Site Reliability Developer 2

3 weeks ago


Pune India Oracle Full time

Job Description Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning. In this role you will need to: Take ownership of the implementation and production operations of a wide array of core system platform solutions React to production deficiencies by continuously implementing automation, self-learning, and real-time monitoring to production systems Be a strong contributor to development of platform services including architecture, provisioning, configuration, deployment, and support Partner with the distributed team in prototyping new database platform services Stay informed of cloud infrastructure stacks Innovate. Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance. Authority for end-to-end performance and operability. Partner with development teams in defining and implementing improvements in service architecture. Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio. Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack. Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs). Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Understand and explain the affect of product architecture decisions on distributed systems. Professional curiosity and a desire to a develop deep understanding of services and technologies. Preferred Qualifications: Degree level : BE/BS/MS Programming languages like Python and bash , technical skills of Cloud platforms , Chef , Grafana and Terraform Fair knowledge and experience of the Oracle Engineered systems and subsystems. Ability to troubleshoot and resolve hardware/software issues, restore environments to an operational state, perform root cause analysis and provide forward thinking mitigation strategies Fair level understanding, implementation experience and troubleshooting of Oracle Database technology including RAC, Dataguard, ASM, RMAN etc Demonstrated operations experience with Linux platform (i.e. RHEL, OEL) including administration, management, and troubleshooting Strong communication and analytical skills Familiarity with security practices in web application delivery and General knowledge of network topology Experience with configuration management tools Career Level - IC2



  • Bengaluru, India Oracle Full time

    Job Description Oracle's Health and AI Database Services Team is a client-facing organization supporting more than 400 customer databases across multiple geographies. SLA adherence is mission-critical, as any lapse directly impacts customers who rely on OHAI to run their healthcare businesses. In alignment with our mission, we are focused on enhancing the...


  • india Oracle Full time

    DescriptionSolve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems....


  • Pune, India emagine Full time

    Job Description Job Overview: As a Site Reliability Engineer (SRE) working in a 24/7 shift rotation, you will be responsible for ensuring the reliability, availability, and performance of critical systems and services. You will combine strong technical skills with operational excellence to proactively monitor, troubleshoot, and resolve issues. Your expertise...


  • India Akamai Technologies Full time

    Job Description Job Description Do you like collaborating across teams to solve complex problems Do you enjoy solving large scale distributed content delivery challenges Join our highly skilled Compute Site Reliability team Our team designs, develops, and manages applications and infrastructure that support Akamai's Compute products and services. We...


  • Pune, India Siemens Digital Industries Software Full time

    Job Description Siemens Digital Industries Software is a leading provider of solutions for the design, simulation, and manufacture of products across many different industries. Formula 1 cars, skyscrapers, ships, space exploration vehicles, and many of the objects we see in our daily lives are being conceived and manufactured using our Product Lifecycle...


  • Pune, India NR Consulting Full time

    Job Description ```html About the Company We are seeking a highly skilled Site Reliability Engineer (SRE) with strong expertise in Google Cloud Platform (GCP) and CI/CD automation to lead cloud infrastructure initiatives. The ideal candidate will design and implement robust CI/CD pipelines, automate deployments, ensure platform reliability, and drive...


  • India Akamai Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Description Do you like collaborating across teams to solve complex problems? Do you enjoy solving large scale distributed content delivery challenges?Join our highly skilled Compute Site Reliability teamOur team designs, develops, and manages applications and infrastructure that support Akamai's Compute products and services. We specialize in creating...


  • pune, India Talent Worx Full time

    Site Reliability Engineer (SRE)At Talent Worx, we are looking for a dedicated Site Reliability Engineer (SRE) to join our team. This role involves maintaining high availability and reliability of our services through the application of software engineering practices and systems administration skills. The ideal candidate will bridge the gap between...


  • India Oracle Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    DescriptionYou will be responsible to work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission...


  • Pune, India Talent Worx Full time

    Site Reliability Engineer (SRE) At Talent Worx, we are looking for a dedicated Site Reliability Engineer (SRE) to join our team. This role involves maintaining high availability and reliability of our services through the application of software engineering practices and systems administration skills. The ideal candidate will bridge the gap between...