Senior Site Reliability Engineer

1 week ago


India Oracle Full time

Role : Senior Site Reliability Engineer

Team: OCI Reliability

Shift : 6am - 2pm

Skills required : Production Incidence, Automation, Python.

Location : Remote

Job description

As a Senior Site Reliability Engineer, you will focus on detecting, triaging, and mitigating OCI service-impacting events quickly and efficiently. You will be responsible for minimising downtime by delivering exceptional major incident management and ensuring the reliability, scalability, performance, and security of the systems that prevent incidents from occurring. Your work will directly contribute to reducing event duration by leveraging your operational expertise, best practices, and the ability to develop tools that automate and improve incident management processes.

Oracle Cloud is cutting-edge and continuously evolving. When issues arise, your team will respond within minutes to mitigate customer impact and ensure service continuity. This role will give you deep insight into the inner workings of OCI’s systems and operations. You’ll collaborate with and influence leaders across Oracle, driving organisational initiatives aimed at continually improving OCI-wide service availability. As part of an agile, high-impact team, you will play a crucial role in shaping the future of Oracle Cloud. If you're excited to be part of a fast-moving team that’s pushing the boundaries of innovation, we’d love to connect with you

We are looking for candidates who are flexible to work APAC shift hours (6 AM to 2 PM IST).

Career Level - IC3

Responsibilities :

  • Lead major incident recovery by orchestrating cross-functional collaboration, driving rapid escalation, clear communication, and seamless stakeholder alignment to ensure swift and effective resolution.
  • Identify opportunities to automate and streamline critical incident workflows, taking full ownership of developing and implementing innovative solutions to enhance efficiency and drive faster resolutions.
  • Leverage deep expertise in cloud computing design patterns and dependencies to proactively mitigate complex major incidents and optimize cloud-based solutions and Leverage your expertise to quickly diagnose root causes, mitigate impact, and implement long-term fixes.
  • Proficient in troubleshooting cloud infrastructure issues using observability platforms to monitor, analyse, and resolve performance and reliability challenges.
  • Continuously improve operational processes, tools, and workflows to enhance the reliability and efficiency of the cloud infrastructure.

Minimum Qualifications

  • Bachelor's degree or higher in Computer Science or a related field, or equivalent work experience.
  • 4+ years of experience in Site Reliability Engineering (SRE), DevOps, or Systems Engineering.
  • Extensive hands-on experience with public cloud operations (e.g., AWS, Azure, GCP, OCI).
  • Proven track record in Major Incident Management within cloud-based environments, with the ability to drive effective incident resolution.
  • Strong understanding of automation and orchestration principles, with a focus on improving system reliability and efficiency.
  • Proficiency in at least one modern object-oriented programming language (e.g., Python, Java, Go, etc.).
  • Solid experience in software engineering best practices, including Agile methodologies, coding standards, code reviews, version control, build processes, testing, and operations.
  • Familiarity with infrastructure automation tools such as Chef, Ansible, Jenkins, and Terraform.
  • Expertise in several key technologies, including Infrastructure-as-a-Service (IaaS), CI/CD systems, Docker, RESTful APIs, log analysis, and debugging tools.
  • Experience with observability platforms such as Grafana, Prometheus, and other monitoring, logging, and tracing tools to optimize system visibility, performance, and issue resolution.



  • india HCLTech Full time

    Urgent Opening for Cloud Senior Site Reliability Engineer role for Pan India location with HCL TechInterested candidates kindly share your updated resume to sagardo@hcltech.com with the subject line "Cloud Senior Site Reliability Engineer Role_ your name & preferred location"Job Description: Ability to learn SRE practices across Red Hat Open Shift, Google...


  • India Vertex Agility Full time

    Senior Site Reliability Engineer - Remote Vertex Agility is a dynamic, cross-geographic remote consultancy specializing in software engineering, Dev Ops, and cloud, partnered with some of the most well-known brands globally. We specialise in transforming businesses by providing tailored cloud solutions to our client's needs. Operating from 15+ countries...


  • india HCLTech Full time

    Urgent Opening for Cloud Senior Site Reliability Engineer role for Pan India location with HCL Tech Interested candidates kindly share your updated resume to with the subject line "Cloud Senior Site Reliability Engineer Role_ your name & preferred location" Job Description: Ability to learn SRE practices across Red Hat Open Shift, Google Cloud or...


  • India Tata Consultancy Services Full time

    Dear Candidate, Greetings from TCS !!! TCS is hiring for SRE, please find the below JD….. Experience range – 5+ years Location- Bangalore, Pune, Hyderabad, Chennai Skills Required - Site Reliability Engineer Role& Responsibilities – Collaborates with cloud platform engineers and teams to design, develop, test, and implement...


  • india Tata Consultancy Services Full time

    Dear Candidate,Greetings from TCS !!!TCS is hiring for SRE, please find the below JD…..Experience range – 5+ yearsLocation- Bangalore, Pune, Hyderabad, ChennaiSkills Required - Site Reliability EngineerRole& Responsibilities –Collaborates with cloud platform engineers and teams to design, develop, test, and implement availability, reliability,...


  • india Vertex Agility Full time

    Senior Site Reliability Engineer - Remote Vertex Agility is a dynamic, cross-geographic remote consultancy specializing in software engineering, DevOps, and cloud, partnered with some of the most well-known brands globally. We specialise in transforming businesses by providing tailored cloud solutions to our client's needs. Operating from 15+ countries...


  • India Vertex Agility Full time

    Senior Site Reliability Engineer - Remote Vertex Agility is a dynamic, cross-geographic remote consultancy specializing in software engineering, DevOps, and cloud, partnered with some of the most well-known brands globally. We specialise in transforming businesses by providing tailored cloud solutions to our client's needs. Operating from 15+ countries...


  • India Vertex Agility Full time

    Senior Site Reliability Engineer - RemoteVertex Agility is a dynamic, cross-geographic remote consultancy specializing in software engineering, DevOps, and cloud, partnered with some of the most well-known brands globally. We specialise in transforming businesses by providing tailored cloud solutions to our client's needs. Operating from 15+ countries across...


  • india Vertex Agility Full time

    Senior Site Reliability Engineer - Remote Vertex Agility is a dynamic, cross-geographic remote consultancy specializing in software engineering, DevOps, and cloud, partnered with some of the most well-known brands globally. We specialise in transforming businesses by providing tailored cloud solutions to our client's needs. Operating from 15+ countries...


  • india Vertex Agility Full time

    Senior Site Reliability Engineer - RemoteVertex Agility is a dynamic, cross-geographic remote consultancy specializing in software engineering, DevOps, and cloud, partnered with some of the most well-known brands globally. We specialise in transforming businesses by providing tailored cloud solutions to our client's needs. Operating from 15+ countries across...


  • India IDEMIA Full time

    We are hiring for Site Reliability Engineer role at Noida location. Responsibility: Involved in deploy/manage/operate of medium to large scale production systems Understanding of Linux as a runtime environment Familiar to Cloud native concepts and virtualisation Familiar to CI/CD concepts and tools like Jenkins, Gitlab etc Previous...


  • India IDEMIA Full time

    We are hiring for Site Reliability Engineer role at Noida location. Responsibility: Involved in deploy/manage/operate of medium to large scale production systems Understanding of Linux as a runtime environment Familiar to Cloud native concepts and virtualisation Familiar to CI/CD concepts and tools like Jenkins, Gitlab etc Previous...


  • India PeopleLogic Full time

    Job Responsibilities : Ensure the 24/7 operations and reliability of data services in our production Collaborate with the data engineering development team to design, build, and maintain scalable, reliable, and secure data pipelines and systems. Develop and implement monitoring, alerting, and incident response strategies to proactively identify and...


  • India InstaService Inc Full time

    About Us:At InstaService, we are committed to delivering reliable, high-performance home services to our customers. As a fast-growing on-demand services platform, we are looking for a talented DevOps / Site Reliability Engineer (SRE) to join our dynamic team. This role is crucial to scaling and maintaining our infrastructure, ensuring our platform remains...


  • India BCE Global Tech Full time

    About the role We are seeking a talented Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a strong background in software engineering and systems administration, with a passion for building scalable and reliable systems. As an SRE, you will collaborate with development and operations teams to ensure our services are reliable,...


  • india PeopleLogic Full time

    Job Responsibilities : Ensure the 24/7 operations and reliability of data services in our productionCollaborate with the data engineering development team to design, build, and maintain scalable, reliable, and secure data pipelines and systems.Develop and implement monitoring, alerting, and incident response strategies to proactively identify and resolve...


  • india Apex Systems Full time

    Devops Engineer Bengaluru & Chennai Remote Looking for an immediate Joiner • Overall 5+yrs of experience as Site Reliability Engineer /Devops Engineer• Bachelor’s or master’s Degree in software engineering, computer science, or in a related technical field• Familiarity with Infrastructure as Code (e.g. Terraform & CloudFormation)• Has a focus in...


  • India Apex Systems Full time

    Devops Engineer Bengaluru & Chennai Remote Looking for an immediate Joiner • Overall 5+yrs of experience as Site Reliability Engineer /Devops Engineer • Bachelor’s or master’s Degree in software engineering, computer science, or in a related technical field • Familiarity with Infrastructure as Code (e.g. Terraform & CloudFormation) • Has...


  • Anywhere in India/Multiple Locations Stealth Startup Full time

    Key ResponsibilitiesAt Stealth Startup, we're looking for a skilled Site Reliability Engineer to maintain and enhance the reliability, availability, and performance of our large-scale distributed systems. Your key responsibilities will include automating deployment, monitoring, and management of production systems, as well as implementing and managing CI/CD...


  • India Tanla Platforms Limited Full time

    About the Role:We are seeking a highly skilled Site Reliability Engineer to join our team at Tanla Platforms Limited. As a Site Reliability Engineer, you will be responsible for ensuring the high availability, scalability, and reliability of our platforms and applications.Key Responsibilities:Design, implement, and maintain scalable and highly available...