Senior site reliability engineer
1 month ago
Team: OCI Reliability
Shift : 6am - 2pm
Skills required : Production Incidence, Automation, Python.
Location : Remote
Job description
As a Senior Site Reliability Engineer, you will focus on detecting, triaging, and mitigating OCI service-impacting events quickly and efficiently. You will be responsible for minimising downtime by delivering exceptional major incident management and ensuring the reliability, scalability, performance, and security of the systems that prevent incidents from occurring. Your work will directly contribute to reducing event duration by leveraging your operational expertise, best practices, and the ability to develop tools that automate and improve incident management processes.
Oracle Cloud is cutting-edge and continuously evolving. When issues arise, your team will respond within minutes to mitigate customer impact and ensure service continuity. This role will give you deep insight into the inner workings of OCI’s systems and operations. You’ll collaborate with and influence leaders across Oracle, driving organisational initiatives aimed at continually improving OCI-wide service availability. As part of an agile, high-impact team, you will play a crucial role in shaping the future of Oracle Cloud. If you're excited to be part of a fast-moving team that’s pushing the boundaries of innovation, we’d love to connect with you
We are looking for candidates who are flexible to work APAC shift hours (6 AM to 2 PM IST).
Career Level - IC3
Responsibilities :
Lead major incident recovery by orchestrating cross-functional collaboration, driving rapid escalation, clear communication, and seamless stakeholder alignment to ensure swift and effective resolution.
Identify opportunities to automate and streamline critical incident workflows, taking full ownership of developing and implementing innovative solutions to enhance efficiency and drive faster resolutions.
Leverage deep expertise in cloud computing design patterns and dependencies to proactively mitigate complex major incidents and optimize cloud-based solutions and Leverage your expertise to quickly diagnose root causes, mitigate impact, and implement long-term fixes.
Proficient in troubleshooting cloud infrastructure issues using observability platforms to monitor, analyse, and resolve performance and reliability challenges.
Continuously improve operational processes, tools, and workflows to enhance the reliability and efficiency of the cloud infrastructure.
Minimum Qualifications
Bachelor's degree or higher in Computer Science or a related field, or equivalent work experience.
4+ years of experience in Site Reliability Engineering (SRE), Dev Ops, or Systems Engineering.
Extensive hands-on experience with public cloud operations (e.g., AWS, Azure, GCP, OCI).
Proven track record in Major Incident Management within cloud-based environments, with the ability to drive effective incident resolution.
Strong understanding of automation and orchestration principles, with a focus on improving system reliability and efficiency.
Proficiency in at least one modern object-oriented programming language (e.g., Python, Java, Go, etc.).
Solid experience in software engineering best practices, including Agile methodologies, coding standards, code reviews, version control, build processes, testing, and operations.
Familiarity with infrastructure automation tools such as Chef, Ansible, Jenkins, and Terraform.
Expertise in several key technologies, including Infrastructure-as-a-Service (Iaa S), CI/CD systems, Docker, RESTful APIs, log analysis, and debugging tools.
Experience with observability platforms such as Grafana, Prometheus, and other monitoring, logging, and tracing tools to optimize system visibility, performance, and issue resolution.
-
Senior Site Reliability Engineer
3 weeks ago
Delhi, India GeekBull Consulting Full timeJob Code: GBC-2411129Job Role: Senior Site Reliability EngineerJob Type: Contract - to - Hire ( C2H )Duration: 6 MonthsExperience: 7 - 10 YearsLocation: HyderabadWork Location: Hyderabad/ RemoteShift Timings : 6 PM to 3 AM ISTAbout Company:We collaborate with a wide range of clients, from startups to industry giants in sectors like Healthcare, Education, IT,...
-
Senior Site Reliability Engineer
4 weeks ago
Delhi, India Ushur Full timeLocation: BangaloreExperience: 6-8 YearsWork Mode: Hybrid/RemoteThe RoleSenior Site Reliability Engineers at Ushur perform a unique blend of customer support engineering, solution engineering, and operational engineering. You will work on our largest customers’ most complex problems and craft intuitive, elegant solutions. You’ll also proactively work...
-
Senior Site Reliability Engineer
3 weeks ago
Delhi, India SwiftWIN | A Concord Company Full timeJob Title: Site Reliability Engineer (SRE) - Azure DevOpsJob Overview:We are seeking a highly skilled and motivated Site Reliability Engineer (SRE) with strong experience in Azure DevOps to join our dynamic team. The SRE will be responsible for maintaining the reliability, availability, and performance of our production environments, with a specific focus on...
-
Senior Site Reliability Engineer
2 months ago
Delhi, India Vertex Agility Full timeSenior Site Reliability Engineer - RemoteVertex Agility is a dynamic, cross-geographic remote consultancy specializing in software engineering, DevOps, and cloud, partnered with some of the most well-known brands globally. We specialise in transforming businesses by providing tailored cloud solutions to our client's needs. Operating from 15+ countries across...
-
Senior Site Reliability Engineer
2 months ago
delhi, India Vertex Agility Full timeSenior Site Reliability Engineer - RemoteVertex Agility is a dynamic, cross-geographic remote consultancy specializing in software engineering, DevOps, and cloud, partnered with some of the most well-known brands globally. We specialise in transforming businesses by providing tailored cloud solutions to our client's needs. Operating from 15+ countries across...
-
Senior Site Reliability Engineer
2 months ago
Delhi, India Vertex Agility Full timeSenior Site Reliability Engineer - RemoteVertex Agility is a dynamic, cross-geographic remote consultancy specializing in software engineering, DevOps, and cloud, partnered with some of the most well-known brands globally. We specialise in transforming businesses by providing tailored cloud solutions to our client's needs. Operating from 15+ countries across...
-
Senior site reliability engineer
2 months ago
Delhi, India Vertex Agility Full timeSenior Site Reliability Engineer - RemoteVertex Agility is a dynamic, cross-geographic remote consultancy specializing in software engineering, Dev Ops, and cloud, partnered with some of the most well-known brands globally. We specialise in transforming businesses by providing tailored cloud solutions to our client's needs. Operating from 15+ countries...
-
Site Reliability Engineer
4 weeks ago
Delhi, India Tranzeal Incorporated Full timeJob Title: Site Reliability Engineer (SRE)Location: Bangalore, KAWork Mode: Office (5Days/Week)Position Type: Contract basedWe're hiring a Site Reliability Engineer to join our team in Bangalore! If you have a strong background in maintaining and scaling cloud services and love automating infrastructure at scale, this is for you.Experience with Ansible and...
-
Site Reliability Engineer
3 weeks ago
Delhi, India Delphic (South Asia) Full timeJob Title: Site Reliability Engineer (SRE)Location: RemoteJob Type: Full-timeExperience : 7 yearsIntroduction:We are seeking a highly skilled and motivated Site Reliability Engineer (SRE) to join our dynamic team. As an SRE, you will play a critical role in ensuring the reliability, scalability, and performance of our systems and infrastructure. You will...
-
Site reliability engineer
3 weeks ago
Delhi, India Delphic Full timeJob Title: Site Reliability Engineer (SRE)Location: RemoteJob Type: Full-timeExperience : 7 yearsIntroduction:We are seeking a highly skilled and motivated Site Reliability Engineer (SRE) to join our dynamic team. As an SRE, you will play a critical role in ensuring the reliability, scalability, and performance of our systems and infrastructure. You will...
-
Site Reliability Engineer
1 month ago
Delhi, India IDEMIA Full timeWe are hiring for Site Reliability Engineer role at Noida location.Responsibility:- Involved in deploy/manage/operate of medium to large scale production systems- Understanding of Linux as a runtime environment- Familiar to Cloud native concepts and virtualisation- Familiar to CI/CD concepts and tools like Jenkins, Gitlab etc- Previous experience of working...
-
Site Reliability Engineer
1 month ago
Delhi, India IDEMIA Full timeWe are hiring forSite Reliability Engineerrole atNoidalocation.Responsibility:Involved in deploy/manage/operate of medium to large scale production systemsUnderstanding of Linux as a runtime environmentFamiliar to Cloud native concepts and virtualisationFamiliar to CI/CD concepts and tools like Jenkins, Gitlab etcPrevious experience of working with Docker,...
-
Site Reliability Engineer
1 month ago
Delhi, India IDEMIA Full timeWe are hiring for Site Reliability Engineer role at Noida location.Responsibility:Involved in deploy/manage/operate of medium to large scale production systemsUnderstanding of Linux as a runtime environmentFamiliar to Cloud native concepts and virtualisationFamiliar to CI/CD concepts and tools like Jenkins, Gitlab etcPrevious experience of working with...
-
Site Reliability Engineer
3 weeks ago
Delhi, India K&K social resources and development GmbH Full timeK&K Social Resources & Development GmbH is an international recruiting agency that has been providing technical resources in the European region since 1993. This position is with one of our clients in India who is actively hiring candidates to expand their teams.Title: Site Reliability EngineerLocation: India - RemoteEmployment Type: PermanentNotice...
-
Site reliability engineer
3 weeks ago
Delhi, India K&K Social Resources And Development GmbH Full timeK&K Social Resources & Development Gmb H is an international recruiting agency that has been providing technical resources in the European region since 1993. This position is with one of our clients in India who is actively hiring candidates to expand their teams.Title: Site Reliability EngineerLocation: India - RemoteEmployment Type: PermanentNotice...
-
Site Reliability Engineer
3 weeks ago
Delhi, India Hirextra -World's First Staffing Aggregator Full timeJob Description :- Highly skilled Cloud Site Reliability Engineer to ensure high availability, reliability and performance of cloud infrastructure and services.- Experience in cloud platforms (AWS, GCP), automation, monitoring, and incident management.- Experience in Prometheus, Grafana, Splunk, CloudWatch).- Automate routine operational tasks and cloud...
-
Site reliability engineer
3 weeks ago
Delhi, India Tata Consultancy Services Full timeTCS has been a great pioneer in feeding the fire of young techies like you. We are a global leader in the technology arena and there’s nothing that can stop us from growing together.What we are looking forRole: Site Reliability EngineerExperience Range: 8 – 12 YearsLocation: Pune & Chennai, Bangalore , DelhiMust-Have:Exceptional skills in...
-
Site Reliability Engineer
4 weeks ago
Delhi, India Tata Consultancy Services Full timeTCS has been a great pioneer in feeding the fire of young techies like you. We are a global leader in the technology arena and there’s nothing that can stop us from growing together.What we are looking forRole: Site Reliability EngineerExperience Range: 8 – 12 YearsLocation: Pune & Chennai, Bangalore , DelhiMust-Have:Exceptional skills in...
-
Site Reliability Engineer
2 weeks ago
Delhi, India Coforge Full timeJob Title: Site Reliability EngineerSkills : SRE, CI/CD, AWS, Python, Terraform & KubernetesLocation: Hyderabad (Work from Office)Experience: 7-15 YearsNote: Immediate joiners are preferableJob Description:We at Coforge are hiring a Site Reliability Engineer with the following skillset:Design, implement, and manage scalable and secure cloud-based...
-
Staff Site Reliability Engineer
4 weeks ago
Delhi, India Zscaler Full timeAbout the role:Our Engineering team built the world's largest cloud security platform from the ground up, and we keep building. With more than 100 patents and big plans for enhancing services and increasing our global footprint, the team has made us and our multitenant architecture today's cloud security leader, with more than 15 million users in 185...