
Site Reliability Engineering Lead
2 weeks ago
Job Description:
About the Role:
We are seeking a highly skilled Site Reliability Engineer (SRE) to join our team. This position is responsible for leading the SRE side of our products, making technical decisions, and collaborating with development teams and platform engineers.
This role involves quantitatively measuring and managing system reliability, achieving an appropriate risk balance through SLI/SLOs. By automating operations, responding quickly to incidents, conducting root cause analysis, and driving continuous improvement, we enhance service resilience.
We cultivate a culture within the organization that blends engineering and operational best practices. The successful candidate will act as a leader who identifies technical challenges within development teams, proactively plans solutions, and drives projects to resolution.
Responsibilities:
- Analyze and improve system bottlenecks and conduct capacity planning
- Conduct postmortems and root cause analyses to prevent recurrence
- Continuously improve the incident management process and optimize on-call operations
- Optimize deployment pipelines and CI/CD workflows to improve release efficiency and rollback capabilities
- Design and implement comprehensive monitoring, logging, and tracing strategies using tools like OpenTelemetry, Grafana, Prometheus, and Datadog
- Work closely with other SREs, platform engineers, and developers to optimize infrastructure and improve reliability
Requirements:
- Few years of experience in Site Reliability Engineering, DevOps, or Infrastructure Engineering
- Some coding experience required (batch processing or small automation scripts only acceptable)
- Experience with statically typed languages (e.g., C, C++, Java, Rust, Go, Scala) required
- Experience operating Kubernetes in a production environment
- Experience with CI/CD automation tools
- Familiarity with cloud platforms (AWS or others) and cloud-native architectures
- Experience in incident management, disaster recovery, and high availability strategies
- Experience fostering SRE best practices within an organization
What We Offer:
- Technical leadership experience (mentoring and supporting team members in technical areas)
- Proven experience in project management (identifying issues, planning solutions, driving execution, and coordinating stakeholders)
- Collaboration with global teams in an agile and technically driven environment
Preferred Qualifications:
- Contributions to CNCF projects or open-source communities
- Hands-on experience with large-scale distributed systems and cutting-edge cloud-native technologies
About Us:
We are a company that values innovation, collaboration, and customer satisfaction.
-
Site Reliability Engineer
3 weeks ago
Chennai, Tamil Nadu, India Zyoin Group Full timePosition: Site Reliability Engineer (SRE)Experience: 4 – 10 YearsLocation: Chennai (Hybrid – 2 days in office)Role Overview:We are seeking a Site Reliability Engineer (SRE) responsible for leading reliability practices, ensuring scalable systems, and collaborating with development teams to maintain highly available services.Key Responsibilities- Design,...
-
Site Reliability Engineer
3 weeks ago
Chennai, Tamil Nadu, India Zyoin Group Full timePosition: Site Reliability Engineer (SRE) Experience: 4 – 10 Years Location: Chennai (Hybrid – 2 days in office) Role Overview: We are seeking a Site Reliability Engineer (SRE) responsible for leading reliability practices, ensuring scalable systems, and collaborating with development teams to maintain highly available services. Key Responsibilities ...
-
Site Reliability Engineering Lead
2 weeks ago
Chennai, Tamil Nadu, India Horizon56 Full time US$ 90,000 - US$ 1,20,000 per yearWe are seeking an experienced and dynamic Site Reliability Engineering Lead to oversee the reliability, scalability, and performance of our critical systems. In this role, you will lead a team of Technical Support Engineers, managing both day-to-day operations and a 24/7 shift schedule. You will collaborate with cross-functional teams to ensure system...
-
Site Reliability Engineer
2 weeks ago
Chennai, Tamil Nadu, India Zyoin Group Full timeWork Mode: Hybrid (2 days Office)We are looking for a Site Reliability Engineer (SREs) who will lead the Site Reliability Engineering(SRE) side of each of our products. This position is responsible for making technical decisions, collaborating with development teams and platform engineers, and building and operating highly reliable and scalable products....
-
Site Reliability Engineer
3 weeks ago
Chennai, Tamil Nadu, India Zyoin Group Full timeJob Description Exp : 4- 10 Years Location : Chennai Work Mode: Hybrid (2 days Office) We are looking for a Site Reliability Engineer (SREs) who will lead the Site Reliability Engineering(SRE) side of each of our products. This position is responsible for making technical decisions, collaborating with development teams and platform engineers, and building...
-
Site Reliability Engineer
2 weeks ago
Chennai, Tamil Nadu, India Zyoin Group Full timeJob DescriptionExp : 4- 10 Years Location : Chennai Work Mode: Hybrid (2 days Office)We are looking for a Site Reliability Engineer (SREs) who will lead the Site Reliability Engineering(SRE) side of each of our products. This position is responsible for making technical decisions, collaborating with development teams and platform engineers, and building and...
-
Site Reliability Engineer
2 weeks ago
Chennai, Tamil Nadu, India Zyoin Group Full timeJob DescriptionExp : 4- 10 Years Location : Chennai Work Mode: Hybrid (2 days Office)We are looking for a Site Reliability Engineer (SREs) who will lead the Site Reliability Engineering(SRE) side of each of our products. This position is responsible for making technical decisions, collaborating with development teams and platform engineers, and building and...
-
Lead Site Reliability Engineer
2 weeks ago
Chennai, Tamil Nadu, India Trimble Full timeJob DescriptionLead Site Reliability EngineerReporting to:Sr Manager, Availability ManagementOffice Location:Chennai, IndiaFlexible Working:Hybrid (Part Office/Part Home)Cloud Site Reliability Engineer Responsibilities- On-board internal customers to our 24x7 Applications Support and Enterprise Status Page services- Be involved with creating an SRE culture...
-
Site Reliability Engineer
1 week ago
Chennai, Tamil Nadu, India Concord Full timeSRE Sr. Engineers (Individual Contributors)Key Attributes:Strong SRE (Site Reliability Engineering) experienceDevOps skills – CI/CD, monitoring, automation, infrastructure as code, etc.Excellent troubleshooting and debugging skills (infrastructure + application level)Perseverance – must push through complex/challenging issues without giving upAble to...
-
Site Reliability Engineer
2 weeks ago
Chennai, Tamil Nadu, India Intellect Design Arena Full time ₹ 5,00,000 - ₹ 8,00,000 per yearJob Title: Site Reliability EngineerCompany: Intellect Design Arena LtdLocation: Chennai, IndiaExperience Required: 6+ yearsJob Type: Full-timeDepartment: SRE / DevOps / Engineering EnablementAbout Intellect Design Arena LtdIntellect Design Arena Ltd is a global leader in digital financial technology, offering cutting-edge solutions for banking, insurance,...