Manager - Site Reliability

1 week ago


Hyderabad, Telangana, India ZORTECH SOLUTIONS PRIVATE LIMITED Full time ₹ 20,00,000 - ₹ 25,00,000 per year

Job Title : Site Reliability Engineering (SRE) Manager

Location : Hyderabad

Employment Type : Full-Time

Work Model : 3 Days from office (Hybrid)

Summary :


The SRE Manager will lead the reliability engineering function, ensuring infrastructure resiliency and optimal operational performance. This hybrid role blends technical leadership with team mentorship and cross-functional coordination.

Experience Required :


10+ years total experience, with 3+ years in a leadership role in SRE or Cloud Operations.

Technical Knowledge and Skills :

Mandatory :


- Deep understanding of Kubernetes, GKE, Prometheus, Terraform

- Cloud : Advanced GCP administration

- CI/CD : Jenkins, Argo CD, GitHub Actions

- Incident Management : Full lifecycle, tools like OpsGenie

Nice to Have :


- Knowledge of service mesh and observability stacks

- Strong scripting skills (Python, Bash)

- Big Query /Dataflow exposure for telemetry

Scope :


- Build and lead a team of SREs

- Standardize practices for reliability, alerting, and response

- Engage with Engineering and Product leaders

Roles and Responsibilities :


- Establish and lead the implementation of organizational reliability strategies, aligning SLAs, SLOs, and Error Budgets with business goals and customer expectations.

- Develop and institutionalize incident response frameworks, including escalation policies, on-call scheduling, service ownership mapping, and RCA process governance.

- Lead technical reviews for infrastructure reliability design, high-availability architectures, and resiliency patterns across distributed cloud services

- Champion observability and monitoring culture by standardizing tooling, alert definitions, dashboard templates, and telemetry data schemas across all product teams.

- Drive continuous improvement through operational maturity assessments, toil elimination initiatives, and SRE OKRs aligned with product objectives.

- Collaborate with cloud engineering and platform teams to introduce self-healing systems, capacity-aware autoscaling, and latency-optimized service mesh patterns.

- Act as the principal escalation point for reliability-related concerns and ensure incident retrospectives lead to measurable improvements in uptime and MTTR.

- Own runbook standardization, capacity planning, failure mode analysis, and production readiness reviews for new feature launches.

- Mentor and develop a high-performing SRE team, fostering a proactive ownership culture, encouraging cross-functional knowledge sharing, and establishing technical career pathways.

- Collaborate with leadership, delivery, and customer stakeholders to define reliability goals, track performance, and demonstrate ROI on SRE investments



  • Hyderabad, Telangana, India Apple Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Imagine what you could do here. Apple is a place where extraordinary people gather to do their best work. Together we craft products and experiences people once couldn't have imagined — and now can't imagine living without. If you're motivated by the idea of making a real impact, and joining a team where we pride ourselves in being one of the most diverse...


  • Hyderabad, Telangana, India Talent Worx Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    SRE (Site Reliability Engineer)Talent Worx is seeking a talented SRE (Site Reliability Engineer) to enhance our technology team. In this role, you will be pivotal in ensuring the reliability, performance, and availability of our applications and services. Your work will involve both software engineering and systems operations as you strive to improve...


  • Hyderabad, Telangana, India TurboHire Full time ₹ 15,00,000 - ₹ 28,00,000 per year

    Site Reliability Engineer (SRE)Location: Hyderabad (Hybrid)Experience: 3–5 yearsAbout the RoleWe are looking for an SRE Engineer to own reliability, deployment, and monitoringof TurboHire's cloud infrastructure. You will ensure our platform is scalable, secure,and highly available. The role balances hands-on coding, automation, and infraoperations, freeing...


  • Hyderabad, Telangana, India LivePerson Full time ₹ 8,00,000 - ₹ 15,00,000 per year

    LivePerson (NASDAQ: LPSN) is a leading customer engagement company, creating digital experiences powered by Curiously Human AI. Every person is unique, and our technology makes it possible for companies, including leading brands like HSBC, Orange, and GM Financial, to treat their audiences that way at scale. Nearly a billion conversational interactions are...


  • Hyderabad, Telangana, India EPAM Systems Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    We are seeking a skilledLead Site Reliability Engineerto drive the stability, scalability, and reliability of our systems while improving efficiency through automation and best practices.This role calls for deep expertise in DevOps methodologies, Infrastructure as Code (IaC), and collaboration across teams to ensure optimal system...


  • Hyderabad, Telangana, India Amgen Inc Full time ₹ 8,00,000 - ₹ 12,00,000 per year

    *What you will do* In this vital role you will responsible for the reliability, stability, performance, scalability, and security of platforms that support Amgens digital products and engineering teams. This hands-on role focuses on supporting cloud-based infrastructure, automating operations, maintaining observability, and improving platform reliability...


  • Hyderabad, Telangana, India Amgen Inc Full time ₹ 8,00,000 - ₹ 12,00,000 per year

    We are looking for a Site Reliability Engineer/Cloud Engineer (SRE) to work on the performance optimization, standardization, and automation of Amgens critical infrastructure and systems. This role is crucial to ensuring the reliability, scalability, and cost-effectiveness of our production systems. The ideal candidate will work on operational excellence...


  • Hyderabad, Telangana, India Chase- Candidate Experience page Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    There's nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems.As a Site Reliability Engineer III at JPMorgan Chase within the Consumer and Community Banking, you will solve complex and broad business problems...


  • Hyderabad, Telangana, India HTC Global Services Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Job DescriptionAbout the Role:Seeking a highly skilled AWS Site Reliability Engineer (SRE) with a 6 year experience to join our dynamic team.RequirementsAt least 3 to 6 years of hands-on experience in AWS Cloud and Site Reliability Engineering.Strong knowledge of networking concepts including VPC, subnets, NAT, routing and security groups.Proficiency in...


  • Hyderabad, Telangana, India Instaresz Business Services Pvt Ltd Full time ₹ 8,00,000 - ₹ 20,00,000 per year

    Experience Required:10+ YearsLocation:Hyderabad (On-site)Employment Type:Full-TimeAbout InstareszInstaresz Business Services Pvt. Ltd. focuses on building and scaling high-performance SaaS products with expertise in:• SaaS Product Development• Infrastructure & DevOps• Data & Analytics• AI & Automation Our mission is to create robust, secure, and...