Site Reliability Engineering Manager

4 weeks ago


Hyderabad, India Zortech Solutions Full time

Job Title:
Site Reliability Engineering (SRE) Manager


Location:
Hyderabad


Employment Type:
Full-Time


Work Model:
Hybrid (3 Days from Office)

About the Role

We are looking for an experienced
Site Reliability Engineering (SRE) Manager
to lead our reliability engineering function, ensuring infrastructure resiliency, operational excellence, and seamless user experiences. This hybrid leadership role blends
hands-on technical expertise
with
strategic team leadership
and
cross-functional collaboration
.

Experience Required

  • 10+ years
    of total IT experience.
  • 3+ years
    in a leadership role within SRE or Cloud Operations.

Technical Knowledge & Skills

Mandatory:

  • Kubernetes, GKE, Prometheus, Terraform
  • Advanced GCP administration
  • CI/CD: Jenkins, Argo CD, GitHub Actions
  • Incident Management (full lifecycle) with tools like OpsGenie

Nice to Have:

  • Service mesh and observability stacks knowledge
  • Strong scripting skills (Python, Bash)
  • Exposure to BigQuery / Dataflow for telemetry

Scope of the Role

  • Build and mentor a high-performing SRE team.
  • Standardize practices for
    reliability, alerting, and response
    .
  • Partner with Engineering and Product leaders to align reliability goals with business objectives.

Key Responsibilities

  • Define and implement organizational
    reliability strategies
    , aligning SLAs, SLOs, and error budgets with business goals.
  • Develop and institutionalize
    incident response frameworks
    including escalation, on-call scheduling, service ownership, and RCA governance.
  • Lead technical reviews for
    reliability design
    and high-availability architectures across distributed cloud services.
  • Champion a culture of
    observability and monitoring
    , standardizing dashboards, telemetry schemas, and alert definitions.
  • Drive
    continuous improvement initiatives
    (toil reduction, operational maturity, and SRE OKRs).
  • Collaborate with platform teams to introduce
    self-healing systems, autoscaling, and latency-optimized service mesh patterns
    .
  • Act as the
    principal escalation point
    for reliability-related concerns and ensure incident retrospectives lead to measurable improvements.
  • Own
    runbook standardization, capacity planning, and production readiness reviews
    for new launches.
  • Mentor and guide SREs, creating career pathways and fostering a proactive ownership culture.
  • Engage with
    leadership and stakeholders
    to track performance, reliability KPIs, and ROI on SRE investments.

Why Join Us?

  • Opportunity to
    build and lead a world-class SRE function
    .
  • Work with
    cutting-edge cloud-native technologies
    .
  • Hybrid work flexibility (3 days in office).
  • Collaborative culture with strong leadership support.
  • Apply Now
    to be part of our mission to deliver
    resilient, scalable, and reliable systems
    .


  • Hyderabad, India Talent Worx Full time

    Site Reliability Engineer (SRE) At Talent Worx, we are looking for a dedicated Site Reliability Engineer (SRE) to join our team. This role involves maintaining high availability and reliability of our services through the application of software engineering practices and systems administration skills. The ideal candidate will bridge the gap between...


  • Hyderabad, India Jigya Software Services Full time

    Job Title:Senior Site Reliability Engineer (SRE) - AWS/Kubernetes Location:Hyderabad - Onsite Job Type:Full-Time About the Role: We are looking for a highly skilled and motivated Site Reliability Engineer to design, build, and maintain our high-performance, scalable cloud infrastructure. You will play a critical role in ensuring the reliability, performance,...


  • Hyderabad, India IntraEdge Full time

    Site Reliability Engineer Experience: 7+ Years Location: Hyderabad Skills for Principal: Strong leadership and people management skills. Exceptional technical proficiency in Pearson's technology stack. Advanced project management capabilities. Excellent communication and collaboration skills. Adept at risk assessment and crisis management. Strategic thinking...


  • Chennai, Hyderabad, India Glomatriz Technologies Full time ₹ 2,50,000 - ₹ 7,50,000 per year

    We are seeking a Site Reliability Engineer with 3+ years of experience to ensure system reliability, monitor performance, and implement scalable solutions. The role involves automation, incident management, and collaboration with development teams.


  • Hyderabad, India Sonata Software Full time

    Category Details Role Site Reliability Engineer (SRE) III – Data Engineering Location Hyderabad- Employment Type Full Time Experience 7–12 years in site reliability, cloud-based data infrastructure, data pipeline observability, automation, and high-availability engineering within EdTech platforms (2U) Primary Skills (Must-Have) AWS, CI/CD, Jenkins, IAAC,...


  • Hyderabad, India Sonata Software Full time

    Category Details Role Site Reliability Engineer (SRE) III – Data Engineering Location Hyderabad- Employment Type Full Time Experience 7–12 years in site reliability, cloud-based data infrastructure, data pipeline observability, automation, and high-availability engineering within EdTech platforms (2U) Primary Skills (Must-Have) AWS, CI/CD, Jenkins, IAAC,...


  • Hyderabad, India Sonata Software Full time

    Role: Site Reliability Engineer Location: HyderabadNotice Period: Immediate to 20 DaysEmployment Type: Full TimeExperience7–12 years in site reliability, cloud-based data infrastructure, data pipeline observability, automation, and high-availability engineering within EdTech platforms (2U)Primary Skills (Must-Have)AWS, CI/CD, Jenkins, IAAC,...


  • Hyderabad, Telangana, India Apple Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Imagine what you could do here. Apple is a place where extraordinary people gather to do their best work. Together we craft products and experiences people once couldn't have imagined — and now can't imagine living without. If you're motivated by the idea of making a real impact, and joining a team where we pride ourselves in being one of the most diverse...


  • Hyderabad, India SID Global Solutions Full time

    Job Role: Site Reliability Engineer (SRE) – GCPExperience: 3+ yearsLocation: HyderabadAbout SIDGS:SIDGS is a premium global systems integrator and global implementation partner of Google corporation, providing Digital Solutions & Services to Fortune 500 companies. Our Digital solutions go across following domains: User Experience, CMS, API Management,...


  • Hyderabad, India Talentiser Full time

    Hiring hybrid Site Reliability Engineers for a fast-growing product company building scalable tech solutions and transforming how businesses run mission-critical operations. Our Saa S platform is designed for high performance, reliability, and automation at scale. Your Impact As a Site Reliability Engineer , you’ll play a key role in ensuring ...