Site Reliability Engineer

3 weeks ago


Chennai, Tamil Nadu, India Ascendion Full time

Job Description :

We are looking for an experienced Azure Site Reliability Engineer (SRE) with 6-9 years of experience to support and administer Azure Kubernetes Service (AKS) clusters running critical middleware handling thousands of transactions per second (TPS).

The ideal candidate will have a strong background in Infrastructure as Code (IaC), cloud networking, automation, and observability to ensure high availability, scalability, and reliability.

This role requires an engineering-first mindset, focusing on IaC-driven deployments, automation, monitoring, and operational excellence while maintaining a 99.999% availability target.

Key Responsibilities :

- Deploy, manage, and maintain Azure Kubernetes Service (AKS) clusters with a focus on scalability and availability.

- Handle cluster cutovers, base image updates, and IaC-driven changes.

- Apply SRE principles to ensure high availability and resiliency. Write and maintain Terraform scripts for IaC deployments.

- Manage Kubernetes configurations using Helm charts.

- Automate infrastructure provisioning, scaling, and disaster recovery processes.

- Implement GitOps methodologies using ArgoCD for deployment automation.

- Implement and maintain monitoring & logging solutions using ELK Stack (Elasticsearch, Logstash, Kibana) or Grafana Loki.

- Analyze logs and metrics to proactively detect issues. Integrate OpenTelemetry (preferred) for distributed tracing and observability.

- Ensure compliance with security best practices for handling sensitive data in regulated environments.

- Implement and manage secrets using HashiCorp Vault (preferred).

- Maintain secure cloud networking within Azure.

- Build and optimize CI/CD pipelines using GitHub Actions (preferred) or any CI/CD tool.

- Automate deployments using GitOps principles with ArgoCD.

- Optimize build, release, and rollback processes for high availability.

- Conduct disaster recovery testing and build fault-tolerant systems.

- Respond to production incidents, troubleshoot issues, and implement long-term fixes.

- Participate in an on-call rotation to ensure system reliability.

Required Skills & Qualifications :

Cloud :

- Azure (Must-have) with strong networking expertise.

- Terraform (Hands-on experience required).

Container Orchestration :

- Kubernetes (AKS) with Helm.

- GitHub Actions (preferred) or any CI/CD tool.

- ArgoCD for deployment automation.

- Proficiency in Python or Golang (any one required).

- Experience with ELK Stack or Grafana Loki. Linux with networking skills.

- OpenTelemetry (preferred).

- HashiCorp Vault (preferred).

- Familiarity with security and compliance best practices.

- Bachelor's degree in Computer Science, Engineering, or related fields (or equivalent experience).

- 6-9 years of experience in Site Reliability Engineering, DevOps, or Infrastructure Engineering.

- Minimum 2 years of hands-on experience as an SRE working with Azure, Kubernetes, and Terraform.

- Experience working in highly available, regulated, and security-sensitive environments.

- Work with highly scalable, mission-critical systems handling thousands of transactions per second.

- Be part of a fast-paced, DevOps-driven engineering culture.

- Leverage the latest cloud-native technologies and automation frameworks.

- Competitive compensation and growth opportunities.

(ref:hirist.tech)

  • Chennai, Tamil Nadu, India Bright Vision Technologies Full time

    Bright Vision Technologies has an immediate Full-time opportunity for Site Reliability Engineer (SRE)  Job Role:  Site Reliability Engineer (SRE) Job Type: Full Time Candidates Looking for Visa sponsorship and willing to relocate to USA are encouraged to apply.About Bright Vision Technologies: Bright Vision Technologies is a fast-growing technology company...


  • Chennai, Tamil Nadu, India 10decoders Full time

    JD: Site Reliability Engineer - GCP With Terraform The Role: We are looking for a Senior SRE with 5+ years of experience to work primarily with our Application development team. An ideal candidate would have extensive experience building cloud infrastructure on Google Cloud with Terraform and have strong experience running workloads that scale on Google's...


  • Chennai, Tamil Nadu, India 10decoders Full time

    JD: Site Reliability Engineer - GCP With TerraformThe Role:We are looking for a Senior SRE with 5+ years of experience to work primarily with ourApplication development team. An ideal candidate would have extensive experiencebuilding cloud infrastructure on Google Cloud with Terraform and have strongexperience running workloads that scale on Google's...


  • Chennai, Tamil Nadu, India Burgeon It Services Pvt Ltd Full time

    Job Title : SRE EngineerLocation : ChennaiExperience : 8+ YearsJob Description :We are seeking an experienced Site Reliability Engineer (SRE) to join our dynamic team. The ideal candidate will have a strong background in software engineering and operations, with a passion for building scalable and reliable systems.Key Responsibilities :- Design, implement,...


  • Chennai, Tamil Nadu, India Burgeon It Services Pvt Ltd Full time

    Job Title : SRE EngineerLocation : ChennaiExperience : 8+ YearsJob Description :We are seeking an experienced Site Reliability Engineer (SRE) to join our dynamic team. The ideal candidate will have a strong background in software engineering and operations, with a passion for building scalable and reliable systems.Key Responsibilities :- Design, implement,...


  • Chennai, Tamil Nadu, India 10decoders Full time

    Job Summary We are seeking a Senior Site Reliability Engineer (SRE) with 5+ years of experience to join our team and work primarily with our Application development team. The ideal candidate will have extensive experience building cloud infrastructure on Google Cloud Platform using Terraform and strong experience running workloads that scale on Google's...


  • Chennai, Tamil Nadu, India 10decoders Full time

    JD: Site Reliability Engineer -GCP With TerraformThe Role:We are looking for a Senior SRE with5+ yearsof experience to work primarily with ourApplication development team. An ideal candidate would have extensive experiencebuilding cloud infrastructure onGoogle Cloud with Terraformand have strongexperience running workloads that scale on Google's Kubernetes...


  • Chennai, Tamil Nadu, India Zf Friedrich Full time

    Job DescriptionJob Description :Req ID 77489|GEC Chennai, India,ZF Commercial Vehicle Control Systems India LimitedLong DescriptionAbout the Team:Garuda team is a SRE team responsible for the reliability and operations of our Fleet management services platform. We ensure the availability and performance of the platform through proactive incident management,...


  • Chennai, Tamil Nadu, India Kiash Solutions LLp Full time

    We are hiring a Site Reliability Engineer (SRE) with strong expertise in Azure operations, containerized workflows (Docker), and Python scripting. The ideal candidate will lead efforts to ensure system reliability, automate operational tasks, and optimize cloud-based infrastructure, while collaborating with cross-functional teams to deliver high-performing...


  • Chennai, Tamil Nadu, India 10decoders Full time

    JD: Site Reliability Engineer - GCP With TerraformThe Role:We are looking for a Senior SRE with 5+ years of experience to work primarily with ourApplication development team. An ideal candidate would have extensive experiencebuilding cloud infrastructure on Google Cloud with Terraform and have strongexperience running workloads that scale on Google's...


  • Chennai, Tamil Nadu, India ZF Group Full time

    Job DescriptionJob description:About the Team:Garuda team is a SRE team responsible for the reliability and operations of our Fleet management services platform. We ensure the availability and performance of the platform through proactive incident management, optimization, and continuous improvement while contributing to the development of SCALAR&aposs...


  • Chennai, Tamil Nadu, India Everstage Inc. Full time

    Everstage is looking to hire Site Reliability Engineer. Please write to bharath@everstage.com if the below opportunity excites you. We are seeking a skilled and motivated Site Reliability Engineer (SRE) with at least 2 years of experience in maintaining and optimising infrastructure. The ideal candidate will be responsible for ensuring system reliability...


  • Chennai, Tamil Nadu, India 10decoders Full time

    Job Description The Role: We are seeking a Senior Site Reliability Engineer with 5+ years of experience to work closely with our Application Development team. Responsibilities: Contribute to establishing best practices and shaping the SRE culture within our organization. Collaborate with teams to design, build, and improve Google Cloud infrastructure using...


  • Chennai, Tamil Nadu, India Bastion Data Solutions Full time

    Become a part of Bastion Data Solutions' mission to deliver exceptional data solutions.ResponsibilitiesThis on-site role at Bastion Data Solutions in Chennai requires a strong background in Site Reliability Engineering, software development, and system administration.Main duties will include:Ensuring site reliability and performanceDeveloping software...


  • Chennai, Tamil Nadu, India triSys Full time

    Job DescriptionExperience: 5-8yrsJob Location : Chennai/Pune/Gurgaon/KolkataWe are seeking a highly skilled and experienced Site Reliability Engineer (SRE) with a deep understanding of SRE principles and practices. This role will be instrumental in shaping and guiding the SRE journey, ensuring high availability, reliability, and performance. The ideal...


  • Chennai, Tamil Nadu, India Tredence Inc. Full time

    Site Reliability Engineer (SRE) Experience: 8-12yrs Pune/ Chennai/ Gurgaon/ Kolkata We are seeking a highly skilled and experienced Site Reliability Engineer (SRE) with a deep understanding of SRE principles and practices. This role will be instrumental in shaping and guiding the SRE journey, ensuring high availability, reliability, and performance. The...


  • Chennai, Tamil Nadu, India Natobotics Technologies Pvt Limited Full time

    Site Reliability Engineer - Server Support (SRE - SES)Location : Chennai, Hyderabad, Pune, BangaloreExperience : 4-7 YearsNotice Period : 0-30 DaysAbout the Role :We are urgently seeking experienced Site Reliability Engineers - Server Support (SRE - SES) to join our growing team. As an SRE - SES, you will be responsible for ensuring the high availability,...


  • Chennai, Tamil Nadu, India Tredence Inc. Full time

    Site Reliability Engineer (SRE) Experience: 8-12yrsPune/ Chennai/ Gurgaon/ KolkataWe are seeking a highly skilled and experienced Site Reliability Engineer (SRE) with a deep understanding of SRE principles and practices. This role will be instrumental in shaping and guiding the SRE journey, ensuring high availability, reliability, and performance. The ideal...


  • Chennai, Tamil Nadu, India Tredence Inc. Full time

    **Job Title:** Site Reliability Engineer (SRE) **Experience Level:** 8-12 years **Locations:** Pune, Chennai, Gurgaon, Kolkata We are seeking a highly skilled and experienced Site Reliability Engineer (SRE) to shape and guide our SRE journey. The ideal candidate will bring both technical expertise and SRE knowledge to establish robust observability, incident...


  • Chennai, Tamil Nadu, India Zuora Full time

    As a Site Reliability Engineering Manager at Zuora, you will be responsible for leading a team of talented engineers to leverage their expertise in cloud technologies, system design, troubleshooting, automation, and AI to scale and work across Product Engineering, Customer Support, Product Management, and Global Services to deliver Site and Customer...