Sandhata - Site Reliability Engineer - Cloud Infrastructure

3 weeks ago


Chennai, Tamil Nadu, India Sandhata Technologies Pvt Ltd Full time

Position Overview :

We are seeking an experienced Site Reliability Engineer (SRE) to join our dynamic team. As an SRE, you will play a pivotal role in ensuring the reliability, availability, and performance of our cloud-based infrastructure hosted on AWS with EKS. You will work closely with cross-functional teams to implement best practices for monitoring, automation, and continuous integration and deployment using tools such as Datadog and Azure DevOps. The ideal candidate should have a solid background in cloud technologies, troubleshooting, and production release support.

Responsibilities :

- Collaborate with development and operations teams to design, implement, and manage scalable and reliable infrastructure solutions on AWS using EKS (Elastic Kubernetes Service).

- Develop, maintain, and enhance monitoring and alerting systems using Datadog to proactively identify and address potential issues, ensuring optimal system performance.

- Participate in the design and implementation of CI/CD pipelines using Azure DevOps, enabling automated and reliable software delivery.

- Lead efforts in incident response and troubleshooting to quickly diagnose and resolve production incidents, minimizing downtime and impact on users.

- Take ownership of reliability initiatives by identifying areas for improvement, conducting root cause analysis, and implementing solutions to prevent recurrence of incidents.

- Work with the development teams to define and establish Service Level Objectives (SLOs) and Service Level

- Indicators (SLIs) to measure and maintain the system's reliability.

- Contribute to the documentation of processes, procedures, and best practices to enhance knowledge sharing within the :

- Minimum of 4 years of experience in a Site Reliability Engineer or similar role, managing cloud-based infrastructure on AWS with EKS.

- Strong expertise in AWS services, especially EKS, including cluster provisioning, scaling, and management.

- Proficiency in using monitoring and observability tools, with hands-on experience in Datadog or similar tools for tracking system performance and generating meaningful alerts.

- Experience in implementing CI/CD pipelines using Azure DevOps or similar tools to automate software deployment and testing.

- Solid understanding of containerization and orchestration technologies (e.g., Docker, Kubernetes) and their role in modern application architectures.

- Excellent troubleshooting skills and the ability to analyze complex issues, determine root causes, and implement effective solutions.

- Strong scripting and automation skills (Python, Bash, etc.). (ref:hirist.tech)

  • Chennai, Tamil Nadu, India Forbes Advisor Full time

    Job Title: Staff Site Reliability Engineer - Cloud Infrastructure">">Description: We're seeking an experienced Staff Site Reliability Engineer - Cloud Infrastructure to join our team at Forbes Advisor. As a leader in personal finance and business insights, we strive to provide accurate and reliable information to help consumers make informed...


  • Chennai, Tamil Nadu, India Centific Global Technologies Full time

    Job Summary Centific Global Technologies is seeking a highly skilled SRE Engineer to join our team. As a key member of our infrastructure reliability team, you will be responsible for designing, implementing, and maintaining scalable, secure, and highly available cloud-based systems.About Us Centific Global Technologies is a leading provider of innovative...


  • Chennai, Tamil Nadu, India 10decoders Full time

    JD: Site Reliability Engineer - GCP With Terraform The Role: We are looking for a Senior SRE with 5+ years of experience to work primarily with our Application development team. An ideal candidate would have extensive experience building cloud infrastructure on Google Cloud with Terraform and have strong experience running workloads that scale on Google's...


  • Chennai, Tamil Nadu, India 10decoders Full time

    JD: Site Reliability Engineer - GCP With TerraformThe Role:We are looking for a Senior SRE with 5+ years of experience to work primarily with ourApplication development team. An ideal candidate would have extensive experiencebuilding cloud infrastructure on Google Cloud with Terraform and have strongexperience running workloads that scale on Google's...


  • Chennai, Tamil Nadu, India Bright Vision Technologies Full time

    Bright Vision Technologies has an immediate Full-time opportunity for Site Reliability Engineer (SRE)  Job Role:  Site Reliability Engineer (SRE) Job Type: Full Time Candidates Looking for Visa sponsorship and willing to relocate to USA are encouraged to apply.About Bright Vision Technologies: Bright Vision Technologies is a fast-growing technology company...


  • Chennai, Tamil Nadu, India 10decoders Full time

    Job Summary We are seeking a Senior Site Reliability Engineer (SRE) with 5+ years of experience to join our team and work primarily with our Application development team. The ideal candidate will have extensive experience building cloud infrastructure on Google Cloud Platform using Terraform and strong experience running workloads that scale on Google's...


  • Chennai, Tamil Nadu, India 10decoders Full time

    JD: Site Reliability Engineer -GCP With TerraformThe Role:We are looking for a Senior SRE with5+ yearsof experience to work primarily with ourApplication development team. An ideal candidate would have extensive experiencebuilding cloud infrastructure onGoogle Cloud with Terraformand have strongexperience running workloads that scale on Google's Kubernetes...


  • Chennai, Tamil Nadu, India 10decoders Full time

    JD: Site Reliability Engineer - GCP With TerraformThe Role:We are looking for a Senior SRE with 5+ years of experience to work primarily with ourApplication development team. An ideal candidate would have extensive experiencebuilding cloud infrastructure on Google Cloud with Terraform and have strongexperience running workloads that scale on Google's...


  • Chennai, Tamil Nadu, India Kiash Solutions LLp Full time

    We are hiring a Site Reliability Engineer (SRE) with strong expertise in Azure operations, containerized workflows (Docker), and Python scripting. The ideal candidate will lead efforts to ensure system reliability, automate operational tasks, and optimize cloud-based infrastructure, while collaborating with cross-functional teams to deliver high-performing...


  • Chennai, Tamil Nadu, India Burgeon It Services Pvt Ltd Full time

    Job Title : SRE Engineer Location : Chennai Experience : 8 Years Job Description : We are seeking an experienced Site Reliability Engineer (SRE) to join our dynamic team. The ideal candidate will have a strong background in software engineering and operations, with a passion for building scalable and reliable systems. Key Responsibilities : - Design,...


  • Chennai, Tamil Nadu, India 10decoders Full time

    Job Description The Role: We are seeking a Senior Site Reliability Engineer with 5+ years of experience to work closely with our Application Development team. Responsibilities: Contribute to establishing best practices and shaping the SRE culture within our organization. Collaborate with teams to design, build, and improve Google Cloud infrastructure using...


  • Chennai, Tamil Nadu, India Centific Global Technologies Full time

    About the Role We are seeking a highly motivated and experienced Cloud Architect to lead our infrastructure reliability efforts. As a key member of our team, you will be responsible for designing, implementing, and maintaining scalable, secure, and highly available cloud-based systems.Responsibilities * Lead the design and implementation of scalable and...


  • Chennai, Tamil Nadu, India Burgeon It Services Pvt Ltd Full time

    Job Title : SRE EngineerLocation : ChennaiExperience : 8+ YearsJob Description :We are seeking an experienced Site Reliability Engineer (SRE) to join our dynamic team. The ideal candidate will have a strong background in software engineering and operations, with a passion for building scalable and reliable systems.Key Responsibilities :- Design, implement,...


  • Chennai, Tamil Nadu, India Tredence Inc. Full time

    Site Reliability Engineer (SRE) Experience: 8-12yrs Pune/ Chennai/ Gurgaon/ Kolkata We are seeking a highly skilled and experienced Site Reliability Engineer (SRE) with a deep understanding of SRE principles and practices. This role will be instrumental in shaping and guiding the SRE journey, ensuring high availability, reliability, and performance. The...


  • Chennai, Tamil Nadu, India Tredence Inc. Full time

    Site Reliability Engineer (SRE) Experience: 8-12yrs Pune/ Chennai/ Gurgaon/ Kolkata We are seeking a highly skilled and experienced Site Reliability Engineer (SRE) with a deep understanding of SRE principles and practices. This role will be instrumental in shaping and guiding the SRE journey, ensuring high availability, reliability, and performance. The...


  • Chennai, Tamil Nadu, India triSys Full time

    Job DescriptionExperience: 5-8yrsJob Location : Chennai/Pune/Gurgaon/KolkataWe are seeking a highly skilled and experienced Site Reliability Engineer (SRE) with a deep understanding of SRE principles and practices. This role will be instrumental in shaping and guiding the SRE journey, ensuring high availability, reliability, and performance. The ideal...


  • Chennai, Tamil Nadu, India MEEDEN LABS PRIVATE LIMITED Full time

    Experience Required : - 1 to 3 years of hands-on experience in Site Reliability Engineering (SRE) with a focus on observability, triaging, and automation. Mandatory Skills : Monitoring & Reporting : - Hands-on experience with monitoring tools like AppDynamics, Prometheus, Data Dog, New Relic, Dynatrace, Splunk, Kibana, Grafana, Alert Manager, and PagerDuty....


  • Chennai, Tamil Nadu, India Everstage Inc. Full time

    Everstage is looking to hire Site Reliability Engineer. Please write to bharath@everstage.com if the below opportunity excites you. We are seeking a skilled and motivated Site Reliability Engineer (SRE) with at least 2 years of experience in maintaining and optimising infrastructure. The ideal candidate will be responsible for ensuring system reliability...


  • Chennai, Tamil Nadu, India HCLTech Full time

    Site Reliability Engineer, in Application/Cloud Support, will be responsible for ensuring the reliability, scalability, and performance of critical business systems. Involve in identifying and solving issues within multiple components of these systems, utilizing expertise in Site Reliability Engineering. Roles & Responsibilities: -Lead efforts to improve the...


  • Chennai, Tamil Nadu, India HCLTech Full time

    Job DescriptionSite Reliability Engineer, in Application/Cloud Support, will be responsible for ensuring the reliability, scalability, and performance of critical business systems. Involve in identifying and solving issues within multiple components of these systems, utilizing expertise in Site Reliability Engineering.Roles & Responsibilities: -1. Lead...