Sandhata - Site Reliability Engineer - Cloud Infrastructure
3 weeks ago
Position Overview :
We are seeking an experienced Site Reliability Engineer (SRE) to join our dynamic team. As an SRE, you will play a pivotal role in ensuring the reliability, availability, and performance of our cloud-based infrastructure hosted on AWS with EKS. You will work closely with cross-functional teams to implement best practices for monitoring, automation, and continuous integration and deployment using tools such as Datadog and Azure DevOps. The ideal candidate should have a solid background in cloud technologies, troubleshooting, and production release support.
Responsibilities :
- Collaborate with development and operations teams to design, implement, and manage scalable and reliable infrastructure solutions on AWS using EKS (Elastic Kubernetes Service).
- Develop, maintain, and enhance monitoring and alerting systems using Datadog to proactively identify and address potential issues, ensuring optimal system performance.
- Participate in the design and implementation of CI/CD pipelines using Azure DevOps, enabling automated and reliable software delivery.
- Lead efforts in incident response and troubleshooting to quickly diagnose and resolve production incidents, minimizing downtime and impact on users.
- Take ownership of reliability initiatives by identifying areas for improvement, conducting root cause analysis, and implementing solutions to prevent recurrence of incidents.
- Work with the development teams to define and establish Service Level Objectives (SLOs) and Service Level
- Indicators (SLIs) to measure and maintain the system's reliability.
- Contribute to the documentation of processes, procedures, and best practices to enhance knowledge sharing within the :
- Strong expertise in AWS services, especially EKS, including cluster provisioning, scaling, and management.
- Proficiency in using monitoring and observability tools, with hands-on experience in Datadog or similar tools for tracking system performance and generating meaningful alerts.
- Experience in implementing CI/CD pipelines using Azure DevOps or similar tools to automate software deployment and testing.
- Solid understanding of containerization and orchestration technologies (e.g., Docker, Kubernetes) and their role in modern application architectures.
- Excellent troubleshooting skills and the ability to analyze complex issues, determine root causes, and implement effective solutions.
- Strong scripting and automation skills (Python, Bash, etc.). (ref:hirist.tech)
-
Staff Site Reliability Engineer
5 days ago
Chennai, Tamil Nadu, India Forbes Advisor Full timeJob Title: Staff Site Reliability Engineer - Cloud Infrastructure">">Description: We're seeking an experienced Staff Site Reliability Engineer - Cloud Infrastructure to join our team at Forbes Advisor. As a leader in personal finance and business insights, we strive to provide accurate and reliable information to help consumers make informed...
-
Site Reliability Engineer
13 hours ago
Chennai, Tamil Nadu, India Centific Global Technologies Full timeJob Summary Centific Global Technologies is seeking a highly skilled SRE Engineer to join our team. As a key member of our infrastructure reliability team, you will be responsible for designing, implementing, and maintaining scalable, secure, and highly available cloud-based systems.About Us Centific Global Technologies is a leading provider of innovative...
-
Site Reliability Engineer
2 weeks ago
Chennai, Tamil Nadu, India 10decoders Full timeJD: Site Reliability Engineer - GCP With Terraform The Role: We are looking for a Senior SRE with 5+ years of experience to work primarily with our Application development team. An ideal candidate would have extensive experience building cloud infrastructure on Google Cloud with Terraform and have strong experience running workloads that scale on Google's...
-
Site Reliability Engineer
4 weeks ago
Chennai, Tamil Nadu, India 10decoders Full timeJD: Site Reliability Engineer - GCP With TerraformThe Role:We are looking for a Senior SRE with 5+ years of experience to work primarily with ourApplication development team. An ideal candidate would have extensive experiencebuilding cloud infrastructure on Google Cloud with Terraform and have strongexperience running workloads that scale on Google's...
-
Site Reliability Engineer
1 week ago
Chennai, Tamil Nadu, India Bright Vision Technologies Full timeBright Vision Technologies has an immediate Full-time opportunity for Site Reliability Engineer (SRE) Job Role: Site Reliability Engineer (SRE) Job Type: Full Time Candidates Looking for Visa sponsorship and willing to relocate to USA are encouraged to apply.About Bright Vision Technologies: Bright Vision Technologies is a fast-growing technology company...
-
Site Reliability Engineer
3 weeks ago
Chennai, Tamil Nadu, India 10decoders Full timeJob Summary We are seeking a Senior Site Reliability Engineer (SRE) with 5+ years of experience to join our team and work primarily with our Application development team. The ideal candidate will have extensive experience building cloud infrastructure on Google Cloud Platform using Terraform and strong experience running workloads that scale on Google's...
-
Site Reliability Engineer
3 weeks ago
Chennai, Tamil Nadu, India 10decoders Full timeJD: Site Reliability Engineer -GCP With TerraformThe Role:We are looking for a Senior SRE with5+ yearsof experience to work primarily with ourApplication development team. An ideal candidate would have extensive experiencebuilding cloud infrastructure onGoogle Cloud with Terraformand have strongexperience running workloads that scale on Google's Kubernetes...
-
Site Reliability Engineer
4 weeks ago
Chennai, Tamil Nadu, India 10decoders Full timeJD: Site Reliability Engineer - GCP With TerraformThe Role:We are looking for a Senior SRE with 5+ years of experience to work primarily with ourApplication development team. An ideal candidate would have extensive experiencebuilding cloud infrastructure on Google Cloud with Terraform and have strongexperience running workloads that scale on Google's...
-
Site Reliability Engineer
4 weeks ago
Chennai, Tamil Nadu, India Kiash Solutions LLp Full timeWe are hiring a Site Reliability Engineer (SRE) with strong expertise in Azure operations, containerized workflows (Docker), and Python scripting. The ideal candidate will lead efforts to ensure system reliability, automate operational tasks, and optimize cloud-based infrastructure, while collaborating with cross-functional teams to deliver high-performing...
-
Site Reliability Engineer
5 hours ago
Chennai, Tamil Nadu, India Burgeon It Services Pvt Ltd Full timeJob Title : SRE Engineer Location : Chennai Experience : 8 Years Job Description : We are seeking an experienced Site Reliability Engineer (SRE) to join our dynamic team. The ideal candidate will have a strong background in software engineering and operations, with a passion for building scalable and reliable systems. Key Responsibilities : - Design,...
-
Site Reliability Engineer
4 weeks ago
Chennai, Tamil Nadu, India 10decoders Full timeJob Description The Role: We are seeking a Senior Site Reliability Engineer with 5+ years of experience to work closely with our Application Development team. Responsibilities: Contribute to establishing best practices and shaping the SRE culture within our organization. Collaborate with teams to design, build, and improve Google Cloud infrastructure using...
-
Cloud Architect and Site Reliability Leader
9 hours ago
Chennai, Tamil Nadu, India Centific Global Technologies Full timeAbout the Role We are seeking a highly motivated and experienced Cloud Architect to lead our infrastructure reliability efforts. As a key member of our team, you will be responsible for designing, implementing, and maintaining scalable, secure, and highly available cloud-based systems.Responsibilities * Lead the design and implementation of scalable and...
-
Site Reliability Engineer
3 weeks ago
Chennai, Tamil Nadu, India Burgeon It Services Pvt Ltd Full timeJob Title : SRE EngineerLocation : ChennaiExperience : 8+ YearsJob Description :We are seeking an experienced Site Reliability Engineer (SRE) to join our dynamic team. The ideal candidate will have a strong background in software engineering and operations, with a passion for building scalable and reliable systems.Key Responsibilities :- Design, implement,...
-
Senior Site Reliability Engineer
2 weeks ago
Chennai, Tamil Nadu, India Tredence Inc. Full timeSite Reliability Engineer (SRE) Experience: 8-12yrs Pune/ Chennai/ Gurgaon/ Kolkata We are seeking a highly skilled and experienced Site Reliability Engineer (SRE) with a deep understanding of SRE principles and practices. This role will be instrumental in shaping and guiding the SRE journey, ensuring high availability, reliability, and performance. The...
-
Senior Site Reliability Engineer
5 days ago
Chennai, Tamil Nadu, India Tredence Inc. Full timeSite Reliability Engineer (SRE) Experience: 8-12yrs Pune/ Chennai/ Gurgaon/ Kolkata We are seeking a highly skilled and experienced Site Reliability Engineer (SRE) with a deep understanding of SRE principles and practices. This role will be instrumental in shaping and guiding the SRE journey, ensuring high availability, reliability, and performance. The...
-
Site Reliability Engineer
2 weeks ago
Chennai, Tamil Nadu, India triSys Full timeJob DescriptionExperience: 5-8yrsJob Location : Chennai/Pune/Gurgaon/KolkataWe are seeking a highly skilled and experienced Site Reliability Engineer (SRE) with a deep understanding of SRE principles and practices. This role will be instrumental in shaping and guiding the SRE journey, ensuring high availability, reliability, and performance. The ideal...
-
Site Reliability Engineer
4 days ago
Chennai, Tamil Nadu, India MEEDEN LABS PRIVATE LIMITED Full timeExperience Required : - 1 to 3 years of hands-on experience in Site Reliability Engineering (SRE) with a focus on observability, triaging, and automation. Mandatory Skills : Monitoring & Reporting : - Hands-on experience with monitoring tools like AppDynamics, Prometheus, Data Dog, New Relic, Dynatrace, Splunk, Kibana, Grafana, Alert Manager, and PagerDuty....
-
Site Reliability Engineer
3 weeks ago
Chennai, Tamil Nadu, India Everstage Inc. Full timeEverstage is looking to hire Site Reliability Engineer. Please write to bharath@everstage.com if the below opportunity excites you. We are seeking a skilled and motivated Site Reliability Engineer (SRE) with at least 2 years of experience in maintaining and optimising infrastructure. The ideal candidate will be responsible for ensuring system reliability...
-
Site Reliability Engineer
3 days ago
Chennai, Tamil Nadu, India HCLTech Full timeSite Reliability Engineer, in Application/Cloud Support, will be responsible for ensuring the reliability, scalability, and performance of critical business systems. Involve in identifying and solving issues within multiple components of these systems, utilizing expertise in Site Reliability Engineering. Roles & Responsibilities: -Lead efforts to improve the...
-
Site Reliability Engineer
4 days ago
Chennai, Tamil Nadu, India HCLTech Full timeJob DescriptionSite Reliability Engineer, in Application/Cloud Support, will be responsible for ensuring the reliability, scalability, and performance of critical business systems. Involve in identifying and solving issues within multiple components of these systems, utilizing expertise in Site Reliability Engineering.Roles & Responsibilities: -1. Lead...