Site Reliability Engineer
2 days ago
Description :
Site Reliability Engineer (SRE) Kubernetes & Cloud
Position Summary :
We are seeking a highly skilled Site Reliability Engineer (SRE) with deep expertise in Kubernetes and cloud technologies (AWS, Azure, or GCP). The SRE will be responsible for designing, deploying, automating, and supporting highly available, scalable, and secure containerized applications in cloud-native environments. You will work closely with development, operations, and security teams to ensure the reliability, performance, and efficiency of our production systems.
Key Responsibilities :
- Design, deploy, and manage Kubernetes clusters (on-premises and/or cloud-managed such as EKS, AKS, GKE) to support scalable microservices architectures.
- Automate infrastructure provisioning and application deployment using Infrastructure as Code (IaC) tools such as Terraform, Helm, or CloudFormation.
- Monitor, troubleshoot, and optimize system performance using observability tools (Prometheus, Grafana, ELK, Datadog, etc.).
- Implement and manage CI/CD pipelines to ensure rapid, repeatable, and reliable software delivery.
- Ensure system reliability, availability, and security through proactive monitoring, incident response, and root cause analysis.
- Develop and maintain runbooks, dashboards, and documentation for operational procedures and system architectures.
- Participate in on-call rotations and respond to production incidents, ensuring minimal downtime and fast recovery.
- Collaborate with development and operations teams to drive DevOps and SRE best practices, including capacity planning, scaling, and cost optimization.
- Continuously improve automation, tooling, and processes to reduce manual work and increase system reliability.
Required Skills & Experience :
years experience as an SRE, DevOps Engineer, or similar role supporting large-scale, production-grade environments.
- Expertise in Kubernetes (deployment, scaling, upgrades, troubleshooting, networking, RBAC, etc.).
- Hands-on experience with at least one major cloud provider: AWS, Azure, or GCP.
- Proficiency in scripting/programming (Python, Bash, Go, etc.).
- Experience with IaC tools (Terraform, Helm, CloudFormation, ARM, etc.).
- Strong knowledge of Linux systems administration and networking concepts.
- Familiarity with monitoring, logging, and alerting tools (Prometheus, Grafana, ELK/EFK, Datadog, etc.).
- Experience with CI/CD tools (Jenkins, GitLab CI, ArgoCD, etc.).
- Understanding of security best practices in cloud and containerized environments.
- Excellent troubleshooting and problem-solving skills.
- Strong communication and collaboration skills.
Preferred Qualifications :
- Certified Kubernetes Administrator (CKA) or similar certification.
- Experience with service mesh (Istio, Linkerd), ingress controllers, and API gateways.
-
Cloud Site Reliability Engineer
1 week ago
Chennai, Tamil Nadu, India Ford Global Career Site Full time ₹ 15,00,000 - ₹ 25,00,000 per yearBe at the Forefront of Mobility's Future: Join Ford as a Site Reliability EngineerEnterprise Technology is the engine driving the future of transportation, and we're looking for a talented Site Reliability Engineer (SRE) to help us redefine mobility. In this role, you'll leverage cutting-edge technology to enhance customer experiences, improve lives, and...
-
Site Reliability Engineer
2 weeks ago
Chennai, Tamil Nadu, India NatWest Group Full timeSite Reliability Engineer, AVP Join us as a Site Reliability EngineerYou'll manage the provision of stable, resilient, reliable applications with the end goal of minimising disruption to Customer & Colleague Journeys (CCJ) We'll look to you to identify and automate manual tasks and implement observability solutions, ensuring a thorough understanding of...
-
Site Reliability Engineer
1 week ago
Chennai, Tamil Nadu, India NatWest Group Full time ₹ 12,00,000 - ₹ 36,00,000 per yearSite Reliability Engineer Join us as a Site Reliability EngineerIn this key role, you'll support the improvement of non-functional and operational characteristics such as availability, performance, efficiency, change management, monitoring, security, incident response, and capacity planning of our products and services You'll enjoy significant...
-
Site Reliability Engineer
2 weeks ago
Chennai, Tamil Nadu, India Elgebra Full time ₹ 12,00,000 - ₹ 36,00,000 per yearRole Overview : We are seeking a highly experienced and technically proficient Site Reliability Engineer (SRE) to join our team in support of our client, Qincline. The ideal candidate will have 7 or more years of dedicated experience in Site Reliability Engineering or a closely related discipline. This pivotal role requires a strong focus on ensuring the...
-
Site Reliability Engineer
2 weeks ago
Chennai, Tamil Nadu, India Ford Motor Full timeSRE - Software Engineer Enterprise Technology plays a critical part in shaping the future of mobility. If you're looking for the chance to leverage advanced technology to redefine the transportation landscape, enhance the customer experience and improve people's lives, this is the opportunity for you. Join us and challenge your IT expertise and analytical...
-
Site Reliability Engineer
2 weeks ago
Chennai, Tamil Nadu, India NatWest Group Full time ₹ 9,00,000 - ₹ 12,00,000 per yearJoin us as a Site Reliability EngineerIn this key role, you'll support the improvement of non-functional and operational characteristics such as availability, performance, efficiency, change management, monitoring, security, incident response, and capacity planning of our products and servicesYou'll enjoy significant stakeholder interaction, working in...
-
Site Reliability Engineer III
2 weeks ago
Chennai, Tamil Nadu, India ACV Full time ₹ 1,04,000 - ₹ 1,30,878 per yearACV's mission is to build and enable the most trusted and efficient digital marketplaces for buying and selling used vehicles with transparency and comprehensive data that was previously unimaginable. We are powered by a combination of the world's best people and the industry's best technology. At ACV, we are driven by an entrepreneurial spirit and...
-
Site Reliability Engineer
1 week ago
Chennai, Tamil Nadu, India Intellect Design Arena Full time ₹ 12,00,000 - ₹ 36,00,000 per yearJob Title: Site Reliability EngineerCompany: Intellect Design Arena LtdLocation: Chennai, IndiaExperience Required: 6+ yearsJob Type: Full-timeDepartment: SRE / DevOps / Engineering EnablementAbout Intellect Design Arena LtdIntellect Design Arena Ltd is a global leader in digital financial technology, offering cutting-edge solutions for banking, insurance,...
-
Site Reliability Engineer
7 days ago
Chennai, Tamil Nadu, India Trimble Full time ₹ 10,000 - ₹ 25,000 per yearSite Reliability Engineer Cloud Site Reliability Engineer Reporting to: Sr Manager, Availability Management Office Location: Chennai, India Flexible Working: Hybrid (Part Office/Part Home) Cloud Site Reliability Engineer Responsibilities AI in Observability: Heavily utilise migration tooling and AI to eliminate key tasks as well as optimising...
-
Site Reliability Engineer
5 days ago
Chennai, Tamil Nadu, India Trimble Full time ₹ 10,00,000 - ₹ 25,00,000 per yearLead Site Reliability Engineer Cloud Site Reliability Engineer Reporting to: Sr Manager, Availability Management Office Location: Chennai, India Flexible Working: Hybrid (Part Office/Part Home) Cloud Site Reliability Engineer Responsibilities AI in Observability: Heavily utilise migration tooling and AI to eliminate key tasks as well as...