
Senior Site Reliability Engineer
4 weeks ago
Senior Site Reliability Engineer (SRE) – Job Description
Key Responsibilities
SRE & Application Reliability
- Implement and tune SLOs/SLIs, build reliability dashboards, and respond to incidents using Grafana IRM, JSM, and escalation workflows.
- Monitor application performance and availability across Kubernetes clusters using Grafana, Prometheus, Loki, Mimir, and Tempo.
- Participate in on-call rotation, postmortems, and continual improvement processes.
Application Support & Troubleshooting
- Act as the primary escalation point for production issues — whether internal or client-facing.
- Monitor logs, traces, and alerts to proactively identify and resolve incidents.
- Debug issues across the stack: Kubernetes, Helm releases, application logs, API errors, database bottlenecks.
- Coordinate with development, QA, and client teams to ensure timely and effective resolution of issues.
DevOps & Infrastructure Automation
- Implement GitOps workflows using FluxCD and ArgoCD to manage Kubernetes deployments.
- Manage and maintain infrastructure-as-code using Terraform, Terragrunt, and Azure (Preferred).
- Automate CI/CD pipelines with GitHub Actions for Docker image builds, Helm-based deployments, release tagging, etc.
Post-QA & Release Validation
- Work closely with QA engineers to validate release branches, tag images, and verify integration across services.
- Test application functionality post deployments (sanity and product functional tests).
- Assist in defining performance benchmarks (e.g., pgBench for PostgreSQL clusters) and validate pre-production readiness.
Must-Have Qualifications
- 6–8 years of experience in DevOps, SRE, or Production Support roles.
- Strong hands-on experience with Azure and Kubernetes (AKS preferred) and Helm/Kustomize.
- Solid knowledge of GitHub Actions, GitOps (FluxCD/ArgoCD), and Terraform/Terragrunt.
- Experience with monitoring/logging stacks: Grafana, Prometheus, Loki, Tempo, Mimir, and Incident Response tools.
- Experience debugging microservices written in Node.js, Go, or similar.
- Excellent troubleshooting and debugging skills across the stack.
-
Senior Site Reliability Engineer
3 weeks ago
Bengaluru, Karnataka, India Akamai Full timeJob Category Site Reliability Would you like to lead modernization initiatives while building a public cloud platform from scratch Would you like to own critical services in a new public cloud platform Join our IaaS Site Reliability Engineering SRE team We design develop and operate infrastructure and services that power the backbone of our...
-
Site Reliability Engineer
4 weeks ago
Bengaluru, Karnataka, India WhiteLotus Talent Partners Full timeWe are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes. In this role, you will focus on monitoring, basic troubleshooting, and incident response, helping to maintain high system...
-
Site Reliability Engineer
3 weeks ago
Bengaluru, Karnataka, India WhiteLotus Talent Partners Full timeWe are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes. In this role, you will focus on monitoring, basic troubleshooting, and incident response, helping to maintain high system...
-
Site Reliability Engineer
1 week ago
Bengaluru, Karnataka, India WhiteLotus Talent Partners Full time ₹ 9,00,000 - ₹ 12,00,000 per yearWe are looking for aL0 and L1 Site Reliability Engineer (SRE) Supportto join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered byOpenStackandKubernetes. In this role, you will focus onmonitoring,basic troubleshooting, andincident response, helping to maintain high system availability,...
-
Senior Site Reliability Engineer
4 days ago
Bengaluru, Karnataka, India Saviynt Full time ₹ 12,00,000 - ₹ 36,00,000 per yearAbout the job Saviynt's AI-powered identity platform manages and governs human and non-human access to all of an organization's applications, data, and business processes. Customers trust Saviynt to safeguard their digital assets, drive operational efficiency, and reduce compliance costs. Built for the AI age, Saviynt is today helping organizations safely...
-
Senior Site Reliability Engineer
6 days ago
Bengaluru, Karnataka, India Saviynt Full time ₹ 15,00,000 - ₹ 25,00,000 per yearAbout the jobSaviynt's AI-powered identity platform manages and governs human and non-human access to all of an organization's applications, data, and business processes. Customers trust Saviynt to safeguard their digital assets, drive operational efficiency, and reduce compliance costs. Built for the AI age, Saviynt is today helping organizations safely...
-
Site Reliability Engineer
4 days ago
Bengaluru, Karnataka, India AppHelix Full time ₹ 9,00,000 - ₹ 12,00,000 per yearRole DescriptionThis is a full-time on-site role located in Bengaluru for a Site Reliability Engineer. The Site Reliability Engineer will be responsible for maintaining and improving the reliability of AppHelix's systems. Daily tasks include monitoring system performance, troubleshooting issues, managing infrastructure, and supporting software development....
-
Site Reliability Engineer
2 days ago
Bengaluru, Karnataka, India HireAlpha Full time ₹ 8,00,000 - ₹ 24,00,000 per yearWe're Hiring | Senior Site Reliability Engineer (SRE)Bangalore | HybridPermanent RoleAre you ready to help shape the future of cloud contact centers? we're building scalable, reliable, and cutting-edge infrastructure for world-class customer experiences — and we're looking for aSenior SREto join our teamWhat you'll do:Lead efforts in building a seamless ...
-
Site Reliability Engineer
14 hours ago
Bengaluru, Karnataka, India Luxoft Full time ₹ 20,00,000 - ₹ 25,00,000 per yearProject descriptionLuxoft partner with next-generation digital bank, built from the ground up to deliver seamless, secure, and scalable financial services. Our platform is cloud-native, API-first, and focused on reliability, speed, and security. We are growing fast and looking for top-tier Site Reliability / Ops Engineers to join our core team and help run...
-
Senior Site Reliability Engineer
2 days ago
Bengaluru, Karnataka, India Aerospike Full time ₹ 15,00,000 - ₹ 20,00,000 per yearAbout Aerospike Aerospike is the real-time database for mission-critical use cases and workloads, including machine learning, generative, and agentic AI. Aerospike powers millions of transactions per second with millisecond latency, at a fraction of the total cost of ownership compared to other databases. Global leaders, including Adobe, Airtel, Barclays,...