
Site Reliability Engineer
3 days ago
Key Responsibilities
- Cloud Infrastructure (AWS):
- Design, implement, and manage scalable, resilient, and cost-optimized cloud infrastructure using AWS services (EC2, EKS, Lambda, RDS, S3, CloudFront, IAM, VPC, etc.).
- Implement Infrastructure as Code (IaC) using tools like Terraform / CloudFormation.
- DevOps & Automation:
- Build and maintain CI/CD pipelines (Jenkins, GitHub Actions, GitLab CI, or AWS CodePipeline) for automated deployments.
- Automate repetitive tasks to improve development velocity and operational efficiency.
- Observability & Monitoring:
- Define and implement observability strategy covering monitoring, logging, tracing, and alerting.
- Work with tools like Prometheus, Grafana, ELK/EFK stack, AWS CloudWatch, Datadog, New Relic, Splunk, or Dynatrace.
- Establish SLIs, SLOs, and SLAs to measure and improve system reliability.
- Site Reliability Engineering (SRE):
- Drive incident management processes – detection, alerting, root cause analysis, and postmortems.
- Apply chaos engineering principles to validate resilience and recovery.
- Optimize reliability, latency, scalability, and system efficiency.
- Security & Compliance:
- Implement best practices for cloud security, identity & access management, and compliance frameworks (ISO, SOC2, GDPR, etc.).
- Ensure observability and monitoring meet security and audit requirements.
- Collaboration & Leadership:
- Partner with development, QA, and product teams to ensure seamless deployments.
- Mentor junior engineers and promote a culture of reliability, automation, and continuous improvement.
Required Skills & Qualifications
- 7+ years of professional experience in DevOps, Cloud Infrastructure, or SRE roles.
- Strong expertise in AWS Cloud (certification preferred: AWS Certified DevOps Engineer, Solutions Architect, or SysOps).
- Proficiency in IaC tools (Terraform, CloudFormation).
- Solid experience in CI/CD pipeline tools (Jenkins, GitHub Actions, GitLab CI/CD, AWS CodePipeline).
- Hands-on with observability tools: Prometheus, Grafana, CloudWatch, ELK, Datadog, New Relic, Splunk, or similar.
- Deep understanding of SRE principles: SLIs/SLOs, error budgets, incident response, chaos testing.
- Strong scripting/coding experience (Python, Bash, Go, or similar).
- Knowledge of containers & orchestration (Docker, Kubernetes, EKS).
- Familiarity with security best practices in cloud-native environments.
Preferred Skills
- Experience with multi-cloud or hybrid-cloud environments.
- Exposure to resiliency testing & chaos engineering tools (Gremlin, Litmus, Chaos Mesh).
- Knowledge of cost-optimization and FinOps in AWS.
- Excellent communication and stakeholder management skills.
What We Offer
- Opportunity to work on cutting-edge cloud-native architectures.
- A culture focused on automation, reliability, and innovation.
- Growth opportunities with certifications, training, and leadership exposure.
-
Site Reliability Engineer
3 days ago
Bengaluru, Karnataka, India Enterprise Minds, Inc Full timeWe're Hiring | Site Reliability Engineer | 8-10 years
-
site reliability engineer
15 hours ago
Bengaluru, Karnataka, India Randstad Full timeRole: Site Reliability Engineer SummaryThe Network Engineer 2 provides technical design, planning, operation, maintenance, and advanced troubleshooting of the Bread Financials' network infrastructure. This position ensures continuity and alignment of the network administration/engineering direction. This position supports Bread Financials' strategies and...
-
Site Reliability Engineer
5 days ago
Bengaluru, Karnataka, India WhiteLotus Talent Partners Full timeWe are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes . In this role, you will focus on monitoring , basic troubleshooting , and incident response , helping to maintain high...
-
Site Reliability Engineer
19 hours ago
Bengaluru, Karnataka, India WhiteLotus Talent Partners Full timeWe are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes. In this role, you will focus on monitoring, basic troubleshooting, and incident response, helping to maintain high system...
-
Site Reliability Engineer
6 days ago
Bengaluru, Karnataka, India beBeeReliability Full time ₹ 2,00,00,000 - ₹ 2,50,00,000Role OverviewAs a Site Reliability Engineer, you will play a pivotal role in driving innovation and modernizing complex systems by leveraging cutting-edge technologies and collaboration with cross-functional teams.
-
Site Reliability Engineer
2 days ago
Bengaluru, Karnataka, India Coforge Full timeJob Description- Design, implement, and maintain scalable infrastructure to ensure high availability and performance of software applications.- Collaborate with development teams to identify and resolve issues affecting application performance, stability, and reliability.- Develop automated monitoring scripts using tools like Prometheus, Grafana, etc. to...
-
Site Reliability Engineering
2 days ago
Bengaluru, Karnataka, India Infrasoft Technologies Limited Full timeJob DescriptionJob Title: DeveloperWork Location: Bangalore, KarnatakaExperience Range: 68 YearsJob Description:We are looking for a skilled Developer with strong hands-on experience in Site Reliability Engineering (SRE), Java, JavaScript, and Production Support. The ideal candidate should have a solid background in application monitoring and troubleshooting...
-
Site Reliability Engineer
2 weeks ago
Bengaluru, Karnataka, India Collabera Full timeJob Description As a Principal/Chief Site Reliability Engineer , you will play a critical role in designing, developing, and maintaining scalable and highly reliable systems. You'll work closely with development teams to improve system reliability, monitor critical applications, and design fail-proof infrastructure. Responsibilities Design and implement...
-
Site Reliability Engineer
4 weeks ago
Bengaluru, Karnataka, India NatWest Group Full timeJoin us as a Site Reliability Engineer In this key role you ll support the improvement of non-functional and operational characteristics such as availability performance efficiency change management monitoring security incident response and capacity planning of our products and services You ll enjoy significant stakeholder interaction working in...
-
Site Reliability Engineer
4 weeks ago
Bengaluru, Karnataka, India Tata Technologies Full timeJob DescriptionSite Reliability EngineerWhat awaits you/ Job ProfileAn SRE is responsible for maintaining reliability. That means facilitating automated, streamlined, and efficient error responses and reducing human error at scale. SREs spend a lot of time removing pain points, configuring internal tools, and setting and testing system benchmarks. They also...