Site Reliability Engineer

2 weeks ago


Bengaluru, Karnataka, India CES Full time

We're looking for a highly skilled Site Reliability Engineer to help us build, manage, and scale modern infrastructure systems for high-availability applications. If you're passionate about automation, cloud platforms, and solving tough operational challenges, we would love to hear from you.

Key Skills and Competencies

  • 3+ years of extensive experience with Infrastructure as Code (IaC) and Desired State Configuration (DSC) tools like Terraform, CDK, and Chef
  • Experience in packaging, deploying, and managing containerized workloads on Docker and Kubernetes
  • Expertise in managing AWS infrastructure at scale – EC2, S3, ELB, Lambda, Route 53, ECS, SQS, CloudWatch
  • Prior experience working in DevOps or SRE environments
  • Strong automation/scripting skills using PowerShell, Ruby, Go, Python, and Bash
  • Hands-on with monitoring and reporting tools – ELK Stack, Dynatrace, New Relic, Nagios
  • Experience with IIS management, performance monitoring, and troubleshooting
  • Background in web farm management for high-traffic SaaS applications
  • Strong problem-solving and root-cause analysis skills
  • Experience working with .NET application architectures – caching, content delivery, high availability, load balancing
  • Familiarity with CI/CD pipelines and tools – TeamCity, Octopus Deploy, GitHub, Jenkins, Codefresh, etc

Responsibilities:

  • Drive initiatives to improve platform scalability and operational efficiency
  • Lead standardization efforts across engineering and infrastructure teams
  • Identify opportunities to improve and automate deployments, visibility, and management
  • Apply cloud security best practices to ensure infrastructure safety
  • Provide full-stack diagnostics and resolve complex infrastructure issues
  • Track performance metrics and make data-backed improvement decisions
  • Proactively suggest infrastructure or process changes for system reliability
  • Ensure disaster recovery readiness and implement high availability systems
  • Build support workflows and assist with incident response
  • Own and improve the customer experience through system reliability and uptime

Personal Attributes:

  • Passionate about learning and applying new technologies
  • A strong collaborator who believes in team success
  • Excellent communicator – verbal, written, and virtual
  • High integrity and commitment to ethical standards
  • Self-motivated, driven, and detail-oriented
  • Able to work independently on short-term projects


  • Bengaluru, Karnataka, India Enterprise Minds, Inc Full time

    We're Hiring | Site Reliability Engineer | 8-10 years


  • Bengaluru, Karnataka, India WhiteLotus Talent Partners Full time

    We are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes . In this role, you will focus on monitoring , basic troubleshooting , and incident response , helping to maintain high...


  • Bengaluru, Karnataka, India WhiteLotus Talent Partners Full time

    We are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes. In this role, you will focus on monitoring, basic troubleshooting, and incident response, helping to maintain high system...


  • Bengaluru, Karnataka, India beBeeReliability Full time ₹ 2,00,00,000 - ₹ 2,50,00,000

    Role OverviewAs a Site Reliability Engineer, you will play a pivotal role in driving innovation and modernizing complex systems by leveraging cutting-edge technologies and collaboration with cross-functional teams.


  • Bengaluru, Karnataka, India Coforge Full time

    Job Description- Design, implement, and maintain scalable infrastructure to ensure high availability and performance of software applications.- Collaborate with development teams to identify and resolve issues affecting application performance, stability, and reliability.- Develop automated monitoring scripts using tools like Prometheus, Grafana, etc. to...


  • Bengaluru, Karnataka, India Infrasoft Technologies Limited Full time

    Job DescriptionJob Title: DeveloperWork Location: Bangalore, KarnatakaExperience Range: 68 YearsJob Description:We are looking for a skilled Developer with strong hands-on experience in Site Reliability Engineering (SRE), Java, JavaScript, and Production Support. The ideal candidate should have a solid background in application monitoring and troubleshooting...


  • Bengaluru, Karnataka, India Collabera Full time

    Job Description As a Principal/Chief Site Reliability Engineer , you will play a critical role in designing, developing, and maintaining scalable and highly reliable systems. You'll work closely with development teams to improve system reliability, monitor critical applications, and design fail-proof infrastructure. Responsibilities Design and implement...


  • Bengaluru, Karnataka, India Xebia Full time

    We are seeking an experienced AWS DevOps Engineer with strong expertise in Observability and Site Reliability Engineering (SRE) to design, build, and manage scalable, reliable, and secure cloud environments. The role requires hands-on experience with AWS services, Infrastructure as Code (IaC), CI/CD, monitoring & observability frameworks, and incident...


  • Bengaluru, Karnataka, India NatWest Group Full time

    Join us as a Site Reliability Engineer In this key role you ll support the improvement of non-functional and operational characteristics such as availability performance efficiency change management monitoring security incident response and capacity planning of our products and services You ll enjoy significant stakeholder interaction working in...


  • Bengaluru, Karnataka, India Tata Technologies Full time

    Job DescriptionSite Reliability EngineerWhat awaits you/ Job ProfileAn SRE is responsible for maintaining reliability. That means facilitating automated, streamlined, and efficient error responses and reducing human error at scale. SREs spend a lot of time removing pain points, configuring internal tools, and setting and testing system benchmarks. They also...