Site Reliability Engineer

2 days ago


Bengaluru, Karnataka, India Xebia Full time ₹ 15,00,000 - ₹ 25,00,000 per year

We are seeking an experienced
AWS DevOps Engineer with strong expertise in Observability and Site Reliability Engineering (SRE)
to design, build, and manage scalable, reliable, and secure cloud environments. The role requires hands-on experience with AWS services, Infrastructure as Code (IaC), CI/CD, monitoring & observability frameworks, and incident response practices to ensure high availability, performance, and resilience of business-critical systems.

Key Responsibilities

  • Cloud Infrastructure (AWS):
  • Design, implement, and manage scalable, resilient, and cost-optimized cloud infrastructure using AWS services (EC2, EKS, Lambda, RDS, S3, CloudFront, IAM, VPC, etc.).
  • Implement Infrastructure as Code (IaC) using tools like
    Terraform / CloudFormation
    .
  • DevOps & Automation:
  • Build and maintain
    CI/CD pipelines
    (Jenkins, GitHub Actions, GitLab CI, or AWS CodePipeline) for automated deployments.
  • Automate repetitive tasks to improve development velocity and operational efficiency.
  • Observability & Monitoring:
  • Define and implement
    observability strategy
    covering monitoring, logging, tracing, and alerting.
  • Work with tools like
    Prometheus, Grafana, ELK/EFK stack, AWS CloudWatch, Datadog, New Relic, Splunk, or Dynatrace
    .
  • Establish
    SLIs, SLOs, and SLAs
    to measure and improve system reliability.
  • Site Reliability Engineering (SRE):
  • Drive incident management processes – detection, alerting, root cause analysis, and postmortems.
  • Apply
    chaos engineering
    principles to validate resilience and recovery.
  • Optimize reliability, latency, scalability, and system efficiency.
  • Security & Compliance:
  • Implement best practices for cloud security, identity & access management, and compliance frameworks (ISO, SOC2, GDPR, etc.).
  • Ensure observability and monitoring meet security and audit requirements.
  • Collaboration & Leadership:
  • Partner with development, QA, and product teams to ensure seamless deployments.
  • Mentor junior engineers and promote a culture of
    reliability, automation, and continuous improvement
    .

Required Skills & Qualifications

  • 7+ years
    of professional experience in DevOps, Cloud Infrastructure, or SRE roles.
  • Strong expertise in AWS Cloud
    (certification preferred: AWS Certified DevOps Engineer, Solutions Architect, or SysOps).
  • Proficiency in
    IaC tools
    (Terraform, CloudFormation).
  • Solid experience in
    CI/CD pipeline tools
    (Jenkins, GitHub Actions, GitLab CI/CD, AWS CodePipeline).
  • Hands-on with
    observability tools
    : Prometheus, Grafana, CloudWatch, ELK, Datadog, New Relic, Splunk, or similar.
  • Deep understanding of
    SRE principles
    : SLIs/SLOs, error budgets, incident response, chaos testing.
  • Strong scripting/coding experience (Python, Bash, Go, or similar).
  • Knowledge of
    containers & orchestration
    (Docker, Kubernetes, EKS).
  • Familiarity with
    security best practices
    in cloud-native environments.

Preferred Skills

  • Experience with
    multi-cloud or hybrid-cloud environments
    .
  • Exposure to
    resiliency testing & chaos engineering tools
    (Gremlin, Litmus, Chaos Mesh).
  • Knowledge of cost-optimization and FinOps in AWS.
  • Excellent communication and stakeholder management skills.

What We Offer

  • Opportunity to work on cutting-edge cloud-native architectures.
  • A culture focused on
    automation, reliability, and innovation
    .
  • Growth opportunities with certifications, training, and leadership exposure.


  • Bengaluru, Karnataka, India AppHelix Full time ₹ 9,00,000 - ₹ 12,00,000 per year

    Role DescriptionThis is a full-time on-site role located in Bengaluru for a Site Reliability Engineer. The Site Reliability Engineer will be responsible for maintaining and improving the reliability of AppHelix's systems. Daily tasks include monitoring system performance, troubleshooting issues, managing infrastructure, and supporting software development....


  • Bengaluru, Karnataka, India FIS Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    About the Role :Site Reliability Engineer (SRE)with deep expertise inMainframe technologies like COBOL, JCL, etc. to support and enhance ourCard Management & Payment processing functions. This role will be responsible for ensuring reliability, high availability, scalability, stability and performance of mission-critical mainframe software applications and...


  • Bengaluru, Karnataka, India eBay Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    At eBay, we're more than a global ecommerce leader — we're changing the way the world shops and sells. Our platform empowers millions of buyers and sellers in more than 190 markets around the world. We're committed to pushing boundaries and leaving our mark as we reinvent the future of ecommerce for enthusiasts.Our customers are our compass, authenticity...


  • Bengaluru, Karnataka, India ViewSonic Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Job Requirements:Bachelor's degree in Computer Science, Engineering, or a related field.3+ year of experience in a relevant role, such as Site Reliability Engineer, DevOps Engineer, or similar, is preferred but not mandatory.Basic understanding of AWS solutions including EC2, S3, CloudWatch, Lambda, and RDS.Interest and understanding of Platform Engineering...


  • Bengaluru, Karnataka, India HDFC Limited Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Hiring for Lead / Sr Site Reliability Engineer for Mumbai & Bangalore LocationExperience YearsJob PurposeAnalysing, troubleshooting, and designing vital services, platforms, and infrastructure on GCP while always thinking about reliability, scalability, resilience, security, and performance.Job Responsibilities:Help build a Site Reliability Engineering...


  • Bengaluru, Karnataka, India Visa Inc. Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Job Description We seek a Site Reliability Engineer, working in the Product Reliability Engineering function who will:Perform day-to-day site reliability engineering functions including maintenance and incident resolution for all debit applications, products, and services including debit, prepaid, and risk lines of business. Perform ongoing/proactive...


  • Bengaluru, Karnataka, India NatWest Group Full time ₹ 9,00,000 - ₹ 12,00,000 per year

    Site Reliability Engineer Join us as a Site Reliability EngineerIn this key role, you'll support the improvement of non-functional and operational characteristics such as availability, performance, efficiency, change management, monitoring, security, incident response, and capacity planning of our products and services You'll enjoy significant...


  • Bengaluru, Karnataka, India Funic Tech Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Job Title : Site Reliability Engineer (SRE)Experience Required : 7 YearsLocation : Bangalore / ChennaiEmployment Type : Full-TimeWork Mode : OnsiteRole Overview : We are seeking a highly skilled Site Reliability Engineer (SRE) with 7 years of experience to ensure the reliability, scalability, and performance of our systems. The ideal candidate will bring...


  • Bengaluru, Karnataka, India PROGRESS SOFTWARE Full time ₹ 6,00,000 - ₹ 12,00,000 per year

    Job Description Site Reliability Engineer Hybrid Hyderabad, IndiaBengaluru, India DevOps Apply nowJob Summary We are Progress (Nasdaq: PRGS) - the trusted provider of software that enables our customers to develop, deploy and manage responsible, AI-powered applications and experience with agility and ease. Were proud to have a diverse, global team...


  • Bengaluru, Karnataka, India WhiteLotus Talent Partners Full time ₹ 9,00,000 - ₹ 12,00,000 per year

    We are looking for aL0 and L1 Site Reliability Engineer (SRE) Supportto join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered byOpenStackandKubernetes. In this role, you will focus onmonitoring,basic troubleshooting, andincident response, helping to maintain high system availability,...