Lead Site Reliability Engineer

4 weeks ago


Bengaluru, Karnataka, India Landmark Group Full time

Job Title: SRE Lead (Engineering & Reliability)

Job Summary:

We are seeking an experienced and dynamic Site Reliability Engineering (SRE) Lead to oversee the reliability, scalability, and performance of our critical systems. As an SRE Lead, you will play a pivotal role in establishing and implementing SRE practices, leading a team of engineers, and driving automation, monitoring, and incident response strategies. This position combines software engineering and systems engineering expertise to build and maintain high-performing, reliable systems.

Experience: 6+ years

Key Responsibilities:

Reliability & Performance:

  • Lead efforts to maintain high availability and reliability of critical services.
  • Define and monitor SLIs, SLOs, and SLAs to ensure business requirements are met.
  • Proactively identify and resolve performance bottlenecks and system inefficiencies.

Incident Management & Response:

  • Establish and improve incident management processes and on-call rotations.
  • Lead incident response and root cause analysis for high-priority outages.
  • Drive post-incident reviews and ensure actionable insights are implemented.

Automation & Tooling:

  • Develop and implement automated solutions to reduce manual operational tasks.
  • Enhance system observability through metrics, logging, and distributed tracing tools (e.g., Prometheus, Grafana, Elastic APM).
  • Optimize CI/CD pipelines for seamless deployments.

Collaboration:

  • Partner with software engineering teams to improve the reliability of applications and infrastructure.
  • Work closely with product/ engineering teams to design scalable and robust systems.
  • Ensure seamless integration of monitoring and alerting systems across teams.

Leadership & Team Building:

  • Manage, mentor, and grow a team of SREs.
  • Promote SRE best practices and foster a culture of reliability and performance across the organization.
  • Drive performance reviews, skills development, and career progression for team members.

Capacity Planning & Cost Optimization:

  • Perform capacity planning and implement autoscaling solutions to handle traffic spikes.
  • Optimize infrastructure and cloud costs while maintaining reliability and performance.

Skills & Qualifications:

Required Skills:

  • Technical Expertise:
  • Experience with cloud platforms (AWS / Azure / GCP) and Kubernetes.
  • Hands-on knowledge of infrastructure-as-code tools like Terraform /Helm/ Ansible.
  • Proficiency in Java
  • Expertise in distributed systems, databases, and load balancing.
  • Monitoring & Observability:
  • Proficient with tools like Prometheus, Grafana, Elastic APM, or new relic.
  • Understanding of metrics-driven approaches for system monitoring and alerting.
  • Automation & CI/CD:
  • Hands-on experience with CI/CD pipelines (e.g., Jenkins, Azure Pipelines etc.).
  • Skilled in automation frameworks and tools for infrastructure and application deployments.
  • Incident Management:
  • Proven track record in handling incidents, post-mortems, and implementing solutions to prevent recurrence.

Leadership & Communication Skills:

  • Strong people management and leadership skills with the ability to inspire and motivate teams.
  • Excellent problem-solving and decision-making skills.
  • Clear and concise communication, with the ability to translate technical concepts for non-technical stakeholders.

Preferred Skills:

  • Experience with database optimization, Kafka, or other messaging systems.
  • Knowledge of autoscaling techniques
  • Previous experience in an SRE, DevOps, or infrastructure engineering leadership role.
  • Understanding of compliance and security best practices in distributed systems.

Why Join Us?

  • Be a key driver in building and scaling reliable systems in a fast-paced environment.
  • Work with cutting-edge technologies and influence the evolution of the infrastructure.
  • Lead a high-impact team and foster a culture of reliability and innovation.


  • Bengaluru, Karnataka, India Landmark Group Full time ₹ 8,00,000 - ₹ 12,00,000 per year

    Job Title:SRE Lead (Engineering & Reliability)Job Summary:We are seeking an experienced and dynamicSite Reliability Engineering (SRE) Leadto oversee the reliability, scalability, and performance of our critical systems. As an SRE Lead, you will play a pivotal role in establishing and implementing SRE practices, leading a team of engineers, and driving...


  • Bengaluru, Karnataka, India Nike Full time ₹ 8,00,000 - ₹ 12,00,000 per year

    Who You'll Work WithSRE hired will work as an Reliability Engineer with the engineering teams. The candidate will belong to a horizontal domain called TechOps: Resilience Engineering. This position will provide a provision for the SRE to shift between multiple engineering platforms as demanded by the work, vision and/or criticality of the projects. Roles and...


  • Bengaluru, Karnataka, India Landmark Group Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    COMPANY- LANDMARK GROUPJob Title: SRE Lead (Engineering & Reliability)Experience: 8-12 yearsJob Summary:We are seeking an experienced and dynamic Site Reliability Engineering (SRE) Lead tooversee the reliability, scalability, and performance of our critical systems. As an SRE Lead,you will play a pivotal role in establishing and implementing SRE practices,...


  • Bengaluru, Karnataka, India AppHelix Full time ₹ 9,00,000 - ₹ 12,00,000 per year

    Role DescriptionThis is a full-time on-site role located in Bengaluru for a Site Reliability Engineer. The Site Reliability Engineer will be responsible for maintaining and improving the reliability of AppHelix's systems. Daily tasks include monitoring system performance, troubleshooting issues, managing infrastructure, and supporting software development....


  • Bengaluru, Karnataka, India Visa Full time ₹ 48,00,000 - ₹ 72,00,000 per year

    Company Description Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable, and secure...


  • Bengaluru, Karnataka, India NatWest Group Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Site Reliability Engineer, AVP Join us as a Site Reliability EngineerYou'll manage the provision of stable, resilient, reliable applications with the end goal of minimising disruption to Customer & Colleague Journeys (CCJ) We'll look to you to identify and automate manual tasks and implement observability solutions, ensuring a thorough understanding of...


  • Bengaluru, Karnataka, India Chevron Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Total Number of Openings2About the position:Come join our Subsurface Digital Platform where we are driving continuous innovations to improve reliability, scalability and sustainability of Chevron business via Chevron's Digital Transformation. We are seeking a T-shaped dynamic Senior Site Reliability Engineer to lead and provide end-to-end solution support...


  • Bengaluru, Karnataka, India Programming Full time ₹ 10,00,000 - ₹ 25,00,000 per year

    Role - Site Reliability Engineering.Location - BengaluruYears of Expereince - 4+ YearsProfessional & Technical Skills:Must To Have Skills: Proficiency in Site Reliability Engineering.Good To Have Skills: Experience with cloud service providers such as AWS, Azure, or Google Cloud.Strong understanding of CI/CD tools and practices.Experience with container...


  • Bengaluru, Karnataka, India Booking Holdings Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Role Description:Engineering Manager - Site Reliability - Private CloudOur mission at is to create transformative, innovative, and personalized travel experiences for millions of customers all across the world. We want customers to have an amazing experience wherever and whenever they choose: mobile, web, and through partners and 3rd parties.About the team...


  • Bengaluru, Karnataka, India FIS Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    About the Role :Site Reliability Engineer (SRE)with deep expertise inMainframe technologies like COBOL, JCL, etc. to support and enhance ourCard Management & Payment processing functions. This role will be responsible for ensuring reliability, high availability, scalability, stability and performance of mission-critical mainframe software applications and...