Site Reliability Engineering Director

2 weeks ago


Pune, Maharashtra, India beBeeSre Full time ₹ 2,25,00,000 - ₹ 2,62,50,000

Job Title: SRE Lead (Engineering & Reliability)

We are seeking a seasoned and dynamic Site Reliability Engineering leader to oversee the reliability, scalability, and performance of our critical systems.

As an SRE Lead, you will play a pivotal role in establishing and implementing SRE best practices, leading a team of engineers, and driving automation, monitoring, and incident response strategies.

This position combines software engineering and systems engineering expertise to build and maintain high-performing, reliable systems.

Main Responsibilities:

  • Lead efforts to maintain high availability and reliability of critical services
  • Define and monitor SLIs, SLOs, and SLAs to ensure business requirements are met
  • Proactively identify and resolve performance bottlenecks and system inefficiencies
  • Establish and improve incident management processes and on-call rotations
  • Lead incident response and root cause analysis for high-priority outages
  • Drive post-incident reviews and ensure actionable insights are implemented
  • Develop and implement automated solutions to reduce manual operational tasks
  • Enhance system observability through metrics, logging, and distributed tracing tools
  • Optimize CI/CD pipelines for seamless deployments
  • Partner with software engineering teams to improve the reliability of applications and infrastructure
  • Work closely with product/engineering teams to design scalable and robust systems
  • Ensure seamless integration of monitoring and alerting systems across teams
  • Manage, mentor, and grow a team of SREs
  • Promote SRE best practices and foster a culture of reliability and performance across the organization
  • Drive performance reviews, skills development, and career progression for team members
  • Perform capacity planning and implement autoscaling solutions to handle traffic spikes
  • Optimize infrastructure and cloud costs while maintaining reliability and performance

Required Skills and Qualifications:

  • Technical Expertise:
    • Experience with cloud platforms and Kubernetes
    • Hands-on knowledge of infrastructure-as-code tools like Terraform/Helm/Ansible
    • Proficiency in Java
    • Expertise in distributed systems, databases, and load balancing
  • Monitoring & Observability:
    • Proficient with tools like Prometheus/Grafana/Elastic APM or New Relic
    • Understanding of metrics-driven approaches for system monitoring and alerting
  • Automation & CI/CD:
    • Hands-on experience with CI/CD pipelines
    • Skilled in automation frameworks and tools for infrastructure and application deployments
  • Incident Management:
    • Proven track record in handling incidents, post-mortems, and implementing solutions to prevent recurrence
  • Leadership & Communication Skills:
    • Strong people management and leadership skills
    • Excellent problem-solving and decision-making skills
    • Clear and concise communication, with the ability to translate technical concepts for non-technical stakeholders

Benefits:

  • Be a key driver in building and scaling reliable systems in a fast-paced environment
  • Work with cutting-edge technologies and influence the evolution of the infrastructure
  • Lead a high-impact team and foster a culture of reliability and innovation


  • Pune, Maharashtra, India Accelya Group Full time US$ 90,000 - US$ 1,20,000 per year

    For more than 40 years, Accelya has been the industry's partner for change, simplifying airline financial and commercial processes and empowering the air transport community to take better control of the future. Whether partnering with IATA on industry-wide initiatives or enabling digital transformation to simplify airline processes, Accelya drives the...


  • Pune, Maharashtra, India beBeeCloudReliability Full time ₹ 25,00,000 - ₹ 35,00,000

    Job Title: DevOps/Site Reliability EngineerWe are seeking a skilled DevOps/Site Reliability Engineer to optimize our infrastructure and improve the overall quality of our software solutions.Engage in process development, implementation, and measurement for Continues Integrations and Delivery, Site Reliability Engineering, and automation of deployment and...


  • Pune, Maharashtra, India ENGEL Full time ₹ 6,00,000 - ₹ 18,00,000 per year

    Company DescriptionENGEL is a global leader in the production of injection moulding machines and their automation. The company produces systems that manufacture plastic parts used in various industries such as automotive, packaging, and consumer goods. With nine production plants worldwide and subsidiaries and representatives in over 85 countries, ENGEL...


  • Pune, Maharashtra, India Ather Energy Full time ₹ 15,00,000 - ₹ 28,00,000 per year

    You'll be our: Site Reliability EngineerYou'll be based at: Pune Zonal OfficeYou'll be aligned with: Cloud and Data Platform Lead / Cloud ArchitectYou'll be a member of: Cloud and Data Platform TeamAther's fleet of smart scooters is growing rapidly, and so is the volume of data they generate. Our Vehicle Data Platform (VDP) is the core of this ecosystem, and...


  • Pune, Maharashtra, India Synechron Full time

    We have immediate opportunity for Site Reliability Engineer 5 to 9 years.Synechron – PuneJob Role: - SRE (Senior Site Reliability Engineer)Job Location: - PuneAbout SynechronWe began life in 2001 as a small, self-funded team of technology specialists. Since then, we've grown our organization to 14,500+ people, across 58 offices, in 21 countries, in key...


  • Pune, Maharashtra, India Apex One Full time ₹ 7,00,000 - ₹ 12,00,000 per year

    Job Overview We are looking for a detail-oriented and experienced Site Reliability Engineer to join our team. The Site Reliability Engineer will be responsible for creating and implementing scalable software solutions in order to meet system and application performance goals. You will also be responsible for troubleshooting system errors and resolving any...


  • Pune, Maharashtra, India Fiserv Full time

    Site Reliability Engineering Expert (Architect) Exp. Range:- 9 to 12 Years Location:- Pune Job Description: What does a successful Site Reliability Engineer (SRE) Expert do at Fiserv? The Site reliability engineer blends the principles of software engineering with the discipline of operations to create high-performing and reliable software systems....


  • Pune, Maharashtra, India SailPoint Full time US$ 1,25,000 - US$ 1,75,000 per year

    SailPoint is the leader in identity security for the cloud enterprise. Our identity security solutions secure and enable thousands of companies worldwide, giving our customers unmatched visibility into the entirety of their digital workforce, ensuring workers have the right access to do their job – no more, no less.  IdentityNow is SailPoint's Identity...


  • Pune, Maharashtra, India Reveille Technologies Full time

    Job Summary :We are seeking a skilled and proactive Site Reliability Engineer (SRE) with a strong DevOps mindset and hands-on experience in application troubleshooting. The ideal candidate will be responsible for ensuring the reliability, scalability, and performance of our applications and infrastructure. This role requires a blend of software engineering,...


  • Pune, Maharashtra, India Barclays Full time US$ 90,000 - US$ 1,20,000 per year

    Join us as a Site Reliability Engineer - Linux & KDB – AVP at Barclays, We are seeking a highly skilled and motivated KDB Site Reliability Engineer (SRE) to manage and enhance our KDB infrastructure estate. This role is ideal for someone with a strong background in Linux systems, shell scripting, and hands-on experience in financial services. You will be...