Site Reliability Engineering Manager

2 months ago


Bengaluru, India CloudBees Full time

Job Title - Manager, Site Reliability Engineer

Location - Bangalore and Chennai

Year of Experience - 10+ Years


About CloudBees

CloudBees is the leading software delivery platform that enables enterprises to deliver scalable, compliant, and secure software, empowering developers to do their best work.


Seamlessly integrating into any hybrid and heterogeneous environment, CloudBees is more than a tool—it's a strategic partner in your cloud transformation journey, ensuring security, compliance, and operational efficiency while enhancing the developer experience across your entire software development lifecycle. It allows developers to bring and execute their code anywhere, providing greater flexibility and freedom through fast, self-serve, and secure workflows.


CloudBees supports organizations at every step of their DevSecOps journey, whether using Jenkins on-premise or transitioning software delivery to the cloud and wanting to accelerate their cloud transformation by years. CloudBees is helping customers build the future, today.


About the Role

As an SRE Manager at CloudBees, you will be an essential contributor to the development of our industry-leading software products. You'll work within the SaaS Platform team to manage, design, develop, and deliver high-quality solutions to achieve high availability and performance of our systems.


What You'll Do

  • Lead efforts to design, implement, and manage highly available, scalable, and fault-tolerant systems and services.
  • Drive the automation of processes, deployments, monitoring, and incident response to improve efficiency and reliability.
  • Collaborate with development teams to ensure the architecture and applications are designed with scalability, reliability and cost in mind.
  • Develop and maintain monitoring, alerting, and logging solutions to proactively identify and address performance issues and outages.
  • Participate in a follow the sun on-call rotations, responding to incidents, conducting post-incident reviews, and contributing to incident response improvements.
  • Analyze system performance data, identify bottlenecks, and recommend solutions to optimize performance and resource utilization.
  • Contribute to the design and implementation of disaster recovery strategies and backup solutions.
  • Mentor and provide guidance to junior SREs and other team members, fostering a culture of continuous learning and improvement.
  • Stay current with industry trends, emerging technologies, and best practices to drive innovation and improvements in system reliability.

Requirements

  • Experience with containerization and orchestration technologies (e.g., Docker, Kubernetes).
  • Bachelor's degree in Computer Science, Engineering, or related field (or equivalent practical experience).
  • 10 + years of experience with at least two years with leadership experience in Site Reliability Engineering or similar role, with a proven track record of managing complex systems in a production environment.
  • Proficiency in programming/scripting languages such as Go, Python, or similar.
  • Strong experience with cloud platforms (e.g., AWS, Google Cloud, Azure) and infrastructure-as-code tools (e.g., Terraform, Cloud Formation).
  • Solid knowledge of networking concepts, including load balancing, DNS, routing, and security.
  • Experience with monitoring and observability tools (e.g., Prometheus, Grafana, DataDog).
  • Strong problem-solving skills and the ability to troubleshoot complex issues under pressure.
  • Excellent communication and collaboration skills to work effectively across teams.
  • Experience with CI/CD pipelines and version control systems (e.g., Jenkins, GitHub actions).
  • Relevant certifications (e.g., AWS Certified DevOps Engineer, Google Professional DevOps Engineer) are a plus.
  • Proactive approach to identifying problems, performance bottlenecks, and areas for improvement.
  • Possess a passion for reliability, through participation in architectural design.
  • Proven ability to lead and guide technical projects and initiatives.



  • Bengaluru, India First American (India) Full time

    The Role:A SRE Manager is ultimately responsible for system reliability, developer productivity and reducing time to market by striving to reduce technical debt of the services your SRE team supports. We seek managers who are passionate about site reliability to influence and drive the strategic SRE mission.As a Site Reliability Engineering Manager working...


  • Bengaluru, India First American (India) Full time

    The Role:A SRE Manager is ultimately responsible for system reliability, developer productivity and reducing time to market by striving to reduce technical debt of the services your SRE team supports. We seek managers who are passionate about site reliability to influence and drive the strategic SRE mission.As a Site Reliability Engineering Manager working...


  • Bengaluru, India First American (India) Full time

    The Role:A SRE Manager is ultimately responsible for system reliability, developer productivity and reducing time to market by striving to reduce technical debt of the services your SRE team supports. We seek managers who are passionate about site reliability to influence and drive the strategic SRE mission.As a Site Reliability Engineering Manager working...


  • Bengaluru, Karnataka, India First American (India) Full time

    The Role: A SRE Manager is ultimately responsible for system reliability, developer productivity and reducing time to market by striving to reduce technical debt of the services your SRE team supports. We seek managers who are passionate about site reliability to influence and drive the strategic SRE mission. As a Site Reliability Engineering Manager...


  • Bengaluru, India First American (India) Full time

    The Role:A SRE Manager is ultimately responsible for system reliability, developer productivity and reducing time to market by striving to reduce technical debt of the services your SRE team supports. We seek managers who are passionate about site reliability to influence and drive the strategic SRE mission.As a Site Reliability Engineering Manager working...


  • Bengaluru, India First American (India) Full time

    The Role: A SRE Manager is ultimately responsible for system reliability, developer productivity and reducing time to market by striving to reduce technical debt of the services your SRE team supports. We seek managers who are passionate about site reliability to influence and drive the strategic SRE mission. As a Site Reliability Engineering Manager...


  • Bengaluru, Karnataka, India First American (India) Full time

    The Role:A SRE Manager is ultimately responsible for system reliability, developer productivity and reducing time to market by striving to reduce technical debt of the services your SRE team supports. We seek managers who are passionate about site reliability to influence and drive the strategic SRE mission.As a Site Reliability Engineering Manager working...


  • Bengaluru, Karnataka, India First American (India) Full time

    The Role:A SRE Manager is ultimately responsible for system reliability, developer productivity and reducing time to market by striving to reduce technical debt of the services your SRE team supports. We seek managers who are passionate about site reliability to influence and drive the strategic SRE mission.As a Site Reliability Engineering Manager working...


  • Bengaluru, India Ensono Full time

    About Role Ensono is continuing its growth and building a cloud-native managed service offering for our clients. We are looking for energetic and skilled remote Site Reliability Engineers to join us on this exciting new journey. As a Site Reliability Engineer, you and your team will be responsible for between four and ten of Ensono cloud-native managed...


  • Bengaluru, India Cricbuzz.com Full time

    Site Reliability Engineer We are looking for a highly skilled and motivated Web Server Site Reliability Engineer to join our team. As a Web Server Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our web server infrastructure and CDN services. Experience - 3 - 5 years Responsibilities: ●...


  • Bengaluru, Karnataka, India Cyitechsearch Full time

    We are hiring for Site Reliability Engineer Skills : Develop and provide operational support for fullstack software applications. Relevant industry certifications, such as through the Site Reliability Engineering (SRE) Foundation. Five years' experience as a site reliability engineer or similar role. Collaborate with development operations staff to create,...


  • Bengaluru, India Cyitechsearch Full time

    We are hiring for Site Reliability Engineer Skills : - Develop and provide operational support for full-stack software applications.- Relevant industry certifications, such as through the Site Reliability Engineering (SRE) Foundation.- Five years' experience as a site reliability engineer or similar role.- Collaborate with development operations staff...


  • Bengaluru, Karnataka, India Ensono Full time

    About RoleEnsono is continuing its growth and building a cloud-native managed service offering for our clients. We are looking for energetic and skilled remote Site Reliability Engineers to join us on this exciting new journey. As a Site Reliability Engineer, you and your team will be responsible for between four and ten of Ensono cloud-native managed...


  • Bengaluru, India Kunato Full time

    Site Reliability Engineer (SRE) - Python/GolangJob Description:We are seeking a highly skilled and passionate Site Reliability Engineer (SRE) to join our technology team. The ideal candidate will possess strong programming skills with expertise in Python, Golang, or both. This role is pivotal in ensuring the high availability, performance, and security of...


  • Bengaluru, Karnataka, India Ensono Full time

    About Role Ensono is continuing its growth and building a cloud-native managed service offering for our clients. We are looking for energetic and skilled remote Site Reliability Engineers to join us on this exciting new journey. As a Site Reliability Engineer, you and your team will be responsible for between four and ten of Ensono cloud-native managed...


  • Bengaluru, India Ensono Full time

    About RoleEnsono is continuing its growth and building a cloud-native managed service offering for our clients. We are looking for energetic and skilled remote Site Reliability Engineers to join us on this exciting new journey. As a Site Reliability Engineer, you and your team will be responsible for between four and ten of Ensono cloud-native managed...


  • Bengaluru, India Cricbuzz.com Full time

    Site Reliability EngineerWe are looking for a highly skilled and motivated Web Server Site Reliability Engineer to join our team. As a Web Server Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our web server infrastructure and CDN services.Experience - 3 - 5 years Responsibilities:● Design,...


  • Bengaluru, India Cricbuzz.com Full time

    Site Reliability EngineerWe are looking for a highly skilled and motivated Web Server Site Reliability Engineer to join our team. As a Web Server Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our web server infrastructure and CDN services.Experience - 3 - 5 years Responsibilities:● Design,...


  • Bengaluru, Karnataka, India Cricbuzz Full time

    Site Reliability EngineerWe are looking for a highly skilled and motivated Web Server Site Reliability Engineer to join our team. As a Web Server Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our web server infrastructure and CDN services.Experience yearsResponsibilities: Design, implement,...


  • Bengaluru, India Cricbuzz.com Full time

    Site Reliability EngineerWe are looking for a highly skilled and motivated Web Server Site Reliability Engineer to join our team. As a Web Server Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our web server infrastructure and CDN services.Experience - 3 - 5 years Responsibilities:● Design,...