Reliability Engineering Manager

1 week ago


Nagpur, Maharashtra, India beBeeCloud Full time ₹ 18,00,000 - ₹ 25,00,000
Cloud Platform Architect

We transform infrastructure and operations into software engineering challenges. Our mission is to build software platforms that enable safe, reliable, and scalable provisioning and management of all services.

Key Responsibilities:
  • Boldly challenge the status quo and drive innovation in building modern technology.
  • Design and architect new solutions, optimizing existing ones to improve agility in managing hundreds of microservices infrastructures.

The ideal candidate:

  • Believes in automating DevOps & SRE aspects like infrastructure provisioning, deployment, observability, incident lifecycle, uptime SLA etc.
  • Is bold, open-minded, and eager to learn and grow.
Day-to-Day Activities:
  • Work with Kubernetes clusters hosted in cloud environments.
  • Use InfrastructureAsCode tooling like Terraform and Ansible to manage resources.
  • Collaborate with development teams to develop software for reliability and scale, coaching best practices.
  • Troubleshoot priority incidents, facilitate post-mortems, and ensure closure of incidents.
  • Analyze previous incidents and usage patterns to predict issues and take proactive actions.
  • Build and drive adoption for greater self-healing and resiliency patterns.
  • Design automated software upgrades, change management, and release management solutions.
  • Own tools and services end-to-end, designing, coding, testing, and delivering software to automate manual work.
  • Perform performance and cost optimization for infrastructure.
  • Participate in on-call rotations and 24x7 support coverage as needed.

Must-Haves:

  • Bachelor's degree in information systems, information technology, computer science, or similar.
  • 3+ years of professional experience.
  • Experience administering Kubernetes clusters.
  • Experience managing Infrastructure as code using Terraform.
  • Direct production operations experience in a cloud environment.
  • Experience contributing to technology strategy.
  • Experience leading capability-building initiatives across diverse areas such as infrastructure automation, observability, incident management, architecting HA systems, and other core engineering.
  • Demonstrated experience in driving operational efficiency and transparency of a growing organization.


  • Nagpur, Maharashtra, India GXS Bank Full time

    Get to know the Role We treat Infrastructure and operations as Software Engineering problems. Our mission is to build and progress software platforms which enables the provisioning and managing of all Digibank services in safe, reliable and scalable ways. We consistently challenge the status quo, use new technologies to build platforms and tooling for...


  • Nagpur, Maharashtra, India Natobotics Full time

    We're on an exciting journey with our client and we want you to join us. With our client, you will be exposed to the latest technologies and work with some of the brightest minds in the industry.Our client is leading Banking company so you will be playing a key role as a VP – Site Reliability Engineering (SRE), who can assist with the below:Roles &...


  • Nagpur, Maharashtra, India beBeeSiteReliabilityEngineer Full time ₹ 1,50,00,000 - ₹ 2,50,00,000

    About the Position:We seek a seasoned system reliability expert to design and develop scalable service level agreements, indicators, and objectives for high-availability services.The successful candidate will be responsible for designing and implementing automation solutions for routine manual production/non-production operations using technologies like...


  • Nagpur, Maharashtra, India beBeeelkexpert Full time US$ 1,80,000 - US$ 2,60,000

    We're seeking a seasoned Site Reliability Engineer with ELK expertise to spearhead our platform's scalability and reliability.Key Responsibilities:


  • Nagpur, Maharashtra, India beBeeTechnical Full time ₹ 9,60,000 - ₹ 14,40,000

    Executive Technical Lead for Site Reliability Engineering\


  • Nagpur, Maharashtra, India beBeeDevOps Full time ₹ 2,00,00,000 - ₹ 3,00,00,000

    Key RolesAs a highly skilled Principal DevOps / Senior SRE Engineer, you will be responsible for ensuring the reliability and scalability of our SaaS platform deployed across tens of thousands of GCP projects and large-scale GKE clusters.This is a hands-on leadership role where you will not only build scalable systems but also guide best practices across...


  • Nagpur, Maharashtra, India beBeeReliability Full time ₹ 15,00,000 - ₹ 25,00,000

    Job Title: System Reliability EngineerWe are seeking an experienced System Reliability Engineer to drive system reliability and performance through automation, cloud infrastructure, and observability solutions.Key Responsibilities:Cloud & Automation:Experience with Python for cloud service automationSkills to integrate cloud services into monitoring...

  • Reliability Expert

    1 week ago


    Nagpur, Maharashtra, India beBeeResilience Full time ₹ 2,00,00,000 - ₹ 2,50,00,000

    About the Role:As a Site Reliability Engineer, you will play a crucial part in ensuring the robustness and efficiency of our systems. Your responsibilities will include:Identifying potential system issues early and implementing preventive measures to enhance system resilience.Automating processes to eliminate manual effort and enable rapid, secure...


  • Nagpur, Maharashtra, India beBeeSystem Full time ₹ 1,84,32,000 - ₹ 2,38,40,000

    Accounting Tech Platform Engineer RoleWe are seeking a highly skilled System Reliability Engineer to join our team and contribute to the stability, scalability, and operational excellence of our Accounting and Finance platforms.The ideal candidate will be responsible for building automation, implementing monitoring and logging systems, improving incident...


  • Nagpur, Maharashtra, India beBeeScalability Full time ₹ 1,20,00,000 - ₹ 1,80,00,000

    Job SummaryWe are seeking a highly skilled Senior Site Reliability Engineer to join our team. As a Senior SRE, you will be responsible for ensuring the reliability and scalability of our products and processes.In this role, you will work closely with product development teams, Cloud Infrastructure, and other SRE teams to identify and resolve observability...