Distributed System Stability Expert

5 days ago


Bengaluru, Karnataka, India beBeeSystemReliability Full time ₹ 1,04,000 - ₹ 1,30,878
System Reliability Specialist

This role is responsible for ensuring the stability and performance of critical systems and services. The System Reliability Specialist will be the first line of defense in incident management and monitoring, requiring real-time response, proactive problem solving, and strong coordination skills to address production issues efficiently.

  • Monitoring and Alerting: Proactively monitor system health, performance, and uptime using monitoring tools like Datadog, Prometheus.
  • Acting as the primary responder for incidents to troubleshoot and resolve issues quickly, ensuring minimal impact on end-users.
  • Accurately categorizing incidents, prioritizing them based on severity, and escalating to L2/L3 teams when necessary.
  • Ensuring systems meet Service Level Objectives (SLOs) and maintain uptime as per SLAs.
  • Collaborating with DevOps and L2 teams to automate manual processes for incident response and operational tasks.
  • Performing root cause analysis (RCA) of incidents using log aggregators and observability tools to identify patterns and recurring issues.
  • Following predefined runbooks/playbooks to resolve known issues and document fixes for new problems.
Required Skills and Qualifications

Our ideal candidate has 4 to 6 years of relevant experience in SRE, DevOps, or Production Support with monitoring tools (e.g., Prometheus, Datadog).

  • Working knowledge of Linux/Unix operating systems and basic scripting skills (Python, Gitlab actions) cloud platforms (AWS, Azure, or GCP).
  • Familiarity with container orchestration (Kubernetes, Docker, Helmcharts) and CI/CD pipelines.
  • Exposure with ArgoCD for implementing GitOps workflows and automated deployments for containerized applications.
  • Possessing experience in Monitoring: Datadog, Infrastructure: AWS EC2, Lambda, ECS/EKS, RDS, Networking: VPC, Route 53, ELB and Storage: S3, EFS, Glacier.
  • Strong troubleshooting and analytical skills to resolve production incidents effectively.
  • Basic understanding of networking concepts (DNS, Load Balancers, Firewalls).
  • Good communication and interpersonal skills for incident communication and escalation.
  • Having preferred certifications: AWS Certified SysOps Administrator - Associate, AWS Certified Solutions Architect - Associate or AWS Certified DevOps Engineer - Professional


  • Bengaluru, Karnataka, India beBeeEngineer Full time ₹ 12,91,224 - ₹ 28,36,440

    Reliability Engineer Role OverviewOur Platform Engineering team is seeking a highly skilled Reliability Engineer to ensure the reliability and performance of our systems. As a critical member, you will make data-driven decisions, drive innovation, and collaborate with cross-functional teams.We strive for operational excellence, focusing on availability,...


  • Bengaluru, Karnataka, India beBeeReliability Full time ₹ 9,00,000 - ₹ 13,50,000

    System Reliability ExpertWe are seeking a highly skilled System Reliability Expert to join our Technology Team. As a key member of our infrastructure team, you will be responsible for designing, implementing, and maintaining the foundational technology platforms that power all of our applications and businesses.You will play a critical operational role,...

  • Data Engineer

    2 weeks ago


    Bengaluru, Karnataka, India beBeeDataEngineer Full time ₹ 1,50,00,000 - ₹ 2,00,00,000

    Job Title: Data Engineer - Distributed Systems ExpertJob DescriptionWe are seeking a highly skilled Data Engineer to join our organization. The ideal candidate will have extensive experience with Big Data technologies and distributed data processing frameworks.This individual will be responsible for designing, developing, and maintaining large-scale data...


  • Bengaluru, Karnataka, India beBeeInfrastructure Full time ₹ 20,00,000 - ₹ 30,00,000

    Cloud Infrastructure Engineer OpportunityWe're looking for highly skilled engineers with expertise in solving complex problems in distributed systems, virtualized infrastructure, and highly available services.Key ResponsibilitiesDesign, develop, and deploy software to enhance the availability, scalability, and efficiency of cloud products and services.Design...


  • Bengaluru, Karnataka, India beBeeInfrastructure Full time ₹ 1,04,000 - ₹ 1,30,878

    Job DescriptionMaintaining the stability and performance of large-scale distributed systems, including Elasticsearch clusters and MongoDB installations, is a critical role in operations.Key ResponsibilitiesWork with one of the largest Elasticsearch cluster deployments, ensuring high availability and minimizing downtime.Maintain services once they are live by...


  • Bengaluru, Karnataka, India beBeeDevelopment Full time ₹ 1,04,000 - ₹ 1,30,878

    Software Development ExpertWe are seeking a skilled Software Development Expert to join our team. In this role, you will be responsible for designing and implementing high-quality software solutions that meet the needs of our users.Main Responsibilities:Design and implement scalable, resilient distributed systems by generating software specifications,...


  • Bengaluru, Karnataka, India beBeeInfrastructureManager Full time ₹ 1,20,00,000 - ₹ 2,40,00,000

    Job OverviewA System Operations Expert is needed to lead the management of distributed systems, ensuring optimal performance and high availability. This role will focus on troubleshooting issues across hardware, software, and network layers, optimizing system operations using Python and other scripting tools.Key ResponsibilitiesManage proxy infrastructure...


  • Bengaluru, Karnataka, India beBeeEngineering Full time ₹ 12,00,000 - ₹ 15,00,000

    About Distributed Systems EngineeringWe are seeking an engineer with a strong foundation in distributed systems to design and build scalable, fault-tolerant systems that power ML applications. ResponsibilitiesDesign and implement components of distributed systems with a focus on reliability, scalability, and performance.Write well-defined abstractions...


  • Bengaluru, Karnataka, India beBeeArchitecture Full time ₹ 15,00,000 - ₹ 20,00,000

    Cloud Native Architecture LeadAbout this Role:Design & implement scalable microservices architectures on Kubernetes.Build & optimize large-scale distributed systems ensuring high availability & fault tolerance.Drive adoption of AI-powered development tools.Conduct design & code reviews to ensure best practices.Key Responsibilities:Lead, mentor & grow the...


  • Bengaluru, Karnataka, India beBeeCloud Full time ₹ 1,50,00,000 - ₹ 2,01,00,000

    Senior Systems ArchitectWe are seeking an experienced senior systems architect to lead the design and implementation of our data management platform.Main ResponsibilitiesDesign and develop massively scalable distributed systems.Lead the development of the core backend of our data platform.Requirements3+ years of experience in infrastructure back-end...