Site Reliability Leader

6 days ago


Hyderabad, Telangana, India beBeeReliability Full time ₹ 1,50,00,000 - ₹ 2,50,00,000
Job Title: Lead Site Reliability Engineer

**About the Role**: This role focuses on ensuring platform and application availability, scalability, and reliability.

Key Responsibilities:
  • Build, monitor, and maintain highly scalable deployments.
  • Install new releases and environments for applications.
  • Proactively monitor systems and applications, develop monitoring tools and dashboards, and ensure high availability of production environments by identifying performance issues and implementing corrective actions.
  • Lead incident response efforts, diagnose root causes, and implement long-term solutions to prevent recurrence. Ensure effective communication during outages.
  • Work closely with cross-functional teams to ensure efficient platform integration, API management, and campaign execution, providing technical guidance and support as needed.
  • Troubleshoot and perform root cause analysis to quickly resolve incidents in crisis situations.
  • Monitor performance metrics and implement corrective actions when necessary to ensure high availability of production environments.
  • Manage and oversee API integrations, ensuring seamless interoperability between systems and third-party services.
  • Ensure compliance and security integrity of environments.
  • Adhere to process compliance and ensure platform reliability.
  • Experience with monitoring and automation in Prometheus Grafana or ELK or Datadog or Dynatrace or any observability tools.
  • Container management and microservices architectures experience, including Docker in cloud or on-premises infrastructure.

Requirements:
  • Kubernetes expertise in creating, maintaining, scaling, and upgrading Production clusters.
  • Docker experience in writing files complying with industry best practices.
  • Hands-on experience with Azure-DevOps/Jenkins in creating and executing pipelines in a multi-target environment.
  • Analysis skills in troubleshooting with expertise on logging stacks like ELK, Dynatrace, Splunk.
  • Monitoring stacks expertise using Grafana, building and managing dashboards on various data sources.
  • Programming skills in Bash scripts and Ansible, with some exposure to Terraform.
  • Linux environment skills, troubleshooting at OS levels.
  • Project management tool experience, such as JIRA.
  • Deploying and managing distributed queuing systems like Redis, Kafka Rabbit-MQ, IBM-MQ, MSMQ.
  • Database deployment and management in standalone and cluster modes with basic DB skills on Postgres, MySQL, Click House.
  • Experience working on high traffic and highly scalable platforms is an added advantage.
  • Good understanding of Linux networking concepts (TLS/SSL, DNS, Load Balancers, etc.) and troubleshooting skills in large-scale environments.
  • Deep understanding of security concepts - authentication, authorization, signing, encryption, SSL/TLS, SSH/SFTP, X509 certificates.
  • Knowledge of ITIL terminology for incident and problem management.
  • Excellent interpersonal, analytical, and communication skills.
  • Bachelor of Science in Computer Science or related discipline.

Why Join Us?
  • Impactful Work: Play a pivotal role in safeguarding assets, data, and reputation in the industry.
  • Tremendous Growth Opportunities: Be part of a rapidly growing company in the telecom and CPaaS space, with opportunities for professional development.
  • Innovative Environment: Work alongside a world-class team in a challenging and fun environment where innovation is celebrated. We are an equal opportunity employer.


  • Hyderabad, Telangana, India beBeeReliability Full time ₹ 65 - ₹ 85

    Job Title : Site Reliability Engineering ManagerLocation HyderabadEmployment Type Full-TimeWork Model 3 Days from office (Hybrid)About the Role:The SRE Manager will lead the reliability engineering function ensuring infrastructure resiliency and optimal operational performance. This hybrid role blends technical leadership with team mentorship and...


  • Hyderabad, Telangana, India beBeeResponsibilities Full time ₹ 1,50,00,000 - ₹ 2,00,00,000

    Job Title:Achieving System Excellence About the Role:We are seeking a skilled Site Reliability Engineer to join our team. The ideal candidate will have 5+ years of experience in DevOps and Site Reliability Engineering, with a strong focus on ensuring smooth system operations. Key Responsibilities:Design, implement, and maintain scalable systems using...


  • Hyderabad, Telangana, India Acesoft Labs Full time ₹ 1,04,000 - ₹ 1,30,878 per year

    Hi ,Kindly find the below JD :Job Title: Site Reliability Engineering (SRE) ManagerLocation: HyderabadEmployment Type: Full-TimeWork Model - 3 Days from office (Hybrid)Summary:The SRE Manager at TechBlocks India will lead the reliability engineering function, ensuring infrastructure resiliency and optimal operational performance. This hybrid role blends...


  • Hyderabad, Telangana, India Talent Worx Full time US$ 1,20,000 - US$ 2,00,000 per year

    Talent Worx is seeking a talented SRE (Site Reliability Engineer) to enhance our technology team. In this role, you will be pivotal in ensuring the reliability, performance, and availability of our applications and services.Your work will involve both software engineering and systems operations as you strive to improve customer experiences and operational...


  • Hyderabad, Telangana, India CloudHire Full time ₹ 7,00,000 - ₹ 12,00,000 per year

    Job SummaryThe Technical Manager for Site Reliability Engineering (SRE) will lead a remote team of Site Reliability Engineers, ensuring operational excellence and fostering a high-performing team culture. Reporting to the US-based Director of Systems and Security, this role is responsible for overseeing day-to-day operations, technical mentorship, and...


  • Hyderabad, Telangana, India TechBlocks Full time ₹ 7,00,000 - ₹ 12,00,000 per year

    About TechBlocks:TechBlocks is a global digital product engineering company with 16+ years of experience helping Fortune 500 enterprises and high-growth brands accelerate innovation, modernize technology, and drive digital transformation. From cloud solutions and data engineering to experience design and platform modernization, we help businesses solve...


  • Hyderabad, Telangana, India ZORTECH SOLUTIONS PRIVATE LIMITED Full time

    Job Title : Site Reliability Engineering (SRE) ManagerLocation : HyderabadEmployment Type : Full-TimeWork Model : 3 Days from office (Hybrid)Summary :The SRE Manager will lead the reliability engineering function, ensuring infrastructure resiliency and optimal operational performance. This hybrid role blends technical leadership with team mentorship and...


  • Hyderabad, Telangana, India Microsoft Full time ₹ 9,00,000 - ₹ 12,00,000 per year

    The Windows Cloud division is looking for a Senior Site Reliability Engineer that will help us take the Windows Cloud platform, as well as the Windows 365 Cloud PC and Azure Virtual Desktop business to the next level.Windows 365 Cloud PC (W365) and Azure Virtual Desktop (AVD) have recently been recognized as leaders in the Gartner Magic Quadrant for Desktop...


  • Hyderabad, Telangana, India Jigya Software Services Full time ₹ 1,50,000 - ₹ 28,00,000 per year

    Job Title:Senior Site Reliability Engineer (SRE) - AWS/KubernetesLocation:Hyderabad - OnsiteJob Type:Full-TimeAbout the Role:We are looking for a highly skilled and motivated Site Reliability Engineer to design, build, and maintain our high-performance, scalable cloud infrastructure. You will play a critical role in ensuring the reliability, performance, and...


  • Hyderabad, Telangana, India beBeeSite Full time ₹ 2,24,00,000 - ₹ 3,51,20,000

    About Our Senior Site Reliability ExpertThe role of a senior site reliability expert is pivotal in ensuring the stability, scalability, and operational excellence of accounting and finance systems.Key ResponsibilitiesOperational Oversight: As a senior site reliability expert, you will be responsible for overseeing day-to-day operations for accounting and...