Site Reliability Leader

3 days ago


Vellore, Tamil Nadu, India beBeeCloudReliabilityEngineer Full time ₹ 1,80,00,000 - ₹ 2,00,00,000
Senior Cloud Reliability Engineer

Our ideal candidate owns the reliability of our cloud-based SaaS solution on Azure.

  • Define customer-centric Service Level Indicators (SLIs) and Service Level Objectives (SLOs) for Tier-0/Tier-1 services. Publish, review quarterly, and align teams to them.
  • Implement error budgeting policies with multi-window, multi-burn-rate alerts; clear runbooks and paging thresholds.
  • Gate changes by budget status (freeze/relax rules) wired into Continuous Integration/Continuous Deployment (CI/CD).
  • Maintain SLO/EB dashboards (Azure Monitor, Grafana/Prometheus, App Insights). Run weekly SLO reviews with engineering/product.
  • Drive roadmap tradeoffs when budgets are at risk; land reliability epics.
  • Lead SEV1/SEV2 incidents without drama: Own comms, run blameless postmortems, and make corrective actions stick.
  • Engineer reliability in: Multi-AZ/region patterns (active-active/DR), PDBs/Pod Topology Spread, Horizontal Pod Autoscaler/Virtual Machine Scaling/Cluster Auto Scaling, resilient rollout/rollback.
  • Harden Kubernetes clusters (network, identity, policy), optimize node/pod density, ingress (AGIC/Nginx); mesh optional.
  • Ensure observability that works: Metrics/traces/logs with Azure Monitor/App Insights, Log Analytics, Prometheus/Grafana, OpenTelemetry. Alert on symptoms, not noise.
  • Implement Infrastructure as Code (IaC) & automation: Terraform/Bicep modules, GitOps (Flux/Argo), policy-as-code (Azure Policy/Open Policy Agent). No snowflakes.
  • Guarantee CI/CD reliability: Azure DevOps/GitHub Actions with canary/blue-green, progressive delivery, auto-rollback, Key Vault-backed secrets.
  • Maximize capacity & performance: Load testing, right-sizing, autoscaling; partner with FinOps to reduce spend without hurting SLOs.
  • Define Disaster Recovery you can trust: Define RTO/RPO, test backups/restore, run game days/chaos drills, validate ASR and multi-region failover.
  • Ensure security by default: Entra ID (Azure AD), managed identities, Key Vault rotation, VNets/NSGs/Private Link, shift-left checks in CI.
  • Reduce toil: Automate recurring ops, build self-service runbooks/chatops, publish golden paths for product teams.
  • Handle customer escalations: Be the technical owner on calls; communicate tradeoffs and recovery plans with authority.
  • Document architectures, runbooks, postmortems, SLIs/SLOs—kept current and discoverable.


  • Vellore, Tamil Nadu, India beBeeReliability Full time ₹ 1,50,00,000 - ₹ 2,50,00,000

    Job Summary:We are seeking a highly skilled Site Reliability Specialist to join our team in an exciting opportunity.The ideal candidate will have 7-12 years of experience in technical support or engineering, preferably in AI/ML/GenAI environments.Proven expertise in GenAI models (e.g., GPT, Claude, PaLM2, Llama2) and frameworks (e.g., RAG, Agents,...


  • Vellore, Tamil Nadu, India beBeeReliability Full time ₹ 2,00,00,000 - ₹ 2,50,00,000

    Site Reliability Engineering LeadServing as a key technical authority, you will oversee the reliability, scalability, and performance of our critical systems.This role combines software engineering and systems engineering expertise to build and maintain high-performing, reliable systems. About this RoleReliability & Performance:Maintain high availability and...


  • Vellore, Tamil Nadu, India beBeeExcellence Full time ₹ 1,50,00,000 - ₹ 2,00,00,000

    Job Opportunity: Site Excellence LeaderElevate site performance and embed a culture of innovation as our ideal candidate leads transformation and drives improvement initiatives.About the RoleThis is a chance to shape how we work, think, and grow. As our Site Excellence Leader, you will be part of a team that values ingenuity, collaboration, and principled...


  • Vellore, Tamil Nadu, India beBeeReliability Full time ₹ 1,50,00,000 - ₹ 2,50,00,000

    Site Reliability Engineering Role OverviewThe primary objective of this role is to design and architect solutions that ensure the reliability, scalability, and stability of software platforms. To achieve this, you will collaborate with engineering teams throughout the development lifecycle, leveraging your expertise in site reliability engineering best...


  • Vellore, Tamil Nadu, India beBeeEngineering Full time ₹ 1,80,00,000 - ₹ 2,40,00,000

    **Job Description:**As a site reliability engineer, you will play a crucial role in ensuring the digital backbone runs seamlessly for millions of customers.**Key Responsibilities:Engineer Reliability: Identify potential system issues early and implement preventive measures to minimize downtime and maximize uptime.Automate for Speed: Build tools, pipelines,...


  • Vellore, Tamil Nadu, India Xebia Full time

    We are looking for a highly skilled AWS Engineer with strong Python development and Chaos Engineering expertise to design, build, and validate resilient, scalable, and automated cloud-native environments. The ideal candidate will combine cloud engineering, DevOps, and chaos experimentation to improve reliability, fault tolerance, and operational efficiency...


  • Vellore, Tamil Nadu, India beBeeDevops Full time US$ 90,000 - US$ 1,25,000

    Job OverviewWe are seeking an experienced Reliability Operations Specialist to join our team. This role involves designing, implementing, and maintaining scalable monitoring, alerting, and logging solutions to ensure the availability and performance of backend services.In this position, you will work closely with development teams to design and support...


  • Vellore, Tamil Nadu, India beBeeSRE Full time ₹ 1,50,00,000 - ₹ 2,50,00,000

    Senior System Reliability Expert OpportunityWe are seeking an experienced Senior Site Reliability Engineer to ensure the reliability and performance of our systems.


  • Vellore, Tamil Nadu, India beBeeSite Full time ₹ 11,00,000 - ₹ 15,40,000

    Job Title: Site Reliability EngineerThis role focuses on delivering highly reliable financial applications and data services that meet the demanding requirements of accuracy, compliance, and availability supporting business operations.


  • Vellore, Tamil Nadu, India beBeeSiteReliabilityEngineer Full time ₹ 27,00,000 - ₹ 36,00,000

    Job SummaryWe are seeking a seasoned Site Reliability Engineer to join our team. The successful candidate will be responsible for designing, implementing, and maintaining the reliability and scalability of our systems.