
Site Reliability Champion
2 weeks ago
Senior Site Reliability Engineer II Job Summary We are seeking a highly skilled Senior Site Reliability Engineer II to join our team. As a key member of our SRE team, you will be responsible for ensuring the reliability and performance of our SaaS platform on Azure.
Key Responsibilities- Define customer-centric SLIs/SLOs for Tier-0/Tier-1 services Publish, review quarterly, and align teams to them
- Run the error-budget policy with multi-window, multi-burn-rate alerts; clear runbooks and paging thresholds
- Gate changes by budget status (freeze/relax rules) wired into CI/CD
- Maintain SLO/EB dashboards (Azure Monitor, Grafana/Prometheus, App Insights). Run weekly SLO reviews with engineering/product.
- Drive roadmap tradeoffs when budgets are at risk; land reliability epics
- Incidents without drama: Lead SEV1/SEV2, own comms, run blameless postmortems, and make corrective actions stick
- Engineer reliability in: Multi-AZ/region patterns (active-active/DR), PDBs/Pod Topology Spread, HPA/VPA/KEDA, resilient rollout/rollback
- AKS at scale: Harden clusters (network, identity, policy), optimize node/pod density, ingress (AGIC/Nginx); mesh optional
- Observability that works: Metrics/traces/logs with Azure Monitor/App Insights, Log Analytics, Prometheus/Grafana, OpenTelemetry. Alert on symptoms, not noise
- IaC & policy: Terraform/Bicep modules, GitOps (Flux/Argo), policy-as-code (Azure Policy/OPA Gatekeeper). No snowflakes
- CI/CD reliability: Azure DevOps/GitHub Actions with canary/blue-green, progressive delivery, auto-rollback, Key Vault-backed secrets
- Capacity & performance: Load testing, right-sizing, autoscaling; partner with FinOps to reduce spend without hurting SLOs
- DR you can trust: Define RTO/RPO, test backups/restore, run game days/chaos drills, validate ASR and multi-region failover
- Secure by default: Entra ID (Azure AD), managed identities, Key Vault rotation, VNets/NSGs/Private Link, shift-left checks in CI
- Reduce toil: Automate recurring ops, build self-service runbooks/chatops, publish golden paths for product teams
- Customer escalations: Be the technical owner on calls; communicate tradeoffs and recovery plans with authority
- Document to scale: Architectures, runbooks, postmortems, SLIs/SLOs—kept current and discoverable
- (If applicable) Streaming/ETL reliability: Apply SRE practices (SLOs, backpressure, idempotency, replay) to NiFi/Flink/Kafka/Redpanda data flows
- Bachelor's in CS/Engineering (or equivalent experience)
- 12+ years in production ops/platform/SRE, including 5+ years on Azure
- PostgreSQL (must-have): Deep operational expertise incl. HA/DR, logical/physical replication, performance tuning (indexes/EXPLAIN/ANALYZE, pg_stat_statements), autovacuum strategy, partitioning, backup/restore testing, and connection pooling (pgBouncer)
- Azure core: AKS (must-have); Front Door/App Gateway, API Management, VNets/NSGs/Private Link, Storage, Key Vault, Redis, Service Bus/Event Hubs
- Observability: Azure Monitor/App Insights, Log Analytics, Prometheus/Grafana; SLO design and error-budget operations
- IaC/automation: Terraform and/or Bicep; PowerShell and Python; GitOps (Flux/Argo). Pipelines in Azure DevOps or GitHub Actions
- Proven incident leadership at scale, blameless postmortems, and SLO/error-budget governance with change gating
- Mentorship and crisp written/verbal communication
- Apache NiFi, Apache Flink, Apache Kafka or Redpanda (self-managed on AKS or managed equivalents); schema management, exactly-once semantics, backpressure, dead-letter/replay patterns
- Azure Solutions Architect Expert, CKA/CKAD
- ITSM (ServiceNow), on-call tooling (PagerDuty/Opsgenie)
- Compliance/SecOps (SOC 2, ISO 27001), policy-as-code, workload identity
- OpenTelemetry, eBPF tooling, or service mesh
- Multi-tenant SaaS and cost optimization at scale
-
Site Reliability Specialist
2 weeks ago
Tirupati, Andhra Pradesh, India beBeeReliability Full time ₹ 18,40,000 - ₹ 26,40,000Job Title: Site Reliability EngineerWe are seeking a skilled Site Reliability Engineer to join our team.The ideal candidate will have expertise in ensuring the reliability, scalability, and performance of our systems. This includes identifying potential issues early, implementing preventive measures, and boosting system resilience.This role requires a strong...
-
Reliability Engineering Leader
2 weeks ago
Tirupati, Andhra Pradesh, India beBeeReliability Full time ₹ 1,50,00,000 - ₹ 2,50,00,000Job Title: Reliability Engineering LeaderA pivotal role in the reliability engineering function, ensuring infrastructure robustness and optimal operational efficiency.The Reliability Engineering Manager will spearhead a team of Site Reliability Engineers, focusing on establishing and implementing organizational reliability strategies, aligning SLAs, SLOs,...
-
Site Reliability Engineer
3 weeks ago
Tirupati, Andhra Pradesh, India Employ Full timeRole - Site Reliability Engineer (SRE)/ Platform Engineering/ or DevOps Engineering roles Location – Bangalore/ RemoteType - ContractWork Ex - 4-6 yrsWe're working with a AI product company that's building the next generation of GenAI powered developer platforms.We're looking for an experienced Site Reliability Engineer to join their Platform Engineering...
-
High-Performance Site Reliability Expert Wanted
2 weeks ago
Tirupati, Andhra Pradesh, India beBeeSiteReliability Full time ₹ 1,80,00,000 - ₹ 2,20,00,000Site Reliability Engineer Job OpportunityThe ideal candidate will play a pivotal role in ensuring the reliability and performance of our applications, providing technical expertise to drive business growth. The successful Site Reliability Engineer will design, develop, and support various tools, services, and applications to maintain a reliable site...
-
Tirupati, Andhra Pradesh, India beBeeSiteReliabilityEngineer Full time ₹ 2,00,00,000 - ₹ 2,50,00,000Job OverviewWe are seeking an experienced Principal Engineer, Site Reliability to join our team. This individual will play a critical role in ensuring the stability, scalability, and operational excellence of Accounting and Finance platforms.The successful candidate will lead the operational health of these platforms, ensuring the delivery of highly reliable...
-
Reliable Infrastructure Developer
2 weeks ago
Tirupati, Andhra Pradesh, India beBeeEngineering Full time ₹ 18,00,000 - ₹ 22,00,000Our organization is seeking a Site Reliability Engineer with expertise in SRE. The ideal candidate will have a strong foundation in DevOps skills such as CI/CD, monitoring, automation, and infrastructure as code.Key Qualifications:Exceptional Troubleshooting SkillsAdvanced DevOps ExpertisePersistence in Complex IssuesIndependence and Self-InitiativeEffective...
-
Enterprise Reliability Engineer
2 weeks ago
Tirupati, Andhra Pradesh, India beBeeReliability Full time ₹ 1,50,00,000 - ₹ 2,50,00,000System Reliability EngineerWe are looking for a highly skilled System Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for designing and implementing reliable systems that meet the needs of our business.Your primary focus will be on ensuring the high availability, scalability, and performance of our...
-
Reliable Network Infrastructure Specialist
2 weeks ago
Tirupati, Andhra Pradesh, India beBeeNetwork Full time ₹ 60,00,000 - ₹ 1,20,00,000Job OverviewWe are seeking a skilled Network Engineer with expertise in firewall management, cloud networking, and automation to join our team. As a Site Reliability & Network Engineer, you will play a critical role in designing, deploying, and monitoring network infrastructure, ensuring regulatory compliance and security.
-
Observability Engineer
2 weeks ago
Tirupati, Andhra Pradesh, India beBeeSystemReliability Full time ₹ 15,00,000 - ₹ 20,00,000Job Title: System Reliability EngineerWe are seeking a highly skilled System Reliability Engineer to join our team. The successful candidate will be responsible for building and maintaining the platform components for observability.This role will involve working closely with the Lead engineer, performance team, data ingestion, platform DevOps, and data...
-
Strengthening Digital Foundations
2 weeks ago
Tirupati, Andhra Pradesh, India beBeeReliability Full time ₹ 1,50,00,000 - ₹ 2,50,00,000Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team. In this role, you will be responsible for ensuring the smooth operation of our digital systems, identifying potential issues early, and implementing preventive measures.Your Key Responsibilities:Engineer reliability: Implement proactive measures to prevent...