Site Reliability Engineer II

3 days ago


Anantapur, Andhra Pradesh, India beBeeReliability Full time ₹ 1,70,00,000 - ₹ 2,54,00,000
Job Description\Helpers, Incident Management,

You will own and manage the entire lifecycle of services from availability, latency, to performance and efficiency. You'll lead high-impact projects, mentor engineers, and eliminate toil at scale. Reports to the Director of SRE.

  • Define customer-centric SLIs/SLOs for Tier-0/Tier-1 services. Publish, review quarterly, and align teams to them.
  • Error budgeting (policy & tooling): Define an error-budget policy with multi-window, multi-burn-rate alerts; clear runbooks and paging thresholds. Gate changes by budget status (freeze/relax rules) wired into CI/CD.
  • Maintain SLO/EB dashboards (Azure Monitor, Grafana/Prometheus, App Insights). Run weekly SLO reviews with engineering/product.
  • Drive roadmap tradeoffs when budgets are at risk; land reliability epics.
  • Incidents without drama: Lead SEV1/SEV2, own comms, run blameless postmortems, and make corrective actions stick.
  • Engineer reliability in: Multi-AZ/region patterns (active-active/DR), PDBs/Pod Topology Spread, HPA/VPA/KEDA, resilient rollout/rollback.
  • Azure Kubernetes Service (AKS) at scale: Harden clusters (network, identity, policy), optimize node/pod density, ingress (AGIC/Nginx); mesh optional.
  • Observability that works: Metrics/traces/logs with Azure Monitor/App Insights, Log Analytics, Prometheus/Grafana, OpenTelemetry. Alert on symptoms, not noise.
  • IaC & policy: Terraform/Bicep modules, GitOps (Flux/Argo), policy-as-code (Azure Policy/OPA Gatekeeper). No snowflakes.
  • CI/CD reliability: Azure DevOps/GitHub Actions with canary/blue-green, progressive delivery, auto-rollback, Key Vault-backed secrets.
  • Capacity & performance: Load testing, right-sizing, autoscaling; partner with FinOps to reduce spend without hurting SLOs.
  • Disaster recovery you can trust: Define RTO/RPO, test backups/restore, run game days/chaos drills, validate ASR and multi-region failover.
  • Secure by default: Entra ID (Azure AD), managed identities, Key Vault rotation, VNets/NSGs/Private Link, shift-left checks in CI.
  • Reduce toil: Automate recurring ops, build self-service runbooks/chatops, publish golden paths for product teams.
  • Customer escalations: Be the technical owner on calls; communicate tradeoffs and recovery plans with authority.
  • Document to scale: Architectures, runbooks, postmortems, SLIs/SLOs—kept current and discoverable.
  • (If applicable) Streaming/ETL reliability: Apply SRE practices (SLOs, backpressure, idempotency, replay) to NiFi/Flink/Kafka/Redpanda data flows.
\Required Skills and Qualifications\

Bachelor's in CS/Engineering or equivalent experience required.

\
  • 12+ years in production ops/platform/SRE, including 5+ years on Azure.
  • PostgreSQL expertise including HA/DR, logical/physical replication, performance tuning, autovacuum strategy, partitioning, backup/restore testing, and connection pooling.
  • Azure core technologies like AKS, Front Door/App Gateway, API Management, VNets/NSGs/Private Link, Storage, Key Vault, Redis, Service Bus/Event Hubs.
  • Observability tools like Azure Monitor/App Insights, Log Analytics, Prometheus/Grafana, OpenTelemetry.
  • IaC/automation skills with Terraform, Bicep, PowerShell, Python, and GitOps.
  • Proven incident leadership at scale, blameless postmortems, and SLO/error-budget governance with change gating.
  • Mentorship and crisp written/verbal communication.
\Benefits\

As a Senior Site Reliability Engineer, you'll have opportunities for professional growth and development. Our company fosters a culture of collaboration, innovation, and continuous learning.

\
  • Cross-functional teams with experts in various fields.
  • Regular training and upskilling programs.
  • Opportunities for career advancement.
\Others\

We value diversity, equity, and inclusion in our workplace. If you're passionate about technology, reliability, and teamwork, we encourage you to apply.

\
  • Diverse and inclusive work environment.
  • Flexible working arrangements.
  • Paid time off and holidays.


  • Anantapur, Andhra Pradesh, India Employ Full time

    Role - Site Reliability Engineer (SRE)/ Platform Engineering/ or DevOps Engineering roles Location – Bangalore/ RemoteType - ContractWork Ex - 4-6 yrsWe're working with a AI product company that's building the next generation of GenAI powered developer platforms.We're looking for an experienced Site Reliability Engineer to join their Platform Engineering...


  • Anantapur, Andhra Pradesh, India beBeeReliability Full time ₹ 2,00,00,000 - ₹ 3,00,00,000

    Senior Leadership RoleWe are seeking an experienced senior leader to fill the role of Site Reliability Engineer at a Global Financial Services Firm.With 12+ years of experience, you will be responsible for defining and implementing SRE strategies, promoting an "Automate-first" culture in operating services through reduction of toil.Develop Process...


  • Anantapur, Andhra Pradesh, India beBeeInfrastructure Full time ₹ 15,00,000 - ₹ 22,00,000

    Reliable Infrastructure SpecialistThe ideal candidate will have a strong background in Site Reliability Engineering, with experience in DevOps and infrastructure management.This includes expertise in CI/CD pipelines, monitoring, automation, and infrastructure as code.Key ResponsibilitiesCollaborate with cross-functional teams to identify and resolve complex...


  • Anantapur, Andhra Pradesh, India beBeeELK Full time US$ 1,50,000 - US$ 2,50,000

    Job Title: Senior Site Reliability EngineerWe are looking for a highly skilled Senior Site Reliability Engineer to join our Platform Engineering Practice. The ideal candidate will have extensive expertise in designing, managing and scaling large-scale observability infrastructure using ELK clusters.Key Responsibilities:Design and manage large-scale ELK...


  • Anantapur, Andhra Pradesh, India beBeeReliability Full time ₹ 10,00,000 - ₹ 15,00,000

    Job TitleAmbitious Site Reliability Engineer Lead to Drive High-Performing SystemsAbout the RoleThis senior-level position demands a results-driven professional to spearhead site reliability engineering practices, lead high-performing teams, and drive automation strategies.In this challenging role, you will collaborate with cross-functional teams to build...


  • Anantapur, Andhra Pradesh, India beBeeReliability Full time ₹ 25,00,000 - ₹ 32,50,000

    Job SummaryWe are seeking a seasoned Principal Site Reliability Engineer to lead the operational health of our financial platforms.This role is focused on ensuring the stability, scalability, and operational excellence of financial applications and data services that meet demanding requirements for accuracy, compliance, and availability.Operational...


  • Anantapur, Andhra Pradesh, India beBeeReliability Full time ₹ 15,00,000 - ₹ 25,00,000

    We're looking for a seasoned Site Reliability Engineer to join our dynamic engineering team. As a key player in ensuring system performance and reliability, you will be responsible for enhancing our platform's efficiency, automating manual processes, and collaborating with cross-functional teams.Key Responsibilities:Provide technical leadership and...


  • Anantapur, Andhra Pradesh, India beBeeSoftware Full time ₹ 1,50,00,000 - ₹ 2,50,00,000

    Reliable Systems Engineer RoleWe are seeking a skilled Systems Engineer to join our team. This role involves designing, developing and supporting various tools, services and applications to maintain a reliable site environment.


  • Anantapur, Andhra Pradesh, India beBeeReliability Full time ₹ 1,00,00,000 - ₹ 1,50,00,000

    Site Reliability EngineerWe are seeking an experienced Site Reliability Engineer to join our team. The ideal candidate will have a strong background in IT, with a focus on system administration and support.Key Responsibilities:Design, develop, and support various tools, services, and applications to maintain a reliable site environment.Monitor, measure, and...


  • Anantapur, Andhra Pradesh, India beBeeSite Full time ₹ 12,76,700 - ₹ 24,93,400

    Site Reliability EngineerThe role of Site Reliability Engineer (SRE) is pivotal in ensuring the stability, scalability, and operational excellence of Accounting platforms.You will build automation, implement monitoring, improve incident response, and champion DevOps practices to enable Finance systems to operate with consistency and...