Senior Site Reliability Engineer

1 day ago


hyderabad, India Elios Talent Full time

Senior Site Reliability EngineerKey Highlights🛠️ Build, scale, and optimize cloud-native infrastructure powering global, high-availability platforms⚡ Drive automation-first engineering across AWS, Terraform, CI/CD, observability, and resilient systems📊 Own reliability, uptime, system health, costs, and performance across mission-critical environments🔐 Strengthen DevSecOps practices—improving security, delivery velocity, and operational excellence🚨 Lead major incident response, troubleshoot complex issues, and uphold production stability at scalePosition OverviewWe are seeking a Senior Site Reliability Engineer to drive reliability, automation, and performance for large-scale, cloud-based platforms. This role blends deep technical engineering, systems thinking, DevOps collaboration, and operational leadership.You will design and implement scalable infrastructure, improve observability, enhance resiliency, manage incident operations, and champion modern DevSecOps practices. This role plays a critical part in supporting tens of thousands of daily users while ensuring platforms remain secure, fast, and highly available.Key ResponsibilitiesCloud EngineeringArchitect, deploy, and optimize AWS environments using automation and Infrastructure-as-CodeBuild tooling that increases predictability, stability, and delivery speedOptimize systems for scale, reliability, cost, and performanceMaintain repeatable, traceable, and transparent infrastructure through Terraform and automationMonitor cloud spend and usage, ensuring alignment with service-level objectivesObservability & ReliabilityOwn uptime, reliability, system security, performance metrics, and golden signalsLead incident management and triage bridges during major eventsEnhance telemetry systems (NewRelic, CloudWatch, DataDog) for deep operational visibilityUse data-driven analysis to improve system stability and customer experienceEnsure architecture and deployment patterns meet SLAs and reliability goalsDevSecOps & AutomationStrengthen CI/CD pipelines, code-review practices, and engineering standardsPartner with Cybersecurity to address vulnerabilities through automationSupport secure, consistent, and scalable delivery workflows across engineering teamsResiliency EngineeringIdentify failure points, blast-radius risks, and architectural gapsRun failure-injection / chaos testing to validate resiliencyForecast traffic, plan for seasonal peaks, and scale systems for 2x+ load scenariosDrive improvements to infrastructure and software to meet resiliency targetsLeadership & CollaborationMentor engineers across levels, promoting high-quality engineering practicesCollaborate daily with product, engineering, and security teams in a DevOps modelDocument, uplift, and share knowledge through cross-team forums and best practicesQualificationsExperience as a software engineer with strong debugging + deployment skillsHands-on expertise with AWS and Terraform (required)Experience with ECS, and Kubernetes/EKS experience strongly preferredStrong proficiency in Python, Golang, Bash, and automation frameworksCI/CD experience with Jenkins, GitHub Enterprise, CircleCI, or similarAbility to troubleshoot across web servers, app servers, OS, networks, storage, and databasesExperience running large-scale, high-availability production systemsStrong communication, root-cause analysis, and incident leadership skillsBS in Computer Science or equivalent industry experienceAbout UsWe build scalable, secure, and high-performing digital platforms that power global user experiences. By combining cloud engineering, automation, observability, and resilient systems design, we help organizations operate more reliably, innovate faster, and support long-term platform stability and growth.Why Join UsJoin a forward-thinking engineering organization where reliability, automation, and performance are core values. You’ll work with a modern cloud stack, collaborate with exceptional engineers, and own meaningful technical impact across large-scale applications. This is an opportunity to shape infrastructure strategy, elevate engineering practices, and build systems that support millions with consistency and excellence.



  • Hyderabad, India Elios Talent Full time

    Senior Site Reliability Engineer Key Highlights


  • Hyderabad, India Elios Talent Full time

    Senior Site Reliability EngineerKey Highlights🛠️ Build, scale, and optimize cloud-native infrastructure powering global, high-availability platforms⚡ Drive automation-first engineering across AWS, Terraform, CI/CD, observability, and resilient systems📊 Own reliability, uptime, system health, costs, and performance across mission-critical...


  • Hyderabad, India Elios Talent Full time

    Senior Site Reliability EngineerKey Highlights️ Build, scale, and optimize cloud-native infrastructure powering global, high-availability platforms⚡ Drive automation-first engineering across AWS, Terraform, CI/CD, observability, and resilient systemsOwn reliability, uptime, system health, costs, and performance across mission-critical...


  • Hyderabad, India Elios Talent Full time

    Senior Site Reliability Engineer Key Highlights 🛠️ Build, scale, and optimize cloud-native infrastructure powering global, high-availability platforms ⚡ Drive automation-first engineering across AWS, Terraform, CI/CD, observability, and resilient systems 📊 Own reliability, uptime, system health, costs, and performance across mission-critical...


  • Hyderabad, Telangana, India Instaresz Business Services Pvt Ltd Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Job Title: Senior Site Reliability Engineer (SRE)Experience Required:10+ YearsLocation:Hyderabad (On-site)Employment Type:Full-TimeAbout InstareszInstaresz Business Services Pvt. Ltd. focuses on building and scalinghigh-performance SaaSproductswith expertise in:• SaaS Product Development• Infrastructure & DevOps• Data & Analytics• AI & AutomationOur...


  • Hyderabad, India SID Global Solutions Full time

    Job Role: Site Reliability Engineer (SRE) – GCP Experience: 3+ yearsLocation: HyderabadAbout SIDGS:SIDGS is a premium global systems integrator and global implementation partner of Google corporation, providing Digital Solutions & Services to Fortune 500 companies. Our Digital solutions go across following domains: User Experience, CMS, API Management,...


  • Hyderabad, India SID Global Solutions Full time

    Job Role: Site Reliability Engineer (SRE) – GCP Experience: 3+ years Location: Hyderabad About SIDGS: SIDGS is a premium global systems integrator and global implementation partner of Google corporation, providing Digital Solutions & Services to Fortune 500 companies. Our Digital solutions go across following domains: User Experience, CMS, API Management,...


  • Hyderabad, India Whatjobs IN C2 Full time

    Job Role: Site Reliability Engineer (SRE) – GCP Experience: 3+ years Location: Hyderabad About SIDGS: SIDGS is a premium global systems integrator and global implementation partner of Google corporation, providing Digital Solutions & Services to Fortune 500 companies. Our Digital solutions go across following domains: User Experience, CMS, API Management,...


  • Hyderabad, Telangana, India 2a1d0a41-1875-4bbb-b5a8-e4d5620cfd5f Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Role & responsibilitiesCoordinates cross-product chaos experimentation to proactively test system resilience and uncover reliability gaps.Maintains the centralized incident response playbook for the subdivision, documenting standards for communication, escalation, and recovery during incidents. Aggregates and reports quantifiable availability data to senior...


  • Hyderabad, Telangana, India Instaresz Business Services Pvt Ltd Full time ₹ 8,00,000 - ₹ 20,00,000 per year

    Experience Required:10+ YearsLocation:Hyderabad (On-site)Employment Type:Full-TimeAbout InstareszInstaresz Business Services Pvt. Ltd. focuses on building and scaling high-performance SaaS products with expertise in:• SaaS Product Development• Infrastructure & DevOps• Data & Analytics• AI & Automation Our mission is to create robust, secure, and...