Sr Site Reliability Engineer
24 hours ago
Site Reliability Engineer (SRE)
Position Summary
The Site Reliability Engineer (SRE) will be a hands-on contributor within the Site Reliability Engineering Center of Excellence (CoE), responsible for building monitoring and observability solutions, troubleshooting production issues, and participating in 24x7 on-call operations.
This role focuses on the execution of reliability practices, implementing observability tooling, improving MTTR/MTTD through automation, and ensuring production systems are resilient, observable, and performant. The SRE will collaborate closely with Principal and Senior Staff SREs, adopting best practices and frameworks defined by the CoE while directly contributing to enterprise reliability goals. This role reports to the Sr. Manager, SRE.
Key Responsibilities
Execution & CoE Alignment
· Implement SRE frameworks, best practices, and playbooks provided by the CoE.
· Act as a hands-on engineer, contributing to observability, reliability, and incident response initiatives.
· Partner with senior SREs and leadership to maintain consistency in monitoring and incident processes.
· Contribute to automation projects that improve reliability and reduce manual work.
Observability & Monitoring
· Build and maintain monitoring solutions with New Relic, Datadog, Prometheus, Grafana, CloudWatch, OpenTelemetry, Graylog.
· Create and refine dashboards, metrics, and alerts for proactive anomaly detection.
· Extend observability coverage across infrastructure, applications, APIs, and databases.
Reliability Engineering & Automation
· Implement SLIs, SLOs, SLAs, and error budgets in partnership with product and platform teams.
· Contribute to reducing MTTD and MTTR through improved instrumentation and automation.
· Participate in capacity planning, resiliency testing, and scaling reviews.
· Support chaos engineering and reliability validation activities.
Incident & Problem Management
· Participate in incident response, including on-call rotations for 24x7 coverage.
· Assist with root cause analysis (RCA) and implement corrective actions.
· Ensure alignment with ITSM processes for incident, problem, and change management.
· Contribute to playbooks and runbooks to strengthen on-call readiness.
Collaboration & Knowledge Sharing
· Collaborate with Engineering, Product, Security, Cloud, and DevSecOps teams to embed reliability practices.
· Provide input on instrumentation, monitoring hooks, and operational readiness for services.
· Work with DBAs and platform teams on database observability and performance optimization.
· Share knowledge within the SRE team and adopt practices from Staff and Principal SREs.
Qualifications & Experience
Required
· 7+ years in SRE, Operations, or Infrastructure Engineering.
· Strong hands-on experience with monitoring and observability platforms.
· Experience with tools such as New Relic, Datadog, Prometheus, Grafana, CloudWatch, OpenTelemetry, Graylog.
· Proven experience in incident response, troubleshooting production issues, and improving MTTR/MTTD.
· Good knowledge of SLIs, SLOs, SLAs, and error budgets.
· Hands-on experience with AWS services (EC2, ECS, EKS, networking, scaling groups).
· Proficiency in containers & Kubernetes (Docker, EKS).
· Scripting/programming in Python, Go, or shell scripting.
· Understanding of networking, distributed systems, and high-availability architectures.
· Exposure to ITIL/ITSM processes.
Preferred
· Experience in SaaS or healthcare environments.
· Knowledge of databases (MongoDB, Elasticsearch, SQL Server, Oracle).
· Familiarity with chaos engineering and resiliency testing.
· Certifications: AWS Solutions Architect / DevOps Engineer, CKA/CKA
GHX: It's the way you do business in healthcare
Global Healthcare Exchange (GHX) enables better patient care and billions in savings for the healthcare community by maximizing automation, efficiency and accuracy of business processes.
GHX is a healthcare business and data automation company, empowering healthcare organizations to enable better patient care and maximize industry savings using our world class cloud-based supply chain technology exchange platform, solutions, analytics and services. We bring together healthcare providers and manufacturers and distributors in North America and Europe - who rely on smart, secure healthcare-focused technology and comprehensive data to automate their business processes and make more informed decisions.
It is our passion and vision for a more operationally efficient healthcare supply chain, helping organizations reduce - not shift - the cost of doing business, paving the way to delivering patient care more effectively. Together we take more than a billion dollars out of the cost of delivering healthcare every year. GHX is privately owned, operates in the United States, Canada and Europe, and employs more than 1000 people worldwide. Our corporate headquarters is in Colorado, with additional offices in Europe.
Disclaimer
Global Healthcare Exchange, LLC and its North American subsidiaries (collectively, "GHX") provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, national origin, sex, sexual orientation, gender identity, religion, age, genetic information, disability, veteran status or any other status protected by applicable law. All qualified applicants will receive consideration for employment without regard to any status protected by applicable law. This EEO policy applies to all terms, conditions, and privileges of employment, including hiring, training and development, promotion, transfer, compensation, benefits, educational assistance, termination, layoffs, social and recreational programs, and retirement.GHX believes that employees should be provided with a working environment which enables each employee to be productive and to work to the best of his or her ability. We do not condone or tolerate an atmosphere of intimidation or harassment based on race, color, national origin, sex, sexual orientation, gender identity, religion, age, genetic information, disability, veteran status or any other status protected by applicable law. GHX expects and requires the cooperation of all employees in maintaining a discrimination and harassment-free atmosphere. Improper interference with the ability of GHX's employees to perform their expected job duties is absolutely not tolerated.
Read our GHX Privacy Policy
-
Site Reliability Engineer
1 week ago
Hyderabad, Telangana, India 2a1d0a41-1875-4bbb-b5a8-e4d5620cfd5f Full time ₹ 12,00,000 - ₹ 36,00,000 per yearRole & responsibilitiesCoordinates cross-product chaos experimentation to proactively test system resilience and uncover reliability gaps.Maintains the centralized incident response playbook for the subdivision, documenting standards for communication, escalation, and recovery during incidents. Aggregates and reports quantifiable availability data to senior...
-
Site Reliability Engineer
1 week ago
Hyderabad, Telangana, India Assurant Full time ₹ 6,00,000 - ₹ 12,00,000 per yearSite Reliability Engineer, GCC-AssurantThe Site Reliability Engineer (SRE) will be part of the Assurant Reliability Team, specifically within the Site Reliability Engineering area. This remote position, based in India, focuses on building and maintaining reliable, scalable systems through a combination of software development and network diagnostics. The...
-
Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India Assurant Full time ₹ 12,00,000 - ₹ 36,00,000 per yearSite Reliability Engineer, GCC-Assurant The Site Reliability Engineer (SRE) will be part of the Assurant Reliability Team, specifically within the Site Reliability Engineering area. This remote position, based in India, focuses on building and maintaining reliable, scalable systems through a combination of software development and network diagnostics. The...
-
Site Reliability Engineer
1 day ago
Hyderabad, Telangana, India Technology Next Full time ₹ 20,00,000 - ₹ 30,00,000 per yearUrgently hiring for Site Reliability Engineer (SRE) / Chaos EngineerLocation: HyderabadJob Type: Full-time, PermanentJob Description:We are looking for an experienced Site Reliability Engineer (SRE) with strong Python automation skills (Boto3 required) and hands-on experience in chaos engineering to improve system reliability and resilience. The ideal...
-
Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India Evalify-IQ Full time ₹ 6,00,000 - ₹ 18,00,000 per yearSkills Required:AWS, Azure, Terraform, CloudFormation, Cloudformation, Pulumi, CICD, GitHub Actions,GitLab CI, Jenkins, ArgoCD, Prometheus, Splunk, Grafana, Cloudwatch, Datadog, SRE,Site Reliability, Python, Powershell, Shell, Go, Kubernetes, Docker, Performance Tuning,Performance Enhancements, Performance Enhancement, PerformanceExperience Range:2 - 5...
-
Site Reliability Engineer
1 week ago
Hyderabad, Telangana, India Elios Talent Full time ₹ 12,00,000 - ₹ 36,00,000 per yearSite Reliability EngineerKey Highlights Build, automate, and support cloud-native infrastructure powering high-availability platforms Contribute to automation-first engineering across AWS, Terraform, CI/CD, and observability tooling Improve reliability, uptime, system health, and performance across production environments Strengthen DevSecOps...
-
Senior Site Reliability Engineer
5 days ago
Hyderabad, Telangana, India TechBlocks Full timeAbout TechBlocksTechBlocks is a global digital product engineering company with 16+ years of experience helping Fortune 500 enterprises and high-growth brands accelerate innovation, modernize technology, and drive digital transformation. From cloud solutions and data engineering to experience design and platform modernization, we help businesses solve...
-
Principal Site Reliability Engineer
4 days ago
Hyderabad, Telangana, India Oracle Full time ₹ 12,00,000 - ₹ 36,00,000 per yearOracle is seeking motivated Principal Site Reliability Engineer who thrives in a fast-paced rapidly evolving technology environment. This position requires wide and overall knowledge in Mainframe zLinux, DB2, zVM, AIX. Site Reliability Engineer expected to work with multiple service and product development teams, identifying cross-team issues that...
-
Principal Site Reliability Engineer
2 days ago
Hyderabad, Telangana, India Oracle Full time ₹ 12,00,000 - ₹ 36,00,000 per yearOracle is seeking motivated Principal Site Reliability Engineer who thrives in a fast-paced rapidly evolving technology environment. This position requires wide and overall knowledge in Linux administration, AI technologies, software development, cloud computing, networking, cloud security, performance analysis and monitoring to provide the stability,...
-
Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India SID Global Solutions Full time ₹ 9,00,000 - ₹ 12,00,000 per yearJob Role: Site Reliability Engineer (SRE) – GCPExperience: 3+ yearsLocation: HyderabadAbout SIDGS:SIDGS is a premium global systems integrator and global implementation partner of Google corporation, providing Digital Solutions & Services to Fortune 500 companies. Our Digital solutions go across following domains: User Experience, CMS, API Management,...