
▷ 15h Left Lead Site Reliability Engineer
2 days ago
Job Description
Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health optimization on a global scale. Join us to start Caring. Connecting. Growing together.
What makes this a one of kind opportunity We have more than 12,000 technology colleagues serving the IT needs of our clients across the globe and our own Fortune 6 IT needs. At Optum, you'll be encouraged to combine your passion and technical expertise to help us shape the health care system for years to come. You'll help change the way our businesses and consumers engage with technology across a wide platform of health services and delivery systems by setting team goals, forecasting resource needs, and guiding solutions developed to solve business and operational challenges. If you're out to make a difference, apply today.
Medicare & Retirement (M&R) | Community and State | Individual and Family Plan - Technology Operations needs an experienced Senior Site Reliability Engineer (SRE) acting as a bridge between software engineering and IT operations. The primary goal of this role is to maintain software applications/Infrastructure that are reliable, scalable, resilient and to improve performance and operational efficiency along with ensuring all business-critical products having implemented right tools and executed exercise to validate system availability, latency, performance, efficiency, monitoring, incident priority, and capacity planning. This role will enable Government Programs (M&R, C&S and IFP) Technology Operations to meet our business segment's needs as an IT partner and advocate.
Primary Responsibilities:
- Defining and setting up best industry alert and monitoring practices across line of business and design/architect efficient monitoring dashboards on Splunk/Dynatrace /Grafana common for all applications/products across line of business
- Participating in 5-9 program and other peak season readiness initiatives and collaboration with application teams evaluating applications from resiliency, availability, and reliability perspective
- Act as a gatekeeper for changes rolling into production
- Embrace continuous learning of engineering practices to ensure industry best practices and technology adoption, including DevOps, Cloud and Agile thinking
- Tech debt reduction/Tech transformation including opensource/inner source adoption, Cloud adoption, HCP assessment and adoption
- Improve processes/runbooks and lead automation efforts of any manual items around support cutting down manual toil
- Participate in on-call rotation
- Improve operational tooling, frameworks, perform chaos engineering activities
- Respond to platform emergencies, alerts, and escalations from Customer Support
- Comply with the terms and conditions of the employment contract, company policies and procedures, and any and all directives (such as, but not limited to, transfer and/or re-assignment to different work locations, change in teams and/or work shifts, policies in regards to flexibility of work benefits and/or work environment, alternative work arrangements, and other decisions that may arise due to the changing business environment). The Company may adopt, vary or rescind these policies and directives in its absolute discretion and without any limitation (implied or otherwise) on its ability to do so
Required Qualifications:
- Undergraduate degree or equivalent experience
- 10+ years of experience in IT industry across entire SDLC
- 5+ years of experience in integrating monitoring and alerting into cloud software solutions
- 3+ years of coding experience with one or more of the follow languages Java, C#, C/C++, Go, Python, Perl, PowerShell or JavaScript with a willingness and ability to learn new ones
- 3+ years of experience in Splunk / Dynatrace / DataDog/Grafana/ Telemetry or similar for monitoring tools
- 2+ years of experience building and programmatically consuming REST APIs
- ServiceNow experience
- Work experience as a Site Reliability Engineer or similar role
- Experience with any database
- Experience in operations support for any application
- Experience with programmatic interaction with a relational database SQL Server/MySQL/PostgreSQL
- Experience planning and supporting 99.999% availability against critical applications in production
- Knowledge of any scripting or programming language
- Solid understanding of engineering fundamentals: unit testing, performance testing, code reviews, telemetry, agile and DevOps
- Solid understanding of: continuous integration / continuous delivery tools, serverless architecture, containerization, public / private cloud, application observability and/or messaging / stream architecture
- Technical writing skills (creating flow diagrams, end user documentation, etc)
- Proven ability to communicate effectively to both technical and non-technical, globally distributed audiences
At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone-of every race, gender, sexuality, age, location and income-deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes - an enterprise priority reflected in our mission.
-
15h Left Senior Site Reliability Engineer
2 weeks ago
India Microsoft Full timeJob DescriptionThe Windows Cloud division is looking for a Senior Site Reliability Engineer that will help us take the Windows Cloud platform, as well as the Windows 365 Cloud PC and Azure Virtual Desktop business to the next level.Windows 365 Cloud PC (W365) and Azure Virtual Desktop (AVD) have recently been recognized as leaders in the Gartner Magic...
-
Remote Site Reliability Engineer(DevOps)
1 week ago
India Zafin Full timeSenior Site Reliability Engineer (SRE II) Own availability, latency, performance, and efficiency for Zafin's SaaS on Azure. You'll define and enforce reliability standards, lead high-impact projects, mentor engineers, and eliminate toil at scale. Error budgeting (policy & tooling): ~ Run the error-budget policy with multi-window, multi-burn-rate alerts; Run...
-
15h Left) Senior Site Reliability Engineer
1 week ago
Bengaluru, Karnataka, India Allegion Full time- Allegion India is seeking a highly motivated Senior Site Reliability Engineer who will play a critical role in ensuring the reliability, scalability, and performance of our organization's systems and infrastructure, who will work with a team of cross-functional product development engineers to design, implement, and maintain highly available and resilient...
-
Site Reliability Engineer
2 days ago
india Synechron Full timeWe have immediate opportunity forSRE (Senior Site Reliability Engineer) 5 to 9 years. Synechron –BangaloreJob Role: -SRE (Senior Site Reliability Engineer) Job Location: -Bangalore Notice Period:Within 30daysAbout Synechron We began life in 2001 as a small, self-funded team of technology specialists. Since then, we’ve grown our organization to 14,500+...
-
Lead Site Reliability Engineer
2 weeks ago
Bengaluru, Karnataka, India Landmark Group Full timeJob Title: SRE Lead (Engineering & Reliability)Experience: 8-12 years We are seeking an experienced and dynamic Site Reliability Engineering (SRE) Lead to oversee the reliability, scalability, and performance of our critical systems. As an SRE Lead, you will play a pivotal role in establishing and implementing SRE practices, leading a team of engineers, and...
-
Site Reliability Engineer
2 days ago
Bengaluru, India VidPro Consultancy Services Full timeJob Description Experience: 2.55 Years Location: Bangalore (On-site) Work Mode: 5 Days WFO Mandatory Skills: Site Reliability engineer or SRE ,Linux, System architecture, TCP/IP. HTTP,DNS ,Grafana, Prometheus and Loki Troubleshooting ,Root cause, complex systems ,Ci/CD, Docker, Kubernetes Experience : 2-4 years of relevant experience Key Skills...
-
Lead Site Reliability Engineer
3 days ago
Bengaluru, India Landmark Group Full timeJob Title: SRE Lead (Engineering & Reliability)Job Summary:We are seeking an experienced and dynamic Site Reliability Engineering (SRE) Lead to oversee the reliability, scalability, and performance of our critical systems. As an SRE Lead, you will play a pivotal role in establishing and implementing SRE practices, leading a team of engineers, and driving...
-
Site Reliability Engineer
2 days ago
India Concord Full timeSRE Sr. Engineers (Individual Contributors) Key Attributes: - Strong SRE (Site Reliability Engineering) experience - DevOps skills – CI/CD, monitoring, automation, infrastructure as code, etc. - Excellent troubleshooting and debugging skills (infrastructure + application level) - Perseverance – must push through complex/challenging issues without...
-
Site Reliability Engineering Lead
1 week ago
Bengaluru, Karnataka, India beBeeReliability Full time ₹ 15,00,000 - ₹ 20,00,000We are seeking an experienced and dynamic Site Reliability Engineering (SRE) Leader to oversee the reliability, scalability, and performance of our critical systems.As a SRE Leader, you will play a pivotal role in establishing and implementing SRE practices, leading a team of engineers, and driving automation, monitoring, and incident response...
-
Lead Site Reliability Engineer- Remote
2 weeks ago
India Sprinto Full timeJob DescriptionSprinto is a leading platform that automates information security compliance. By raising the bar on information security, Sprinto ensures compliance, healthy operational practices, and the ability for businesses to grow and scale with unwavering confidence. We are a team of 200+ employees & helping 1000+ Customers across 75+ Countries. We are...