Site Reliability Engineer II
5 days ago
About NCR Atleos
NCR Atleos, headquartered in Atlanta, is a leader in expanding financial access. Our dedicated 20,000 employees optimize the branch, improve operational efficiency and maximize self-service availability for financial institutions and retailers across the globe.
Job Title: Site Reliability Engineer II
Location: Hyderabad
Job Type: Full-Time, 24*7 Availability
Grade: 10
Job Description:
As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and observability of our transaction processing and settlement platforms, while also providing hands-on support for critical business applications. You will work across hybrid environments (on-prem and cloud), contribute to automation and monitoring enhancements, and support incident response and platform stability. This position requires availability to work in a 24×7 support model, including rotational shifts and on-call duties.
Key Responsibilities:
- Monitoring & Observability: Maintain and enhance monitoring systems using Prometheus, Grafana, Splunk, and SolarWinds. Ensure timely detection and resolution of issues through effective alerting and dashboards.
- Application Support: Provide L1/L2 support for business-critical applications, including incident triage, health checks, deployment validation, and coordination with development and product teams.
- Incident Management: Lead response for moderate to complex incidents, perform root cause analysis, and contribute to post-incident reviews and documentation.
- Automation & Scripting: Develop and maintain automation scripts using Python, PowerShell, or Bash to streamline operational tasks and reduce manual effort.
- Infrastructure Support: Monitor and support infrastructure health across on-prem and cloud platforms (GCP, Azure), including performance tuning and capacity planning.
- Kubernetes Operations: Support containerized workloads and microservices running on Kubernetes clusters. Perform health checks, troubleshoot deployments, and optimize resource usage.
- Process Adherence: Participate in ITIL-aligned processes for incident, change, and problem management. Ensure compliance with operational standards and audit requirements.
- Knowledge Sharing: Document SOPs, recurring issues, and resolutions. Mentor junior engineers and contribute to team knowledge base.
- Collaboration: Work closely with development, QA, and platform teams to support deployments, platform transitions, and reliability improvements.
- Continuous Improvement: Proactively identify areas for improvement in system reliability, alerting, and operational workflows.
24/7 Support: Provide on-call support for critical issues.
Qualifications:
- Bachelor's / master's degree in computer science, Engineering, or related field.
- 2–5 years of experience in SRE, infrastructure operations, system administration, or application support.
- Proficiency in monitoring tools (Prometheus, Grafana, Splunk, SolarWinds).
- Strong scripting skills (Python, Bash, PowerShell).
- Experience with cloud platforms (GCP, Azure, AWS).
- Hands-on experience with Kubernetes in production environments.
- Solid understanding of ITIL practices and enterprise support workflows.
- Hands-on experience with automation tools (ActiveBatch or similar).
- Strong analytical, communication, and problem-solving skills.
- Willingness to work in a 24×7 support environment and take ownership of reliability outcomes.
Hybrid
#LI-PS1
Offers of employment are conditional upon passage of screening criteria applicable to the job.
EEO Statement
NCR Atleos is an equal-opportunity employer. It is NCR Atleos policy to hire, train, promote, and pay associates based on their job-related qualifications, ability, and performance, without regard to race, color, creed, religion, national origin, citizenship status, sex, sexual orientation, gender identity/expression, pregnancy, marital status, age, mental or physical disability, genetic information, medical condition, military or veteran status, or any other factor protected by law.
Statement to Third Party Agencies
To ALL recruitment agencies: NCR Atleos only accepts resumes from agencies on the NCR Atleos preferred supplier list. Please do not forward resumes to our applicant tracking system, NCR Atleos employees, or any NCR Atleos facility. NCR Atleos is not responsible for any fees or charges associated with unsolicited resumes.
-
Site Reliability Engineer II
1 day ago
Hyderabad, Telangana, India NCR Atleos Full time ₹ 8,00,000 - ₹ 12,00,000 per yearAbout NCR AtleosNCR Atleos, headquartered in Atlanta, is a leader in expanding financial access. Our dedicated 20,000 employees optimize the branch, improve operational efficiency and maximize self-service availability for financial institutions and retailers across the globe.Job Title: Site Reliability Engineer IILocation: HyderabadJob Type: Full-Time, 24*7...
-
Site Reliability Engineer/SRE
2 days ago
Hyderabad, Telangana, India NCR Corporation Full time ₹ 12,00,000 - ₹ 36,00,000 per yearJob Title: Site Reliability Engineer IILocation: HyderabadJob Type: Full-Time, 24*7 AvailabilityJob Description:As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and observability of our transaction processing and settlement platforms, while also providing hands-on support for critical business applications....
-
Site Reliability Engineer II
2 weeks ago
Hyderabad, Telangana, India Microsoft Full time ₹ 8,00,000 - ₹ 24,00,000 per yearJoin the Azure Specialized AI Infrastructure team in India to drive advancements in Artificial Intelligence (AI) and support high-performance infrastructure for generative AI workloads. As a Senior SRE, you will automate and maintain large-scale distributed systems powering latest AI applications and machine learning models. Your primary focus will be on the...
-
Site Reliability Engineer II
31 minutes ago
Hyderabad, Telangana, India Microsoft Full time ₹ 25,00,000 - ₹ 75,00,000 per yearOverviewJoin the Azure Specialized AI Infrastructure team in India to drive advancements in Artificial Intelligence (AI) and support high-performance infrastructure for generative AI workloads. As a Senior SRE, you will automate and maintain large-scale distributed systems powering latest AI applications and machine learning models. Your primary focus will...
-
Site Reliability Engineer
5 hours ago
Hyderabad, Telangana, India Talent Worx Full time ₹ 12,00,000 - ₹ 36,00,000 per yearSite Reliability Engineer (SRE)At Talent Worx, we are looking for a dedicated Site Reliability Engineer (SRE) to join our team. This role involves maintaining high availability and reliability of our services through the application of software engineering practices and systems administration skills. The ideal candidate will bridge the gap between...
-
Site Reliability Engineer
1 week ago
Hyderabad, Telangana, India 2a1d0a41-1875-4bbb-b5a8-e4d5620cfd5f Full time ₹ 12,00,000 - ₹ 36,00,000 per yearRole & responsibilitiesCoordinates cross-product chaos experimentation to proactively test system resilience and uncover reliability gaps.Maintains the centralized incident response playbook for the subdivision, documenting standards for communication, escalation, and recovery during incidents. Aggregates and reports quantifiable availability data to senior...
-
Site Reliability Engineer
1 week ago
Hyderabad, Telangana, India Assurant Full time ₹ 6,00,000 - ₹ 12,00,000 per yearSite Reliability Engineer, GCC-AssurantThe Site Reliability Engineer (SRE) will be part of the Assurant Reliability Team, specifically within the Site Reliability Engineering area. This remote position, based in India, focuses on building and maintaining reliable, scalable systems through a combination of software development and network diagnostics. The...
-
Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India Assurant Full time ₹ 12,00,000 - ₹ 36,00,000 per yearSite Reliability Engineer, GCC-Assurant The Site Reliability Engineer (SRE) will be part of the Assurant Reliability Team, specifically within the Site Reliability Engineering area. This remote position, based in India, focuses on building and maintaining reliable, scalable systems through a combination of software development and network diagnostics. The...
-
Site Reliability Engineer
2 days ago
Hyderabad, Telangana, India Technology Next Full time ₹ 20,00,000 - ₹ 30,00,000 per yearUrgently hiring for Site Reliability Engineer (SRE) / Chaos EngineerLocation: HyderabadJob Type: Full-time, PermanentJob Description:We are looking for an experienced Site Reliability Engineer (SRE) with strong Python automation skills (Boto3 required) and hands-on experience in chaos engineering to improve system reliability and resilience. The ideal...
-
Senior Site Reliability Engineer
3 days ago
Hyderabad, Telangana, India Redpin Full time ₹ 12,00,000 - ₹ 36,00,000 per yearAbout the CompanyAt Redpin wesimplify life's most important payments.Buying a new property overseas can be a stressful time, especially when it comes to moving your money. Through our Currencies Direct and TorFX brands we've been helping people do just that for over 25 years. With recent investment we're now on a mission to build a new range of digital...