Senior Site Reliability Engineer

3 hours ago

Pune, Maharashtra, India Mastercard Full time

Our Purpose

Mastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we're helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services that help people, businesses and governments realize their greatest potential.

Title and Summary

Senior Site Reliability EngineerTitle and Summary

Senior Site Reliability Engineer

Overview

The Distributed Platform Operations team is looking for a Site Reliability Engineer who can help us solve problems, implement automation, and leverage best practices.

Are you a born problem solver who loves to figure out how something works?
Are you a detail -oriented individual who enjoys complex problem solving?
Do you love determining the correct actions required to fix a problem?
Do you have a low tolerance for manual work and look to automate everything you can?

The Site Reliability Engineer (SRE) will be responsible for ensuring the reliability, scalability, and performance of IT infrastructure supporting VMware virtualization and Oracle Linux environments. This role combines operational excellence with automation and engineering practices to reduce toil, improve system resilience, and deliver a seamless experience for internal and external customers.

Key Responsibilities

Infrastructure Reliability & Performance

Monitor, maintain, and optimize VMware clusters, ESXi hosts, and Oracle Linux servers
Ensure high availability and disaster recovery readiness for virtualized environments
Troubleshoot and resolve incidents impacting virtualization and Linux platforms

Automation & Tooling

Design and implement automation for patching, configuration management, and routine operational tasks using tools like Chef, Ansible, Jenkins, and Python
Develop scripts and pipelines to reduce manual effort and improve operational agility

Capacity & Configuration Management

Manage resource allocation across VMware clusters and Oracle Linux systems
Implement standardization and compliance for OS configurations and security baselines

Monitoring & Alerting

Configure and maintain monitoring solutions (e.g., vROps, Splunk, Prometheus) for proactive issue detection
Optimize alerting thresholds to reduce noise and improve incident response times

Incident & Problem Management

Lead root cause analysis for critical incidents and implement permanent fixes
Collaborate with cross-functional teams to resolve complex infrastructure issues

Security & Compliance

Ensure timely patching of VMware and Oracle Linux environments to address vulnerabilities
Maintain compliance with enterprise security standards and regulatory requirements

All About You and Required Skills & Qualifications

BS degree in Computer Science or related technical field involving coding (e.g., physics or mathematics), or equivalent practical experience
Bachelor's degree in information technology, Computer Science or equivalent work experience
Analytical/problem solving and planning skills
The ability to organize, multi-task and prioritize work based on current business needs.
Possess strong communication skills - both verbal and written
Strong relationship skills, collaborative skills and customer service skills
Interest in designing, analysing and troubleshooting large-scale distributed systems
We need team members with an appetite for change and pushing the boundaries of what can be done with automation. Experience in working across development, operations, and Engineering teams to prioritize needs and to build relationships is a must
We support many different stakeholders. Experience in dealing with difficult situations and making decisions with a sense of urgency is needed
Ability to work with little or no supervision
Strong experience with VMware vSphere, ESXi, vCenter, and related virtualization technologies.
Proficiency in Oracle Linux administration, including kernel tuning and patching.
Hands-on experience with automation tools (Chef, Ansible, Jenkins) and scripting (Python, Bash).
Familiarity with monitoring and logging tools (vROps, Splunk, Prometheus).
Knowledge of networking fundamentals, storage (VSAN), and virtualization best practices.
Experience with incident management, root cause analysis, and performance optimization.
Understanding of cloud platforms (AWS, Azure) and container technologies (Docker, Kubernetes) is a plus.

Preferred Qualifications

Certifications: VMware Certified Professional (VCP), Oracle Linux Certified Administrator
Experience in Site Reliability Engineering principles (SLIs, SLOs, error budgets)
Strong collaboration and communication skills for cross-team engagement

Corporate Security Responsibility

All activities involving access to Mastercard assets, information, and networks comes with an inherent risk to the organization and, therefore, it is expected that every person working for, or on behalf of, Mastercard is responsible for information security and must:

Abide by Mastercard's security policies and practices.
Ensure the confidentiality and integrity of the information being accessed.
Report any suspected information security violation or breach, and
Complete all periodic mandatory security trainings in accordance with Mastercard's guidelines

Corporate Security Responsibility

All activities involving access to Mastercard assets, information, and networks comes with an inherent risk to the organization and, therefore, it is expected that every person working for, or on behalf of, Mastercard is responsible for information security and must:

Abide by Mastercard's security policies and practices;
Ensure the confidentiality and integrity of the information being accessed;
Report any suspected information security violation or breach, and
Complete all periodic mandatory security trainings in accordance with Mastercard's guidelines.

Senior Site Reliability Engineer

1 hour ago

Pune, Maharashtra, India Mastercard Full time

Our PurposeMastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we're helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships...
Site Reliability Engineer

2 days ago

Pune, Maharashtra, India Equifax Full time

Site Reliability Engineering (SRE)at Equifax is a discipline that combines software and systems engineering for building and running large-scale, distributed, fault-tolerant systems. SRE ensures that internal and external services meet or exceed reliability and performance expectations while adhering to Equifax engineering principles.SRE is also an...
Vice President, Site Reliability Engineer II

5 days ago

Pune, Maharashtra, India BNY External Career Site Full time

Vice President, Site Reliability EngineerAt BNY, our culture allows us to run our company better and enables employees' growth and success. As a leading global financial services company at the heart of the global financial system, we influence nearly 20% of the world's investible assets. Every day, our teams harness cutting-edge AI and breakthrough...
Site Reliability Engineer

2 weeks ago

Pune, Maharashtra, India Relanto Full time ₹ 12,00,000 - ₹ 36,00,000 per year

Job Title: Site Reliability EngineerSummaryWe are looking for a Site Reliability Engineer to join our Digital & Transformation department. The ideal candidate will have 4 years of experience in this field and will be responsible for ensuring the reliability, availability, and performance of our systems and applications.Roles And Responsibilities4 years of...
Site Reliability Engineer

4 hours ago

Pune, Maharashtra, India NielsenIQ Full time

Job Description Senior Site Reliability Engineer, PuneAt NielsenIQ Digital Shelf, we help the world's leading brands measure and improve their online performance. Formerly known as Data Impact, we've recently joined NielsenIQ. Today, we operate at the intersection of scale and agility — a tech-driven environment backed by a global organization. Our...
Site Reliability Engineer

1 week ago

Pune, Maharashtra, India Fiserv Full time

Site Reliability EngineerExp. Range-8 to14 YearsWhat does a successful Site Reliability Engineer (SRE) Expert do at Fiserv?The Site reliability engineer blends the principles of software engineering with the discipline of operations to create high-performing and reliable software systems. They are tasked with designing and implementing tools, processes, and...
Associate Site Reliability Engineer

6 days ago

Pune, Maharashtra, India InfraCloud Technologies Full time

Duration: 6 Months Internship (with potential full-time conversion based on performance)What are we looking forThis position is for candidates who are eager to build their careers in the Site Reliability Engineering (SRE) domain. We are looking for individuals who are passionate about understanding how systems work, have basic coding or scripting knowledge,...
Site Reliability Engineer

3 hours ago

Pune, Maharashtra, India METRO Global Solution Center IN Full time

Company DescriptionAbout us:Metro Global Solution Center (MGSC) is internal solution partner for METRO, a €29.8 Billion international wholesaler with operations in 32 countries through 625 stores & a team of 91,000 people globally. Metro operates in a further 10 countries with its Food Service Distribution (FSD) business and it is thus active in a total of...
Site Reliability Engineer

4 days ago

Pune, Maharashtra, India CrelioHealth Full time

Job Role - Site Reliability EngineerLocation - PuneJob Summary:We are seeking a Senior DevOps & SRE Engineer to join our team and help us build, deploy, and maintain our infrastructure and applications. The ideal candidate will have experience working in a fast-paced environment and a strong background in DevOps and Site Reliability Engineering (SRE). You...
Senior Site Reliability Engineer

6 days ago

Pune, Maharashtra, India Red Hat Full time

Senior Site Reliability EngineerRed Hat is seeking a Senior Site Reliability Engineer (SRE) to develop, scale, and operate our OpenShift managed cloud services. OpenShift is Red Hat's enterprise Kubernetes distribution. As an SRE you will contribute to running OpenShift at scale by enabling customer self-service, making our monitoring system more...

Americas

Europe

Asia / Oceania

Africa

Senior Site Reliability Engineer