Senior Site Reliability Engineer
13 hours ago
Job Description Our Purpose Mastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we're helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services that help people, businesses and governments realize their greatest potential. Title And Summary Senior Site Reliability Engineer Title and Summary Senior Site Reliability Engineer Overview The Distributed Platform Operations team is looking for a Site Reliability Engineer who can help us solve problems, implement automation, and leverage best practices. - Are you a born problem solver who loves to figure out how something works - Are you a detail -oriented individual who enjoys complex problem solving - Do you love determining the correct actions required to fix a problem - Do you have a low tolerance for manual work and look to automate everything you can The Site Reliability Engineer (SRE) will be responsible for ensuring the reliability, scalability, and performance of IT infrastructure supporting VMware virtualization and Oracle Linux environments. This role combines operational excellence with automation and engineering practices to reduce toil, improve system resilience, and deliver a seamless experience for internal and external customers. Key Responsibilities Infrastructure Reliability & Performance - Monitor, maintain, and optimize VMware clusters, ESXi hosts, and Oracle Linux servers - Ensure high availability and disaster recovery readiness for virtualized environments - Troubleshoot and resolve incidents impacting virtualization and Linux platforms Automation & Tooling - Design and implement automation for patching, configuration management, and routine operational tasks using tools like Chef, Ansible, Jenkins, and Python - Develop scripts and pipelines to reduce manual effort and improve operational agility Capacity & Configuration Management - Manage resource allocation across VMware clusters and Oracle Linux systems - Implement standardization and compliance for OS configurations and security baselines Monitoring & Alerting - Configure and maintain monitoring solutions (e.g., vROps, Splunk, Prometheus) for proactive issue detection - Optimize alerting thresholds to reduce noise and improve incident response times Incident & Problem Management - Lead root cause analysis for critical incidents and implement permanent fixes - Collaborate with cross-functional teams to resolve complex infrastructure issues Security & Compliance - Ensure timely patching of VMware and Oracle Linux environments to address vulnerabilities - Maintain compliance with enterprise security standards and regulatory requirements All About You And Required Skills & Qualifications - BS degree in Computer Science or related technical field involving coding (e.g., physics or mathematics), or equivalent practical experience - Bachelor's degree in information technology, Computer Science or equivalent work experience - Analytical/problem solving and planning skills - The ability to organize, multi-task and prioritize work based on current business needs. - Possess strong communication skills -- both verbal and written - Strong relationship skills, collaborative skills and customer service skills - Interest in designing, analysing and troubleshooting large-scale distributed systems - We need team members with an appetite for change and pushing the boundaries of what can be done with automation. Experience in working across development, operations, and Engineering teams to prioritize needs and to build relationships is a must - We support many different stakeholders. Experience in dealing with difficult situations and making decisions with a sense of urgency is needed - Ability to work with little or no supervision - Strong experience with VMware vSphere, ESXi, vCenter, and related virtualization technologies. - Proficiency in Oracle Linux administration, including kernel tuning and patching. - Hands-on experience with automation tools (Chef, Ansible, Jenkins) and scripting (Python, Bash). - Familiarity with monitoring and logging tools (vROps, Splunk, Prometheus). - Knowledge of networking fundamentals, storage (VSAN), and virtualization best practices. - Experience with incident management, root cause analysis, and performance optimization. - Understanding of cloud platforms (AWS, Azure) and container technologies (Docker, Kubernetes) is a plus. Preferred Qualifications - Certifications: VMware Certified Professional (VCP), Oracle Linux Certified Administrator - Experience in Site Reliability Engineering principles (SLIs, SLOs, error budgets) - Strong collaboration and communication skills for cross-team engagement Corporate Security Responsibility All Activities Involving Access To Mastercard Assets, Information, And Networks Comes With An Inherent Risk To The Organization And, Therefore, It Is Expected That Every Person Working For, Or On Behalf Of, Mastercard Is Responsible For Information Security And Must: - Abide by Mastercard's security policies and practices. - Ensure the confidentiality and integrity of the information being accessed. - Report any suspected information security violation or breach, and - Complete all periodic mandatory security trainings in accordance with Mastercard's guidelines Corporate Security Responsibility All Activities Involving Access To Mastercard Assets, Information, And Networks Comes With An Inherent Risk To The Organization And, Therefore, It Is Expected That Every Person Working For, Or On Behalf Of, Mastercard Is Responsible For Information Security And Must: - Abide by Mastercard's security policies and practices; - Ensure the confidentiality and integrity of the information being accessed; - Report any suspected information security violation or breach, and - Complete all periodic mandatory security trainings in accordance with Mastercard's guidelines.
-
Senior Site Reliability Engineer
7 days ago
Pune, India Jade Global Full timeDescription : Job Title : Senior Site Reliability Engineer (SRE) Datadog Observability.Experience Required : 8+ years overall in SRE and Infrastructure Operations with minimum 3+ years hands-on experience in Datadog.Location : Hyderabad preferable but open for Pune and remote.Job Summary :We are seeking an experienced Site Reliability Engineer (SRE) to lead...
-
Senior Site Reliability Engineer
2 weeks ago
india Akamai Full timeDescription Do you like collaborating across teams to solve complex problems? Do you enjoy solving large scale distributed content delivery challenges?Join our highly skilled Compute Site Reliability teamOur team designs, develops, and manages applications and infrastructure that support Akamai's Compute products and services. We specialize in creating...
-
Site Reliability Engineer
16 hours ago
India Pagos Consultants Full timewe are looking for experienced site reliability engineers to join a founding team of startup-minded individuals that will lay the groundwork for our new fintech offering. This team will play a pivotal role in spearheading innovation. As such, you will have the opportunity to shape the early architecture and design of the system and set the trajectory for its...
-
Site Reliability Engineer
14 hours ago
India Pagos Consultants Full timewe are looking for experienced site reliability engineers to join a founding team of startup-minded individuals that will lay the groundwork for our new fintech offering. This team will play a pivotal role in spearheading innovation. As such, you will have the opportunity to shape the early architecture and design of the system and set the trajectory for its...
-
Site Reliability Engineer
11 hours ago
India Pagos Consultants Full timewe are looking for experienced site reliability engineers to join a founding team of startup-minded individuals that will lay the groundwork for our new fintech offering. This team will play a pivotal role in spearheading innovation. As such, you will have the opportunity to shape the early architecture and design of the system and set the trajectory for its...
-
Site Reliability Engineer
11 hours ago
india Pagos Consultants Full timewe are looking for experienced site reliability engineers to join a founding team of startup-minded individuals that will lay the groundwork for our new fintech offering. This team will play a pivotal role in spearheading innovation. As such, you will have the opportunity to shape the early architecture and design of the system and set the trajectory for its...
-
Senior Site Reliability Engineer
2 weeks ago
Pune, India GSPANN Full timeDescription GSPANN is hiring a Senior Site Reliability Engineer (SRE) to join our team in Pune or Hyderabad. This full-time role focuses on enhancing the reliability, scalability, and observability of global cloud-based systems through automation, performance tuning, and modern DevOps practices.Role and Responsibilities Manage and support production...
-
Senior Site Reliability Engineer
2 weeks ago
India Akamai Full time ₹ 10,50,000 - ₹ 22,50,000 per yearDescription Do you like collaborating across teams to solve complex problems? Do you enjoy solving large scale distributed content delivery challenges?Join our highly skilled Compute Site Reliability teamOur team designs, develops, and manages applications and infrastructure that support Akamai's Compute products and services. We specialize in creating...
-
Site Reliability Engineer
2 days ago
Pune, India UBS Full timeJob Description Job Reference # 326131BR Job Type Full Time Your role We are seeking a highly experienced Site Reliability Engineer (SRE) to join our technology team in a mission-critical financial environment. This role is ideal for someone who has a proven track record of building and operating reliable, scalable systems in regulated industries such as...
-
Site Reliability Engineer
2 weeks ago
Pune, India Synechron Full timeWe have immediate opportunity for SRE (Senior Site Reliability Engineer) 5 to 9 years.Synechron – PuneJob Role: - SRE (Senior Site Reliability Engineer)Job Location: - PuneAbout SynechronWe began life in 2001 as a small, self-funded team of technology specialists. Since then, we’ve grown our organization to 14,500+ people, across 58 offices, in 21...