Site Reliability Engineer
3 weeks ago
Job Description We are seeking a seasoned Site Reliability Engineer (SRE) with a solid background in payment systems and high-availability architectures. The ideal candidate will have hands-on experience managing large-scale, distributed systems in production, with a deep understanding of reliability, scalability, and performance tuning in the financial services or payments industry. Key Responsibilities - Design, build, and maintain scalable, resilient, and secure infrastructure for high-volume payment platforms. - Ensure system uptime, reliability, and performance through effective monitoring, alerting, and incident response strategies. - Collaborate with software engineering and DevOps teams to implement CI/CD pipelines and improve deployment efficiency. - Automate infrastructure management tasks using Infrastructure-as-Code (IaC) tools (Terraform, Ansible, etc.). - Proactively identify and mitigate system bottlenecks, failures, and potential points of failure. - Manage disaster recovery strategies, failover planning, and performance testing for critical payment services. - Work with development teams to ensure services are designed for reliability, scalability, and observability from the ground up. - Participate in root cause analysis and post-incident reviews to prevent future outages. Required Skills & Experience - 8+ years of overall experience in infrastructure engineering or SRE roles, with at least 3+ years in the payments/fintech domain. - Strong understanding of payment protocols (UPI, IMPS, RTGS, NEFT, SWIFT, etc.) and transaction processing systems. - Proven expertise in Linux systems administration, cloud platforms (AWS, GCP, or Azure), and container orchestration (Kubernetes). - Solid experience with monitoring/logging tools like Prometheus, Grafana, ELK Stack, Splunk, etc. - Proficiency in one or more scripting languages (Python, Shell, Go, etc.) for automation. - Experience with incident management, SLAs, and system troubleshooting in high-pressure environments. - Familiarity with security and compliance practices in the financial sector (e.g., PCI-DSS, ISO 27001). Preferred Qualifications - Previous experience supporting mission-critical applications in banking or financial services. - Exposure to Kafka, Redis, or other real-time streaming and caching technologies. - Experience with Site Reliability Engineering principles and implementing SLOs/SLIs. - Understanding of the Error Budget (EL) concept and how it ties into availability and release decisions. - Experience on any performance testing tool like K6, JMeter, LoadRunner. - Familiarity with mocking tools like Mockito, WireMock, Microcks.
-
Manager, Site Reliability Engineering
2 days ago
Gurugram, India Cvent Full timeCvent is looking for a Manager, Site Reliability Engineering to help us scale our systems and ensure stability, reliability and performance and rapid deployments of our platform. We build teams that are inclusive, collaborative, and have a strong sense of ownership for the things they build. If you have a passion and track record for solving problems;...
-
Manager, Site Reliability Engineering
2 days ago
Gurugram, India Cvent Full timeCvent is looking for a Manager, Site Reliability Engineering to help us scale our systems and ensure stability, reliability and performance and rapid deployments of our platform. We build teams that are inclusive, collaborative, and have a strong sense of ownership for the things they build. If you have a passion and track record for solving problems;...
-
Manager, Site Reliability Engineering
1 day ago
gurugram, India Cvent Full timeCvent is looking for a Manager, Site Reliability Engineering to help us scale our systems and ensure stability, reliability and performance and rapid deployments of our platform. We build teams that are inclusive, collaborative, and have a strong sense of ownership for the things they build. If you have a passion and track record for solving problems;...
-
Manager, Site Reliability Engineering
21 hours ago
Gurugram, India Cvent Full timeCvent is looking for a Manager, Site Reliability Engineering to help us scale our systems and ensure stability, reliability and performance and rapid deployments of our platform. We build teams that are inclusive, collaborative, and have a strong sense of ownership for the things they build. If you have a passion and track record for solving problems;...
-
Manager, Site Reliability Engineering
11 hours ago
Gurugram, India Cvent Full timeCvent is looking for a Manager, Site Reliability Engineering to help us scale our systems and ensure stability, reliability and performance and rapid deployments of our platform. We build teams that are inclusive, collaborative, and have a strong sense of ownership for the things they build. If you have a passion and track record for solving problems;...
-
Manager, Site Reliability Engineering
13 hours ago
gurugram, India Cvent Full timeCvent is looking for a Manager, Site Reliability Engineering to help us scale our systems and ensure stability, reliability and performance and rapid deployments of our platform. We build teams that are inclusive, collaborative, and have a strong sense of ownership for the things they build. If you have a passion and track record for solving problems;...
-
Manager, Site Reliability Engineering
22 hours ago
Gurugram, India Cvent Full timeCvent is looking for a Manager, Site Reliability Engineering to help us scale our systems and ensure stability, reliability and performance and rapid deployments of our platform. We build teams that are inclusive, collaborative, and have a strong sense of ownership for the things they build. If you have a passion and track record for solving problems;...
-
Manager, Site Reliability Engineering
15 hours ago
Gurugram, India Cvent Full timeCvent is looking for a Manager, Site Reliability Engineering to help us scale our systems and ensure stability, reliability and performance and rapid deployments of our platform. We build teams that are inclusive, collaborative, and have a strong sense of ownership for the things they build. If you have a passion and track record for solving problems;...
-
Site Reliability Engineer
12 hours ago
Gurugram, India S&P Global Full timeThis job is with S&P Global, an inclusive employer and a member of myGwork – the largest global platform for the LGBTQ+ business community. Please do not contact the recruiter directly.About the Role:OSTTRA India The Role: Site Reliability Engineer The Team:SRE is a global team that provides technical support across the suite of OSTTRA products. The SRE...
-
Site Reliability Engineer
2 weeks ago
Gurugram, Noida, India S&P Global Market Intelligence Full time ₹ 1,20,000 - ₹ 3,00,000 per yearPosition Summary:We are seeking a proactive and innovative Site Reliability Engineer to join our growing team. In this role, you will be a key player in ensuring the reliability, scalability, and performance of our critical systems. You will move beyond traditional monitoring to implement advanced observability, leverage AIOps for predictive insights, and...