Site Reliability Manager

2 days ago


Gurgaon, India dunnhumby Full time

Dunnhumby Hq. in London with offices across countries employs nearly 2,500 experts in offices throughout Europe, Asia, Africa, and the Americas working for transformative, iconic brands such as Tesco, Coca-Cola, Meijer, Procter & Gamble and Metro. dunnhumby is the global leader in Customer Data Science, empowering businesses everywhere to compete and thrive in the modern data-driven economy. We always put the Customer First. Our mission: to enable businesses to grow and reimagine themselves by becoming advocates and champions for their Customers. With deep heritage and expertise in retail – one of the world’s most competitive markets, with a deluge of multi-dimensional data – dunnhumby today enables businesses all over the world, across industries, to be Customer First. Dunnhumby helps retailers and brands deliver better experiences through Customer First strategies. We are seeking a talented Service Experience Manager 8+ years in service management, including systems monitoring, site reliability engineering, or infrastructure operations. 2+ years in a team lead or managerial role Coaches team members to deepen both technical monitoring skills and business understanding. Builds a culture of continuous improvement, automation, and proactive detection. Balances hands-on work with strategic oversight. Experience in a 24/7 operational environment, ideally in Media, SaaS, or streaming platforms own the end-to-end service management strategy for Media systems Accountable for defining, maintaining, and evolving service management standards across all Media platforms. Ensure monitoring coverage aligns with business-critical services and revenue-driving workflows. Acts as escalation point for major monitoring-related incidents. Experience with observability stacks (e.g. Grafana, Prometheus, Splunk, New Relic). Solid understanding of APIs, event pipelines, and log management. Familiarity with cloud environments (GCP, or Azure) and their native monitoring tools. Experience integrating monitoring with ITSM tools (ServiceNow, Zendesk, etc.). Define alerting thresholds that balance sensitivity with noise reduction. Partner with system owners to ensure monitoring reflects SLAs and KPIs. Lead post-incident reviews to improve detection and alert quality. Work closely with engineering, operations, and product teams to align service management priorities from monitoring to Incident/change management. Communicate system health and incident impact to both technical and non-technical stakeholders. Translate complex technical issues into clear business implications. Contribute to the observability roadmap, driving automation, and adoption of AIOps/ML for predictive monitoring. Evaluate new tools and methodologies to improve system reliability, visibility and usability. What you can expect from us We won’t just meet your expectations. We’ll defy them. So you’ll enjoy the comprehensive rewards package you’d expect from a leading technology company. But also, a degree of personal flexibility you might not expect. Plus, thoughtful perks, like flexible working hours and your birthday off. You’ll also benefit from an investment in cutting-edge technology that reflects our global ambition. But with a nimble, small-business feel that gives you the freedom to play, experiment and learn. And we don’t just talk about diversity and inclusion. We live it every day – with thriving networks including dh Gender Equality Network, dh Proud, dh Family, dh One and dh Thrive as the living proof. Everyone’s invited. Our approach to Flexible Working At dunnhumby, we value and respect difference and are committed to building an inclusive culture by creating an environment where you can balance a successful career with your commitments and interests outside of work. We believe that you will do your best at work if you have a work / life balance. Some roles lend themselves to flexible options more than others, so if this is important to you please raise this with your recruiter, as we are open to discussing agile working opportunities during the hiring process. For further information about how we collect and use your personal information please see our Privacy Notice which can be found.



  • Gurgaon, India Cvent Full time

    Cvent is looking for a Manager, Site Reliability Engineering to help us scale our systems and ensure stability, reliability and performance and rapid deployments of our platform. We build teams that are inclusive, collaborative, and have a strong sense of ownership for the things they build. If you have a passion and track record for solving problems;...


  • Gurgaon, India Cvent Full time

    Cvent is looking for a Manager, Site Reliability Engineering to help us scale our systems and ensure stability, reliability and performance and rapid deployments of our platform. We build teams that are inclusive, collaborative, and have a strong sense of ownership for the things they build. If you have a passion and track record for solving problems;...


  • Gurgaon, India Cvent Full time

    Cvent is looking for a Manager, Site Reliability Engineering to help us scale our systems and ensure stability, reliability and performance and rapid deployments of our platform. We build teams that are inclusive, collaborative, and have a strong sense of ownership for the things they build. If you have a passion and track record for solving problems;...


  • Gurgaon, India People Hire Consulting Full time

    Looking for a Manager, Site Reliability Engineering to help us scale our systems and ensurestability, reliability and performance and rapid deployments of our platform. We build teams thatare inclusive, collaborative, and have a strong sense of ownership for the things they build. If youhave a passion and track record for solving problems; moreover, have...

  • Site Reliability

    4 days ago


    Gurgaon, Haryana, India Weekday Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    This role is for one of our clientsCompany Name: NeemtreeIndustry: Technology, Information and MediaSeniority level: Mid-Senior levelMin Experience: 4 yearsLocation: Gurugram, Delhi, NCRJobType: full-timeWe're looking for a Site Reliability & Automation Engineer who thrives at the intersection of infrastructure, automation, and reliability. In this role,...


  • Gurgaon, India People Hire Consulting Full time

    Looking for a Manager, Site Reliability Engineering to help us scale our systems and ensure stability, reliability and performance and rapid deployments of our platform. As Manager, SRE you will demonstrate both emerging and current technologies, methods, and processes contributing to the evolution of software deployment processes, enhancing security,...


  • Gurgaon, India dunnhumby Full time

    Dunnhumby Hq. in London with offices across countries employs nearly 2,500 experts in offices throughout Europe, Asia, Africa, and the Americas working for transformative, iconic brands such as Tesco, Coca-Cola, Meijer, Procter & Gamble and Metro.dunnhumby is the global leader in Customer Data Science, empowering businesses everywhere to compete and thrive...


  • Gurgaon, India Dunnhumby Full time

    Dunnhumby Hq. in London with offices across countries employs nearly 2,500 experts in offices throughout Europe, Asia, Africa, and the Americas working for transformative, iconic brands such as Tesco, Coca-Cola, Meijer, Procter & Gamble and Metro.dunnhumby is the global leader in Customer Data Science, empowering businesses everywhere to compete and thrive...


  • Gurgaon, India beBeeEngineering Full time

    We are seeking an experienced professional to fill a key role in our team. As Manager, Site Reliability Engineering, you will be responsible for ensuring the stability and performance of our systems.Job DescriptionThe ideal candidate will have a strong technical background and experience leading site reliability engineering teams. You will be responsible for...


  • Gurgaon, Haryana, India Datum Technologies Group Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Job Details:Job Title: Site Reliability Engineer (SRE) With Azure & AIDuration: Contract Position (On the Payroll of Datum Technology Group)Location: Chennai || Mumbai || GurugramInterview Process: Virtual (2 Rounds) + 1 Technical screening.Job Description:We are seeking a skilled and collaborative Site Reliability Engineer (SRE) with deep expertise in Azure...