Site Reliability Engineering Manager
2 days ago
About Business Unit: SaaSOps leads post-production support and the overall experience of Epsilon PeopleCloud products for our global clients. This function is responsible for product support, incident management, managed operations and the automation of processes. The team has successfully incubated and mainstreamed Site Reliability Engineering (SRE) as a practice, to ensure reliable product operations on a global scale. Plus, the team is actively leading the adoption of AI in operations (AIOps) and recently launched AI-driven self-service capabilities to enhance operational efficiency and improve client experiences. About the Role Will be a senior IC role responsible for driving strong operations engineering practices in SaaS product operations. Role will drive the incident triage practices, implement effective monitoring and observability tools and help build SRE competence in the team. Role will be closely working with product operations team to deep dive and identify root cause of production issues and work with concerned teams to come up with a permanent fix to recurring issues Role will identify automation opportunities to streamline repeat tasks. Will contribute to evolution of AIOps strategy - identify use cases and come up with AI / Agentic autonomous solutions What you’ll need 15+ Years hands on experience in SRE The candidate will be hands-on technology leader with a proven experience working as a SRE leader in a SAAS product set up. The candidate should have a deep understanding of monitoring tools (New Relic, Prometheus) and observability practices. Prior experience working with ServiceNow, JIRA, Bitbucket and Confluence required. The candidate should be proficient at designing effective Ops dashboards, especially for peak traffic events in a SaaS environment. The candidate should have prior experience handling communications with leadership across an organization for peak traffic events. The ideal candidate should have a strong full stack engineering background with Cloud Engineering, L1-L3 Operations & AI / Gen AI experience Must have strong development skills - at least two of Python, Java, C#; strong DB skills (RDBMS, NoSql, Cloud DBs), Container / orchestration, Cloud Infrastructure Super proficient in atleast one hyperscaler cloud (AWS, GCP, Azure) Demonstrated real world experience in traditional ML & Gen AI use case deployments in production Candidate should have had experience in working closely with Engineering & Operations team - must have a strong DevOps, Incident Management, Release management, change management experience Prior experience with at least one AIOps solution preferred. Must have proven skills in collaboration and getting things done ITIL certification and experience working in an ITIL environment will be a plus. Epsilon is a global data, technology and services company that powers the marketing and advertising ecosystem. For decades, we’ve provided marketers from the world’s leading brands the data, technology and services they need to engage consumers with 1 View, 1 Vision and 1 Voice. 1 View of their universe of potential buyers. 1 Vision for engaging each individual. And 1 Voice to harmonize engagement across paid, owned and earned channels. Epsilon’s comprehensive portfolio of capabilities across our suite of digital media, messaging and loyalty solutions bridge the divide between marketing and advertising technology. We process 400+ billion consumer actions every single day using advanced AI and hold many patents of proprietary technology, including real-time modeling languages and consumer privacy advancements. Thanks to the work of every employee, Epsilon has been consistently recognized as industry-leading by Forrester, Adweek and the MRC. Epsilon is a global company with more than 9,000 employees around the world. Epsilon has a core set of 5 values that define our culture and guide us to bring value for our clients, our people and consumers. We are seeking candidates that align with our values, demonstrate them and make them meaningful in their day-to-day work: Additional Information Act with integrity . We are transparent and have the courage to do the right thing. Work together to win together . We believe collaboration is the catalyst that unlocks our full potential. Innovate with purpose . We shape the market with big ideas that drive big outcomes. Respect all voices . We embrace differences and foster a culture of connection and belonging. Empower with accountability . We trust each other to own and deliver on common goals. Because You Matter YOUniverse. A work-world with you at the heart of it At Epsilon, we believe people make the place. And everything we do is designed with you in mind. That’s why our work-world, aptly named ‘YOUniverse’ is passionate about crafting a nurturing environment that elevates your growth, wellbeing and work-life harmony. So, come be part of a people-centric workspace where care for you is at the core of all we do. Take a trip to YOUniverse and explore our outstanding benefits, here Epsilon is an Equal Opportunity Employer. Epsilon is committed to promoting diversity, inclusion, and equal employment opportunities by using reasonable efforts to attract, recruit, engage and retain qualified individuals of all ethnicities and backgrounds, including, but not limited to, women, people of color, LGBTQ individuals, people with disabilities and any other underrepresented groups, traits or characteristics.
-
Site Reliability Engineer
1 week ago
bangalore district, India IntraEdge Full timeJob Title: Site Reliability Engineer (SRE) – Production Support Location: Bengaluru Job Summary: We are looking for a skilled Site Reliability Engineer (SRE) with strong experience in production support, DevOps practices, and cloud infrastructure management . The ideal candidate will be responsible for maintaining the reliability, performance, and...
-
Site Reliability Engineer
2 weeks ago
bangalore district, India Synechron Full timeGood-day, We have immediate opportunity for Senior Site Reliability Engineer. Job Role: Senior Site Reliability Engineer Job Location: Synechron ( Bengaluru) Experience- 10 to 15 years Notice : Immediate Joiner About Company: At Synechron, we believe in the power of digital to transform businesses for the better. Our global consulting firm combines...
-
Site Reliability Engineering Manager
4 days ago
bangalore, India Tata Consultancy Services Full timeRole**: Manager, Site Reliability EngineeringRequired Technical Skill Set: Manager, Site Reliability EngineeringDesired Experience Range: 12 - 18 yrsNotice Period: Immediate to 90Days onlyLocation of Requirement: BangaloreWe are currently planning to do a Virtual Interview Job Description:Describe what the person will do in the role - how he/she will impact...
-
Manager, Site Reliability Engineering
2 days ago
gurgaon district, India Cvent Full timeCvent is looking for a Manager, Site Reliability Engineering to help us scale our systems and ensure stability, reliability and performance and rapid deployments of our platform. We build teams that are inclusive, collaborative, and have a strong sense of ownership for the things they build. If you have a passion and track record for solving problems;...
-
Site Reliability Engineer
2 weeks ago
bangalore district, India Xebia Full timePerformance & Reliability Engineer ( Senior, Lead , Principal & Manager) Hybrid Location: Pune, Chennai, Bangalore & Gurgaon Need immediate joiners only Job description Role: Performance & Reliability Engineer Job Location: Gurgaon, Chennai, Pune, Bangalore Hybrid Job Overview: We are seeking a highly skilled and motivated Performance & Reliability Engineer...
-
Senior Site Reliability Engineer
2 weeks ago
bangalore district, India Allegion Full timeAllegion India is seeking a highly motivated Senior Site Reliability Engineer who will play a critical role in ensuring the reliability, scalability, and performance of our organization's systems and infrastructure, who will work with a team of cross-functional product development engineers to design, implement, and maintain highly available and resilient...
-
Site Reliability Engineer
1 week ago
IND - Karnataka - BANGALORE, India Globalfoundries Engineering Private Limited Full time ₹ 12,00,000 - ₹ 24,00,000 per yearSite Reliability Engineer About GlobalFoundries GlobalFoundries is a leading full-service semiconductor foundry providing a unique combination of design, development, and fabrication services to some of the world's most inspired technology companies. With a global manufacturing footprint spanning three continents, GlobalFoundries makes possible the...
-
Site Reliability Engineering Manager
4 days ago
bangalore, India CloudHire Full timeJob SummaryThe Technical Manager for Site Reliability Engineering (SRE) will lead a remote team of Site Reliability Engineers, ensuring operational excellence and fostering a high-performing team culture. Reporting to the US-based Director of Systems and Security, this role is responsible for overseeing day-to-day operations, technical mentorship, and...
-
Site Reliability Engineering Manager
3 days ago
bangalore, India CloudHire Full timeJob Summary The Technical Manager for Site Reliability Engineering (SRE) will lead a remote team of Site Reliability Engineers, ensuring operational excellence and fostering a high-performing team culture. Reporting to the US-based Director of Systems and Security, this role is responsible for overseeing day-to-day operations, technical mentorship, and...
-
Site Reliability Engineer
2 weeks ago
bangalore district, India Groww Full timeAbout Groww We are a passionate group of people focused on making financial services accessible to every Indian through a multi-product platform. Each day, we help millions of customers take charge of their financial journey. Customer obsession is in our DNA. Every product, every design, every algorithm down to the tiniest detail is executed keeping the...