Site Reliability Engineering Manager

2 weeks ago


Bengaluru Karnataka India, Karnataka Epsilon Full time

About Business Unit:


SaaSOps leads post-production support and the overall experience of Epsilon PeopleCloud products for our global clients. This function is responsible for product support, incident management, managed operations and the automation of processes. The team has successfully incubated and mainstreamed Site Reliability Engineering (SRE) as a practice, to ensure reliable product operations on a global scale. Plus, the team is actively leading the adoption of AI in operations (AIOps) and recently launched AI-driven self-service capabilities to enhance operational efficiency and improve client experiences.


Click here to view how Epsilon transforms marketing with 1 View, 1 Vision and 1 Voice.


Responsibilities

  • Will be a senior IC role responsible for driving strong operations engineering practices in SaaS product operations.
  • Role will be working closely with engineering, delivery and operations team to ensure streamlined release and change management processes
  • Role will be closely working with product operations team to deep dive and identify root cause of production issues and work with concerned teams to come up with a permanent fix to recurring issues
  • Role will identify automation opportunities to streamline repeat tasks.
  • Will contribute to evolution of AIOps strategy - identify use cases and come up with AI / Agentic autonomous solutions


Qualifications

  • 15+ Years of candidates in SRE
  • The candidate will be hands-on technology leader with a proven experience working as a SRE leader in a product set up.
  • The ideal candidate should have a strong full stack engineering background with Cloud & AI / Gen AI experience
  • Must have strong development skills - at least two of Python, Java, C#; strong DB skills (RDBMS, NoSql, Cloud DBs), Container / orchestration, Cloud Infrastructure
  • Super proficient in atleast one hyperscaler cloud (AWS, GCP, Azure)
  • Demonstrated real world experience in traditional ML & Gen AI use case deployments in production
  • Candidate should have had experience in working closely with Engineering & Operations team - must have a strong DevOps, Release management, change management experience
  • Experience in AIOps will be an added advantage.
  • Must have proven skills in collaboration and getting things done


Epsilon is a global data, technology and services company that powers the marketing and advertising ecosystem. For decades, we’ve provided marketers from the world’s leading brands the data, technology and services they need to engage consumers with 1 View, 1 Vision and 1 Voice. 1 View of their universe of potential buyers. 1 Vision for engaging each individual. And 1 Voice to harmonize engagement across paid, owned and earned channels.

Epsilon’s comprehensive portfolio of capabilities across our suite of digital media, messaging and loyalty solutions bridge the divide between marketing and advertising technology. We process 400+ billion consumer actions every single day using advanced AI and hold many patents of proprietary technology, including real-time modeling languages and consumer privacy advancements. Thanks to the work of every employee, Epsilon has been consistently recognized as industry-leading by Forrester, Adweek and the MRC. Epsilon is a global company with more than 9,000 employees around the world.


Epsilon has a core set of 5 values that define our culture and guide us to bring value for our clients, our people and consumers. We are seeking candidates that align with our values, demonstrate them and make them meaningful in their day-to-day work:


Additional Information

  • Act with integrity. We are transparent and have the courage to do the right thing.
  • Work together to win together. We believe collaboration is the catalyst that unlocks our full potential.
  • Innovate with purpose. We shape the market with big ideas that drive big outcomes.
  • Respect all voices. We embrace differences and foster a culture of connection and belonging.
  • Empower with accountability. We trust each other to own and deliver on common goals.


Because You Matter

YOUniverse. A work-world with you at the heart of it

At Epsilon, we believe people make the place. And everything we do is designed with you in mind. That’s why our work-world, aptly named ‘YOUniverse’ is passionate about crafting a nurturing environment that elevates your growth, wellbeing and work-life harmony. So, come be part of a people-centric workspace where care for you is at the core of all we do.


Take a trip to YOUniverse and explore our outstanding benefits, here

Epsilon is an Equal Opportunity Employer.

Epsilon is committed to promoting diversity, inclusion, and equal employment opportunities by using reasonable efforts to attract, recruit, engage and retain qualified individuals of all ethnicities and backgrounds, including, but not limited to, women, people of color, LGBTQ individuals, people with disabilities and any other underrepresented groups, traits or characteristics.



  • Bengaluru, Karnataka, India, Karnataka Tata Consultancy Services Full time

    Role**: Manager, Site Reliability EngineeringRequired Technical Skill Set: Manager, Site Reliability EngineeringDesired Experience Range: 12 - 18 yrsNotice Period: Immediate to 90Days onlyLocation of Requirement: BangaloreWe are currently planning to do a Virtual Interview Job Description:Describe what the person will do in the role - how he/she will impact...


  • Bengaluru, Karnataka, India, Karnataka ViewSonic Full time

    Job Requirements:Bachelor's degree in Computer Science, Engineering, or a related field.3+ year of experience in a relevant role, such as Site Reliability Engineer, DevOps Engineer, or similar, is preferred but not mandatory.Basic understanding of AWS solutions including EC2, S3, CloudWatch, Lambda, and RDS.Interest and understanding of Platform Engineering...


  • Bengaluru, Karnataka, India, Karnataka IntraEdge Full time

    Job Title: Site Reliability Engineer (SRE) – Production SupportLocation: BengaluruJob Summary:We are looking for a skilled Site Reliability Engineer (SRE) with strong experience in production support, DevOps practices, and cloud infrastructure management. The ideal candidate will be responsible for maintaining the reliability, performance, and scalability...


  • Bengaluru, Karnataka, India, Karnataka WhiteLotus Talent Partners Full time

    We are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes. In this role, you will focus on monitoring, basic troubleshooting, and incident response, helping to maintain high system...


  • Bengaluru, Karnataka, India, Karnataka HDFC Limited Full time

    Hiring for Lead / Sr Site Reliability Engineer for Mumbai & Bangalore LocationExperience - 8 - 14 Years Job PurposeAnalysing, troubleshooting, and designing vital services, platforms, and infrastructure on GCP while always thinking about reliability, scalability, resilience, security, and performance. Job Responsibilities: Help build a Site Reliability...


  • Bengaluru, Karnataka, India, Karnataka JRD Systems Full time

    Position: Site Reliability Engineer (SRE) Role Overview: We are seeking an experienced Site Reliability Engineer (SRE) with a strong background in Windows infrastructure to manage and optimize our cloud and on-premises environments. The ideal candidate will partner with development teams to improve service reliability, implement automation, and ensure...


  • Bengaluru, Karnataka, India, Karnataka Resource Algorithm Full time

    Senior SRE (Engineering & Reliability) Job Summary:We are seeking an experienced and dynamic Site Reliability Engineering (SRE) Lead to oversee the reliability, scalability, and performance of our critical systems. As an SeniorSRE, you will play a pivotal role in establishing and implementing SRE practices, leading a team of engineers, and driving...


  • Bengaluru, Karnataka, India, Karnataka QualityKiosk Technologies Pvt. Ltd. Full time

    QualityKiosk Technologies is one of the world's largest independent Quality Engineering (QE) providers and digital transformation enablers, helping companies build and manage applications for optimal performance and user experience.QualityKiosk, which offers automated quality assurance solutions for clients across geographies and verticals, counts 50 of the...


  • Bengaluru, Karnataka, India, Karnataka CodeKarma Full time

    Site Reliability Engineer (Multi-Cloud Deployments)Location: Bangalore / RemoteExperience: 4–10 yearsType: Full-time (6-month probation)About CodeKarmaCodeKarma is redefining how engineering teams understand and evolve complex systems — bringing production context directly into the developer’s workflow.Our platform runs both as SaaS and as sub-account...


  • Bengaluru, Karnataka, India AppHelix Full time ₹ 9,00,000 - ₹ 12,00,000 per year

    Role DescriptionThis is a full-time on-site role located in Bengaluru for a Site Reliability Engineer. The Site Reliability Engineer will be responsible for maintaining and improving the reliability of AppHelix's systems. Daily tasks include monitoring system performance, troubleshooting issues, managing infrastructure, and supporting software development....