Site Reliability Engineer

2 days ago


Bengaluru, India Epsilon Full time

About Business Unit:

At the core of all that Epsilon does is a team that sets the foundation of our IT infrastructure. The team drives innovation and efficiency through disruptive technology across Epsilon's platforms and business verticals. From being the first point of contact for infrastructure needs to final deployment, the team provides end-to-end solutions for our client-facing platforms. ETS supports all aspects of revenue-generating platforms for Epsilon and sets the architectural direction for our enterprise deployments. By embracing the latest technologies, such as Cloud, Automation, and Artificial Intelligence, the team is at the front of transforming our digital business and capturing new opportunities.

Why we are looking for you:

1. You have experience in building world class automation with Site Reliability mindset for Cloud and on-premise infrastructure
2. You have shift-left approach and have strong cloud native experience
3. You have a solid experience of building products/platforms of scale
4. You enjoy new challenges and are solution oriented in complex infrastructure environments
5. You like mentoring people and enable collaboration of the highest order

What you will enjoy in this role:

As part of the Epsilon CPTS team, the pace of the work matches the fast-evolving demands of Fortune 500 clients across the globe

1. As part of an innovative team that’s not afraid to do things differently, your ideas will come to life in building next gen infrastructure that supports our Fortune 500 global customers
2. You will implement shift-left approach into our infrastructure life cycle practices
3. The open and transparent environment that values innovation and efficiency

Click here to view how Epsilon transforms marketing with 1 View, 1 Vision and 1 Voice.

Responsibilities

1. Lead SRE initiatives across a hybrid infrastructure (on-prem + AWS, Azure, GCP)
2. Manage and optimize 11,000+ servers across Linux and Windows platforms
3. Architect and support scalable, resilient AWS infrastructure (EKS, EC2, S3, RDS, Lambda, etc.)
4. Administer Kubernetes clusters at scale; ensure health, upgrades, and secure deployments
5. Drive infrastructure automation using Python, Shell, and Infrastructure as Code (Terraform, Ansible, Chef)
6. Design and implement AI agents for observability, RCA, and incident triage using modern MLOps/DevOps paradigms
7. Collaborate with development, IT Ops, Command Center, cloud, and platform teams to strengthen CI/CD, security posture, and SLA alignment

Qualifications

BE/ B.Tech – No correspondence course

1. 14+ years of experience in Platform/Cloud Engineering, SRE, DevOps
2. Strong hands-on coding experience in Go, Python, Shell
3. Strong expertise in Cloud, Kubernetes, Linux Administration
4. Hands-on experience with AWS services and Kubernetes
5. Proficiency in IAC tools like Terraform, Cross plane, Ansible
6. Extensive experience in delivering efficient developer experience
7. Familiarity with monitoring tools (Zabbix, PagerDuty, Grafana).

Additional Information

Epsilon is a global data, technology and services company that powers the marketing and advertising ecosystem. For decades, we’ve provided marketers from the world’s leading brands the data, technology and services they need to engage consumers with 1 View, 1 Vision and 1 Voice. 1 View of their universe of potential buyers. 1 Vision for engaging each individual. And 1 Voice to harmonize engagement across paid, owned and earned channels.

Epsilon’s comprehensive portfolio of capabilities across our suite of digital media, messaging and loyalty solutions bridge the divide between marketing and advertising technology. We process 400+ billion consumer actions each day using advanced AI and hold many patents of proprietary technology, including real-time modeling languages and consumer privacy advancements. Thanks to the work of every employee, Epsilon has been consistently recognized as industry-leading by Forrester, Adweek and the MRC. Epsilon is a global company with more than 9,000 employees around the world.

Epsilon has a core set of 5 values that define our culture and guide us to bring value for our clients, our people and consumers. We are seeking candidates that align with our values, demonstrate them and make them meaningful in their day-to-day work:

- Act with integrity. We are transparent and have the courage to do the right thing.
- Work together to win together. We believe collaboration is the catalyst that unlocks our full potential.
- Innovate with purpose. We shape the market with big ideas that drive big outcomes.
- Respect all voices. We embrace differences and foster a culture of connection and belonging.
- Empower with accountability. We trust each other to own and deliver on common goals.

Because You Matter

YOUniverse. A work-world with you at the heart of it

At Epsilon, we believe people make the place. And everything we do is designed with you in mind. That’s why our work-world, aptly named ‘YOUniverse’ is focused on creating a nurturing environment that elevates your growth, wellbeing and work-life harmony. So, come be part of a people-centric workspace where care for you is at the core of all we do.

Take a trip to YOUniverse and explore our unique benefits, here

Epsilon is an Equal Opportunity Employer.

Epsilon is committed to promoting diversity, inclusion, and equal employment opportunities by using reasonable efforts to attract, recruit, engage and retain qualified individuals of all ethnicities and backgrounds, including, but not limited to, women, people of color, LGBTQ individuals, people with disabilities and any other underrepresented groups, traits or characteristics.



  • Bengaluru, Karnataka, India Programming Full time ₹ 1,04,000 - ₹ 1,30,878 per year

    Role - Site Reliability Engineering.Location - BengaluruYears of Expereince - 4+ YearsProfessional & Technical Skills:Must To Have Skills: Proficiency in Site Reliability Engineering.Good To Have Skills: Experience with cloud service providers such as AWS, Azure, or Google Cloud.Strong understanding of CI/CD tools and practices.Experience with container...


  • Bengaluru, India ViewSonic Full time

    Job Requirements:Bachelor's degree in Computer Science, Engineering, or a related field.3+ year of experience in a relevant role, such as Site Reliability Engineer, DevOps Engineer, or similar, is preferred but not mandatory.Basic understanding of AWS solutions including EC2, S3, CloudWatch, Lambda, and RDS.Interest and understanding of Platform Engineering...


  • Bengaluru, Karnataka, India Enterprise Minds, Inc Full time

    We're Hiring | Site Reliability Engineer | 8-10 years


  • Bengaluru, India ViewSonic Full time

    Job Requirements: 1. Bachelor's degree in Computer Science, Engineering, or a related field. 2. 3+ year of experience in a relevant role, such as Site Reliability Engineer, DevOps Engineer, or similar, is preferred but not mandatory. 3. Basic understanding of AWS solutions including EC2, S3, CloudWatch, Lambda, and RDS. 4. Interest and understanding of...


  • Bengaluru, Karnataka, India FOSS United Full time ₹ 1,04,000 - ₹ 1,30,878 per year

    All JobsSite Reliability Engineer at ZEISS IndiaSite Reliability EngineerApplyPosted on September 11, 2025ZEISS IndiaKadubeesanahalli, BengaluruFull TImeJob DescriptionZEISS in IndiaZEISS in India is headquartered in Bengaluru and present in the fields of Industrial Quality Solutions, Research Microscopy Solutions, Medical Technology, Vision Care and Sports...


  • Bengaluru, Karnataka, India Randstad Full time

    Role: Site Reliability Engineer SummaryThe Network Engineer 2 provides technical design, planning, operation, maintenance, and advanced troubleshooting of the Bread Financials' network infrastructure. This position ensures continuity and alignment of the network administration/engineering direction. This position supports Bread Financials' strategies and...


  • Bengaluru, Karnataka, India WhiteLotus Talent Partners Full time

    We are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes. In this role, you will focus on monitoring, basic troubleshooting, and incident response, helping to maintain high system...


  • Bengaluru, Karnataka, India WhiteLotus Talent Partners Full time

    We are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes. In this role, you will focus on monitoring, basic troubleshooting, and incident response, helping to maintain high system...


  • Bengaluru, Karnataka, India TRUGlobal Full time ₹ 9,00,000 - ₹ 12,00,000 per year

    Job Title: Site Reliability Engineer (SRE) with Python Development ExpertisePosition Overview: We are seeking a skilled Site Reliability Engineer (SRE) with strong Python development experience to join our team. The ideal candidate will be responsible for ensuring the reliability, availability, and performance of our services across both on-premises and...


  • Bengaluru, Karnataka, India CorroHealth Full time

    We are seeking a highly skilled Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a deep understanding of both software engineering and systems administration, with a focus on creating scalable and reliable systems. You will work closely with development and operations teams to ensure the reliability, availability, and...