Site Reliability Engineer

1 week ago

Chennai, Tamil Nadu, India Workday Full time ₹ 12,00,000 - ₹ 36,00,000 per year

Your work days are brighter here.

We're obsessed with making hard work pay off, for our people, our customers, and the world around us. As a Fortune 500 company and a leading AI platform for managing people, money, and agents, we're shaping the future of work so teams can reach their potential and focus on what matters most. The minute you join, you'll feel it. Not just in the products we build, but in how we show up for each other. Our culture is rooted in integrity, empathy, and shared enthusiasm. We're in this together, tackling big challenges with bold ideas and genuine care. We look for curious minds and courageous collaborators who bring sun-drenched optimism and drive. Whether you're building smarter solutions, supporting customers, or creating a space where everyone belongs, you'll do meaningful work with Workmates who've got your back. In return, we'll give you the trust to take risks, the tools to grow, the skills to develop and the support of a company invested in you for the long haul. So, if you want to inspire a brighter work day for everyone, including yourself, you've found a match in Workday, and we hope to be a match for you too.

About the Team

The Data Platform and Observability team is based in Pleasanton,CA; Boston,MA; Atlanta, GA, Dublin, Ireland and Chennai, India. Our focus is on the development of large scale distributed data systems to support critical Workday products and provide real-time insights across Workday's platforms, infrastructure and applications.

The team provides platforms that process 100s of terabytes of data that enable core Workday products and use cases like core HCM, Fins, AI/ML skus, internal data products and Observability. If you enjoy writing efficient software or tuning and scaling large distributed systems you will enjoy working with us.

Do you want to tackle exciting challenges at massive scale across private and public clouds for our global customers? Do you want to work with world class engineers and facilitate the development of the next generation Distributed systems platforms? If so, we should chat.

About the Role

The Messaging, Streaming and Caching team is a full-service Distributed Systems Engineering team. We architect and provide async messaging, streaming, and NoSQL platforms and solutions that power the Workday products and SKUs ranging from core HCM, Fins, Integrations, and AI/ML. We develop client libraries and SDK's that make it easy for teams to build Workday products. We develop automation to deploy and run hundreds of clusters, and we also operate and tune our clusters as well. As a team member you will play a key role in improving our services and encouraging their adoption within Workday's infrastructure both in our private cloud and public cloud. As a member of this team you will design and build new capabilities from inception to deployment to exploit the full power of the core middleware infrastructure and services, and work hand in hand with our application and service teams

Primary Responsibilities

Design, build, and enhance critical distributed services, including Kafka, Redis, RabbitMQ etc.
Design, develop, build, deploy and maintain core distributed services using a combination of open source and proprietary stacks across diverse infrastructure environments (Kubernetes, OpenStack, Bare Metal, etc.)
Design and develop core software modules for streaming, messaging and caching.
Construct observability modules, alerts and automation for Dashboard lifecycle management for the distributed services.
Build, deploy and operate infrastructure components in production environments.
Champion all aspects of streaming, messaging and caching with a focus on resiliency and operational excellence.
Evaluate and implement new open-source and cloud-native tools and technologies as needed.
Participate in the on-call rotation to support the distributed systems platforms.
Manage and optimize Workday distributed services in AWS, GCP & Private cloud env.

About You

Basic Qualifications

4-8 years of software engineering experience using one or more of the following: Java/Scala, Golang.
3+ years of distributed systems experience
3+ years of development and DevOps experience in designing and operating large-scale deployments of distributed NoSQL & messaging systems.
1+ year of leading a NoSQL technology related product right from conception to deployment and maintenance.

Preferred Qualifications

expertise in developing distributed system software and deployments that perform well and degrade gracefully under excessive load.
hands-on experience with atleast one or more distributed systems technologies like Kafka/RabbitMQ, Redis, Cassandra
experience learning complex open source service internals via code inspection.
extensive experience with modern software development tools including CI/CD and methodologies like Agile
expertise with configuration management using Chef and service deployment on Kubernetes via Helm and ArgoCD.
experience with Linux system internals and tuning.
experience with distributed system performance analysis and optimization.
strong written and oral communication skills and the ability to explain esoteric technical details clearly to engineers without a similar background.

Our Approach to Flexible Work

With Flex Work, we're combining the best of both worlds: in-person time and remote. Our approach enables our teams to deepen connections, maintain a strong community, and do their best work. We know that flexibility can take shape in many ways, so rather than a number of required days in-office each week, we simply spend at least half (50%) of our time each quarter in the office or in the field with our customers, prospects, and partners (depending on role). This means you'll have the freedom to create a flexible schedule that caters to your business, team, and personal needs, while being intentional to make the most of time spent together. Those in our remote "home office" roles also have the opportunity to come together in our offices for important moments that matter.

Are you being referred to one of our roles? If so, ask your connection at Workday about our Employee Referral process

At Workday, we value our candidates' privacy and data security. Workday will never ask candidates to apply to jobs through websites that are not Workday Careers.

Please be aware of sites that may ask for you to input your data in connection with a job posting that appears to be from Workday but is not.

In addition, Workday will never ask candidates to pay a recruiting fee, or pay for consulting or coaching services, in order to apply for a job at Workday.

Site Reliability Engineer

2 weeks ago

Chennai, Tamil Nadu, India Grootan Technologies Full time ₹ 12,00,000 - ₹ 36,00,000 per year

About the RoleWe are seeking a skilledSite Reliability Engineer (SRE)with 4–5 years of hands-on experience to join our engineering team. In this role, you will be responsible for building and maintaining reliable, scalable, and secure infrastructure to support our applications. You will leverage your expertise in automation, cloud platforms, and monitoring...
Site Reliability Engineer

4 days ago

Chennai, Tamil Nadu, India GSR Business Services Full time ₹ 6,00,000 - ₹ 12,00,000 per year

Dear Aspirants,Urgent HiringSite reliability Engineer3-5 YearsChennaiRole Summary:Supports the reliability and performance of systems and infrastructure. Assists in monitoring, troubleshooting, and automating tasks to maintain high-availability environments.Key Responsibilities:Assist in managing VMware and Linux servers.Monitor system health and respond to...
Site Reliability Engineer

2 weeks ago

Chennai, Tamil Nadu, India Ford Motor Company Full time ₹ 8,00,000 - ₹ 24,00,000 per year

Job DescriptionJob Description:Ford is seeking an experienced Site Reliability Engineer (SRE) to join our team and lead the development, enhancement, and extension of our global monitoring and observability platform.Enterprise Technology plays a critical part in shaping the future of mobility. If you're looking for the chance to leverage advanced technology...
Site Reliability Engineer

6 days ago

Chennai, Tamil Nadu, India Flex Full time ₹ 8,00,000 - ₹ 24,00,000 per year

Experience:3.5 to 7 yearsLocation:ChennaiWork mode:Hybrid.Role Overview:As a Site Reliability Engineer (SRE) on the Factory Applications team, you will help maintain and scale Brix" - a cloud-native, containerized, microservices-based platform used to build global shop floor systems. Your focus will be on automation, reliability, and performance.Key...
Site Reliability Engineer, AVP

3 days ago

Chennai, Tamil Nadu, India NatWest Group Full time ₹ 12,00,000 - ₹ 36,00,000 per year

Join us as a Site Reliability EngineerYou'll manage the provision of stable, resilient, reliable applications with the end goal of minimising disruption to Customer & Colleague Journeys (CCJ)We'll look to you to identify and automate manual tasks and implement observability solutions, ensuring a thorough understanding of CCJ across applicationsThis is a...
AWS Site Reliability Engineer

2 days ago

Chennai, Tamil Nadu, India HTC Global Services Full time ₹ 12,00,000 - ₹ 36,00,000 per year

HTC – A brief profileEstablished in 1990, HTC Inc., a company with headquarters in Troy, Michigan, is a leading global Information Technology solution and BPO provider. HTC assists clients across multiple industry verticals, offering turnkey project lifecycle in, e-business, data warehousing, embedded systems, ECM, SCM, CRM, and ERP solutions. HTC Inc....
Site Reliability Engineer

2 weeks ago

Chennai, Tamil Nadu, India Barclays Full time ₹ 12,00,000 - ₹ 36,00,000 per year

Join us as a Site Reliability Engineer - Containers at Barclays, responsible for supporting the successful delivery of Location Strategy projects to plan, budget, agreed quality and governance standards. You'll spearhead the evolution of our digital landscape, driving innovation and excellence. You will harness cutting-edge technology to revolutionise our...
Site Reliability Engineer

2 weeks ago

Chennai, Tamil Nadu, India Zoho Full time ₹ 4,00,000 - ₹ 6,00,000 per year

Zoho is one of the world's most prolific software companies. With 55+ applications in nearly every major business category, including sales, marketing, customer service, accounting and back office operations, and an array of productivity and collaboration tools built from the ground up, Zoho has the depth and breadth to solve even the most complex business...
Site Reliability Engineer

6 days ago

Chennai, Tamil Nadu, India Barclays Full time ₹ 12,00,000 - ₹ 36,00,000 per year

Join Barclays as a Site Reliability Engineer - Container Platform role, where you will report into the Application Platforms Engineering Lead, playing a key role in building the products, services, software, APIs, and infrastructure that will be central to this new strategy, ensuring we have a world-class product set which is simplified and provides long...
Site Reliability Engineer

4 days ago

Chennai, Tamil Nadu, India Trimble Inc. Full time ₹ 5,00,000 - ₹ 15,00,000 per year

Job SummaryWe are seeking a motivated Site Reliability Engineer (SRE) Level 1 / Level 2 to enhance the infrastructure and operational reliability of our ERP product, specifically within Azure and Windows environments. The ideal candidate will utilize SRE principles to ensure high system availability, stability, and performance while collaborating closely...

Americas

Europe

Asia / Oceania

Africa

Site Reliability Engineer