GCP Site Reliability Engineer, Staff

5 days ago


Hyderabad, Telangana, India Warner Bros. Discovery Full time ₹ 10,00,000 - ₹ 25,00,000 per year

Welcome to Warner Bros. Discovery… the stuff dreams are made of.

Who We Are…

When we say, "the stuff dreams are made of," we're not just referring to the world of wizards, dragons and superheroes, or even to the wonders of Planet Earth. Behind WBD's vast portfolio of iconic content and beloved brands, are the
storytellers
bringing our characters to life, the
creators
bringing them to your living rooms and the
dreamers
creating what's next…

From brilliant creatives, to technology trailblazers, across the globe, WBD offers career defining opportunities, thoughtfully curated benefits, and the tools to explore and grow into your best selves. Here you are supported, here you are celebrated, here you can thrive.

Your New Role:

We are

seeking

a highly skilled

Lead

Google Cloud Platform

(GCP)

Site Reliability Engineer (SRE) to join the Global Infrastructure Cloud Technologies (GICT)

team

to ensure the reliability, availability, scalability, and security of our cloud infrastructure and services.

The ideal candidate will bring

expertise

in

GCP

, automation, monitoring, and incident management to drive operational excellence.

The

role serves as a technical leader across our Hyderabad Cloud Team, supporting hundreds of applications,

websites

and services in the fleet of

Warner Bros Discovery (

WBD

)

cloud accounts.

The selected individual

will help craft management and governance strategies, and work to unify processes with other cloud providers. As a team player, the

GCP Lead

SRE will collaborate with other SRE

Leads

, the rest of the cloud engineering team, software developers and management to build and manage highly resilient and performant infrastructure.

This individual will have a strong background in Linux and Windows Systems Engineering.

Proficiency

in Terraform and related Infrastructure-as-Code

(

IaC

)

is

required

. Experience with the software development lifecycle

, be

fluent in distributed computing techniques and technologies,

and

demonstrated

experience managing enterprise scale infrastructure and tooling.

Direct, hands-on experience writing software

is

ideal

.

This position reports to the Sr

.

Manager of Cloud Engineering.

Your Role Accountabilities:

Key Responsibilities

  • Primarily accountable for managing GCP environments

  • Identify

,

optimize

and

eliminate

performance bottlenecks and proactively

remediating

security concerns through monitoring, profiling, and tuning.

  • Establish and improve SLOs, SLIs, and error budgets to drive system reliability.

  • Collaborate with stakeholders, including application developers, to improve application observability and

optimize

performance.

  • Lead and mentor a team of engineers working to reduce toil across the total team load, and

to implement

security features, roles, user access and privileges according to best practices.

  • Proactively

identify

, design, and implement

process

and architectural

improvements

.

  • Stay informed on the latest features and best practices across the

GCP

Public Cloud and the WBD

GCP

environment.

  • Work with

peer

group of complementary public cloud leads (

Azure

/

AWS

) to

facilitate

consistency across WBD management of resources wherever possible.

Methodology

  • Automate deployment, monitoring, and self-healing capabilities to improve operational efficiency.

  • Develop and manage infrastructure using Terraform

and

other

IaC

tools.

  • Drive incident response efforts, conduct root cause analyses (RCA), and implement preventative measures to minimize downtime.

  • Build and enhance monitoring, alerting, and observability systems to proactively resolve incidents before they

impact

users. Evangelize telemetry and metrics-driven application development.

  • Improve on-call processes and reduce toil by automating repetitive tasks.

  • Contribute to the software development of cloud management tooling and support applications.

  • Develop detailed technical documentation, including runbooks, troubleshooting guides, and system diagrams.

Continuous Improvement

  • Work with stakeholders to ensure systems meet security baselines, best practices, compliance requirements and resiliency standards.

  • Implement effective backup strategies and conduct regular disaster recovery testing.

  • Implement robust access controls, secrets management, and security monitoring solutions.

  • Collaborate with security teams to manage vulnerabilities and respond to threa

t

s.

  • Engage with our FinOps/

CostOps

team to

optimize

cloud costs by implementing efficient resource

utilization

and right-sizing strategies.

  • Work closely with development, infrastructure, and security teams to drive best practices and improvements.

  • Mentor junior engineers and contribute to a culture of continuous learning and improvement.

  • Participate in architectural discussions and provide guidance on reliability and scalability considerations.

Qualifications & Experience

  • 8

+

years of prior experience in

a Site

Reliability Engineering, DevOps, Cloud

Infrastructure

or related fields.

  • Expert in Google Cloud Platform.

  • Strong experience in Linux/Unix administration, networking, and distributed systems.

  • Fluency in two or more programming languages (Python, Golang,

Javascript

, PowerShell, etc.)

  • Extensive hands-on experience in container orchestration technologies, such as

GKE

, Kubernetes, Docker.

  • Deep knowledge of monitoring,

logging

and observability tools (Prometheus, Grafana, ELK, Splunk, etc.).

  • Hands-on experience with Infrastructure-as-Code (

IaC

) using Terraform

and

Google Cloud Deployment Manager

(

GDM

)

templates.

  • Strong background in CI/CD pipelines,

GitOps

, and infrastructure automation (Terraform,

Helm,

Ansible

or Chef).

Soft Skills

  • Strong problem-solving, troubleshooting, and debugging skills.

  • Excellent written and verbal communication and collaboration abilities.

  • English language fluency

required

.

  • Ability to handle multiple assignments concurrently.

  • Passion for automation, reliability, and continuous improvement

.

  • Move quickly and intelligently - seeing technical debt as your nemesis

.

  • Ability to solve problems independently but knows when to request

assistance

.

Not Required but preferred experience

  • Experience with other cloud providers such as AWS, Azure,

Oracle

etc.

  • Knowledge of and passion for media, entertainment, and technology industries (including key players, growth trends and drivers, new media models, industry structure, etc.)

  • Familiarity with streaming and

similar products

/services

.

  • Experience working in a national or global company

.

  • Comfortable working in

a

highly iterative and

somewhat unstructured

environment

.

How We Get Things Done…

This last bit is probably the most important Here at WBD, our guiding principles are the core values by which we operate and are central to how we get things done. You can find them at

along with some insights from the team on what they mean and how they show up in their day to day. We hope they resonate with you and look forward to discussing them during your interview.

Championing Inclusion at WBD

Warner Bros. Discovery embraces the opportunity to build a workforce that reflects a wide array of perspectives, backgrounds and experiences. Being an equal opportunity employer means that we take seriously our responsibility to consider qualified candidates on the basis of merit, regardless of sex, gender identity, ethnicity, age, sexual orientation, religion or belief, marital status, pregnancy, parenthood, disability or any other category protected by law.

If you're a qualified candidate with a disability and you require adjustments or accommodations during the job application and/or recruitment process, please visit our
accessibility page
for instructions to submit your request.



  • Hyderabad, Telangana, India, Telangana inTune Systems Inc Full time

    Job Summary: We are looking for a Senior Site Reliability Engineer (SRE) to join our growing Engineering team. As an SRE, you will play a key role in ensuring the reliability, scalability, and performance of our production systems across a multi-cloud environment (GCP & AWS). You’ll be responsible for owning application support, maintaining our...


  • Hyderabad, Telangana, India HTC Global Services Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Job DescriptionAbout the Role:Seeking a highly skilled AWS Site Reliability Engineer (SRE) with a 6 year experience to join our dynamic team.RequirementsAt least 3 to 6 years of hands-on experience in AWS Cloud and Site Reliability Engineering.Strong knowledge of networking concepts including VPC, subnets, NAT, routing and security groups.Proficiency in...


  • Hyderabad, Telangana, India NCR Corporation Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Job Title: Site Reliability Engineer IILocation: HyderabadJob Type: Full-Time, 24*7 AvailabilityJob Description:As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and observability of our transaction processing and settlement platforms, while also providing hands-on support for critical business applications....


  • Hyderabad, Telangana, India TechBlocks Full time

    About TechBlocksTechBlocks is a global digital product engineering company with 16+ years of experience helping Fortune 500 enterprises and high-growth brands accelerate innovation, modernize technology, and drive digital transformation. From cloud solutions and data engineering to experience design and platform modernization, we help businesses solve...


  • Hyderabad, Telangana, India Amgen Inc Full time ₹ 8,00,000 - ₹ 12,00,000 per year

    *What you will do* In this vital role you will responsible for the reliability, stability, performance, scalability, and security of platforms that support Amgens digital products and engineering teams. This hands-on role focuses on supporting cloud-based infrastructure, automating operations, maintaining observability, and improving platform reliability...


  • Hyderabad, Telangana, India Talent Worx Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Site Reliability Engineer (SRE)At Talent Worx, we are looking for a dedicated Site Reliability Engineer (SRE) to join our team. This role involves maintaining high availability and reliability of our services through the application of software engineering practices and systems administration skills. The ideal candidate will bridge the gap between...


  • Hyderabad, Telangana, India Instaresz Business Services Pvt Ltd Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Job Title: Senior Site Reliability Engineer (SRE)Experience Required:10+ YearsLocation:Hyderabad (On-site)Employment Type:Full-TimeAbout InstareszInstaresz Business Services Pvt. Ltd. focuses on building and scalinghigh-performance SaaSproductswith expertise in:• SaaS Product Development• Infrastructure & DevOps• Data & Analytics• AI & AutomationOur...


  • Hyderabad, Telangana, India 2a1d0a41-1875-4bbb-b5a8-e4d5620cfd5f Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Role & responsibilitiesCoordinates cross-product chaos experimentation to proactively test system resilience and uncover reliability gaps.Maintains the centralized incident response playbook for the subdivision, documenting standards for communication, escalation, and recovery during incidents. Aggregates and reports quantifiable availability data to senior...


  • Hyderabad, Telangana, India Assurant Full time ₹ 6,00,000 - ₹ 12,00,000 per year

    Site Reliability Engineer, GCC-AssurantThe Site Reliability Engineer (SRE) will be part of the Assurant Reliability Team, specifically within the Site Reliability Engineering area. This remote position, based in India, focuses on building and maintaining reliable, scalable systems through a combination of software development and network diagnostics. The...


  • Hyderabad, Telangana, India NCR Atleos Full time ₹ 8,00,000 - ₹ 12,00,000 per year

    About NCR AtleosNCR Atleos, headquartered in Atlanta, is a leader in expanding financial access. Our dedicated 20,000 employees optimize the branch, improve operational efficiency and maximize self-service availability for financial institutions and retailers across the globe.Job Title: Site Reliability Engineer IILocation: HyderabadJob Type: Full-Time, 24*7...