
Associate Manager SRE
12 hours ago
We are seeking a self-driven, inquisitive, and curious Site Reliability Engineer (SRE) to drive reliability, availability, performance, and security across our global digital product ecosystem. This role is central to ensuring a seamless and resilient experience for our users by blending deep engineering expertise with operational excellence and automation.
You will be part of a global SRE practice supporting a portfolio of 260+ modern cloud-native applications across consumer, commercial, supply chain, and enablement functions. Your mission: prevent incidents before they occur, ensure rapid recovery when they do, and build scalable systems that evolve with our growing business.
Responsibilities
Champion reliability, observability, and operational excellence across mission-critical applications.
- Develop and maintain service-level indicators (SLIs), objectives (SLOs), and error budgets to measure and improve system performance.
- Implement automated monitoring, alerting, and recovery mechanisms to reduce manual intervention and improve response times.
- Collaborate closely with software engineering, platform, and operations teams to embed SRE practices across the development lifecycle.
- Lead and participate in incident response, root cause analysis, and postmortem reviews to drive long-term improvements.
- Identify and eliminate sources of toil through automation, tooling, and process refinement.
- Continuously improve resiliency design, capacity planning, and release management in production systems.
- Influence engineering teams with best practices on cloud-native architecture, observability, and deployment strategies.
Qualifications
Required Skills:
- 5+ years of experience in production engineering, DevOps, or SRE roles.
- Strong foundation in Linux systems, networking, and cloud platforms (Azure, AWS, or GCP).
- Hands-on experience with observability tools (e.g., AppDynamics, Prometheus, Grafana, ELK, FullStory).
- Proficiency in scripting or programming (e.g., Python, Bash, Go) and automation frameworks (e.g., Ansible, Terraform).
- Deep understanding of CI/CD pipelines, release strategies, and deployment automation.
- Experience in managing high-scale, distributed systems in cloud-native environments.
- Strong analytical skills and a passion for continuous improvement.
Preferred Skills:
- Familiarity with microservices, Kubernetes, containers, and service mesh architecture.
- Exposure to incident and problem management frameworks (e.g., ITIL, RCA practices).
- Experience working in global teams supporting mission-critical applications.
Required Skills:
- 5+ years of experience in production engineering, DevOps, or SRE roles.
- Strong foundation in Linux systems, networking, and cloud platforms (Azure, AWS, or GCP).
- Hands-on experience with observability tools (e.g., AppDynamics, Prometheus, Grafana, ELK, FullStory).
- Proficiency in scripting or programming (e.g., Python, Bash, Go) and automation frameworks (e.g., Ansible, Terraform).
- Deep understanding of CI/CD pipelines, release strategies, and deployment automation.
- Experience in managing high-scale, distributed systems in cloud-native environments.
- Strong analytical skills and a passion for continuous improvement.
Preferred Skills:
- Familiarity with microservices, Kubernetes, containers, and service mesh architecture.
- Exposure to incident and problem management frameworks (e.g., ITIL, RCA practices).
- Experience working in global teams supporting mission-critical applications.
Champion reliability, observability, and operational excellence across mission-critical applications.
- Develop and maintain service-level indicators (SLIs), objectives (SLOs), and error budgets to measure and improve system performance.
- Implement automated monitoring, alerting, and recovery mechanisms to reduce manual intervention and improve response times.
- Collaborate closely with software engineering, platform, and operations teams to embed SRE practices across the development lifecycle.
- Lead and participate in incident response, root cause analysis, and postmortem reviews to drive long-term improvements.
- Identify and eliminate sources of toil through automation, tooling, and process refinement.
- Continuously improve resiliency design, capacity planning, and release management in production systems.
- Influence engineering teams with best practices on cloud-native architecture, observability, and deployment strategies.
-
SRE Lead
4 weeks ago
Hyderabad, Telangana, India ValueLabs LLP Full timeJob DescriptionDescriptionWe are seeking an experienced Site Reliability Engineering (SRE) Lead to join our team in India. The ideal candidate will have a strong background in ensuring the reliability, scalability, and performance of our services while leading a team of SREs. This role requires a mix of technical expertise, leadership skills, and a passion...
-
SRE/ .net Developer
2 days ago
Hyderabad, Telangana, India Talent Worx Full time ₹ 9,00,000 - ₹ 12,00,000 per yearA leading Digital transformation company is looking for .net/SRE Engineer as below.Experience 5- 14 years in .net devlopment and SRE(site relability enginner) related roles. Bachelor's degree in Computer Science, Information Technology, or similar Proven experience (2-years+) in a Platform Engineering, Site Reliability Engineering or Software Engineering...
-
SRE(Site Reliability Engineer)
2 days ago
Hyderabad, Telangana, India Talent Worx Full time ₹ 15,00,000 - ₹ 20,00,000 per yearSRE (Site Reliability Engineer)Talent Worx is seeking a talented SRE (Site Reliability Engineer) to enhance our technology team. In this role, you will be pivotal in ensuring the reliability, performance, and availability of our applications and services. Your work will involve both software engineering and systems operations as you strive to improve...
-
Senior Associate Engineer
3 weeks ago
Hyderabad, Telangana, India ANSR Full timeAbout American Airlines: To Care for People on Life's Journey. Together with our American Eagle regional partners, we offer thousands of flights daily to more than 350 destinations in more than 60 countries. American Airlines is transforming the way it delivers technology to its customers and team members worldwide. American's Tech Hub in Hyderabad, India,...
-
Manager, Sre Engineer Urgent
4 weeks ago
Hyderabad, Telangana, India MSD Full timeManager- Site Reliability Engineer SRE - Reliability Automation The Opportunity Based in Hyderabad join a global healthcare biopharma company and be part of a 130- year legacy of success backed by ethical integrity forward momentum and an inspiring mission to achieve new milestones in global healthcare Be part of an organisation driven by digital...
-
Urgent SRE/Devops
4 weeks ago
Hyderabad, Telangana, India Skillventory Full timeJob DescriptionWe are seeking an SRE/DevOps professional to partner with domain engineers, product managers, and operations teams to enhance the availability, reliability, and observability of bank services. You will be responsible for developing and maintaining CI/CD pipelines, implementing observability frameworks, and working with AWS services to deploy...
-
SRE Operations
3 days ago
Hyderabad, Telangana, India Talent Worx Full time ₹ 9,00,000 - ₹ 12,00,000 per yearYour role Do you want to design and build the next generation of monitoring services powered by data analytics using the latest technologies? Do you have proven ability to solve complex issues, covering both technical and business needs? Do you like to be challenged and encouraged to learn and grow professionally? We're looking for Data Engineers to:...
-
Senior Systems Engineer II, SRE T500-19454
1 week ago
Hyderabad, Telangana, India Marriott Tech Accelerator Full timeAbout Marrio Marriott Tech Accelerator is part of Marriott International, a global leader in hospitality. Marriott International, Inc. is a leading American multinational company that operates a vast array of lodging brands, including hotels and residential properties. It consists of over 30 well-known brands and nearly 8,900 properties situated in 141...
-
US IT Staffing Business Development Manager
1 week ago
Hyderabad, Telangana, India Boston Associate Software Systems Full timeCompany Description Boston Associate Software Systems, established in 2016 and renamed in 2018, is an IT staffing and solutions firm specializing in contract, contract-to-hire, and direct hire roles across the nation. The company adopts a relationship-based model to better understand the needs of both clients and consultants. Utilizing a vast network of...
-
US IT Staffing Business Development Manager
2 weeks ago
Hyderabad, Telangana, India Boston Associate Software Systems Full timeCompany Description Boston Associate Software Systems, established in 2016 and renamed in 2018, is an IT staffing and solutions firm specializing in contract, contract-to-hire, and direct hire roles across the nation. The company adopts a relationship-based model to better understand the needs of both clients and consultants. Utilizing a vast network of...