Principal Site Reliability Engineer

1 day ago


Hyderabad, Telangana, India Amgen Inc Full time ₹ 8,00,000 - ₹ 12,00,000 per year

We are looking for a Site Reliability Engineer/Cloud Engineer (SRE) to work on the performance optimization, standardization, and automation of Amgens critical infrastructure and systems. This role is crucial to ensuring the reliability, scalability, and cost-effectiveness of our production systems. The ideal candidate will work on operational excellence through automation, incident response, and proactive performance tuning, while also reducing infrastructure costs. You will work closely with cross-functional teams to establish best practices for service availability, efficiency, and cost control.

Roles & Responsibilities:

  • System Reliability, Performance Optimization & Cost Reduction: Ensure the reliability, scalability, and performance of Amgens infrastructure, platforms, and applications. Proactively identify and resolve performance bottlenecks and implement long-term fixes. Continuously evaluate system design and usage to identify opportunities for cost optimization, ensuring infrastructure efficiency without compromising reliability.
  • Automation & Infrastructure as Code (IaC): Drive the adoption of automation and Infrastructure as Code (IaC) across the organization to streamline operations, minimize manual interventions, and enhance scalability. Implement tools and frameworks (such as Terraform, Ansible, or Kubernetes) that increase efficiency and reduce infrastructure costs through optimized resource utilization.
  • Standardization of Processes & Tools: Establish standardized operational processes, tools, and frameworks across Amgens technology stack to ensure consistency, maintainability, and best-in-class reliability practices. Champion the use of industry standards to optimize performance and increase operational efficiency.
  • Monitoring, Incident Management & Continuous Improvement: Implement and maintain comprehensive monitoring, alerting, and logging systems to detect issues early and ensure rapid incident response. Lead the incident management process to minimize downtime, conduct root cause analysis, and implement preventive measures to avoid future occurrences. Foster a culture of continuous improvement by leveraging data from incidents and performance monitoring.
  • Collaboration & Cross-Functional Leadership: Partner with software engineering, and IT teams to integrate reliability, performance optimization, and cost-saving strategies throughout the development lifecycle. Act as a SME for SRE principles and advocate for best practices for assigned Projects.
  • Capacity Planning & Disaster Recovery: Execute capacity planning processes to support future growth, performance, and cost management. Maintain disaster recovery strategies to ensure system reliability and minimize downtime in the event of failures.

Basic Qualifications:

  • Masters degree and 8 to 10 years of IT infrastructure, Site Reliability Engineering or related fields experience OR
  • Bachelors degree and 10 to 14 years ofIT infrastructure, Site Reliability Engineering or related fields experience OR
  • Diploma and 14 to 18 years ofIT infrastructure, Site Reliability Engineering or related fields experience.

Must-Have Skills:

  • Extensively experienced with AWS Cloud Services
  • Proficient in CI/CD (Jenkins/Gitlab), Observability, IAC, Gitops etc
  • Experience with containerization (Docker) and orchestration tools (Kubernetes) to optimize resource usage and improve scalability.
  • Identify and specify SRE tasks
  • Strong Hands-on SRE tasks and automate using Python/ Scripting language
  • Well Versed with FinOps, Infra-Ops, & Platform Operations.
  • Ability to learn new technologies quickly. Strong problem-solving and analytical skills. Excellent communication and teamwork skills.
  • Leadership skills are mandatory to lead a team of 4 to 5 to guide on Technical blockers

Good-to-Have Skills:

  • Knowledge of cloud-native technologies and strategies for cost optimization in multi-cloud environments.
  • Familiarity with distributed systems, databases, and large-scale system architectures.
  • Bachelors degree in computer science and engineering preferred, other Engineering field is considered
  • Databricks Knowledge/Exposure is good to have (need to upskill if hired)

Soft Skills:

  • Ability to foster a collaborative and innovative work environment.
  • Strong problem-solving abilities and attention to detail.
  • High degree of initiative and self-motivation.

We will ensure that individuals with disabilities are provided with reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request an accommodation



  • Hyderabad, Telangana, India Cubic Transportation Systems Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Hiring Principal Site Reliability EngineerExperience: 12+ YearsLocation: HyderabadNotice: Immediate to 30 DaysWe're seeking an experiencedSite Reliability Engineer (SRE)to ensure our services are robust, scalable, secure, and maintainable. You will blend software engineering and systems operations to automate processes, monitor performance, lead incident...


  • Hyderabad, Telangana, India IntraEdge Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Site Reliability EngineerExperience: 7+ YearsLocation: HyderabadSkills for Principal:Strong leadership and people management skills.Exceptional technical proficiency in Pearson's technology stack.Advanced project management capabilities.Excellent communication and collaboration skills.Adept at risk assessment and crisis management.Strategic thinking with a...


  • Hyderabad, Telangana, India Cubic Transportation Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Hiring Principal Site Reliability EngineerExperience: 12 to 18 YearsLocation: HyderabadNotice Period: Immediate to 30 DaysKey ResponsibilitiesDesign, deploy, and maintain scalable, secure applications and infrastructure in cloud or hybrid environmentsImplement and manage robust monitoring, alerting, and observability systemsAutomate recurrent operational...


  • Hyderabad, Telangana, India IntraEdge Full time

    Site Reliability EngineerExperience: 7+ YearsLocation: HyderabadHybrid 4-day office and 1 Day remoteSkills for Principal:Strong leadership and people management skills.Exceptional technical proficiency in Pearson's technology stack.Advanced project management capabilities.Excellent communication and collaboration skills.Adept at risk assessment and crisis...


  • Hyderabad, Telangana, India Oracle Financial Services Software Ltd Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Senior Principal Site Reliability Engineer, Fusion SRE About Oracle Cloud: Oracle Cloud is a comprehensive suite of cloud services—including infrastructure, platform, and applications—designed to help organizations build, deploy, and manage workloads securely at scale. At Oracle, we are building the most intelligent future of cloud computing. Our...


  • Hyderabad, Telangana, India ANSR Full time

    ANSR is hiring for one of its client:About T-Mobile:T-Mobile US, Inc. (NASDAQ: TMUS), headquartered in Bellevue, Washington, is America's supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mobile. Customers benefit from an unmatched combination of value, quality, and exceptional...


  • Hyderabad, Telangana, India Acesoft Labs Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Hi ,Kindly find the below JD :Job Title: Site Reliability Engineering (SRE) ManagerLocation: HyderabadEmployment Type: Full-TimeWork Model - 3 Days from office (Hybrid)Summary:The SRE Manager at TechBlocks India will lead the reliability engineering function, ensuring infrastructure resiliency and optimal operational performance. This hybrid role blends...


  • Hyderabad, Telangana, India TECHBLOCKS Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Job Title: Site Reliability Engineering (SRE) ManagerLocation: HyderabadEmployment Type: Full-TimeWork Model - 3 Days from office (Hybrid)Summary:The SRE Manager at TechBlocks India will lead the reliability engineering function, ensuring infrastructure resiliency and optimal operational performance. This hybrid role blends technical leadership with team...


  • Hyderabad, Telangana, India Cubic Corporation Full time ₹ 15,00,000 - ₹ 20,00,000 per year

    Business Unit:Cubic Transportation SystemsCompany Details:When you join Cubic, you become part of a company that creates and delivers technology solutions in transportation to make people's lives easier by simplifying their daily journeys, and defense capabilities to help promote mission success and safety for those who serve their nation. Led by our...


  • Hyderabad, Telangana, India Cubic Corporation Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Business Unit:Cubic Transportation SystemsCompany Details:When you join Cubic, you become part of a company that creates and delivers technology solutions in transportation to make people's lives easier by simplifying their daily journeys, and defense capabilities to help promote mission success and safety for those who serve their nation. Led by our...