Site Reliability Engineering

3 days ago


Pune, Maharashtra, India Amadeus Full time ₹ 12,00,000 - ₹ 36,00,000 per year

Job Title

Site Reliability Engineering (SRE) Manager – iHotelier

Role Overview

As an SRE Manager for iHotelier, you will lead a team responsible for ensuring the availability, scalability, and performance of mission-critical hospitality services. This role combines technical leadership, operational excellence, and strategic planning to deliver a seamless booking experience for thousands of hotels worldwide. You will define and enforce SRE best practices, drive automation, and partner with cross-functional teams to maintain reliability across iHotelier's complex ecosystem .

Key Responsibilities
  • Lead and mentor a global team of SREs, fostering a culture of reliability and continuous improvement.
  • Define and enforce SRE best practices, including error budgets, SLOs, and SLIs.
  • Drive automation initiatives to reduce toil and improve deployment velocity.
  • Oversee incident response, root cause analysis, and post-mortems for iHotelier services.
  • Manage on-call rotations and ensure effective escalation processes.
  • Implement observability frameworks (monitoring, logging, alerting) using Datadog, Grafana, Prometheus, and Splunk.
  • Own CI/CD pipelines and deployment strategies using ArgoCD, Jenkins, and Kubernetes.
  • Ensure compliance with security and privacy standards for hospitality data.
  • Optimize cloud infrastructure (Azure) for cost and performance.
  • Govern ArgoCD/Jenkins workflows including PR/backout PR, prod1/prod1-pci branch patterns.
  • Maintain WLI/runbooks for Kafka lag, URM Router, Email Engine, EQC Provider Booking, Cache Invalidator, and Couchbase maintenance.
  • Collaborate with R&D, DevOps, and Product teams to design resilient architectures.
  • Align with business stakeholders to prioritize reliability improvements.
  • Participate in capacity planning for peak booking periods and ensure operational readiness.
  • Support monitoring tools currently in production and enhance alert dashboards for proactive detection.
Required Skills & Experience
  • Bachelor's or Master's degree in Computer Science or related field.
  • 10+ years in software engineering/operations, with 4+ years in SRE leadership.
  • Proven track record managing large-scale distributed systems.
  • Strong knowledge of Linux and Windows OS, cloud-native environments, and container orchestration (Kubernetes, Azure AKS).
  • Experience with SLO/SLA management, automation, and operational readiness testing.
  • Hands-on experience with monitoring tools (Datadog, Grafana, Prometheus, Splunk) and incident management platforms (ServiceNow).
  • Familiarity with CI/CD pipelines, infrastructure-as-code (Terraform), and GitOps tools (Flux).
  • Knowledge of networking fundamentals and API performance optimization.
Preferred Skills
  • Experience leading SRE or DevOps teams in a high-availability SaaS environment.
  • Familiarity with hospitality systems or booking platforms.
  • Knowledge of CDN technologies (Akamai, Cloudflare) and containerization (Docker).
  • Strong collaboration and communication skills.
Performance Metrics
  • Service uptime and reliability (meeting or exceeding SLOs).
  • Reduction in incident MTTR (Mean Time to Recovery).
  • Change success rate and rollback efficiency.
  • Automation coverage and reduction in manual operational tasks.
  • Team engagement and retention.
  • Sustaining SSI streaks and Hospitality Stability Program deliverables.
iHotelier-Specific Context

iHotelier is a hospitality distribution and booking platform under Amadeus Hospitality, providing booking engines, analytics, and call center solutions. Core modules include Admin, Booking Engine (BE4/BE5), CRS APIs, Call Center/VoicePro, and Analytics. The platform supports multi-channel distribution (Direct, Meta, OTAs, GDS) and PMS connectivity via PMS Connect and the Property Connectivity Gateway (PCG). Architecture leverages Java services with Oracle as the primary data store, Coherence for distributed caching, Couchbase for constant hotel data, and Kafka for ARI/OB/Reservation event flows. Deployments use Kubernetes with ArgoCD GitOps and Jenkins pipelines, with strict governance for PCI vs non-PCI environments. Monitoring and alerting rely on Datadog synthetics, Grafana/Prometheus, and Splunk dashboards, complemented by incident dashboards and ServiceNow workflows. Operational excellence is driven by runbooks for critical components and proactive observability enhancements. Programs like the Hospitality Stability Program (HSP) and SSI initiatives ensure continuous improvement in reliability and performance.

Diversity & Inclusion

Amadeus aspires to be a leader in Diversity, Equity and Inclusion in the tech industry, enabling every employee to reach their full potential by fostering a culture of belonging and fair treatment, attracting the best talent from all backgrounds, and as a role model for an inclusive employee experience.

Amadeus is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to gender, race, ethnicity, sexual orientation, age, beliefs, disability or any other characteristics protected by law.



  • Pune, Maharashtra, India Relanto Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Job Title: Site Reliability EngineerSummaryWe are looking for a Site Reliability Engineer to join our Digital & Transformation department. The ideal candidate will have 4 years of experience in this field and will be responsible for ensuring the reliability, availability, and performance of our systems and applications.Roles And Responsibilities4 years of...


  • Pune, Maharashtra, India Fiserv Full time ₹ 8,00,000 - ₹ 24,00,000 per year

    Site Reliability EngineerExp. Range-8 to14 YearsWhat does a successful Site Reliability Engineer (SRE) Expert do at Fiserv?The Site reliability engineer blends the principles of software engineering with the discipline of operations to create high-performing and reliable software systems. They are tasked with designing and implementing tools, processes, and...


  • Pune, Maharashtra, India Accelya Group Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    For more than 40 years, Accelya has been the industry's partner for change, simplifying airline financial and commercial processes and empowering the air transport community to take better control of the future. Whether partnering with IATA on industry-wide initiatives or enabling digital transformation to simplify airline processes, Accelya drives the...


  • Pune, Maharashtra, India Accelya Group Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    For more than 40 years, Accelya has been the industry's partner for change, simplifying airline financial and commercial processes and empowering the air transport community to take better control of the future. Whether partnering with IATA on industry-wide initiatives or enabling digital transformation to simplify airline processes, Accelya drives the...


  • Pune, Maharashtra, India Barclays Investment Bank Full time ₹ 8,00,000 - ₹ 16,00,000 per year

    Company DescriptionBarclays Investment Bank provides innovative financial solutions to support clients' funding, financing, strategic, and risk management needs across various sectors and global markets. With a strong presence in investment banking, international corporate banking, global markets, and research, Barclays serves money managers, financial...


  • Pune, Maharashtra, India Idox Full time ₹ 9,00,000 - ₹ 12,00,000 per year

    Site Reliability Engineer (AWS)Pune, IndiaAbout the roleWe are seeking a driven and detail-oriented Site Reliability Engineer (SRE) with a strong passion for building resilient, scalable cloud infrastructure. This role offers an exciting opportunity for professionals with 2 to 4 years of experience in DevOps, Cloud, or Infrastructure to deepen their...


  • Pune, Maharashtra, India UBS Full time ₹ 10,00,000 - ₹ 25,00,000 per year

    IndiaInformation Technology (IT)Group FunctionsJob Reference #319274BRCityPuneJob TypeFull TimeYour roleAre you an analytic thinker?Do you enjoy Site Reliability Engineering initiatives and proactive problem management across on-premises & Cloud Database ensuring high availability & stability of Database infrastructure services?Do you want to play a key role...


  • Pune, Maharashtra, India CrelioHealth Full time ₹ 12,00,000 - ₹ 24,00,000 per year

    Job Role - Site Reliability EngineerLocation - PuneJob Summary:We are seeking a Senior DevOps & SRE Engineer to join our team and help us build, deploy, and maintain our infrastructure and applications. The ideal candidate will have experience working in a fast-paced environment and a strong background in DevOps and Site Reliability Engineering (SRE). You...


  • Pune, Maharashtra, India NR Consulting Full time ₹ 12,00,000 - ₹ 24,00,000 per year

    ```htmlAbout the CompanyWe are seeking a highly skilled Site Reliability Engineer (SRE) with strong expertise in Google Cloud Platform (GCP) and CI/CD automation to lead cloud infrastructure initiatives. The ideal candidate will design and implement robust CI/CD pipelines, automate deployments, ensure platform reliability, and drive continuous improvement in...


  • Pune, Maharashtra, India Reveille Technologies,Inc Full time ₹ 5,00,000 - ₹ 15,00,000 per year

    We're Hiring – Site Reliability Engineer (SRE) | C2H Opportunity Location: [Pune] Type: Contract-to-Hire (C2H) Notice Period: Immediate Joiners Only Experience : 4 to 6 yrsWe're looking for a Site Reliability Engineer (SRE) with solid troubleshooting skills, scripting experience, and hands-on exposure to modern DevOps & monitoring tools. Technical...