Site Reliability Engineer II

2 weeks ago


Gurgaon, Haryana, India American Express Full time ₹ 15,00,000 - ₹ 28,00,000 per year

At American Express, our culture is built on a 175-year history of innovation, shared values and Leadership Behaviors, and an unwavering commitment to back our customers, communities, and colleagues. As part of Team Amex, you'll experience this powerful backing with comprehensive support for your holistic well-being and many opportunities to learn new skills, develop as a leader, and grow your career.

Here, your voice and ideas matter, your work makes an impact, and together, you will help us define the future of American Express.

The Enterprise Data Management Technology Team brings together foundational strategic technology capabilities in data governance, data privacy, data retention and deletion, data quality, and automation, grounded in our data technology model that prioritizes data management. It employs a ground-breaking focus with development responsibilities for regulatory needs that deepen and expand data strategy, as well as core technical capabilities that cut across business lines and customer segments.

Responsibilities:

  • Infrastructure Management: Design, implement, and manage scalable, reliable infrastructure using cloud-native technologies and Infrastructure as Code (IaC) tools.
  • Automation & CI/CD: Develop and maintain automated processes and Continuous Integration/Continuous Delivery (CI/CD) pipelines to streamline deployments and operational tasks.
  • Monitoring & Alerting: Implement and manage comprehensive monitoring and alerting systems to detect issues early and ensure system health.
  • Incident Management: Lead incident response efforts, perform root cause analysis (RCA) for outages, and implement measures to prevent future disruptions.
  • Performance Tuning & Optimization: Gather and analyze metrics from systems and applications to identify performance bottlenecks and conduct tuning.
  • Collaboration: Work closely with development teams to integrate reliability into software design and deployment processes.
  • Capacity Planning: Manage server capacity to ensure systems can handle current and future demand.
  • Site Reliability Engineering (SRE) Principles: Balance feature development speed with reliability, and establish and maintain Service Level Objectives (SLOs)

Minimum Qualifications
- Programming Languages: Python, Bash, Perl, or similar for automation.
- Cloud Platforms: , GCP, Hydra.
- Containerization: Docker, Kubernetes.
- Monitoring Tools: Prometheus, Grafana, Splunk.
- IaC Tools: Terraform.
- Operating Systems: Linux (proficient in command-line tools like strace, truss).
- Networking: TCP/IP, firewalls, load balancers

We back you with benefits that support your holistic well-being so you can be and deliver your best. This means caring for you and your loved ones' physical, financial, and mental health, as well as providing the flexibility you need to thrive personally and professionally:

  • Competitive base salaries
  • Bonus incentives
  • Support for financial-well-being and retirement
  • Comprehensive medical, dental, vision, life insurance, and disability benefits (depending on location)
  • Flexible working model with hybrid, onsite or virtual arrangements depending on role and business need
  • Generous paid parental leave policies (depending on your location)
  • Free access to global on-site wellness centers staffed with nurses and doctors (depending on location)
  • Free and confidential counseling support through our Healthy Minds program
  • Career development and training opportunities

American Express is an equal opportunity employer and makes employment decisions without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran status, disability status, age, or any other status protected by law.

Offer of employment with American Express is conditioned upon the successful completion of a background verification check, subject to applicable laws and regulations.



  • Gurgaon, Haryana, India American Express Full time ₹ 1,50,000 - ₹ 28,00,000 per year

    At American Express, our culture is built on a 175-year history of innovation, shared values and Leadership Behaviors, and an unwavering commitment to back our customers, communities, and colleagues. As part of Team Amex, you'll experience this powerful backing with comprehensive support for your holistic well-being and many opportunities to learn new...


  • Gurgaon, Haryana, India Aerial Telecom Solutions (ATS) Full time ₹ 1,04,000 - ₹ 1,30,878 per year

    Position Overview:SRE- Lead will be responsible for managing a team of engineers focused on software deployments and site reliability engineering practices. The role will involve overseeing the deployment process of software applications and services, implementing automation, monitoring, and alerting tools, and ensuring the reliability, availability, and...


  • Gurgaon, Haryana, India RBS Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Join us as a Site Reliability EngineerIn this key role, you'll support the improvement of non-functional and operational characteristics such as availability, performance, efficiency, change management, monitoring, security, incident response, and capacity planning of our products and servicesYou'll enjoy significant stakeholder interaction, working in...


  • Gurgaon, Haryana, India Impronics Technologies Full time

    Job DescriptionRequired Skills & Experience:- 8+ years of overall experience in infrastructure engineering or SRE roles, with at least 3+ years in thepayments/fintech domain.- Strong understanding ofpayment protocols(UPI, IMPS, RTGS, NEFT, SWIFT, etc.) and transaction processing systems.- Proven expertise inLinux systems administration, cloud platforms (AWS,...


  • Gurgaon, Haryana, India GreyOrange Full time ₹ 1,04,000 - ₹ 1,30,878 per year

    Job Title : Senior Site Reliability EngineerPosition type - Contractual ( 1 Year)We are seeking a talented and motivated Senior Site Reliability Engineer (SRE) to join our organization.The SRE team at GreyOrange is responsible for monitoring the stability and availability of mission-critical production systems, managing incidents for quicker resolution, and...


  • Gurgaon, Haryana, India RBS Full time ₹ 15,00,000 - ₹ 20,00,000 per year

    Join us as a Site Reliability EngineerIn this key role, you'll improve, drive, and embed non-functional and operational characteristics such as availability, performance, efficiency, change management, monitoring, security, incident response, and capacity planning of our products and servicesYou'll enjoy significant stakeholder interaction, working in...


  • Gurgaon, Haryana, India Acquia Full time US$ 90,000 - US$ 1,20,000 per year

    Job Title:  Associate Site Reliability Engineer Acquia is the open source digital experience company. We provide the world's most ambitious brands with technology that allows them to embrace innovation and create customer moments that matter. At Acquia we believe in the power of community and collaboration – giving our customers the freedom to build...


  • Gurgaon, Haryana, India EDGE Executive Search Full time ₹ 1,04,000 - ₹ 1,30,878 per year

    The JobThe SRE is a global team that provides technical support across the suite of products. The team works closely with a highly competent Technical Operation Centre (TOC), Development and Infrastructure teams to deliver proactive tasks to improve the supportability of our platforms. Our work helps to ensure that the company provides a high-quality...


  • Gurgaon, Haryana, India Cvent Full time US$ 1,50,000 - US$ 2,00,000 per year

    Cvent is looking for a Manager, Site Reliability Engineering to help us scale our systems and ensure stability, reliability and performance and rapid deployments of our platform. We build teams that are inclusive, collaborative, and have a strong sense of ownership for the things they build. If you have a passion and track record for solving problems;...


  • Gurgaon, Haryana, India LEAPWORK Full time ₹ 1,04,000 - ₹ 1,30,878 per year

    At Leapwork, our vision is to break down the barriers between humans and computers through the worlds most accessible automation platform. We are the leading global AI-powered visual test automation solution, enabling some of the world's largest enterprises to adopt, scale, and maintain automation – in under 30 days.In today's environment, where...