Site Reliability Engineer II

3 days ago


Bangalore, India Jobted IN C2 Full time

Role Description:

Key Responsibilities

Software Systems Design

- Create software that will address availability, scalability, latency, and efficiency for Bookings’ systems/services

- Have a product-based mindset that takes both customer and future roadmap plans into account. Development efforts are focussed on solving for a general case in tech or related subsystem of responsibility while not ruling out that tooling or product can be leveraged by other teams

Technical Incident Management

- Take ownership of how to procedurally deal with emergency situations. SRE should write the playbook on how to deal with a system/service degrading or even a full outage

- Conduct post-mortem meetings (RFOs) to ensure learnings are applied and shared in case of incidents

- Take part in our incident management program by participating in on-call rotation.

- Be available to provide expertise and feedback for our service health program

Automation and Toil Reduction

- Build automation and application orchestration to prevent recurrent problems and to reduce human labor

Observability (Monitoring and Alerting Improvements)

- Implement monitoring and alerting. This might not always be writing the software itself but could also be to create the best practices around how to monitor and alert for a system/service

- Engage in service capacity planning and demand forecasting, software performance analysis and system tuning

Architectural Guidance

- Maintain holistic knowledge and understanding of a system/service instead of only knowing some fraction of the problem space

- Create, document and implement Booking Reliability Engineering best practices.

- Collaborate with other teams and tech POs to support them in building reliable and scalable systems/services for their users and stakeholders

- Influence the business and tech colleagues to adapt engineering, reliability and security best practices

Community Involvement

- Take an active part in educating and skilling up members of our engineering community

Requirements of special knowledge/skills

- Proficiency in the core skills of a software developer: coding, large-scale software design & scaling, complexity analysis, algorithms, data structures, design patterns
- Expertise in source control management such as Git, Bitbucket & Infrastructure provisioning with Terraform.
- Solid hands-on experience with experience with configuration management tools ( Ansible & Puppet)
- Deep understanding of Unix/Linux systems internals and networking; this includes topics like: kernel, shell and client-server protocols
- Proficiency in Unix/Linux system administration (Redhat/CentOS)
- Networking: significant knowledge and understanding of network theory, such as different protocols (TCP/IP, UDP, ICMP, etc), MAC addresses, IP packets, DNS, OSI layers, and load balancing)
- Extensive on design, configuration and implementation for a system/service in a large scale production environment (systems engineering and architectural skills)
- Expertise in various AWS services & their use cases. (EC2, Network, Lambda, IAM and more)
- Eagerness to keep up with latest developments in technology
- Connection with the worldwide SRE community
- Exhibit the following behaviours: be curious; be data driven; have a systematic problem solving approach; constantly aiming to improve systems/services
- Should have minimum 6 to maximum 10 years of experience in a similar role.

- Architectural Guidance

- Advise product teams towards a technical solution that meets the functional, nonfunctional & architectural requirements by challenging the rationale for an application design and providing context in the wider architectural landscape

- Set a clear direction for a technical capability by evaluating and aligning the target architecture improvements, reframing architectural designs and decisions for varied stakeholder

- Critical Thinking

- Find solutions to difficult or complex issues by applying different skills and techniques like analytical thinking, lateral thinking, and logical reasoning



  • Bangalore - Bagmane Tridib, India CME Group Full time ₹ 1,04,000 - ₹ 13,08,780 per year

    CME Group is the world's leading and most diverse derivatives marketplace, offering futures and options across a wide range of industries. We are seeking a passionate SRE to join our dynamic team.  The Application Site Reliability Engineer II will help ensure the reliability and performance of our Markets trading and real-time post-trade systems; systems...


  • Bangalore, India ViewSonic Full time

    Job Requirements: Bachelor's degree in Computer Science, Engineering, or a related field. 3+ year of experience in a relevant role, such as Site Reliability Engineer, DevOps Engineer, or similar, is preferred but not mandatory. Basic understanding of AWS solutions including EC2, S3, CloudWatch, Lambda, and RDS. Interest and understanding of...


  • Bangalore, India ViewSonic Full time

    Job Requirements: Bachelor's degree in Computer Science, Engineering, or a related field. 3+ year of experience in a relevant role, such as Site Reliability Engineer, DevOps Engineer, or similar, is preferred but not mandatory. Basic understanding of AWS solutions including EC2, S3, CloudWatch, Lambda, and RDS. Interest and understanding of...


  • bangalore, India WhiteLotus Talent Partners Full time

    We are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes. In this role, you will focus on monitoring, basic troubleshooting, and incident response, helping to maintain high system...


  • bangalore, India Employ Full time

    Role - Site Reliability Engineer (SRE)/ Platform Engineering/ or DevOps Engineering rolesLocation – Fully RemoteType - 6 months ContractWork Ex - 5+ YrsWe’re working with a AI product company that’s building the next generation of GenAI powered developer platforms.We’re looking for an experienced Site Reliability Engineer to join their Platform...


  • Bangalore, India Xebia Full time

    We are seeking an experienced AWS DevOps Engineer with strong expertise in Observability and Site Reliability Engineering (SRE) to design, build, and manage scalable, reliable, and secure cloud environments. The role requires hands-on experience with AWS services, Infrastructure as Code (IaC), CI/CD, monitoring & observability frameworks, and incident...


  • Bangalore, India Tavant Full time

    About Tavant: With 25+ years of experience building innovative digital products and solutions, Tavant provides impactful results to its customers. It has been the frontrunner in driving digital innovation and tech-enabled transformation across a wide range of industries such as Consumer Lending, Manufacturing, Agtech, Media & Entertainment, and Retail in...


  • Bangalore, India Synechron Full time

    We have immediate opportunity for SRE (Senior Site Reliability Engineer) 5 to 9 years. Synechron – Bangalore Job Role: - SRE (Senior Site Reliability Engineer) Job Location: - Bangalore Notice Period: Within 30days About Synechron We began life in 2001 as a small, self-funded team of technology specialists. Since then, we’ve grown...


  • bangalore, India Xebia Full time

    We are seeking an experienced AWS DevOps Engineer with strong expertise in Observability and Site Reliability Engineering (SRE) to design, build, and manage scalable, reliable, and secure cloud environments. The role requires hands-on experience with AWS services, Infrastructure as Code (IaC), CI/CD, monitoring & observability frameworks, and incident...


  • bangalore, India Synechron Full time

    We have immediate opportunity for SRE (Senior Site Reliability Engineer) 5 to 9 years.Synechron – BangaloreJob Role: - SRE (Senior Site Reliability Engineer)Job Location: - BangaloreNotice Period: Within 30daysAbout SynechronWe began life in 2001 as a small, self-funded team of technology specialists. Since then, we’ve grown our organization to 14,500+...