Senior Cloud Site Reliability Engineer

6 days ago


Pune, Maharashtra, India ZS Full time
Senior Cloud Site Reliability Engineer

ZS is a global firm that transforms healthcare and beyond through management consulting and technology. As a Senior Cloud Site Reliability Engineer, you will be part of our Cloud Center of Excellence (CCoE) team, responsible for building, maintaining, and architecting systems that enable ZS client-facing software solutions.

Key Responsibilities:
  • Analyze and maintain cloud solutions to support ZS's growing clientele.
  • Work with operations engineers and software developers to ensure the stability of the environment.
  • Coordinate emergency responses, perform root cause analysis, and implement solutions to prevent re-occurrences.
  • Identify ways to increase Mean Time Between Failures (MTBF) and lower Mean Time To Recover (MTTR) for the environment.
  • Review application stacks and execute initiatives to reduce failures, defects, and issues with overall performance.
  • Maintain environment monitoring systems to provide visibility into deployed products/solutions.
  • Perform root cause analysis on incoming infrastructure alerts and work with teams to resolve them.
  • Maintain performance analysis tools and identify adverse changes to performance, working with teams to resolve them.
  • Research industry trends and technologies, promoting adoption of best-in-class tools and technologies.
  • Take the initiative to advance the quality, performance, or scalability of cloud solutions by influencing architecture or design.
  • Design, develop, and execute automated tests to validate solutions and environments.
  • Troubleshoot issues across the entire stack – infrastructure, software, application, and network.
Requirements:
  • 3+ years' experience as a Site Reliability Engineer or equivalent position.
  • 2+ years' experience with AWS cloud technologies and at least one AWS certification (Solution Architect / DevOps Engineer).
  • 1+ years' experience as a senior member in an infrastructure/software team.
  • Hands-on experience with AWS services like EC2, RDS, EMR, CloudFront, ELB, API Gateway, CodeBuild, AWS Config, Systems Manager, Service Catalog, Lambda, etc.
  • Full-stack IT experience with *nix, Windows, network/firewall concepts, source control (BitBucket), and build/dependency management and continuous integration systems (TeamCity, Jenkins).
  • Expertise in at least one scripting language, Python preferred.
  • Firm understanding of application reliability, performance tuning, and scalability.
  • Exposure to big data technologies (Spark, Hadoop, Scala, etc.) stack is preferred.
  • Solid knowledge of infrastructure and cloud-native services along with network technologies.
  • Solid understanding of RDBMS and Cloud Database engines like Postgres SQL, MySQL, etc.
  • Firm understanding of Clusters, Load balancers, and CDN.
  • Experience in fault-tolerant system design.
  • Familiarity with Splunk data analysis, Datadog, or similar tools is a plus.
  • A Bachelor's degree (Master's preferred) in a related technical field.
  • Excellent analytical, troubleshooting, and communication skills.
  • Possess strong verbal, written, and team presentation communication skills. ZS is a global firm; fluency in English is required.
  • This role requires healthy doses of initiative and the ability to remain flexible and responsive in a very dynamic environment.
  • Ability to quickly learn new platforms, languages, tools, and techniques as needed to meet project requirements.
Perks & Benefits:

ZS offers a comprehensive total rewards package including health and well-being, financial planning, annual leave, personal growth, and professional development. Our robust skills development programs, multiple career progression options, and internal mobility paths and collaborative culture empower you to thrive as an individual and global team member.

We are committed to giving our employees a flexible and connected way of working. A flexible and connected ZS allows us to combine work from home and on-site presence at clients/ZS offices for the majority of our week. The magic of ZS culture and innovation thrives in both planned and spontaneous face-to-face connections.



  • Pune, Maharashtra, India Coupa Software Full time

    About CoupaCoupa is a leading provider of spend management solutions, dedicated to helping businesses optimize their procurement processes and achieve greater efficiency. Our mission is to empower our customers to unlock their full potential and drive success through innovative technology and collaborative partnerships.Job SummaryWe are seeking a highly...


  • Pune, Maharashtra, India ZS Full time

    About ZSZS is a global management consulting and technology firm that transforms healthcare and beyond. Our people are our most valuable asset, and we believe that making an impact demands a different approach. At ZS, your ideas elevate actions, and you'll have the freedom to define your own path and pursue cutting-edge work.Our Cloud Center of ExcellenceThe...


  • Pune, Maharashtra, India Global Payments Asia-Pacific India Private Limited Full time

    About This RoleAt Global Payments Asia-Pacific India Private Limited, we're on a mission to revolutionize the way people move money. As a Senior Site Reliability Engineer, you'll play a critical role in ensuring the availability, latency, and performance of our payment solutions.Key ResponsibilitiesDesign and implement chaos engineering experiments to...


  • Pune, Maharashtra, India Coupa Software Full time

    About CoupaCoupa is a leading provider of spend management solutions, dedicated to helping businesses unlock their full potential and do well while doing good. Our mission is to empower customers to make informed decisions and drive growth through innovative technology and collaborative partnerships.Job DescriptionWe are seeking a highly skilled Senior Site...


  • Pune, Maharashtra, India RED HAT Full time

    Job DescriptionRed Hat is seeking a highly skilled Senior Site Reliability Engineer to join our team and contribute to the development, scaling, and operation of our OpenShift managed cloud services. As a key member of our SRE team, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud services.Key...


  • Pune, Maharashtra, India Siemens Industry Software (India) Private Limited Full time

    Job SummaryAs a Site Reliability Engineer at Siemens Industry Software (India) Private Limited, you will play a crucial role in ensuring the availability, reliability, and performance of our cloud-based applications. We are looking for an experienced engineer who can design, deploy, and automate solutions to drive new capabilities, visibility, and...


  • Pune, Maharashtra, India Acoustic Full time

    Senior Site Reliability EngineerAcoustic is seeking a seasoned and innovative Senior Site Reliability Engineer to join our team. We believe that the ideal candidate will bring expertise in preventative measures to minimize downtime and contribute to the growth and success of our organization.Key ResponsibilitiesLead major incident calls and provide solutions...


  • Pune, Maharashtra, India Acoustic Full time

    Senior Site Reliability EngineerAcoustic is seeking a seasoned Senior Site Reliability Engineer to join our team. We believe that the ideal candidate will bring innovative ideas and implement preventative measures to minimize downtime.Key ResponsibilitiesLead major incident calls and provide solutions to the team.Collaborate with our SRE teams to provide...


  • Pune, Maharashtra, India Red Hat India Private Limited Full time

    Job Title: Senior Site Reliability EngineerRed Hat is seeking a Senior Site Reliability Engineer to develop, scale, and operate our OpenShift managed cloud services. As an SRE, you will contribute to running OpenShift at scale by enabling customer self-service, making our monitoring system more sustainable, and eliminating work through automation.Key...


  • Pune, Maharashtra, India Thinkproject Full time

    About the RoleAs a Site Reliability Engineer at Thinkproject, you will play a crucial part in ensuring the reliability, performance, and scalability of our cloud-based software solutions. Working closely with cross-functional teams, you will contribute to the design, implementation, and operation of highly available systems, guaranteeing seamless operations...


  • Pune, Maharashtra, India Global Payments Asia-Pacific India Private Limited Full time

    At Global Payments Asia-Pacific India Private Limited, we're on a mission to make payments easier and more secure for millions of people around the world. As a Senior Site Reliability Engineer, you'll play a critical role in ensuring the availability, latency, and performance of our payment solutions.Key ResponsibilitiesDesign and implement solutions to...


  • Pune, Maharashtra, India Global Payments Asia-Pacific India Private Limited Full time

    Job SummaryGlobal Payments Asia-Pacific India Private Limited is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of our systems.Key...


  • Pune, Maharashtra, India RED HAT Full time

    Job DescriptionRed Hat is seeking a highly skilled Senior Site Reliability Engineer to join our team. As an SRE, you will play a critical role in developing, scaling, and operating our OpenShift managed cloud services.Key Responsibilities:Contribute to the design, development, and deployment of scalable and reliable cloud servicesCollaborate with...


  • Pune, Maharashtra, India NTT DATA Full time

    Job Title: Site Reliability EngineerJob Summary:NTT DATA Services is seeking a highly skilled Site Reliability Engineer to join our global SRE practice. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based systems. You will work closely with our development teams to design,...


  • Pune, Maharashtra, India Quorum Software Full time

    Job Title: Site Reliability EngineerQuorum Software is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for designing and implementing scalable observability solutions, ensuring proper instrumentation of applications, services, and infrastructure components, and collaborating with...


  • Pune, Maharashtra, India Quorum Software Full time

    Job Title: Site Reliability EngineerQuorum Software is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for designing and implementing scalable observability solutions, ensuring the reliability and performance of our cloud-based infrastructure.Key Responsibilities:Design and...


  • Pune, Maharashtra, India Siemens Industry Software (India) Private Limited Full time

    Job Title: Site Reliability EngineerSiemens Digital Industries Software is a leading provider of solutions for the design, simulation, and manufacture of products across many different industries. Our team is looking for an engineer who is excited about automatic automation and wants to make significant contributions towards the delivery of automated...


  • Pune, Maharashtra, India Quorum Software Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Quorum Software. As a Site Reliability Engineer, you will be responsible for designing and implementing scalable observability solutions, ensuring proper instrumentation of applications, services, and infrastructure components, and collaborating with development teams...


  • Pune, Maharashtra, India NielsenIQ Full time

    About NielsenIQNielsenIQ is a global consumer intelligence leader, delivering the most complete understanding of consumer buying behavior and revealing new pathways to growth. With a holistic retail read and the most comprehensive consumer insights, delivered with advanced analytics through state-of-the-art platforms, NielsenIQ delivers the Full View.Job...


  • Pune, Maharashtra, India Global Payments Asia-Pacific India Private Limited Full time

    At Global Payments Asia-Pacific India Private Limited, we're on a mission to make payments seamless and secure. As a Site Reliability Engineer, you'll play a critical role in ensuring the availability, latency, and performance of our systems.Key ResponsibilitiesDesign and implement chaos engineering experiments to identify and mitigate potential system...