Site Reliability Engineer

24 hours ago


Noida India S&P Global Full time

Job Description About The Role Grade Level (for internal use): 10 Department overview S&P Global provides innovative products and services that enhance transparency, reduce risk, and improve operational efficiency. Our customers include banks, hedge funds, asset managers, central banks, regulators, auditors, fund administrators and insurance companies. We develop large scale technology platforms and enterprise software to produce global financial data with focus on analysis and regulatory requirements. Position Summary We are seeking a proactive and innovative Site Reliability Engineer to join our growing team. In this role, you will be a key player in ensuring the reliability, scalability, and performance of our critical systems. You will move beyond traditional monitoring to implement advanced observability, leverage AIOps for predictive insights, and use Chaos Engineering to proactively uncover system weaknesses. This is an opportunity to help shape a modern SRE culture, automate away toil, and empower our development teams to build more resilient applications from the ground up. Key Responsibilities - Observability & Proactive System Health - Design, build, and maintain a comprehensive observability platform using tools like Splunk and OpenTelemetry to provide deep insights into system health and performance. - LeverageAIOpsprinciples and platforms to enhance anomaly detection, automate event correlation, and enable predictive alerting, reducing mean time to detection (MTTD). - Develop and manage robust alerting strategies and SLO-based dashboards to ensure critical issues are addressed before they impact customers. - Drive a data-driven culture by providing engineering teams with the visibility they need to understand the impact of their code in production. - Reliability & Resilience Engineering - Design, implement, and conductChaos Engineeringexperiments to proactively identify and remediate system weaknesses, architectural flaws, and potential cascading failures. - Partner with software engineering teams throughout the application lifecycle to architect for high availability, disaster recovery, and fault tolerance. - Define, measure, and evangelize Service Level Indicators (SLIs) and Service Level Objectives (SLOs), and manage the associated error budgets to balance reliability with feature velocity. - Analyze and lead blameless post-mortems for incidents, ensuring that root causes are addressed and preventative measures are implemented to avoid recurrence. - Performance & Efficiency Optimization - Analyze performance metrics and distributed traces to identify and resolve latency bottlenecks across our infrastructure and applications. - Implement cost optimization (FinOps) strategies by identifying and eliminating resource waste, optimizing cloud service usage, and promoting efficient architecture patterns. - Work with development teams to conduct performance testing and ensure new features do not introduce performance regressions. - Automation & Platform Engineering - Identify and aggressively automate manual operational tasks (toil) by developing scripts, tools, and self-healing systems. - Enhance and maintain our Infrastructure as Code (IaC) modules, promoting reusable patterns and best practices with Terraform. - Improve and secure CI/CD pipelines (e.g., GitHub Actions, Azure DevOps) to enable safe, automated, and rapid deployment and rollback procedures. Requirements And Qualifications Core Technical Skills - Experience:4+ years in a Site Reliability, DevOps, or Cloud Engineering role, with demonstrable experience in a large-scale production environment. - Cloud Proficiency:Deep experience with AWS services (EKS, ECS, EC2, S3, RDS, Lambda) and managing production workloads in the cloud. - Observability:Proficient in application observability, monitoring, and logging. Hands-on experience with tools like Splunk, OpenTelemetry, Prometheus, Grafana, or Datadog is essential. - Infrastructure as Code (IaC):Strong experience with Terraform for provisioning and managing cloud infrastructure. - Containerization:Solid understanding of Containerization Technology particularly with managed services like EKS or ECS. - CI/CD:Experience building and maintaining CI/CD pipelines using tools like GitHub Actions, Azure DevOps, or Jenkins. - Scripting & Automation:Strong scripting skills in languages like Python, Bash, or PowerShell for automation and tooling. Familiarity with a higher-level language such as C# (.NET) is a plus. - Modern Practices:Experience with or a demonstrated understanding ofAIOpsconcepts andChaos Engineeringprinciples and tools (e.g., Gremlin, AWS Fault Injection Simulator). Professional Attributes - SRE Mindset:A true understanding of Site Reliability Engineering principles, including SLOs, error budgets, and the value of eliminating toil. - Problem-Solving:Excellent troubleshooting and problem-solving skills, with a methodical approach to resolving complex technical issues under pressure. - Collaboration:Ability to work effectively with development teams, product managers, and other stakeholders, communicating complex technical ideas clearly. - Ownership & Drive:A strong sense of ownership, urgency, and a passion for building and maintaining highly available, performant, and reliable systems. - Agile Experience:Comfortable working in an agile environment and contributing to team sprints and planning. - On-Call:Willingness to participate in a scheduled on-call rotation Education & Certifications - Bachelor's degree in Computer Science, Information Technology, or a related field, or equivalent practical experience. - AWS certification (e.g., AWS Certified Solutions Architect, DevOps Engineer) is highly preferred. About S&P Global Market Intelligence At S&P Global Market Intelligence, a division of S&P Global we understand the importance of accurate, deep and insightful information. Our team of experts delivers unrivaled insights and leading data and technology solutions, partnering with customers to expand their perspective, operate with confidence, and make decisions with conviction. For more information, visit www.spglobal.com/marketintelligence. What's In It For You Our Purpose Progress is not a self-starter. It requires a catalyst to be set in motion. Information, imagination, people, technologythe right combination can unlock possibility and change the world. Our world is in transition and getting more complex by the day. We push past expected observations and seek out new levels of understanding so that we can help companies, governments and individuals make an impact on tomorrow. At S&P Global we transform data into Essential Intelligence, pinpointing risks and opening possibilities. We Accelerate Progress. Our People We're more than 35,000 strong worldwideso we're able to understand nuances while having a broad perspective. Our team is driven by curiosity and a shared belief that Essential Intelligence can help build a more prosperous future for us all. From finding new ways to measure sustainability to analyzing energy transition across the supply chain to building workflow solutions that make it easy to tap into insight and apply it. We are changing the way people see things and empowering them to make an impact on the world we live in. We're committed to a more equitable future and to helping our customers find new, sustainable ways of doing business. We're constantly seeking new solutions that have progress in mind. Join us and help create the critical insights that truly make a difference. Our Values Integrity, Discovery, Partnership At S&P Global, we focus on Powering Global Markets. Throughout our history, the world's leading organizations have relied on us for the Essential Intelligence they need to make confident decisions about the road ahead. We start with a foundation of integrity in all we do, bring a spirit of discovery to our work, and collaborate in close partnership with each other and our customers to achieve shared goals. Benefits We take care of you, so you can take care of business. We care about our people. That's why we provide everything youand your careerneed to thrive at S&P Global. Our Benefits Include - Health & Wellness: Health care coverage designed for the mind and body. - Flexible Downtime: Generous time off helps keep you energized for your time on. - Continuous Learning: Access a wealth of resources to grow your career and learn valuable new skills. - Invest in Your Future: Secure your financial future through competitive pay, retirement planning, a continuing education program with a company-matched student loan contribution, and financial wellness programs. - Family Friendly Perks: It's not just about you. S&P Global has perks for your partners and little ones, too, with some best-in class benefits for families. - Beyond the Basics: From retail discounts to referral incentive awardssmall perks can make a big difference. For more information on benefits by country visit: https://spgbenefits.com/benefit-summaries Global Hiring And Opportunity At S&P Global At S&P Global, we are committed to fostering a connected and engaged workplace where all individuals have access to opportunities based on their skills, experience, and contributions. Our hiring practices emphasize fairness, transparency, and merit, ensuring that we attract and retain top talent. By valuing different perspectives and promoting a culture of respect and collaboration, we drive innovation and power global markets. Recruitment Fraud Alert If you receive an email from a spglobalind.com domain or any other regionally based domains, it is a scam and should be reported to [Confidential Information]. S&P Global never requires any candidate to pay money for job applications, interviews, offer letters, pre-employment training or for equipment/delivery of equipment. Stay informed and protect yourself from recruitment fraud by reviewing our guidelines, fraudulent domains, and how to report suspicious activity here. Equal Opportunity Employer S&P Global is an equal opportunity employer and all qualified candidates will receive consideration for employment without regard to race/ethnicity, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, marital status, military veteran status, unemployment status, or any other status protected by law. Only electronic job submissions will be considered for employment. If you need an accommodation during the application process due to a disability, please send an email to:[HIDDEN TEXT]and your request will be forwarded to the appropriate person. US Candidates Only: The EEO is the Law Poster http://www.dol.gov/ofccp/regs/compliance/posters/pdf/eeopost.pdfdescribes discrimination protections under federal law. Pay Transparency Nondiscrimination Provision - https://www.dol.gov/sites/dolgov/files/ofccp/pdf/pay-transp_%20English_formattedESQA508c.pdf IFTECH202.1 - Middle Professional Tier I (EEO Job Group) Job ID: 319575 Posted On: 2025-09-24 Location: Gurgaon, Haryana, India



  • Noida, India CorroHealth Full time

    We are seeking a highly skilled Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a deep understanding of both software engineering and systems administration, with a focus on creating scalable and reliable systems. You will work closely with development and operations teams to ensure the reliability, availability, and...


  • Noida, India CorroHealth Full time

    We are seeking a highly skilled Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a deep understanding of both software engineering and systems administration, with a focus on creating scalable and reliable systems. You will work closely with development and operations teams to ensure the reliability, availability, and...


  • Noida, India CorroHealth Full time

    We are seeking a highly skilled Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a deep understanding of both software engineering and systems administration, with a focus on creating scalable and reliable systems. You will work closely with development and operations teams to ensure the reliability, availability, and...


  • Noida, India CorroHealth Full time

    We are seeking a highly skilled Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a deep understanding of both software engineering and systems administration, with a focus on creating scalable and reliable systems. You will work closely with development and operations teams to ensure the reliability, availability, and...


  • Noida, India CorroHealth Full time

    We are seeking a highly skilled Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a deep understanding of both software engineering and systems administration, with a focus on creating scalable and reliable systems. You will work closely with development and operations teams to ensure the reliability, availability, and...


  • Noida, Uttar Pradesh, India, Ghaziabad CorroHealth Full time

    We are seeking a highly skilled Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a deep understanding of both software engineering and systems administration, with a focus on creating scalable and reliable systems. You will work closely with development and operations teams to ensure the reliability, availability, and...


  • India Employ Full time

    Role - Site Reliability Engineer (SRE)/ Platform Engineering/ or Dev Ops Engineering roles Location – Fully Remote Type - 6 months Contract Work Ex - 5+ Yrs We’re working with a AI product company that’s building the next generation of Gen AI powered developer platforms . We’re looking for an experienced Site Reliability Engineer to join...


  • Noida, India CorroHealth Full time

    We are seeking a highly skilled Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a deep understanding of both software engineering and systems administration, with a focus on creating scalable and reliable systems. You will work closely with development and operations teams to ensure the reliability, availability, and...


  • Noida, India CorroHealth Full time

    We are seeking a highly skilled Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a deep understanding of both software engineering and systems administration, with a focus on creating scalable and reliable systems. You will work closely with development and operations teams to ensure the reliability, availability, and...


  • Noida, India CorroHealth Full time

    We are seeking a highly skilled Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a deep understanding of both software engineering and systems administration, with a focus on creating scalable and reliable systems. You will work closely with development and operations teams to ensure the reliability, availability, and...