Site Reliability Engineer III

2 days ago


Gurugram, India RELX Full time

About the Role

We are seeking a Site Reliability Engineer (SRE) with experience in Azure and a track record of success in cloud migration project initiatives. The successful candidate will help design and coordinate the implementation of cloud infrastructure, including Kubernetes clusters, databases and storage, serverless functions, CI/CD pipelines, and solutions for monitoring, alerting, and security.

In this role, you will work to understand business needs and technical solutions, provide input based on evidence, communicate effectively with people of different technical backgrounds, influence decision-making, implement best-practice solutions, and maintain them, all within a fast-paced environment managing multiple projects in parallel.

Responsibilities

Develop, deploy, and maintain scalable and highly available systems on Kubernetes.

Design and implement automation processes for system deployments and scaling.

Monitor system performance, troubleshoot issues, and drive ongoing improvements.

Collaborate with development teams to enhance infrastructure, including CI/CD pipelines.

Respond to and resolve operational incidents, provide detailed reports, and participate in post-incident reviews.

Manage code deployments, updates, and processes across multiple environments.

Requirements

Hands-on experience with Azure solutions and observability tools (preferably Grafana), including designing and implementing observability pipelines for logs, metrics, and traces, along with setting up dashboards and alerts.

Understanding of authentication and authorization mechanisms in Azure, including Microsoft Entra ID.

Experience with Infrastructure-as-Code (IaC) tools such as Terraform (Ansible, Puppet, ARM templates also valued).

Knowledge of automated CI/CD pipelines (GitHub Actions preferred; Jenkins, Argo CD also relevant).

Familiarity with containerized workloads (EKS, other Kubernetes distributions, Docker, JFrog).

Exposure to serverless solutions (e.g., Logic Apps, Function Apps, Functions, WebJobs).

Experience with logging and monitoring tools (Azure Monitor, Log Analytics, Metrics Explorer, Activity Log).

Preferred Experience and Skills

Enthusiasm for technology, and a broad understanding of cloud solutions.

Willingness to share expertise and advise on best practices.

Experience with budget management and cost control is a plus.

Skills in system integration and troubleshooting.

Experience with performance analysis and optimization.

Knowledge of Kubernetes service meshes (Linkerd preferred; Istio, Traefik Mesh also valued).

Ability to code or script (for example Linux/Bash/Sh, Windows/PowerShell/Batch, Python, Java).

Familiarity with load balancing and service proxies (Nginx, Traefik, HAProxy, F5).

Experience with tools such as Jira, Confluence, MySQL Workbench, Maven.

Professional certifications for Cloud Developers or Architects (Azure preferred; AWS also beneficial).

Accessibility and Inclusion

We are committed to fostering an inclusive workplace where everyone feels welcome. If you require any accommodations or adjustments during the application or interview process, please let us know. Candidates from all backgrounds are encouraged to apply, including those with non-traditional career paths, gaps in employment, or alternative educational experiences. We value diversity and are dedicated to providing equal opportunities for all.

Learn more about our team and culture .

We are committed to providing a fair and accessible hiring process. If you have a disability or other need that requires accommodation or adjustment, please let us know by completing our or please contact .Criminals may pose as recruiters asking for money or personal information. We never request money or banking details from job applicants. Learn more about spotting and avoiding scams . Please read our .

  • Bengaluru, Gurugram, India Rackspace Technology Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Site Reliability Engineer / Observability EngineerPublic Cloud - Offerings and Delivery Workforce Mgmt & Delivery Ops /Full - Time / RemoteRackspace is building up its Professional Services Center of Excellence on Application Performance Monitoring Suites.If you enjoy solving complex business problems and can contribute to building next generation of modern...


  • Gurugram, India Zinnia Full time

    Who You Are : As a Site Reliability Engineer at Zinnia, you will play a pivotal role in designing, building, and maintaining the infrastructure and systems that support our applications and services. You will collaborate with cross-functional teams to ensure our platform is scalable, reliable, and efficient. Your expertise will be crucial in driving the...


  • Gurugram, India Freecharge Full time

    Job Title: Site Reliability Engineer (SRE)3 Years Experience About the Role: We are looking for a Site Reliability Engineer (SRE) with 3 years of experience to join our team. You will be responsible for ensuring the reliability, scalability, and efficiency of our production systems. This role requires a balance of software engineering, system administration,...


  • Gurugram, Pune, India Prerna Malhotra (Proprietor Of Praxis Hr Solutions) Full time

    Job Description Description We are seeking a skilled Site Reliability Engineer (SRE) to join our dynamic team in India. The SRE will be responsible for ensuring the reliability, availability, and performance of our applications and services. This role requires a combination of software engineering and systems engineering to build and maintain scalable and...


  • Gurugram, Noida, India S&P Global Market Intelligence Full time ₹ 1,20,000 - ₹ 3,00,000 per year

    Position Summary:We are seeking a proactive and innovative Site Reliability Engineer to join our growing team. In this role, you will be a key player in ensuring the reliability, scalability, and performance of our critical systems. You will move beyond traditional monitoring to implement advanced observability, leverage AIOps for predictive insights, and...

  • DevOps Engineer III

    2 weeks ago


    Gurugram, India NCR Corporation Full time

    Job Title: Site Reliability Engineer III As a Senior Site Reliability Engineer, you will be part of a 24*7 team and will be a key technical leader responsible for the design, implementation, and continuous improvement of systems that ensure high availability, performance, and scalability. You will work cross-functionally with engineering, infrastructure,...


  • Gurugram, India Gemini Solutions Pvt Ltd Full time

    Position Summary In this role, you will play a crucial part in shaping the firm's infrastructure reliability and efficiency by implementing robust Site Reliability Engineering practices. Your contribution will be pivotal in ensuring the availability, scalability, and performance of our systems and applications. Leveraging your strong technical skills and...


  • Gurugram, Hyderabad, India Talent Hired-the Job Store Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    9+ years of experience in a Site Reliability Engineering or DevOps role.Hands-on experience with Dynatrace and Splunk for monitoring, logging, and alerting.Strong proficiency in Terraform for infrastructure provisioning (AWS, Azure, or GCP).Experience with CI/CD pipelines and automation tools (e.g., Jenkins, GitLab CI, Azure DevOps).Deep understanding of...


  • Gurugram, India Leapwork Full time

    At Leapwork, our vision is to break down the barriers between humans and computers through the world's most accessible automation platform. We are the leading global AI-powered visual test automation solution, enabling some of the world's largest enterprises to adopt, scale, and maintain automation – in under 30 days. In today's environment, where...


  • Gurugram, India S&P Global Market Intelligence Full time

    About the Role:  OSTTRA India The RoleSite Reliability Engineer The TeamSRE is a global team that provides technical support across the suite of OSTTRA products. The SRE team works closely with a highly competent Technical Operation Centre (TOC), Development and Infrastructure teams to deliver proactive tasks to improve the supportability of our...