Senior Site Reliability Engineer
15 hours ago
Job Description Unified Dashboards, Elastic Stack (ELK), Loki, Splunk, Dynatrace, Datadog, Grafana, New Relic, Azure, Python, GitLab, Jenkins, Ansible, Terraform, DevOps, SLO/SLAs Monitoring, Incident Response, Root Cause Analysis (RCA), E2E Implementation Description GSPANN is hiring a Senior Site Reliability Engineer (SRE) to join our team in Pune or Hyderabad. This full-time role focuses on enhancing the reliability, scalability, and observability of global cloud-based systems through automation, performance tuning, and modern DevOps practices. Location: Pune / Hyderabad Role Type: Full Time Published On: 30 May 2025 Experience: 6 - 10 Years Share this job Description GSPANN is hiring a Senior Site Reliability Engineer (SRE) to join our team in Pune or Hyderabad. This full-time role focuses on enhancing the reliability, scalability, and observability of global cloud-based systems through automation, performance tuning, and modern DevOps practices. Role and Responsibilities - Manage and support production environments on cloud platforms, with a strong preference for Microsoft Azure. - Apply expertise in observability tools such as Dynatrace, Splunk, Datadog, Grafana, and New Relic to monitor system health. - Implement modern observability practices including end-to-end (E2E) instrumentation, telemetry, and unified dashboard creation. - Drive organizational change by influencing senior leadership and improving SRE practices company-wide. - Write automation scripts using Python (strongly preferred) to streamline operations and eliminate manual effort. - Deploy cloud infrastructure using tools like Ansible, Terraform, and Azure DevOps. - Work confidently with Continuous Integration/Continuous Deployment (CI/CD) tools such as GitLab, Jenkins, Bamboo, Travis CI, and CircleCI. - Operate and orchestrate containerized environments using Kubernetes and Docker. - Troubleshoot complex issues and provide reliable, scalable solutions. - Embrace continuous learning and demonstrate a strong passion for automation and process improvement. - Use logging stacks like ELK (Elasticsearch, Logstash, and Kibana), Loki, and Splunk to maintain visibility and traceability. - Influence organizational adoption of Infrastructure as Code (IaC) and CI/CD methodologies. - Define and monitor Service Level Objectives (SLOs) and Service Level Agreements (SLAs). - Lead incident response efforts and perform Root Cause Analysis (RCA) to minimize recurrence. Skills And Experience - Bachelor's degree in Computer Science, Information Science, Engineering, or a related discipline. - 6+ years of experience in Site Reliability Engineering (SRE) or DevOps roles, with a focus on cloud-based production systems. - Ensure the availability, low latency, performance, and cost efficiency of global e-commerce platforms. - Design and maintain full-stack observability solutions, including dashboards and standardized instrumentation. - Implement advanced monitoring and alerting systems tailored for both internal engineering teams and external stakeholders. - Advocate for SRE best practices and promote operational excellence across teams and departments. - Collaborate with engineering, product, and operations teams to increase reliability and accelerate delivery timelines. - Build automation tools that support incident response, system recovery, and software delivery pipelines. - Track and maintain error budgets, achieve defined SLOs, and guarantee high uptime for mission-critical services. - Identify system bottlenecks and anomalies proactively, ensuring optimal performance under peak loads. - Automate infrastructure management to reduce costs and scale efficiently during traffic surges. - Lead strategic, cross-functional initiatives that enhance overall system architecture and reliability.
-
Senior Site Reliability Engineer
4 weeks ago
Hyderabad, India Elios Talent Full timeSenior Site Reliability Engineer Key Highlights
-
Senior Site Reliability Engineer
7 days ago
hyderabad, India Elios Talent Full timeSenior Site Reliability Engineer Key Highlights
-
Senior Site Reliability Engineer
3 days ago
Hyderabad, Telangana, India Jade Global Full timeSenior Site Reliability Engineer (SRE) – Datadog Observability1Job Title: Senior Site Reliability Engineer (SRE) – Datadog ObservabilityExperience Required: 8+ years overall in SRE and Infrastructure Operations with minimum 3+ years hands-on experience in DatadogLocation: Hyderabad preferable but open for Pune and remoteJob Summary:We are seeking an...
-
Senior Site Reliability Engineer
1 week ago
India Capgemini Engineering Full timeJob Description Job Title: Senior Site Reliability Engineer (SRE) Experience: 4+ years Location: Mysuru Employment Type: Full-time About the Role We are seeking a highly skilled Senior Site Reliability Engineer to join our team. The ideal candidate will have deep expertise in Microsoft Azure OR AWS, strong experience in building and maintaining reliable,...
-
Senior Site Reliability Engineer
4 weeks ago
Hyderabad, India Elios Talent Full timeSenior Site Reliability EngineerKey Highlights🛠️ Build, scale, and optimize cloud-native infrastructure powering global, high-availability platforms⚡ Drive automation-first engineering across AWS, Terraform, CI/CD, observability, and resilient systems📊 Own reliability, uptime, system health, costs, and performance across mission-critical...
-
Senior Site Reliability Engineer
4 weeks ago
Hyderabad, India Elios Talent Full timeSenior Site Reliability EngineerKey Highlights🛠️ Build, scale, and optimize cloud-native infrastructure powering global, high-availability platforms⚡ Drive automation-first engineering across AWS, Terraform, CI/CD, observability, and resilient systems📊 Own reliability, uptime, system health, costs, and performance across mission-critical...
-
Senior Site Reliability Engineer
3 weeks ago
Hyderabad, India Elios Talent Full timeSenior Site Reliability EngineerKey Highlights️ Build, scale, and optimize cloud-native infrastructure powering global, high-availability platforms⚡ Drive automation-first engineering across AWS, Terraform, CI/CD, observability, and resilient systemsOwn reliability, uptime, system health, costs, and performance across mission-critical...
-
Senior Site Reliability Engineer
1 week ago
hyderabad, India Elios Talent Full timeSenior Site Reliability EngineerKey Highlights🛠️ Build, scale, and optimize cloud-native infrastructure powering global, high-availability platforms⚡ Drive automation-first engineering across AWS, Terraform, CI/CD, observability, and resilient systems📊 Own reliability, uptime, system health, costs, and performance across mission-critical...
-
Senior Site Reliability Engineer
3 days ago
Hyderabad, Telangana, India Jade Global Full timeJob Title: Senior Site Reliability Engineer (SRE) – Datadog ObservabilityExperience Required: 8+ years overall in SRE and Infrastructure Operations with minimum 3 + years hands-on experience in Datadog Location: Hyderabad preferable but open for Pune and remoteJob Summary:We are seeking an experienced Site Reliability Engineer (SRE) to lead end-to-end SRE...
-
Site Reliability Engineer
3 weeks ago
India Pagos Consultants Full timewe are looking for experienced site reliability engineers to join a founding team of startup-minded individuals that will lay the groundwork for our new fintech offering. This team will play a pivotal role in spearheading innovation. As such, you will have the opportunity to shape the early architecture and design of the system and set the trajectory for its...