Senior Site Reliability Engineer

15 hours ago


Hyderabad India GSPANN Technologies, Inc Full time

Job Description Unified Dashboards, Elastic Stack (ELK), Loki, Splunk, Dynatrace, Datadog, Grafana, New Relic, Azure, Python, GitLab, Jenkins, Ansible, Terraform, DevOps, SLO/SLAs Monitoring, Incident Response, Root Cause Analysis (RCA), E2E Implementation Description GSPANN is hiring a Senior Site Reliability Engineer (SRE) to join our team in Pune or Hyderabad. This full-time role focuses on enhancing the reliability, scalability, and observability of global cloud-based systems through automation, performance tuning, and modern DevOps practices. Location: Pune / Hyderabad Role Type: Full Time Published On: 30 May 2025 Experience: 6 - 10 Years Share this job Description GSPANN is hiring a Senior Site Reliability Engineer (SRE) to join our team in Pune or Hyderabad. This full-time role focuses on enhancing the reliability, scalability, and observability of global cloud-based systems through automation, performance tuning, and modern DevOps practices. Role and Responsibilities - Manage and support production environments on cloud platforms, with a strong preference for Microsoft Azure. - Apply expertise in observability tools such as Dynatrace, Splunk, Datadog, Grafana, and New Relic to monitor system health. - Implement modern observability practices including end-to-end (E2E) instrumentation, telemetry, and unified dashboard creation. - Drive organizational change by influencing senior leadership and improving SRE practices company-wide. - Write automation scripts using Python (strongly preferred) to streamline operations and eliminate manual effort. - Deploy cloud infrastructure using tools like Ansible, Terraform, and Azure DevOps. - Work confidently with Continuous Integration/Continuous Deployment (CI/CD) tools such as GitLab, Jenkins, Bamboo, Travis CI, and CircleCI. - Operate and orchestrate containerized environments using Kubernetes and Docker. - Troubleshoot complex issues and provide reliable, scalable solutions. - Embrace continuous learning and demonstrate a strong passion for automation and process improvement. - Use logging stacks like ELK (Elasticsearch, Logstash, and Kibana), Loki, and Splunk to maintain visibility and traceability. - Influence organizational adoption of Infrastructure as Code (IaC) and CI/CD methodologies. - Define and monitor Service Level Objectives (SLOs) and Service Level Agreements (SLAs). - Lead incident response efforts and perform Root Cause Analysis (RCA) to minimize recurrence. Skills And Experience - Bachelor's degree in Computer Science, Information Science, Engineering, or a related discipline. - 6+ years of experience in Site Reliability Engineering (SRE) or DevOps roles, with a focus on cloud-based production systems. - Ensure the availability, low latency, performance, and cost efficiency of global e-commerce platforms. - Design and maintain full-stack observability solutions, including dashboards and standardized instrumentation. - Implement advanced monitoring and alerting systems tailored for both internal engineering teams and external stakeholders. - Advocate for SRE best practices and promote operational excellence across teams and departments. - Collaborate with engineering, product, and operations teams to increase reliability and accelerate delivery timelines. - Build automation tools that support incident response, system recovery, and software delivery pipelines. - Track and maintain error budgets, achieve defined SLOs, and guarantee high uptime for mission-critical services. - Identify system bottlenecks and anomalies proactively, ensuring optimal performance under peak loads. - Automate infrastructure management to reduce costs and scale efficiently during traffic surges. - Lead strategic, cross-functional initiatives that enhance overall system architecture and reliability.



  • Hyderabad, India Elios Talent Full time

    Senior Site Reliability Engineer Key Highlights


  • hyderabad, India Elios Talent Full time

    Senior Site Reliability Engineer Key Highlights


  • Hyderabad, Telangana, India Jade Global Full time

    Senior Site Reliability Engineer (SRE) – Datadog Observability1Job Title: Senior Site Reliability Engineer (SRE) – Datadog ObservabilityExperience Required: 8+ years overall in SRE and Infrastructure Operations with minimum 3+ years hands-on experience in DatadogLocation: Hyderabad preferable but open for Pune and remoteJob Summary:We are seeking an...


  • India Capgemini Engineering Full time

    Job Description Job Title: Senior Site Reliability Engineer (SRE) Experience: 4+ years Location: Mysuru Employment Type: Full-time About the Role We are seeking a highly skilled Senior Site Reliability Engineer to join our team. The ideal candidate will have deep expertise in Microsoft Azure OR AWS, strong experience in building and maintaining reliable,...


  • Hyderabad, India Elios Talent Full time

    Senior Site Reliability EngineerKey Highlights🛠️ Build, scale, and optimize cloud-native infrastructure powering global, high-availability platforms⚡ Drive automation-first engineering across AWS, Terraform, CI/CD, observability, and resilient systems📊 Own reliability, uptime, system health, costs, and performance across mission-critical...


  • Hyderabad, India Elios Talent Full time

    Senior Site Reliability EngineerKey Highlights🛠️ Build, scale, and optimize cloud-native infrastructure powering global, high-availability platforms⚡ Drive automation-first engineering across AWS, Terraform, CI/CD, observability, and resilient systems📊 Own reliability, uptime, system health, costs, and performance across mission-critical...


  • Hyderabad, India Elios Talent Full time

    Senior Site Reliability EngineerKey Highlights️ Build, scale, and optimize cloud-native infrastructure powering global, high-availability platforms⚡ Drive automation-first engineering across AWS, Terraform, CI/CD, observability, and resilient systemsOwn reliability, uptime, system health, costs, and performance across mission-critical...


  • hyderabad, India Elios Talent Full time

    Senior Site Reliability EngineerKey Highlights🛠️ Build, scale, and optimize cloud-native infrastructure powering global, high-availability platforms⚡ Drive automation-first engineering across AWS, Terraform, CI/CD, observability, and resilient systems📊 Own reliability, uptime, system health, costs, and performance across mission-critical...


  • Hyderabad, Telangana, India Jade Global Full time

    Job Title: Senior Site Reliability Engineer (SRE) – Datadog ObservabilityExperience Required: 8+ years overall in SRE and Infrastructure Operations with minimum 3 + years hands-on experience in Datadog Location: Hyderabad preferable but open for Pune and remoteJob Summary:We are seeking an experienced Site Reliability Engineer (SRE) to lead end-to-end SRE...


  • India Pagos Consultants Full time

    we are looking for experienced site reliability engineers to join a founding team of startup-minded individuals that will lay the groundwork for our new fintech offering. This team will play a pivotal role in spearheading innovation. As such, you will have the opportunity to shape the early architecture and design of the system and set the trajectory for its...