Sre observability engineer

3 weeks ago


Hyderabad, India TerraGiG Full time

We are looking for SRE Observability EngineerAbout the Role:Duration: PermanentLocation: HyderabadTimings: Full Time (As per company timings)Notice Period: (Immediate Joiner - Only)Experience: 6-10 YearsJD:Position: SRE Observability EngineerExp: 5+ to 10 YearsLocation: HyderabadMandatory Skills: Observability, Grafana and Writing queries using Prometheus and Loki.Job Description:We are seeking a highly experienced and driven Senior Observability Engineer to lead the design, development, and maintenance of observability solutions across our infrastructure, applications, and services. As a Senior Observability Engineer, you will be at the forefront of implementing cutting-edge monitoring, logging, and tracing solutions that ensure the reliability, performance, and availability of our complex, distributed systems. You will be collaborating with cross-functional teams, including Development, Infrastructure Engineers, Dev Ops, and SREs, to optimize system observability, and improve our incident response capabilities.Key Responsibilities:- Lead the Design & Implementation of observability solutions, including monitoring, logging, and tracing for both cloud and on-premises environments.- Drive the Development and maintenance of advanced monitoring tools such as Prometheus, Grafana, Datadog, New Relic, and App Dynamics.- Implement Distributed Tracing frameworks like Open Telemetry, Jaeger, or Zipkin, and enhance application performance diagnostics and troubleshooting.- Optimize Log Management and analysis strategies using tools like Elasticsearch, Splunk, Loki, and Fluentd, ensuring efficient log processing and insights.- Develop Advanced Alerting and anomaly detection strategies to proactively identify system issues, minimizing downtime and improving Mean Time to Recovery (MTTR).- Collaborate with Development & SRE Teams to enhance observability in CI/CD pipelines, microservices architectures, and across various platform environments.- Automate Observability Tasks by leveraging scripting languages such as Python, Bash, or Golang to increase efficiency and scale observability operations.- Ensure Scalability & Efficiency of monitoring solutions to manage large-scale distributed systems and handle evolving business requirements.- Lead Incident Response by providing actionable insights through observability data for effective troubleshooting and root cause analysis.- Stay Abreast of Industry Trends in observability, Site Reliability Engineering (SRE), and monitoring practices, continuously improving processes.Required Qualifications:- 5+ years of hands-on experience in observability, SRE, Dev Ops, or a related field, with a proven track record of successfully managing complex, large-scale distributed systems.- Expert-level proficiency in observability tools such as Prometheus, Grafana, Datadog, New Relic, App Dynamics, with the ability to lead the design and implementation of these solutions at scale.- Advanced experience with log management platforms like Elasticsearch, Splunk, Loki, and Fluentd, and the ability to optimize log aggregation and analysis for better performance insights.- Deep expertise in distributed tracing tools such as Open Telemetry, Jaeger, or Zipkin, with a focus on performance optimization and root cause analysis.- Extensive experience with cloud environments (preferably Azure, AWS, GCP) and Kubernetes for deploying and managing observability solutions across modern, cloud-native infrastructures.- Advanced proficiency in scripting languages such as Python, Bash, or Golang, and strong experience with Infrastructure as Code (Ia C) tools like Terraform and Ansible.- Strong understanding of system architecture, performance tuning, and troubleshooting complex production environments, with an emphasis on scalability and high availability.- Proven experience in leading and mentoring teams, providing technical direction, and driving the adoption of best practices for observability and monitoring.- Exceptional problem-solving skills, with a focus on providing actionable insights and data-driven decision-making.- Ability to lead high-impact projects, effectively communicate with stakeholders, and influence cross-functional teams.- Strong communication and collaboration skills; demonstrated ability to work closely with engineering teams, leadership, and external partners to meet observability and system reliability goals.Preferred Qualifications:- Experience with AI-driven observability tools and anomaly detection techniques.- Familiarity with microservices, serverless architectures, and event-driven systems.- Proven track record of handling on-call rotations and incident management workflows in high-availability environments.- Relevant certifications in observability tools, cloud platforms, or SRE best practices are a plus.Interested candidates please share your resume to



  • Hyderabad, India Awign Expert Full time

    Job Description Position: SRE Observability Engineer Exp: 5+ to 10 Years Location: Hyderabad Mandatory Skills: Observability, Grafana and Writing queries using Prometheus and Loki. Job Description: We are seeking a highly experienced and driven Senior Observability Engineer to lead the design, development, and maintenance of observability solutions across...


  • Hyderabad, India Awign Expert Full time

    Position: SRE Observability Engineer Exp: 5+ to 10 Years Location: Hyderabad Mandatory Skills: Observability, Grafana and Writing queries using Prometheus and Loki. Job Description: We are seeking a highly experienced and driven Senior Observability Engineer to lead the design, development, and maintenance of observability solutions across our...


  • Hyderabad, India TerraGiG Full time

    We are looking for SRE Observability EngineerAbout the Role:Duration: PermanentLocation: HyderabadTimings: Full Time (As per company timings)Notice Period: (Immediate Joiner - Only)Experience: 6-10 YearsJD:Position: SRE Observability EngineerExp: 5+ to 10 YearsLocation: HyderabadMandatory Skills: Observability, Grafana and Writing queries using Prometheus...


  • Hyderabad, India TerraGiG Full time

    We are looking for SRE Observability EngineerAbout the Role:Duration: PermanentLocation: Hyderabad Timings: Full Time (As per company timings)Notice Period: (Immediate Joiner - Only)Experience: 6-10 YearsJD:Position: SRE Observability EngineerExp: 5+ to 10 YearsLocation: HyderabadMandatory Skills: Observability, Grafana and Writing queries using Prometheus...


  • Hyderabad, India TerraGiG Full time

    We are looking for SRE Observability EngineerAbout the Role:Duration: PermanentLocation: Hyderabad Timings: Full Time (As per company timings)Notice Period: (Immediate Joiner - Only)Experience: 6-10 YearsJD:Position: SRE Observability EngineerExp: 5+ to 10 YearsLocation: HyderabadMandatory Skills: Observability, Grafana and Writing queries using Prometheus...


  • Hyderabad, India TerraGiG Full time

    We are looking for SRE Observability EngineerAbout the Role:Duration: PermanentLocation: Hyderabad Timings: Full Time (As per company timings)Notice Period: (Immediate Joiner - Only)Experience: 6-10 YearsJD:Position: SRE Observability EngineerExp: 5+ to 10 YearsLocation: HyderabadMandatory Skills: Observability, Grafana and Writing queries using Prometheus...


  • hyderabad, India TerraGiG Full time

    We are looking for SRE Observability EngineerAbout the Role:Duration: PermanentLocation: Hyderabad Timings: Full Time (As per company timings)Notice Period: (Immediate Joiner - Only)Experience: 6-10 YearsJD:Position: SRE Observability EngineerExp: 5+ to 10 YearsLocation: HyderabadMandatory Skills: Observability, Grafana and Writing queries using Prometheus...


  • Hyderabad, India Whatjobs IN C2 Full time

    We are looking for SRE Observability Engineer About the Role: Duration: Permanent Location: Hyderabad Timings: Full Time (As per company timings) Notice Period: (Immediate Joiner - Only) Experience: 6-10 Years JD: Position: SRE Observability Engineer Exp: 5+ to 10 Years Location: Hyderabad Mandatory Skills: Observability, Grafana and Writing queries using...


  • Hyderabad, India TerraGiG Full time

    We are looking for SRE Observability Engineer About the Role: Duration: Permanent Location: Hyderabad Timings: Full Time (As per company timings) Notice Period: (Immediate Joiner - Only) Experience: 6-10 Years JD: Position: SRE Observability Engineer Exp: 5+ to 10 Years Location: Hyderabad Mandatory Skills: Observability, Grafana and Writing queries using...


  • Hyderabad, India TerraGiG Full time

    We are looking for SRE Observability Engineer About the Role: Duration: Permanent Location: Hyderabad Timings: Full Time (As per company timings) Notice Period: (Immediate Joiner - Only) Experience: 6-10 Years JD: Position: SRE Observability Engineer Exp: 5+ to 10 Years Location: Hyderabad Mandatory Skills: Observability, Grafana and Writing queries using...