SRE Observability Engineer
10 hours ago
Job Description Job Title: SRE Observability Engineer Experience: 6 Years Location: Hyderabad Notice Period: Immediate Joiners Only About the Role We are seeking a highly skilled and motivated SRE Observability Engineer to design, build, and scale observability platforms across our distributed systems. The ideal candidate will have deep expertise in monitoring, logging, tracing, and alerting frameworks along with hands-on experience in Prometheus, Grafana, and Loki. This role involves close collaboration with Development, DevOps, Infrastructure, and SRE teams to ensure end-to-end visibility, reliability, performance, and availability of critical systems. Mandatory Skills Observability Grafana Prometheus & Loki (including strong query-writing skills) Key Responsibilities Lead the design and implementation of observability solutions spanning monitoring, logging, and distributed tracing across cloud and on-prem environments. Develop and maintain advanced monitoring frameworks using Prometheus, Grafana, Datadog, New Relic, AppDynamics and other observability platforms. Implement and optimize distributed tracing using OpenTelemetry, Jaeger, or Zipkin to enhance application visibility and performance diagnostics. Improve log management pipelines using tools such as Elasticsearch, Splunk, Loki, Fluentd, ensuring efficient log ingestion, parsing, storage, and analysis. Build advanced alerting and anomaly detection mechanisms for proactive issue resolution and improved MTTR. Work with development and SRE teams to enhance observability integration within CI/CD pipelines, microservices, and cloud-native architectures. Automate observability processes using Python, Bash, or Golang to scale operations and reduce manual effort. Ensure observability platforms are resilient, scalable, and cost-effective for large-scale distributed systems. Lead incident response efforts, offering actionable insights through logs, metrics, and traces for rapid troubleshooting. Stay updated on evolving observability, SRE, and monitoring practices to continuously strengthen observability posture. Required Qualifications 5+ years of hands-on experience in Observability, SRE, DevOps, or similar roles, managing large-scale distributed systems. Strong experience designing and implementing solutions using Prometheus, Grafana, Datadog, New Relic, AppDynamics. Expertise in log management tools such as Elasticsearch, Splunk, Loki, Fluentd, including performance optimization. Deep proficiency in distributed tracing frameworks (OpenTelemetry, Jaeger, Zipkin). Hands-on experience with cloud platforms Azure, AWS, or GCP, and Kubernetes-based environments. Strong scripting skills in Python, Bash, or Golang, and experience with IaC tools such as Terraform, Ansible. Solid understanding of system architecture, performance tuning, scalability, and high-availability architectures. Proven experience in guiding teams, providing technical leadership, and enforcing observability best practices. Excellent problem-solving skills with the ability to provide data-driven, actionable insights. Strong stakeholder management, communication, and collaboration abilities. Preferred Qualifications Experience with AI-driven observability and automated anomaly detection. Familiarity with microservices, serverless, and event-driven architectures. Prior experience in on-call rotations and incident management in high-availability environments. Certifications in cloud platforms, SRE, or observability tools. Requirements SRE
-
SRE Observability Engineer
10 hours ago
Hyderabad, India Evnek Technologies Pvt Ltd Full timeJob Title: SRE Observability Engineer Experience: 6 Years Location: Hyderabad Notice Period: Immediate Joiners Only About the Role We are seeking a highly skilled and motivated SRE Observability Engineer to design, build, and scale observability platforms across our distributed systems. The ideal candidate will have deep expertise in monitoring, logging,...
-
Observability SRE
7 days ago
Hyderabad, India Ifintalent Global Private Limited Full timeJob Description Key Responsibilities: - Design, build, and maintain observability platforms including monitoring, logging, tracing, and alerting systems. - Implement and optimize metrics collection using tools like Prometheus, Grafana, OpenTelemetry, or similar. - Develop and maintain centralized logging infrastructure (e.g., Data Dog, Open Telemetry,...
-
Observability Engineer
1 week ago
Hyderabad, Telangana, India algoleap Full time ₹ 12,00,000 - ₹ 36,00,000 per yearRole: Observability EngineerJob Description:Senior Platform EngineerWe are seeking a highly experienced and driven Senior Observability Engineer to lead the design, development, and maintenance of observability solutions across our infrastructure, applications, and services. As a Senior Observability Engineer, you will be at the forefront of implementing...
-
sre
2 weeks ago
Gurugram, Hyderabad, Noida, India Zensar Full time ₹ 15,00,000 - ₹ 25,00,000 per yearShort Description for Internal CandidatesBachelors degree in Computer Science, IT, or equivalent. - 3–6 years in SRE, Observability, Application Monitoring, or Performance Engineering roles. - Hands-on exposure to Glassbox and Sumo Logic strongly preferred.*Description for CandidatesWe are seeking a Site Reliability Engineer (SRE) with a strong focus on...
-
SRE Design
4 weeks ago
Hyderabad, India Pepsico Full timeOverview We are looking for a self-driven, software engineering mindset SRE engineer to - Drive new shift left activities critical to apply Site Reliability Engineering (SRE) and quality assurance principles within the application design / Project roadmap that enablees resilient outcomes - Apply pre-emptive approach into production minimizing business...
-
SRE Observability Platform Architect
6 days ago
Hyderabad, India Virtusa Full timeSRE Observability Platform Architect - Description Observability Platform Architect Experience: · Minimum 10 years of relevant work experience with monitoring setup using any product (Dynatrace, Datadog, ELK stack, Splunk, Grafana/Prometheus, etc.) set up in critical production environments. · Minimum 5-6 years of work experience in end-to-end...
-
SRE
6 days ago
Hyderabad, India Virtusa Full timeSRE - CREQ Description Bi Tools, API & Batch monitoring Support Responsibilities 1. Troubleshoot Recurring failures & participate in incident triages 2. Troubleshoot issues, both from a production as well as a performance standpoint 3. on-call to be able to respond during App failures 4. Monitor critical applications and services to minimize downtime and...
-
Observability Engineer
2 weeks ago
Hyderabad, Telangana, India Jobhedge Consultancy Full time ₹ 12,00,000 - ₹ 36,00,000 per yearDescription : Job Description : AI-Driven Observability EngineerExperience : 10 YearsAbout the Role : We are seeking a highly skilled AI-Driven Observability Engineer to design, implement, and maintain end-to-end observability solutions for infrastructure and application. You will play a key role in ensuring the reliability, performance, and...
-
Observability Lead
2 days ago
Hyderabad, Telangana, India Micron Technology Full time ₹ 12,00,000 - ₹ 36,00,000 per yearOur vision is to transform how the world uses information to enrich life for all.Micron Technology is a world leader in innovating memory and storage solutions that accelerate the transformation of information into intelligence, inspiring the world to learn, communicate and advance faster than ever.We are seeking a seasoned Observability Lead to drive the...
-
SRE Lead Design
6 days ago
Hyderabad, India PepsiCo Full timeOverview We are looking for a self-driven, software engineering mindset SRE engineer to Drive new shift left activities critical to apply Site Reliability Engineering (SRE) and quality assurance principles within the application design / Project roadmap that enablees resilient outcomes Apply pre-emptive approach into production minimizing business impact,...