
Observability Engineer
4 weeks ago
Job Title: Observability Engineer
Location: Hyderabad, Pune, Gurgaon
Experience: 8 - 14 Years
Notice Period : Immediate to 15 Days
Skills: ITIL, ITSM, Sumo Logic, APM Tools
Role and Responsibilities:
- Develop and implement comprehensive observability strategies to monitor IT systems and applications.
- Utilize ITSM/ITIL frameworks to align observability practices with organizational processes and standards.
- Deploy and manage APM tools to monitor the performance and health of applications.
- Analyse APM data to identify performance bottlenecks and optimize application performance.
- Configure and maintain Sumo Logic for log management and analytics.
- Use Sumo Logic to collect, process, and analyse logs, metrics, and traces to gain insights into system behaviour.
- Collaborate with IT teams to identify, investigate, and resolve incidents and problems.
- Conduct Root Cause Analysis (RCA) to prevent recurrence of issues and improve system reliability.
- Analyse data to detect anomalies, trends, and potential issues before they impact users.
- Create and maintain dashboards to visualize key performance indicators (KPIs) and system health metrics.
- Generate reports to communicate findings and recommendations to stakeholders.
- Implement best practices from ITIL/ITSM to enhance the efficiency and effectiveness of observability processes.
- Work closely with development, operations, and security teams to ensure comprehensive observability coverage
Required Skills:
- 8+ Years of experience, having thorough knowledge on ITIL/ITSM process.
- Should have good exposure to APM tools existing in market like Dynatrace, Datadog
- Should have expertise in sumo logic.
- Design and implement alerting mechanism leveraging tools for monitoring, alerting and logging to detect potential issues.
- Reduce noise on alerts.
- Periodic reviews to increase observability coverage across applications.
- Periodic review of metrics and derive anomalies.
- Periodic updates to dashboards as required
-
Observability Engineer
3 weeks ago
Pune, Maharashtra, India Growel Softech Pvt. Ltd. Full timeObservability Tools : - Experience with open-source observability tools such as Grafana, Prometheus, Mimir, Loki, FluentD, OpenTelemetry, and Tempo.- Experience designing, implementing, and managing observability platforms to monitor the performance and reliability of distributed Observability : - Exposure to AI/ML-based observability tools and techniques,...
-
Observability Engineer
4 weeks ago
Pune, Maharashtra, India Sarvaha Systems Full timeSarvaha would like to welcome a skilled Observability Engineer with a minimum of 3 years of experience to contribute to designing, deploying, and scaling our monitoring and logging infrastructure on Kubernetes .In this role, you will play a key part in enabling end-to-end visibility across cloud environments by processing Petabyte data scales, helping...
-
Observability Engineer
4 weeks ago
Pune, Maharashtra, India Sarvaha Systems Full timeSarvaha would like to welcome a skilled Observability Engineer with a minimum of 3 years of experience to contribute to designing, deploying, and scaling our monitoring and logging infrastructure on Kubernetes . In this role, you will play a key part in enabling end-to-end visibility across cloud environments by processing Petabyte data scales, helping...
-
Observability Engineering Specialist
1 day ago
Hyderabad, Telangana, India beBeeObservability Full time ₹ 1,04,000 - ₹ 1,30,878">Job Title: Observability Engineering Specialist">Job Summary: We are seeking an Observability Engineering Specialist to join our team. The ideal candidate will have a strong background in DevOps, Observability, and related technologies.">Key Responsibilities:">">Design, implement, and maintain observability solutions for distributed systems.">Develop...
-
Senior Observability Engineer
1 day ago
Hyderabad, Telangana, India beBeeObservability Full time ₹ 20,00,000 - ₹ 24,00,000Job OpportunityWe are seeking a highly skilled Observability Engineer with expertise in designing and implementing scalable observability platforms, driving adoption of APM tooling, and embedding synthetic monitoring into modern service architectures.The ideal candidate will have hands-on experience in building secure and resilient synthetic monitoring...
-
Observability Expert
2 days ago
Pune, Hyderabad / Secunderabad, Telangana, Gurgaon / Gurugram, India beBeeObservability Full time ₹ 15,00,000 - ₹ 20,00,000Job Title: Observability EngineerWe are seeking a seasoned Observability professional to join our team. The ideal candidate will have a strong background in ITIL/ITSM and extensive experience with APM tools.Key Responsibilities:Develop and implement comprehensive observability strategies to monitor IT systems and applications.Utilize ITSM/ITIL frameworks to...
-
Site Reliability Engineer
2 hours ago
Hyderabad, Telangana, India beBeeObservability Full time US$ 1,50,000 - US$ 2,00,000Site Reliability Engineer - Observability ExpertWe are seeking a highly skilled Site Reliability Engineer to join our team. As an Observability expert, you will design and develop next-generation observability platforms that enable our clients to monitor and improve their complex IT systems.The ideal candidate will have a strong background in software...
-
Observability Systems Engineer
2 days ago
Hyderabad / Secunderabad, Telangana, India beBeeObservability Full time ₹ 1,04,000 - ₹ 1,30,878Job Title: Sr Observability Engineer">We are seeking an experienced Observability Engineer to join our team. This is a key role in the development and implementation of our observability strategy.Key Responsibilities:Design, implement and maintain observability systems for monitoring and notification technologies.Collaborate with cross-functional teams to...
-
Observability/AlOps
4 days ago
Hyderabad, Telangana, India IntraEdge Full timeL2- Observability/AIOps (5 to 8 yrs exp).Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures internally critical and externally visible systems have reliability and uptime appropriate to users' needs and a fast...
-
Observability/AlOps
2 days ago
Hyderabad, Telangana, India IntraEdge Full timeL2- Observability/AIOps (5 to 8 yrs exp). Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures internally critical and externally visible systems have reliability and uptime appropriate to users' needs and a fast...