Senior Observability Engineer
4 days ago
Role Overview: We are seeking a Senior Observability Engineer with strong expertise in designing, implementing, and optimizing observability solutions. In this role, you will be key to shaping the future of observability at Cognite, assessing existing observability frameworks, identifying gaps, and building robust capabilities encompassing log aggregation, event correlation, noise reduction, and comprehensive telemetry analysis to enable proactive operational excellence and reliability for our services.Key ResponsibilitiesConduct assessments of existing observability architectures to identify gaps and improvement opportunities.Design and implement scalable log aggregation pipelines for centralized and efficient data collection.Apply noise-reduction techniques to filter irrelevant or false-positive alerts, enhancing focus on actionable issues.Develop and maintain monitoring dashboards that deliver actionable insights across applications and infrastructure.Lead the migration from Lightstep to Honeycomb, ensuring seamless data pipeline transitions, OpenTelemetry alignment, and stakeholder adoption.Collaborate with infrastructure and product teams to integrate observability tooling into CI/CD workflows and cloud environments.Analyze telemetry data (metrics, logs, traces) to troubleshoot complex system behaviors and recommend improvements.Participate in production debugging and incident troubleshooting using telemetry data Mentor junior engineers on log management, event correlation, distributed tracing. alert management.Stay current on observability innovations and recommend adoption strategies aligned with organizational goals.Support post-incident reviews and continuous improvement through data-driven root cause analysis.Drive continuous improvement in reliability and operational excellence through proactive observability initiatives.Key Skills8+ years of experience in software or systems engineering, with at least 3 years focused on observability or SRE practices.Hands-on experience with observability tools such as Honeycomb, VictoriaMetrics, Lightstep, Prometheus, Grafana, OpenTelemetry, Splunk, Datadog, or New Relic.Strong knowledge of OpenTelemetry instrumentation (metrics, traces, logs) and SLIs/SLOs for reliability tracking.Experience with distributed tracing, event correlation, and noise reduction frameworks.Proficiency in one or more programming/scripting languages such as Python, Java, Kotlin, Go, or Shell.Working knowledge of Infrastructure as Code (Terraform) and CI/CD (Jenkins, Github Actions,...) pipelines.Familiarity with cloud platforms (AWS, Azure, GCP) and container orchestration (Kubernetes).Strong analytical, troubleshooting, and communication skills with the ability to work effectively across teams.Experience conducting observability gap assessments and defining improvement plans.Experience working in complex or multi-cloud environments is preferred.
-
Senior Observability Engineer
2 days ago
bangalore, India Cognite Full timeRole Overview: We are seeking a Senior Observability Engineer with strong expertise in designing, implementing, and optimizing observability solutions. In this role, you will be key to shaping the future of observability at Cognite, assessing existing observability frameworks, identifying gaps, and building robust capabilities encompassing log aggregation,...
-
Senior Consultant – Observability
2 days ago
bangalore, India World Wide Technology Full timeWorld Wide Technology (WWT), a global technology integrator and IT solutions provider. World Wide Technology, established in 1990 in St. Louis, Missouri, collaborates with OEMs like Cisco and Dell EMC to offer infrastructure security and custom app development services to Fortune 500 companies in various sectors. With over 10,000 employees globally, we...
-
Senior Lead Observability Engineer
2 weeks ago
Bangalore, India London Stock Exchange Group Full timeWe're seeking talent DevOps, Site Reliability Engineers or Platform Engineer who want to join us to build an outstanding observability team, using their experience and technological skills to help make us build the centralized observability team. The role involves collaborating with development teams to instrument services, promoting standard methodologies...
-
Machine Learning Engineer
1 week ago
bangalore, India beBeeMachineLearningEngineer Full timeJob Title:A senior machine learning engineer with a focus on observability platforms is required to develop scalable and reliable AI/ML systems.
-
Observability Engineer
2 weeks ago
Bangalore, India London Stock Exchange Group Full timeKey Responsibilities Design and implement scalable telemetry pipelines for metrics, logs, traces, and events across distributed systems. Develop and maintain observability standards, NMS tooling, dashboards, alerting frameworks, and SLOs in collaboration with product and platform teams. Champion best practices in instrumentation, monitoring, and incident...
-
Cloud Engineer-Observability
1 day ago
bangalore, India Smarsh Full timeAbout the team: The Observability team builds and manages the single telemetry and observability service used by all product teams on the Smarsh platform. It provides "as a service" telemetry, monitoring, and visualization capabilities that enable our product teams to operate, support, and triage the applications and services under their product portfolio.We...
-
Cloud Engineer-Observability
1 week ago
bangalore, India Smarsh Full timeAbout the team : The Observability team builds and manages the single telemetry and observability service used by all product teams on the Smarsh platform. It provides "as a service" telemetry, monitoring, and visualization capabilities that enable our product teams to operate, support, and triage the applications and services under their product portfolio....
-
Cloud Engineer-Observability
7 days ago
Bangalore, India Smarsh Full timeAbout the team : The Observability team builds and manages the single telemetry and observability service used by all product teams on the Smarsh platform. It provides "as a service" telemetry, monitoring, and visualization capabilities that enable our product teams to operate, support, and triage the applications and services under their product portfolio....
-
Senior Platform Engineer
1 week ago
Bangalore, India FM Full timeAbout FM: FM is a 190-year-old, Fortune 500 commercial property insurance company of 6,000+ employees with a unique focus on science and risk engineering. Serving over a quarter of the Fortune 500 and major corporations globally, they deliver data-driven strategies that enhance resilience, ensure business continuity, and empower organizations to thrive. FM...
-
Cloud Engineer- Observability
1 week ago
bangalore, India Smarsh Full timeWho are we?Smarsh empowers its customers to manage risk and unleash intelligence in their digital communications. Our growing community of over 6500 organizations in regulated industries counts on Smarsh every day to help them spot compliance, legal or reputational risks in 80+ communication channels before those risks become regulatory fines or headlines....