
Lead - Site Reliability Engineer
4 days ago
We are looking for a Lead - Site Reliability Engineer with 8+ years for Experience into design, implement, and manage robust observability solutions across our cloud infrastructure and applications. The ideal candidate will have hands-on experience with
Prometheus
,
Grafana
,
Google Cloud Monitoring
, and
OpenTelemetry
, along with exposure to
SolarWinds
. You should be comfortable working with
metrics, logs, and traces
, and be able to
correlate telemetry data
to proactively detect, diagnose, and resolve performance issues.
Key Responsibilities:
- Design and maintain observability pipelines using OpenTelemetry, Prometheus, and Grafana.
- Build dashboards and alerts to monitor system health, application performance, and business KPIs.
- Integrate observability solutions with Google Cloud Platform services and SolarWinds.
- Correlate logs, metrics, and traces to troubleshoot incidents and reduce MTTR.
- Collaborate with SREs, DevOps, and development teams to improve end-to-end system observability.
- Implement best practices for telemetry data collection, enrichment, storage, and visualization.
Requirements:
- Strong experience with Prometheus and Grafana for monitoring and alerting.
- Proficiency in OpenTelemetry for instrumenting distributed systems.
- Working knowledge of observability tools in Google Cloud (e.g., Cloud Monitoring, Logging, Trace).
- Exposure to SolarWinds for network and infrastructure monitoring.
- Solid understanding of telemetry data types: metrics, logs, and traces.
- Ability to correlate and analyze multi-source observability data.
- Scripting skills (Python, Bash) and familiarity with Infrastructure-as-Code is a plus.
Preferred Qualifications:
- Experience in Site Reliability Engineering or Platform Engineering roles.
- Knowledge of SLIs/SLOs and performance benchmarking.
- Experience with APM tools (e.g., Datadog, New Relic) is a plus.
-
Lead Site Reliability Engineer
3 weeks ago
Hyderabad, Telangana, India JP Morgan Chase & Co. Full timeJob DescriptionAssume a critical role in defining the future of a globally recognized firm and have a direct and significant effect in a realm tailored for top achievers in site reliability.As a Lead Site Reliability Engineer at JPMorgan Chase within the Consumer & Community Banking Team, you will take the lead in conducting resiliency design reviews, break...
-
Lead Site Reliability Engineer
4 days ago
Hyderabad, India JP Morgan Chase & Co. Full timeJob Description Assume a critical role in defining the future of a globally recognized firm and have a direct and significant effect in a realm tailored for top achievers in site reliability. As a Lead Site Reliability Engineer at JPMorgan Chase within the Consumer & Community Banking, youhold a leadership role in your team, demonstrate strong knowledge...
-
Lead Site Reliability Engineer
2 days ago
Hyderabad, India Chase Bank Full timeJob Description Assume a critical role in defining the future of a globally recognized firm and have a direct and significant effect in a realm tailored for top achievers in site reliability. As a Lead Site Reliability Engineer at JPMorgan Chase within the Consumer & Community Banking, youhold a leadership role in your team, demonstrate strong knowledge...
-
Lead - site reliability engineer
2 days ago
Hyderabad, India VXI Global Solutions Full timeWe are looking for a Lead - Site Reliability Engineer with 8+ years for Experience into design, implement, and manage robust observability solutions across our cloud infrastructure and applications. The ideal candidate will have hands-on experience with Prometheus , Grafana , Google Cloud Monitoring , and Open Telemetry , along with exposure to ...
-
Lead - Site Reliability Engineer
3 weeks ago
Hyderabad, Telangana, India VXI Global Solutions Full timeWe are looking for a Lead - Site Reliability Engineer with 8+ years for Experience into design, implement, and manage robust observability solutions across our cloud infrastructure and applications. The ideal candidate will have hands-on experience with Prometheus, Grafana, Google Cloud Monitoring, and OpenTelemetry, along with exposure to SolarWinds. You...
-
Lead - Site Reliability Engineer
7 days ago
Hyderabad, India VXI Global Solutions Full timeWe are looking for a Lead - Site Reliability Engineer with 8+ years for Experience into design, implement, and manage robust observability solutions across our cloud infrastructure and applications. The ideal candidate will have hands-on experience with Prometheus , Grafana , Google Cloud Monitoring , and OpenTelemetry , along with exposure to ...
-
Lead - Site Reliability Engineer
6 days ago
Hyderabad, India VXI Global Solutions Full timeWe are looking for a Lead - Site Reliability Engineer with 8+ years for Experience into design, implement, and manage robust observability solutions across our cloud infrastructure and applications. The ideal candidate will have hands-on experience with Prometheus, Grafana, Google Cloud Monitoring, and OpenTelemetry, along with exposure to SolarWinds. You...
-
Lead - Site Reliability Engineer
6 days ago
hyderabad, India VXI Global Solutions Full timeWe are looking for a Lead - Site Reliability Engineer with 8+ years for Experience into design, implement, and manage robust observability solutions across our cloud infrastructure and applications. The ideal candidate will have hands-on experience with Prometheus, Grafana, Google Cloud Monitoring, and OpenTelemetry, along with exposure to SolarWinds. You...
-
Lead - Site Reliability Engineer
7 days ago
Hyderabad, India VXI Global Solutions Full timeWe are looking for a Lead - Site Reliability Engineer with 8+ years for Experience into design, implement, and manage robust observability solutions across our cloud infrastructure and applications. The ideal candidate will have hands-on experience with Prometheus, Grafana, Google Cloud Monitoring, and OpenTelemetry, along with exposure to SolarWinds. You...
-
Senior Site Reliability Engineer
7 days ago
Hyderabad, India CloudHire Full timeJob SummaryThe Technical Manager for Site Reliability Engineering (SRE) will lead a remote team of Site Reliability Engineers, ensuring operational excellence and fostering a high-performing team culture. Reporting to the US-based Director of Systems and Security, this role is responsible for overseeing day-to-day operations, technical mentorship, and...