
Lead - site reliability engineer
2 days ago
We are looking for a Lead - Site Reliability Engineer with 8+ years for Experience into design, implement, and manage robust observability solutions across our cloud infrastructure and applications. The ideal candidate will have hands-on experience with Prometheus , Grafana , Google Cloud Monitoring , and Open Telemetry , along with exposure to Solar Winds . You should be comfortable working with metrics, logs, and traces , and be able to correlate telemetry data to proactively detect, diagnose, and resolve performance issues. Key Responsibilities: Design and maintain observability pipelines using Open Telemetry, Prometheus, and Grafana. Build dashboards and alerts to monitor system health, application performance, and business KPIs. Integrate observability solutions with Google Cloud Platform services and Solar Winds. Correlate logs, metrics, and traces to troubleshoot incidents and reduce MTTR. Collaborate with SREs, Dev Ops, and development teams to improve end-to-end system observability. Implement best practices for telemetry data collection, enrichment, storage, and visualization. Requirements: Strong experience with Prometheus and Grafana for monitoring and alerting. Proficiency in Open Telemetry for instrumenting distributed systems. Working knowledge of observability tools in Google Cloud (e.g., Cloud Monitoring, Logging, Trace). Exposure to Solar Winds for network and infrastructure monitoring. Solid understanding of telemetry data types: metrics, logs, and traces. Ability to correlate and analyze multi-source observability data. Scripting skills (Python, Bash) and familiarity with Infrastructure-as-Code is a plus. Preferred Qualifications: Experience in Site Reliability Engineering or Platform Engineering roles. Knowledge of SLIs/SLOs and performance benchmarking. Experience with APM tools (e.g., Datadog, New Relic) is a plus.
-
Lead Site Reliability Engineer
3 weeks ago
Hyderabad, Telangana, India JP Morgan Chase & Co. Full timeJob DescriptionAssume a critical role in defining the future of a globally recognized firm and have a direct and significant effect in a realm tailored for top achievers in site reliability.As a Lead Site Reliability Engineer at JPMorgan Chase within the Consumer & Community Banking Team, you will take the lead in conducting resiliency design reviews, break...
-
Lead Site Reliability Engineer
4 days ago
Hyderabad, India JP Morgan Chase & Co. Full timeJob Description Assume a critical role in defining the future of a globally recognized firm and have a direct and significant effect in a realm tailored for top achievers in site reliability. As a Lead Site Reliability Engineer at JPMorgan Chase within the Consumer & Community Banking, youhold a leadership role in your team, demonstrate strong knowledge...
-
Lead Site Reliability Engineer
2 days ago
Hyderabad, India Chase Bank Full timeJob Description Assume a critical role in defining the future of a globally recognized firm and have a direct and significant effect in a realm tailored for top achievers in site reliability. As a Lead Site Reliability Engineer at JPMorgan Chase within the Consumer & Community Banking, youhold a leadership role in your team, demonstrate strong knowledge...
-
Lead - Site Reliability Engineer
3 weeks ago
Hyderabad, Telangana, India VXI Global Solutions Full timeWe are looking for a Lead - Site Reliability Engineer with 8+ years for Experience into design, implement, and manage robust observability solutions across our cloud infrastructure and applications. The ideal candidate will have hands-on experience with Prometheus, Grafana, Google Cloud Monitoring, and OpenTelemetry, along with exposure to SolarWinds. You...
-
Lead - Site Reliability Engineer
7 days ago
Hyderabad, India VXI Global Solutions Full timeWe are looking for a Lead - Site Reliability Engineer with 8+ years for Experience into design, implement, and manage robust observability solutions across our cloud infrastructure and applications. The ideal candidate will have hands-on experience with Prometheus , Grafana , Google Cloud Monitoring , and OpenTelemetry , along with exposure to ...
-
Lead - Site Reliability Engineer
7 days ago
Hyderabad, India VXI Global Solutions Full timeWe are looking for a Lead - Site Reliability Engineer with 8+ years for Experience into design, implement, and manage robust observability solutions across our cloud infrastructure and applications. The ideal candidate will have hands-on experience with Prometheus, Grafana, Google Cloud Monitoring, and OpenTelemetry, along with exposure to SolarWinds. You...
-
Lead - Site Reliability Engineer
6 days ago
hyderabad, India VXI Global Solutions Full timeWe are looking for a Lead - Site Reliability Engineer with 8+ years for Experience into design, implement, and manage robust observability solutions across our cloud infrastructure and applications. The ideal candidate will have hands-on experience with Prometheus, Grafana, Google Cloud Monitoring, and OpenTelemetry, along with exposure to SolarWinds. You...
-
Lead - Site Reliability Engineer
6 days ago
Hyderabad, India VXI Global Solutions Full timeWe are looking for a Lead - Site Reliability Engineer with 8+ years for Experience into design, implement, and manage robust observability solutions across our cloud infrastructure and applications. The ideal candidate will have hands-on experience with Prometheus, Grafana, Google Cloud Monitoring, and OpenTelemetry, along with exposure to SolarWinds. You...
-
Lead - Site Reliability Engineer
4 days ago
Hyderabad, India VXI Global Solutions Full timeWe are looking for a Lead - Site Reliability Engineer with 8+ years for Experience into design, implement, and manage robust observability solutions across our cloud infrastructure and applications. The ideal candidate will have hands-on experience withPrometheus,Grafana,Google Cloud Monitoring, andOpenTelemetry, along with exposure toSolarWinds. You should...
-
Senior Site Reliability Engineer
7 days ago
Hyderabad, India CloudHire Full timeJob SummaryThe Technical Manager for Site Reliability Engineering (SRE) will lead a remote team of Site Reliability Engineers, ensuring operational excellence and fostering a high-performing team culture. Reporting to the US-based Director of Systems and Security, this role is responsible for overseeing day-to-day operations, technical mentorship, and...