SRE Observability Architect
2 days ago
SRE Observability Architect - Description Experience: • Minimum 10 years of relevant work experience with monitoring setup using any product (Dynatrace, Datadog, ELK stack, Splunk, Grafana/Prometheus, etc.) set up in critical production environments. • Minimum 5-6 years of work experience in end-to-end observability covering technical, user experience and business outcome metrics. Experience with AIOps is an advantage. • Has experience working with private cloud and Cloud-native public-cloud (particularly AWS) hosted applications. • Multi-tenancy setup and data segregation on the observability and AIOps stack. • Designing and building an Observability & Maintenance (O&M) module for multi-tenant solutions. • Defining SLIs and setting up SLOs for multi-tenant solutions. Core Capabilities: • Experience in implementing Container, Network, APM, RUM, Log Analytics, end-to-end tracing, and custom alerts with Grafana, Prometheus, Grafana Loki (alternatively Logstash or Fluent bit). Implementing the same on any other 3rd party product like Dynatrace is also considered. • Proficiency with containers and multi-tenancy setup for the observability solution is critical. • Ability to configure custom alerts, monitors and build AIOps workflows based on telemetry. • Good understanding of setting up integration capabilities with other systems via APIs and consuming external APIs for IAM as well as ingesting metric-based telemetry via collectors. • Ability to build custom observability dashboards across different portfolios and personas. • Setting up Synthetic Monitoring and Test Automation while integrating its telemetry into the observability stack. • Tenant and data segregation as well as ability to obfuscate sensitive information on the common observability schema. • Ability to code is preferable – Python / Java and Ansible scripting preferred. Qualification: • Observability Foundation certification from DevOps Institute or any product-level accreditation. • Any recognized System Architecture qualifications ( TOGAF) are a bonus. Role & Responsibilities: • Architect, design and ensure Implementation of the entire observability solution to be packaged as a module in a multi-tenant private cloud solution. • Implement observability solution to monitor and apply the same feature-set across all tenants (monitor and act upon telemetry from tenants – serving as a hypervisor). • Design and implement integrations as well as externalize APIs. • Set up authentication and authorization controls by integrating with an IAM layer. • Work with UI/UX teams to design dashboards for the Observability & Maintenance platform for both the tenants as well as the host. • Design and set up an AIOps module responsible for automated remediation workflows such as capacity scaling, container restarts, anomaly detection, etc. • Work on building Proof-of-Concept solutions to view end-to-end tube-maps / service flows for the respective tenant’s services. • Defining and setting up a CMDB to serve as a source for the infrastructure and application telemetry. • Work with other teams to ensure the system is well-tested and scalable, meeting tenant demands. • Define business aligned SLIs and set SLOs for core services and journeys. Primary Location Bangalore, Karnataka, India Job Type Experienced Years of Experience 12 Travel No
-
SRE - DevOps and Observability
2 days ago
Bengaluru, India EMBARKGCC SERVICES PRIVATE LIMITED Full timeKey Responsibilities - Own and manage AKS-based Kubernetes clusters (multi-tenant, namespace isolation). - Implement and maintain GitOps workflows using FluxCD and Helm. - Manage infrastructure as code with Terraform. - Build and operate observability stack (Prometheus, Grafana, Loki, Tempo) and integrate with external tools (Datadog, Dynatrace, Grafana...
-
Observability Platform and SRE Engineer
2 weeks ago
Bengaluru, Karnataka, India Kotak Mahindra Bank Full time ₹ 8,00,000 - ₹ 20,00,000 per yearDev Ops Engineering III-SUPPORT SERVICES-Applications-CTB Title : Observability Platforms and SRE Engg. The Company : World of Kotak product suite encompasses a powerful suite of cross banking assets, all-in-one stop banking services, securities, and investment banking; insights across a wide spectrum of the major financial and banking markets. ...
-
SRE - DevOps and Observability
2 days ago
Bengaluru, India EMBARKGCC SERVICES PRIVATE LIMITED Full timeJob Description Key Responsibilities - Own and manage AKS-based Kubernetes clusters (multi-tenant, namespace isolation). - Implement and maintain GitOps workflows using FluxCD and Helm. - Manage infrastructure as code with Terraform. - Build and operate observability stack (Prometheus, Grafana, Loki, Tempo) and integrate with external tools (Datadog,...
-
SRE Engineer
4 weeks ago
Bengaluru, Pune, India Zentek Infosoft Full timeJob Description SRE Engineer / Architect / Consultant - Design and implement SRE practices - Build robust monitoring and alerting systems - Automate routine operational tasks - Ensure reliability, scalability, and high system performance - Deep understanding of cloud platforms - Familiar with containerization technologies Observability Engineer / Lead -...
-
Observability Architect
2 weeks ago
Bengaluru, Karnataka, India Infosys Full time ₹ 12,00,000 - ₹ 36,00,000 per yearObservability ArchitectExperience in Observability, Observability Architect, Dynatrace, Monitoring , Logging and alerting, End-End Visibility Key Tools: Dynatrace, Splunk, New Relic, Prometheus and Grafana (Our immediate requirement is for Dynatrace and Splunk
-
SRE – Cloud Security and Observability
2 weeks ago
Bengaluru, Karnataka, India RapidCircle Advisory Full time ₹ 12,00,000 - ₹ 36,00,000 per yearMaking a difference and driving positive change is what we do every day at Rapid Circle. Our Cloud Pioneers help our clients in their digital transformation. Are you someone who goes for constant, positive change? Then this vacancy is for youAs a Cloud Pioneer at Rapid Circle, you will work with our customers on different projects. For example, making impact...
-
SRE, Observability System Administrator
2 days ago
Bengaluru, India Toast Full timeThe Observability System Administrator role at Toast fits within the Observability Enablement & Administration team, which is part of Site Reliability Engineering, responsible for overseeing Toast production services, with a commitment to quality, reliability, and low latency. The Observability Enablement & Administration team is responsible for setting the...
-
Observability Architect
2 weeks ago
Bengaluru, Karnataka, India Infosys Full time ₹ 1,00,00,000 - ₹ 2,00,00,000 per yearKey Responsibilities:Experience in Observability Observability Architect Dynatrace Monitoring Logging and alerting End End VisibilityKey Tools Dynatrace Splunk New Relic Prometheus and Grafana Our immediate requirement is for Dynatrace and SplunkPreferred Skills:Technology->DevOps->Continuous delivery - Continuous deployment and release->dbDeply
-
SRE, Observability System Administrator
1 week ago
Bengaluru, Karnataka, India Toast Full time ₹ 12,00,000 - ₹ 36,00,000 per yearThe Observability System Administrator role at Toast fits within the Observability Enablement & Administration team, which is part of Site Reliability Engineering, responsible for overseeing Toast production services, with a commitment to quality, reliability, and low latency. The Observability Enablement & Administration team is responsible for setting the...
-
SRE, Observability System Administrator
1 week ago
Bengaluru, Karnataka, India Toast Full time ₹ 12,00,000 - ₹ 24,00,000 per yearThe Observability System Administrator role at Toast fits within the Observability Enablement & Administration team, which is part of Site Reliability Engineering, responsible for overseeing Toast production services, with a commitment to quality, reliability, and low latency. The Observability Enablement & Administration team is responsible for setting the...