Observability Engineer

3 weeks ago


Hyderabad, India Mindlance Full time
Job Summary:
We are seeking a highly skilled and motivated Grafana Dashboard Specialist with deep expertise in DevOps automation to join our team. The ideal candidate will be responsible for designing, developing, and maintaining advanced Grafana dashboards that provide actionable insights into system performance, application metrics, and business KPIs. Additionally, the candidate will be a Subject Matter Expert (SME) in automation, developing and contributing to CI/CD pipelines, infrastructure as code (IaC), and cloud-native operations for Grafana.
________________________________________
Key Responsibilities:
Grafana & Observability:
• Design and implement visually compelling and data-rich Grafana dashboards for Observability.
• Integrate Grafana Cloud with data sources such as Prometheus, Loki, ServiceNow, PagerDuty, Snowflake, AWS
• Integrate telemetry data sources such as Tomcat, Liberty, Ping, Linux, Windows, and databases (Oracle, PostGres) and REST API.
• Create alerting mechanisms for SLA breaches, latency spikes and transaction anomalies.
• Develop custom panels and alerts to monitor infrastructure, applications, and business metrics.
• Collaborate with stakeholders to understand monitoring needs and translate them to define KPIs and visualization needs.
• Optimize dashboard performance and usability across teams.
• Implement and manage OpenTelemetry instrumentation across services to collect distributed traces, metrics, and logs.
• Integrate OpenTelemetry data pipelines with Grafana and other observability platforms.
• Develop and maintain OpenTelemetry collectors and exporters for various environments.
• Develop and implement monitoring solutions for applications and infrastructure to ensure high availability and performance.
• Collaborate with development, operations, and other IT teams to ensure monitoring solutions are integrated and aligned with business needs.
DevOps & Automation:
• Architect, design and maintain CI/CD pipelines using tools such as Jenkins, Bitbucket, and Nexus.
• Implement Infrastructure as Code (IaC) using Terraform and Ansible.
• Automate deployment, scaling, and monitoring of both cloud-native and on-premises environments.
• Ensure system reliability, scalability, and security through automated processes.
• Collaborate with development and operations teams to streamline workflows and reduce manual intervention.
SME Responsibilities:
• Act as a technical advisor on automation and observability best practices.
• Lead initiatives to improve system performance, reliability, and developer productivity.
• Conduct training sessions and create documentation for internal teams.
• Stay current with industry trends and emerging technologies in DevOps and observability.
• Advocate for and guide the adoption of OpenTelemetry standards and practices across engineering teams.
• Optimize monitoring processes and tools to enhance efficiency and effectiveness.
Required Qualifications:
• Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience).
• + years of experience in DevOps, SRE, or infrastructure automation roles.
• + years of hands-on experience with Grafana and dashboard development.
• Strong proficiency in scripting languages (Python, Bash, Go).
• Experience with monitoring tools (Grafana Cloud, Prometheus, Loki, Dynatrace, Splunk, etc.).
• Deep understanding of CI/CD, and cloud platforms (AWS and Azure).
• Expertise in Kubernetes, Docker, and container orchestration.
• Familiarity with security and compliance in automated environments.
• Hands-on experience with OpenTelemetry instrumentation and data collection.
________________________________________
Preferred Qualifications:
• Grafana certification or equivalent experience.
• Experience with custom Grafana plugins or panel development.
• Knowledge of business intelligence tools and data visualization principles.
• Contributions to open-source DevOps or observability projects.
• Strong communication and stakeholder management skills.
• Experience with OpenTelemetry Collector configuration and integration.
• Familiarity with distributed tracing concepts.

EEO:



  • Hyderabad, Telangana, India algoleap Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Role: Observability EngineerJob Description:Senior Platform EngineerWe are seeking a highly experienced and driven Senior Observability Engineer to lead the design, development, and maintenance of observability solutions across our infrastructure, applications, and services. As a Senior Observability Engineer, you will be at the forefront of implementing...

  • Observability Engineer

    17 hours ago


    Hyderabad, Telangana, India Algoleap Technologies Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    SUMMARY Role: Observability EngineerJob Description:Senior Platform EngineerWe are seeking a highly experienced and driven Senior Observability Engineer to lead the design, development, and maintenance of observability solutions across our infrastructure, applications, and services. As a Senior Observability Engineer, you will be at the forefront of...


  • Hyderabad, India Mindlance Full time

    Observability Engineer Location: Hyderabad Job Summary: We are seeking a highly skilled and motivated Grafana Dashboard Specialist with strong expertise in Dev Ops automation to join our team. The ideal candidate will be responsible for designing, developing, and maintaining advanced Grafana dashboards that provide actionable insights into system...


  • Hyderabad, India Mindlance Full time

    Observability EngineerLocation: HyderabadJob Summary:We are seeking a highly skilled and motivated Grafana Dashboard Specialist with strong expertise in DevOps automation to join our team. The ideal candidate will be responsible for designing, developing, and maintaining advanced Grafana dashboards that provide actionable insights into system performance,...


  • Hyderabad, India Mindlance Full time

    Observability EngineerLocation: HyderabadJob Summary:We are seeking a highly skilled and motivated Grafana Dashboard Specialist with strong expertise in DevOps automation to join our team. The ideal candidate will be responsible for designing, developing, and maintaining advanced Grafana dashboards that provide actionable insights into system performance,...


  • Hyderabad, India Mindlance Full time

    Observability EngineerLocation: HyderabadJob Summary:We are seeking a highly skilled and motivated Grafana Dashboard Specialist with strong expertise in DevOps automation to join our team. The ideal candidate will be responsible for designing, developing, and maintaining advanced Grafana dashboards that provide actionable insights into system performance,...


  • Hyderabad, India Mindlance Full time

    Observability Engineer Location: Hyderabad Job Summary: We are seeking a highly skilled and motivated Grafana Dashboard Specialist with strong expertise in DevOps automation to join our team. The ideal candidate will be responsible for designing, developing, and maintaining advanced Grafana dashboards that provide actionable insights into system...


  • Hyderabad, India Mindlance Full time

    Observability Engineer Location: Hyderabad Job Summary: We are seeking a highly skilled and motivated Grafana Dashboard Specialist with strong expertise in DevOps automation to join our team. The ideal candidate will be responsible for designing, developing, and maintaining advanced Grafana dashboards that provide actionable insights into system...


  • Hyderabad, India Intraedge Technologies Ltd. Full time

    The MLOps Observability Engineer will design, implement, and maintain the comprehensive monitoring, logging, and tracing solutions for our entire ML platform and production models. This includes building automated systems to detect model decay, data drift, and infrastructure performance issues, ensuring that our AI/ML applications are reliable, scalable, and...


  • Hyderabad, Telangana, India INTRAEDGE TECHNOLOGIES PRIVATE LIMITED Full time ₹ 6,00,000 - ₹ 12,00,000 per year

    The MLOps Observability Engineer will design, implement, and maintain the comprehensive monitoring, logging, and tracing solutions for our entire ML platform and production models. This includes building automated systems to detect model decay, data drift, and infrastructure performance issues, ensuring that our AI/ML applications are reliable, scalable, and...