Observability Engineer

6 days ago


Gurugram, India GSPANN Full time
Description GSPANN is hiring an experienced Observability Engineer (AI Ops) with 12-15 years of expertise in monitoring, automation, and AI-driven operations. The role involves enhancing system reliability and performance through APM tools, cloud observability, scripting, and Site Reliability Engineering (SRE) practices.

Role and Responsibilities

Use Application Performance Management (APM) tools such as Dynatrace and LogicMonitor to monitor and enhance system performance. Write and maintain automation scripts using Python and Bash to streamline monitoring and alerting processes. Deploy and manage Splunk for log analysis, real-time monitoring, and root cause troubleshooting. Operate and oversee Kubernetes clusters through Amazon Elastic Kubernetes Service (EKS) for high availability and scalability. Implement observability solutions on Amazon Web Services (AWS) and Microsoft Azure to ensure cloud-based systems are monitored and well-managed. Apply Site Reliability Engineering (SRE) principles to improve system resilience, scalability, and performance. Incorporate AI and machine learning in observability workflows to enable predictive monitoring and boost operational efficiency. Respond promptly to incidents and drive resolution efforts to minimize business disruptions. Continuously analyze and tune system performance, using proactive monitoring and feedback loops. Partner with development and operations teams to integrate observability tools and practices seamlessly across environments.

Skills and Experience

Hold a Bachelor’s degree in Computer Science, Information Technology, or a related discipline. Bring 12-15 years of experience in observability engineering or related technical roles. Demonstrate advanced proficiency in APM tools (e.g., Dynatrace, LogicMonitor), scripting languages (Python, Bash), and Splunk. Have hands-on experience working with EKS, AWS, and Azure platforms. Show deep understanding of SRE concepts and how to apply them in production environments. Exhibit strong problem-solving and communication skills. Thrive in a fast-paced and dynamic environment. Hold certifications such as AWS Certified DevOps Engineer, Google Cloud Professional DevOps Engineer, or equivalent. Apply knowledge of AI and machine learning techniques in operational contexts. Understand and utilize performance optimization frameworks and related best practices.

  • Gurugram, India Ahead Full time

    AHEAD builds platforms for digital business. By weaving together advances in cloud infrastructure, automation and analytics, and software delivery, we help enterprises deliver on the promise of digital transformation. AtAHEAD, we prioritize creating a culture of belonging,where all perspectives and voices are represented, valued, respected, and heard. We...


  • Gurugram, India GSPANN Full time

    Description GSPANN is hiring an Observability Engineer with expertise in Site Reliability Engineering (SRE) The role focuses on leveraging SRE principles, automation, and AI-driven observability to enhance reliability and scalability across cloud and ERP environments.Role and Responsibilities Leverage Application Performance Management (APM) tools such...


  • Gurugram, India GSPANN Full time

    Description GSPANN is hiring an Observability Engineer with expertise in Site Reliability Engineering (SRE) The role focuses on leveraging SRE principles, automation, and AI-driven observability to enhance reliability and scalability across cloud and ERP environments.Role and Responsibilities Leverage Application Performance Management (APM) tools such as...


  • Gurugram, India AHEAD Full time

    AHEAD builds platforms for digital business. By weaving together advances in cloud infrastructure, automation and analytics, and software delivery, we help enterprises deliver on the promise of digital transformation.At AHEAD, we prioritize creating a culture of belonging, where all perspectives and voices are represented, valued, respected, and heard. We...


  • Gurugram, India AHEAD Full time

    AHEAD builds platforms for digital business. By weaving together advances in cloud infrastructure, automation and analytics, and software delivery, we help enterprises deliver on the promise of digital transformation.At AHEAD, we prioritize creating a culture of belonging, where all perspectives and voices are represented, valued, respected, and heard. We...


  • Gurugram, India GSPANN Full time

    Description GSPANN is hiring an experienced Observability Engineer (AI Ops) with 12-15 years of expertise in monitoring, automation, and AI-driven operations. The role involves enhancing system reliability and performance through APM tools, cloud observability, scripting, and Site Reliability Engineering (SRE) practices.Role and Responsibilities Use...


  • Gurugram, India Nexthire Full time

                                          Designation/Role : Cloud and Observability Engineer Role : Cloud and Observability Engineer Experience : 3-6 Years+ Location : Gurugram About the Job Coralogix is a modern, full-stack observability platform transforming...


  • Gurugram, India Nexthire Full time

                                          Designation/Role : Cloud and Observability Engineer Role : Cloud and Observability Engineer Experience : 3-6 Years+ Location : Gurugram About the Job Coralogix is a modern, full-stack observability platform...


  • Gurugram, India American Express Global Business Travel Full time

    Amex GBT is a place where colleagues find inspiration in travel as a force for good and – through their work – can make an impact on our industry. We’re here to help our colleagues achieve success and offer an inclusive and collaborative culture where your voice is valued. We’re transforming business travel technology. Amex GBT gives travellers...


  • Gurugram, India Success Pact Consulting Pvt Ltd Full time

    Description : Position : Infrastructure EngineerExperience : 7+ Years (Principal or Staff Level)Job Type : Full-timeJob Summary : We are seeking a highly experienced Infrastructure Engineer at the Principal or Staff level, with 7+ years of specialized experience in cloud infrastructure, DevOps, or Site Reliability Engineering (SRE). This critical role...