Observability Engineer

3 days ago


Chennai Tamil Nadu, India Standard Chartered Full time

Job Summary As the Technical Squad Lead Central Platform Development you will play a critical role in making the internal state of the bank s application and infrastructure services visible to stakeholders for troubleshooting performance analysis capacity planning and reporting through the Central Monitoring and Observability Platform You will lead to develop the bank s central monitoring and observability platform and tooling to enable product owners developers and operators to efficiently trace performance problems to their source and map their application performance to business objectives You will lead the Predictive Monitoring and Predictive Observability and AIOps practices for the platform to enable observability platform to predict issues early Strategy Awareness and understanding of the TTO 25 business strategy and model appropriate to the role Support and the enablement of the Central Monitoring Observability strategy goals and objectives by developing prioritized features aligned to the Catalyst and Tech Simplification programmes Business The Monitoring Observability Platform team is a global team ensuring the design development delivery support of the bank s central monitoring and observability services for all TTO teams technology domains The ideal candidate will possess a deep understanding in one or more of the Observability technologies Elastic Observability Grafana Observability and AIOps Specialist Data Specialist Machine Learning Engineer Data Transformation Lead enabling the design development implementation and management of the machine learning frameworks integrating advanced technological tools and techniques with a strong focus on applying ML techniques to real-world problems Participation in Weekend releases overnight major incidents to help teams enable Observability Predictive Capability is a must as this is key capability for the role Must have working experience on AIOps and MLOps Processes As the Technical Squad Lead Central Platform Development you will play a crucial role in ensuring the stability reliability and use of Machine learning of our applications and platform integrations thereby enabling our organization to deliver predictive observability services to our internal stakeholders by adhering to the Enterprise SDLC eSDLC framework and guidelines People Talent Actively engaging in stakeholders conversations providing timely clear and actionable feedback to deliver solution within timeline Key Responsibilities Risk Management The ability to interpret the Group s technical and security ICS control requirements and information to identify potential risks and key issues based on this information and put in place appropriate controls and measures to mitigate or minimize risk to the central monitoring observability platform delivery Governance Awareness and understanding of the eSDLC framework in which the TTO software delivery operates and the requirements and expectations relevant to the role Responsible for adhering to the effectiveness of the central monitoring and observability platform deliver governance based on oversight and controls of the eSDLC framework Regulatory Business Conduct Display exemplary conduct and live by the Group s Values and Code of Conduct Take personal responsibility for embedding the highest standards of ethics including regulatory and business conduct across Standard Chartered Bank This includes understanding and ensuring compliance with in letter and spirit all applicable laws regulations guidelines and the Group Code of Conduct Effectively and collaboratively identify escalate mitigate and resolve risk conduct and compliance matters Key stakeholders TTO CIO Development teams TTO Product Owners TTO SRE PSS TTO Cloud Engineering ET Foundation Service Owners Other Responsibilities Participate in solution architecture design consulting platform management and capacity planning activities Create sustainable solutions and services through automation and service uplifts within monitoring and observability disciplines Participation in Weekend releases overnight major incidents to help teams enable Observability Predictive Capability is a must Skills and Experience Agile Delivery Application Delivery Process AIOps Specialist Machine Learning Frameworks Software Product Technical Knowledge Monitoring and Observability Experience Data Pipelines Application Programming Integration Qualifications Education Degree Training Agile delivery devops predictive monitoring Licenses Any Membership Any Certifications Machine learning aiops Languages English Our ideal candidate should have overall minimum of 8 years of IT experience Bachelor s Degree in computer science or Information Systems or equivalent applicable experience Proven experience 4 years working as an Observability Monitoring Specialist AIOPs specialist Data Transformation Lead or similar role with a strong focus on applying observability techniques to real-world problems to reduce incident impacts Design and develop AI-powered solutions for Observability using Machine Learning techniques and rightful used models Must have experience on mentoring team in terms of creating structure to the book of work Help team with organised product backlog Participation in Weekend releases overnight major incidents to help teams enable Observability Predictive Capability is a must as this is key capability for the role Hands-on experience with machine learning frameworks e g TensorFlow PyTorch SKLearn XGboost etc and proficiency in programming languages such as Python Hive Spark Pyspark etc Working experience in AIOps by creating Data ingestion ETL pipelines aggregation analytics and Machine Learning Must have working experience on AIOps and MLOps using traditional and Gen-AI LLM models such as Mistral Llama Bert Enables Use of AI in responsible way and enable AI ML technologies to identify historical trends dynamic baselining and to drive Root cause analysis actions Enables Use of Machine Learning as an assistant to enable observability analysts product owners incident managers hive leads to come with right actions Addressed problems through risk management and contingency planning Software development life cycle knowledge in terms of analysis development testing phases Problem solving skills using Open source technologies and solutions About Standard Chartered We re an international bank nimble enough to act big enough for impact For more than 170 years we ve worked to make a positive difference for our clients communities and each other We question the status quo love a challenge and enjoy finding new opportunities to grow and do better than before If you re looking for a career with purpose and you want to work for a bank making a difference we want to hear from you You can count on us to celebrate your unique talents and we can t wait to see the talents you can bring us Our purpose to drive commerce and prosperity through our unique diversity together with our brand promise to be here for good are achieved by how we each live our valued behaviours When you work with us you ll see how we value difference and advocate inclusion Together we Do the right thing and are assertive challenge one another and live with integrity while putting the client at the heart of what we do Never settle continuously striving to improve and innovate keeping things simple and learning from doing well and not so well Are better together we can be ourselves be inclusive see more good in others and work collectively to build for the long term What we offer In line with our Fair Pay Charter we offer a competitive salary and benefits to support your mental physical financial and social wellbeing Core bank funding for retirement savings medical and life insurance with flexible and voluntary benefits available in some locations Time-off including annual leave parental maternity 20 weeks sabbatical 12 months maximum and volunteering leave 3 days along with minimum global standards for annual and public holiday which is combined to 30 days minimum Flexible working options based around home and office locations with flexible working patterns Proactive wellbeing support through Unmind a market-leading digital wellbeing platform development courses for resilience and other human skills global Employee Assistance Programme sick leave mental health first-aiders and all sorts of self-help toolkits A continuous learning culture to support your growth with opportunities to reskill and upskill and access to physical virtual and digital learning Being part of an inclusive and values driven organisation one that embraces and celebrates our unique diversity across our teams business functions and geographies - everyone feels respected and can realise their full potential 40310



  • Chennai, India Michael Page Full time

    World renowned organization. Fast track growth. About Our Client The hiring organisation is specialising in providing logistic services. They are known for their commitment to technological advancements and operational excellence. Job Description Design and implement observability solutions to monitor IT systems and applications effectively. Collaborate with...


  • Chennai, Tamil Nadu, India VCS Staffing Geek Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Urgent Hiring-Role-Observability EngineerOffice Location: Chennai, IndiaWork Mode - HybridResponsibilitiesBuilding data pipelines: Design, build, and maintain observability data pipelines to ingest metrics, logs, and traces, ideally using the OpenTelemetry (OTEL) standard.Scripting & Automation: Develop and maintain automation scripts and tools to streamline...


  • Chennai, India Theomnihire Full time

    Job Description Job description:  The Engineer/Senior Engineer – Observability Engineering is key member of Service Reliability Engineering. He/she will be ultimately responsible for system Observability, reliability Monitoring and reducing time to detect by continuously finetuning the monitoring infrastructure of the services our SRE team supports. As a...


  • Chennai, India Theomnihire Full time

    Job description:  The Engineer/Senior Engineer – Observability Engineering is key member of Service Reliability Engineering. He/she will be ultimately responsible for system Observability, reliability Monitoring and reducing time to detect by continuously finetuning the monitoring infrastructure of the services our SRE team supports. As a Reliability...


  • Chennai, India Hapag-Lloyd Full time

    IT Engineer – Observability Integration Full Time Perungudi, Chennai, Tamil Nadu, India With Professional Experience 7/3/25 About Hapag-Lloyd With a fleet of modern container ships and a Vessel Capacity 2.2 million TEU, as well as a Container Capacity 3.2 million TEU including one of the world’s largest and most modern reefer container fleets,...


  • Chennai, India Hapag-Lloyd AG Full time

    IT Engineer – Observability Integration Full Time Perungudi, Chennai, Tamil Nadu, India With Professional Experience 7/3/25 About Hapag-Lloyd With a fleet of 287 modern container ships and a Vessel Capacity 2.2 million TEU, as well as a Container Capacity 3.2 million TEU including one of the world’s largest and most modern reefer container fleets,...


  • Chennai, Tamil Nadu, India TOCUMULUS Full time ₹ 1,00,00,000 - ₹ 2,00,00,000 per year

    About Us:ToCumulus Technology Solutions is a leading IT solutions provider specializing in cloud transformation, application modernization, and digital innovation. We are committed to helping enterprises like ADNOC accelerate their digital journey and modernize their legacy applications.Job description:The Engineer/Senior Engineer – Observability...


  • Chennai, Tamil Nadu, India VCS Staffing Geek Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Job type: 1 Years Extendable ContractRole Observability Migration EngineerLocation: ChennaiMode- HybridResponsibilitiesBuilding data pipelines: Design, build, and maintain observability data pipelines to ingest metrics, logs, and traces, ideally using the OpenTelemetry (OTEL) standard.Scripting & Automation: Develop and maintain automation scripts and tools...


  • Chennai, Tamil Nadu, India HariNex Solutions Full time ₹ 9,60,000 - ₹ 15,60,000 per year

    Job description: Engineer/Senior Engineer – ObservabilityLocation: Chennai (Preferred) /MumbaiRole Type- ContractGrafana Developer Expertise ( Grafana, Prometheus , Splunk) With 2~3 years of ExperienceThe Engineer/Senior Engineer – Observability Engineering is key member of Service Reliability Engineering. He/she will be ultimately responsible for system...


  • Chennai, India INDIGLOBE IT SOLUTIONS PRIVATE LIMITED Full time

    Job Description :Key Responsibilities :- Set up, configure, and manage Dynatrace agents across on-premises, AWS, and Kubernetes environments.- Build custom dashboards, alerts, and monitoring views tailored for various teams and business units.- Collect, integrate, and analyze observability data from cloud, on-prem, and containerized environments.- Develop...