
Cloud AIOps and MLOps Operations Manager
1 week ago
Job Description:
We are seeking a highly skilled Cloud AIOps and MLOps Operations Manager to join our team. The successful candidate will be responsible for implementing AIOps strategies, deploying observability solutions, and enabling AI-driven anomaly detection and root cause analysis.
The ideal candidate will have 5+ years of experience in technology work, with a strong background in Data & Analytics and cloud-based platforms. They will possess excellent communication skills, a customer-centric approach, and a growth mindset.
Responsibilities:
- Implement AIOps strategies for automating IT operations using Azure Monitor, Azure Log Analytics, and AI-driven alerting.
- Deploy Azure-based observability solutions (Azure Monitor, Application Insights, Azure Synapse for log analytics, and Azure Data Explorer) to enhance real-time system performance monitoring.
- Enable AI-driven anomaly detection and root cause analysis by collaborating with data science teams using Azure Machine Learning (Azure ML) and AI-powered log analytics.
- Contribute to developing self-healing and auto-remediation mechanisms using Azure Logic Apps, Azure Functions, and Power Automate to proactively resolve system issues.
- Support ML lifecycle automation using Azure ML, Azure DevOps, and Azure Pipelines for CI/CD of ML models.
- Assist in deploying scalable ML models with Azure Kubernetes Service (AKS), Azure Machine Learning Compute, and Azure Container Instances.
- Automate feature engineering, model versioning, and drift detection using Azure ML Pipelines and MLflow.
- Optimize ML workflows with Azure Data Factory, Azure Databricks, and Azure Synapse Analytics for data preparation and ETL/ELT automation.
- Implement basic monitoring and explainability for ML models using Azure Responsible AI Dashboard and InterpretML.
- Collaborate with Data Science, DevOps, CloudOps, and SRE teams to align AIOps/MLOps strategies with enterprise IT goals.
- Work closely with business stakeholders and IT leadership to implement AI-driven insights and automation to enhance operational decision-making.
- Track and report AI/ML operational KPIs, such as model accuracy, latency, and infrastructure efficiency.
- Assist in coordinating with cross-functional teams to maintain system performance and ensure operational resilience.
- Support the implementation of AI ethics, bias mitigation, and responsible AI practices using Azure Responsible AI Toolkits.
- Ensure adherence to Azure Information Protection (AIP), Role-Based Access Control (RBAC), and data security policies.
- Assist in developing risk management strategies for AI-driven operational automation in Azure environments.
- Prepare and present program updates, risk assessments, and AIOps/MLOps maturity progress to stakeholders as needed.
- Support efforts to attract and build a diverse, high-performing team to meet current and future business objectives.
- Help remove barriers to agility and enable the team to adapt quickly to shifting priorities without losing productivity.
- Contribute to developing the appropriate organizational structure, resource plans, and culture to support business goals.
- Leverage technical and operational expertise in cloud and high-performance computing to understand business requirements and earn trust with stakeholders.
Qualifications:
- 5+ years of technology work experience in a global organization, preferably in CPG or a similar industry.
- 5+ years of experience in the Data & Analytics field, with exposure to AI/ML operations and cloud-based platforms.
- 5+ years of experience working within cross-functional IT or data operations teams.
- 2+ years of experience in a leadership or team coordination role within an operational or support environment.
- Experience in AI/ML pipeline operations, observability, and automation across platforms such as Azure, AWS, and GCP.
- Excellent Communication: Ability to convey technical concepts to diverse audiences and empathize with stakeholders while maintaining confidence.
- Customer-Centric Approach: Strong focus on delivering the right customer experience by advocating for customer needs and ensuring issue resolution.
- Problem Ownership & Accountability: Proactive mindset to take ownership, drive outcomes, and ensure customer satisfaction.
- Growth Mindset: Willingness and ability to adapt and learn new technologies and methodologies in a fast-paced, evolving environment.
- Operational Excellence: Experience in managing and improving large-scale operational services with a focus on scalability and reliability.
- Site Reliability & Automation: Understanding of SRE principles, automated remediation, and operational efficiencies.
- Cross-Functional Collaboration: Ability to build strong relationships with internal and external stakeholders through trust and collaboration.
- Familiarity with CI/CD processes, data pipeline management, and self-healing automation frameworks.
- Strong understanding of data acquisition, data catalogs, data standards, and data management tools.
- Knowledge of master data management concepts, data governance, and analytics.
-
Hyderabad, India Pepsico Full timeOverview - We are seeking a skilled Associate Manager – AIOps & MLOps Operations to support and enhance the automation, scalability, and reliability of AI/ML operations across the enterprise. - This role requires a solid understanding of AI-driven observability, machine learning pipeline automation, cloud-based AI/ML platforms, and operational...
-
Hyderabad, India PepsiCo Full timeJob Description Overview We are seeking a skilled Associate Manager – AIOps & MLOps Operations to support and enhance the automation, scalability, and reliability of AI/ML operations across the enterprise. This role requires a solid understanding of AI-driven observability, machine learning pipeline automation, cloud-based AI/ML platforms, and operational...
-
Cloud Operations Leader
1 week ago
Hyderabad / Secunderabad, Telangana, India beBeeCloudOperations Full time ₹ 20,00,000 - ₹ 25,00,000Job Title: Cloud Operations StrategistAbout the RoleWe are seeking a seasoned expert in cloud operations to lead our AIOps initiatives. As a Cloud Operations Strategist, you will develop and implement a comprehensive strategy for enhancing efficiency and effectiveness in IT operations.Key ResponsibilitiesStrategic Planning: Develop a forward-looking AIOps...
-
Cloud - Systems Architect
1 week ago
Hyderabad, Telangana, India EPAM Systems, Inc. Full time_EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most...
-
Principal engineer, software
3 days ago
Hyderabad, India ANSR Full timeANSR is hiring for one of its clients. About T-Mobile: T-Mobile US, Inc. (NASDAQ: TMUS), headquartered in Bellevue, Washington, is America’s supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mobile. Customers benefit from an unmatched combination of value, quality, and...
-
Principal Engineer, Software
1 week ago
Hyderabad, Telangana, India ANSR Full time ₹ 40,00,000 - ₹ 80,00,000 per yearANSR is hiring for one of its clients.About T-Mobile:T-Mobile US, Inc. (NASDAQ: TMUS), headquartered in Bellevue, Washington, is America's supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mobile. Customers benefit from an unmatched combination of value, quality, and exceptional...
-
Principal Engineer, Software
5 days ago
Hyderabad, India ANSR Full timeANSR is hiring for one of its clients. About T-Mobile: T-Mobile US, Inc. (NASDAQ: TMUS), headquartered in Bellevue, Washington, is America's supercharged Un-carrier, connecting millions through its strong nationwide network and flagship brands, T-Mobile and Metro by T-Mobile. Customers benefit from an unmatched combination of value, quality, and exceptional...
-
MLOps Engineer
6 hours ago
Hyderabad, Telangana, India Mancer Consulting Services Full time ₹ 12,00,000 - ₹ 36,00,000 per yearRole –MLOps EngineerJob Purpose:The Staff MLOps Engineer plays a pivotal role in shaping our MLOps practice within ITG by building and enhancing a scalable, reliable, and cutting-edge Machine Learning Operations (MLOps) platform. This role combines deep cloud architecture expertise with advanced AI/ML knowledge to develop solutions that streamline...
-
DevOps/mlops Intern
5 days ago
Hyderabad, Telangana, India TensorGo Technologies Full timeJob ID **DVOI252507**: Role **DevOps/MLOps Intern**: Location **Hyderabad**: Experience **0 to 1 Year**: **Profile**: We are looking for a proactive and driven DevOps/MLOps Intern to support the development of scalable infrastructure and deployment pipelines. You’ll work closely with the engineering team to align system operations with product goals...
-
Associate Manager
5 days ago
Hyderabad, India Pepsico Full timeOverview We are seeking a skilled Associate Manager AIOps & MLOps Operations to support and enhance the automation, scalability, and reliability of AI/ML operations across the enterprise. This role requires a solid understanding of AI-driven observability, machine learning pipeline automation, cloud-based AI/ML platforms, and operational excellence. The...