ML Ops Engineer
4 weeks ago
Role Brief: We are seeking a skilledML Ops Engineerto design, implement, and maintain scalable machine learning and large language model (LLM) pipelines in cloud environments, primarily using AWS services. This role is critical to ensuring the reliability, efficiency, and performance of ML systems in production. The ideal candidate will have hands-on experience with AWS tools such as SageMaker, Lambda, Bedrock, Batch with Fargate, and infrastructure components like RDS, DynamoDB, and SQS. You will be responsible for automating CI/CD workflows, managing auto-scaling APIs, and provisioning cloud resources to support high-performance ML workloads, including RAG systems.Primary Responsibilities: Strategizing and implementing scalable infrastructure for ML or LLM model pipelines using tools like and cloudservices such as AWS (e.g.,AWS Batch, Fargate,Bedrock) Manage auto-scaling mechanisms to handle varying workloads and ensure high availability of Rest APIs Automate CI/CD pipelines and Lambda functions for model testing, deployment, and updates, reducing manual errorsand improving efficiency. Amazon SageMaker Pipelines for end-to-end ML workflow automation. Optimize utilizing step-functions Conduct drift analysis to detect and respond to data drift, concept drift, and label drift. Implement mitigation strategies such as automated alerts, model retraining triggers, and performance audits. Set up reproducible workflows for data preparation, model training, and deployment. Provision and optimize cloud resources (e.g., GPUs, memory) to meet computational demands of large models like those used in RAG systems Automate retraining workflows to keep models updated as data evolves Work closely with data scientists, ML engineers, and DevOps teams to integrate models into production environments. Implement monitoring tools to track model performance and detect issues like drift or degradation in real- time. Monitoring dashboards with real-time alerts for pipeline failures or performance issues C Implementing ModelObservability frameworks.Required Skills: Education Any Engineering (BE/Btech/ME/Mtech) Min 4 years of experience with AWS services such as Lambda, Bedrock, Batch with Fargate, RDS (PostgreSQL), DynamoDB, SQS, CloudWatch, API Gateway, SageMaker Should have hands-on experience in drift analysis, including detecting and mitigating data, concept, and label drift in production ML systems Knowledge of ML frameworks (e.g., PyTorch, TensorFlow) to understand model requirements during deployment Experience with Rest API Frameworks like Fast APIs, Flask Familiarity with model observability like Evidently, Nanny ML, Phoenix and monitoring tools (Grafana etc) and retraining tools like MLflow/ Kubeflow / Airflow AWS Certified Machine Learning – Specialty –Good to have this certification
-
ML Ops Engineer
3 weeks ago
New Delhi, India SatSure Full timeWe are looking for a Machine Learning Operations Engineer to join our team, to design, build, and integrate ML Ops for large-scale, distributed machine learning systems, focusing on cutting-edge tools, distributed GPU training, and enhancing research experimentation.About SatSure:SatSure is a deep tech, decision Intelligence company that works primarily at...
-
ML Ops Engineer
3 weeks ago
New Delhi, India SatSure Full timeWe are looking for a Machine Learning Operations Engineer to join our team, to design, build, and integrate ML Ops for large-scale, distributed machine learning systems, focusing on cutting-edge tools, distributed GPU training, and enhancing research experimentation.About SatSure: SatSure is a deep tech, decision Intelligence company that works primarily at...
-
ML Ops Engineer
5 days ago
Delhi, India People Prime Worldwide Full timeAbout Company :Our client is a trusted global innovator of IT and business services. They help clients transform through consulting, industry solutions, business process services, digital & IT modernization and managed services. Our client enables them, as well as society, to move confidently into the digital future. We are committed to our clients’...
-
ML Ops Engineer
6 days ago
Delhi, India People Prime Worldwide Full timeAbout Company :Our client is a trusted global innovator of IT and business services. They help clients transform through consulting, industry solutions, business process services, digital & IT modernization and managed services. Our client enables them, as well as society, to move confidently into the digital future. We are committed to our clients’...
-
Lead Software Engineer: ML Ops
4 weeks ago
New Delhi, India AppZen Full timeRole: Lead Software Engineer : ML Ops & System EngineeringAbout Us: AppZen is the leader in autonomous spend-to-pay software. Its patented artificial intelligence accurately and efficiently processes information from thousands of data sources so that organizations can better understand enterprise spend at scale to make smarter business decisions. It...
-
ML Ops
2 weeks ago
Delhi, India EXL Full timePrior ~2+ years of experience working with ML Ops & DSResponsibilities & Skills:Deploy, monitor, and scale ML models on AWS (SageMaker, EKS, Lambda) or GCP (Vertex AI, GKE, Cloud Functions).Build and maintain CI/CD pipelines for ML workflows using GitHub Actions / Jenkins / cloud-native tools.Containerize and orchestrate workloads with Docker & Kubernetes;...
-
ML Ops
2 weeks ago
Delhi, India EXL Full timePrior ~2+ years of experience working with ML Ops & DSResponsibilities & Skills :Deploy, monitor, and scale ML models on AWS (SageMaker, EKS, Lambda) or GCP (Vertex AI, GKE, Cloud Functions) .Build and maintain CI/CD pipelines for ML workflows using GitHub Actions / Jenkins / cloud-native tools .Containerize and orchestrate workloads with Docker & Kubernetes...
-
Senior ML Ops Engineer-Databricks
1 week ago
New Delhi, India ValueMomentum Full timeJob Responsibilities:- Evaluate and source appropriate cloud infrastructure solutions for machine learning needs, ensuring cost-effectiveness and scalability based on project requirements. - Automate and manage the deployment of machine learning models into production environments, ensuring version control for models and datasets using tools like Docker and...
-
Senior ML Ops Engineer-Databricks
3 days ago
New Delhi, India ValueMomentum Full timeJob Responsibilities:- Evaluate and source appropriate cloud infrastructure solutions for machine learning needs, ensuring cost-effectiveness and scalability based on project requirements. - Automate and manage the deployment of machine learning models into production environments, ensuring version control for models and datasets using tools like Docker and...
-
ML Ops Engineer
1 week ago
New Delhi, India Maaze Underwriting Solutions Pvt. Ltd Full timeAbout the RoleWe are seeking an experienced MLOps Engineer to design, build, and maintain scalable machine learning infrastructure with strong focus on Azure cloud ecosystem. You will deploy and optimize ML/AI models in production, with emphasis on GPU-accelerated workloads and large language models. Key Responsibilities Design and implement MLOps pipelines...