MLOps Engineer- Billion Dollar US Enterprise Software
1 week ago
Role Focus: Production ML Systems | GPU Orchestration | Inference at Scale What You'll Actually Do (Not Buzzwords) Infrastructure That Doesn't Break Design and maintain the backbone for training, fine-tuning, and deploying ML models that actually work in production Orchestrate GPU workloads on Kubernetes (EKS) with node autoscaling, intelligent bin-packing, and cost-aware scheduling (spot instances, preemptibles—you know the drill) Build CI/CD pipelines that handle ML code, data versioning, and model artifacts like a well-oiled machine (GitHub Actions, Argo Workflows, Terraform) Production ML, Not Science Projects Partner with Data Scientists and ML Engineers to turn Jupyter notebooks into production-grade systems Deploy and scale inference backends (vLLM, Hugging Face, NVIDIA Triton) that serve real traffic Optimize GPU utilization because every idle A100 hour is money burning Build observability that actually tells you why things broke (Prometheus, Grafana, OpenTelemetry) Ship Fast, Sleep Well Create tooling for seamless model deployment, instant rollback, and A/B testing Lead incident response when production AI systems decide to have opinions Work with security and compliance teams to implement best practices without slowing down innovation What We're Really Looking For Must-Haves (No Negotiation)5+ years in MLOps, infrastructure, or platform engineering —you've been in the trenches Production ML experience : At least one project that's serving real users, not a Kaggle competition Kubernetes expertise with GPUs : You understand taints, tolerations, affinity rules, and why GPU scheduling is its own special hell Cloud-native architecture (AWS preferred): You think in VPCs, IAM roles, and cost optimization Training pipeline experience : Set up or scaled training/fine-tuning for ML models in production (PyTorch Lightning, Hugging Face Accelerate, DeepSpeed) IaC fluency : Terraform, Helm, Kustomize are second nature Python engineering skills : You can debug a distributed training failure and fix it Inference scaling : You've deployed and scaled inference workloads and lived to tell the tale The "We're Very Interested" Signals You mention scaling inference and we can see the fire in your eyes You've used MLflow, W&B, or SageMaker Experiments and have opinions on which is best You understand CI/CD for ML and why it's different from regular software You've built monitoring systems that caught issues before users did Nice to Have (But Seriously Nice) GPU scheduling wizardry in Kubernetes Model drift monitoring and versioning tools Low-latency inference optimization (quantization, FP8, TensorRT—the good stuff) Experience in compliance or regulated industries where "just ship it" isn't an option What Makes This Role Different Ownership. You're not a ticket-taker or a consultant passing through. You'll own infrastructure that powers real AI products, make architectural decisions that matter, and have the autonomy to build things the right way. Impact. Your work directly affects model training speed, inference latency, GPU costs, and system reliability. You'll see the results of your optimizations in dollars saved and milliseconds gained. Quality over speed. We value security, operational excellence, and sustainable systems. No "move fast and break things" chaos here—we move deliberately and build things that last. The Reality Check This role is not for you if: You prefer working on proofs-of-concept over production systems You think "it works on my machine" is an acceptable answer You haven't shipped ML systems to production You're looking for pure research or pure DevOps (this is the intersection) This role is for you if: You get excited about making GPUs go brrr efficiently You've been oncall for ML systems and learned hard lessons You believe infrastructure is a product, not an afterthought You want to build the foundation for AI that actually works Write to MLOps@CareerXperts.com to get connected
-
Staff Software Development Engineer
24 hours ago
bangalore, India Razorpay Full timeRazorpay was founded by Shashank Kumar and Harshil Mathur in 2014. Razorpay is building a new-age digital banking hub (Neobank) for businesses in India with the mission is to enable frictionless banking and payments experiences for businesses of all shapes and sizes. What started as a B2B payments company is processing billions of dollars of payments for...
-
MLOPs Engineer
3 days ago
bangalore, India Acura Solution Full timeJob Description:Experience Range: 5 to 12 yrs (min 4 years relevant)Responsibilities:Enable Model tracking, model experimentation, Model automationDevelop ML pipelines to support Develop MLOps components in Machine learning development life cycle usingModel Repository (either of): MLFlow, Kubeflow Model RegistryMachine Learning Services (either of):...
-
Enterprise Architect
2 weeks ago
bangalore district, India Bosch Global Software Technologies Full timeJob Description Job Summary: We are seeking an experienced Enterprise Architect to design and govern the enterprise-wide AI platform architecture by bringing Architecture best principles. This role will be responsible for defining the technical vision, architectural standards, and integration patterns for AI/ML, Generative AI and Agentic AI capabilities...
-
Enterprise Architect
3 weeks ago
Bangalore Division, India Bosch Global Software Technologies Full timeJob Description Job Summary: We are seeking an experienced Enterprise Architect to design and govern the enterprise-wide AI platform architecture by bringing Architecture best principles. This role will be responsible for defining the technical vision, architectural standards, and integration patterns for AI/ML, Generative AI and Agentic AI capabilities...
-
Senior Mlops Engineer
1 week ago
Bangalore, Karnataka, India Elanco Full timeAt Elanco NYSE ELAN - it all starts with animals As a global leader in animal health we are dedicated to innovation and delivering products and services to prevent and treat disease in farm animals and pets We re driven by our vision of Food and Companionship Enriching Life and our approach to sustainability - the Elanco Healthy Purpose TM - to advance the...
-
Lead MLOps Engineer
2 weeks ago
bangalore, India MathCo Full timeAs a Lead MLOps Engineer, you will play a pivotal role in building scalable and reliable machine learning infrastructure for enterprise-grade applications. We are looking for a Lead Data Engineer with strong exposure to MLOps practices, ideally someone with a core data engineering background who has worked on large-scale data platforms. This is a hybrid role...
-
Lead MLOps Engineer
2 weeks ago
bangalore, India MathCo Full timeAs a Lead MLOps Engineer, you will play a pivotal role in building scalable and reliable machine learning infrastructure for enterprise-grade applications. We are looking for a Lead Data Engineer with strong exposure to MLOps practices, ideally someone with a core data engineering background who has worked on large-scale data platforms. This is a hybrid role...
-
Lead MLOps Engineer
2 weeks ago
Bangalore, India MathCo Full timeAs a Lead MLOps Engineer, you will play a pivotal role in building scalable and reliable machine learning infrastructure for enterprise-grade applications. We are looking for a Lead Data Engineer with strong exposure to MLOps practices, ideally someone with a core data engineering background who has worked on large-scale data platforms. This is a hybrid role...
-
Cloud – MLOps Engineer
1 week ago
Bangalore South, Karnataka, , India Excellence and Eminence LLP Full time ₹ 2,00,00,000 - ₹ 4,00,00,000 per yearPosition: Cloud – MLOps Engineer Location: Bangalore (Work from Office) Notice Period: Not more than 30 days SummaryWe are seeking a highly skilled Cloud – MLOps Engineer to design, implement, and manage scalable cloud -based machine learning infrastructure on AWS. The ideal candidate will have strong expertise in AWS cloud architecture, MLOps pipeline...
-
Senior MLOps/DevOps Engineer
2 weeks ago
bangalore, India Yubi Full timeAbout Yubi Yubi, formerly known as CredAvenue, is re-defining global debt markets by freeing the flow of finance between borrowers, lenders, and investors. We are the world's possibility platform for the discovery, investment, fulfillment, and collection of any debt solution. At Yubi, opportunities are plenty and we equip you with tools to seize it. In March...