SRE & DevOps Engineer (ML/AI Platform)
3 weeks ago
SRE & DevOps Engineer (ML/AI Platform) Contract Position | Global E-Commerce Leader | HybridAbout the Opportunity We're partnering with aleading global e-commerce companyto find an exceptional SRE & DevOps Engineer to join their AI Platform Team. This is your chance to shape the future of machine learning infrastructure that powers innovation for millions of users worldwide. As part of this transformative role, you'll support cutting-edge AI platforms and services, working alongside researchers, data scientists, and engineering teams in a purpose-driven, inclusive environment.What You'll Do Platform Operations & Support Support next-generation AI architecture for research and engineering teams Partner with vendors and infrastructure teams to ensure security and 99.999% service availability Diagnose and resolve production issues, including performance and functional challenges Provide technical support to customers and document solutions DevOps & Automation Design and implement zero-downtime monitoring for highly available services Build CI/CD pipelines for automated deployment and configuration Identify automation opportunities to streamline problem management Develop operational standards for tools, versioning, source control, and deployment practices Continuous Improvement Drive customer service enhancements and recommend product improvements Define engineering excellence and operational maturity standards Conduct customer training and generate insights reports Accelerate team efficiency through automation and knowledge sharingWhat You Bring Required Expertise Should be having 5+ years of experience. Strong Python development skills with data structure, algorithm, experience in designing, building, and releasing production software Hands-on experience with ML frameworks:PyTorch, TensorFlow, Triton Cloud-native technologies:Kubernetes, Docker, Linux DevOps proficiency: CI/CD pipelines, Jenkins, test automation Framework troubleshooting: version upgrades, compatibility management Excellent debugging and triaging capabilities Preferred Skills Experience with AI/ML model training and inference platforms LLM fine-tuning systems knowledge Performance monitoring and application deployment automation#SRE #DevOps #MLOps #AI #MachineLearning #Kubernetes #Python #PyTorch #TensorFlow #CloudEngineering #Hiring #TechJobs #ContractRole
-
Senior AI Platform DevOps Engineer
2 weeks ago
New Delhi, India Cloudely, Inc Full timeAI Platform DevOpes/ SRE Engineer Location: India - 100% Remote Fulltime Permanent PositionResponsibilities/What You’ll Do Platform Design and Architecture: building and operating a highly available, scalable, modular AI platform using technologies such as Qdrant, Anyscale, and Ray to support LLM orchestration, vector search, and multi-agent frameworks....
-
Senior AI Platform DevOps Engineer
1 week ago
New Delhi, India Cloudely, Inc Full timeAI Platform DevOpes/ SRE Engineer Location: India - 100% Remote Fulltime Permanent PositionResponsibilities/What You’ll Do Platform Design and Architecture: building and operating a highly available, scalable, modular AI platform using technologies such as Qdrant, Anyscale, and Ray to support LLM orchestration, vector search, and multi-agent frameworks....
-
Site Reliability Engineer
4 weeks ago
New Delhi, India Stoopa AI Full timeCompany DescriptionStoopa.AI is building next-generation AI-driven platforms for ports and is focused on reliability, speed, and intelligent automation. As we scale our next generation smart port product Turi, we are hiring our first dedicated SRE/DevOps Engineer to build, optimize, and own our reliability engineering function from the ground up. This is a...
-
Site Reliability Engineer
4 weeks ago
New Delhi, India Stoopa AI Full timeCompany DescriptionStoopa.AI is building next-generation AI-driven platforms for ports and is focused on reliability, speed, and intelligent automation. As we scale our next generation smart port product Turi, we are hiring our first dedicated SRE/DevOps Engineer to build, optimize, and own our reliability engineering function from the ground up. This is a...
-
Site Reliability Engineer
4 weeks ago
New Delhi, India Stoopa AI Full timeCompany Description Stoopa.AI is building next-generation AI-driven platforms for ports and is focused on reliability, speed, and intelligent automation. As we scale our next generation smart port product Turi, we are hiring ourfirst dedicated SRE/DevOps Engineerto build, optimize, and own our reliability engineering function from the ground up. This is a...
-
Senior DevOps Platform Engineer
3 weeks ago
New Delhi, India apna Full timeJob Title: Senior engineer (SDE-2)– Platform EngineeringLocation:Bengaluru Employment Type:Full-time Team:Platform EngineeringAbout the Role: We are looking for a passionate and hands-on DevOps Engineer to join our Platform Engineering team and accelerate our platform modernization journey. This role is ideal for engineers who thrive in automation-heavy...
-
Senior DevOps Platform Engineer
2 weeks ago
New Delhi, India apna Full timeJob Title: Senior engineer (SDE-2)– Platform EngineeringLocation:Bengaluru Employment Type:Full-time Team:Platform EngineeringAbout the Role: We are looking for a passionate and hands-on DevOps Engineer to join our Platform Engineering team and accelerate our platform modernization journey. This role is ideal for engineers who thrive in automation-heavy...
-
AI Platform Engineer
2 weeks ago
New Delhi, India BayOne Solutions Full timeJob Description We are seeking a highly skilledAI Platform Engineerto design, build, and operate our next-generationAI application platform . In this role, you will work on advanced AI systems includingRetrieval-Augmented Generation (RAG)pipelines,multi-model gateways ,Model Context Protocol (MCP) tools ,agentic workflow automations(e.g., n8n), and secure...
-
Senior AI Platform DevOps Engineer
2 weeks ago
Delhi, India Cloudely, Inc Full timeAI Platform DevOpes/ SRE Engineer Location: India - 100% Remote Fulltime Permanent Position Responsibilities/What You’ll Do Platform Design and Architecture: building and operating a highly available, scalable, modular AI platform using technologies such as Qdrant, Anyscale, and Ray to support LLM orchestration, vector search, and multi-agent frameworks....
-
AI Data Platform Reliability
4 weeks ago
New Delhi, India Oracle Full timeResponsibilitiesDesign, develop, and execute end-to-end (E2E) scenario validations that simulate real-world usage of complex AI data platform workflows (data ingestion, transformation, ML pipeline orchestration, etc.). Collaborate closely with product, engineering, and field teams to identify gaps in coverage and propose test automation strategies. Develop...