Mlops engineer

4 days ago


Malappuram, India Recro Full time

Role OverviewWe are looking for an experienced MLOps Lead with deep expertise in Azure and AWS cloud ecosystems, who can design, deploy, and manage scalable AI/ML infrastructure. The ideal candidate should bring a strong background in cloud governance, Gen AI tooling, automation, and CI/CD pipelines, with hands-on experience across modern MLOps frameworks.Key ResponsibilitiesDesign, implement, and manage scalable cloud-based AI/ML infrastructure across Azure and AWS.Drive end-to-end MLOps lifecycle — model deployment, monitoring, retraining, and governance.Enable Gen AI and Agentic AI platforms leveraging Azure Open AI, Bedrock, Anthropic Claude, Lang Chain, etc.Implement CI/CD pipelines using Azure Dev Ops or AWS Code Pipeline.Ensure security, observability, and compliance across ML and Gen AI ecosystems.Manage infrastructure automation via Terraform, Bicep, Cloud Formation, or similar Ia C tools.Collaborate with data science and engineering teams to optimize ML workflows, data pipelines, and API integrations.Implement monitoring and alerting using Grafana, Prometheus, Azure Monitor, and Application Insights.Oversee networking, identity management, and role-based access controls (IAM, RBAC) across clouds.Support model lifecycle management — drift monitoring, retraining, technical evaluation, and business validation.Technical Skills & ExpertiseCloud & MLOps PlatformsAzure: Azure ML, Azure AI Services, Azure Open AI, Azure Kubernetes Service (AKS), Databricks, Azure Search, Azure Blob, Cosmos DB, Azure SQL, Azure Functions, Azure Event Hub, Azure Resource Manager (ARM), Bicep.AWS: Sage Maker, Bedrock, Lambda, Dynamo DB, S3, RDS, Redshift, ECR, Cloud Formation, CDK, KMS, Event Bridge, Step Functions.AI/ML & ProgrammingHands-on in Python, with exposure to Tensor Flow, Py Torch, scikit-learn.Understanding of LLM tokenization, prompt injection risks, jailbreak prevention, and AI safety techniques.Familiarity with Lang Chain, Llama Cloud, AI Foundry, and related frameworks.Experience in model monitoring, retraining, and evaluation workflows.Dev Ops & InfrastructureExpertise in CI/CD pipelines, containerization (Docker, Kubernetes), and infrastructure automation.Strong in governance, audit logging, security policies (Azure Policy, AWS SCP, IAM).Deep understanding of networking, DNS, load balancers, VNets/VPCs, VPNs.Skilled in Ia C tools – Terraform, Bicep, ARM, Cloud Formation.Monitoring & ObservabilityExperience with Grafana, Prometheus, Application Insights, Log Analytics Workspaces, Azure Monitor.Security & Access ManagementUnderstanding of Microsoft AD, least privilege principles, IAM, RBAC.Testing & AutomationFamiliarity with unit testing and integration testing in CI/CD workflows (preferably Azure Dev Ops).Good to HaveExperience with Azure Bot Framework, M365 Copilot, and APIM.Exposure to code assistants such as Git Hub Copilot, Cursor, Claude Code.Knowledge of Boto3 SDK (AWS Python) and Type Script for Ia C.Preferred BackgroundStrong background in cloud infrastructure engineering and machine learning operations.Proven ability to lead cross-functional teams and implement AI governance at scale.Excellent problem-solving, communication, and documentation skills.


  • Engineering Manager

    4 weeks ago


    Malappuram, India Centre for Digital Tech. in Healthcare (CDiTH), IIITH Full time

    Job Title: Engineering Manager About CDiTH The Centre for Digital Technologies in Healthcare (CDiTH) at IIIT Hyderabad is a healthcare-focused Research Translation Centre—dedicated to applying and extending academic research into real-world solutions. Our vision is to improve the efficacy, affordability, and reach of healthcare delivery in India and...