[Immediate Start] Ai lead

2 days ago


Chandigarh India PaladinAi Full time

Job Description AIOps Lead Location: Chandigarh (On-site) Experience: 3 to 5 years (AI/ML + DevOps + Observability) Employment Type: Full-time About the Role We are looking for a next-generation AIOps Engineer to design and operate AI-driven, self-healing, and intelligent infrastructure systems. In this role, you'll fuse MLOps, DevOps, and agentic AI systems leveraging technologies like Ray, vLLM, SGLang, and PyTorch Lightning to build predictive, autonomous, and scalable operational pipelines. You will develop intelligent observability systems capable of detecting, diagnosing, and resolving issues in real time powered by distributed AI and LLM-based automation. Key Responsibilities Design, implement, and scale AIOps pipelines that collect, analyze, and act on telemetry data across infrastructure and applications. Build and deploy distributed ML/LLM workflows using Ray, PyTorch Lightning, vLLM, or SGLang for anomaly detection, event correlation, and predictive maintenance. Orchestrate LLM-based operations agents using LangChain, LangGraph, or SGLang to power AI-assisted diagnostics and root-cause analysis. Implement intelligent observability layers over systems like Prometheus, Grafana, ELK, OpenTelemetry, or Datadog to enable AI-driven insights and alerting. Develop self-healing systems leveraging AI and automation frameworks to auto-remediate incidents. Optimize inference serving and distributed compute with vLLM, Ray Serve, and Triton Inference Server for ultra-fast response times. Build real-time data ingestion pipelines using Kafka, Spark, or Flink for operational and telemetry data. Collaborate with SRE, MLOps, and AI engineering teams to create autonomous, adaptive infrastructure systems. Integrate CI/CD pipelines for AI workflows using MLflow, Kubeflow, or Airflow, with model monitoring and drift detection. Evaluate and integrate AIOps platforms (Moogsoft, BigPanda, Datadog AIOps, Dynatrace, etc.) and agentic frameworks for proactive automation. Required Skills & Qualifications Bachelor's or Master's in Computer Science, Engineering, or related field. 4+ years of experience in DevOps, SRE, or AI infrastructure engineering. Strong programming experience in Python (preferred), Go, or Bash scripting. Deep understanding of cloud platforms (AWS, GCP, Azure) and Kubernetes/Docker orchestration. Expertise in infrastructure as code (Terraform, Helm, Pulumi). Experience with distributed compute frameworks Ray, PyTorch Lightning, vLLM, SGLang. Proficiency with observability and monitoring stacks (Prometheus, Grafana, ELK, OpenTelemetry, Splunk). Familiarity with MLOps and LLMOps tools (MLflow, Kubeflow, Airflow, ArgoCD). Experience with event-driven systems and message queues (Kafka, RabbitMQ, AWS SQS). Understanding of AI-powered automation, root cause analysis, and predictive operational analytics. Preferred / Nice-to-Have Hands-on with vLLM for optimized LLM inference and observability agents. Experience deploying and optimizing Ray Serve, vLLM, or Triton in production. Exposure to SGLang for LLM-based orchestration, workflow automation, and diagnostics reasoning. Familiarity with vector databases (Milvus, Weaviate, Pinecone) and RAG-based observability. Experience with agentic AIOps frameworks and LLM-driven operational reasoning (LangGraph, AutoGen, CrewAI). Understanding of AI observability, drift detection, cost-aware scaling, and fault-tolerant AI systems. Contributions to open-source AIOps, observability, or distributed AI infrastructure projects. What We Offer Opportunity to build the foundation for autonomous, intelligent operations. Hands-on exposure to SGLang, vLLM, Ray, PyTorch Lightning, and LangGraph ecosystems. Collaborative, cross-functional environment spanning AI, cloud, and systems engineering. Competitive compensation, flexible work setup, and professional development opportunities.



  • Chandigarh, India Labellerr AI Full time

    Company DescriptionLabellerr AI is a cutting-edge SaaS platform dedicated to accelerating AI development for organizations by offering AI-powered data labeling and automation services. Our unique Smart Feedback Loop technology empowers machine learning teams to train computer vision models efficiently. We optimize unstructured data for seamless model...


  • India Data-Hat AI Full time

    Department: AI Strategy & Implementation Reports To: Chief AI Officer / Lead AI Solutions Architect Travel: International travel required Position Summary: We are seeking an AI Solutions Architect with over 20 years of experience to serve as a strategic and technical leader in the design, implementation, and evolution of enterprise-grade AI systems. This...


  • Chandigarh, India Weekday AI Full time

    This role is for one of the Weekday's clients Salary range: Rs 2100000 - Rs 7500000 (ie INR 21-75 LPA) Min Experience: 3 years Location: Chandigarh, India JobType: full-time We are seeking an innovative and strategic Product Manager to drive the vision, strategy, and execution of next-generation AI-driven FinTech products. This role sits at the intersection...


  • Gurugram, Gurugram, India Sirius AI Full time

    Job Description Role Overview We are seeking a dynamic and visionary Associate Director to lead solutioning and innovation initiatives within our AI Innovations Lab. This role involves designing, delivering, and scaling AI/ML solutions for clients in the financial services ecosystem. The ideal candidate brings a mix of hands-on technical expertise, strategic...


  • Noida, India HCLTech Full time

    Job Description Job Title: Senior Technical Lead - Generative AI Location: Noida, Hyderabad, Chennai, Bangalore, Pune Job Type: Full-Time Experience: 7-13 years (with at least 1 year in Generative AI) Key Responsibilities: - Lead the design, development, and deployment of Generative AI models and solutions on cloud platforms such as Microsoft Azure, AWS, or...


  • India Mindfire Solutions Full time

    About the Job As a Lead AI/ML Engineer, you spearhead the design, development, and implementation of advanced AI and machine learning models. Your role involves guiding a team of engineers ensuring the successful deployment of projects that leverage AI/ML technologies to solve complex problems. You collaborate closely with stakeholders to understand business...

  • AI Developer

    2 weeks ago


    India Omnibound AI Full time

    Please Note: ➡️ Only candidates who are immediate joiners (0–7 days) will be considered➡️ This also includes for those who are already serving notice period, as long as they are available to join within 10-15 days. ➡️ If you do not have hands-on experience with building AI software, please do not applyAbout us: Omnibound...

  • AI Developer

    2 weeks ago


    India Omnibound AI Full time

    Please Note : ➡️ Only candidates who are immediate joiners (0–7 days) will be considered ➡️ This also includes for those who are already serving notice period , as long as they are available to join within 10-15 days. ➡️ If you do not have hands-on experience with building AI software, please do not apply About us: Omnibound ( is building the...

  • AI Developer

    2 weeks ago


    India Omnibound AI Full time

    Please Note: ➡️ Only candidates who are immediate joiners (0–7 days) will be considered ➡️ This also includes for those who are already serving notice period, as long as they are available to join within 10-15 days. ➡️ If you do not have hands-on experience with building AI software, please do not apply About us: Omnibound...


  • India Mercor Full time

    Job Description Company Introduction Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors include Benchmark, General Catalyst, Peter Thiel, Adam D'Angelo, Larry Summers, and Jack Dorsey. Role Overview - Position: Data Science Professional Freelance, Remote - Commitment: 1525...