VLM Research Engineer

1 day ago


Vapi, Gujarat, India Meril Full time ₹ 15,00,000 - ₹ 25,00,000 per year

Role Overview

The VLM Research Engineer specializes in Vision–Language Models (VLMs), enabling robots and autonomous vehicles to understand natural language instructions, ground perception, and translate commands into actionable tasks. This role is pivotal in advancing multimodal AI systems, bridging vision and language for real-world autonomy.

Key Responsibilities

  • Research, design, and implement state-of-the-art Vision–Language Models for robotics and autonomous vehicles.
  • Enable robots/vehicles to understand natural language instructions and ground them in perception data.
  • Collaborate with cross-functional teams (AI, robotics, perception, and software) to integrate VLMs into autonomous systems.
  • Develop algorithms for instruction following, scene understanding, and multimodal reasoning.
  • Evaluate model performance using benchmarks, real-world datasets, and simulation environments.
  • Stay updated with latest research in vision-language models, multimodal AI, and robotics.
  • Document research findings, model architectures, and best practices for internal and external dissemination.

Required Qualifications

  • Master's or PhD in Computer Science, AI, Robotics, or related field with specialization in deep learning or multimodal AI.
  • 3+ years of experience in research and development of vision-language models or multimodal AI systems.
  • Strong expertise in deep learning frameworks (PyTorch, TensorFlow) and transformer-based architectures.
  • Experience in applying VLMs to robotics, autonomous vehicles, or related real-world systems is preferred.
  • Strong programming skills in Python and C++.
  • Excellent problem-solving, analytical, and communication skills.

Desired Competencies

  • Deep understanding of multimodal learning, vision-language grounding, and instruction following.
  • Ability to innovate and adapt cutting-edge AI research to practical robotics applications.
  • Experience with simulation environments and real-world robotic platforms.
  • Strong collaboration skills to work with cross-functional research and engineering teams.
  • Publication record in relevant conferences/journals is a plus.


  • Vapi, Gujarat, India Meril Full time

    Job Title: VLM Research Engineer Location: Vapi, Gujarat Employment Type: Full-TimeOverviewWe are seeking a highly skilled VLM Research Engineer to build multimodal (vision-language-action) models for instruction following, scene grounding, and tool use across platforms. The role involves developing advanced models that bridge perception and language...

  • VLM Research Engineer

    2 weeks ago


    Vapi, Gujarat, India Meril Full time

    Job Title: VLM Research Engineer Location: Vapi, Gujarat Employment Type: Full-Time Overview We are seeking a highly skilled VLM Research Engineer to build multimodal (vision-language-action) models for instruction following, scene grounding, and tool use across platforms. The role involves developing advanced models that bridge perception and...

  • VLM Research Engineer

    2 weeks ago


    Vapi, Gujarat, India Meril Full time

    Job Title: VLM Research EngineerLocation: Vapi, GujaratEmployment Type: Full-TimeOverviewWe are seeking a highly skilled VLM Research Engineer to build multimodal (vision-language-action) models for instruction following, scene grounding, and tool use across platforms. The role involves developing advanced models that bridge perception and language...


  • Vapi, Gujarat, India Meril Full time

    Job DescriptionJob Title: VLM Research EngineerLocation: Vapi, GujaratEmployment Type: Full-TimeOverviewWe are seeking a highly skilled VLM Research Engineer to build multimodal (vision-language-action) models for instruction following, scene grounding, and tool use across platforms. The role involves developing advanced models that bridge perception and...


  • Vapi, Gujarat, India Meril Full time

    Job Title: VLM Research Engineer Location: Vapi, Gujarat Employment Type: Full-Time Overview We are seeking a highly skilled VLM Research Engineer to build multimodal (vision-language-action) models for instruction following, scene grounding, and tool use across platforms. The role involves developing advanced models that bridge perception and...


  • Vapi, Gujarat, India Meril Full time

    Job Title: VLM Research Engineer Location: Vapi, Gujarat Employment Type: Full-TimeOverviewWe are seeking a highly skilled VLM Research Engineer to build multimodal (vision-language-action) models for instruction following, scene grounding, and tool use across platforms. The role involves developing advanced models that bridge perception and language...


  • Vapi, Gujarat, India beBeeMultimodal Full time ₹ 1,00,00,000 - ₹ 2,00,00,000

    Research Engineer OpportunityBuild cutting-edge multimodal models for instruction following, scene grounding, and tool use across various platforms.Key Responsibilities:Develop advanced VLMs that bridge perception and language understanding for autonomous systems.Pretrain and finetune VLMs to align them with robotics data, including video, teleoperation, and...


  • Vapi, Gujarat, India beBeeMachineLearning Full time ₹ 10,00,000 - ₹ 18,00,000

    Job Title: Research Interns (MS / PhD)Location: Vapi, GujaratEmployment Type: Internship (Full-Time)OverviewSeeking talented professionals to contribute to scoped projects across Vision-Language Models (VLM), Reinforcement Learning and Planning, Perception, SLAM, 3D Vision, and Simulation. A focus on achieving publishable results and productionizable...


  • Vapi, Gujarat, India Meril Full time

    Job Title: Research Interns (MS / PhD)Location: Vapi, GujaratEmployment Type: Internship (Full-Time)OverviewWe are seeking talented **Research Interns (MS/PhD)** to contribute to scoped projects across Vision-Language Models (VLM), Reinforcement Learning and Planning, Perception, SLAM, 3D Vision, and Simulation. Interns will focus on achieving publishable...


  • Vapi, Gujarat, India Meril Full time

    Job Title: Research Interns (MS / PhD) Location: Vapi, Gujarat Employment Type: Internship (Full-Time) Overview We are seeking talented **Research Interns (MS/PhD)** to contribute to scoped projects across Vision-Language Models (VLM), Reinforcement Learning and Planning, Perception, SLAM, 3D Vision, and Simulation. Interns will focus on achieving...