VLM Research Engineer

4 weeks ago


Vapi, Gujarat, India Meril Full time

Job Title: VLM Research Engineer

Location: Vapi, Gujarat

Employment Type: Full-Time

Overview

We are seeking a highly skilled VLM Research Engineer to build multimodal (vision-language-action) models for instruction following, scene grounding, and tool use across platforms. The role involves developing advanced models that bridge perception and language understanding for autonomous systems.

Key Responsibilities

  • Pretrain and finetune VLMs, aligning them with robotics data including video, teleoperation, and language.
  • Build perception-to-language grounding for referring expressions, affordances, and task graphs.
  • Develop Toolformer/actuator interfaces to convert language intents into actionable skills and motion plans.
  • Create evaluation pipelines for instruction following, safety filters, and hallucination control.
  • Collaborate with cross-functional teams for integration of models into robotics platforms.

Must-Haves

  • Master's or PhD in a relevant field.
  • 1–2+ years of experience in Computer Vision/Machine Learning.
  • Strong proficiency in PyTorch or JAX; experience with LLMs and VLMs.
  • Familiarity with multimodal datasets, distributed training, and RL/IL.

Nice-to-Haves

  • Experience with world models, diffusion-policy integration, and speech interfaces.
  • Familiarity with sim-to-real transfer in robotics applications.

Success Metrics

  • on language-based tasks.
  • Grounding precision and latency.
  • Sim-to-real performance retention.

Domain Notes

Humanoids:

- Language-guided manipulation and tool use.

AGVs (Autonomous Ground Vehicles):

- Natural language tasking for warehouse operations; semantic maps.

Cars:

- Gesture and sign interpretation; driver interaction.

Drones:

- Natural language mission specification; target search and inspection.

Application Instructions

Interested candidates may apply by sending their resume and cover letter to with the subject line: "VLM Research Engineer Application" .


  • Vlm research engineer

    3 weeks ago


    Vapi, Gujarat, India Meril Full time

    Job Title: VLM Research Engineer Location: Vapi, Gujarat Employment Type: Full-TimeOverviewWe are seeking a highly skilled VLM Research Engineer to build multimodal (vision-language-action) models for instruction following, scene grounding, and tool use across platforms. The role involves developing advanced models that bridge perception and language...

  • VLM Research Engineer

    4 weeks ago


    Vapi, Gujarat, India Meril Full time

    Job Title: VLM Research Engineer Location: Vapi, Gujarat Employment Type: Full-Time Overview We are seeking a highly skilled VLM Research Engineer to build multimodal (vision-language-action) models for instruction following, scene grounding, and tool use across platforms. The role involves developing advanced models that bridge perception and...

  • VLM Research Engineer

    2 weeks ago


    Vapi, Gujarat, India Meril Full time

    Job Title: VLM Research Engineer Location: Vapi, Gujarat Employment Type: Full-Time Overview We are seeking a highly skilled VLM Research Engineer to build multimodal (vision-language-action) models for instruction following, scene grounding, and tool use across platforms. The role involves developing advanced models that bridge...

  • VLM Research Engineer

    4 weeks ago


    Vapi, Gujarat, India Meril Full time

    Job Title: VLM Research EngineerLocation: Vapi, GujaratEmployment Type: Full-TimeOverviewWe are seeking a highly skilled VLM Research Engineer to build multimodal (vision-language-action) models for instruction following, scene grounding, and tool use across platforms. The role involves developing advanced models that bridge perception and language...

  • VLM Research Engineer

    4 weeks ago


    Vapi, Gujarat, India Meril Full time

    Job DescriptionJob Title: VLM Research EngineerLocation: Vapi, GujaratEmployment Type: Full-TimeOverviewWe are seeking a highly skilled VLM Research Engineer to build multimodal (vision-language-action) models for instruction following, scene grounding, and tool use across platforms. The role involves developing advanced models that bridge perception and...

  • VLM Research Engineer

    3 weeks ago


    Vapi, Gujarat, India Meril Full time

    Job Title: VLM Research Engineer Location: Vapi, Gujarat Employment Type: Full-TimeOverviewWe are seeking a highly skilled VLM Research Engineer to build multimodal (vision-language-action) models for instruction following, scene grounding, and tool use across platforms. The role involves developing advanced models that bridge perception and language...


  • Vapi, Gujarat, India Meril Full time

    Job Title: Research Interns (MS / PhD)Location: Vapi, GujaratEmployment Type: Internship (Full-Time)OverviewWe are seeking talented **Research Interns (MS/PhD)** to contribute to scoped projects across Vision-Language Models (VLM), Reinforcement Learning and Planning, Perception, SLAM, 3D Vision, and Simulation. Interns will focus on achieving publishable...


  • Vapi, Gujarat, India Meril Full time

    Job Title: Research Interns (MS / PhD) Location: Vapi, Gujarat Employment Type: Internship (Full-Time) Overview We are seeking talented **Research Interns (MS/PhD)** to contribute to scoped projects across Vision-Language Models (VLM), Reinforcement Learning and Planning, Perception, SLAM, 3D Vision, and Simulation. Interns will focus on achieving...


  • Vapi, Gujarat, India Meril Full time

    Job Title: Research Interns (MS / PhD) Location: Vapi, Gujarat Employment Type: Internship (Full-Time) Overview We are seeking talented **Research Interns (MS/PhD)** to contribute to scoped projects across Vision-Language Models (VLM), Reinforcement Learning and Planning, Perception, SLAM, 3D Vision, and Simulation. Interns will focus on achieving...


  • Vapi, Gujarat, India Meril Full time

    Job Title: Research Interns (MS / PhD) Location: Vapi, Gujarat Employment Type: Internship (Full-Time) Overview We are seeking talented **Research Interns (MS/PhD)** to contribute to scoped projects across Vision-Language Models (VLM), Reinforcement Learning and Planning, Perception, SLAM, 3D Vision, and Simulation. Interns will focus on achieving...