Multimodal Model Developer

6 days ago

Vapi, Gujarat, India beBeeResearch Full time ₹ 12,00,000 - ₹ 16,00,000

Research and Development Position

We are seeking a highly skilled Research Engineer to build advanced models that bridge perception and language understanding for autonomous systems across multiple platforms.

Main Responsibilities:

Develop multimodal models using deep learning techniques for instruction following, scene grounding, and tool use.
Pretrain and fine-tune vision-language models (VLMs) aligning them with robotics data including video, teleoperation, and language inputs.
Build perception-to-language grounding for referring expressions, affordances, and task graphs in robotics applications.
Create evaluation pipelines for instruction following, safety filters, and hallucination control.
Collaborate with cross-functional teams for integration of models into robotics platforms.

Required Skills and Qualifications:

Master's or Ph.D. in Computer Science, Robotics, or a related field.
1-2+ years of experience in Machine Learning and/or Computer Vision.
Strong proficiency in PyTorch or JAX; experience with Large Language Models (LLMs) and VLMs.
Familiarity with multimodal datasets, distributed training, and Reinforcement Learning/Imitation Learning.

Nice-to-Haves:

Experience with world models, diffusion-policy integration, and speech interfaces in robotics.
Familiarity with sim-to-real transfer in robotics applications.
Success on language-based tasks such as natural language processing and machine translation.

Multimodal Modeling Specialist

2 weeks ago

Vapi, Gujarat, India beBeeMachineLearning Full time ₹ 1,50,00,000 - ₹ 2,00,00,000

Job Title: Vision-Led Model Research Engineer Job Description:We are seeking a highly skilled Research Engineer to build multimodal models for instruction following, scene grounding, and tool use across platforms. The role involves developing advanced models that bridge perception and language understanding for autonomous systems.Key...
Researcher - Multimodal Model Development

2 weeks ago

Vapi, Gujarat, India beBeeMultimodal Full time US$ 1,20,000 - US$ 1,50,000

Job Summary:We are seeking a highly skilled researcher to build multimodal models for instruction following, scene grounding, and tool use across platforms.The role involves developing advanced models that bridge perception and language understanding for autonomous systems.About the RoleDevelop vision-language models (VLMs) aligning them with robotics data...
Multimodal AI Software Modeler

2 weeks ago

Vapi, Gujarat, India beBeeModeler Full time ₹ 40,00,000 - ₹ 50,00,000

Software Modeler - Multimodal AI ExpertWe seek a skilled model developer to create cutting-edge models for task-oriented dialogue systems, vision-language understanding, and multimodal perception.Main ResponsibilitiesPretrain and fine-tune visual language models (VLMs) aligning them with robotics data including video, teleoperation, and language.Build...
Multimodal Model Expert

2 weeks ago

Vapi, Gujarat, India beBeeVlm Full time ₹ 1,20,00,000 - ₹ 2,00,00,000

Job Title: VLM Research Engineer">">Location: Vapi, Gujarat">">Employment Type: Full-Time">">Overview">">We are seeking a highly skilled expert in multimodal (vision-language-action) models to build instruction following, scene grounding, and tool use across platforms. The role involves developing advanced models that bridge perception and language...
Multimodal Model Architect

1 week ago

Vapi, Gujarat, India beBeemultimodel Full time US$ 1,00,000 - US$ 1,50,000

We are seeking an expert in multimodal models to develop advanced systems that bridge perception and language understanding for autonomous systems.The ideal candidate will have a strong background in Computer Vision, Machine Learning, and Large Language Models. Experience with distributed training, Reinforcement Learning, and Imitation Learning is also...
Multimodal Research Specialist

1 week ago

Vapi, Gujarat, India beBeeMultimodal Full time ₹ 1,00,00,000 - ₹ 2,00,00,000

Research Engineer OpportunityBuild cutting-edge multimodal models for instruction following, scene grounding, and tool use across various platforms.Key Responsibilities:Develop advanced VLMs that bridge perception and language understanding for autonomous systems.Pretrain and finetune VLMs to align them with robotics data, including video, teleoperation, and...
VLM Research Engineer

6 days ago

Vapi, Gujarat, India Meril Full time ₹ 15,00,000 - ₹ 25,00,000 per year

Role OverviewThe VLM Research Engineer specializes in Vision–Language Models (VLMs), enabling robots and autonomous vehicles to understand natural language instructions, ground perception, and translate commands into actionable tasks. This role is pivotal in advancing multimodal AI systems, bridging vision and language for real-world autonomy.Key...
VLM Research Engineer

2 weeks ago

Vapi, Gujarat, India Meril Full time

Job Title: VLM Research Engineer Location: Vapi, Gujarat Employment Type: Full-Time Overview We are seeking a highly skilled VLM Research Engineer to build multimodal (vision-language-action) models for instruction following, scene grounding, and tool use across platforms. The role involves developing advanced models that bridge perception and...
VLM Research Engineer

3 days ago

Vapi, Gujarat, India Meril Full time

Job Title: VLM Research Engineer Location: Vapi, Gujarat Employment Type: Full-Time Overview We are seeking a highly skilled VLM Research Engineer to build multimodal (vision-language-action) models for instruction following, scene grounding, and tool use across platforms. The role involves developing advanced models that bridge...
Vlm research engineer

2 weeks ago

Vapi, Gujarat, India Meril Full time

Job Title: VLM Research Engineer Location: Vapi, Gujarat Employment Type: Full-TimeOverviewWe are seeking a highly skilled VLM Research Engineer to build multimodal (vision-language-action) models for instruction following, scene grounding, and tool use across platforms. The role involves developing advanced models that bridge perception and language...

Americas

Europe

Asia / Oceania

Africa

Multimodal Model Developer