Machine Learning Engineer

2 weeks ago


Greater Bengaluru Area, India Mulya Technologies Full time ₹ 8,00,000 - ₹ 24,00,000 per year

TITLE: Machine Learning Engineer - Multimodal AI & Inference

LOCATION: GREATER BENGALURU AREA

Company Description

We are looking for exceptional talent and leadership to join Fast Growing Startup into Scalable Intelligence, the world's first company developing Agentic Silicon for powering the future of AI.

Founded in 2023, We have deep customer engagements across America, Europe, and Asia, and demonstrated functional prototypes to prove our concept and vision.

Job Description

Overview:

You will design, optimize, and deploy large multimodal models (language, vision, audio, video) to run efficiently on a compact, high-performance AI appliance capable of supporting 100B+ parameter models at real-time speeds. Your mission is to deliver state-of-the-art multimodal inference locally through advanced model optimization, quantization, and system-level integration.

Key Responsibilities:

1. Model Integration & Porting

  • Optimize large-scale foundation models (e.g., Llama, gpt-oss, Whisper, HiDream, Qwen, Wan etc) for on-device inference.
  • Adapt pre-trained models for multimodal tasks (text, image, audio, video, or cross-modal reasoning).
  • Ensure seamless interoperability between modalities — e.g., enabling the system to "see, hear, and talk" naturally.

2. Model Optimization for Edge Hardware

  • Quantize and compress large models (4-bit or mixed precision) while maintaining high accuracy and low latency.
  • Implement and benchmark inference runtimes using frameworks like , Ollama, vLLM, ONNX etc.
  • Collaborate with hardware engineers to co-design model architectures optimized for the appliance's compute fabric.

3. Inference Pipeline Development

  • Build and maintain scalable, high-throughput inference pipelines capable of handling concurrent multimodal requests (text, audio, image, video).
  • Implement token streaming, caching, and scheduling strategies for real-time responses.
  • Develop APIs for low-latency local inference accessible via a web interface.

4. Evaluation & Benchmarking

  • Profile and benchmark performance (throughput, latency, energy efficiency) of deployed models.
  • Run regression tests to validate numerical accuracy after quantization or pruning.
  • Define KPIs for multimodal model performance under real-world usage.

5. Research & Prototyping

  • Investigate emerging multimodal architectures and lightweight model variants for local deployment.
  • Prototype hybrid models that combine LLMs, diffusion models, and ASR/TTS pipelines for advanced multimodal applications.
  • Stay current on state-of-the-art inference frameworks, compression techniques, and multimodal learning trends.

Required Qualifications:

  • Strong background in deep learning and model deployment, with hands-on experience in PyTorch and/or TensorFlow.
  • Expertise in model optimization — quantization, pruning, distillation, or mixed-precision inference.
  • Practical knowledge of inference engines (vLLM, , ONNX Runtime or similar).
  • Experience deploying large models locally or on edge devices with limited memory/compute constraints.
  • Familiarity with multimodal model architectures — e.g., CLIP, Flamingo, LLaVA, or AudioGPT-style systems.
  • Strong software engineering skills (Python, C++, CUDA) and experience integrating models into production systems.
  • Understanding of GPU/accelerator utilization, memory bandwidth optimization, and distributed inference.

Preferred Qualifications:

  • Experience with model-parallel or tensor-parallel inference at scale.
  • Contributions to open-source inference frameworks or model serving systems.
  • Familiarity with hardware-aware training or co-optimization of neural networks and hardware.
  • Background in speech, vision, or multimodal ML research.
  • Track record of deploying models that run entirely offline or on embedded/edge systems.

Contact

Sumit S. B

"Mining the Knowledge Community"

Practice Head(Talent Acquisition. Semiconductors Domain)



  • Greater Kolkata Area, India Emperen Technologies Full time ₹ 8,00,000 - ₹ 25,00,000 per year

    Job Title :Machine Learning EngineerJob Type :Full-TimeExperience Level :Mid to Senior [5+ Years]Department :Data Science / AI EngineeringJob SummaryWe are seeking a highly skilled and mathematically grounded Machine Learning Engineer to join our AI team.The ideal candidate will have 5+ years of ML experience with a deep understanding of machine learning...


  • Greater Kolkata Area, India Cyanous Software Private Limited Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Job DescriptionWe are seeking a highly skilled Machine Learning Engineer with expertise in building and deploying end-to-end ML solutions. The ideal candidate will have strong experience in model development, deployment, and monitoring in cloud environments (preferably Azure). You will be responsible for the full ML lifecycle, ensuring robust, scalable, and...


  • Greater Bengaluru Area, India Valiance Solutions Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    About the Role:We are seeking an experienced MLOps Engineer to lead the deployment, scaling, and performance optimization of open-source Generative AI models on cloud infrastructure. You'll work at the intersection of machine learning, DevOps, and cloud engineering to help productize and operationalize large-scale LLM and diffusion models.Key...


  • Greater Bengaluru Area, India Mulya Technologies Full time ₹ 1,20,000 - ₹ 12,00,000 per year

    TITLE: MACHINE LEARNING COMPILER ENGINEER(Principal /Senior Staff /Staff Machine Learning Compiler Engineer)LOCATION: GREATER BENGALURU AREACOMPANY DESCRIPTION:We are looking for exceptional talent and leadership to join Fast Growing Startup into Scalable Intelligence, the world's first company developing Agentic Silicon for powering the future of AI.Founded...


  • Greater Delhi Area, India Miran World Full time

    Company- Miran World Location- New Delhi (Hybrid) Experience-2 to 4 Years Job Type- Full-Time Job Description Miran World is actively seeking a highly skilled and passionate full stack Machine Learning Engineer with 2 to 5 years of experience to join our dynamic team. The ideal candidate will have the combination of expertise in applied machine learning,...


  • Greater Delhi Area, India Miran World Full time

    Company- Miran World Location- New Delhi (Hybrid) Experience-2 to 4 Years Job Type- Full-Time Job DescriptionMiran World is actively seeking a highly skilled and passionate full stack Machine Learning Engineer with 2 to 5 years of experience to join our dynamic team. The ideal candidate will have the combination of expertise in applied machine learning,...


  • Greater Delhi Area, India Miran World Full time

    Company- Miran World Location- New Delhi (Hybrid) Experience-2 to 4 Years Job Type- Full-Time Job Description Miran World is actively seeking a highly skilled and passionate full stack Machine Learning Engineer with 2 to 5 years of experience to join our dynamic team. The ideal candidate will have the combination of expertise in applied machine learning,...


  • Industrial Area, India GEDU Services Full time

    Position Overview:We are looking for a highly skilled and experienced Senior AI/ML Engineer to join our team. The ideal candidate will have a strong background in artificial intelligence, machine learning, and deep learning, with a proven track record of building and deploying scalable AI models in real-world applications.Key Responsibilities: Design,...


  • Bengaluru, Karnataka, India NatWest Group Full time

    Machine Learning Engineer, VP Join us as a Machine Learning EngineerIn this role, you'll be driving and embedding the deployment, automation, maintenance and monitoring of machine learning models and algorithms Day-to-day, you'll make sure that models and algorithms work effectively in a production environment while promoting data literacy education...


  • Greater Bengaluru Area, India Tata Electronics Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Role OverviewWe are looking for aMachine Control Engineerwith strongcoding skillsand apassion for learning AI, reinforcement learning (RL), and intelligent automation. This role will focus ondeveloping and optimizing control systems for complex industrial machines, integrating AI-driven approaches to improve precision, efficiency, and automation.Key...