Machine Learning Engineer
2 weeks ago
TITLE: Machine Learning Engineer - Multimodal AI & Inference
LOCATION: GREATER BENGALURU AREA
Company Description
We are looking for exceptional talent and leadership to join Fast Growing Startup into Scalable Intelligence, the world's first company developing Agentic Silicon for powering the future of AI.
Founded in 2023, We have deep customer engagements across America, Europe, and Asia, and demonstrated functional prototypes to prove our concept and vision.
Job Description
Overview:
You will design, optimize, and deploy large multimodal models (language, vision, audio, video) to run efficiently on a compact, high-performance AI appliance capable of supporting 100B+ parameter models at real-time speeds. Your mission is to deliver state-of-the-art multimodal inference locally through advanced model optimization, quantization, and system-level integration.
Key Responsibilities:
1. Model Integration & Porting
- Optimize large-scale foundation models (e.g., Llama, gpt-oss, Whisper, HiDream, Qwen, Wan etc) for on-device inference.
- Adapt pre-trained models for multimodal tasks (text, image, audio, video, or cross-modal reasoning).
- Ensure seamless interoperability between modalities — e.g., enabling the system to "see, hear, and talk" naturally.
2. Model Optimization for Edge Hardware
- Quantize and compress large models (4-bit or mixed precision) while maintaining high accuracy and low latency.
- Implement and benchmark inference runtimes using frameworks like , Ollama, vLLM, ONNX etc.
- Collaborate with hardware engineers to co-design model architectures optimized for the appliance's compute fabric.
3. Inference Pipeline Development
- Build and maintain scalable, high-throughput inference pipelines capable of handling concurrent multimodal requests (text, audio, image, video).
- Implement token streaming, caching, and scheduling strategies for real-time responses.
- Develop APIs for low-latency local inference accessible via a web interface.
4. Evaluation & Benchmarking
- Profile and benchmark performance (throughput, latency, energy efficiency) of deployed models.
- Run regression tests to validate numerical accuracy after quantization or pruning.
- Define KPIs for multimodal model performance under real-world usage.
5. Research & Prototyping
- Investigate emerging multimodal architectures and lightweight model variants for local deployment.
- Prototype hybrid models that combine LLMs, diffusion models, and ASR/TTS pipelines for advanced multimodal applications.
- Stay current on state-of-the-art inference frameworks, compression techniques, and multimodal learning trends.
Required Qualifications:
- Strong background in deep learning and model deployment, with hands-on experience in PyTorch and/or TensorFlow.
- Expertise in model optimization — quantization, pruning, distillation, or mixed-precision inference.
- Practical knowledge of inference engines (vLLM, , ONNX Runtime or similar).
- Experience deploying large models locally or on edge devices with limited memory/compute constraints.
- Familiarity with multimodal model architectures — e.g., CLIP, Flamingo, LLaVA, or AudioGPT-style systems.
- Strong software engineering skills (Python, C++, CUDA) and experience integrating models into production systems.
- Understanding of GPU/accelerator utilization, memory bandwidth optimization, and distributed inference.
Preferred Qualifications:
- Experience with model-parallel or tensor-parallel inference at scale.
- Contributions to open-source inference frameworks or model serving systems.
- Familiarity with hardware-aware training or co-optimization of neural networks and hardware.
- Background in speech, vision, or multimodal ML research.
- Track record of deploying models that run entirely offline or on embedded/edge systems.
Contact
Sumit S. B
"Mining the Knowledge Community"
Practice Head(Talent Acquisition. Semiconductors Domain)
-
Machine Learning Engineer
1 week ago
Greater Kolkata Area, India Emperen Technologies Full time ₹ 8,00,000 - ₹ 25,00,000 per yearJob Title :Machine Learning EngineerJob Type :Full-TimeExperience Level :Mid to Senior [5+ Years]Department :Data Science / AI EngineeringJob SummaryWe are seeking a highly skilled and mathematically grounded Machine Learning Engineer to join our AI team.The ideal candidate will have 5+ years of ML experience with a deep understanding of machine learning...
-
Machine Learning Engineer
2 weeks ago
Greater Kolkata Area, India Cyanous Software Private Limited Full time ₹ 12,00,000 - ₹ 36,00,000 per yearJob DescriptionWe are seeking a highly skilled Machine Learning Engineer with expertise in building and deploying end-to-end ML solutions. The ideal candidate will have strong experience in model development, deployment, and monitoring in cloud environments (preferably Azure). You will be responsible for the full ML lifecycle, ensuring robust, scalable, and...
-
Machine Learning Engineer
6 days ago
Greater Bengaluru Area, India Valiance Solutions Full time ₹ 12,00,000 - ₹ 36,00,000 per yearAbout the Role:We are seeking an experienced MLOps Engineer to lead the deployment, scaling, and performance optimization of open-source Generative AI models on cloud infrastructure. You'll work at the intersection of machine learning, DevOps, and cloud engineering to help productize and operationalize large-scale LLM and diffusion models.Key...
-
Senior Machine Learning Compiler Engineer
6 days ago
Greater Bengaluru Area, India Mulya Technologies Full time ₹ 1,20,000 - ₹ 12,00,000 per yearTITLE: MACHINE LEARNING COMPILER ENGINEER(Principal /Senior Staff /Staff Machine Learning Compiler Engineer)LOCATION: GREATER BENGALURU AREACOMPANY DESCRIPTION:We are looking for exceptional talent and leadership to join Fast Growing Startup into Scalable Intelligence, the world's first company developing Agentic Silicon for powering the future of AI.Founded...
-
Full-Stack Machine Learning Engineer
1 week ago
Greater Delhi Area, India Miran World Full timeCompany- Miran World Location- New Delhi (Hybrid) Experience-2 to 4 Years Job Type- Full-Time Job Description Miran World is actively seeking a highly skilled and passionate full stack Machine Learning Engineer with 2 to 5 years of experience to join our dynamic team. The ideal candidate will have the combination of expertise in applied machine learning,...
-
Full-Stack Machine Learning Engineer
2 weeks ago
Greater Delhi Area, India Miran World Full timeCompany- Miran World Location- New Delhi (Hybrid) Experience-2 to 4 Years Job Type- Full-Time Job DescriptionMiran World is actively seeking a highly skilled and passionate full stack Machine Learning Engineer with 2 to 5 years of experience to join our dynamic team. The ideal candidate will have the combination of expertise in applied machine learning,...
-
Full-Stack Machine Learning Engineer
2 weeks ago
Greater Delhi Area, India Miran World Full timeCompany- Miran World Location- New Delhi (Hybrid) Experience-2 to 4 Years Job Type- Full-Time Job Description Miran World is actively seeking a highly skilled and passionate full stack Machine Learning Engineer with 2 to 5 years of experience to join our dynamic team. The ideal candidate will have the combination of expertise in applied machine learning,...
-
Senior Machine Learning Engineer
3 weeks ago
Industrial Area, India GEDU Services Full timePosition Overview:We are looking for a highly skilled and experienced Senior AI/ML Engineer to join our team. The ideal candidate will have a strong background in artificial intelligence, machine learning, and deep learning, with a proven track record of building and deploying scalable AI models in real-world applications.Key Responsibilities: Design,...
-
Machine Learning Engineer
4 days ago
Bengaluru, Karnataka, India NatWest Group Full timeMachine Learning Engineer, VP Join us as a Machine Learning EngineerIn this role, you'll be driving and embedding the deployment, automation, maintenance and monitoring of machine learning models and algorithms Day-to-day, you'll make sure that models and algorithms work effectively in a production environment while promoting data literacy education...
-
AI Engineer – Machine Control
1 week ago
Greater Bengaluru Area, India Tata Electronics Full time ₹ 12,00,000 - ₹ 36,00,000 per yearRole OverviewWe are looking for aMachine Control Engineerwith strongcoding skillsand apassion for learning AI, reinforcement learning (RL), and intelligent automation. This role will focus ondeveloping and optimizing control systems for complex industrial machines, integrating AI-driven approaches to improve precision, efficiency, and automation.Key...