Senior Deep Learning Engineer

4 days ago


bangalore district, India Nanonets Full time

Join Nanonets to push the boundaries of what's possible with deep learning. We're not just implementing models – we're setting new benchmarks in document AI, with our open-source models achieving nearly 1 million downloads on and recognition from global AI leaders. Backed by $40M+ in total funding including our recent $29M Series B from Accel, alongside Elevation Capital and Y Combinator, we're scaling our deep learning capabilities to serve enterprise clients including Toyota, Boston Scientific, and Bill.com. You'll work on genuinely challenging problems at the intersection of computer vision, NLP, and generative AI. Here's a quick 1-minute intro video . What You'll Build Core Technical Challenges: Train & Fine-tune SOTA Architectures : Adapt and optimize transformer-based models, vision-language models, and custom architectures for document understanding at scale Production ML Infrastructure : Design high-performance serving systems handling millions of requests daily using frameworks like TorchServe, Triton Inference Server, and vLLM Agentic AI Systems : Build reasoning-capable OCR that goes beyond extraction – models that understand context, chain operations, and provide confidence-grounded outputs Optimization at Scale : Implement quantization, distillation, and hardware acceleration techniques to achieve fast inference while maintaining accuracy Multi-modal Innovation : Tackle alignment challenges between vision and language models, reduce hallucinations, and improve cross-modal understanding using techniques like RLHF and PEFT Engineering Responsibilities: Design distributed training pipelines for models with billions of parameters using PyTorch FSDP/DeepSpeed Build comprehensive evaluation frameworks benchmarking against GPT-4V, Claude, and specialized document AI models Implement A/B testing infrastructure for gradual model rollouts in production Create reproducible training pipelines with experiment tracking Optimize inference costs through dynamic batching, model pruning, and selective computation We’re on a mission to hire the very best and are committed to creating exceptional employee experiences where everyone is respected and has access to equal opportunity. Technical Requirements Must-Have: 4+ years of hands-on deep learning experience with production deployments. Strong PyTorch expertise – ability to implement custom architectures, loss functions, and training loops from scratch. Experience with distributed training and large-scale model optimization Proven track record of taking models from research to production Solid understanding of transformer architectures, attention mechanisms, and modern training techniques. B.E./B.Tech from top-tier engineering colleges Highly Valued: Experience with model serving frameworks (TorchServe, Triton, Ray Serve, vLLM) Knowledge of efficient inference techniques (ONNX, TensorRT, quantization) Contributions to open-source ML projects Experience with vision-language models and document understanding Familiarity with LLM fine-tuning techniques (LoRA, QLoRA, PEFT) Why This Role is Exceptional Proven Impact : Our models approaching 1 million downloads – your work will have global reach Real Scale : Your models will process millions of documents daily for Fortune 500 companies Well-Funded Innovation : $40M+ in funding means significant GPU resources and freedom to experiment Open Source Leadership : Publish your work and contribute to models already trusted by nearly a million developers Research-Driven Culture : Regular paper reading sessions, collaboration with research community Rapid Growth : Strong financial backing and Series B momentum mean ambitious projects and fast career progression Our Recent Achievements Nanonets-OCR model: ~1 million downloads on Hugging Face – one of the most adopted document AI models globally Launched industry-first Automation Benchmark defining new standards for AI reliability Published research recognized by leading AI researchers Built agentic OCR systems that reason and adapt, not just extract


  • Deep Learning Engineer

    10 hours ago


    Bangalore, India AIMonk Labs Private Ltd Full time

    Job description A job where you increase the depth of your expertise in computer vision. A job where you learn and implement the SOTA papers. A job where you write vectorized code that runs in seconds, not in minutes. A job where models learn to see and understand the world around them. A job where models run real-time because you optimize every byte. A job...


  • Bangalore, India Nanonets Full time

    Join Nanonets to push the boundaries of what's possible with deep learning. We're not just implementing models – we're setting new benchmarks in document AI, with our open-source models achieving nearly 1 million downloads on and recognition from global AI leaders. Backed by $40M+ in total funding including our recent $29M Series B from Accel, alongside...


  • bangalore, India Nanonets Full time

    Join Nanonets to push the boundaries of what's possible with deep learning. We're not just implementing models – we're setting new benchmarks in document AI, with our open-source models achieving nearly 1 million downloads on Hugging Face and recognition from global AI leaders.Backed by $40M+ in total funding including our recent $29M Series B from Accel,...


  • Bengaluru District, Karnataka, India Grafton Biosciences Inc Full time

    **Company Overview** Grafton Biosciences is a US-based biotech startup (with offices in Bengaluru) focused on solving cancer and cardiovascular disease through groundbreaking innovations in early detection, diagnostics, and therapeutics. We combine cutting-edge molecular and synthetic biology, machine learning, device engineering, and manufacturing to...


  • Bangalore Division, India Nanonets Full time

    Join Nanonets to push the boundaries of what's possible with deep learning. We're not just implementing models – we're setting new benchmarks in document AI, with our open-source models achieving nearly 1 million downloads on and recognition from global AI leaders. Backed by $40M+ in total funding including our recent $29M Series B from Accel, alongside...


  • bangalore, India Deep Armor Full time

    Years of Experience: 6-10 yearsAbout the RoleWe’re looking for a Senior Security Engineer to lead and support product security efforts for cloud-hosted web applications. You will be responsible for deep-tech product security design reviews, code reviews, threat modeling, and other technical activities in software security development life cycle.Key...


  • Bangalore, India Deep Armor Full time

    Years of Experience: 6-10 years About the Role We’re looking for a Senior Security Engineer to lead and support product security efforts for cloud-hosted web applications. You will be responsible for deep-tech product security design reviews, code reviews, threat modeling, and other technical activities in software security development life cycle. Key...


  • Bangalore, India Deep Armor Full time

    Years of Experience: 5-10 years About the Role We're looking for a Senior Security Engineer to lead and support product security efforts for cloud-hosted web applications. You will be responsible for deep-tech product security design reviews, code reviews, threat modeling, and other technical activities in software security development life cycle. Key...


  • Gurgaon District, Haryana, India Agrex Technologies Private Limited Full time

    **Requirements** - Strong understanding of deep learning concepts and experience with popular deep learning frameworks such as PyTorch, TensorFlow, and Keras. - Familiarity with image and video processing libraries such as OpenCV, scikit-image, and scikit-video. - Experience with computer vision tasks such as object detection, segmentation, and tracking. -...


  • bangalore district, India Diligente Technologies Full time

    Title: Senior Machine Learning Engineer Location: Vaishnavi Signature, Bellandur, Bengaluru( hybrid2 days onsite a week) Full time What You Will Achieve and Key Responsibilities Research, Design, Develop and Deploy AI models and systems Lead the research and development of AI models - a varied portfolio ranging from small classifiers to fine-tuning LLM’s...