GPU Optimization Engineer
23 hours ago
Role
We're hiring a GPU Optimization Engineer who understands GPUs at a deep, architectural level — someone who knows exactly how to squeeze every last millisecond out of a model, what GPU constraints matter, and how to restructure models for real-world inference performance. You'll work across CUDA kernels, model graph optimizations, hardware-specific tuning, and porting models across GPU architectures. Your work directly impacts the latency, throughput, and reliability of smallest's real-time speech models.
What You'll Do
- Optimize model architectures (ASR, TTS, SLMs) for maximum performance on specific GPU hardware
- Profile models end-to-end to identify GPU bottlenecks — memory bandwidth, kernel launch overhead, fusion opportunities, quantization constraints
- Design and implement custom kernels (CUDA/Triton/Tinygrad) for performance-critical model sections
- Perform operator fusion, graph optimization, and kernel-level scheduling improvements
- Tune models to fit GPU memory limits while maintaining quality
- Benchmark and calibrate inference across NVIDIA, AMD, and potentially emerging accelerators
- Port models across GPU chipsets (NVIDIA → AMD / edge GPUs / new compute backends)
- Work with TensorRT, ONNX Runtime, and custom runtimes for deployment
- Partner with the research and infra teams to ensure the entire stack is optimized for real-time workloads
Requirements
- Strong understanding of
GPU architecture
— SMs, warps, memory hierarchy, occupancy tuning - Hands-on experience with
CUDA
, kernel writing, and kernel-level debugging - Experience with
kernel fusion
and model graph optimizations - Familiarity with
TensorRT, ONNX, Triton, tinygrad, or similar inference engines - Strong proficiency in
PyTorch
and Python - Deep understanding of
model architectures
(transformers, convs, RNNs, attention, diffusion blocks) - Experience profiling GPU workloads using Nsight, nvprof, or similar tools
- Strong problem-solving abilities with a performance-first mindset
Great to Have
- Experience with quantization (INT8, FP8, hybrid formats)
- Experience with audio/speech models (ASR, TTS, SSL, vocoders)
- Contributions to open-source GPU stacks or inference runtimes
- Published work related to systems-level model optimization
Who Will Succeed in This Role
Someone who:
- thinks in kernels, not just layers
- knows which optimizations are theoretical vs practically impactful
- understands GPU boundaries (memory, bandwidth, latency) and how to work around them
- is excited by the challenge of ultra-low latency and large-scale real-time inference
- loves debugging at the CUDA + model level
-
GPU Modeling Engineer
6 days ago
Bengaluru, Karnataka, India NICHESPACE IT SOLUTIONS PVT LTD Full timeRole: GPU Modeling EngineerExperience:1. Lead Role: 15+ years2.Team Members: 3 to 10+ years (15 members)Location: Bengaluru (On-site)Key Responsibilities:* Design and develop GPU models* Collaborate with hardware and software teams* Optimize GPU performance* Debug GPU-related issues* Implement and validate GPU features in cross-functional teamsRequired...
-
Sr. GPU ENgineer
1 week ago
Bengaluru, Karnataka, India Norwin Technologies Full timeDear Candidate,We are looking for immediate joiners who come with an experience in GPU. Interested candidates can share their profile on Location: BangaloreMode: Work from officeShifts : US shiftBelow is teh JD for this role.Job Description: We are looking for an experienced AI/ML & DevOps Engineer to design, develop, and scale AI-driven applications using...
-
GPU STA Engineer
22 hours ago
Bengaluru, Karnataka, India Qualcomm Full timeCompany:Qualcomm India Private LimitedJob Area:Engineering Group, Engineering Group > Hardware EngineeringGeneral Summary:Qualcomm GPU team is actively seeking candidates for several physical design engineering positions. Graphics HW team in Bangalore is part of a worldwide team responsible for developing and delivering GPU solutions which are setting the...
-
GPU STA Engineer
4 days ago
Bengaluru, Karnataka, India Qualcomm Full timeCompany:Qualcomm India Private LimitedJob Area:Engineering Group, Engineering Group > Hardware EngineeringGeneral Summary:Qualcomm GPU team is actively seeking candidates for several physical design engineering positions. Graphics HW team in Bangalore is part of a worldwide team responsible for developing and delivering GPU solutions which are setting the...
-
GPU + Kubernetes Expert
1 week ago
Bengaluru, Karnataka, India Norwin Technologies Full timeTitle: GPU + Kubernetes ExpertLocation: BangaloreExperience: 8+ YearsJob Description:Were looking for an experienced Infrastructure Engineer with a strong background in Kubernetes (K8s), GPU-based workloads, and scaling large distributed systems. We need builders, not just maintainers. Ideal candidates will have hands-on experience developing and...
-
Senior GPU Compiler Engineer
3 days ago
Bengaluru, Karnataka, India Best NanoTech Full timeAbout the Company-Undisputed leader in AI computingOur client is theworld's leading pioneer in accelerated computing. Originally known for inventing the GPU and revolutionizing gaming, they are now theprimary force powering the AI era, providing the infrastructure for everything from self-driving cars to ChatGPT. You will be joining atrillion-dollar...
-
GPU + Kubernetes expert
1 week ago
Bengaluru, Karnataka, India Norwin Technologies Full timeDear Candidate,Were looking for an experiencedInfrastructure Engineerwith a strong background inKubernetes (K8s),GPU-based workloads, andscaling large distributed systems. We need builders, not just maintainers. Ideal candidates will have hands-on experiencedeveloping and stress-testing infrastructure at scale, not just reporting bottlenecks, but solving...
-
Adreno GPU AI Compiler Perf specialist
5 days ago
Bengaluru, Karnataka, India Qualcomm Full timeCompany:Qualcomm India Private LimitedJob Area:Engineering Group, Engineering Group > Systems EngineeringGeneral Summary:Qualcomm's Adreno GPU is the industry-leading mobile graphics solution in today's Android smartphone market and is rapidly expanding into new domains, including the Snapdragon Elite Windows on Arm platform. The Adreno GPU compiler supports...
-
GPU Software Development Engineer
1 week ago
Bengaluru, Karnataka, India Intel Corporation Full timeJob Details:Job Description: Develops and/or validates software that enables Intel GPUs. Scope can spans the entire stack, from firmware and device drivers through APIs and the application layer, and may also include the tools, infrastructure, and technologies necessary to develop, profile, optimize, and productize Intel GPUs or graphics/GPGPU software...
-
GPU Design Verification Engineer, Silicon
1 week ago
Bengaluru, Karnataka, India Google Full timeMinimum qualifications:Bachelor's degree in Electrical Engineering, Computer Engineering, Computer Science, a related field, or equivalent practical experience.4 years of experience with standard GPU workloads like Manhattan/3DMark.Experience with GPU architecture and AMBA bus protocols like AHB/AXI/ACE.Preferred qualifications:Master's degree or PhD in...