ML GPU Kernel Development Engineer

1 day ago


hyderabad, India Advanced Micro Devices, Inc Full time

WHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond.  Together, we advance your career.  ML GPU Kernel Development EngineerTHE ROLE: We are seeking a talented Machine Learning Kernel Developer to design, develop, and optimize low-level machine learning kernels for AMD GPUs using the ROCm software stack. In this role, you will work on high-impact projects to accelerate AI frameworks and libraries, with a focus on emerging technologies like Large Language Models (LLMs)  and other generative AI workloads.THE PERSON: The ideal candidate will have hands-on experience with GPU programming (ROCm or CUDA) and a passion for pushing the boundaries of AI performance.KEY RESPONSIBILITIES: Design and implement highly optimized ML kernels (e.g., matrix operations, attention mechanisms) for AMD GPUs using ROCm.Profile, debug, and tune kernel performance to maximize hardware utilization for AI workloads.Collaborate with ML researchers and framework developers to integrate kernels into AI frameworks (e.g., PyTorch, TensorFlow) and inference engines (e.g., vLLM, SGLang).Contribute to the ROCm software stack by identifying and resolving bottlenecks in libraries like MIOpen, BLAS, or Composable Kernel.Stay updated on the latest AI/ML trends (LLMs, quantization, distributed inference) and apply them to kernel development.Document and communicate technical designs, benchmarks, and best practices.Troubleshoot and resolve issues related to GPU compatibility, performance, and scalability.REQUIRED EXPERIENCE: 2+ years of experience in GPU kernel development for machine learning (ROCm or CUDA).Proficiency in C/C++ and Python, with experience in performance-critical programming.Strong understanding of ML frameworks (PyTorch, TensorFlow) and GPU-accelerated libraries.Basic knowledge of modern AI technologies (LLMs, transformers, inference optimization).Familiarity with parallel computing, memory optimization, and hardware architectures.Problem-solving skills and ability to work in a fast-paced environment.PREFERRED EXPERIENCE: Direct experience with AMD ROCm development (HIP, MIOpen, Composable Kernel).Knowledge of LLM-specific optimizations (e.g., FlashAttention, PagedAttention in vLLM).Experience with distributed training/inference or model compression techniques.Contributions to open-source ML projects or GPU compute libraries.ACADEMIC CREDENTIALS: Bachelor's/Master's in Computer Science, Electrical Engineering, or related field.#LI-PK1 Benefits offered are described:  AMD benefits at a glance.AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law.   We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.



  • Hyderabad, Telangana, India AMD Full time ₹ 12,00,000 - ₹ 24,00,000 per year

    Overview:WHAT YOU DO AT AMD CHANGES EVERYTHINGAt AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to...


  • Hyderabad, Telangana, India Advanced Micro Devices, Inc Full time ₹ 10,00,000 - ₹ 20,00,000 per year

    WHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create...


  • Hyderabad, India Intel Full time

    Job Description Are you interested in computer graphics and the opportunity to work with the Linux software engineering team on Intel's leading-edge Graphics/Compute products? Come join us. WHO WE ARE : The GPU and System Software Engineering organization is responsible for developing Linux drivers and technology for Intel's Graphics/Compute products, for...


  • Hyderabad, India Stealth Mode Startup - AI Product Based Company Full time

    Job Summary :We are seeking an experienced developers for our kernel development team focused on building and optimizing AI/ML operators using our specialised Instruction Set Architecture (ISA). In this role, you will be responsible for design, development, and performance tuning of core kernel components that directly influence the efficiency and...


  • Hyderabad, India LeadSoc Technologies Pvt Ltd Full time

    Greeting from Leadsoc technologies _ HyderabadPosition: Machine Learning C++ EngineerML Engg (C/C++ and Python programming Profiles )Deep Knowledge of C/C++ and Python programmingEXP: 2- 4 YearsNotice period: - Immediate joiner· Experience with Linux Commands is must· Experience with Scripting language like bash/powershell · Understanding of various...


  • Hyderabad, Telangana, India IIT Hyderabad Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Kernel Design & Development:Design and development of core kernel modules, optimized for both performance and energy efficiency under AI/ML workloadsDesign and development of advanced performance profiling/optimization and debugging tools to ensure low latencyAnalyse kernel performance, identify bottlenecks, and implement optimizations at the software and...


  • Hyderabad, India Mobius by Gaian Full time

    Description :About the Role :We are seeking an experienced DevOps Engineer to join our infrastructure team, with a strong focus on managing and optimizing GPU-based compute environments for machine learning and deep learning workloads.In this role, you will be responsible for the end-to-end infrastructure lifecyclefrom provisioning with Terraform/Ansible to...

  • Open source AI/ML

    4 days ago


    Hyderabad, Telangana, India Source-Right Full time ₹ 15,00,000 - ₹ 20,00,000 per year

    Position: Open source AI/ML (SI35FT RM 3718)EXPERIENCE – Must HaveStrong C++ and Python programming skills.Performance analysis skills for both CPU and GPUGood knowledge of AI/ML Frameworks and ArchitectureBasic GPU kernel programming knowledgeExperience with software engineering methodologies such as Agile, Scrum, Kanban.Experience in all the phases of...

  • Kernel Developer

    3 weeks ago


    Hyderabad, India 5G-AI Full time

    Job Summary :We are seeking an experienced developers for our kernel development team focused on building and optimizing AI/ML operators using our specialised Instruction Set Architecture (ISA). In this role, you will be responsible for design, development, and performance tuning of core kernel components that directly influence the efficiency and...


  • Hyderabad, Telangana, India Mobius by Gaian Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Description : About the Role : We are seeking an experienced DevOps Engineer to join our infrastructure team, with a strong focus on managing and optimizing GPU-based compute environments for machine learning and deep learning workloads. In this role, you will be responsible for the end-to-end infrastructure lifecyclefrom provisioning with...