MulticoreWare - GPU Engineer - CUDA/OpenGL

3 weeks ago


Chennai, Tamil Nadu, India MulticoreWare Inc. Full time

Job Summary :

We are seeking an experienced GPU Programming Engineer to join our team. In this role, you will focus on developing, optimizing, and deploying GPU-accelerated solutions for high-performance machine learning workloads. The ideal candidate has strong expertise in GPU programming across one or more platforms (e.g., NVIDIA CUDA, AMD ROCm/HIP, or OpenCL) and is comfortable working at the intersection of parallel computing, performance tuning, and ML system integration.

Key Responsibilities :

- Develop, optimize, and maintain GPU-accelerated components for machine learning pipelines using frameworks such as CUDA, HIP, or OpenCL.

- Analyze and improve GPU kernel performance through profiling, benchmarking, and resource optimization.

- Optimize memory access, compute throughput, and kernel execution to improve overall system performance on the target GPUs.

- Port existing CPU-based implementations to GPU platforms while ensuring correctness and performance scalability.

- Work closely with system architects, software engineers, and domain experts to integrate GPU-accelerated solutions.

Required Qualifications :

- Bachelor's or master's degree in computer science, Electrical Engineering, or a related field.

- 2+ years of hands-on experience in GPU programming using CUDA, HIP, OpenCL, or other GPU compute APIs.

- Strong understanding of GPU architecture, memory hierarchy, and parallel programming models.

- Proficiency in C/C++ and hands-on experience developing on Linux-based systems.

- Familiarity with profiling and tuning tools such as Nsight, rocprof, or Perfetto.

Preferred Qualifications :

- Familiarity with cuDNN, TensorRT, OpenCL, or other GPU computing libraries.

(ref:hirist.tech)

  • Chennai, Tamil Nadu, India MulticoreWare Inc Full time ₹ 15,00,000 - ₹ 20,00,000 per year

    Job SummaryWe are seeking an experienced GPU Programming Engineer to join our team. In this role, you will focus on developing, optimizing, and deploying GPU-accelerated solutions for high-performance machine learning workloads. The ideal candidate has strong expertise in GPU programming across one or more platforms (e.g., NVIDIA CUDA, AMD ROCm/HIP, or...


  • Chennai, Tamil Nadu, India Adecco Full time

    GPU programming engineerExperience: 4+ YearsLocation: Chennai, IndiaEmployment Type: Contract roleJob Summary We are seeking an experienced GPU Programming Engineer to join our team. In this role, you will focus on developing, optimizing, and deploying GPU-accelerated solutions for highperformance machine learning workloads. The ideal candidate has strong...


  • Chennai, Tamil Nadu, India Adecco Full time

    GPU programming engineer Experience: 4+ Years Location: Chennai, IndiaEmployment Type: Contract roleJob Summary We are seeking an experienced GPU Programming Engineer to join our team. In this role, you will focus on developing, optimizing, and deploying GPU-accelerated solutions for highperformance machine learning workloads. The ideal candidate has strong...


  • Chennai, Tamil Nadu, India Adecco Full time

    GPU programming engineer Experience: 4+ Years Location: Chennai, India Employment Type: Contract role Job Summary We are seeking an experienced GPU Programming Engineer to join our team. In this role, you will focus on developing, optimizing, and deploying GPU-accelerated solutions for high performance machine learning workloads. The ideal candidate...


  • Chennai, Tamil Nadu, India Adecco Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    GPU programming engineerExperience: 4+ YearsLocation: Chennai, IndiaEmployment Type: Contract roleJob Summary We are seeking an experienced GPU Programming Engineer to join our team. In this role, you will focus on developing, optimizing, and deploying GPU-accelerated solutions for highperformance machine learning workloads. The ideal candidate has strong...


  • Chennai, Tamil Nadu, India KLA Corporation Full time ₹ 10,00,000 - ₹ 25,00,000 per year

    Company OverviewKLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop, smartphone, wearable device, voice-controlled gadget, flexible screen, VR device or smart car would have made it into your hands without us. KLA invents...


  • Chennai, Tamil Nadu, India MulticoreWare Inc. Full time

    Job Description : - Developing a software pipeline for end-to-end ML Model Inference for specific hardware accelerator by achieving maximum performance & accuracy. - Implementing cutting edge deep learning layers for various model categories like CNN, RNN, LSTM, GANs, etc using customized inference pipeline for NN Processor. - Performance optimization for...


  • Chennai, Tamil Nadu, India Manvian Group Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Job Overview:We are hiring an AI Engineer for an execution-driven role to implement aGPU-powered, real-time CCTV video analytics solution for a high-scale retail business.Key Responsibilities:Integrate AI video analytics with live RTSP CCTV feeds (multi-camera environments)Build real-time detection & tracking for people, objects, and behaviors.Develop visual...


  • Chennai, Tamil Nadu, India MulticoreWare Inc Full time ₹ 5,00,000 - ₹ 15,00,000 per year

    Key ResponsibilitiesDebugging and Troubleshooting :Investigate and resolve complex software issues within OpenStack environments (particularly those running on Ubuntu), including networking, compute, and storage.Diagnose and troubleshoot problems related to Kubernetes container orchestration, including pod failures, service outages, and networking...


  • Chennai, Tamil Nadu, India KLA Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Job Description/ Preferred QualificationsYour Day-to-day RolesExpose limitations in existing solutions, based on clusters of CPUs & GPUs, to deploy AI-based solutions on on-prem & cloud infrastructures at scale.Develop distributed frameworks and system-level solutions that enable scaling out image processing & AI loads from single GPU to multi-node clusters...