HPC / Cuda Software Engineer
1 month ago
Job Description
KLA’s AI Advanced Computing Labs is looking for an extraordinary HPC System R&D Engineer to join its team to develop system-level HPC technologies that would form the foundation of next-generation clusters used in KLA tools. The technologies would be developed and demonstrated on on-prem clusters that serve as testbeds for next-generation KLA tools.
Your Day-to-day Roles
- Expose limitations in existing solutions, based on clusters of CPUs & GPUs, to deploy AI-based solutions on on-prem & cloud infrastructures at scale.
- Develop system-level solutions that enable scaling out image processing & AI loads from single GPU to multi-node clusters with multiple GPUs.
- Install, benchmark, and evaluate pre-release hardware for early-stage evaluation and prototyping by identifying (or developing) relevant workloads.
- Explore modern HPC systems software (such as new distributions of linux) for adoption into KLA’s tools.
Minimum Qualifications
- Masters / PhD in Computer Science or related fields; bachelors degree holders with relevant experience and extraordinary track-record will also be considered.
- Deep understanding of operating systems, computer networks, and high performance applications
- Good mental model of the architecture of a modern distributed systems that is comprised of CPUs, GPUs, and accelerators.
- Experience with deployments of deep-learning frameworks based on TensorFlow, and PyTorch on large-scale on-prem or cloud infrastructures.
- Solid understanding of container infrastructure such as Docker or singularity, and Kubernetes.
- Strong Scripting Skills in Bash, Python, or similar.
- Good communication.
Things to Make us go Wow
- Hands-on experience in architecting, building, and maintaining (against all odds) large scale distributed HPC clusters.
- Experience with model development on DL frameworks such as TensorFlow, and PyTorch
- Experience with building open-source operating systems and software stack on pre-release hardware.
- Hands-on involvement with cluster management tools (such as Prometheus, Grafana), scheduling and resource management (like SLURM, PBS, MPI/OSHMEM), and virtualization technologies (such as KVM/VMWare/Nutanix)
- Experience in working with developers who use clusters & sys-admins who maintain clusters
-
HPC / Cuda Software Engineer
1 month ago
Chennai, India KLA Full timeJob DescriptionKLA’s AI Advanced Computing Labs is looking for an extraordinary HPC System R&D Engineer to join its team to develop system-level HPC technologies that would form the foundation of next-generation clusters used in KLA tools. The technologies would be developed and demonstrated on on-prem clusters that serve as testbeds for next-generation...
-
HPC / Cuda Software Engineer
1 month ago
Chennai, India KLA Full timeJob Description KLA’s AI Advanced Computing Labs is looking for an extraordinary HPC System R&D Engineer to join its team to develop system-level HPC technologies that would form the foundation of next-generation clusters used in KLA tools. The technologies would be developed and demonstrated on on-prem clusters that serve as testbeds for...
-
Hpc / cuda software engineer
1 month ago
Chennai, India KLA Full timeJob DescriptionKLA’s AI Advanced Computing Labs is looking for an extraordinary HPC System R&D Engineer to join its team to develop system-level HPC technologies that would form the foundation of next-generation clusters used in KLA tools. The technologies would be developed and demonstrated on on-prem clusters that serve as testbeds for next-generation...
-
HPC / Cuda Software Engineer
1 month ago
Chennai, India KLA Full timeJob DescriptionKLA’s AI Advanced Computing Labs is looking for an extraordinary HPC System R&D Engineer to join its team to develop system-level HPC technologies that would form the foundation of next-generation clusters used in KLA tools. The technologies would be developed and demonstrated on on-prem clusters that serve as testbeds for next-generation...
-
KLA | HPC
1 month ago
chennai, India KLA Full timeJob DescriptionKLA’s AI Advanced Computing Labs is looking for an extraordinary HPC System R&D Engineer to join its team to develop system-level HPC technologies that would form the foundation of next-generation clusters used in KLA tools. The technologies would be developed and demonstrated on on-prem clusters that serve as testbeds for next-generation...
-
KLA | HPC
1 month ago
chennai, India KLA Full timeJob Description KLA’s AI Advanced Computing Labs is looking for an extraordinary HPC System R&D Engineer to join its team to develop system-level HPC technologies that would form the foundation of next-generation clusters used in KLA tools. The technologies would be developed and demonstrated on on-prem clusters that serve as testbeds for next-generation...
-
KLA | HPC
1 month ago
chennai, India KLA Full timeJob Description KLA’s AI Advanced Computing Labs is looking for an extraordinary HPC System R&D Engineer to join its team to develop system-level HPC technologies that would form the foundation of next-generation clusters used in KLA tools. The technologies would be developed and demonstrated on on-prem clusters that serve as testbeds for next-generation...
-
Specialist, hpc software systems
4 weeks ago
Chennai, India KLA Full timeKLA Overview: KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop, smartphone, wearable device, voice-controlled gadget, flexible screen, VR device or smart car would have made it into your hands without us. KLA...
-
Specialist, hpc software systems
1 month ago
Chennai, India KLA Full timeKLA Overview:KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop, smartphone, wearable device, voice-controlled gadget, flexible screen, VR device or smart car would have made it into your hands without us. KLA invents...
-
HPC SW Infrastructure Engineer
6 months ago
Chennai, India 3110 K-T India Full timeDescription Architect and Design High-Performance Compute Clusters : Collaborate with cross-functional teams to design, implement, and support HPC clusters. Optimize compute resources for maximum efficiency, considering CPU/GPU architecture, storage scalability, and high-bandwidth interconnects. Project Specifications and Timelines : Understand...
-
Specialist, HPC Software Systems
1 month ago
Chennai, India KLA Full timeKLA Overview: KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop, smartphone, wearable device, voice-controlled gadget, flexible screen, VR device or smart car would have made it into your hands without us. KLA invents...
-
Specialist, HPC Software Systems
1 month ago
Chennai, India KLA Full timeKLA Overview: KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop, smartphone, wearable device, voice-controlled gadget, flexible screen, VR device or smart car would have made it into your hands without us. KLA invents...
-
Specialist, HPC Software Systems
1 month ago
Chennai, India KLA Full timeKLA Overview: KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop, smartphone, wearable device, voice-controlled gadget, flexible screen, VR device or smart car would have made it into your hands without us. KLA invents...
-
Specialist, HPC Software Systems
1 month ago
Chennai, India KLA Full timeKLA Overview: KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop, smartphone, wearable device, voice-controlled gadget, flexible screen, VR device or smart car would have made it into your hands without us. KLA...
-
Specialist, HPC Software Systems
1 month ago
Chennai, India KLA Full timeKLA Overview:KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop, smartphone, wearable device, voice-controlled gadget, flexible screen, VR device or smart car would have made it into your hands without us. KLA invents...
-
KLA | Specialist, HPC Software Systems | chennai
1 month ago
chennai, India KLA Full timeKLA Overview: KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop, smartphone, wearable device, voice-controlled gadget, flexible screen, VR device or smart car would have made it into your hands without us. KLA invents...
-
KLA | Specialist, HPC Software Systems | chennai
1 month ago
chennai, India KLA Full timeKLA Overview: KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop, smartphone, wearable device, voice-controlled gadget, flexible screen, VR device or smart car would have made it into your hands without us. KLA invents...
-
Specialist, HPC Software Systems
1 month ago
Chennai, India KLA Full timeKLA Overview:KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop, smartphone, wearable device, voice-controlled gadget, flexible screen, VR device or smart car would have made it into your hands without us. KLA invents...
-
Chennai, Tamil Nadu, India KLA Full timeAbout the Role:We are seeking a highly skilled Senior HPC System Software Engineer to join our team in India. This is an exceptional opportunity to be at the forefront of developing cutting-edge system software that powers AI advancements.Job Description:The ideal candidate will possess strong object-oriented programming skills in Java and/or C++ and...
-
HPC Admin
4 weeks ago
Chennai, India ScaleneWorks Full timeAssist in cloud engineering projects and tasks, contributing to project success. • Collaborate with team members to deploy, maintain, and optimize cloud solutions. • Provide technical support, troubleshoot issues, and document solutions. • Contribute to the creation of technical documentation and knowledge sharing. • Participate in cloud training and...