HPC - Team Lead
2 days ago
Hi,
We have an immediate requirement for HPC Team Lead position in Hyderabad with our organization SHI Locuz Enterprise Solutions Pvt Ltd.
PFB JD:
Experience - 6+years
Work location - Hyderabad
ROLE SUMMARY
The Technology Lead – HPC ensures that critical IT services and high-performance computing (HPC) infrastructure are available, efficient, and secure. The person in this role manages daily operations of mission‐critical systems in multiple client's data centres, working closely with both facilities engineering teams (power, cooling, physical infrastructure) and IT infrastructure / operations teams, to support service clients around the clock. This role combines technical leadership, operations oversight, incident / problem management, and strategic planning.
PRIMARY ROLES & RESPONSIBILITIES
- Experience architecting and maintaining HPC/AI systems.
- Linux system administration
- Cluster management
- System and software configuration management
- High speed networking
- Resource managers and schedulers
- High speed parallel storage
- Monitoring and alerting
- Strong understanding of HPC/AI architectures and concepts.
- Experience supporting and managing a group of HPC/AI Clusters.
- Excellent knowledge in prototyping and deploying HPC/AI clusters.
- Extensive experience in troubleshooting Linux OS, filesystems and cluster hardware.
- Good command of various Linux scripting tools, like bash, Perl, python, etc.
- Experience implementing, maintaining, and verifying defined security policies.
- To be willing to maintain a flexible work schedule.
- A positive attitude and willingness to help enable the lab users for success.
- Excellent guidance and teamwork skills.
TECHNICAL SKILLS
- RedHat, Ubuntu, SuSE OS
- Cluster Tools (Bright, xCAT, werewolf, OpenHPC, ROCKS etc)
- InfiniBand
- Lustre, BeeGFS and GPFS architecture and maintenance
- Configuration management software (Ansible, Puppet)
- SLURM/PBS/LSF/Gridengine Scheduler
- SPACK software manager
- Experience in AI Servers & Software stack Deployment.
- Experience on container technologies and orchestration tools - docker, singularity, Apptainer, Kubernetes.
- Hands-on with AI/ML tools: TensorFlow, PyTorch, Keras, ONNX, JAX.
- Experience in benchmarking and performance optimization of large-scale HPC/AI systems
- Experience in Linux, and/or Windows Operating System (OS), including file management, scripting, editing, and security.
- Log consolidation and monitoring (ganglia, Grafana etc.)
- Lifecycle and patch management experience.
SOFT SKILLS
- Good logical reasoning & analytical skill
- Good communication skill
OTHER SKILLS
- Collaborative, co-operative, and commitment mindset.
- Teamwork
- Excellent analytical and problem-solving skills.
- Ability to work independently and within cross-functional teams.
- Detail-oriented with good documentation practices.
- Excellent interpersonal, communication, customer interaction, documentation skills and decision-making ability.
-
HPC Team Lead
6 days ago
Hyderabad, Telangana, India SHI | Locuz - An SHI Company Full time ₹ 15,00,000 - ₹ 25,00,000 per yearHi,We have an immediate requirement for HPC Team Lead position in Hyderabad with our organization SHI Locuz Enterprise Solutions Pvt Ltd.PFB JD:Experience - 6+yearsWork location - HyderabadShould know the below mentioned:HPC Skill SetCluster Tool Kit: Rocks, xCAT, OpenHPCScheduler: PBS Pro, SLURMMPI: Intel, OpenMPIPFS: Lustre, GPFSLinux Skill :OS Deep Dive (...
-
Hpc Applications
1 week ago
Madhapur, Hyderabad, Telangana, India Locuz Enterprise Solutions Full time**L2 Skill HPC Engineer with Application Expertise** **Role Overview**: **Core Responsibilities**: - ** HPC Cluster Support**: Manage day-to-day operations of HPC clusters (Slurm, PBS, LSF), monitor jobs, and node health, and manage user issues at L2. - ** Application Support & Optimization**: - ** User & Job Management**: Handle user access, and...
-
Hpc Engineer
2 weeks ago
Hyderabad, India Whatjobs IN C2 Full timeHPC Engineer(L2) with Application Expertise Role Overview: An L2 HPC (High-Performance Computing) Engineer with an application skillset is responsible for supporting, troubleshooting, and maintaining HPC infrastructure and assisting users with scientific and engineering applications. They operate between infrastructure and application layers, ensuring...
-
Hpc Application Engineer
3 weeks ago
Hyderabad, India Whatjobs IN C2 Full timeHi, We have an immediate requirement for HPC Applications Engineer with our organization SHI Locuz Enterprise Solutions Pvt Ltd. PFB JD L2 Skill HPC Engineer with Application Expertise Role Overview: An L2 HPC (High-Performance Computing) Engineer with an application skillset is responsible for supporting, troubleshooting, and maintaining HPC infrastructure...
-
HPC Engineer For SHI
1 week ago
Hyderabad, Telangana, India Locuz Full time ₹ 15,00,000 - ₹ 25,00,000 per yearL2 kill HPC Engineer with Application ExpertiseRole Overview: An L2 HPC (High-Performance Computing) Engineer with an application skillset is responsible for supporting, troubleshooting, and maintaining HPC infrastructure and assisting users with scientific and engineering applications. They operate between infrastructure and application layers, ensuring...
-
Snr. Spec. Platform Analytics Hpc DevOps
15 hours ago
Hyderabad, Telangana, India Novartis Full time**Summary**: The Snr. Specialist DDIT APD HPC DevOps will be a core member of a F1 Foundry team supporting the data42 HPC platform maintenance and ensuring delivering per the roadmap / vision laid out. **About the Role**: **Your responsibilities include but are not limited to**: - Responsible for design & development of features required in Analytics...
-
Bengaluru, Hyderabad, India Cognizant Full time ₹ 12,00,000 - ₹ 36,00,000 per yearDear candidate,Please find the JD below for your reference and if you are interested kindly share your updated profile as attachment to Role OverviewWe are seeking seasoned professionals with deep expertise in operating and managing High-Performance Computing (HPC) platforms. The ideal candidate will have hands-on experience in designing, deploying, and...
-
Hyderabad, India Advanced Micro Devices, Inc Full timeWHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create...
-
Hyderabad, India Xilinx Full timeJob Description WHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to...
-
Staff Systems Engineer, DA HPC
24 hours ago
Hyderabad, India Micron Full timeOur vision is to transform how the world uses information to enrich life for all. Micron Technology is a world leader in innovating memory and storage solutions that accelerate the transformation of information into intelligence, inspiring the world to learn, communicate and advance faster than ever. Responsibilities and Tasks: You will work regularly with...