Hpc network engineer
3 days ago
Location: Hyderabad or Mumbai
Experience: Minimum 5 years of relevant network experience
Job Overview:
We are seeking a highly skilled and experienced HPC Network Engineer to join our team. The ideal candidate will have a strong background in setting up and managing high-performance computing (HPC) networks with cutting-edge technologies such as 400 G and 800 G network connectivity. This role involves designing, implementing, and troubleshooting complex network architectures tailored to HPC and GPU-based systems. The engineer will also play a critical role in enabling efficient GPU interconnects and scaling AI and HPC workloads.
Key Responsibilities:
HPC Network Deployment :
Design, deploy, and maintain HPC networks with 400 G/800 G connectivity.
Optimize network performance for large-scale computing environments.
Advanced Networking Expertise :
Deep understanding and hands-on experience with Ro CE (RDMA over Converged Ethernet) and Infiniband technologies.
Collaborate with cross-functional teams to architect and implement robust HPC networking solutions.
Architectural Design and Communication :
Develop and present complex network architectures to technical and non-technical stakeholders.
Translate customer requirements into scalable and efficient network designs.
GPU Communication Frameworks :
Expertise in NVLink and NVSwitch for high-speed GPU-to-GPU communication.
Optimize interconnects for distributed training and inference workloads.
Technology Expertise :
Hands-on experience with switches and networking equipment from Broadcom, Arista, Mellanox, Juniper, Cisco, SONi C, or Dell.
Familiarity with NVIDIA, AMD, and Intel HPC architectures and their network integration requirements.
Storage Networking for HPC and AI :
Integrate GPUDirect Storage and NVMe-o F for efficient data movement between storage and GPUs.
Optimize data pathways for high-speed storage access in HPC workloads.
Problem Solving and Troubleshooting :
Monitor, analyze, and troubleshoot network performance issues.
Implement monitoring tools to ensure high availability and reliability of the HPC network.
Customer-Centric Solutions :
Engage with customers to understand their requirements and deliver tailored solutions on the fly.
Provide ongoing support and documentation for implemented solutions.
Comprehensive Network Knowledge :
Expertise in end-to-end network monitoring, analysis, troubleshooting, and implementation.
Stay updated on industry trends, standards, and best practices for HPC networking.
AI and HPC Workload Integration :
Support hybrid workloads combining AI and traditional HPC tasks.
Scale large language models and scientific simulations across GPU clusters with minimal latency.
Required Skills and Qualifications:
Minimum 5 years of hands-on experience in core network engineering.
Proven expertise in configuring and managing 400 G or 800 G network environments.
Strong knowledge of Ro CE and Infiniband protocols.
Hands-on experience with NVLink and NVSwitch in GPU-based environments.
Familiarity with networking equipment and technologies from vendors such as Broadcom, Arista, Mellanox, Juniper, Cisco, SONi C, or Dell.
Experience working with NVIDIA, AMD, or Intel HPC and GPU architectures.
Ability to conceptualize and explain complex network designs to diverse audiences.
Strong analytical and troubleshooting skills in high-performance environments.
Excellent communication and customer engagement skills to address requirements and provide solutions effectively.
Preferred Qualifications:
Industry certifications such as CCNP, CCIE, or equivalent.
Experience in scripting and automation for network operations.
Exposure to large-scale HPC deployments in data center environments.
Knowledge of software-defined networking (SDN) and virtualized networking environments.
Familiarity with AI-specific frameworks like Tensor Flow , Py Torch , or Horovod in distributed setups.
Why Join Us?
Work on cutting-edge HPC and GPU-based technologies.
Collaborate with industry leaders in AI and cloud infrastructure.
Competitive compensation and growth opportunities.
Opportunity to work in a dynamic and fast-paced environment.
-
HPC Network Engineer
5 days ago
Delhi, India Stealth AI Startup Full timeJob Title: HPC Network EngineerLocation: Hyderabad or MumbaiExperience: Minimum 5 years of relevant network experienceJob Overview:We are seeking a highly skilled and experienced HPC Network Engineer to join our team. The ideal candidate will have a strong background in setting up and managing high-performance computing (HPC) networks with cutting-edge...
-
Stealth AI Startup | HPC Network Engineer
3 days ago
delhi, India Stealth AI Startup Full timeJob Title: HPC Network EngineerLocation: Hyderabad or MumbaiExperience: Minimum 5 years of relevant network experienceJob Overview:We are seeking a highly skilled and experienced HPC Network Engineer to join our team. The ideal candidate will have a strong background in setting up and managing high-performance computing (HPC) networks with cutting-edge...
-
Stealth AI Startup | HPC Network Engineer
6 days ago
Delhi, India Stealth AI Startup Full timeJob Title: HPC Network EngineerLocation: Hyderabad or MumbaiExperience: Minimum 5 years of relevant network experienceJob Overview:We are seeking a highly skilled and experienced HPC Network Engineer to join our team. The ideal candidate will have a strong background in setting up and managing high-performance computing (HPC) networks with cutting-edge...
-
Stealth AI Startup | HPC Network Engineer
5 days ago
Delhi, India Stealth AI Startup Full timeJob Title:HPC Network EngineerLocation:Hyderabad or MumbaiExperience:Minimum 5 years of relevant network experienceJob Overview:We are seeking a highly skilled and experiencedHPC Network Engineerto join our team. The ideal candidate will have a strong background in setting up and managing high-performance computing (HPC) networks with cutting-edge...
-
Hpc engineer
2 months ago
Delhi, India SHI | Locuz - An SHI Company Full timeHPC Field EngineerPlease find the JD below:Work Location - PuneExperience - 2+yearsShould know the below mentioned:HPC Skill SetCluster Tool Kit: Rocks, x CAT, Open HPCScheduler: PBS Pro, SLURMMPI: Intel, Open MPIPFS: Lustre, GPFSLinux Skill :• OS Deep Dive ( Red Hat, SLES, Ubuntu )• Unattended Installation Deep Dive ( PXE, Cobbler, x CAT, etc)• File...
-
HPC Engineer
2 months ago
delhi, India SHI | Locuz - An SHI Company Full timeHPC Field EngineerPlease find the JD below:Work Location - PuneExperience - 2+yearsShould know the below mentioned:HPC Skill SetCluster Tool Kit: Rocks, xCAT, OpenHPCScheduler: PBS Pro, SLURMMPI: Intel, OpenMPIPFS: Lustre, GPFSLinux Skill :• OS Deep Dive ( RedHat, SLES, Ubuntu )• Unattended Installation Deep Dive ( PXE, Cobbler, xCAT, etc)• File Server...
-
Hpc linux support engineer
3 weeks ago
Delhi, India Self-employed Full timeRole: HPC Linux Systems EngineerLocation: Remote working; where you currently reside.Salary: Open for discussionType: Permanent, full-time (Work hours for India will be (9:30pm-8am) morning intially.Our client is a global IT solutions and managed services provider that focuses on helping organizations digitally transform their operations. With expertise in...
-
HPC Engineer
2 months ago
Delhi, India SHI | Locuz - An SHI Company Full timeHPC Field EngineerPlease find the JD below:Work Location - PuneExperience - 2+yearsShould know the below mentioned:HPC Skill Set- Cluster Tool Kit: Rocks, xCAT, OpenHPC- Scheduler: PBS Pro, SLURM- MPI: Intel, OpenMPI- PFS: Lustre, GPFSLinux Skill :• OS Deep Dive ( RedHat, SLES, Ubuntu )• Unattended Installation Deep Dive ( PXE, Cobbler, xCAT, etc)•...
-
HPC Administrator
4 months ago
Delhi, India Esconet Technologies Full timeJob Title: HPC Cluster Integrator/AdministratorLocation: Delhi/NCR - Okhla Phase 1Work Experience: 3+ years of relevant work experienceMandatory Skills: HPC, Linux and Any scripting language - Bash, Perl or Python.Esconet is looking for a HPC Senior System Integrator/System Administrator who is highly motivated, creative and innovative and has a strong...
-
HPC Linux Support Engineer
6 days ago
Delhi, India Self-employed Full timeRole: HPC Linux Systems EngineerLocation: Remote working; where you currently reside.Salary: Open for discussionType: Permanent, full-time (Work hours for India will be (9:30pm-8am) morning initially.Our client is a global IT solutions and managed services provider that focuses on helping organizations digitally transform their operations. With expertise in...
-
Hpc linux support engineer
3 days ago
Delhi, India Self-employed Full timeRole: HPC Linux Systems EngineerLocation: Remote working; where you currently reside.Salary: Open for discussionType: Permanent, full-time (Work hours for India will be (9:30pm-8am) morning initially.Our client is a global IT solutions and managed services provider that focuses on helping organizations digitally transform their operations. With expertise in...
-
HPC Admin
1 month ago
Delhi, India Yotta Data Services Private Limited Full timeJob Scope:As an HPC Admin L3, you will be responsible for the provisioning, management, and maintenance of GPU Supercomputing clusters on NVIDIA reference architecture. You will ensure optimal performance and uptime of these critical systems, supporting high-performance computing (HPC) requirements.Job Responsibilities:- Provision, configure, and maintain...
-
HPC Administrator
4 months ago
Delhi, India Esconet Technologies Full timeJob Title: HPC Cluster Integrator/AdministratorLocation: Delhi/NCR - Okhla Phase 1Work Experience: 3+ years of relevant work experienceMandatory Skills: HPC, Linux and Any scripting language - Bash, Perl or Python.Esconet is looking for a HPC Senior System Integrator/System Administrator who is highly motivated, creative and innovative and has a strong...
-
HPC Administrator
4 months ago
Delhi, India Esconet Technologies Full timeJob Title: HPC Cluster Integrator/Administrator Location: Delhi/NCR - Okhla Phase 1 Work Experience: 3+ years of relevant work experience Mandatory Skills: HPC, Linux and Any scripting language - Bash, Perl or Python. Esconet is looking for a HPC Senior System Integrator/System Administrator who is highly motivated, creative and innovative and has a strong...
-
HPC Engineer
2 months ago
Delhi, India SHI | Locuz - An SHI Company Full timeHPC Field EngineerPlease find the JD below:Work Location - PuneExperience - 2+yearsShould know the below mentioned:HPC Skill SetCluster Tool Kit: Rocks, xCAT, OpenHPCScheduler: PBS Pro, SLURMMPI: Intel, OpenMPIPFS: Lustre, GPFSLinux Skill :• OS Deep Dive ( RedHat, SLES, Ubuntu )• Unattended Installation Deep Dive ( PXE, Cobbler, xCAT, etc)• File Server...
-
Self-employed | HPC Linux Support Engineer
3 weeks ago
Delhi, India Self-employed Full timeRole: HPC Linux Systems EngineerLocation: Remote working; where you currently reside.Salary: Open for discussionType: Permanent, full-time (Work hours for India will be (9:30pm-8am) morning intially.Our client is a global IT solutions and managed services provider that focuses on helping organizations digitally transform their operations. With expertise in...
-
delhi, India Self-employed Full timeRole: HPC Linux Systems EngineerLocation: Remote working; where you currently reside.Salary: Open for discussionType: Permanent, full-timeOur client is a global IT solutions and managed services provider that focuses on helping organizations digitally transform their operations. With expertise in cloud computing, data center technologies, networking, and...
-
HPC Systems Engineer
2 weeks ago
Delhi, Delhi, India NextGen Innovation Labs Full timeJob Title: HPC Systems EngineerEstimated Salary: $120,000 - $180,000 per yearCompany Overview:NextGen Innovation Labs is a cutting-edge research and development facility that pushes the boundaries of innovation.Job Description:Manage workload scheduler management using HPC tools and middleware.Troubleshoot HPC applications from an infrastructure...
-
Hpc admin l3
3 weeks ago
Delhi, India SIRO Clinpharm Pvt. Ltd. Full time10+ years of professional experience in HPCInstallation, configuration and troubleshooting of HPC Clusters, networks, and storage.Installation, configuration and troubleshooting of parallel filesystem GPFSInstallation, configuration and troubleshooting of job HPC Job Schedulers LSF and SLURMInstallation, configuration and troubleshooting Cluster Manager BCM,...
-
Delhi, India Self-employed Full timeRole: HPC Linux Systems EngineerLocation: Remote working; where you currently reside.Salary: Open for discussionType: Permanent, full-timeOur client is a global IT solutions and managed services provider that focuses on helping organizations digitally transform their operations. With expertise in cloud computing, data center technologies, networking, and...