
Enterprise HPC Infrastructure Solutions Specialist
24 hours ago
Job Summary:
\High-Performance Computing experts are sought after to provide operational support, plan and perform maintenance activities, assess customer environments for performance and design issues, and troubleshoot complex infrastructure issues.
\Key Responsibilities:
\- \
- Operational support for incident, problem, and change management activities.\
- Maintenance planning and execution for HPC infrastructure smooth operation.\
- Customer environment assessment for performance and design issues, proposing resolutions to improve system efficiency.\
- Troubleshooting of complex infrastructure issues, implementation of solutions, and subject matter expert escalation point.\
- Detailed documentation creation and maintenance of HPC infrastructure and related processes.\
- Expertise in storage technologies and HPC-related issues, communication with vendors to resolve storage issues and improve system reliability.\
- Customer and internal team communication for transparency and timely issue resolution.\
- Participation in on-call rotation for 24/7 critical HPC infrastructure support.\
- Requirements:\
- \
- Bachelor's degree or equivalent in Information Systems or a related field.\
- 5+ years of expert-level experience managing infrastructure in high-performance computing environments.\
- 1+ year experience with Nvidia DGX preferred.\
- Experience with HPC schedulers (e.g. SLURM, PBS, Torque).\
- Experience configuring, maintaining, and troubleshooting Kubernetes.\
- Experience with storage technology (e.g. Ceph, Vast Data Platform) and distributed file systems (e.g. Lustre, GPFS, NFS, GlusterFS).\
- Experience with machine learning or data science workflows in HPC/AI environments.\
- Advanced experience with Linux operating systems.\
- Experience with Nvidia/Mellanox switches a plus.\
- Experience with ethernet and InfiniBand networking a plus.\
- 1+ year working with monitoring platforms (e.g. Prometheus, Grafana); Elastic Observability experience is a bonus.\
- 1+ year working with enterprise ITSM systems (ServiceNow is a bonus).\
- Experience with automation tools such as Ansible, Puppet, or Chef is a plus.\
- Managed Services or consulting experience is required.\
- Strong background in customer service.\
- High-level problem-solving and communication skills.\
- Strong oral and written communication skills.\
- Related network certifications are a bonus.\
-
HPC Infrastructure Engineer
1 day ago
Gurgaon, Haryana, India AHEAD Full timeJob DescriptionRoles & Responsibilities- Provide enterprise-level operational support to Managed Services customers for incident, problem, and change management activities- Plan and perform maintenance activities- Assess customer environments for performance and design issues and propose resolutions- Work across technical teams to troubleshoot complex...
-
HPC System Administrator
3 days ago
Gurgaon, Haryana, India NVISH SOLUTIONS PRIVATE LIMITED Full timeResponsibilities : - Administration of HPC and VDI clusters - User Account management for HPC onboarding and offboarding - Creation and Maintenance of AMI Images in AMI accounts- Install, configure, and maintain Linux operating systems on HPC clusters.- Support HPC necessary components and native services of the platform by coordinating with respective...
-
HPC Storage Engineer
1 day ago
Gurgaon, Haryana, India AHEAD Full timeJob DescriptionThe High-Performance Computing Storage Engineer is primarily responsible for the overall health and maintenance of storage technologies in our managed services customers environments. Our Storage Engineers are a valued member of the Managed Services Infrastructure Practice responsible for Tier 3 incident management, service request management...
-
HPC Network Engineer
1 day ago
Gurgaon, Haryana, India AHEAD Full timeJob DescriptionThe High-Performance Computing Network Engineer is primarily responsible for the overall health and maintenance of storage technologies in our managed services customer's environments. Our Network Engineers are a valued member of the Managed Services Infrastructure Practice responsible for Tier 3 incident management, service request management...
-
Cloud Infrastructure Specialist
24 hours ago
Gurgaon, Haryana, India beBeeNetwork Full time ₹ 18,00,000 - ₹ 24,00,000Job Opportunity:We are seeking a highly skilled High-Performance Computing Network Engineer to join our team.Job Description:This individual will play a vital role in maintaining the overall health and infrastructure of storage technologies for managed services customers. They will be responsible for Tier 3 incident management, service request management,...
-
Chief High-Performance Infrastructure Specialist
21 hours ago
Gurgaon, Haryana, India beBeeInfrastructure Full time ₹ 15,00,000 - ₹ 28,00,000Job DescriptionWe are seeking an experienced professional to fill the role of High-Performance Computing Engineer. The successful candidate will provide operational support for enterprise-level customers, planning and performing maintenance activities, assessing customer environments for performance and design issues, and collaborating with technical teams...
-
HPC & Infrastructure Engineer
6 days ago
Gurgaon, Haryana, India Tower Research Capital Full time US$ 1,50,000 - US$ 2,00,000 per yearTower Research Capital is a leading quantitative trading firm founded in 1998. Tower has built its business on a high-performance platform and independent trading teams. We have a 25+ year track record of innovation and a reputation for discovering unique market opportunities.Tower is home to some of the world's best systematic trading and engineering...
-
IT Infrastructure Solutions Specialist
5 days ago
Gurgaon, Haryana, India beBeeinfrastructure Full time ₹ 12,00,000 - ₹ 15,00,000Job OpportunityThis is a full-time, on-site position for an IT Infrastructure Solutions Specialist based in Gurugram.
-
Principal Storage Architect
22 hours ago
Gurgaon, Haryana, India beBeeStorage Full time ₹ 15,00,000 - ₹ 28,00,000Job Description:The primary responsibility of the Storage Infrastructure Specialist lies in ensuring the optimal functioning and maintenance of storage technologies within our managed services customer environments.This role is an integral part of the Managed Services Infrastructure Practice, responsible for Tier 3 incident management, service request...
-
IT Infrastructure Solutions Specialist
1 week ago
Gurgaon, Haryana, India beBeeInfrastructure Full time ₹ 15,48,000 - ₹ 21,47,000Job Title: IT Infrastructure Solutions SpecialistDescription: As an IT Infrastructure Solutions Specialist, you will play a crucial role in ensuring the stability and reliability of our organization's infrastructure. You will be responsible for resolving incidents and problems across multiple business system components and ensuring operational stability.Key...