Enterprise HPC Infrastructure Solutions Specialist

24 hours ago


Gurgaon, Haryana, India beBeeHpc Full time ₹ 1,50,00,000 - ₹ 2,50,00,000

Job Summary:

\

High-Performance Computing experts are sought after to provide operational support, plan and perform maintenance activities, assess customer environments for performance and design issues, and troubleshoot complex infrastructure issues.

\

Key Responsibilities:

\
  • \
  • Operational support for incident, problem, and change management activities.\
  • Maintenance planning and execution for HPC infrastructure smooth operation.\
  • Customer environment assessment for performance and design issues, proposing resolutions to improve system efficiency.\
  • Troubleshooting of complex infrastructure issues, implementation of solutions, and subject matter expert escalation point.\
  • Detailed documentation creation and maintenance of HPC infrastructure and related processes.\
  • Expertise in storage technologies and HPC-related issues, communication with vendors to resolve storage issues and improve system reliability.\
  • Customer and internal team communication for transparency and timely issue resolution.\
  • Participation in on-call rotation for 24/7 critical HPC infrastructure support.\
  • Requirements:\
    • \
    • Bachelor's degree or equivalent in Information Systems or a related field.\
    • 5+ years of expert-level experience managing infrastructure in high-performance computing environments.\
    • 1+ year experience with Nvidia DGX preferred.\
    • Experience with HPC schedulers (e.g. SLURM, PBS, Torque).\
    • Experience configuring, maintaining, and troubleshooting Kubernetes.\
    • Experience with storage technology (e.g. Ceph, Vast Data Platform) and distributed file systems (e.g. Lustre, GPFS, NFS, GlusterFS).\
    • Experience with machine learning or data science workflows in HPC/AI environments.\
    • Advanced experience with Linux operating systems.\
    • Experience with Nvidia/Mellanox switches a plus.\
    • Experience with ethernet and InfiniBand networking a plus.\
    • 1+ year working with monitoring platforms (e.g. Prometheus, Grafana); Elastic Observability experience is a bonus.\
    • 1+ year working with enterprise ITSM systems (ServiceNow is a bonus).\
    • Experience with automation tools such as Ansible, Puppet, or Chef is a plus.\
    • Managed Services or consulting experience is required.\
    • Strong background in customer service.\
    • High-level problem-solving and communication skills.\
    • Strong oral and written communication skills.\
    • Related network certifications are a bonus.\


  • Gurgaon, Haryana, India AHEAD Full time

    Job DescriptionRoles & Responsibilities- Provide enterprise-level operational support to Managed Services customers for incident, problem, and change management activities- Plan and perform maintenance activities- Assess customer environments for performance and design issues and propose resolutions- Work across technical teams to troubleshoot complex...


  • Gurgaon, Haryana, India NVISH SOLUTIONS PRIVATE LIMITED Full time

    Responsibilities : - Administration of HPC and VDI clusters - User Account management for HPC onboarding and offboarding - Creation and Maintenance of AMI Images in AMI accounts- Install, configure, and maintain Linux operating systems on HPC clusters.- Support HPC necessary components and native services of the platform by coordinating with respective...


  • Gurgaon, Haryana, India AHEAD Full time

    Job DescriptionThe High-Performance Computing Storage Engineer is primarily responsible for the overall health and maintenance of storage technologies in our managed services customers environments. Our Storage Engineers are a valued member of the Managed Services Infrastructure Practice responsible for Tier 3 incident management, service request management...


  • Gurgaon, Haryana, India AHEAD Full time

    Job DescriptionThe High-Performance Computing Network Engineer is primarily responsible for the overall health and maintenance of storage technologies in our managed services customer's environments. Our Network Engineers are a valued member of the Managed Services Infrastructure Practice responsible for Tier 3 incident management, service request management...


  • Gurgaon, Haryana, India beBeeNetwork Full time ₹ 18,00,000 - ₹ 24,00,000

    Job Opportunity:We are seeking a highly skilled High-Performance Computing Network Engineer to join our team.Job Description:This individual will play a vital role in maintaining the overall health and infrastructure of storage technologies for managed services customers. They will be responsible for Tier 3 incident management, service request management,...


  • Gurgaon, Haryana, India beBeeInfrastructure Full time ₹ 15,00,000 - ₹ 28,00,000

    Job DescriptionWe are seeking an experienced professional to fill the role of High-Performance Computing Engineer. The successful candidate will provide operational support for enterprise-level customers, planning and performing maintenance activities, assessing customer environments for performance and design issues, and collaborating with technical teams...


  • Gurgaon, Haryana, India Tower Research Capital Full time US$ 1,50,000 - US$ 2,00,000 per year

    Tower Research Capital is a leading quantitative trading firm founded in 1998. Tower has built its business on a high-performance platform and independent trading teams. We have a 25+ year track record of innovation and a reputation for discovering unique market opportunities.Tower is home to some of the world's best systematic trading and engineering...


  • Gurgaon, Haryana, India beBeeinfrastructure Full time ₹ 12,00,000 - ₹ 15,00,000

    Job OpportunityThis is a full-time, on-site position for an IT Infrastructure Solutions Specialist based in Gurugram.


  • Gurgaon, Haryana, India beBeeStorage Full time ₹ 15,00,000 - ₹ 28,00,000

    Job Description:The primary responsibility of the Storage Infrastructure Specialist lies in ensuring the optimal functioning and maintenance of storage technologies within our managed services customer environments.This role is an integral part of the Managed Services Infrastructure Practice, responsible for Tier 3 incident management, service request...


  • Gurgaon, Haryana, India beBeeInfrastructure Full time ₹ 15,48,000 - ₹ 21,47,000

    Job Title: IT Infrastructure Solutions SpecialistDescription: As an IT Infrastructure Solutions Specialist, you will play a crucial role in ensuring the stability and reliability of our organization's infrastructure. You will be responsible for resolving incidents and problems across multiple business system components and ensuring operational stability.Key...