HPC Admin

1 month ago


Mumbai Metropolitan Region, India Yotta Data Services Private Limited Full time

Job Scope:

As an HPC Admin L3, you will be responsible for the provisioning, management, and maintenance of GPU Supercomputing clusters on NVIDIA reference architecture. You will ensure optimal performance and uptime of these critical systems, supporting high-performance computing (HPC) requirements.


Job Responsibilities:

  • Provision, configure, and maintain GPU Supercomputing clusters and associated networking configuration.
  • Collaborate with NVIDIA Solution Architect & Engineering teams on large-scale GPU-as-a-service projects, both on-premises and in cloud deployments.
  • Implement and optimize software stacks including MaaS (metal-as-a-service), Job Scheduler (SLURM/PBS), Cloud Orchestration (Kubernetes), and Network Management (NetQ for Ethernet fabric and UFM for InfiniBand).
  • Conduct performance engineering activities such as debugging, profiling, benchmarking, and tuning of GPU applications on large-scale supercomputing clusters.
  • Run benchmarking applications from widely used platforms such as MLPerf Training & Inference, AI Training (PyTorch, TensorFlow, NeMo, Megatron-LM), and AI Inference (TensorRT-LLM, Triton Inference Server, vLLM).


Must-Have Skill:

  • Hands-on experience with NVIDIA GPU, particularly NVIDIA Data Centre GPUs (A100/H100)
  • Proficiency in provisioning and managing software stacks like MaaS, Job Scheduler (SLURM/PBS), Cloud Orchestration (Kubernetes), and Network Management (NetQ for Ethernet fabric and UFM for InfiniBand).
  • Prior experience collaborating with NVIDIA Solution Architect & Engineering teams on large-scale GPU-as-a-service projects.
  • Familiarity with benchmarking applications from widely used platforms and frameworks, including MLPerf, PyTorch, TensorFlow, NeMo, Megatron-LM, TensorRT-LLM, Triton Inference Server, and vLLM.
  • Experience in performance engineering, including debugging, profiling, benchmarking, and tuning various GPU applications on large-scale supercomputing clusters.


Good to Have Skill:

  • Knowledge of other HPC technologies and architectures beyond NVIDIA, broadening expertise in the field.
  • Good knowledge on Infiniband and other switches.
  • Experience with other cloud platforms and orchestration tools, expanding versatility in deployment environments.
  • Strong problem-solving and troubleshooting abilities, enabling quick resolution of complex technical issues.
  • Excellent communication and collaboration skills to work effectively within cross-functional teams and with external partners.


Behavioral Attributes:

  • Strong problem-solving skills with a proactive and solution-oriented approach.
  • Excellent communication and collaboration skills for effective customer support.
  • Adaptability to handle a dynamic and fast-paced cloud administration environment.
  • Commitment to security best practices and continuous improvement.


Qualification and Experience:

  • Bachelor's degree in Engineering, or equivalent.
  • Minimum 10 years experience in IT, 5+ years of relevant experience in HPC engineering roles, with a focus on NVIDIA GPU and Networking Technologies.
  • Demonstrated success in deploying and managing large-scale GPU Supercomputing clusters, preferably in collaboration with NVIDIA teams.
  • Proven track record of performance engineering activities and optimizing GPU applications for high-performance computing workloads.


  • HPC Admin

    3 weeks ago


    mumbai, India Yotta Data Services Private Limited Full time

    Job Scope: As an HPC Admin L3, you will be responsible for the provisioning, management, and maintenance of GPU Supercomputing clusters on NVIDIA reference architecture. You will ensure optimal performance and uptime of these critical systems, supporting high-performance computing (HPC) requirements.Job Responsibilities: Provision, configure, and maintain...

  • Hpc admin

    3 weeks ago


    Mumbai, India Yotta Data Services Private Limited Full time

    Job Scope: As an HPC Admin L3, you will be responsible for the provisioning, management, and maintenance of GPU Supercomputing clusters on NVIDIA reference architecture. You will ensure optimal performance and uptime of these critical systems, supporting high-performance computing (HPC) requirements. Job Responsibilities: Provision, configure, and...

  • HPC Admin

    1 month ago


    mumbai, India Yotta Data Services Private Limited Full time

    Job Scope: As an HPC Admin L3, you will be responsible for the provisioning, management, and maintenance of GPU Supercomputing clusters on NVIDIA reference architecture. You will ensure optimal performance and uptime of these critical systems, supporting high-performance computing (HPC) requirements. Job Responsibilities: Provision, configure, and...


  • Mumbai, Maharashtra, India Yotta Data Services Private Limited Full time

    Job Title: HPC AdminAt Yotta Data Services Private Limited, we are seeking a highly skilled HPC Admin to join our team. As an HPC Admin, you will be responsible for the provisioning, management, and maintenance of GPU Supercomputing clusters on NVIDIA reference architecture.Key Responsibilities:Provision, configure, and maintain GPU Supercomputing clusters...


  • Navi Mumbai, Maharashtra, India YOTTA INFRASTRUCTURE SOLUTIONS LLP Full time

    Job Title: Yotta - L2 HPC AdministratorAt Yotta Infrastructure Solutions LLP, we are seeking a highly skilled L2 HPC Administrator to join our team. As an HPC Admin L2, you will be responsible for the management and maintenance of GPU Supercomputing clusters on NVIDIA reference architecture.Key Responsibilities:Configure and maintain GPU Supercomputing...


  • Mumbai, India Baker Hughes Full time

    Would you like to help shape and implement our Digital Technology teams' strategic direction?     Are you passionate in Technology, Software and Development?   Join our Digital Technology team!   We operate at the heart of the digital transformation of our business. Our team is responsible designing & building secure solution involving global...


  • Navi Mumbai, India YOTTA INFRASTRUCTURE SOLUTIONS LLP Full time

    Job Description :As an HPC Admin L2, you will be responsible for the management and maintenance of GPU Supercomputing clusters on NVIDIA reference architecture. You will ensure optimal performance and uptime of these critical systems, supporting high-performance computing (HPC) requirements. Job Responsibilities : - Configure, and maintain GPU Supercomputing...


  • Mumbai/Airoli, India YOTTA INFRASTRUCTURE SOLUTIONS LLP Full time

    Job Description : As an HPC Admin L2, you will be responsible for the management and maintenance of GPU Supercomputing clusters on NVIDIA reference architecture. You will ensure optimal performance and uptime of these critical systems, supporting high-performance computing (HPC) requirements. Job Responsibilities : - Configure, and maintain GPU...


  • Navi Mumbai, India YOTTA INFRASTRUCTURE SOLUTIONS LLP Full time

    As an HPC Admin L3, you will be responsible for the provisioning, management, and maintenance of GPU Supercomputing clusters on NVIDIA reference architecture. You will ensure optimal performance and uptime of these critical systems, supporting high-performance computing (HPC) requirements. Job Responsibilities : - Provision, configure, and maintain GPU...


  • Mumbai/Airoli, India YOTTA INFRASTRUCTURE SOLUTIONS LLP Full time

    As an HPC Admin L3, you will be responsible for the provisioning, management, and maintenance of GPU Supercomputing clusters on NVIDIA reference architecture. You will ensure optimal performance and uptime of these critical systems, supporting high-performance computing (HPC) requirements. Job Responsibilities : - Provision, configure, and maintain GPU...

  • Return to Career

    5 months ago


    Mumbai, Maharashtra, India Baker Hughes Full time

    **Return to Career - India** **Are you passionate to restart your career and resume your professional journey ?** **Would you like to use your skills, experience, and abilities for one of the largest names in energy?** **Baker Hughes is hiring individuals under the Return to Career Program across India** Baker Hughes' Digital Technology team provide and...