Yotta - L3 HPC Administrator

2 weeks ago


Bengaluru, India YOTTA INFRASTRUCTURE SOLUTIONS LLP Full time

Job Scope :

As an HPC Admin L3, you will be responsible for the provisioning, management, and maintenance of GPU Supercomputing clusters on NVIDIA reference architecture.

You will ensure optimal performance and uptime of these critical systems, supporting high-performance computing (HPC) requirements.

Job Responsibilities :

- Provision, configure, and maintain GPU Supercomputing clusters and associated networking configuration.

- Collaborate with NVIDIA Solution Architect & Engineering teams on large-scale GPU-as-a-service projects, both on-premises and in cloud deployments.

- Implement and optimize software stacks including MaaS (metal-as-a-service), Job Scheduler (SLURM/PBS), Cloud Orchestration (Kubernetes), and Network Management (NetQ for Ethernet fabric and UFM for InfiniBand).

- Conduct performance engineering activities such as debugging, profiling, benchmarking, and tuning of GPU applications on large-scale supercomputing clusters.

- Run benchmarking applications from widely used platforms such as MLPerf Training & Inference, AI Training (PyTorch, TensorFlow, NeMo, Megatron-LM), and AI Inference (TensorRT-LLM, Triton Inference Server, vLLM).

Must-Have Skill :

- Hands-on experience with NVIDIA GPU, particularly NVIDIA Data Centre GPUs (A100/H100)

- Proficiency in provisioning and managing software stacks like MaaS, Job Scheduler (SLURM/PBS), Cloud Orchestration (Kubernetes), and Network Management (NetQ for Ethernet fabric and UFM for InfiniBand).

- Prior experience collaborating with NVIDIA Solution Architect & Engineering teams on large-scale GPU-as-a-service projects.

- Familiarity with benchmarking applications from widely used platforms and frameworks, including MLPerf, PyTorch, TensorFlow, NeMo, Megatron-LM, TensorRT-LLM, Triton Inference Server, and vLLM.

- Experience in performance engineering, including debugging, profiling, benchmarking, and tuning various GPU applications on large-scale supercomputing clusters.

Good to Have Skill :

- Knowledge of other HPC technologies and architectures beyond NVIDIA, broadening expertise in the field.

- Good knowledge on Infiniband and other switches.

- Experience with other cloud platforms and orchestration tools, expanding versatility in deployment environments.

- Strong problem-solving and troubleshooting abilities, enabling quick resolution of complex technical issues.

- Excellent communication and collaboration skills to work effectively within cross-functional teams and with external partners.

Behavioral Attributes :

- Strong problem-solving skills with a proactive and solution-oriented approach.

- Excellent communication and collaboration skills for effective customer support.

- Adaptability to handle a dynamic and fast-paced cloud administration environment.

- Commitment to security best practices and continuous improvement.

Qualification and Experience :

- Bachelor's degree in Engineering, or equivalent.

- Minimum 10 years experience in IT, 5+ years of relevant experience in HPC engineering roles, with a focus on NVIDIA GPU and Networking Technologies.

- Demonstrated success in deploying and managing large-scale GPU Supercomputing clusters, preferably in collaboration with NVIDIA teams.

- Proven track record of performance engineering activities and optimizing GPU applications for high-performance computing workloads

(ref:hirist.tech)

  • Bengaluru, India Tata Consultancy Services Full time

    Role - HPC Cloud Programmer – L3Location - Pan IndiaExperience - 5+ YrsJD- Knowledge on HPC and AWS- Knowledge on HPC clusters- L3 support experience in CAE application- Hands on scripting experience in Linux Bash scripting , Python scripting- Hands on scripting experience in Slrum scripting , PBS scripting


  • Bengaluru, India Tata Consultancy Services Full time

    Role - HPC Cloud Programmer – L3Location - Pan IndiaExperience - 5+ YrsJD- Knowledge on HPC and AWS- Knowledge on HPC clusters- L3 support experience in CAE application- Hands on scripting experience in Linux Bash scripting , Python scripting- Hands on scripting experience in Slrum scripting , PBS scripting


  • Bengaluru, India Tata Consultancy Services Full time

    Role - HPC Cloud Programmer – L3Location - Pan IndiaExperience - 5+ YrsJDKnowledge on HPC and AWSKnowledge on HPC clustersL3 support experience in CAE applicationHands on scripting experience in Linux Bash scripting , Python scriptingHands on scripting experience in Slrum scripting , PBS scripting


  • Bengaluru, India Tata Consultancy Services Full time

    Role - HPC Cloud Programmer – L3 Location - Pan India Experience - 5+ Yrs JD Knowledge on HPC and AWS Knowledge on HPC clusters L3 support experience in CAE application Hands on scripting experience in Linux Bash scripting , Python scripting Hands on scripting experience in Slrum scripting , PBS scripting


  • Bengaluru, India Tata Consultancy Services Full time

    Role - HPC Cloud Programmer – L3Location - Pan IndiaExperience - 5+ YrsJDKnowledge on HPC and AWSKnowledge on HPC clustersL3 support experience in CAE applicationHands on scripting experience in Linux Bash scripting , Python scriptingHands on scripting experience in Slrum scripting , PBS scripting

  • HPC Cloud Architect

    1 month ago


    Bengaluru, India Tata Consultancy Services Full time

    Role - HPC Cloud Architect - L3Location - Pan IndiaExperience - 8 to 10 YrsJDKnowledge on HPC and AWSKnowledge on HPC clustersKnowledge on CAE applicationLinux Bash scripting , Python scriptingScheduler- Slrum scripting , PBS scripting

  • Hpc cloud architect

    1 month ago


    Bengaluru, India Tata Consultancy Services Full time

    Role - HPC Cloud Architect - L3Location - Pan IndiaExperience - 8 to 10 YrsJDKnowledge on HPC and AWSKnowledge on HPC clustersKnowledge on CAE applicationLinux Bash scripting , Python scriptingScheduler- Slrum scripting , PBS scripting

  • HPC Cloud Architect

    1 month ago


    Bengaluru, India Tata Consultancy Services Full time

    Role - HPC Cloud Architect - L3Location - Pan IndiaExperience - 8 to 10 YrsJD- Knowledge on HPC and AWS- Knowledge on HPC clusters- Knowledge on CAE application- Linux Bash scripting , Python scripting- Scheduler- Slrum scripting , PBS scripting


  • Bengaluru, India Tata Consultancy Services Full time

    Role - HPC Cloud Programmer – L3Location - Pan IndiaExperience - 5+ YrsJDKnowledge on HPC and AWSKnowledge on HPC clustersL3 support experience in CAE applicationHands on scripting experience in Linux Bash scripting , Python scriptingHands on scripting experience in Slrum scripting , PBS scripting

  • HPC Cloud Architect

    1 month ago


    Bengaluru, India Tata Consultancy Services Full time

    Role - HPC Cloud Architect - L3 Location - Pan IndiaExperience - 8 to 10 YrsJD Knowledge on HPC and AWS Knowledge on HPC clustersKnowledge on CAE applicationLinux Bash scripting , Python scripting Scheduler- Slrum scripting , PBS scripting

  • HPC Cloud Architect

    1 month ago


    Bengaluru, India Tata Consultancy Services Full time

    Role - HPC Cloud Architect - L3 Location - Pan India Experience - 8 to 10 Yrs JD Knowledge on HPC and AWS Knowledge on HPC clusters Knowledge on CAE application Linux Bash scripting , Python scripting Scheduler- Slrum scripting , PBS scripting

  • HPC Cloud Architect

    1 month ago


    Bengaluru, India Tata Consultancy Services Full time

    Role - HPC Cloud Architect - L3 Location - Pan IndiaExperience - 8 to 10 YrsJD Knowledge on HPC and AWS Knowledge on HPC clustersKnowledge on CAE applicationLinux Bash scripting , Python scripting Scheduler- Slrum scripting , PBS scripting


  • Bengaluru, Karnataka, India Tata Consultancy Services Full time

    We are seeking a seasoned HPC Cloud Programmer with L3 support experience in CAE applications and expertise in AWS, HPC clusters, Linux Bash scripting, Python scripting, Slurm scripting, and PBS scripting.Job DescriptionTata Consultancy Services is offering an exciting opportunity for a talented individual to join our team as an HPC Cloud Programmer. In this...

  • HPC Cloud Expert

    2 weeks ago


    Bengaluru, Karnataka, India Tata Consultancy Services Full time

    Job OverviewTata Consultancy Services is seeking a seasoned HPC Cloud Expert to join our team. This role involves leveraging expertise in high-performance computing and cloud technologies to deliver strategic cloud deployment solutions.About the RoleWe are looking for an experienced professional with a strong background in HPC clusters, Linux scripting,...


  • Bengaluru, Karnataka, India Peopledecode Solutions Pvt Ltd Full time

    **About Peopledecode Solutions Pvt Ltd**We are a leading technology company providing cutting-edge solutions to our clients. We're currently seeking an experienced AWS HPC Technical Lead to join our team.Salary: $150,000 - $200,000 per yearAbout the Role:The AWS HPC Technical Lead will be responsible for designing and implementing high-performance computing...

  • Delivery lead-hpc

    4 weeks ago


    Bengaluru, India Algoleap Full time

    Algoleap is one of the fastest growing digital engineering services company based out of Hyderabad working with many fortune 1000 customers in transforming their digital landscape.At Algoleap, we believe that a truly innovative and successful workplace is one where diversity, equity, and inclusion thrive. We are dedicated to building a team that celebrates...

  • Delivery Lead-HPC

    4 weeks ago


    Bengaluru, India algoleap Full time

    Algoleap is one of the fastest growing digital engineering services company based out of Hyderabad working with many fortune 1000 customers in transforming their digital landscape.At Algoleap, we believe that a truly innovative and successful workplace is one where diversity, equity, and inclusion thrive. We are dedicated to building a team that celebrates...

  • Delivery Lead-HPC

    4 weeks ago


    Bengaluru, India algoleap Full time

    Algoleap is one of the fastest growing digital engineering services company based out of Hyderabad working with many fortune 1000 customers in transforming their digital landscape. At Algoleap, we believe that a truly innovative and successful workplace is one where diversity, equity, and inclusion thrive. We are dedicated to building a team that celebrates...

  • Delivery Lead-HPC

    4 weeks ago


    Bengaluru, India algoleap Full time

    Algoleap is one of the fastest growing digital engineering services company based out of Hyderabad working with many fortune 1000 customers in transforming their digital landscape. At Algoleap, we believe that a truly innovative and successful workplace is one where diversity, equity, and inclusion thrive. We are dedicated to building a team that celebrates...

  • Delivery Lead-HPC

    4 weeks ago


    Bengaluru, India algoleap Full time

    Algoleap is one of the fastest growing digital engineering services company based out of Hyderabad working with many fortune 1000 customers in transforming their digital landscape. At Algoleap, we believe that a truly innovative and successful workplace is one where diversity, equity, and inclusion thrive. We are dedicated to building a team that...