High Performance Infrastructure Professional

7 days ago


Jodhpur, Rajasthan, India beBeeComputing Full time ₹ 19,80,000 - ₹ 25,20,000

Job Title: High Performance Computing Specialist

Description:

This highly skilled position involves designing, building, and supporting high-performance computing (HPC) infrastructure. The successful candidate will have extensive experience in AI infrastructure design and deployment, including CPU, GPU, middleware, and orchestration tools.

Key Responsibilities:

  • Assess current AI infrastructure requirements and future growth needs.
  • Design and deploy AI infrastructure to meet business requirements, including CPU, GPU, middleware, and orchestration tools.
  • Configure and manage PFS/NFS systems.
  • Install and configure libraries and compilers.
  • Deploy and manage monitoring and observability tools for AI clusters.
  • Manage and troubleshoot network issues related to InfiniBand, ROCE switches, UFMs, NETQ tools, and Nvidia GPUs.

Required Skills:

  • Operating systems: Linux (RHEL, CentOS, SuSE)
  • Languages: C, C++, bash, Python
  • Schedulers and resource management: PBS, SLURM, Grid Engine, LSF
  • Cluster management: xCAT, Bright Cluster Manager, War wolf
  • Monitoring: Grafana, Nagios, Zabbix, Gangalia
  • Compilers and libraries: GNU, Intel, OpenMPI, Cuda, MPI
  • Networking: InfiniBand/ROCE/UFM/Ethernet/DPU

Desirable Skills:

  • Parallel file systems: GPFS, Lustre, Weka, BeeGFS, VAST
  • Benchmarking tools: IOR, Lim pack
  • Nvidia stack: Nvidia AI Enterprise, RunAi
  • Configuration and troubleshooting of PFS like Lustre/GPFS/WEKA
  • Job scheduler installation, configuration, and troubleshooting
  • Root cause analysis of cluster issues
  • Performance benchmarking of clusters
  • Basic knowledge of parallel programming – OpenMPI, MPI
  • Knowledge in GPU orchestration/MIG instance creation
  • Scripting and automation – bash, Perl, Python
  • Advanced Linux OS debugging skills


  • Jodhpur, Rajasthan, India beBeeComputing Full time ₹ 15,00,000 - ₹ 30,00,000

    Key Performance Computing (KPC) ExpertJob Overview:This is a critical role in the management and administration of large-scale computing systems, responsible for ensuring optimal performance and efficiency.Key Responsibilities:Design and implementation of High-Performance Computing (HPC) clusters, including Virtual Desktop Infrastructure (VDI).User account...


  • Jodhpur, Rajasthan, India beBeeSiteReliabilityEngineer Full time ₹ 90,00,000 - ₹ 1,20,00,000

    Are you looking for a challenging role where you can leverage your technical expertise to drive system performance and reliability?We are seeking a Senior Site Reliability Engineer to join our team.About the RoleThis is an exciting opportunity to work with a globally dispersed team in a follow-the-sun support model, collaborating closely with...


  • Jodhpur, Rajasthan, India beBeeExpertise Full time ₹ 1,50,00,000 - ₹ 1,87,00,000

    Infrastructure ExpertiseTo excel in this role, you will play a pivotal part in ensuring the optimal performance and reliability of our datacentres, platforms, and corporate systems.Key Responsibilities:Manage AIX servers, SANs, and storage arrays, showcasing expertise in server and storage technologies.Implement robust Backup, Replication, HA, and Recovery...


  • Jodhpur, Rajasthan, India beBeeCloud Full time ₹ 15,00,000 - ₹ 25,00,000

    Cloud Infrastructure Manager PositionWe are seeking an experienced professional to oversee and manage our cloud infrastructure. The ideal candidate will have in-depth knowledge of cloud platforms, strong automation skills, and a solid understanding of cloud security, monitoring, and optimization.Achieve enterprise-grade cloud infrastructure setup on AWS,...


  • Jodhpur, Rajasthan, India beBeeNetwork Full time ₹ 1,50,00,000 - ₹ 2,50,00,000

    Job DescriptionNetwork Infrastructure ProfessionalImplement secure network architecture and design.Deploy and monitor enterprise-scale network security solutions.Collaborate with cross-functional teams to improve network performance.Experience:2–3 yrs in Network Security Support Engineer role or 4–6 yrs in Network Security Engineer/Consultant role or...


  • Jodhpur, Rajasthan, India beBeeBackend Full time ₹ 6,00,000 - ₹ 12,00,000

    Software ArchitectThis is a high-level position that oversees the design and development of large-scale software systems.The ideal candidate will have expertise in Typescript/Node.js, PostgreSQL, and modern distributed technologies to create reliable infrastructure that supports rapid user growth.Key Responsibilities:Design scalable server-side applications...


  • Jodhpur, Rajasthan, India beBeeDataEngineer Full time ₹ 1,50,00,000 - ₹ 2,00,00,000

    Key Objectives:• Architect and deploy scalable data platforms using lakehouse technologies.• Develop high-performance data applications and services utilizing Java for large-scale data systems.• Build efficient streaming pipelines with Kafka, Pulsar, and Flink to facilitate low-latency data processing.• Utilize modern query engines like Trino, Spark...


  • Jodhpur, Rajasthan, India beBeeReliability Full time ₹ 1,50,00,000 - ₹ 2,50,00,000

    Job Role:SRE Position OverviewWe are seeking an experienced Site Reliability Engineer to join our team and contribute to designing, building, and maintaining high-performance, scalable, and reliable services.The ideal candidate will have a strong background in system administration, software development, and DevOps practices. They will be responsible for...


  • Jodhpur, Rajasthan, India beBeeCloud Full time ₹ 1,50,00,000 - ₹ 2,00,00,000

    Cloud Infrastructure EngineerWe are seeking a skilled Cloud Infrastructure Engineer to contribute to the management, operation, and optimization of our cloud infrastructure. The ideal candidate will have a good understanding of cloud principles and services, with experience in Google Cloud Platform (GCP) and Azure.Key responsibilities include:Implementing...


  • Jodhpur, Rajasthan, India beBeeCloudLeader Full time US$ 12,00,000 - US$ 18,00,000

    Job OpportunityWe are seeking an experienced Cloud Solution Architect to lead the design, implementation, and management of cloud infrastructure and DevOps operations for large-scale eCommerce projects.You will ensure that the cloud infrastructure is secure, scalable, resilient, and optimized for performance, while mentoring a high-performing team and...