
HPC Expert for High-Performance Computing Roles
2 days ago
High-Performance Computing Expert
HPC System Administration & Troubleshooting:
- Manage and optimize HPC clusters for high availability and performance.
- Troubleshoot GPU, CPU, network drivers, firmware, and OS-level issues to ensure smooth operation.
- Debug storage, networking, and job scheduling bottlenecks in Slurm-based environments for optimal efficiency.
Kubernetes & Cloud HPC Environments:
- Deploy and manage HPC workloads in Kubernetes for AI/ML and parallel computing applications.
- Optimize OpenStack-based HPC clusters with Ceph, Cinder, and Neutron for cloud scalability and reliability.
- Implement containerized HPC workflows using Kubernetes and OpenShift for enhanced flexibility and portability.
Automation & Infrastructure as Code (IaC):
- Develop Ansible and Terraform scripts for provisioning and managing HPC resources efficiently.
- Automate job scheduling, cluster monitoring, and log analysis using Python for real-time insights.
- Optimize CI/CD pipelines for HPC and AI/ML applications to reduce development time and improve quality.
Performance Tuning & Benchmarking:
- Benchmark and optimize multi-node HPC workloads (MPI, NCCL, ROCm, CUDA) for peak performance and efficiency.
- Tune OS parameters, networking (InfiniBand, RoCE), and Slurm configurations for optimal performance.
- Enhance HPC storage performance (Ceph, Lustre, NFS) and distributed computing efficiency for improved productivity.
Client Support & Collaboration:
- Provide real-time technical support and troubleshooting for HPC users with a focus on customer satisfaction.
- Engage with developers, DevOps, and system administrators to optimize cluster performance and collaboration.
- Document solutions, best practices, and contribute to internal knowledge bases for knowledge sharing and reuse.
Preferred Qualifications:
- Experience with AMD MI300, MI2X0 GPUs, ROCm, MPI, UCX, or XPMEM for advanced HPC capabilities.
- Exposure to containerized workloads using Singularity or Docker in HPC environments for flexibility and portability.
- Familiarity with OpenStack deployment automation (e.g., TripleO, Kolla, or OpenStack-Ansible) for efficient cluster management.
- Experience in customer-facing technical roles, with a strong ability to troubleshoot live issues and provide exceptional support.
-
Cloud HPC Systems Manager
3 days ago
Hyderabad / Secunderabad, Telangana, India beBeeCloud Full timeCloud HPC Engineer Position OverviewWe are seeking a seasoned Cloud High-Performance Computing (HPC) Engineer to join our team. This role is responsible for designing, implementing, and managing cloud-based infrastructure that supports HPC environments.As a key member of our engineering team, you will collaborate with data scientists and ML engineers to...
-
High-Performance Computing Expert
18 hours ago
Hyderabad, Telangana, India beBeeArtificial Full time ₹ 2,50,00,000 - ₹ 3,50,00,000Job Title: AI and HPC EngineerThe position involves designing, optimizing, and benchmarking CPU- and GPU-intensive environments to ensure maximum efficiency in scientific and machine learning workloads.Expertise in Open-source and Commercial High-Performance Computing (HPC) AI ApplicationsProficient in deploying and optimizing scientific codes such as...
-
Senior Systems Engineer
5 days ago
Hyderabad / Secunderabad, Telangana, India beBeeHighPerformance Full time ₹ 15,00,000 - ₹ 20,00,000System Administrator for High-Performance Computing EnvironmentsAre you an experienced professional in high-performance computing environments? We are seeking a skilled System Administrator to manage, optimize, and troubleshoot HPC clusters and cloud-based environments.This role requires hands-on experience with Python, Kubernetes (K8s), Slurm, OpenStack,...
-
High-Performance Computing Specialist
8 hours ago
Hyderabad, Telangana, India beBeeHpc Full time ₹ 20,00,000 - ₹ 23,17,500Job Title:A highly skilled HPC AI Applications Professional is required to drive the implementation of high-performance computing solutions.Key Responsibilities:Design and implement high-performance computing (HPC) solutions using Open-source and Commercial HPC AI ApplicationsInstall, benchmark, and fine-tune open-source applications, libraries, and...
-
Senior High Performance Computing Engineer
3 days ago
Hyderabad, Telangana, India Amgen Inc Full timeJob Description- Implement and manage cloud-based infrastructure that supports HPC environments for data science (e.g., AI/ML workflows, Image Analysis).- Collaborate with data scientists and ML engineers to deploy scalable machine learning models into production.- Ensure the security, scalability, and reliability of HPC systems in the cloud.- Optimize cloud...
-
HPC Developer and Performance Optimizer
1 week ago
Hyderabad / Secunderabad, Telangana, India beBeeSoftware Full timeJob OverviewWe are seeking a highly skilled and experienced professional to join our team as an HPC Software Optimization Engineer. This is a challenging role that requires strong technical skills, leadership experience, and the ability to work collaboratively with cross-functional teams.
-
Senior HPC Engineer
3 days ago
Hyderabad / Secunderabad, Telangana, India beBeeHighPerformanceComputingEngineer Full timeJob Description:This position involves the deployment, maintenance, and support of HPC infrastructure in a multi-cloud environment. The ideal candidate will have hands-on engineering experience with deep technical expertise in HPC technology and standard methodologies.Implement and manage cloud-based infrastructure that supports HPC environments for data...
-
Senior HPC
3 days ago
Hyderabad, Telangana, India Allnessjobs Full timeJob Title: Senior Cloud EngineerLocation: HyderabadExperience: 5+ YearsWork Mode: HybridAbout the Role:We are looking for a highly skilled and motivated Senior AWS Engineer to join our team. In this role, you will be responsible for designing, deploying, and managing cloud infrastructure using Amazon Web Services (AWS). You will collaborate closely with...
-
Senior HPC Systems Engineer
1 hour ago
Hyderabad, Telangana, India beBeeSystems Full time ₹ 1,50,00,000 - ₹ 2,50,00,000Job Title: SMTS Systems Design EngineerWe are seeking an experienced systems engineer to join our team. The ideal candidate will have a strong background in system design and administration, with expertise in high-performance computing (HPC) clusters, Kubernetes, and cloud environments.The successful candidate will be responsible for designing, deploying,...
-
Hyderabad, Telangana, India Amgen Full timeCareer Category Information SystemsJoin Amgens Mission of Serving PatientsAt Amgen if you feel like youre part of something bigger its because you are Our shared missionxe2x80x94to serve patients living with serious illnessesxe2x80x94drives all that we do Since 1980 weve helped pioneer the world of biotech in our fight against the worlds toughest...