Lead HPCC
16 hours ago
What you’ll do:
Responsibilities
Leadership and Strategy
Develop HPC and container platform roadmaps and strategies for growth based on business needs
Engage in and enhance the complete service lifecycle—from conceptualization and design to implementation and operation
Identify the growth path and scalability options of a solution and include these in design activities
Solution Design, architecture and Planning
Gather requirements, assess technical feasibility, and design integrated HPC and container solutions that align with business objectives.
Architect and optimize the technical solutions to meet the requirements of the customer.
Identify the potential challenges and constraints that impact the solution and project plan.
Opportunity assessment
Respond to the technical sections of RFIs/RFPs and Lead proof-of-concept engagements to a successful conclusion
Utilize an effective consultative approach to advance opportunities
Innovation and Research
Stay abreast of emerging technologies, trends, and industry developments related to HPC, Kubernetes, containers, cloud computing, and Security.
Develop best practices, Accelerators and Show & Tell for HPC and container platform solutions and integrations.
Customer-centric mindset
Strong focus on understanding customer business requirements and solving complex cloud technology issues
Be the trusted advisor, delight customers, and deliver exceptional customer experiences to drive customer success.
Communicate complex technical concepts and findings to non-technical stakeholders
Team Collaboration
Collaborate with cross-functional teams, including system administrators, developers, data scientist and project managers, to ensure successful project delivery.
Understands the roles and effectively engages other teams and resources within the company
Mentor and train new team members and lead the way in participation in tech talks, forums, innovation.
Performance Optimization and Troubleshooting
Troubleshoot and resolve technical issues related to complete solutions.
Identify performance bottlenecks and provide remediations.
Project Delivery
Ability to lead technical projects by gathering the requirements, preparing the architecture / design and executing it end to end.
Must be able to bring clarity and drive complex projects involving multiple stakeholders
Solid business acumen and ability to converse with client on issues and challenges
Technical Skills
Container Technologies and Orchestration Platform
In-depth knowledge and hands-on experience with containerization technologies like Docker,or Podman
In-depth knowledge and hands-on experience with at least two (2) of the container orchestration technologies like CNCF Kubernetes, Red Hat OpenShift, SUSE Rancher RKE/K3S, Canonical charmed kubernetes or HPE Ezmeral
Runtime
Linux
Knowledge and experience with Linux System Administration, package management, scheduling, boot procedures/troubleshooting, performance optimization, and networking concepts
Good knowledge and hands-on experience with at least two various Linux distributions like RHEL, SLES, Ubuntu, Debian.
HPC
In-depth knowledge and hands-on experience with atleast one (1) HPC technologies, workload schedulers – Slurm, Altair PBS pro, and cluster managers – HPCM, Bright cluster manager
Good experience in performance optimization and health assessment of HPC components such as operating systems, storage, servers, parallel file systems, schedulers.
Good knowledge and hands-on experience containerization technologies like Singularity for HPC
Good knowledge in parallel computing, MPI technologies
Virtualization
Good knowledge and hands-on experience with virtualization technologies like KVM, OpenShift virtualization
Programming Languages
Good experience with Programming ike python,
Good experience withScripting languages like bash
Cloud Platforms
Good knowledge and hands-on experience with OpenStack cloud solutions
Good Knowledge with any of the public cloud container services- AKS, EKS, GKE
Understanding of cloud infrastructure and services for scalable AI deployments.
Good understanding of Cloud Security and Observability
Storage
Indepth knowledge and hands-on experience with CSI drivers
Good knowledge of storage concepts - Block,File and/or Object Storage (like Minio)
Networks
Good knowledge of network protocols like TCP/IP, S3, FTP, NFS, or SMB/CIFS
Good knowledge of DNS, TCP/IP, Routing and Load Balancing
Networks - HPC
Good knowledge of HPC networking stack (high speed networking), InfiniBand.
GPU
Knowledge of GPU technologies, NVIDIA GPU operator, NVIDIA vGPU technology
What you need to bring:
Qualifications:
- Bachelor’s/master’s degree in computer science, Information Technology, or a related field
- Proven experience as a Solutions Architect, HPC and Container platform Specialist, or similar role, with expertise in designing and implementing complex solutions
- Red Hat Certified Specialist in Containers and Kubernetes (RHCSA, RHCE), CNCF certification - CKA, CKAD, CKS is preferred
- Typically, 6-8 years of experience in delivering complex HPC and container platform projects
- Excellent communication and presentation skills with the ability to convey complex technical concepts to non-technical stakeholders.