Lead hpcc
6 days ago
What you’ll do: Responsibilities Leadership and Strategy Develop HPC and container platform roadmaps and strategies for growth based on business needs Engage in and enhance the complete service lifecycle—from conceptualization and design to implementation and operation Identify the growth path and scalability options of a solution and include these in design activities Solution Design, architecture and Planning Gather requirements, assess technical feasibility, and design integrated HPC and container solutions that align with business objectives. Architect and optimize the technical solutions to meet the requirements of the customer. Identify the potential challenges and constraints that impact the solution and project plan. Opportunity assessment Respond to the technical sections of RFIs/RFPs and Lead proof-of-concept engagements to a successful conclusion Utilize an effective consultative approach to advance opportunities Innovation and Research Stay abreast of emerging technologies, trends, and industry developments related to HPC, Kubernetes, containers, cloud computing, and Security. Develop best practices, Accelerators and Show & Tell for HPC and container platform solutions and integrations. Customer-centric mindset Strong focus on understanding customer business requirements and solving complex cloud technology issues Be the trusted advisor, delight customers, and deliver exceptional customer experiences to drive customer success. Communicate complex technical concepts and findings to non-technical stakeholders Team Collaboration Collaborate with cross-functional teams, including system administrators, developers, data scientist and project managers, to ensure successful project delivery. Understands the roles and effectively engages other teams and resources within the company Mentor and train new team members and lead the way in participation in tech talks, forums, innovation. Performance Optimization and Troubleshooting Troubleshoot and resolve technical issues related to complete solutions. Identify performance bottlenecks and provide remediations. Project Delivery Ability to lead technical projects by gathering the requirements, preparing the architecture / design and executing it end to end. Must be able to bring clarity and drive complex projects involving multiple stakeholders Solid business acumen and ability to converse with client on issues and challenges Technical Skills Container Technologies and Orchestration Platform In-depth knowledge and hands-on experience with containerization technologies like Docker,or Podman In-depth knowledge and hands-on experience with at least two (2) of the container orchestration technologies like CNCF Kubernetes, Red Hat Open Shift, SUSE Rancher RKE/K3 S, Canonical charmed kubernetes or HPE Ezmeral Runtime Linux Knowledge and experience with Linux System Administration, package management, scheduling, boot procedures/troubleshooting, performance optimization, and networking concepts Good knowledge and hands-on experience with at least two various Linux distributions like RHEL, SLES, Ubuntu, Debian. HPC In-depth knowledge and hands-on experience with atleast one (1) HPC technologies, workload schedulers – Slurm, Altair PBS pro, and cluster managers – HPCM, Bright cluster manager Good experience in performance optimization and health assessment of HPC components such as operating systems, storage, servers, parallel file systems, schedulers. Good knowledge and hands-on experience containerization technologies like Singularity for HPC Good knowledge in parallel computing, MPI technologies Virtualization Good knowledge and hands-on experience with virtualization technologies like KVM, Open Shift virtualization Programming Languages Good experience with Programming ike python, Good experience with Scripting languages like bash Cloud Platforms Good knowledge and hands-on experience with Open Stack cloud solutions Good Knowledge with any of the public cloud container services- AKS, EKS, GKE Understanding of cloud infrastructure and services for scalable AI deployments. Good understanding of Cloud Security and Observability Storage Indepth knowledge and hands-on experience with CSI drivers Good knowledge of storage concepts - Block, File and/or Object Storage (like Minio) Networks Good knowledge of network protocols like TCP/IP, S3, FTP, NFS, or SMB/CIFS Good knowledge of DNS, TCP/IP, Routing and Load Balancing Networks - HPC Good knowledge of HPC networking stack (high speed networking), Infini Band. GPU Knowledge of GPU technologies, NVIDIA GPU operator, NVIDIA v GPU technology What you need to bring: Qualifications: Bachelor’s/master’s degree in computer science, Information Technology, or a related field Proven experience as a Solutions Architect, HPC and Container platform Specialist, or similar role, with expertise in designing and implementing complex solutions Red Hat Certified Specialist in Containers and Kubernetes (RHCSA, RHCE), CNCF certification - CKA, CKAD, CKS is preferred Typically, 6-8 years of experience in delivering complex HPC and container platform projects Excellent communication and presentation skills with the ability to convey complex technical concepts to non-technical stakeholders.