Senior High Performance Computing Engineer

3 weeks ago


Hyderabad, Telangana, India Amgen Inc Full time
Job Description

- Implement and manage cloud-based infrastructure that supports HPC environments for data science (e.g., AI/ML workflows, Image Analysis).
- Collaborate with data scientists and ML engineers to deploy scalable machine learning models into production.
- Ensure the security, scalability, and reliability of HPC systems in the cloud.
- Optimize cloud resources for cost-effective and efficient use.
- Stay ahead with the latest in cloud services and industry-standard processes.
- Provide technical leadership and guidance in cloud and HPC systems management.
- Develop and maintain CI/CD pipelines for deploying resources to multi-cloud environments.
- Monitor and fix cluster operations/applications and cloud environments.
- Document system design and operational procedures.

Must-Have Skills:

- Expert with Linux/Unix system administration (RHEL, CentOS, Ubuntu, etc.).
- Proficiency with job scheduling and resource management tools (SLURM, PBS, LSF, etc.).
- Good understanding of parallel computing, MPI, OpenMP, and GPU acceleration (CUDA, ROCm).
- Knowledge of storage architectures and distributed file systems (Lustre, GPFS, Ceph).
- Experience with containerization technologies (Singularity, Docker) and cloud-based HPC solutions.
- Expert in scripting languages (Python, Bash) and containerization technologies (Docker, Kubernetes).
- Familiarity with automation tools (Ansible, Puppet, Chef) for system provisioning and maintenance.
- Understanding of networking protocols, high-speed interconnects, and security best practices.
- Demonstrable experience in cloud computing (AWS, Azure, GCP) and cloud architecture.
- Experience with infrastructure as code (IaC) tools like Terraform or CloudFormation and Git.

What we expect of you

- We are all different, yet we all use our unique contributions to serve patients.
- Expert knowledge in large Linux environments, networking, storage, and cloud-related technologies.
- Also, the candidate will have expertise in root-cause analysis and fix while working with a team and stakeholders.
- Top-level communication and documentation skills are required.
- Expertise in coding in Python, Bash, YAML is expected.

Good-to-Have Skills:

- Experience with Kubernetes (EKS) and service mesh architectures.
- Knowledge of AWS Lambda and event-driven architectures.
- Familiarity with AWS CDK, Ansible, or Packer for cloud automation.
- Exposure to multi-cloud environments (Azure, GCP).

Basic Qualifications:

- Bachelor's degree in computer science, IT, or a related field with 6-8 years of hands-on HPC administration or a related field.

Additional Skills:

- Experience supporting research in healthcare life sciences.
- Deep, extensive experience with High Performance Computing (HPC) and cluster management.
- Familiarity with machine learning frameworks (TensorFlow, PyTorch) and data pipelines.
- Certifications in cloud architecture (AWS Certified Solutions Architect, Google Cloud Professional Cloud Architect, etc.).
- Experience in an Agile development environment.
- Prior work with distributed computing and big data technologies (Hadoop, Spark).

Professional Certifications (preferred):

- Red Hat Certified Engineer (RHCE) or Linux Professional Institute Certification (LPIC).
- AWS Certified Solutions Architect Associate or Professional.

Preferred Qualifications:

Soft Skills:

- Strong analytical and problem-solving skills.
- Ability to work effectively with global, virtual teams.
- Effective communication and collaboration with cross-functional teams.
- Ability to work in a fast-paced, cloud-first environment.

  • Hyderabad, Telangana, India beBeeHighPerformance Full time ₹ 1,50,00,000 - ₹ 2,50,00,000

    Job OverviewWe are seeking a skilled High Performance Computing Specialist to join our team. As a Senior Consultant, you will be responsible for designing and implementing high-performance computing clusters on Azure.Your primary focus will be on automating cluster buildout workflows, tasks, and reports to produce innovative solutions for cluster buildout...


  • Hyderabad, Telangana, India beBeeVerification Full time ₹ 1,50,00,000 - ₹ 2,50,00,000

    Performance Verification ExpertWe are seeking a skilled Performance Verification Engineer to join our team.The ideal candidate will have a strong background in performance verification and experience working with high-performance computing systems, including:Developing simulation infrastructure and methodology advances to model customer...


  • Hyderabad, Telangana, India beBeeEngineer Full time US$ 1,50,000 - US$ 2,00,000

    Job Overview">We're seeking an experienced engineer to develop and optimize software systems for our silicon platform. This role focuses on building efficient runtime systems that maximize chip performance while ensuring reliability and ease of use.">Key Responsibilities:">">Design and implement runtime systems for AI accelerator execution and memory...


  • Hyderabad, Telangana, India Amgen Full time

    Career Category Information SystemsJoin Amgens Mission of Serving PatientsAt Amgen if you feel like youre part of something bigger its because you are Our shared missionxe2x80x94to serve patients living with serious illnessesxe2x80x94drives all that we do Since 1980 weve helped pioneer the world of biotech in our fight against the worlds toughest...


  • Hyderabad, Telangana, India beBeeEngineering Full time ₹ 18,00,000 - ₹ 20,10,000

    Senior Performance Engineering ManagerWe are seeking a seasoned Performance Engineering Manager to lead our team in optimizing application performance and delivering exceptional user experiences.Key Responsibilities:Design and implement high-performance monitoring solutions using Dynatrace, Appmon, and Enterprise Synthetic script.Implement, configure, and...


  • Hyderabad, Telangana, India Genpact Full time

    Job DescriptionInviting applications for the role of Consultant - High Performance Compute (HPC) AdminResponsibilities:- Administration of HPC and VDI clusters- User Account management for HPC onboarding and offboarding- Creation and Maintenance of AMI Images in AMI accounts- Install, configure, and maintain Linux operating systems on HPC clusters.- Support...


  • Hyderabad, Telangana, India GENPACT Full time

    Ready to build the future with AI At Genpact we don t just keep up with technology we set the pace AI and digital innovation are redefining industries and we re leading the charge Genpact s our industry-first accelerator is an example of how we re scaling advanced technology solutions to help global enterprises work smarter grow faster and...


  • Hyderabad, Telangana, India beBeeMachineLearning Full time ₹ 25,00,000 - ₹ 48,00,000

    Job DescriptionWe are seeking a skilled Machine Learning Engineer to join our team. This is an exciting opportunity to work with researchers on building high-performance and scalable software addressing novel ML research algorithms.As a key member of the team, you will apply solid software engineering skills to deal with the unexpected and explore research...


  • Hyderabad, Telangana, India beBeeFpga Full time ₹ 1,20,00,000 - ₹ 2,00,00,000

    FPGA Engineer PositionWe are seeking an experienced FPGA application engineer to join our team.Job Responsibilities:Own a Vivado product area and work on high impact projects.Drive critical customer escalations to closure.Triage reported issues in several Vivado product areas.Actively explore innovative methodologies for the new 7nm Versal ACAP family.Work...


  • Hyderabad, Telangana, India beBeeSenior Full time ₹ 1,50,00,000 - ₹ 2,50,00,000

    About the RoleWe are seeking an experienced Senior Manager to lead our Oracle ERP/EPM Engineering team in Hyderabad.This role will involve managing a high-performing team of Engineers, Senior Engineers, and Principal Engineers, with a focus on people leadership, performance management, and career development.As a Senior Manager, you will be responsible for...