Senior High Performance Computing Engineer

1 day ago


Hyderabad, Telangana, India Amgen Inc Full time
Job Description

- Implement and manage cloud-based infrastructure that supports HPC environments for data science (e.g., AI/ML workflows, Image Analysis).
- Collaborate with data scientists and ML engineers to deploy scalable machine learning models into production.
- Ensure the security, scalability, and reliability of HPC systems in the cloud.
- Optimize cloud resources for cost-effective and efficient use.
- Stay ahead with the latest in cloud services and industry-standard processes.
- Provide technical leadership and guidance in cloud and HPC systems management.
- Develop and maintain CI/CD pipelines for deploying resources to multi-cloud environments.
- Monitor and fix cluster operations/applications and cloud environments.
- Document system design and operational procedures.

Must-Have Skills:

- Expert with Linux/Unix system administration (RHEL, CentOS, Ubuntu, etc.).
- Proficiency with job scheduling and resource management tools (SLURM, PBS, LSF, etc.).
- Good understanding of parallel computing, MPI, OpenMP, and GPU acceleration (CUDA, ROCm).
- Knowledge of storage architectures and distributed file systems (Lustre, GPFS, Ceph).
- Experience with containerization technologies (Singularity, Docker) and cloud-based HPC solutions.
- Expert in scripting languages (Python, Bash) and containerization technologies (Docker, Kubernetes).
- Familiarity with automation tools (Ansible, Puppet, Chef) for system provisioning and maintenance.
- Understanding of networking protocols, high-speed interconnects, and security best practices.
- Demonstrable experience in cloud computing (AWS, Azure, GCP) and cloud architecture.
- Experience with infrastructure as code (IaC) tools like Terraform or CloudFormation and Git.

What we expect of you

- We are all different, yet we all use our unique contributions to serve patients.
- Expert knowledge in large Linux environments, networking, storage, and cloud-related technologies.
- Also, the candidate will have expertise in root-cause analysis and fix while working with a team and stakeholders.
- Top-level communication and documentation skills are required.
- Expertise in coding in Python, Bash, YAML is expected.

Good-to-Have Skills:

- Experience with Kubernetes (EKS) and service mesh architectures.
- Knowledge of AWS Lambda and event-driven architectures.
- Familiarity with AWS CDK, Ansible, or Packer for cloud automation.
- Exposure to multi-cloud environments (Azure, GCP).

Basic Qualifications:

- Bachelor's degree in computer science, IT, or a related field with 6-8 years of hands-on HPC administration or a related field.

Additional Skills:

- Experience supporting research in healthcare life sciences.
- Deep, extensive experience with High Performance Computing (HPC) and cluster management.
- Familiarity with machine learning frameworks (TensorFlow, PyTorch) and data pipelines.
- Certifications in cloud architecture (AWS Certified Solutions Architect, Google Cloud Professional Cloud Architect, etc.).
- Experience in an Agile development environment.
- Prior work with distributed computing and big data technologies (Hadoop, Spark).

Professional Certifications (preferred):

- Red Hat Certified Engineer (RHCE) or Linux Professional Institute Certification (LPIC).
- AWS Certified Solutions Architect Associate or Professional.

Preferred Qualifications:

Soft Skills:

- Strong analytical and problem-solving skills.
- Ability to work effectively with global, virtual teams.
- Effective communication and collaboration with cross-functional teams.
- Ability to work in a fast-paced, cloud-first environment.

  • Hyderabad, Telangana, India beBeePrincipal Full time ₹ 36,00,000 - ₹ 42,00,000

    Job Title: Principal IP/RTL Design EngineerThe role of a Principal IP/RTL Design Engineer is pivotal in the development of innovative IP blocks for TPU cores. This engineer will be responsible for designing and developing systolic arrays, vector units, and memory subsystems that focus on high-performance matrix multiplication, low-latency interconnects, and...


  • Hyderabad, Telangana, India beBeePerformance Full time ₹ 6,00,000 - ₹ 10,00,000

    Job OverviewAs a performance testing engineer, you will play a critical role in ensuring the optimal functioning of applications and systems. Your expertise in performance testing and tuning will help identify and resolve bottlenecks, enabling seamless user experiences.Key ResponsibilitiesPlan and execute performance testing activities to evaluate...


  • Hyderabad, Telangana, India beBeeEngineering Full time ₹ 80,00,000 - ₹ 1,50,00,000

    System Engineer RoleWe are seeking an accomplished Associate System Engineer to collaborate on high-performance controller firmware for innovative volatile and non-volatile memory systems in Hyderabad, Telangana.Main Responsibilities:Coding, development, bench testing, debugging, and failure analysis of firmware for new high-performance memory controllers...


  • Hyderabad, Telangana, India beBeeEngineer Full time ₹ 2,00,00,000 - ₹ 2,50,00,000

    Lead SoC Core Engineer for IPC FrameworksWe are seeking a highly skilled Lead SoC Core Engineer to drive the design, development, and integration of scalable Inter-Processor Communication (IPC) frameworks for complex automotive systems.This role requires expertise in core platform software and multicore architecture, enabling efficient communication across...


  • Hyderabad, Telangana, India beBeePerformance Full time ₹ 18,00,000 - ₹ 25,00,000

    Job Description:As a senior performance verification expert, you will work within the computing and graphics performance verification team.Main Responsibilities Include:Develop simulation infrastructure and methodology advances to model customer requirements.Collaborate with architects and designers to troubleshoot functional and performance issues.Develop...


  • Hyderabad, Telangana, India beBeeFirmware Full time US$ 1,50,000 - US$ 1,80,000

    Firmware Development ExpertWe are seeking a skilled firmware development professional to join our team. As a Firmware Engineer, you will work on high-performance controller firmware for innovative volatile and non-volatile memory systems.Key Responsibilities:Develop and implement high-quality firmware for new high-performance memory controllers and Solid...


  • Hyderabad, Telangana, India State Street Corporation Full time

    Azure Data Architect and Performance Engineer Who we are looking for Charles River Development, a State Streetpany, is offering the opportunity for a talented database administrator with Data Architect and performance engineering skills to join its Professional Services Implementation team. We are looking for an expert SQL Server architect, developer and...


  • Hyderabad, Telangana, India beBeeDatabase Full time ₹ 2,00,00,000 - ₹ 2,50,00,000

    Database Architect Leader">We are seeking an experienced Database Architect to lead the design and implementation of our database architecture.This is a key role that involves working closely with engineering teams to model new schemas, optimize indexes, and review queries.Key responsibilities include:Designing scalable data architectures for high...

  • Data Engineer

    17 hours ago


    Hyderabad, Telangana, India beBeeDataEngineer Full time US$ 11,40,000 - US$ 16,22,000

    Job Title:">Data Engineer - Performance Optimization Specialist">Overview:">We are seeking a skilled Data Engineer to join our team and optimize our data processing capabilities. The ideal candidate will have experience in designing, building, and maintaining high-performance data pipelines.">Key Responsibilities:">">Design and implement scalable batch...


  • Hyderabad, Telangana, India Stealth Mode Startup - AI Product Based Company Full time

    Job Description :- 9+ years of experience in software architecture, system design, and development of scalable, distributed systems.- Proven experience in designing edge-based solutions, compilers, runtime, firmware.- Strong programming skills in modern languages such as Python, C++ or similar language.- Expertise in designing high-performance, low-latency...