Sr. HPC Engineer 8+ years AWS DevOps

4 weeks ago


Bengaluru, India Western Digital Full time
Job Description

Western Digital’s High-Performance Computing environments are key to bringing new storage solutions to market. As a Senior High-Performance Computing (HPC) engineer in the IT Infrastructure team, you will be at the heart of Western Digital’s engineering and product development process, delivering the IT HPC infrastructure and services that empowers engineering teams to develop new storage technologies and deliver high quality products to market quickly.

As a member of the HPC as a service team – HPCaaS, you will be responsible for establishing and executing strategic objectives focused on improving the effective utilization of the compute resources while meeting or exceeding customer service level agreements for job prioritization, job concurrency, and job throughput in our EDA compute clusters. This includes leading architectural innovation and path finding efforts to create and implement Western Digital’s next generation Grid computing environment. As a member of the team, you will be expected to not only deliver on technical requirements and solutions but also be able to present your solutions to senior management. Responsibilities include but are not limited to working as an individual contributor, a team member and a technical team lead to explore, define, and pilot new solutions with little supervision. Develop solutions, scripts, and/or processes to automate management of services and tools as required. In this role, you will be collaborating closely with EDA and hardware design team stakeholders to define and deliver workload efficiency improvements in Western Digital’s EDA HPC infrastructure globally.

What you’ll be doing:

  • Support multi-site, high-performance compute infrastructure and services for the global engineering product development organizations
  • Design, create, deliver, and support the deployment of Ansible automation within HPC and Unix environments
  • Identify and propose solutions and new services for the distributed ASIC and GPU computing clusters
  • Perform troubleshooting and root cause analysis of HPC clusters and file system related issues
  • Develop and maintain documentation for all aspects of the HPC infrastructure
  • Improve root cause analysis and corrective action for problems large and small – identify patterns and propose how we can automate repetitive tasks
  • Recommend and implement solutions to improve the performance of workloads
  • Support diverse Engineering Design Automation environment

 

Tooling

  • GitHub
  • CI/CD (Jenkins, Terraform, Ansible)
  • Splunk, Grafana, Prometheus

Infrastructure

  • Kubernetes/Open Shift
  • Cloud Computing (AWS Cloud, Google, Azure)
  • Cloud Storage Systems (S3, FSx, CVO)
  • OS: RedHat and any related distribution 
  • Containers (Singularity/Docker)

Qualifications

  • Bachelor’s degree in computer science or equivalent experience
  • 10+ years of Linux systems administration experience specifically in managing or supporting RedHat and/or Centos Linux in production environments
  • Experience with configuration management tools: Ansible, Puppet, Chef
  • Experience with automation tools like Terraform or any other orchestration tools.
  • Ability to technically lead a project through the lifecycle
  • Scripting skills: highly skilled in at least two typical scripting languages (shell/bash, python, ruby)
  • Excellent problem-solving, multitasking, troubleshooting skills, and attention to detail are required to work in this challenging and dynamic environment
  • Very strong interpersonal, customer service, result-oriented, and team-building skills


Additional Information

Western Digital thrives on the power and potential of diversity. As a global company, we believe the most effective way to embrace the diversity of our customers and communities is to mirror it from within. We believe the fusion of various perspectives results in the best outcomes for our employees, our company, our customers, and the world around us. We are committed to an inclusive environment where every individual can thrive through a sense of belonging, respect and contribution.

Western Digital is committed to offering opportunities to applicants with disabilities and ensuring all candidates can successfully navigate our careers website and our hiring process. Please contact us at staffingsupport@wdc.com to advise us of your accommodation request. In your email, please include a description of the specific accommodation you are requesting as well as the job title and requisition number of the position for which you are applying.



  • Bengaluru, Karnataka, India Western Digital Full time

    Job DescriptionWestern Digital's High-Performance Computing environments are key to bringing new storage solutions to market. As a Senior High-Performance Computing (HPC) engineer in the IT Infrastructure team, you will be at the heart of Western Digital's engineering and product development process, delivering the IT HPC infrastructure and services that...


  • Bengaluru, Karnataka, India Western Digital Full time

    Job DescriptionWestern Digital's High-Performance Computing environments are key to bringing new storage solutions to market. As a Senior High-Performance Computing (HPC) engineer in the IT Infrastructure team, you will be at the heart of Western Digital's engineering and product development process, delivering the IT HPC infrastructure and services that...


  • Bengaluru, India Western Digital Full time

    Job DescriptionWestern Digital’s High-Performance Computing environments are key to bringing new storage solutions to market. As a Senior High-Performance Computing (HPC) engineer in the IT Infrastructure team, you will be at the heart of Western Digital’s engineering and product development process, delivering the IT HPC infrastructure and services that...


  • Bengaluru, India Western Digital Full time

    Company Description At Western Digital, our vision is to power global innovation and push the boundaries of technology to make what you thought was once impossible, possible. At our core, Western Digital is a company of problem solvers. People achieve extraordinary things given the right technology. For decades, we’ve been doing just that. Our...


  • Bengaluru, India Western Digital Full time

    Company Description At Western Digital, our vision is to power global innovation and push the boundaries of technology to make what you thought was once impossible, possible. At our core, Western Digital is a company of problem solvers. People achieve extraordinary things given the right technology. For decades, we’ve been doing just that. Our...

  • HPC/LSF Engineer

    1 day ago


    Bengaluru, India arm limited Full time

    Job DescriptionJob Overview:The IT Infrastructure and Engineering group provides the high-performance compute environment that fuels product and solutions development for Arm's engineering community. Whether its high-performance compute (HPC) on Arm’s on-prem infrastructure and/or in the cloud, Electronic Design Automation (EDA) tools, or customized...

  • HPC/LSF Engineer

    3 days ago


    Bengaluru, India arm limited Full time

    Job Description Job Overview: The IT Infrastructure and Engineering group provides the high-performance compute environment that fuels product and solutions development for Arm's engineering community. Whether its high-performance compute (HPC) on Arm’s on-prem infrastructure and/or in the cloud, Electronic Design Automation (EDA) tools, or...

  • Hpc Admin

    2 months ago


    Bengaluru, Karnataka, India Tata Consultancy Services Full time

    Must Have - Drive innovative computational solutions and exploit emerging technologies (for Example Converged HPC) - Must have experience of large-scale cluster and server computing and related software like LSF, Slurm, Altair PBS Pro etc. - Experience in implementing a large-scale parallel filesystems like Spectrum Scale GPFS & Lustre, BeeGFS, Weka IO...

  • Sr DevOps Engineer

    2 months ago


    Bengaluru, India ANSR - Tech Full time

    About Us:ANSR is a global technology and consulting firm that provides end-to-end services and solutions to businesses across industries. The company was founded in 2015 and has its headquarters in Bangalore, India, with additional offices in the United States, Europe, and the Asia Pacific region. ANSR's expertise lies in providing digital transformation...

  • Sr DevOps Engineer

    2 months ago


    Bengaluru, Karnataka, India ANSR - Tech Full time

    About Us:ANSR is a global technology and consulting firm that provides end-to-end services and solutions to businesses across industries. The company was founded in 2015 and has its headquarters in Bangalore, India, with additional offices in the United States, Europe, and the Asia Pacific region. ANSR's expertise lies in providing digital transformation...

  • Sr DevOps Engineer

    2 months ago


    Bengaluru, India ANSR - Tech Full time

    About Us: ANSR is a global technology and consulting firm that provides end-to-end services and solutions to businesses across industries. The company was founded in 2015 and has its headquarters in Bangalore, India, with additional offices in the United States, Europe, and the Asia Pacific region. ANSR's expertise lies in providing digital transformation...


  • Bengaluru, India PradeepIT Consulting Services Pvt Ltd Full time

    Job Title:AWS DevOps EngineerExperience:5 to 8 yearsLocation:RemoteJob Description:We are seeking a skilled DevOps Engineer with 5 to 8 years of experience to join our team. The ideal candidate will have a strong background in implementing DevOps practices, particularly with expertise in Liquibase Snowflake CI/CD implementation. This role requires...


  • Bengaluru, India PradeepIT Consulting Services Pvt Ltd Full time

    Job Title: AWS DevOps EngineerExperience: 5 to 8 yearsLocation: RemoteJob Description:We are seeking a skilled DevOps Engineer with 5 to 8 years of experience to join our team. The ideal candidate will have a strong background in implementing DevOps practices, particularly with expertise in Liquibase Snowflake CI/CD implementation. This role requires...


  • Bengaluru, India PradeepIT Consulting Services Pvt Ltd Full time

    Job Title: AWS DevOps EngineerExperience: 5 to 8 yearsLocation: RemoteJob Description:We are seeking a skilled DevOps Engineer with 5 to 8 years of experience to join our team. The ideal candidate will have a strong background in implementing DevOps practices, particularly with expertise in Liquibase Snowflake CI/CD implementation. This role requires...

  • Aws Devops Engineer

    2 weeks ago


    Bengaluru, India Rapyder Cloud Solutions Pvt Ltd Full time

    Job Description ofSR Devops Engineer.1) Hands-on good experience in DevOps tools.2) Sound understanding of OS (Linux,CentOs, Windows) OS hardening and patching3) Experience designing and building environments on AWS, which includes working with services like VPC, EC2, ELB, RDS, Cloud watch, Lambda, Stepfunctions and S34) At least two-year exp in...

  • Aws Devops Engineer

    3 weeks ago


    Bengaluru, India Rapyder Cloud Solutions Pvt Ltd Full time

    Job Description ofSR Devops Engineer.1) Hands-on good experience in DevOps tools.2) Sound understanding of OS (Linux,CentOs, Windows) OS hardening and patching3) Experience designing and building environments on AWS, which includes working with services like VPC, EC2, ELB, RDS, Cloud watch, Lambda, Stepfunctions and S34) At least two-year exp in...

  • AWS Devops Engineer

    5 days ago


    Bengaluru, Karnataka, India PradeepIT Consulting Services Pvt Ltd Full time

    Job Title: AWS DevOps Engineer Experience: 5 to 8 years Location: Remote Job Description: We are seeking a skilled DevOps Engineer with 5 to 8 years of experience to join our team. The ideal candidate will have a strong background in implementing DevOps practices, particularly with expertise in Liquibase Snowflake CI/CD implementation. This role requires...

  • Aws Devops Engineer

    3 weeks ago


    Bengaluru, India Rapyder Cloud Solutions Pvt Ltd Full time

    Job Description of  SR Devops Engineer. 1) Hands-on good experience in DevOps tools.2) Sound understanding of OS (Linux,CentOs, Windows) OS hardening and patching3) Experience designing and building environments on AWS, which includes working with services like VPC, EC2, ELB, RDS, Cloud watch, Lambda, Stepfunctions and S34) At least two-year exp in...

  • Aws Devops Engineer

    4 weeks ago


    Bengaluru, India Rapyder Cloud Solutions Pvt Ltd Full time

    Job Description of  SR Devops Engineer. 1) Hands-on good experience in DevOps tools.2) Sound understanding of OS (Linux,CentOs, Windows) OS hardening and patching3) Experience designing and building environments on AWS, which includes working with services like VPC, EC2, ELB, RDS, Cloud watch, Lambda, Stepfunctions and S34) At least two-year exp in...


  • Bengaluru, India ANSR - Tech Full time

    About Us:ANSR is a global technology and consulting firm that provides end-to-end services and solutions to businesses across industries. The company was founded in 2015 and has its headquarters in Bangalore, India, with additional offices in the United States, Europe, and the Asia Pacific region. ANSR's expertise lies in providing digital transformation...