Sr. HPC Engineer

2 months ago


Bengaluru, India Western Digital Full time
Job Description

Western Digital’s High-Performance Computing environments are key to bringing new storage solutions to market. As a Senior High-Performance Computing (HPC) engineer in the IT Infrastructure team, you will be at the heart of Western Digital’s engineering and product development process, delivering the IT HPC infrastructure and services that empowers engineering teams to develop new storage technologies and deliver high quality products to market quickly.

As a member of the HPC as a service team – HPCaaS, you will be responsible for establishing and executing strategic objectives focused on improving the effective utilization of the compute resources while meeting or exceeding customer service level agreements for job prioritization, job concurrency, and job throughput in our EDA compute clusters. This includes leading architectural innovation and path finding efforts to create and implement Western Digital’s next generation Grid computing environment. As a member of the team, you will be expected to not only deliver on technical requirements and solutions but also be able to present your solutions to senior management. Responsibilities include but are not limited to working as an individual contributor, a team member and a technical team lead to explore, define, and pilot new solutions with little supervision. Develop solutions, scripts, and/or processes to automate management of services and tools as required. In this role, you will be collaborating closely with EDA and hardware design team stakeholders to define and deliver workload efficiency improvements in Western Digital’s EDA HPC infrastructure globally.

What you’ll be doing:

  • Support multi-site, high-performance compute infrastructure and services for the global engineering product development organizations
  • Design, create, deliver, and support the deployment of Ansible automation within HPC and Unix environments
  • Identify and propose solutions and new services for the distributed ASIC and GPU computing clusters
  • Perform troubleshooting and root cause analysis of HPC clusters and file system related issues
  • Develop and maintain documentation for all aspects of the HPC infrastructure
  • Improve root cause analysis and corrective action for problems large and small – identify patterns and propose how we can automate repetitive tasks
  • Recommend and implement solutions to improve the performance of workloads
  • Support diverse Engineering Design Automation environment

 

Tooling

  • GitHub
  • Terraform, Ansible
  • Splunk, Grafana, Prometheus

Infrastructure

  • OS: RedHat and any related distribution 
  • Monitoring tools like nagios/cacti or any equivalent
  • PXE/Kickstart configuration
  • NFS storage management & automounter
  • EDA tool installation and support like Cadence and Synopsys
  • Opensource tool installation and support
  • Unix/Linux authentication with AD
  • Infrastructure automation with scripting knowledge 

Qualifications

  • Bachelor’s degree in computer science or equivalent experience
  • 10+ years of Linux systems administration experience specifically in managing or supporting RedHat and/or Centos Linux in production environments
  • Experience with configuration management tools: Ansible, Puppet, Chef
  • Experience with automation
  • Ability to technically lead a project through the lifecycle
  • Scripting skills: highly skilled in at least two typical scripting languages (shell/bash, python, ruby)
  • Excellent problem-solving, multitasking, troubleshooting skills, and attention to detail are required to work in this challenging and dynamic environment
  • Very strong interpersonal, customer service, result-oriented, and team-building skills


Additional Information

Western Digital thrives on the power and potential of diversity. As a global company, we believe the most effective way to embrace the diversity of our customers and communities is to mirror it from within. We believe the fusion of various perspectives results in the best outcomes for our employees, our company, our customers, and the world around us. We are committed to an inclusive environment where every individual can thrive through a sense of belonging, respect and contribution.

Western Digital is committed to offering opportunities to applicants with disabilities and ensuring all candidates can successfully navigate our careers website and our hiring process. Please contact us at staffingsupport@wdc.com to advise us of your accommodation request. In your email, please include a description of the specific accommodation you are requesting as well as the job title and requisition number of the position for which you are applying.


  • HPC/LSF Engineer

    3 months ago


    Bengaluru, India arm limited Full time

    Job Description Job Overview: The IT Infrastructure and Engineering group provides the high-performance compute environment that fuels product and solutions development for Arm's engineering community. Whether its high-performance compute (HPC) on Arm’s on-prem infrastructure and/or in the cloud, Electronic Design Automation (EDA) tools, or...


  • Bengaluru, India BosonQ Psi (BQP) Full time

    High-Performance Computing (HPC) Engineer-Position Overview:We are seeking a highly skilled and motivated HPC Engineer to join our dynamic team. The successful candidate will play a crucial role in extending our HPC capabilities and driving new ventures within BQP. The ideal candidate will have a strong background in developing physics solvers, extensive...

  • Hpc Admin

    4 months ago


    Bengaluru, Karnataka, India Tata Consultancy Services Full time

    Must Have - Drive innovative computational solutions and exploit emerging technologies (for Example Converged HPC) - Must have experience of large-scale cluster and server computing and related software like LSF, Slurm, Altair PBS Pro etc. - Experience in implementing a large-scale parallel filesystems like Spectrum Scale GPFS & Lustre, BeeGFS, Weka IO...

  • Hpc

    5 months ago


    Bengaluru, Karnataka, India Tata Consultancy Services Full time

    Drive innovative computational solutions and exploit emerging technologies (for Example Converged HPC) - Must have experience of large-scale cluster and server computing and related software like LSF, Slurm, Altair PBS Pro etc. - Experience in implementing a large-scale parallel filesystems like Spectrum Scale GPFS & Lustre, BeeGFS, Weka IO etc. - Working...


  • Bengaluru/ Bangalore, India Exxonmobil Company India Pvt Ltd Full time

    Apply for HPC Technical Software Engineer, Career Progress Consultants in Bengaluru/ Bangalore for 3 - 5 Year of Experience on TimesJobs.com.


  • Bengaluru, India NVIDIA Full time

    NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 fueled the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI and enabled the next era of computing. NVIDIA is a “learning machine” that constantly evolves...


  • Bengaluru, India ExxonMobil Corporation Full time

    What you will do Work on a scrum team as a software developer to develop and support proprietary applications used for seismic imaging. Collaborate closely with researchers to enable their research and to commercialize prototype research code. Support internal business partners that are globally distributed. Optimize applications...

  • Hpc Expert

    2 months ago


    Bengaluru, India Hewlett Packard Enterprise Full time

    HPC Expert This role has been designated as ‘Edge’, which means you will primarily work outside of an HPE office. **Who We Are**: **High Performance Computing, AI and Labs** is a critical element of HPE. We are focused on delivering innovative solutions that accelerate our customers’ digital transformation, enabling them to tackle their complex, and...


  • Bengaluru, India Lenovo Full time

    Lenovo HPC PS team is looking for a Consultant for Power and Cooling based in North America (NA) to join its Global HPC Professional Services team. This position represents a rewarding opportunity to use and expand your experience and skills in a dynamic, complex organization and to deliver across our major customer sites. Your Responsibilities : Consultant...


  • Bengaluru, India Lenovo Full time

    Lenovo HPC PS team is looking for a Consultant for Power and Cooling based in North America (NA) to join its Global HPC Professional Services team. This position represents a rewarding opportunity to use and expand your experience and skills in a dynamic, complex organization and to deliver across our major customer sites. Your Responsibilities : Consultant...


  • Bengaluru, India Lenovo Full time

    Lenovo HPC PS team is looking for a Consultant for Power and Cooling based in North America (NA) to join its Global HPC Professional Services team. This position represents a rewarding opportunity to use and expand your experience and skills in a dynamic, complex organization and to deliver across our major customer sites. Your Responsibilities : ...


  • Bengaluru, India ExxonMobil Full time

     About us At ExxonMobil, our vision is to lead in energy innovations that advance modern living and a net-zero future. As one of the world’s largest publicly traded energy and chemical companies, we are powered by a unique and diverse workforce fueled by the pride in what we do and what we stand for. The success of our Upstream, Product Solutions and...


  • Bengaluru, India ExxonMobil Full time

      About us   At ExxonMobil, our vision is to lead in energy innovations that advance modern living and a net-zero future. As one of the world’s largest publicly traded energy and chemical companies, we are powered by a unique and diverse workforce fueled by the pride in what we do and what we stand for.   The success of our Upstream,...


  • Bengaluru, India Lenovo Full time

    Description and Requirements Lenovo HPC PS team is looking for a Consultant for Power and Cooling based in North America (NA) to join its Global HPC Professional Services team. This position represents a rewarding opportunity to use and expand your experience and skills in a dynamic, complex organization and to deliver across our major customer...


  • Bengaluru, India ExxonMobil Full time

      About us   At ExxonMobil, our vision is to lead in energy innovations that advance modern living and a net-zero future. As one of the world’s largest publicly traded energy and chemical companies, we are powered by a unique and diverse workforce fueled by the pride in what we do and what we stand for.   The success of our Upstream,...


  • Bengaluru, India ExxonMobil Full time

     About us At ExxonMobil, our vision is to lead in energy innovations that advance modern living and a net-zero future. As one of the world’s largest publicly traded energy and chemical companies, we are powered by a unique and diverse workforce fueled by the pride in what we do and what we stand for. The success of our Upstream, Product Solutions and...


  • Bengaluru, India L&T Technology Services Full time

    Job DescriptionSME : EE Architecture – Automotive Ethernet, Zonal Controller, HPCEducation: Bachelors or Masters in (EEE/ECE/EI)Exp Level: 10-15 yrsLocation: Bangalore or Mys Exp in Vehicle EE architectures understanding. (Zonal architectures, HPCs, Sensors and actuators) and System Engineering. Exp in in Vehicle Communication and gateways. Strong...


  • Bengaluru, India L&T Technology Services Full time

    Job DescriptionSME : EE Architecture – Automotive Ethernet, Zonal Controller, HPCEducation: Bachelors or Masters in (EEE/ECE/EI)Exp Level: 10-15 yrsLocation: Bangalore or Mys Exp in Vehicle EE architectures understanding. (Zonal architectures, HPCs, Sensors and actuators) and System Engineering. Exp in in Vehicle Communication and gateways. Strong...


  • Bengaluru, India L&T Technology Services Full time

    Job Description SME : EE Architecture – Automotive Ethernet, Zonal Controller, HPC Education: Bachelors or Masters in (EEE/ECE/EI) Exp Level: 10-15 yrs Location: Bangalore or Mys  Exp in Vehicle EE architectures understanding. (Zonal architectures, HPCs, Sensors and actuators) and System Engineering.  Exp in in Vehicle Communication and gateways....


  • Bengaluru, India ExxonMobil Corporation Full time

    What you will do Primary Job Functions Establish strategies for overall support of the system Evaluate new hardware and software and understand potential benefits/impacts it can have in the environment Perform hardware maintenance Perform software installations and upgrades, inclusive of operating system Monitor overall system performance...