
GPU Engineer
15 hours ago
Role & responsibilities
Job Summary
We are seeking a highly skilled GPU Infrastructure Engineer to join our team. This role focuses on the design, implementation, and management of enterprise network and cloud-based infrastructure to support evolving Azure cloud needs. The ideal candidate will have a strong background in software, network, or systems engineering, along with hands-on experience in managing large-scale cloud and data center operations.
Responsibilities
- Respond to incidents during regular on-call rotations and resolve issues efficiently to minimize downtime.
- Design and plan scalable GPU infrastructure solutions to meet organizational capacity and performance needs.
- Collaborate with cross-functional teams to define and implement GPU infrastructure architecture that aligns with business objectives.
- Evaluate GPU technologies and recommend the best hardware and software configurations.
- Configure and deploy GPU servers, including installation and setup of hardware, software, and networking components.
- Coordinate with vendors for procurement and installation of GPUs and related infrastructure.
- Implement and manage GPU clustering setups for compute-intensive tasks.
- Utilize monitoring tools to assess GPU performance metrics and system health.
- Conduct benchmarking tests and analyze the results to identify performance bottlenecks.
- Optimize workload distribution across GPU resources to ensure maximum efficiency.
- Provide expert troubleshooting support for reporting and resolving GPU-related issues experienced by team members.
- Maintain incident response protocols to address hardware and software failures swiftly and effectively.
- Develop FAQs and knowledge base articles to streamline support processes for internal users.
- Infrastructure Maintenance:
- Schedule and perform routine maintenance, including updates to software, firmware, and drivers related to GPU systems.
- Plan and execute capacity upgrades and expansions as needed, ensuring minimal disruption to services.
- Conduct post-mortem analyses on significant incidents to improve overall system reliability.
- Write scripts for automation of deployment, configuration management, and system monitoring tasks (e.g., Python, Bash).
- Develop tools that increase productivity for engineering and data science teams using GPUs.
- Implement Infrastructure as Code (IaC) practices for efficient and repeatable deployments.
Requirements
- Bachelors or Masters Degree in Computer Science, Information Technology, or a related field.
Technical Experience:
- Proven expertise in software engineering, network engineering, or systems administration.
- Hands-on experience with managing and debugging cloud backend server and networking infrastructure and services.
- Strong understanding of enterprise network and cloud-based architectures, including experience working with Cisco and Azure.
- Experience with cloud platforms providing GPU services (e.g., AWS, Google Cloud, Azure).
- Understanding virtualization technologies (e.g., Docker, Kubernetes) and server orchestration tools.
- Knowledge of network configurations and storage solutions used in GPU environments.
- Strong understanding of GPU architectures (NVIDIA CUDA, AMD ROCm, etc.).
- Experience with AI/ML workloads, HPC, or rendering applications.
- Familiarity with PCIe, memory subsystems (DDR, HBM), and high-speed I/O.
- Understanding of Azure Pipeline , Azure DevOps.
- Demonstrated knowledge in deploying servers and network infrastructure equipment at scale.
Specialized Skills:
- Experience working with GPU hardware or related system engineering.
- Experience with:
- Data center architecture and cloud infrastructure.
- Network infrastructure design and management in hybrid environments.
- Certifications in relevant technologies such as:
- Cisco (e.g., CCNA /CCNP).
- AZ900(Manadatory) , AZ104 (Optional).
- OCI Foundations Associate (Optional)
- ITIL or equivalent certifications (Optional).
-
GPU Developer
3 weeks ago
Hyderabad, Telangana, India Leadsoc Technologies Pvt Ltd Full timeAbout the Role :We are looking for an experienced GPU Developer with strong expertise in GPU compilers, GPU modeling, and architectural modeling using C/C++. This role involves designing and optimizing GPU software components, contributing to architectural exploration, and applying advanced software design patterns.Key Responsibilities :- Design, implement,...
-
GPU Compiler
15 hours ago
Hyderabad, Telangana, India Qualcomm Full time ₹ 15,00,000 - ₹ 25,00,000 per yearGeneral Summary:Our power efficient GPU solution is fundamental to enable new exciting markets like VR, IoT, AI, drone, autonomous driving etc. GPU compiler is a key component of graphics solution. We are looking for talented, self-motivated engineers to create world class GPU compiler products to enable high performance graphics and compute with low power...
-
IP/RTL Design Architect for GPU
2 weeks ago
Hyderabad, Telangana, India Mulya Technologies Full timeIP/RTL Design Architect for GPU Hyderabad Founded by highly respected Silicon Valley veterans - with its design centers established in Santa Clara, California. / Hyderabad/ Bangalore Our pay comprehensively beats "ALL" Semiconductor product players in the Indian market. Position Overview Seeking an IP/RTL Design Engineer with 8+ years of experience...
-
Senior Engineer I
1 week ago
Hyderabad, Telangana, India DigitalOcean Full time ₹ 12,00,000 - ₹ 24,00,000 per yearDive in and do the best work of your career at DigitalOcean. Journey alongside a strong community of top talent who are relentless in their drive to build the simplest scalable cloud. If you have a growth mindset, naturally like to think big and bold, and are energized by the fast-paced environment of a true industry disruptor, you'll find your place here....
-
Ai Coupled Hpc Engineer Hyd
4 weeks ago
Hyderabad, Telangana, India Locuz Enterprise Solutions Full timeSenior HPC AI Applications Engineer Experienced HPC AI Applications Engineer with 5 years in High-performance computing and AI application deployment Expert at architecting optimizing and benchmarking CPU GPU-intensive environments ensuring maximum efficiency in scientific and ML workloads Mastery over Open-source and Commercial HPC AI Applications ...
-
Principal ip/rtl design engineer for tpu
3 weeks ago
Hyderabad, Telangana, India Mulya Technologies Full timePrincipal IP/RTL Design Engineer for TPU / GPU Hyderabad / BangaloreFounded by highly respected Silicon Valley veterans - with its design centers established in Santa Clara, California. / Hyderabad/ BangaloreOur pay comprehensively beats "ALL" Semiconductor product players in the Indian market. Position OverviewSeeking an IP/RTL Design Engineer with 5+ years...
-
SW Lead Engineer
13 hours ago
Hyderabad, Telangana, India Quest Global Full time ₹ 20,00,000 - ₹ 25,00,000 per yearJob Requirements Deep Knowledge of C/C++ and Python programmingExperience with Linux Commands is mustExperience with Scripting language like bash/powershellUnderstanding of various python ML frameworks like Pytorch, Transformers etcUnderstanding of various language and compiler for writing highly efficient custom Deep-Learning GPU Kernels. like...
-
AI Software System Engineer
4 weeks ago
Hyderabad, Telangana, India Xilinx Full timeJob DescriptionWHAT YOU DO AT AMD CHANGES EVERYTHINGWe care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences - the building blocks for the data center, artificial intelligence, PCs, gaming and embedded....
-
Software Validation Engineer
4 weeks ago
Hyderabad, Telangana, India Advanced Micro Devices (AMD) Full timeJob DescriptionWe are looking for a dynamic, energetic Senior Software Systems Design Engineer to join our growing team.You will contribute to the core team that validate the tests for AMD GPU based compute software stack. You will be responsible for maintaining the ROCm stack quality by running those test suites (automated, manual) and suggest enhancements...
-
Principal IP/RTL Design Engineer for TPU
3 weeks ago
Hyderabad, Telangana, India Mulya Technologies Full timePrincipal IP/RTL Design Engineer for TPU / GPU Hyderabad / BangaloreFounded by highly respected Silicon Valley veterans - with its design centers established in Santa Clara, California. / Hyderabad/ BangaloreOur pay comprehensively beats "ALL" Semiconductor product players in the Indian market. Position OverviewSeeking an IP/RTL Design Engineer with 5+ years...