BitOoda | Systems/Network Engineer – High-Performance Compute GPU Infrastructure | india
1 day ago
Key Responsibilities:System OptimizationConfigure and optimize bare-metal servers, including Linux OS, NVIDIA/AMD GPU drivers, and system libraries.Fine-tune NUMA settings, CPU-GPU affinity, and storage I/O for peak performance.Benchmark and tune HPC systems for specific workloads, ensuring sustained high performance.GPU Cluster ManagementDeploy and manage GPU clusters using job orchestration tools like Kubernetes, Slurm, or similar platforms.Monitor GPU utilization, thermals, and overall system health using tools like NVIDIA DCGM, ROCm, and Prometheus/Grafana.NetworkingDesign and maintain high-speed networking solutions (e.g., NVLink, InfiniBand, RDMA) for distributed GPU systems.Optimize data transfer between nodes and reduce latency in cluster communication.Storage SolutionsManage and configure storage solutions such as NVMe, SSD arrays, Ceph, or Lustre for high-throughput workloads.AutomationAutomate system deployment, updates, and monitoring using tools like Ansible, Terraform, or Python scripts.SecurityImplement secure access controls, firewalls, and VPNs to protect GPU resources and user data.Ensure compliance with security best practices for HPC environments.Hybrid/Cloud IntegrationManage integrations between on-premise GPU clusters and cloud platforms (e.g., AWS, GCP, Azure).Build and maintain hybrid HPC setups for seamless scalability.Data Center InfrastructureWork on power, cooling, and rack design for HPC setups, ensuring reliable and efficient operations.Deploy and maintain systems in on-premise or hybrid cloud data center environments.
Required QualificationsTechnical SkillsStrong experience with Linux (CentOS, Ubuntu, RHEL) and system-level configuration.Expertise in managing NVIDIA GPU ecosystems (CUDA, NVLink, NVIDIA drivers).Familiarity with AMD ROCm, HIP, or OpenCL for AMD GPUs.Knowledge of high-speed networking protocols (InfiniBand, RDMA, Ethernet).Proficiency in scripting and automation (Python, Bash, Ansible, Terraform).Experience with job orchestration tools like Kubernetes or Slurm.Familiarity with containerization (Docker, NVIDIA Docker, Singularity).Understanding of storage technologies, including NVMe and parallel file systems.Soft SkillsStrong analytical and problem-solving skills.Ability to work independently and as part of a remote team.Excellent communication skills for cross-team collaboration.Preferred QualificationsExperience with hybrid cloud setups, including AWS Outposts, Azure Stack, or GCP Anthos.Hands-on experience with hardware management tools like IPMI/BMC for remote server management.Familiarity with emerging accelerators (e.g., SambaNova, Cerebras, Graphcore).
What We OfferCompetitive salary and benefits package.Work with a talented and collaborative team of engineers.Opportunities to work on cutting-edge GPU and HPC projects.A flexible and dynamic startup environment where you can grow and innovate.Opportunities for professional development and continuous learning.
-
Delhi, India BitOoda Full timeRole OverviewAs a Systems/Network Engineer, you will be responsible for architecting, deploying, and maintaining GPU-based compute infrastructure. You will work on bare-metal systems, high-speed networks, and hybrid cloud integrations to ensure maximum performance, reliability, and scalability. This role is primarily remote but may occasionally require...
-
Delhi, India BitOoda Full timeJob Posting: GPU Optimization Engineer (Bare Metal Expertise)Location:RemoteJob Type:Full-TimeAbout UsWe are an innovative company at the forefront of high-performance computing (HPC) and AI, building cutting-edge solutions powered by GPUs and specialized accelerators. We’re looking for a highly skilled GPU Optimization Engineer to design, develop, and...
-
High-Performance Compute Engineer
8 hours ago
Delhi, Delhi, India BitOoda Full timeJob Overview:As a Systems/Network Engineer, you will be responsible for architecting, deploying, and maintaining high-performance compute infrastructure leveraging NVIDIA GPUs. This role involves working on bare-metal systems, high-speed networks, and hybrid cloud integrations to ensure maximum performance, reliability, and scalability.Key...
-
▷ 15h Left: GPU Optimization Engineer
13 hours ago
Delhi, India BitOoda Full timeJob Posting: GPU Optimization Engineer (Bare Metal Expertise)Location: RemoteJob Type: Full-TimeAbout UsWe are an innovative company at the forefront of high-performance computing (HPC) and AI, building cutting-edge solutions powered by GPUs and specialized accelerators. We’re looking for a highly skilled GPU Optimization Engineer to design, develop, and...
-
High-Performance Data Center Specialist
1 day ago
Delhi, Delhi, India Vivekananda Institute of Professional Studies Full timeAbout the JobAt Vivekananda Institute of Professional Studies, we are seeking a highly skilled and dedicated Data Center Engineer (NVIDIA Specialist) to join our team. This role involves the management, optimization, and maintenance of data center hardware and systems, with a specific focus on NVIDIA technologies such as GPUs and AI/ML infrastructure.Key...
-
Senior Systems Engineer
3 weeks ago
delhi, India DC Tech Consulting Full timeJob Profile: Senior Systems Engineer - Kubernetes & Linux PlatformSummary:An experienced Systems Engineer with over 10 years of specialized expertise in Linux platforms, Kubernetes cluster management, and advanced troubleshooting. Skilled in Kubernetes Day 2 operations, Linux networking, Linux storage, and Nvidia GPU configurations within Kubernetes...
-
Senior Systems Engineer
3 weeks ago
Delhi, India DC Tech Consulting Full timeJob Profile: Senior Systems Engineer - Kubernetes & Linux PlatformSummary:An experienced Systems Engineer with over 10 years of specialized expertise in Linux platforms, Kubernetes cluster management, and advanced troubleshooting. Skilled in Kubernetes Day 2 operations, Linux networking, Linux storage, and Nvidia GPU configurations within Kubernetes...
-
Senior Systems Engineer
3 weeks ago
Delhi, India DC Tech Consulting Full timeJob Profile: Senior Systems Engineer - Kubernetes & Linux PlatformSummary:An experienced Systems Engineer with over 10 years of specialized expertise in Linux platforms, Kubernetes cluster management, and advanced troubleshooting. Skilled in Kubernetes Day 2 operations, Linux networking, Linux storage, and Nvidia GPU configurations within Kubernetes...
-
E2E Networks
1 week ago
Delhi NCR, India E2E Networks Limited Full timeJob Title : Technical Lead - Python Developer Experience Required : 5-8 Years Job Summary : We are seeking a highly skilled Tech Lead with 5-8 years of experience to join our dynamic team. The ideal candidate will have a strong background in software development, excellent leadership capabilities, and the ability to oversee and guide a team of developers....
-
High-Performance Infrastructure Engineer
8 hours ago
Delhi, Delhi, India LinkedIn Full timeAs a Cloud-Native Systems Developer at LinkedIn, you will play a crucial role in building the next-generation infrastructure platforms. With a focus on information retrieval (IR), you will be part of a high-performing team that develops distributed databases built using Rust to support multiple retrieval use cases.Key ResponsibilitiesDesign and build highly...
-
Delhi, India ClearML Full timeInformation Technology Manager, AI ComputingCompany DescriptionClearML is a unified, open source platform for continuous AI/ML, trusted by forward-thinking Data Scientists, ML Engineers, DevOps, and decision makers at leading Fortune 500, enterprises, academia, and innovative start-ups worldwide. We enable customers to achieve the fastest time to production,...
-
Delhi, India Sakar Robotics Full timeCompany DescriptionSakar Robotics is a dynamic and innovative company located in Pune, dedicated to revolutionizing the construction industry. Our mission is to provide cutting-edge solutions that transform construction activities and drive innovation in the field.We are seeking a skilled Senior Computer Vision Engineer with at least 3 years of hands-on...
-
High-Performance Backend Systems Engineer
2 weeks ago
Delhi, Delhi, India Tykhe Inc Full timeJob Title: High-Performance Backend Systems EngineerAbout Us:Tykhe Inc is a cutting-edge company at the forefront of Generative Artificial Intelligence (GenAI). We're seeking an exceptional Product/Software Engineer-Backend to join our team in shaping the future of GenAI. This role offers exciting opportunities to work closely with cross-functional teams and...
-
Delhi, Delhi, India Mulya Technologies Full timeMulya Technologies Seeks Experienced ProfessionalWe are currently looking for a highly skilled Senior Microarchitecture Designer for High-Performance Systems to join our team at Mulya Technologies.About the RoleDesign and integrate high-performance System on Chip, architecting SoCs for power, performance, and area efficiency.Develop microarchitecture and...
-
Delhi, India Vivekananda Institute of Professional Studies Full timeAbout the JobTitle:Data Centre Engineer (NVIDIA Specialist)Reports to:Director GeneralLocation: VIPS Campus, DelhiApply by:20th December, 2024About VIPS: Summary:We are seeking a highly skilled and dedicatedData Center Engineer (NVIDIA Specialist)to join our team. This role involves the management, optimization, and maintenance of data center hardware and...
-
Network engineer
3 weeks ago
New Delhi, India Esconet Technologies Full timeRole- Network Engineer Work Experience Required : Minimum 3 Years as a network engineer Location: Okhla Phase 1, New Delhi About us: Founded in New Delhi, India, Esconet Technologies Limited is a leading name in IT Infrastructure solution sales and services. Formerly known as Esconet Technologies Pvt. Ltd., we transitioned to a public company...
-
Network Engineer
3 weeks ago
new delhi, India Esconet Technologies Full timeRole- Network Engineer Work Experience Required : Minimum 3 Years as a network engineer Location: Okhla Phase 1, New Delhi About us: Founded in New Delhi, India, Esconet Technologies Limited is a leading name in IT Infrastructure solution sales and services. Formerly known as Esconet Technologies Pvt. Ltd., we transitioned to a public company in September...
-
High-Performance AI Developer
2 weeks ago
Delhi, Delhi, India AryaXAI Full timeAryaXAI is a pioneer in AI innovation, driving the development of explainable, safe, and aligned systems for mission-critical businesses.We are seeking a highly skilled High-Performance AI Developer to join our team and push the boundaries of high-performance AI computation. In this role, you will design, develop, and optimize GPU kernels that power...
-
Delhi, Delhi, India Mulya Technologies Full timeHigh-Performance SoC Design EngineerWe are seeking a highly skilled Senior ASIC Design Engineer to join our team at Mulya Technologies in Santa Clara, California.About the Role:We are looking for candidates with expertise in Arm IP background, specifically CHI, CMN, and Arm CPUs.The ideal candidate will have experience designing and integrating...
-
Network Engineer
1 day ago
Delhi, India Esconet Technologies Full timeRole- Network EngineerWork Experience Required: Minimum 4 Years as a network engineerLocation: Okhla Phase 1, New DelhiAbout us:Founded in New Delhi, India, Esconet Technologies Limited is a leading name in IT Infrastructure solution sales and services. Formerly known as Esconet Technologies Pvt. Ltd., we transitioned to a public company in September 2023....