GPU Infrastructure

5 days ago


Hyderabad, India PhoQtek labs Full time

About the RoleWe are seeking a highly skilled IT Solutions & GPU Infrastructure Lead to take complete ownership of our GPU-based server infrastructure. This role focuses on next-generation GPU systems used for AI/ML workloads, covering every aspect from data center colocation and setup to GPU slicing, MIG management, resource allocation, optimization, and compliance. You will lead the end-to-end lifecycle of GPU infrastructure — ensuring all servers are optimized, secure, and production-ready for both internal and customer use.Key ResponsibilitiesColocation & Infrastructure SetupGPU colocation and end-to-end infrastructure setup will be entirely under your ownership and responsibility.Coordinate with data centers for rack installation, power, and cooling.Deploy and configure GPU-based servers for production readiness.2. GPU & AI/ML InfrastructureManage GPU slicing and MIG (Multi-Instance GPU) for multi-tenant workloads.Install and maintain the NVIDIA software stack — CUDA, cuDNN, NCCL, and DCGM.Optimize GPU infrastructure for AI/ML workloads (TensorFlow, PyTorch, RAPIDS).Support multi-GPU scaling using NVLink and PCIe passthrough.3. Systems & VirtualizationAdminister Linux-based environments (Ubuntu, CentOS, Rocky) along with other environments.Manage virtualization platforms such as VMware, KVM, or Proxmox with GPU passthrough.Handle container orchestration with Docker and Kubernetes GPU Operators.Integrate high-performance storage (NFS, Ceph, SAN/NAS) for large-scale datasets.4. Monitoring & Performance OptimizationMonitor GPU and system performance using Prometheus, Grafana, NVIDIA DCGM, and nvidia-smi.Proactively detect, analyze, and resolve GPU or system bottlenecks.Optimize GPU nodes for training and inference performance.Implement structured logging, alerts, and usage reporting.one should have to administer, manage, monitor and maintain GPU infrastructure for AI workloads.5. Security & ComplianceHarden GPU servers for multi-tenant workloads.Manage driver, firmware, and software license compliance.Ensure infrastructure security and audit readiness with periodic patching and updates.6. Networking & High-Performance I/OConfigure and maintain high-speed network fabrics (InfiniBand, RDMA, RoCE).Optimize low-latency interconnects for distributed GPU workloads.Troubleshoot and enhance data transfer performance.7. Customer & Infrastructure OwnershipServe as the primary contact for GPU resource allocation.Provision GPU slices or MIG instances for internal and external teams.Troubleshoot, document, and optimize workload performance.QualificationsProven experience in data center server setup and colocation.Deep expertise in GPU server administration (NVIDIA A100/H100 or equivalent).Strong working knowledge of GPU slicing, MIG, CUDA, NCCL, and NVIDIA drivers.Experience with Linux administration, virtualization (VMware/KVM/Proxmox), and containers (Docker/Kubernetes).Hands-on experience with AI/ML frameworks such as TensorFlow and PyTorch.Familiarity with monitoring tools (Prometheus, Grafana, DCGM).Knowledge of storage systems (NFS, Ceph) and high-performance networking.Strong vendor coordination and infrastructure management skills.Why This Role MattersThis position owns the entire lifecycle of GPU-based infrastructure — from colocation to slicing, monitoring, and optimization. You will build and maintain the backbone of our AI/ML infrastructure, ensuring that all systems are efficient, scalable, and production-grade.



  • Hyderabad, India PhoQtek labs Full time

    About the RoleWe are seeking a highly skilled IT Solutions & GPU Infrastructure Lead to take complete ownership of our GPU-based server infrastructure. This role focuses on next-generation GPU systems used for AI/ML workloads, covering every aspect from data center colocation and setup to GPU slicing, MIG management, resource allocation, optimization, and...

  • GPU Infrastructure

    1 week ago


    Hyderabad, India PhoQtek labs Full time

    About the Role We are seeking a highly skilled IT Solutions & GPU Infrastructure Lead to take complete ownership of our GPU-based server infrastructure. This role focuses on next-generation GPU systems used for AI/ML workloads, covering every aspect from data center colocation and setup to GPU slicing, MIG management, resource allocation, optimization,...

  • GPU Infrastructure

    6 days ago


    Hyderabad, India PhoQtek labs Full time

    About the RoleWe are seeking a highly skilled IT Solutions & GPU Infrastructure Lead to take complete ownership of our GPU-based server infrastructure. This role focuses on next-generation GPU systems used for AI/ML workloads, covering every aspect from data center colocation and setup to GPU slicing, MIG management, resource allocation, optimization, and...

  • GPU Infrastructure

    5 days ago


    Hyderabad, India PhoQtek labs Full time

    About the RoleWe are seeking a highly skilled IT Solutions & GPU Infrastructure Lead to take complete ownership of our GPU-based server infrastructure. This role focuses on next-generation GPU systems used for AI/ML workloads, covering every aspect from data center colocation and setup to GPU slicing, MIG management, resource allocation, optimization, and...

  • GPU Infrastructure

    7 days ago


    Hyderabad, India PhoQtek labs Full time

    About the Role We are seeking a highly skilled IT Solutions & GPU Infrastructure Lead to take complete ownership of our GPU-based server infrastructure. This role focuses on next-generation GPU systems used for AI/ML workloads, covering every aspect from data center colocation and setup to GPU slicing, MIG management, resource allocation, optimization,...

  • GPU Infrastructure

    7 days ago


    hyderabad, India PhoQtek labs Full time

    About the RoleWe are seeking a highly skilled IT Solutions & GPU Infrastructure Lead to take complete ownership of our GPU-based server infrastructure. This role focuses on next-generation GPU systems used for AI/ML workloads, covering every aspect from data center colocation and setup to GPU slicing, MIG management, resource allocation, optimization, and...

  • GPU Infrastructure

    7 days ago


    Hyderabad, India PhoQtek labs Full time

    About the RoleWe are seeking a highly skilled IT Solutions & GPU Infrastructure Lead to take complete ownership of our GPU-based server infrastructure. This role focuses on next-generation GPU systems used for AI/ML workloads, covering every aspect from data center colocation and setup to GPU slicing, MIG management, resource allocation, optimization, and...

  • Gpu infrastructure

    7 days ago


    Hyderabad, India PhoQtek Labs Full time

    About the Role We are seeking a highly skilled IT Solutions & GPU Infrastructure Lead to take complete ownership of our GPU-based server infrastructure. This role focuses on next-generation GPU systems used for AI/ML workloads, covering every aspect from data center colocation and setup to GPU slicing, MIG management, resource allocation, optimization, and...

  • GPU Infrastructure

    7 days ago


    Hyderabad, India PhoQtek labs Full time

    About the Role We are seeking a highly skilled IT Solutions & GPU Infrastructure Lead to take complete ownership of our GPU-based server infrastructure. This role focuses on next-generation GPU systems used for AI/ML workloads, covering every aspect from data center colocation and setup to GPU slicing, MIG management, resource allocation, optimization, and...

  • GPU Infrastructure

    6 days ago


    Hyderabad, India PhoQtek labs Full time

    About the Role We are seeking a highly skilled IT Solutions & GPU Infrastructure Lead to take complete ownership of our GPU-based server infrastructure. This role focuses on next-generation GPU systems used for AI/ML workloads, covering every aspect from data center colocation and setup to GPU slicing, MIG management, resource allocation, optimization, and...