
Sr Systems Engineer Linux – AI Infrastructure
3 weeks ago
Key Responsibilities
- Support deployment and maintenance of NVIDIA GPU-accelerated systems.
- Deploy and support Kubernetes clusters across various environments and distros (e.g., RKE, OpenShift, AKS, EKS, GKE).
- Perform day-to-day system administration across compute, storage, and networking layers.
- Automate infrastructure tasks using Shell scripts, Ansible, or similar tools.
- Collaborate with DevOps, data science, and engineering teams to ensure scalable, resilient infrastructure for AI/ML workloads.
- Monitor infrastructure health and performance; participate in troubleshooting and root cause analysis.
- Extensive experience in managing, configuring, and troubleshooting Linux-based systems (e.g., RHEL, Ubuntu, CentOS, Debian) in enterprise environments, including kernel tuning, system monitoring, and performance optimization.
- Hands-on experience in deploying and configuring Linux servers for AI/ML applications, including setup of GPU-accelerated environments, storage optimization for large datasets (e.g., using RAID, LVM), and ensuring system stability under intensive computational loads—note that training on NVIDIA technologies will be provided.
- Expertise in tuning Linux systems for performance, including CPU/GPU resource allocation, memory management, and I/O optimization, tailored to on-premises setups handling AI/ML training and inference workloads.
- Proven ability to diagnose and resolve intricate problems in Linux environments, such as hardware failures, network bottlenecks, or software conflicts, with a emphasis on minimizing downtime in mission-critical on-premises AI/ML systems.
Qualifications
• Min 7 years of experience in systems engineering or enterprise infrastructure roles.
• Understanding of enterprise storage, networking, and system monitoring tools.
• Scripting and automation experience (e.g., Bash, Python, Ansible).
• Strong communication, documentation, and troubleshooting skills.
• Comfortable working independently in a remote environment.
-
Sr Systems Engineer Linux – AI Infrastructure
4 weeks ago
India DC Tech Consulting Full timeWe are seeking a highly skilled Senior Linux Administrator to join our team, focusing on the implementation and management of on-premises Linux servers optimized for AI/ML workloads. The ideal candidate will have deep expertise in core Linux system administration, with a strong foundation in configuring and optimizing servers for high-performance computing...
-
Senior Ai Engineer
4 weeks ago
India BugRaid AI Full timeCompany Description Bug Raid.AI harnesses advanced AIOps and AI bots to proactively manage and respond to incidents, revolutionizing the entire process.Our innovative solution integrates comprehensive incident analysis with real-time response capabilities, distinguishing us within the industry.We expedite resolution by swiftly identifying and addressing...
-
Backend Ai Engineer
3 weeks ago
India Coderbotics AI Full timeCompany Description Coderbotics AI is a team of passionate tech enthusiasts dedicated to revolutionizing the way software evolves.We specialize in AI-powered code migration solutions—helping businesses seamlessly transition legacy systems, refactor codebases, and manage technical debt with speed and precision.Our advanced technology and expert team ensure...
-
India Scubyt Full timeWe're hiring NCP-Certified Engineers Join us as a Network (AIN), Deployment (AII), or Operations (AIO) Engineer and help power next-gen AI infrastructure with NVIDIA H100 racks.Apply now to be part of cutting-edge AI deployments and scalable data center innovation1. Network Design & Installation Engineer (NCP-AIN Certified)Location: India REMOTEDuration:...
-
Backend AI Engineer
4 weeks ago
India Coderbotics AI Full timeCompany DescriptionCoderbotics AI is a team of passionate tech enthusiasts dedicated to revolutionizing the way software evolves. We specialize in AI-powered code migration solutions—helping businesses seamlessly transition legacy systems, refactor codebases, and manage technical debt with speed and precision. Our advanced technology and expert team ensure...
-
System Engineer
3 weeks ago
India WTMF AI Full timePosition: System EngineerLocation: RemoteType: Full-timeExperience:1–3 years (but problem-solving instincts matter more than years on paper)At WTMF, we're building something more than just another app — we're creating an emotionally intelligent space where people feel heard, understood, and safe. Behind all the AI magic and mood-matching conversations,...
-
Linux System Admin Irc248259
2 weeks ago
India GlobalLogic Full timeDescription Linux Administration Experience Range 6 yearsRequirements Education Bachelors degree in Computer Science Information Technology or related field Masters Degree Preferred xc2xb7 In depth knowledge of Linux RedHat CentOS Debian etc xc2xb7 Hands on experience with MySQL or related databasexc2xb7 Experience with server hardware and...
-
Linux System Administrator
2 weeks ago
India ZettaMine Labs Pvt. Ltd. Full timeHello, Greetings from ZettaMine Hiring for Linux Server Administration Exp: 3 to 8 Years Location: Hyderabad,Bangalore Immediate joiners Only Job Description: Overall 3 – 8 years' Experience in Server Management Administration (Linux). Experience in Server (physical/VM) installation, maintenance & decommissioning. Profound Linux OS...
-
Sr System Development Engineer, AFT
2 weeks ago
India Amazon Music Full timeJob DescriptionDESCRIPTIONThe Amazon Fulfillment Technologies team in Hyderabad is looking for a Sr System Development Engineer to manage all aspects of mission-critical services. Our team of engineers innovate, automate, drive process and service improvements and manage highly available systems that power Amazon fulfillment network worldwide.The ideal...
-
Senior Sre Engineer
4 weeks ago
India BugRaid AI Full timeCompany Description Bug Raid.AI adopts advanced AIOPS and AI bots for proactive incident management and response, transforming the entirety of the process.By integrating sophisticated AIOPS for comprehensive incident analysis with AI bots for immediate response, Bug Raid.AI provides automated and intelligent incident handling.Our platform enables...