AI Inference Kernel Engineer
1 week ago
We look forward to when AI can discover the next quantum AI accelerator, or when AI can make RL much more compute-efficient. We want to enable AI to bootstrap its own intelligence, to discover new computational paradigms. Just as AlphaEvolve discovered a 23% speedup in Gemini's critical kernels and achieved 32.5% improvements in FlashAttention, we're building the infrastructure that will enable every AI model to optimize its own compute stack. Of course, to automate algorithm and hardware discovery, we need to break the data barrier. CUDA is a low-resource language, and kernel optimization depends a lot on context and hardware that models simply are not trained on. Phinity is building the canonical training data infrastructure that will enable agentic hardware engineering and optimization, which will fuel algorithmic discovery. We are building environments for agents to learn to write kernel from a spec and optimize them on specific hardware, and eventually, to discover new hardware breakthroughs. Our customers include one of the largest frontier model labs. We're seeking top engineers for a contractor role who can optimize hardware for model training and inference workloads, who can bake their industry experience into a model. This is a hybrid Systems Engineer/AI research role where you will be looking through and debugging model reasoning traces and designing the optimal CUDA problems to teach unreleased models to automate your work in industry. Please do not apply unless you have optimized kernels before. Skill requirements: Languages: CUDA, C++, Python, Frameworks: JAX/XLA, PyTorch, TensorFlow (at the C++ level), Pallas Libraries: cuBLAS, cuDNN, CUTLASS, CUB, Thrust Compiler Tools: NVCC, PTX assembly, MLIR/XLA understanding Hardware Knowledge: SM architecture, tensor cores, memory hierarchies (HBM, L2, shared, registers) Apply if you have: - Achieved >10x speedups on production ML workloads - Written kernels that outperform vendor libraries - Optimized attention, GEMM, or convolution at the assembly level - Built custom fusions that beat XLA/Triton compiler output - Published papers or open-source kernels used in production
-
Gen AI Inference Engineer
2 weeks ago
Chennai, India artcube.ai (Artcube AI Pvt. Ltd.) Full timeJob Description Job Title: GenAI Inference Engineer (12 Years Experience) Location: Chennai, India Company: Artcube AI Pioneers in GenAI for Virtual Product Placement About Us We are a next-generation AI company building proprietary models and intelligent algorithms for post-production product placement in TV/OTT and movies. Our GenAI models allow us to...
-
Manager, Kernel Software
2 weeks ago
Bengaluru, India Cerebras Systems Full timeJob Description Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine...
-
AI Systems Engineer
4 weeks ago
Hyderabad, India SEMI LEAF Full timeJob Description Job Title : AI Systems Engineer GPU/ROCm/CUDA | ML Frameworks Optimization Location : : 3-6 [Mid-Senior] Job Description We are looking for a passionate and experienced AI Systems Engineer to join our team to work on next-generation Machine Learning technologies and optimize performance across AMD GPU accelerators. This role involves...
-
AI/ML Validation Engineer
4 weeks ago
Bengaluru, India SEMI LEAF Full timeJob Description Required Skills - Strong background in machine learning fundamentals, including deep learning, large language models, and recommender systems. - Strong background in validation, defect and software development life cycle - Strong knowledge on ubuntu / yocto linux - Experience working with opensource frameworks such as PyTorch, TensorFlow, and...
-
AI Engineer
3 weeks ago
India TalentBridge Full timeJob Title: AI EngineerJob Type: 6-Month Contract, after 6 months it will convert to fulltimeJob DescriptionWe are looking for an experienced AIML Engineer with 4–8 years of expertise in AI/ML solutions, specifically in building intelligent applications leveraging LLM orchestration tools.The ideal candidate will have hands-on experience with Semantic...
-
AI Engineer
3 weeks ago
India TalentBridge Full timeJob Title: AI Engineer Job Type: 6-Month Contract, after 6 months it will convert to fulltime Job Description We are looking for an experienced AIML Engineer with 4–8 years of expertise in AI/ML solutions, specifically in building intelligent applications leveraging LLM orchestration tools. The ideal candidate will have hands-on experience with Semantic...
-
LLM Systems Performance Engineer
1 week ago
India Phinity Full timeWe look forward to when AI can discover the next quantum AI accelerator, or when AI can make RL much more compute-efficient. We want to enable AI to bootstrap its own intelligence, to discover new computational paradigms. Just as AlphaEvolve discovered a 23% speedup in Gemini's critical kernels and achieved 32.5% improvements in FlashAttention, we're...
-
LLM Systems Performance Engineer
1 week ago
India Phinity Full timeWe look forward to when AI can discover the next quantum AI accelerator, or when AI can make RL much more compute-efficient. We want to enable AI to bootstrap its own intelligence, to discover new computational paradigms. Just as AlphaEvolve discovered a 23% speedup in Gemini's critical kernels and achieved 32.5% improvements in FlashAttention, we're...
-
LLM Systems Performance Engineer
2 weeks ago
India Phinity Full timeWe look forward to when AI can discover the next quantum AI accelerator, or when AI can make RL much more compute-efficient. We want to enable AI to bootstrap its own intelligence, to discover new computational paradigms. Just as AlphaEvolve discovered a 23% speedup in Gemini's critical kernels and achieved 32.5% improvements in FlashAttention, we're...
-
AI Engineer
3 weeks ago
india, IN TalentBridge Full timeJob Title: AI EngineerJob Type: 6-Month Contract, after 6 months it will convert to fulltimeJob DescriptionWe are looking for an experienced AIML Engineer with 4–8 years of expertise in AI/ML solutions, specifically in building intelligent applications leveraging LLM orchestration tools.The ideal candidate will have hands-on experience with Semantic...