LLM Systems Performance Engineer

1 week ago


India Phinity Full time

We look forward to when AI can discover the next quantum AI accelerator, or when AI can make RL much more compute-efficient. We want to enable AI to bootstrap its own intelligence, to discover new computational paradigms. Just as AlphaEvolve discovered a 23% speedup in Gemini's critical kernels and achieved 32.5% improvements in FlashAttention, we're building the infrastructure that will enable every AI model to optimize its own compute stack. Of course, to automate algorithm and hardware discovery, we need to break the data barrier. CUDA is a low-resource language, and kernel optimization depends a lot on context and hardware that models simply are not trained on.Phinity is building the canonical training data infrastructure that will enable agentic hardware engineering and optimization, which will fuel algorithmic discovery. We are building environments for agents to learn to write kernel from a spec and optimize them on specific hardware, and eventually, to discover new hardware breakthroughs. Our customers include one of the largest frontier model labs.We're seeking top engineers for a contractor role who can optimize hardware for model training and inference workloads, who can bake their industry experience into a model. This is a hybrid Systems Engineer/AI research role where you will be looking through and debugging model reasoning traces and designing the optimal CUDA problems to teach unreleased models to automate your work in industry. Please do not apply unless you have optimized kernels before.Skill requirements:Languages: CUDA, C++, Python,Frameworks: JAX/XLA, PyTorch, TensorFlow (at the C++ level), PallasLibraries: cuBLAS, cuDNN, CUTLASS, CUB, ThrustCompiler Tools: NVCC, PTX assembly, MLIR/XLA understandingHardware Knowledge: SM architecture, tensor cores, memory hierarchies (HBM, L2, shared, registers)Apply if you have:Achieved >10x speedups on production ML workloadsWritten kernels that outperform vendor librariesOptimized attention, GEMM, or convolution at the assembly levelBuilt custom fusions that beat XLA/Triton compiler outputPublished papers or open-source kernels used in production


  • Shro Systems

    4 weeks ago


    Pune, India Shro Systems Pvt. Ltd. Full time

    Job Description We are looking for a highly skilled Generative AI Developer with expertise in Large Language Models (LLMs) to join our AI/ML innovation team. The ideal candidate will be responsible for building, fine-tuning, deploying, and optimizing generative AI models to solve complex real-world problems. Responsibilities You will collaborate with data...


  • India Trident Consulting Full time

    Job Description Trident Consulting is looking for a Distinguished LLM Engineer - Chennai/ Tirunelveli/ Coimbatore. Role: Distinguished LLM Engineer Location: Chennai/ Tirunelveli/ Coimbatore Type: Fulltime Salary:Depends on your experience and the current market rate Do you want to use your AI expertise to drive real-world impact We're hiring a Distinguished...

  • LLM Engineer

    3 weeks ago


    Noida, India Algoscale Full time

    Job Description Location: Noida (WFO) Timings: 10:30 AM to 7:30 PM; Mon-Fri About the Role We are seeking a talented LLM (Large Language Model) Engineer to join our growing AI/ML team. The ideal candidate will have hands-on experience working with modern natural language processing (NLP) systems, large-scale model training, fine-tuning, and deployment. You...

  • Senior AI Engineer

    4 weeks ago


    Mumbai, India The Goodstack Company Full time

    Job Description An SF based startup is looking to hire a Senior AI Engineer. The role is fully remote. Here's What The Day-to-day Responsibilities Will Look Like - - Design and implement conversational AI agents using LLMs - Build RAG pipelines to process e-commerce data - Optimize prompt engineering for accuracy and cost efficiency - Create data ingestion...


  • India Jobgether Full time

    Job Description This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior AI/LLM Engineer - Agentic Systems in India. As a Senior AI/LLM Engineer, you will be at the forefront of designing and building intelligent, self-directed AI agents that transform how enterprises leverage AI. You will work in a highly...


  • India Jobgether Full time

    This position is posted by Jobgether on behalf of a partner company We are currently looking for a Senior AI LLM Engineer - Agentic Systems in India As a Senior AI LLM Engineer you will be at the forefront of designing and building intelligent self-directed AI agents that transform how enterprises leverage AI You will work in a highly collaborative...

  • Senior LLM Engineer

    3 weeks ago


    Bengaluru, Karnataka, India, Karnataka RingCentral Full time

    Job Description:We are seeking an experienced AI Engineer with a strong background in Natural Language Understanding (NLU) who is passionate about pushing the boundaries of Conversational AI. In this role, you will design, develop, and deploy scalable AI solutions leveraging LLMs, Retrieval-Augmented Generation (RAG), and prompt engineering techniques to...


  • Pune, Maharashtra, India, Maharashtra Rapid7 Full time

    Principal LLM EngineerJoin Rapid7: Secure the Future with AIAre you ready to lead the charge in integrating cutting-edge Large Language Models (LLMs) into world-class Cyber Security products?Rapid7 is looking for a Principal LLM Engineer with a rare combination of deep Data Science expertise, mastery of production MLOps, and 13+ years of experience. You...


  • india, IN FlashIntel Full time

    Role OverviewFlashIntel is seeking a dedicated and innovative Research Engineer with a focus on Multimodal Large Language Models (m-LLMs), Text-to-Speech (TTS) technologies, and agentic workflows. This position offers a unique opportunity to engage in cutting-edge research and development of AI solutions that integrate various data modalities and enhance...


  • India FlashIntel Full time

    Role Overview Flash Intel is seeking a dedicated and innovative Research Engineer with a focus on Multimodal Large Language Models (m-LLMs), Text-to-Speech (TTS) technologies, and agentic workflows. This position offers a unique opportunity to engage in cutting-edge research and development of AI solutions that integrate various data modalities and enhance...