Senior Distributed Training Research Engineer
3 weeks ago
Location: Bangalore (India)
Type of Job: Full-time
About Krutrim:
Krutrim is building AI computing for the future. Our envisioned AI computing stack encompasses the AI computing infrastructure, AI Cloud, multilingual and multimodal foundational models, and AI-powered end applications. We are India's first AI unicorn and built the first foundation model from the country.
Our AI stack is empowering consumers, startups, enterprises and scientists across India and the world to build their end AI applications or AI models. While we are building foundational models across text, voice, and vision relevant to our focus markets, we are also developing AI training and inference platforms that enable AI research and development across industry domains. The platforms being built by Krutrim have the potential to impact millions of lives in India, across income and education strata, and across languages.
The team at Krutrim represents a convergence of talent across AI research, Applied AI, Cloud Engineering, and semiconductor design. Our teams operate from three locations: Bangalore, Singapore & San Francisco.
Job Description:
We are seeking an experienced Senior Generative AI Model Research Engineer to efficiently train frontier and foundation multimodal large language models. In this critical role, you will be responsible for scalable training methodologies to develop a variety of generative AI models such as large language models, voice/speech foundation models, vision and multi-modal foundation models using cutting-edge techniques and frameworks. In this hands-on role, you will optimize and implement state of art neural architecture, robust training and inference infrastructure to efficiently take complex models with hundreds of billions and trillions of parameters to production while optimizing for low latency, high throughput, and cost efficiency.
Key Responsibilities:
1. Architect Distributed Training Systems: Design and implement highly scalable distributed training pipelines for LLMs and frontier models, leveraging model parallelism (tensor, pipeline, expert) and data parallelism techniques.
2. Optimize Performance: Utilize deep knowledge of CUDA, C++, and low-level optimizations to enhance model training speed and efficiency across diverse hardware configurations.
3. Implement Novel Techniques: Research and apply cutting-edge parallelism techniques like Flash Attention to accelerate model training and reduce computational costs.
4. Framework Expertise: Demonstrate proficiency in deep learning frameworks such as PyTorch, TensorFlow, and JAX, and tailor them for distributed training scenarios.
5. Scale to Hundreds of Billions of Parameters: Work with massive models, ensuring stable and efficient training across distributed resources.
6. Evaluate Scaling Laws: Design and conduct experiments to analyze the impact of model size, data, and computational resources on model performance.
7. Collaborate: Partner closely with research scientists and engineers to integrate research findings into production-ready training systems.
Qualifications:
1. Advanced Degree: Ph.D. or Master's degree in Computer Science, Machine Learning, or a related field.
2. Proven Experience: 5+ years of experience in distributed training of large-scale deep learning models, preferably LLMs or similar models.
3. Deep Learning Expertise: Strong theoretical and practical understanding of deep learning algorithms, architectures, and optimization techniques.
4. Parallelism Mastery: Extensive experience with various model and data parallelism techniques, including tensor parallelism, pipeline parallelism, and expert parallelism.
5. Framework Proficiency: Expert-level knowledge of PyTorch, TensorFlow, or JAX, with a demonstrated ability to extend and customize these frameworks.
6. Performance Optimization: Proven track record of optimizing deep learning models for speed and efficiency using CUDA, C++, and other performance-enhancing tools.
7. Research Acumen: Familiarity with current research trends in large model training and the ability to apply new techniques to real-world problems.
Join Krutrim to shape the future of AI and make a significant impact on 100s of millions of lives across India and the world. If you're passionate about pushing the boundaries of AI and want to work with a team at the forefront of innovation, we want to hear from you
-
Bengaluru, Karnataka, India Krutrim Full timeSenior Distributed Training Research Engineer (Frontier LLMs)Location:Bangalore (India)Type of Job:Full-timeAbout Krutrim:Krutrim is building AI computing for the future. Our envisioned AI computing stack encompasses the AI computing infrastructure, AI Cloud, multilingual and multimodal foundational models, and AI-powered end applications. We are India's...
-
Senior Distributed Training Research Engineer
4 weeks ago
Bengaluru, Karnataka, India Krutrim Full timeLocation: Bangalore (India), Singapore and Palo Alto (CA, US) About Krutrim: Krutrim is building AI computing for the future. Our envisioned AI computing stack encompasses the AI computing infrastructure, AI Cloud, multilingual and multimodal foundational models, and AI-powered end applications. We are India's first AI unicorn and built the first...
-
Senior Distributed Training Research Engineer
4 weeks ago
Bengaluru, Karnataka, India Krutrim Full timeLocation: Bangalore (India), Singapore and Palo Alto (CA, US)About Krutrim: Krutrim is building AI computing for the future. Our envisioned AI computing stack encompasses the AI computing infrastructure, AI Cloud, multilingual and multimodal foundational models, and AI-powered end applications. We are India's first AI unicorn and built the first foundation...
-
AI Training Research Engineer
1 week ago
Bengaluru, Karnataka, India Krutrim Full timeKrutrim is at the forefront of AI computing innovation, building a comprehensive stack that encompasses AI infrastructure, cloud services, and foundational models. As India's first AI unicorn, we've pioneered the development of the country's first foundation model. Our mission is to empower individuals, startups, enterprises, and scientists across the globe...
-
Distributed Training Systems Expert
22 hours ago
Bengaluru, Karnataka, India Krutrim Full timeJob OverviewWe are seeking an experienced Senior Generative AI Model Research Engineer to efficiently train frontier and foundation multimodal large language models.
-
Clinical Research Training Specialist
1 week ago
Bengaluru, Karnataka, India CliniLaunch Research Institute Full timeCompany OverviewCliniLaunch Research Institute is a leading clinical research institute and professional training center in Bengaluru. Our mission is to bridge the gap between aspiring professionals and the industry in fields like Pharmacy, Lifesciences, Medicine, and Paramedical.
-
Senior Lead Engineer
1 week ago
Bengaluru, Karnataka, India Squareroot Consulting Pvt Ltd. Full timeCompany OverviewSquareroot Consulting Pvt Ltd. is a pioneering technology company that fosters innovation through collaborative software development.Job SummaryWe are looking for an accomplished Senior Lead Engineer - Distributed Infrastructure to lead our Distributed Systems team in Bangalore, India.Design and implement scalable, fault-tolerant distributed...
-
Senior Distributed Systems Engineer
1 week ago
Bengaluru, Karnataka, India Signzy Full timeCompany Overview:At Signzy, we are a dynamic engineering team that excels in building cutting-edge distributed systems and large-scale applications. Our collaborative environment fosters innovation and growth, providing opportunities for top engineers to excel.Job Description:We are seeking a highly skilled Senior Distributed Systems Engineer to join our...
-
Senior Research Analyst
4 weeks ago
Bengaluru, Karnataka, India AIM Research Full timeAnalytics India Magazine, founded in 2012, is India's biggest media house specializing in emerging tech like AI & data science.We also have a dedicated research arm that focuses on syndicated/ in-house research around how various aspects of the AI & data science industry is shaping in India. We are looking to bolster the team with a researcher who comes with...
-
Distributed Storage Engineer
7 days ago
Bengaluru, Karnataka, India Upraised Full timeDistributed Storage Engineer Job DescriptionUpraised is looking for a talented Distributed Storage Engineer to join our team. As a Distributed Storage Engineer, you will be responsible for designing, implementing, and supporting distributed storage solutions in public, private, and hybrid cloud environments.Main Responsibilities:Design and implement...
-
Senior Golang Engineer
2 days ago
Bengaluru, Karnataka, India CAW Studios Pvt Ltd Full timeSenior Golang Engineer - Distributed ArchitecturesWe are seeking a seasoned Senior Golang Engineer to join our team at CAW Studios Pvt Ltd. As a key member of our engineering team, you will be responsible for designing, developing, and maintaining high-quality, scalable backend systems using Golang, with a focus on distributed architectures.Your...
-
Distributed System Engineer
7 days ago
Bengaluru, Karnataka, India Uber9 Business Process Service Private Limited Full timeWe are searching for a Distributed System Engineer to join our team at Uber9 Business Process Service Private Limited. As a senior member of our engineering team, you will play a key role in designing and developing robust and scalable distributed systems using Java technologies.Key SkillsStrong proficiency in Java and other relevant programming...
-
Distributed System Engineer
6 days ago
Bengaluru, Karnataka, India SolarWinds Full timeSolarWinds' mission is to enrich the lives of its employees, customers, and communities through the acceleration of business transformation with simple, powerful, and secure solutions.We're seeking a talented and motivated Software Engineer to join our Observability Platform team, where you'll contribute to developing scalable, resilient, and real-time...
-
Distributed Systems Engineer
7 days ago
Bengaluru, Karnataka, India Oracle Full timeJob DescriptionWe are looking for hands-on engineers with expertise and passion in solving difficult problems in distributed systems, virtualized infrastructure, and highly available services. As a Senior Principal Member of Technical Staff, you will own the software design and development for major components of Oracle's Cloud Infrastructure.
-
Senior Research Engineer
3 weeks ago
Bengaluru, Karnataka, India Quantiphi Full timeQuantiphi is an award-winning AI-first digital engineering company, driven by a deep desire to solve transformational problems at the heart of businesses. Our signature approach combines groundbreaking machine-learning research with disciplined cloud and data-engineering practices to create breakthrough impact at unprecedented speed. Quantiphi has seen 2.5x...
-
Senior Research Engineer
4 weeks ago
Bengaluru, Karnataka, India Quantiphi Full timeQuantiphi is an award-winning AI-first digital engineering company, driven by a deep desire to solve transformational problems at the heart of businesses. Our signature approach combines groundbreaking machine-learning research with disciplined cloud and data-engineering practices to create breakthrough impact at unprecedented speed. Quantiphi has seen 2.5x...
-
Senior Research Engineer
4 weeks ago
Bengaluru, Karnataka, India Quantiphi Full timeQuantiphi is an award-winning AI-first digital engineering company, driven by a deep desire to solve transformational problems at the heart of businesses. Our signature approach combines groundbreaking machine-learning research with disciplined cloud and data-engineering practices to create breakthrough impact at unprecedented speed. Quantiphi has seen 2.5x...
-
Senior Research Engineer
4 weeks ago
Bengaluru, Karnataka, India Quantiphi Full timeQuantiphi is an award-winning AI-first digital engineering company, driven by a deep desire to solve transformational problems at the heart of businesses. Our signature approach combines groundbreaking machine-learning research with disciplined cloud and data-engineering practices to create breakthrough impact at unprecedented speed. Quantiphi has seen 2.5x...
-
Senior Research Engineer
3 weeks ago
Bengaluru, Karnataka, India Quantiphi Full timeQuantiphi is an award-winning AI-first digital engineering company, driven by a deep desire to solve transformational problems at the heart of businesses. Our signature approach combines groundbreaking machine-learning research with disciplined cloud and data-engineering practices to create breakthrough impact at unprecedented speed. Quantiphi has seen 2.5x...
-
Senior Research Engineer
2 weeks ago
Bengaluru, Karnataka, India Quantiphi Full timeQuantiphi is an award-winning AI-first digital engineering company, driven by a deep desire to solve transformational problems at the heart of businesses. Our signature approach combines groundbreaking machine-learning research with disciplined cloud and data-engineering practices to create breakthrough impact at unprecedented speed. Quantiphi has seen 2.5x...