Data Platform Engineer
1 month ago
Job Summary:
BharatGen is on a mission to create AI that truly represents the diversity, culture, and unique context of India. At the heart of this mission lies the need for robust, scalable infrastructure to build multilingual and multimodal datasets that power foundational AI models. We’re seeking a skilled Data Platform Engineer to build scalable tools, platforms, and pipelines tailored for processing large-scale, multilingual, multimodal datasets critical for foundational AI models.
In this role, you will build scalable data pipelines to ingest, transform, and prepare data from diverse sources—text, speech, images, and video—making it ready for Generative AI model training. Your work will involve developing and managing the underlying platform while addressing challenges like governance, security, observability, lineage, and scalability. The outcomes of your work will include efficient tools for data processing, a reliable data platform, and high-quality datasets tailored to the evolving needs of large-scale AI and LLM training.
Collaborating closely with researchers and ML engineers, you will play a pivotal role in enabling BharatGen to deliver state-of-the-art AI models, contributing to the advancement of India’s AI ecosystem through innovative data engineering solutions.
Key Responsibilities:
- Design and Build Scalable Platforms: Develop distributed infrastructure for ingesting, processing, and transforming diverse datasets (text, speech, images, video) at terabyte to petabyte scale.
- Develop Robust Data Pipelines: Create reliable, scalable pipelines to prepare datasets for Generative AI and LLM training.
- Implement Governance and Observability: Build frameworks for data lineage, monitoring, and access control to ensure data quality and operational reliability.
- Optimize Performance and Cost: Enhance platform performance and resource utilization using cost-effective strategies, including GPU-accelerated preprocessing.
- Collaborate and Innovate: Work closely with researchers and ML engineers to adapt platforms and data pipelines to evolving LLM requirements, addressing various data challenges.
- Drive Innovation: Stay updated on emerging tools, frameworks, and best practices to implement cutting-edge solutions for large-scale dataset creation.
Minimum Qualifications and Experience:
Education:
- Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field.
- [Preferred] Advanced degrees or certifications in Distributed Systems, Data Engineering, or Big Data technologies
Experience and Expertise:
- 3+ years of overall industry experience in engineering roles, demonstrating strong foundations in software development, systems engineering, or related disciplines.
- 1+ years of specific hands-on experience in developing large-scale, distributed data pipelines and platforms, preferably in high-performance AI or ML environments.
- Expertise in managing unstructured data (text, speech, or multimodal datasets) for high-performance use cases, ideally in the context of LLM/AI datasets.
- Understanding of challenges in scalable data engineering, including ingestion, transformation, and storage optimization for large-scale accelerated workflows.
Skills:
1.Technical
- Proficiency in distributed systems and frameworks (e.g., Kafka, Ray, PySpark) for scalable data workflows.
- Exposure to end-to-end data lifecycle management, including DataOps.
- Strong programming skills in Python, Scala, or Go, with a focus on high-performance pipeline development.
- Experience with building and optimizing data pipelines, including ETL processes, data modeling, and integration into scalable workflows.
- Expertise in data scraping, crawling frameworks, and modern dataset development techniques such as synthetic data generation techniques.
- Experience with cloud platforms (AWS, GCP, Azure) and container orchestration (Docker, Kubernetes).
- Deep understanding of data platform design, including data architecture, metadata tracking, data lineage, observability, monitoring, and scalability best practices.
- Familiarity with Infrastructure-as-Code tools (e.g., Terraform, CloudFormation), CI/CD pipelines, relational/NoSQL databases, and GPU-accelerated workflows.
- Familiarity with visualization and monitoring tools for lifecycle management and pipeline performance tracking.
2.Soft Skills
- Adaptability and innovation in fast-paced, dynamic environments.
- Strong collaboration skills for interdisciplinary teamwork.
- Proactive problem-solving and a growth mindset to thrive in a mission-driven organization.
-
Data Platform Architect
6 days ago
Mumbai, Maharashtra, India Straive Full timeAbout StraiveStraive is a leading provider of innovative data solutions, empowering businesses to make informed decisions and drive growth.Job OverviewThis role offers the opportunity to lead the design, build, and evolution of in-house data platforms, driving business success through strategic data engineering.Key ResponsibilitiesProvide subject matter...
-
Data platform lead
3 weeks ago
Mumbai, India Godrej Consumer Products Limited Full timeJob Title: Data Platform Lead Location: Mumbai, Maharashtra, India About the role As Azure Data Platform Lead, you'll design and implement global data solutions on Azure, lead the data platform team, collaborate with leadership, and ensure our data infrastructure supports business goals. You'll also drive the adoption of new features, standardize...
-
Data Platform Developer
1 month ago
Navi Mumbai, Maharashtra, India Nouryon Full timeJob OverviewNouryon, a global leader in sustainable specialty chemicals, seeks an experienced Data Platform Developer to join our team. In this role, you will play a vital part in designing and delivering IT solutions that drive business growth and innovation.About the JobThis is a key position within Nouryon's Azure data platform team, requiring a strong...
-
Data Engineer Position in Azure Platform
1 month ago
Mumbai, Maharashtra, India MSRcosmos LLC Full timeMSR Cosmos Group is currently looking for a skilled Data Engineer to join their team on a short-term contract basis in Mumbai.The successful candidate will have strong experience working with Snowflake as a data warehouse, including writing complex SQL queries, designing and optimizing database schemas, and fine-tuning performance. Additionally, they should...
-
Data platform engineer
3 weeks ago
Mumbai, India TIH-IoT Full timeJob Summary: Bharat Gen is on a mission to create AI that truly represents the diversity, culture, and unique context of India. At the heart of this mission lies the need for robust, scalable infrastructure to build multilingual and multimodal datasets that power foundational AI models. We’re seeking a skilled Data Platform Engineer to build scalable...
-
Data Engineer for Foundational Data Platform
3 weeks ago
Mumbai, Maharashtra, India smalldata Full timeAbout smalldata.aiWe are a research and consulting firm focused on building effective data frameworks and AI products. Our expertise spans over a decade, with numerous collaborations across various industries.Our clients face common challenges in consolidating data from disparate systems. They possess vast amounts of customer, product, and engagement data...
-
Open Shift Admin
3 months ago
Mumbai, Maharashtra, India NTT DATA Full time**Make an impact with NTT DATA** Join a company that is pushing the boundaries of what is possible. We are renowned for our technical excellence and leading innovations, and for making a difference to our clients and society. Our workplace embraces diversity and inclusion - it’s a place where you can grow, belong and thrive. **Your day at NTT DATA** The...
-
Data platform engineer
4 weeks ago
Mumbai, India TIH-IoT Full timeJob Summary:Bharat Gen is on a mission to create AI that truly represents the diversity, culture, and unique context of India. At the heart of this mission lies the need for robust, scalable infrastructure to build multilingual and multimodal datasets that power foundational AI models. We’re seeking a skilled Data Platform Engineer to build scalable tools,...
-
Global Data Centre Design Engineer
3 days ago
Mumbai, Maharashtra, India Colt Data Centre Services Full timeJob Description:We are seeking a highly skilled and experienced Global Data Centre Design Engineer to join our team at Colt Data Centre Services. As a key member of our Delivery Team, you will play a leading role in the delivery of multi-disciplinary projects, managing external engineering consultants and contractors to meet programme, design and budget...
-
Cloud Data Platform Developer
3 weeks ago
Mumbai, Maharashtra, India XHire Full timeJob Title: Cloud Data Platform DeveloperAbout the Role:XHire is seeking an experienced Cloud Data Platform Developer to join our team. The successful candidate will be responsible for designing and developing automated data pipelines and data structures, as well as working closely with other teams to deliver business tenancies as part of the data platform...
-
Data Platform Architect
7 months ago
Mumbai, Maharashtra, India Hyre Global pvt ltd Full timeBachelor's degree in computer science or Management of Information Services - At least 10 years in Data Engineering - At least 5 years in a Manager Role (Techno-functional) - Mastery Knowledge of AWS or Azure - Successfully implemented a Master Data Management Platform - Successfully support Data Governance Technologies and Process - Familiarity with GCPR...
-
Data Platform Engineer
4 weeks ago
Mumbai, India TIH-IoT Full timeJob Summary: BharatGen is on a mission to create AI that truly represents the diversity, culture, and unique context of India. At the heart of this mission lies the need for robust, scalable infrastructure to build multilingual and multimodal datasets that power foundational AI models. We’re seeking a skilled Data Platform Engineer to build scalable...
-
Data Platform Engineer
4 weeks ago
Mumbai, India TIH-IoT Full timeJob Summary:BharatGen is on a mission to create AI that truly represents the diversity, culture, and unique context of India. At the heart of this mission lies the need for robust, scalable infrastructure to build multilingual and multimodal datasets that power foundational AI models. We’re seeking a skilled Data Platform Engineer to build scalable tools,...
-
Cloud DevOps Engineer
1 month ago
Mumbai, Maharashtra, India Qrata Full timeJob SummaryWe are seeking an experienced Cloud DevOps Engineer to support the design, development and management of an advanced data and analytics platform on AWS.
-
Data Platform Lead
1 month ago
Mumbai Metropolitan Region, India Godrej Consumer Products Limited Full timeJob Title: Data Platform LeadLocation: Mumbai, Maharashtra, IndiaAbout the roleAs Azure Data Platform Lead, you'll design and implement global data solutions on Azure, lead the data platform team, collaborate with leadership, and ensure our data infrastructure supports business goals. You'll also drive the adoption of new features, standardize the Azure tech...
-
Data Platform Lead
4 weeks ago
Mumbai Metropolitan Region, India Godrej Consumer Products Limited Full timeJob Title: Data Platform Lead Location: Mumbai, Maharashtra, India About the role As Azure Data Platform Lead, you'll design and implement global data solutions on Azure, lead the data platform team, collaborate with leadership, and ensure our data infrastructure supports business goals. You'll also drive the adoption of new features, standardize the Azure...
-
Large Scale Data Platform Architect
2 days ago
Mumbai, Maharashtra, India TIH | IIT Bombay Full timeJob SummaryBharatGen is committed to developing AI that represents the diversity and culture of India. To achieve this mission, we need a robust infrastructure for building multilingual and multimodal datasets that power foundational AI models. We're seeking an experienced Large Scale Data Platform Architect to design scalable tools, platforms, and pipelines...
-
Azure Data Platform Architect
6 days ago
Mumbai, Maharashtra, India Tata Consultancy Services Full timeSeeking an experienced Cloud Data Warehouse Engineer to lead our data engineering team in Tata Consultancy Services. The ideal candidate will have a strong background in cloud-based data warehousing and excellent technical skills.About the Role:We are looking for a talented individual to join our team as a Cloud Data Warehouse Engineer. As a key member of...
-
Mumbai, Maharashtra, India Adept Global Full timeAdept Global is seeking a seasoned Data Analyst to join our team.Estimated salary: 15 LPAThe successful candidate will be responsible for designing and developing our data warehouse, including the critical user discovery & data exploration system.Job Description:We are looking for an individual who can define the architecture and implementation of our data...
-
Mumbai, Maharashtra, India Qrata Full timeThe Qrata company is a commercially driven, front office aligned team that works in close partnership with the trading desks, global research and enterprise technology.We are recruiting an experienced Cloud DevOps engineer to support the ongoing design, development and management of an advanced data and analytics platform on AWS.Key...