Data Platform Engineer

4 hours ago


Mumbai Metropolitan Region, IN BharatGen Full time

Job Summary:

BharatGen is on a mission to create AI that truly represents the diversity, culture, and unique context of India. At the heart of this mission lies the need for robust, scalable infrastructure to build multilingual and multimodal datasets that power foundational AI models. We’re seeking a skilled Data Platform Engineer to build scalable tools, platforms, and pipelines tailored for processing large-scale, multilingual, multimodal datasets critical for foundational AI models.


In this role, you will build scalable data pipelines to ingest, transform, and prepare data from diverse sources—text, speech, images, and video—making it ready for Generative AI model training. Your work will involve developing and managing the underlying platform while addressing challenges like governance, security, observability, lineage, and scalability. The outcomes of your work will include efficient tools for data processing, a reliable data platform, and high-quality datasets tailored to the evolving needs of large-scale AI and LLM training.


Collaborating closely with researchers and ML engineers, you will play a pivotal role in enabling BharatGen to deliver state-of-the-art AI models, contributing to the advancement of India’s AI ecosystem through innovative data engineering solutions.


Key Responsibilities:

  • Design and Build Scalable Platforms: Develop distributed infrastructure for ingesting, processing, and transforming diverse datasets (text, speech, images, video) at terabyte to petabyte scale.
  • Develop Robust Data Pipelines: Create reliable, scalable pipelines to prepare datasets for Generative AI and LLM training.
  • Implement Governance and Observability: Build frameworks for data lineage, monitoring, and access control to ensure data quality and operational reliability.
  • Optimize Performance and Cost: Enhance platform performance and resource utilization using cost-effective strategies, including GPU-accelerated preprocessing.
  • Collaborate and Innovate: Work closely with researchers and ML engineers to adapt platforms and data pipelines to evolving LLM requirements, addressing various data challenges.
  • Drive Innovation: Stay updated on emerging tools, frameworks, and best practices to implement cutting-edge solutions for large-scale dataset creation.


Minimum Qualifications and Experience:

  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field with 3+ years of industry experience.


Required Skills:

  • Proficiency in distributed systems and frameworks (e.g., Kafka, Ray, PySpark) for scalable data workflows.
  • Exposure to end-to-end data lifecycle management, including DataOps.
  • Strong programming skills in Python, Scala, or Go, with a focus on high-performance pipeline development.
  • Experience with building and optimizing data pipelines, including ETL processes, data modeling, and integration into scalable workflows.
  • Expertise in data scraping, crawling frameworks, and modern dataset development techniques such as synthetic data generation techniques.
  • Experience with cloud platforms (AWS, GCP, Azure) and container orchestration (Docker, Kubernetes).
  • Deep understanding of data platform design, including data architecture, metadata tracking, data lineage, observability, monitoring, and scalability best practices.
  • Familiarity with Infrastructure-as-Code tools (e.g., Terraform, CloudFormation), CI/CD pipelines, relational/NoSQL databases, and GPU-accelerated workflows.
  • Familiarity with visualization and monitoring tools for lifecycle management and pipeline performance tracking.
  • Expertise in managing unstructured data (text, speech, or multimodal datasets) for high-performance use cases, ideally in the context of LLM/AI datasets.
  • Understanding of challenges in scalable data engineering, including ingestion, transformation, and storage optimization for large-scale accelerated workflows.


  • Data Engineer

    4 hours ago


    Mumbai Metropolitan Region, IN Synechron Full time

    Greetings,We have immediate opportunity for Data Engineer – 3-7 YearsSynechron– MumbaiJob Role: Data EngineerJob Location: MumbaiAbout SynechronWe began life in 2001 as a small, self-funded team of technology specialists. Since then, we’ve grown our organization to 14,500+ people, across 58 offices, in 21 countries, in key global markets.Innovative...

  • Data Scientist

    4 hours ago


    Mumbai Metropolitan Region, IN CodeVyasa Full time

    Looking for Data scientist | Mumbai (Vashi) to join a team of rockstar developers. The candidate should have a minimum of 4yrs of experience.About CodeVyasa:CodeVyasa is a mid-sized product engineering company that works with top-tier product/solutions companies such as McKinsey, Walmart, RazorPay, Swiggy, and others. We are about 550+ people strong and we...

  • Data Analyst

    4 hours ago


    Mumbai Metropolitan Region, IN JiBe ERP Full time

    PositionData AnalystAbout JibeJiBe is the leading cloud based fully integrated ERP system for the shipping industry. Our goal is to allow shipping companies to improve productivity, efficiency and safety levels, while reducing costs. JiBe ERP enables increased automation and streamlining of processes, creating pre-defined workflows and reducing the usage of...

  • Process Engineer

    4 hours ago


    Mumbai Metropolitan Region, IN SCG Full time

    About SCG in IndiaSCG is introducing advanced manufacturing processes and sustainable product innovations to India’s industrial landscape. With a focus on efficiency, resource optimization, and environmental stewardship, SCG offers chemical engineers a platform to drive meaningful impact.About the RoleAs a Process Engineer, you will optimize SCG’s...

  • Enterprise Architect

    4 hours ago


    Mumbai Metropolitan Region, IN Yotta Data Services Private Limited Full time

    JD - Enterprise Architect.Job Scope:Develop transformation strategies and high-level business cases, clearly communicating the value of transformational change to IT executive leadership within customer organizationsLead transformation solutions by assessing the current state, including data management, dependencies, compliance, security, application...

  • Estimation Engineer

    4 hours ago


    Mumbai Metropolitan Region, IN Seven Eleven Club & Hotels Full time

    Objective of the role: The opening is for Seven Eleven Group Of Companies.He/ she analyzes project plans, specifications, and drawings to prepare detailed cost estimates for labor, materials, and other expenses. Key responsibilities include collaborating with project managers and vendors, managing budgets, and preparing reports to help with decision-making...


  • Mumbai Metropolitan Region, IN NTT Global Networks Full time

    Senior Engineer – Network OperationsJob Title: Senior Engineer – Network OperationsDepartment: Engineering and OperationsLocation: MumbaiReporting: Manager OperationsJob Type: Full TimeShift: Rotational ShiftPRE-REQUISITESHands-on troubleshooting experience in Enterprise LAN/WAN environmentStrong technical subject matter expertise on the following:...

  • Maintenance Engineer

    4 hours ago


    Mumbai Metropolitan Region, IN SCG Full time

    About SCG in IndiaSCG’s India growth is underpinned by sustainable product innovation and high-performance operations. By combining advanced construction systems, integrated supply chains, and world-class ESG practices, SCG offers engineers opportunities to work on cutting-edge manufacturing technology.About the RoleAs a Maintenance Engineer, you will be...


  • Mumbai Metropolitan Region, IN SCG Full time

    About SCG in IndiaSCG is expanding in India with a strong focus on manufacturing excellence, sustainable construction technologies, and operational innovation. Our activities include introducing lightweight AAC wall systems to modernize construction, operating India as a strategic sourcing base for global industries, and maintaining international leadership...

  • Lead Network Engineer

    4 hours ago


    Mumbai Metropolitan Region, IN NTT Global Networks Full time

    Job Title: Lead Engineer – Enterprise Network OperationsDepartment: Engineering and OperationsLocation: MumbaiReporting: Associate Director Network OperationsJob Type: Full TimeShift: Rotational ShiftPRE-REQUISITESTechnical expertise on all or either of the following platformCisco catalyst and Nexus switching and wirelessArista switchingHP/Aruba switching...