Mastering Large-Scale Data Engineering

6 days ago


ghaziabad, India beBeeData Full time

Senior Data Engineer OpportunityWe are seeking a seasoned Senior Data Engineer to design, develop and own robust, scalable, and automated data pipelines that power our large language model development.Key Responsibilities:Data Pipeline Design: Develop efficient ETL/ELT pipelines in Python for ingesting and processing terabyte-scale text datasets.Data Quality Management: Implement rigorous data cleaning, deduplication, filtering, and normalization strategies. Define and enforce data quality standards to ensure the highest integrity for model training.Data Transformation: Structure and format diverse datasets (JSON, Parquet, etc.) for consumption by LLM training frameworks.Collaboration: Work closely with AI researchers and ML engineers to understand data requirements, define metrics, and support the model training lifecycle.Workflow Optimization: Continuously optimize data processing workflows for speed, cost, and reliability.Requirements:8+ years of professional experience in data engineering, data processing or backend software engineering.Expert-level proficiency in Python and its data ecosystem (e.g., Pandas, NumPy, Dask, Polars).Proven experience building and maintaining large-scale data pipelines.Deep understanding of data structures, data modeling, and software engineering best practices (Git, CI/CD, testing).Experience handling and parsing diverse data formats at scale.Benefits:Opportunity to lead cutting-edge AI and ML projects.Collaborative team culture.Competitive compensation with continuous learning opportunities.



  • Ghaziabad, India CodeMyMobile Full time

    Experience Required - 7 to 10 Years How to Apply:  Are you a Data Engineer who cares about clean engineering, autonomy, and solving real data challenges? If this sounds like you, we’d love to connect! Email your application to with: Your GitHub links or other professional works. Your resume (PDF or online profile). A Data Engineering project you are...


  • Ghaziabad, Uttar Pradesh, India CodeFire Technologies Pvt Ltd Full time ₹ 12,00,000 - ₹ 30,00,000 per year

    We are Hiring: Senior Data Engineer / Analyst (6–8 Years Experience)Looking for a skilled professional with strong PySpark, Python, and data pipeline experience. The ideal candidate should have hands-on expertise in moving and processing large datasets, along with strong communication skills in English.Job Profile: Senior Data Engineer/ AnalystLocation:...


  • ghaziabad, India beBeeData Full time

    Unlock Your Potential as a Data Engineering ExpertWe are seeking an experienced and innovative Data Engineer to lead the design and implementation of scalable data pipelines for ingesting, transforming, and activating customer data.About This Role:Leverage your technical expertise to develop and orchestrate workflows using Apache Airflow and Spark. Develop...


  • Ghaziabad, India Karamtara Engineering Full time

    Job Summary:Karamtara Engineering Ltd is seeking a highly experienced General Manager - Procurement to lead and manage procurement operations, with a primary focus on capital equipment and capex purchases. The ideal candidate will have at least 15 - 20 years of expertise in procuring high-value capital assets, industrial machinery, and raw materials for...


  • Ghaziabad, India Karamtara Engineering Full time

    Job Summary:Karamtara Engineering Ltd is seeking a highly experienced General Manager - Procurement to lead and manage procurement operations, with a primary focus on capital equipment and capex purchases. The ideal candidate will have at least 15 - 20 years of expertise in procuring high-value capital assets, industrial machinery, and raw materials for...


  • ghaziabad, India beBeeCloudDataEngineer Full time

    Cloud Data Engineering ExpertWe are seeking a seasoned Cloud Data Engineer with expertise in designing and developing efficient ETL/ELT pipelines, optimizing data workflows, and ensuring robust data observability and streaming. This role is ideal for an engineer with extensive experience in cloud services and tools for high-performance data processing.The...


  • ghaziabad, India beBeeDataEngineer Full time

    Job DescriptionA data engineer role in a cloud environment requires expertise in designing, building, deploying, and maintaining large-scale data processing systems.The ideal candidate should have hands-on experience with GCP services such as Dataflow, Big Query, and Storage Classes.GCP - Data Side: Proficiency in setting up and managing cloud-based data...


  • ghaziabad, India beBeeEngineer Full time

    Data Center Engineer Job SummaryWe are seeking a skilled Data Center Engineer to support our delivery function. The role focuses on delivering high-quality technical solutions, resolving complex issues, and driving technology adoption across large-scale environments.Key Responsibilities:Deliver customer engagements and provide support for large-scale...


  • ghaziabad, India beBeeData Full time

    Job OverviewWe are seeking a seasoned data professional to lead our data engineering efforts. As a Principal Data Engineer, you will be responsible for designing and implementing scalable and reliable data infrastructure from the ground up.You will oversee the development of robust ETL/ELT pipelines to process and transform large datasets efficiently.You...


  • ghaziabad, India beBeeData Full time

    Job DescriptionWe are seeking a highly skilled Senior Data Engineer to design, build, optimize and manage large-scale data pipelines that support both batch and real-time analytics.Develop scalable, high-performance data pipelines using Azure cloud technologies.Create and optimize Spark (Scala/PySpark) jobs for large-volume data processing and...