Data Pipeline Architect

1 day ago


mangalore, India beBeeData Full time

Senior Data EngineerWe are seeking an experienced Senior Data Engineer to design, build, and own data pipelines that fuel large language model (LLM) development.Create robust, scalable, automated ETL/ELT pipelines in Python for ingesting and processing terabyte-scale text datasets.Implement rigorous data cleaning, deduplication, filtering, and normalization strategies. Define and enforce data quality standards to ensure the highest integrity for model training.Efficiently structure and format diverse datasets (JSON, Parquet, etc.) for consumption by LLM training frameworks.Collaborate with AI researchers and ML engineers to understand data requirements, define metrics, and support the model training lifecycle.Key Skills:Expert-level proficiency in Python and its data ecosystem (e.g., Pandas, NumPy, Dask, Polars).Proven experience building and maintaining large-scale data pipelines.Deep understanding of data structures, data modeling, and software engineering best practices (Git, CI/CD, testing).



  • mangalore, India beBeeData Full time

    Solutions Architect – Data Pipeline SpecialistAt the core of our Mobility DataOps ecosystem lies a complex network of data flows, tools, and processes. As a Solutions Architect – Data Pipeline Specialist, you will play a crucial role in designing and implementing the technical architecture of our projects.Your expertise will be essential in ensuring...


  • mangalore, India beBeePlatformEngineer Full time

    Job TitleWe are seeking a highly skilled Platform Engineer to join our team.Key ResponsibilitiesData Pipeline Development: Develop scalable data pipelines using Python, Apache Spark, and Databricks.Notebook-Based Workflows: Create and manage notebook-based workflows for data processing, analysis, and automation.Cloud-Native Solutions: Implement and maintain...


  • mangalore, India beBeeDataEngineer Full time

    Job OverviewAs a Data Engineer, you will be responsible for designing and developing scalable data pipelines and cloud-based data solutions. This role requires strong Python programming skills and expertise in ETL/ELT processes.The ideal candidate will have hands-on experience with AWS cloud services such as S3, Glue, Lambda, Redshift, Kinesis, and DynamoDB....


  • mangalore, India beBeeData Full time

    Enterprise Data Architect PositionWe are seeking a senior, hands-on Enterprise Data Architect to lead the design and delivery of large-scale data solutions. This individual will be responsible for owning the architecture for data pipelines, data warehousing, analytics, and integrations with various systems.Data Architecture Roadmap: Define end-to-end data...


  • mangalore, India beBeeSolution Full time

    Data Architect RoleOverview of Data Architect PositionWe are seeking a senior data architect to lead the design and implementation of enterprise data analytics solutions. This individual will be responsible for defining end-to-end data architectures, roadmaps, and technical strategies for large-scale education data platforms.Responsibilities of Senior Data...


  • mangalore, India beBeeDataPipelineValidator Full time

    Seeking a skilled Data Pipeline Validator to ensure the accuracy and reliability of data pipelines, business intelligence reports, dashboards, and analytics solutions.Key ResponsibilitiesValidate and analyze data pipelines to ensure accuracy and performance.Test Business Intelligence (BI) reports and dashboards to ensure reliable results.Collaborate with...


  • mangalore, India beBeeQuality Full time

    Job Title: QA Consultant – ETLDescription:We are seeking an experienced professional to play a crucial role in ensuring the quality and reliability of our data pipelines. As a QA Consultant – ETL, you will work closely with our integration teams to test upstream/downstream interfaces, validate JSON/XML schema structure, status codes, and pagination...


  • mangalore, India beBeeDataEngineering Full time

    Job Opportunity:We are seeking an experienced Data Engineering Specialist to join our team.About the Role:This position involves designing and developing scalable data pipelines using cloud-native tools such as AWS DMS, AWS Glue, Kafka, Azure Data Factory, GCP Dataflow, etc.The successful candidate will be responsible for architecting and implementing data...


  • mangalore, India beBeeCloud Full time

    Job Opportunity:We are seeking a skilled Data Support Engineer to join our application development team.The ideal candidate will have a strong foundation in Unix system administration, a solid understanding of cloud-based data services (especially AWS), and hands-on experience with ETL pipelines and production support.This role plays a key part in ensuring...


  • mangalore, India beBeeDataEngineer Full time

    Data Engineer Role SummaryPerform data extraction tasks using various web scraping techniques and process the extracted data for analysis.Design, implement, and maintain efficient data pipelines to ingest and schedule large datasets.About this Data Engineer PositionCollaborate with cross-functional teams to understand business requirements and develop...