Senior Data Engineer

5 days ago


Chennai GPO Chennai Tamil Nadu, India Urgent openingShree Dhanya Info for the post of the Data Entry Operator Full time

Data Engineer (LLM Data Pipeline)

Description :

Location: India (Remote/Hybrid)

Experience: 3–6 years in Data Engineering

Role Overview:

You will own the data pipeline powering our LLM training and fine-tuning. This includes ingestion,

cleaning, deduplication, and building high-quality datasets from structured/unstructured sources.

Responsibilities:

  • Design ETL pipelines for text, PDFs, and structured data.
  • Implement data deduplication, filtering (toxicity, PII), and normalization.
  • Train and manage tokenizers (SentencePiece/BPE).
  • Build datasets for supervised fine-tuning and evaluation.
  • Work closely with domain experts to generate instruction/response pairs.

Requirements:

  • Strong in Python, SQL, and data wrangling frameworks (Pandas, Spark).
  • Experience with large text datasets, cleaning, preprocessing.
  • Familiarity with NLP-specific preprocessing (chunking, embeddings).
  • Knowledge of cloud data storage (S3/GCS/Blob).
  • Bonus: Prior experience in AI/ML pipelines.

Job Type: Permanent

Pay: ₹469, ₹1,812,611.94 per year



  • Chennai, Tamil Nadu, India AABM Cloud Data Solution Full time

    Senior Data Engineer (Remote, India)10+ Years Data Engineer Experience RequiredSQL, Python, Snowflake - 8+ years Mandatory hands-on experienceExperience with any ETL Tools (SSIS Preferred), Cloud Environments, and DB2/DBT is a plus.

  • Senior Data Scientist

    3 weeks ago


    Chennai, Tamil Nadu, India, Tamil Nadu Crayon Data Full time

    Role: Sr Data ScientistExperience level: 5 to 7 yearsLocation: ChennaiWhy Crayon? Why now?Crayon is transforming into an AI first company, and every Crayon (that’s what we call ourselves!) is undergoing a journey of upskilling and expanding their capabilities in the AI space.We're building an organization where AI is not a department—it’s a way of...

  • Senior Mlops

    3 weeks ago


    Chennai, Tamil Nadu, India NTT DATA Full time

    Req ID 343712 NTT DATA strives to hire exceptional innovative and passionate individuals who want to grow with us If you want to be part of an inclusive adaptable and forward-thinking organization apply now We are currently seeking a Senior MLOps AIOps Platform Engineer to join our team in Chennai Tamil N du IN-TN India IN Job Summary We are seeking a Senior...


  • Chennai, India Eucloid Data Solutions Full time

    Job Description:Eucloid is looking for a Senior Data Engineer to join our Data Platform team supporting various business applications. The ideal candidate will support development of data infrastructure for our clients by participating in activities which may include starting from up- stream and down-stream technology selection to designing and building of...


  • tamil nadu, India Eucloid Data Solutions Full time

    Job Description:Eucloid is looking for a Senior Data Engineer to join our Data Platform team supporting various business applications. The ideal candidate will support development of data infrastructure for our clients by participating in activities which may include starting from up- stream and down-stream technology selection to designing and building of...


  • Chennai, India Eucloid Data Solutions Full time

    Job Description:Eucloid is looking for a Senior Data Engineer to join our Data Platform team supporting various business applications. The ideal candidate will support development of data infrastructure for our clients by participating in activities which may include starting from up- stream and down-stream technology selection to designing and building of...


  • Chennai, India Eucloid Data Solutions Full time

    Job Description: Eucloid is looking for a Senior Data Engineer to join our Data Platform team supporting various business applications. The ideal candidate will support development of data infrastructure for our clients by participating in activities which may include starting from up- stream and down-stream technology selection to designing and building of...


  • Chennai, India Eucloid Data Solutions Full time

    Job Description: Eucloid is looking for a Senior Data Engineer to join our Data Platform team supporting various business applications. The ideal candidate will support development of data infrastructure for our clients by participating in activities which may include starting from up- stream and down-stream technology selection to designing and building of...


  • Chennai, India Eucloid Data Solutions Full time

    Job Description: Eucloid is looking for a Senior Data Engineer to join our Data Platform team supporting various business applications. The ideal candidate will support development of data infrastructure for our clients by participating in activities which may include starting from up- stream and down-stream technology selection to designing and building of...


  • Chennai, India Eucloid Data Solutions Full time

    Job Description:Eucloid is looking for a Senior Data Engineer to join our Data Platform team supporting various business applications. The ideal candidate will support development of data infrastructure for our clients by participating in activities which may include starting from up- stream and down-stream technology selection to designing and building of...