Senior Data Engineer

1 week ago


Coimbatore, Tamil Nadu, India Aivar Innovations Full time

Senior Data Engineer - Data Processing & Feature Engineering 

Location: Coimbatore

Experience Level: 6+ years

About the Role

We are seeking exceptional Senior Data Engineers to build the data foundation powering Velogent AI's autonomous agents. You will design and implement large-scale data ingestion, processing, and feature engineering systems that transform unstructured enterprise data (invoices, documents, transactions, RFQs) into structured, high-quality datasets. Your work enables agentic AI systems to make accurate, compliance-aware decisions while maintaining data quality, lineage, and auditability standards required by regulated industries.

Core Responsibilities

  • Design and architect end-to-end data pipelines processing large volumes of unstructured enterprise data (documents, PDFs, transaction records, email, etc.)
  • Build sophisticated data ingestion frameworks supporting multiple data sources and formats with automated validation and quality checks
  • Implement large-scale data processing solutions using distributed computing frameworks handling terabytes of data efficiently
  • Develop advanced feature engineering pipelines extracting meaningful signals from unstructured data (document classification, entity extraction, semantic tagging)
  • Design data warehousing architecture supporting both operational (near real-time) and analytical queries for agentic AI reasoning
  • Build robust data quality frameworks ensuring high data accuracy critical for agent decision-making and regulatory compliance
  • Implement data governance patterns including lineage tracking, metadata management, and audit trails for regulated environments
  • Optimize data pipeline performance, reliability, and cost through intelligent partitioning, caching, and resource optimization
  • Lead data security implementation protecting sensitive information (PII, financial data, healthcare records) with encryption and access controls
  • Collaborate with AI engineers to understand data requirements and optimize data for model training and inference
  • Establish best practices for data documentation, SLA management, and operational excellence

Must-Have Qualifications

  • Unstructured Data Expertise: Production experience ingesting and processing large volumes of unstructured data (documents, PDFs, images, text, logs)
  • Large-Scale Data Processing: Advanced expertise with distributed data processing frameworks (Apache Spark, Flink, or cloud-native alternatives like AWS Glue)
  • Feature Engineering: Deep knowledge of advanced feature engineering techniques for ML systems, including automated feature extraction and transformation
  • Python Proficiency: Expert-level Python for data processing, ETL pipeline development, and data science workflows
  • NLP/Text Processing: Strong background in NLP and text analysis techniques for document understanding, entity extraction, and semantic processing
  • Data Architecture: Experience designing data warehouses, data lakes, or lakehouse architectures supporting both batch and real-time processing
  • ETL/ELT Pipeline Design: Proven expertise building production-grade ETL/ELT pipelines with error handling, retry logic, and monitoring
  • Cloud Data Platforms: Advanced experience with AWS data services (S3, Athena, Glue, RDS, DynamoDB) or equivalent cloud platforms
  • Data Quality & Governance: Understanding of data quality frameworks, metadata management, and data governance practices

Nice-to-Have Qualifications

  • Experience with document parsing and layout analysis libraries (Pydantic, , PyPDF, etc.)
  • Knowledge of information extraction pipelines and vector databases for semantic search
  • Familiarity with Apache Kafka or other event streaming platforms for real-time data processing
  • Experience with dbt (data build tool) or similar data transformation frameworks
  • Understanding of data privacy and compliance frameworks (GDPR, HIPAA, SOC2)
  • Experience optimizing costs in cloud data platforms through intelligent resource allocation
  • Background in building recommendation systems or ranking systems using feature engineering
  • Knowledge of graph databases and knowledge graphs for relationship extraction
  • Familiarity with computer vision techniques for document analysis and processing
  • Published work or open-source contributions in NLP, document processing, or data engineering

What You'll Work With

  • Large-scale document processing pipelines handling millions of invoices, contracts, and business documents
  • Apache Spark and distributed computing frameworks for ETL
  • AWS data services (S3, Glue, Athena, RDS) for data infrastructure
  • Advanced NLP and text processing libraries (spaCy, transformers, LangChain)
  • Vector databases and semantic search infrastructureData quality and monitoring frameworks
  • Cloud data warehouses and data lakes on AWS
  • Compliance and governance frameworks for regulated industries


  • Coimbatore, Tamil Nadu, India Squash Apps Full time

    Senior Databricks Data EngineerWe're looking for a Senior Databricks Data Engineer to lead large-scale data pipeline development on the Databricks Lakehouse Platform. If you're strong in Spark, cloud platforms, and modern data engineering practices—this role is for you. Responsibilities● Build & optimize ETL/ELT pipelines using Databricks (PySpark, SQL,...

  • Senior Data Engineer

    2 weeks ago


    Coimbatore, Tamil Nadu, India Numentica Consulting Group Full time

    Preference to candidates residing @ ChennaiThis is a Fulltime and Onsite role @ Chennai.NuStartz is seeking an experienced Senior Data Engineer to drive the modernization of our data infrastructure. The ideal candidate will play a key role in designing, building, and optimizing data pipelines as we migrate to Google Cloud Platform (GCP)—leveraging...

  • Senior Medical Coder

    2 weeks ago


    Coimbatore, Tamil Nadu, India NTT DATA Full time

    Company DescriptionNTT DATA, a part of NTT Group, provides IT and business services that help clients achieve digital transformation. Headquartered in Tokyo, the company specializes in consulting, industry solutions, business process services, and digital & IT modernization. With operations in over 50 countries, NTT DATA is committed to delivering innovative...


  • Coimbatore, Tamil Nadu, India Agkiya Solutions Full time

    Looking for Informatica ETL Consultant for US Client.Role: Informatica (IICS)+ Data Engineering - Senior DeveloperLocation: Pan India (Remote)Experience: 9+ yearsCategory: ContractWork Timings: 2:30 PM IST to 11:30 PM IST

  • Senior Data Engineer

    2 weeks ago


    Coimbatore, Tamil Nadu, India Novintix Technologies LLP Full time

    Responsibilities:Design and implement data pipelines for supply chain data (e.g., inventory, shipping, procurement).Develop and maintain data warehouses and data lakes.Ensure data quality, integrity, and security.Collaborate with supply chain stakeholders to identify analytics requirements.Develop data models and algorithms for predictive analytics (e.g.,...


  • Coimbatore, Tamil Nadu, India Epam Systems Full time

    We are seeking a detail-focused and highly skilled Data Quality Engineer (Automation) to join our data engineering team.The candidate will bring expertise in ETL automation, data warehouse testing, and cloud data services, as well as hands-on knowledge of test automation frameworks and CI/CD methodologies.ResponsibilitiesDefine and implement automated test...

  • Data Engineer

    2 weeks ago


    Coimbatore, Tamil Nadu, India Diligent Global Full time

    We are scaling a strategic Digital Operations capability and seeking a pragmatic, delivery-focused Data Engineer to design, build and operate end-to-end data platforms that drive product-performance analytics.Key responsibilities• Engineer robust ETL/ELT pipelines and data models to ingest, transform and surface large volumes of structured & unstructured...

  • Data Engineer

    5 days ago


    Coimbatore, Tamil Nadu, India LTIMindtree Full time

    Hi All,We are hiring for multiple positions with LTI MindtreeOpen Roles:Power BI DeveloperAzure Data EngineerAWS Data EngineerSnowflake Data EngineerGCP Data EngineerExperience: 5 to 8 yearsLocation: Coimbatore / BhubaneswarInterested candidates can share their CV at Regards,Neha


  • Coimbatore, Tamil Nadu, India Point Perfect Technology Solutions Full time

    Greetings form PPTSWe are currently looking for aSenior Data Scientistto join our team. The ideal candidate should have strong expertise in data analysis, machine learning, statistical modeling, and handling large-scale datasets. The role requires hands-on experience with advanced analytical tools, programming languages, and the ability to translate complex...


  • Coimbatore, Tamil Nadu, India Reach Talent Solutions Full time

    Urgent Hiring | Senior Structural Engineer SteelWe are looking for an experienced Senior Structural Engineer (Steel) to join our team for an immediate requirement in Coimbatore.Position: Senior Structural Engineer – SteelJob Location: CoimbatoreNo. of Openings: 1Experience Required: 6 – 10 yearsMode: Work from OfficeKey Software SkillsSTAAD ProIDEA...