
Data Engineer
2 days ago
Roles and Responsibilities
- Build and maintain scalable, fault-tolerant
data pipelines
to support GenAI and analytics workloads across OCR, documents, and case data. - Manage ingestion and transformation of semi-structured legal documents (PDF, Word, Excel) into structured formats.
- Enable
RAG workflows
by processing data into chunked, vectorized formats with metadata. - Handle large-scale ingestion from multiple sources into
cloud-native data lakes
(S3, GCS),
data warehouses
(BigQuery, Snowflake), and PostgreSQL. - Automate pipelines using orchestration tools like
Airflow/Prefect
, including retry logic, alerting, and metadata tracking. - Collaborate with ML Engineers to ensure data availability, traceability, and performance for inference and training pipelines.
- Implement data validation and testing frameworks using
Great Expectations
or
dbt
. - Integrate OCR pipelines and post-processing outputs for embedding and document search.
- Design infrastructure for
streaming vs batch
data needs and optimize for cost, latency, and reliability.
Qualifications
- Bachelor's or Master's degree in Computer Science, Data Engineering, or equivalent.
- 3+ years of experience in building distributed data pipelines and managing multi-source ingestion.
- Proficiency with
Python
,
SQL
, and data tools like Pandas, PySpark. - Experience working with data orchestration tools (Airflow, Prefect), and file formats like Parquet, Avro, JSON.
- Hands-on experience with cloud storage/data warehouse systems (S3, GCS, BigQuery, Redshift).
- Understanding of GenAI and vector database ingestion pipelines is a strong plus.
- Bonus: Experience with OCR tools (Tesseract, Google Document AI), PDF parsing libraries (PyMuPDF), and API-based document processors.
-
Principal Software/Data Engineering
2 days ago
Gurugram, India CoPoint Data Full timeAbout CoPoint Data CoPoint Data is a specialized consulting firm focused on transforming businesses through process improvement, data insights, and technology-driven innovation. We leverage AI technologies, Microsoft cloud platforms, and modern web development frameworks to deliver intelligent, scalable solutions that drive measurable impact for our clients....
-
Senior Web Developer and Data Engineer
2 days ago
Gurugram, India CoPoint Data Full timeAbout CoPoint AI CoPoint AI is a specialized consulting firm focused on transforming businesses through process improvement, data insights, and technology-driven innovation. We leverage AI technologies, Microsoft cloud platforms, and modern web development frameworks to deliver intelligent, scalable solutions that drive measurable impact for our clients. Our...
-
Data Engineer
2 days ago
Gurugram, India Obrimo Technologies Full timeResponsibilities: Develop, implement, and maintain efficient and scalable data engineering solutions to support the company's data initiatives. Collaborate with cross-functional teams to understand business requirements and translate them into technical data engineering solutions. Build and maintain data pipelines to ingest, transform, and store...
-
Data Engineer
2 days ago
Gurugram, India AuxoAI Full timeRole SummaryAuxoAI is seeking a skilled and experienced Data Engineer to join our dynamic team. The ideal candidate will have 7-10 years of prior experience in data engineering, with a strong background in Databricks. This role offers an exciting opportunity to work on diverse projects, collaborating with cross-functional teams to design, build, and optimize...
-
Data engineer
2 days ago
Gurugram, India Orbion Infotech Full timeCompany Description Orbion Infotech is your trusted partner for comprehensive software services and top-tier staff augmentation solutions. With a track record of success, we empower organizations to thrive in today's digital landscape. Our dedicated team of industry experts offers custom software development, staff augmentation, and strategic technology...
-
Data Engineer
2 days ago
Gurugram, India Healthpoint Ventures Full timeCompany Description At Healthpoint Ventures, we are dedicated to fostering innovation in healthcare through strategic collaboration with providers, payers, and stakeholders. Our mission is to harness the power of artificial intelligence to maximize value across the healthcare ecosystem. By driving impactful projects and joint ventures, we leverage AI to...
-
Data Engineer
2 days ago
Gurugram, India NatWest Group Full timeJoin us as a Data Engineering Lead This is an exciting opportunity to use your technical expertise to collaborate with colleagues and build effortless, digital first customer experiences You'll be simplifying the bank through developing innovative data driven solutions, inspiring to be commercially successful through insight, and keeping our customers' and...
-
Data Engineer
2 days ago
Gurugram, India Digital Business People Full timeJob Title:Data Engineer Location:Gurgaon (On-site) Experience:4 Years Joining:Immediate About the Role We are looking for a skilledData Engineerwith strong expertise inPython, AWS Cloud, and Redshiftto design, build, and maintain scalable data pipelines and solutions. The ideal candidate will have hands-on experience with large-scale data systems, cloud...
-
Data Engineer
2 days ago
Gurugram, India ExcelGens, Inc. Full timeWe're Hiring: Data Engineer (Microsoft Fabric Specialist) Are you an experienced Data Engineer (5–7 years) with strong hands-on expertise in Microsoft Fabric? Join ExcelGens in Gurgaon and play a key role in building scalable, modern, and intelligent data solutions. What you'll do: Design & develop data pipelines using Microsoft Fabric (Data Factory,...
-
Data Engineer
2 days ago
Gurugram, India ExcelGens, Inc. Full timeWe're Hiring:Data Engineer (Microsoft Fabric Specialist) Are you an experienced Data Engineer (2-3 years) with strong hands-on expertise in Microsoft Fabric? JoinExcelGens, Inc.in Gurgaon and play a key role in building scalable, modern, and intelligent data solutions. What you'll do: Design & develop data pipelines using Microsoft Fabric (Data Factory,...