Data Engineer

3 days ago


india, IN OWOW Full time

What You'll Build Core Responsibilities Data Architecture & Infrastructure (40%) ● Design and implement a multi-database architecture (MongoDB, Redis, Milvus, Neo4j, BigQuery) ● Build scalable data pipelines for real-time conversation processing and personalization● Architect ETL/ELT workflows for data migration from legacy systems● Implement data partitioning, sharding, and optimization strategies for high-throughput systems ● Create data governance frameworks ensuring quality, security, and compliance Vector & Graph Database Systems (25%)● Design and optimize Milvus vector collections for semantic search (1024-dim embeddings) ● Build graph schemas in Neo4j for customer journey mapping and persona relationships● Implement HNSW indexing strategies and similarity search optimization● Create hybrid search systems combining vector, full-text, and graph queries● Monitor and tune database performance (query latency, throughput, resource utilization) ML Data Infrastructure (20%) ● Build data collection pipelines for LLM fine-tuning (conversation logs, tool executions)● Create feature stores for GNN training (customer interactions, engagement signals)● Implement data versioning and lineage tracking for ML experiments ● Design A/B testing data infrastructure with CUPED variance reduction● Build real-time feature computation pipelines for contextual bandits Analytics & Monitoring (15%) ● Design BigQuery schemas for marketing analytics and performance tracking● Create materialized views and aggregation pipelines for real-time dashboards● Implement data quality monitoring and anomaly detection ● Build observability infrastructure (Prometheus metrics, Grafana dashboards)● Develop cost optimization strategies for cloud data warehousing Technical Stack You'll Work With Databases & Storage ● MongoDB (conversation state, active sessions) ● Redis (caching, rate limiting, real-time data) ● Milvus (vector embeddings, semantic search) ● Neo4j (customer journey graphs, persona networks) ● BigQuery (analytics warehouse, historical data) Data Processing & Orchestration ● Apache Airflow or Prefect (workflow orchestration) ● Pandas, Polars (data transformation) ● Apache Spark (optional - for large-scale processing) ● dbt (data transformation and modeling) ML/AI Data Pipeline ● vLLM (LLM inference serving) ● MLflow (model registry, experiment tracking)● Sentence Transformers (embedding generation) ● PyTorch, TensorFlow (ML model training) Cloud & Infrastructure ● Google Cloud Platform (BigQuery, Cloud Storage, Compute) ● Docker & Kubernetes (containerization, orchestration) ● Terraform (infrastructure as code) ● GitHub Actions or GitLab CI (CI/CD pipelines) Programming & Tools ● Python 3.10+ (primary language) ● SQL (complex queries, query optimization) ● Shell scripting (Bash/Zsh) ● Git (version control) Requirements Must-Have Skills ● 5+ years of data engineering experience with production systems● Expert-level SQL and database design skills ● Strong Python programming (async/await, type hints, testing) ● Experience with at least 3 different database technologies (SQL, NoSQL, Vector, Graph) ● Proven track record building high-scale data pipelines (>1M records/day)● Deep understanding of data modeling (dimensional, normalized, denormalized)● Experience with cloud data warehouses (BigQuery, Redshift, or Snowflake)● Strong knowledge of data quality, validation, and governance ● Excellent debugging and optimization skills Highly Desirable ● Experience with vector databases (Milvus, Pinecone, Weaviate, Qdrant)● Experience with graph databases (Neo4j, ArangoDB, Neptune) ● Knowledge of embedding models and semantic search ● Experience with ML data pipelines (feature stores, model training data)● Understanding of A/B testing and experimental design ● Experience with real-time streaming (Kafka, Pub/Sub, Kinesis) ● Knowledge of LLMs and conversational AI systems ● Experience with data migration projects (especially large-scale) ● Background in marketing technology or customer data platformsNice-to-Have ● Experience with PyTorch Geometric or graph neural networks ● Knowledge of marketing analytics (attribution, segmentation, personalization)● Familiarity with LangChain, LangGraph, or agent frameworks ● Experience with cost optimization in cloud environments ● Contributions to open-source data engineering projects ● Experience with data compliance (GDPR, CCPA) Key Projects You'll Own Phase 1: Foundation ● Migrate 10M+ conversation vectors from Pinecone to Milvus ● Design and implement MongoDB schemas for real-time agent state● Set up Neo4j graph database with customer journey models ● Create BigQuery data warehouse with partitioned tables Phase 2: Optimization ● Build automated data quality monitoring system ● Implement caching strategies (Redis) for 10x latency reduction ● Optimize vector search queries (target:


  • Data Engineer

    2 weeks ago


    india, IN Data-Hat AI Full time

    Department: Data Engineering & AI Solutions Reports To: Lead Data Solutions Architect Travel: International travel required (up to 30–40%) Position Summary: We are hiring a senior-level Data Engineer to lead the design, development, and optimization of high-performance data infrastructure that underpins mission-critical AI systems. With 12+ years of...

  • Data Engineer

    1 week ago


    india, IN Insight Global Full time

    Job DescriptionInsight Global is looking for a skilled senior level Data Engineer in Hyderabad or remote India for a large AAA gaming company on a contract for 3 months with possibility of extension or conversion. You will be assisting the Data and Analytics team on one of the company's largest game titles set to launch this fall. Your day-to-day will be...

  • Data Engineer

    5 days ago


    , India, IN KPG99 INC Full time

    Role- Databricks EngineerLocation- Remote Duration- 12+ months with ExtensionsREQUIRED SKILLS AND EXPERIENCE- 3–5 years of experience in data engineering roles- Strong hands-on experience with Databricks for data processing and pipeline development.- Proficiency in SQL for data querying, transformation, and troubleshooting.- Solid programming skills in...

  • Data Engineer

    5 days ago


    india, IN Insight Global Full time

    Position: GCP Data Engineer Location: 100% Remote in IndiaDuration: 12 month contract + extensions + conversionsPackage: 10 LPA- 26 LPAInterview Process: 2 RoundsREQUIRED SKILLS AND EXPERIENCE6+ Years of experience as a Data Engineer Experience with GCP Data ie. Big Query, Cloud Storage, BigTable, Airflow, Dataproc, Dataflow Strong SQL experience (NoSQL,...

  • Data Engineer

    2 weeks ago


    india, IN Bahwan CyberTek Full time

    Job Title: Data Engineer – Google Cloud Platform (GCP) Job Summary We are seeking a skilled and motivated Data Engineer with hands-on experience in building scalable data pipelines and cloud-native data solutions on Google Cloud Platform. The ideal candidate will be proficient in GCP services like Pub/Sub, Dataflow, Cloud Storage, and BigQuery, with a...

  • Data Engineer

    1 week ago


    india, IN Response Informatics Full time

    We’re Hiring | Data Engineering Experts (Hyderabad / Remote)We’re looking for passionate and experienced Data Engineering professionals to join our growing team.If you love building scalable data pipelines, optimizing cloud-based workflows, and leading innovation in data infrastructure — we’d love to meet you! Open Positions:Data Engineer – 8 to...

  • Data Engineer

    1 week ago


    india, IN NuVision Auto Glass Full time

    NuVision Auto Glass is a leading auto glass service provider in the USA, serving customers across Arizona, Florida, South Carolina, and Colorado. Known for delivering reliable mobile windshield replacement and expert auto glass services, ensuring convenience and safety at every step.With seamless insurance claims, easy financing options for cash payments,...

  • Data Engineer

    6 days ago


    india, IN Magma Consultancy Part time

    Location: RemoteType: Part-time (20–25 hours per week)About the RoleWe are seeking a skilled Data Engineer to support our growing data operations on a part-time basis. The ideal candidate has hands-on experience in building, optimizing, and maintaining data pipelines and architectures. You’ll work closely with analysts, developers, and business teams to...

  • Data Engineer

    1 week ago


    india, IN Sapaad Full time

    WHO WE ARESapaad is a global leader in unified commerce platforms, delivering world-class software solutions for the food and beverage industry. Our flagship product, also named Sapaad, has achieved remarkable success over the past decade, empowering thousands of F&B businesses across 40+ countries—with many more coming onboard each day.Driven by a...

  • Data Engineer

    3 days ago


    india, IN Digivance Solutions Full time

    Position: Data EngineerExperience: 5–10 YearsLocation: Chennai, Bengaluru, Pune, Hyderabad, Mumbai, Delhi NCR(candidates will be required to work at any of these locations in hybrid mode)Key ResponsibilitiesCollaborate with business and technology stakeholders to understand current and future data requirements.Design, build, and maintain reliable,...