Data Engineer

17 hours ago


Anantapur, India OWOW Full time

What You'll Build Core Responsibilities Data Architecture & Infrastructure (40%) ● Design and implement a multi-database architecture (MongoDB, Redis, Milvus, Neo4j, BigQuery) ● Build scalable data pipelines for real-time conversation processing and personalization● Architect ETL/ELT workflows for data migration from legacy systems● Implement data partitioning, sharding, and optimization strategies for high-throughput systems ● Create data governance frameworks ensuring quality, security, and compliance Vector & Graph Database Systems (25%)● Design and optimize Milvus vector collections for semantic search (1024-dim embeddings) ● Build graph schemas in Neo4j for customer journey mapping and persona relationships● Implement HNSW indexing strategies and similarity search optimization● Create hybrid search systems combining vector, full-text, and graph queries● Monitor and tune database performance (query latency, throughput, resource utilization) ML Data Infrastructure (20%) ● Build data collection pipelines for LLM fine-tuning (conversation logs, tool executions)● Create feature stores for GNN training (customer interactions, engagement signals)● Implement data versioning and lineage tracking for ML experiments ● Design A/B testing data infrastructure with CUPED variance reduction● Build real-time feature computation pipelines for contextual bandits Analytics & Monitoring (15%) ● Design BigQuery schemas for marketing analytics and performance tracking● Create materialized views and aggregation pipelines for real-time dashboards● Implement data quality monitoring and anomaly detection ● Build observability infrastructure (Prometheus metrics, Grafana dashboards)● Develop cost optimization strategies for cloud data warehousing Technical Stack You'll Work With Databases & Storage ● MongoDB (conversation state, active sessions) ● Redis (caching, rate limiting, real-time data) ● Milvus (vector embeddings, semantic search) ● Neo4j (customer journey graphs, persona networks) ● BigQuery (analytics warehouse, historical data) Data Processing & Orchestration ● Apache Airflow or Prefect (workflow orchestration) ● Pandas, Polars (data transformation) ● Apache Spark (optional - for large-scale processing) ● dbt (data transformation and modeling) ML/AI Data Pipeline ● vLLM (LLM inference serving) ● MLflow (model registry, experiment tracking)● Sentence Transformers (embedding generation) ● PyTorch, TensorFlow (ML model training) Cloud & Infrastructure ● Google Cloud Platform (BigQuery, Cloud Storage, Compute) ● Docker & Kubernetes (containerization, orchestration) ● Terraform (infrastructure as code) ● GitHub Actions or GitLab CI (CI/CD pipelines) Programming & Tools ● Python 3.10+ (primary language) ● SQL (complex queries, query optimization) ● Shell scripting (Bash/Zsh) ● Git (version control) Requirements Must-Have Skills ● 5+ years of data engineering experience with production systems● Expert-level SQL and database design skills ● Strong Python programming (async/await, type hints, testing) ● Experience with at least 3 different database technologies (SQL, NoSQL, Vector, Graph) ● Proven track record building high-scale data pipelines (>1M records/day)● Deep understanding of data modeling (dimensional, normalized, denormalized)● Experience with cloud data warehouses (BigQuery, Redshift, or Snowflake)● Strong knowledge of data quality, validation, and governance ● Excellent debugging and optimization skills Highly Desirable ● Experience with vector databases (Milvus, Pinecone, Weaviate, Qdrant)● Experience with graph databases (Neo4j, ArangoDB, Neptune) ● Knowledge of embedding models and semantic search ● Experience with ML data pipelines (feature stores, model training data)● Understanding of A/B testing and experimental design ● Experience with real-time streaming (Kafka, Pub/Sub, Kinesis) ● Knowledge of LLMs and conversational AI systems ● Experience with data migration projects (especially large-scale) ● Background in marketing technology or customer data platformsNice-to-Have ● Experience with PyTorch Geometric or graph neural networks ● Knowledge of marketing analytics (attribution, segmentation, personalization)● Familiarity with LangChain, LangGraph, or agent frameworks ● Experience with cost optimization in cloud environments ● Contributions to open-source data engineering projects ● Experience with data compliance (GDPR, CCPA) Key Projects You'll Own Phase 1: Foundation ● Migrate 10M+ conversation vectors from Pinecone to Milvus ● Design and implement MongoDB schemas for real-time agent state● Set up Neo4j graph database with customer journey models ● Create BigQuery data warehouse with partitioned tables Phase 2: Optimization ● Build automated data quality monitoring system ● Implement caching strategies (Redis) for 10x latency reduction ● Optimize vector search queries (target:


  • Data engineer

    3 weeks ago


    Anantapur, India Collabera Full time

    Data Engineer REMOTE, INDIA Day to day: As a Data Engineer, you will design, build, and maintain data products such as pipelines, models, APIs, and visualizations using best-in-class tools and Data Ops practices. Your role involves creating scalable solutions that integrate machine learning and AI capabilities, implementing software engineering standards...

  • Data Engineer

    7 days ago


    Anantapur, India LanceSoft, Inc. Full time

    Pay Rate: $11.00/hr to $12.00/hrLocation: 100% Remote working from IndiaDuration of contract: This is 4 to 4.5 months contract jobWorking Hours: US working hours (Night Shift form india)This team is looking to hire a motivated Sr. Data Engineer who will be working closely with team of highly motivated data engineers to build a state-of-the-art marketing data...

  • Data Engineer

    3 weeks ago


    Anantapur, India Hellowork Consultants Full time

    🔹 Role: Data Engineer 🔹 Experience: 5 to 7 Years 🔹 Location: Thiruvananthapuram 🔹 Work Mode: Hybrid 🔹 Notice Period: Immediate JoinersRole Summary: Data EngineerKey Responsibilities1.Develop and manage Azure Data Factory (ADF) pipelines to ingest data from legacy and cloud systems into the Data Lake.2.Build and optimize Databricks notebooks...

  • Senior Data Engineer

    3 weeks ago


    Anantapur, India RapidBrains Full time

    Job Title: Senior Data EngineerExperience: 6+ YearsEmployment Type: ContractLocation: Remote OverviewWe are looking for a Senior Data Engineer with deep expertise in Azure Data Engineering to design, build, and optimize large-scale data pipelines. The ideal candidate will have strong experience with Azure Data Factory (ADF), Azure Synapse, PySpark, and SQL,...


  • Anantapur, India Primesoft Inc Full time

    Hiring for Data Engineer!!!Company: Primesoft Enterprise IT Services Pvt. Ltd.Experience: 7+ yearsLocation: Chennai (Work From Office)Notice Period: Immediate to 30days onlyAbout the role–-As a Software Engineer II - Data, you will contribute to the design and development of data systems including pipelines, APIs, analytics, AI and machine learning at...

  • Data engineer

    3 weeks ago


    Anantapur, India NP Group Full time

    Data Engineer - Palantir Foundry, Workshop, Pyspark & Typescript Fully Remote - Long Term (initially 6 months) full time contract c$12.00 per hour We have an immediate requirement for an experienced Data Engineer to join the global engineering team for a International Enterprise Organisation. You will bring Data Engineering expertise onto greenfield multi...


  • Anantapur, India iVoyant Full time

    One of our clients is looking for an experienced Senior Snowflake Data Engineer to join their team.Key Responsibilities:We are seeking a Senior Data Engineer with 8+ years of experience in end-to-end data engineering and Snowflake development.Expert in Snowflake native features: Snow pipe, Streams, Tasks, Time Travel, Zero-Copy Cloning, and Secure Data...

  • Data AI Engineer

    12 hours ago


    Anantapur, India Verdantas Full time

    Join Verdantas – A Top #ENR 81 Firm! Position: Data AI Engineer Key Responsibilities: Your duties will include but are not limited to the following: - Architect and develop AI-powered agents using low-code platforms (e.g., Power Platform) and pro-code frameworks (e.g., LangChain, Semantic Kernel). - Build and deploy solutions using OpenAI models via Azure...

  • Senior data engineer

    2 weeks ago


    Anantapur, India Delphi Consulting Middle East Full time

    Ready to embark on a journey where your growth is intertwined with our commitment to making a positive impact? Join the Delphi family - where Growth Meets Values. At Delphi Consulting Pvt. Ltd. , we foster a thriving environment with a hybrid work model that lets you prioritize what matters most. Interviews and onboarding are conducted virtually, reflecting...

  • sap abap

    6 days ago


    Anantapur, Andhra Pradesh, India NTT DATA Full time ₹ 5,00,000 - ₹ 15,00,000 per year

    Company DescriptionNTT DATA, a part of NTT Group, is headquartered in Tokyo and provides IT and business services globally. We help clients transform through consulting, industry solutions, business process services, digital and IT modernization, and managed services. Our commitment to our clients' long-term success allows us to enable not only them but...