Product Data Ingestion Expert

1 week ago


Thoothukudi, Tamil Nadu, India beBeeDataIngestionEngineer Full time ₹ 19,97,000 - ₹ 21,49,900

We're building a high-throughput product data ingestion pipeline across hundreds of domains.

About the Role

As our expert, you'll own the crawling/extraction layer end-to-end: HTTP-first crawling with a Playwright fallback, per-domain learned selectors, and reliable PDF handling (datasheets/specs).

Key Responsibilities
  • Design an HTTP-first crawler (Scrapy or aiohttp) with Playwright fallback only for JS-heavy pages. Ensure efficient data extraction and minimize load times.
  • Implement sitemap diffing and conditional GETs (ETag/Last-Modified) for incremental runs. This will enable seamless updates and reduce unnecessary requests.
  • Develop a lightweight 'needs JS?' classifier (HTML length, JSON-LD presence, data-product markers) to auto-route HTTP vs Playwright. This will streamline your workflow and improve accuracy.
  • Enforce per-domain throttles/backoff (2–4 concurrent/domain; auto-lower on 429/503). This will prevent overloading and ensure smooth operations.
  • Add URL normalization/canonicalization and de-dup (respect ; hash PDFs). This will maintain data integrity and eliminate duplicates.
  • Handle PDF discovery & download (HEAD first to dedupe; size/concurrency caps; SHA-256 keys). Ensure secure and efficient PDF management.
  • Apply Playwright browser automation resource budgets (block images/fonts/analytics; kill outliers by size/CPU/time). Optimize browser performance and prevent abuse.
  • Integrate third-party APIs (REST/GraphQL) as first-class sources: handle auth (API keys/OAuth2), pagination, and rate limits; unify API + crawl outputs. Enhance data diversity and quality.
  • Own automation & orchestration for scheduled runs (Airflow/Temporal/Celery or cron), idempotent retries, and alerting. Ensure seamless execution and proactive issue resolution.
  • Create per-domain selectors (YAML) with verification on hold-outs; re-learn only when health drops. Maintain accurate and up-to-date selectors.
  • Ship observability: per-site field coverage, error rates, retries, avg page time, and PDF success. Provide actionable insights and inform data-driven decisions.
  • Maintain allow/deny paths; adhere to robots.txt and Terms of Service. Uphold data quality and respect domain boundaries.

Join us in shaping the future of product data ingestion and take on this exciting challenge



  • Thoothukudi, Tamil Nadu, India beBeeData Full time ₹ 1,80,00,000 - ₹ 2,50,00,000

    Job Title: Senior Data ScientistWe are seeking a highly skilled data expert to join our digital transformation team. The ideal candidate will have hands-on experience with model development and a deep understanding of customer data ingestion, real-time data pipelines, and AI-driven marketing strategies.


  • Thoothukudi, Tamil Nadu, India beBeeData Full time ₹ 20,00,000 - ₹ 25,00,000

    Job Title:A Data Architect PositionDesign and develop scalable data infrastructure to drive business growth.Develop, implement, and maintain large-scale data pipelines.Ingest, process, and store structured and unstructured data from various sources.Implement data lakes and warehouses using cloud services.Optimize data pipelines for performance, cost, and...


  • Thoothukudi, Tamil Nadu, India beBeeTransformation Full time US$ 1,20,000 - US$ 1,60,000

    Data Transformation Expert WantedWe are seeking a skilled Data Transformation Engineer to join our team. As a seasoned Snowflake developer and infrastructure expert, you will be responsible for designing and implementing scalable, secure data pipelines that integrate with various cloud platforms.Key Responsibilities:Develop ETL/ELT pipelines using dbt,...


  • Thoothukudi, Tamil Nadu, India beBeeproductivity Full time ₹ 8,00,000 - ₹ 15,00,000

    Production Efficiency ExpertWe are seeking a skilled Production Analyst to drive data-driven decision making and process improvements. If you excel in analyzing numbers, identifying trends, and streamlining operations, this role is ideal for you.The ideal candidate will have prior experience in retail/manufacturing merchandising and at least 5 years'...


  • Thoothukudi, Tamil Nadu, India beBeeDataEngineer Full time ₹ 1,24,70,000 - ₹ 2,09,71,520

    Job Title:Data Engineer - IoT SolutionsWe are seeking an experienced Data Engineer to join our team and contribute to the design and implementation of scalable IoT data solutions. As a key member of our engineering team, you will be responsible for developing and maintaining data engineering solutions leveraging AWS IoT services.Key Responsibilities:Design,...


  • Thoothukudi, Tamil Nadu, India beBeeDataArchitect Full time ₹ 1,50,00,000 - ₹ 2,00,00,000

    Imagine leading a team of skilled professionals to drive data-driven projects and develop a strong understanding of Data Architecture and models.Key ResponsibilitiesLead the design and development of features in the existing Data Warehouse, fostering collaboration with cross-functional teams.Provide strategic leadership in establishing connections between...


  • Thoothukudi, Tamil Nadu, India beBeeMigration Full time US$ 80,000 - US$ 1,30,000

    Database Migration ExpertiseThe role of a Data Migration Specialist involves transitioning large-scale databases from Fivetran to Nexla, with Snowflake as the target data platform. This project requires hands-on expertise in data migration and exposure to diverse database systems.Mission OverviewWe are seeking skilled professionals with experience in ETL...

  • Data Engineer L3

    1 week ago


    Thoothukudi, Tamil Nadu, India Costco IT Full time

    About Costco Wholesale Costco Wholesale is a multi-billion-dollar global retailer with warehouse club operations in eleven countries. They provide a wide selection of quality merchandise, plus the convenience of specialty departments and exclusive member services, all designed to make shopping a pleasurable experience for their members.About Costco Wholesale...


  • Thoothukudi, Tamil Nadu, India beBeeDataScience Full time ₹ 1,91,40,000 - ₹ 2,51,20,000

    Unlock the Power of Data ScienceAbout Us:We are a leading provider of data services and technology solutions, delivering cutting-edge analytics and AI capabilities to multiple domains. Our expertise spans data analytics, machine learning, and natural language processing.Our long-term vision is built on three core pillars: Data Analytics & AI Solutions, Data...


  • Thoothukudi, Tamil Nadu, India beBeeDataEngineer Full time ₹ 18,00,000 - ₹ 25,00,000

    Job Title: Data EngineerWe are seeking an experienced Data Engineer to join our team. As a Data Engineer, you will play a crucial role in designing and building scalable data models using dbt (Data Build Tool) and collaborating with data engineers to ingest and transform data from various sources.This is an excellent opportunity for you to enhance your...