
Product Data Ingestion Expert
1 week ago
We're building a high-throughput product data ingestion pipeline across hundreds of domains.
About the RoleAs our expert, you'll own the crawling/extraction layer end-to-end: HTTP-first crawling with a Playwright fallback, per-domain learned selectors, and reliable PDF handling (datasheets/specs).
Key Responsibilities- Design an HTTP-first crawler (Scrapy or aiohttp) with Playwright fallback only for JS-heavy pages. Ensure efficient data extraction and minimize load times.
- Implement sitemap diffing and conditional GETs (ETag/Last-Modified) for incremental runs. This will enable seamless updates and reduce unnecessary requests.
- Develop a lightweight 'needs JS?' classifier (HTML length, JSON-LD presence, data-product markers) to auto-route HTTP vs Playwright. This will streamline your workflow and improve accuracy.
- Enforce per-domain throttles/backoff (2–4 concurrent/domain; auto-lower on 429/503). This will prevent overloading and ensure smooth operations.
- Add URL normalization/canonicalization and de-dup (respect ; hash PDFs). This will maintain data integrity and eliminate duplicates.
- Handle PDF discovery & download (HEAD first to dedupe; size/concurrency caps; SHA-256 keys). Ensure secure and efficient PDF management.
- Apply Playwright browser automation resource budgets (block images/fonts/analytics; kill outliers by size/CPU/time). Optimize browser performance and prevent abuse.
- Integrate third-party APIs (REST/GraphQL) as first-class sources: handle auth (API keys/OAuth2), pagination, and rate limits; unify API + crawl outputs. Enhance data diversity and quality.
- Own automation & orchestration for scheduled runs (Airflow/Temporal/Celery or cron), idempotent retries, and alerting. Ensure seamless execution and proactive issue resolution.
- Create per-domain selectors (YAML) with verification on hold-outs; re-learn only when health drops. Maintain accurate and up-to-date selectors.
- Ship observability: per-site field coverage, error rates, retries, avg page time, and PDF success. Provide actionable insights and inform data-driven decisions.
- Maintain allow/deny paths; adhere to robots.txt and Terms of Service. Uphold data quality and respect domain boundaries.
Join us in shaping the future of product data ingestion and take on this exciting challenge
-
Chief Data Architect
1 week ago
Thoothukudi, Tamil Nadu, India beBeeData Full time ₹ 1,80,00,000 - ₹ 2,50,00,000Job Title: Senior Data ScientistWe are seeking a highly skilled data expert to join our digital transformation team. The ideal candidate will have hands-on experience with model development and a deep understanding of customer data ingestion, real-time data pipelines, and AI-driven marketing strategies.
-
Principal Data Solutions Developer
1 week ago
Thoothukudi, Tamil Nadu, India beBeeData Full time ₹ 20,00,000 - ₹ 25,00,000Job Title:A Data Architect PositionDesign and develop scalable data infrastructure to drive business growth.Develop, implement, and maintain large-scale data pipelines.Ingest, process, and store structured and unstructured data from various sources.Implement data lakes and warehouses using cloud services.Optimize data pipelines for performance, cost, and...
-
Data Transformation Engineer
2 weeks ago
Thoothukudi, Tamil Nadu, India beBeeTransformation Full time US$ 1,20,000 - US$ 1,60,000Data Transformation Expert WantedWe are seeking a skilled Data Transformation Engineer to join our team. As a seasoned Snowflake developer and infrastructure expert, you will be responsible for designing and implementing scalable, secure data pipelines that integrate with various cloud platforms.Key Responsibilities:Develop ETL/ELT pipelines using dbt,...
-
Data-Driven Production Specialist
1 week ago
Thoothukudi, Tamil Nadu, India beBeeproductivity Full time ₹ 8,00,000 - ₹ 15,00,000Production Efficiency ExpertWe are seeking a skilled Production Analyst to drive data-driven decision making and process improvements. If you excel in analyzing numbers, identifying trends, and streamlining operations, this role is ideal for you.The ideal candidate will have prior experience in retail/manufacturing merchandising and at least 5 years'...
-
Scalable IoT Data Solutions Expert
1 week ago
Thoothukudi, Tamil Nadu, India beBeeDataEngineer Full time ₹ 1,24,70,000 - ₹ 2,09,71,520Job Title:Data Engineer - IoT SolutionsWe are seeking an experienced Data Engineer to join our team and contribute to the design and implementation of scalable IoT data solutions. As a key member of our engineering team, you will be responsible for developing and maintaining data engineering solutions leveraging AWS IoT services.Key Responsibilities:Design,...
-
Chief Data Architect
1 week ago
Thoothukudi, Tamil Nadu, India beBeeDataArchitect Full time ₹ 1,50,00,000 - ₹ 2,00,00,000Imagine leading a team of skilled professionals to drive data-driven projects and develop a strong understanding of Data Architecture and models.Key ResponsibilitiesLead the design and development of features in the existing Data Warehouse, fostering collaboration with cross-functional teams.Provide strategic leadership in establishing connections between...
-
Data Migration Specialist
1 week ago
Thoothukudi, Tamil Nadu, India beBeeMigration Full time US$ 80,000 - US$ 1,30,000Database Migration ExpertiseThe role of a Data Migration Specialist involves transitioning large-scale databases from Fivetran to Nexla, with Snowflake as the target data platform. This project requires hands-on expertise in data migration and exposure to diverse database systems.Mission OverviewWe are seeking skilled professionals with experience in ETL...
-
Data Engineer L3
1 week ago
Thoothukudi, Tamil Nadu, India Costco IT Full timeAbout Costco Wholesale Costco Wholesale is a multi-billion-dollar global retailer with warehouse club operations in eleven countries. They provide a wide selection of quality merchandise, plus the convenience of specialty departments and exclusive member services, all designed to make shopping a pleasurable experience for their members.About Costco Wholesale...
-
Expert Data Scientist
1 week ago
Thoothukudi, Tamil Nadu, India beBeeDataScience Full time ₹ 1,91,40,000 - ₹ 2,51,20,000Unlock the Power of Data ScienceAbout Us:We are a leading provider of data services and technology solutions, delivering cutting-edge analytics and AI capabilities to multiple domains. Our expertise spans data analytics, machine learning, and natural language processing.Our long-term vision is built on three core pillars: Data Analytics & AI Solutions, Data...
-
Analytic Data Specialist
6 days ago
Thoothukudi, Tamil Nadu, India beBeeDataEngineer Full time ₹ 18,00,000 - ₹ 25,00,000Job Title: Data EngineerWe are seeking an experienced Data Engineer to join our team. As a Data Engineer, you will play a crucial role in designing and building scalable data models using dbt (Data Build Tool) and collaborating with data engineers to ingest and transform data from various sources.This is an excellent opportunity for you to enhance your...