High-Throughput Web Data Ingestion Specialist

1 week ago


Thiruvananthapuram, Kerala, India beBeeEngineer Full time ₹ 1,50,00,000 - ₹ 2,00,00,000
Job Title: Senior Web Scraping Engineer

We are seeking a highly skilled web scraping engineer to build a high-throughput product data ingestion pipeline across hundreds of domains.

The ideal candidate will have 4+ years of Python experience, including 2+ years building production web crawlers at scale.

  • Design an HTTP-first crawler with Playwright fallback only for JS-heavy pages.
  • Implement sitemap diffing and conditional GETs for incremental runs.
  • Build a lightweight classifier to auto-route HTTP vs Playwright.
  • Enforce per-domain throttles/backoff.
  • Add URL normalization/canonicalization and de-duplication.
  • Handle PDF discovery & download.
  • Apply browser automation resource budgets.
  • Integrate third-party APIs as first-class sources.
  • Own automation & orchestration for scheduled runs.

Must-have qualifications include strong experience with Scrapy or aiohttp, Playwright, proxy management, polite anti-bot tactics, and per-domain rate limiting. Experience with ETag/Last-Modified, retries, backoff, and HTTP caching is also required. Additionally, the ideal candidate should be confident with CSS/XPath, schema.org/JSON-LD, and HTML parsing.

Responsibilities:
  • Containerize workers and provide runbooks/CI.
  • Collaborate with data team on schemas/normalization.
  • Ship observability: per-site field coverage, error rates, retries, avg page time, and PDF success.
  • Maintain allow/deny paths; adhere to robots.txt and Terms of Service.

Nice to have qualifications include Go or Node.js experience, cloud experience (AWS/GCP), and workflow engine experience (Airflow/Temporal/Argo/Celery).

How We Work:
  • Ship in small, measurable increments.
  • Track coverage and freshness as north-star metrics.
  • Prefer simple designs that are easy to operate at scale.

Compensation: Competitive. Please include your expected CTC (INR LPA) and any variable/benefits expectations.



  • Thiruvananthapuram, Kerala, India beBeeDataIngestion Full time ₹ 20,00,000 - ₹ 25,00,000

    Job Title: Databricks Data Ingestion SpecialistWe are seeking a skilled data ingestion specialist to join our team. As an expert in designing and developing efficient data pipelines, you will play a key role in integrating multiple sources into Databricks.This is an exciting opportunity for individuals who are passionate about working with big data, cloud...


  • Thiruvananthapuram, Kerala, India beBeeDataValidation Full time ₹ 9,00,000 - ₹ 12,00,000

    We are seeking a Data Validation Specialist to ensure data accuracy and consistency in our event-driven systems. Major Responsibilities: Validate ETL processes and perform data reconciliation across systems. Test Kafka/Redpanda pipelines (producers, consumers, topics, schema registry, Avro/JSON). Ensure high-quality data with high throughput. Run...


  • Thiruvananthapuram, Kerala, India beBeeData Full time ₹ 10,00,000 - ₹ 15,00,000

    As a senior data integration specialist, you will be responsible for designing and developing data pipelines to integrate multiple sources into Databricks. This role requires a strong understanding of data ingestion processes focusing on scalability and efficiency.


  • Thiruvananthapuram, Kerala, India beBeeWebData Full time ₹ 9,00,000 - ₹ 12,00,000

    Key Responsibilities:We are seeking a highly skilled and ambitious Senior Web Data Specialist to join our team on a long-term project.The ideal candidate will have strong experience in web scraping, data extraction, and automation. They will be responsible for extracting and structuring data from various online sources involving dynamic content, custom...


  • Thiruvananthapuram, Kerala, India beBeeExpert Full time ₹ 10,00,000 - ₹ 15,00,000

    Job Opportunity:We are seeking a skilled Web Data Extraction Specialist to lead the development of a long-term project. Your goal will be to create an intelligent system that efficiently extracts data from various online sources involving dynamic content, custom headers, and automation flows.Key Responsibilities:Design and implement advanced data scraping...


  • Thiruvananthapuram, Kerala, India beBeeData Full time ₹ 15,00,000 - ₹ 22,50,000

    Senior Data Transformation SpecialistWe are seeking an expert in data transformation to help us build scalable, high-performance pipelines.Data Ingestion and Pipelining: Design and implement efficient data ingestion pipelines using Snowflake and dbt.Data Modelling and Transformation: Implement data modelling and transformation logic to support a layered...


  • Thiruvananthapuram, Kerala, India beBeeDataLeader Full time US$ 2,00,000 - US$ 2,50,000

    Job TitleTeam Lead – Advanced Data Systems DeliveryKey Responsibilities:Leads the end-to-end delivery of advanced data systems for mission-critical applications focusing on revenue management and other high-impact use cases. Ensures solutions meet scalability, performance, availability, security, and reliability standards delivering on time within...


  • Thiruvananthapuram, Kerala, India beBeeData Full time ₹ 9,87,654 - ₹ 12,34,567

    ETL Tester Job DescriptionAs a skilled ETL tester, you will play a pivotal role in ensuring the accuracy and reliability of our data pipelines. Your exceptional testing skills will be instrumental in identifying and resolving defects, thereby guaranteeing high-quality data.The ideal candidate should possess a solid understanding of data validation,...


  • Thiruvananthapuram, Kerala, India beBeeData Full time US$ 2,00,000 - US$ 2,50,000

    Job OverviewWe're seeking a Senior Full Stack Software Development Engineer with expertise in data engineering to help build cutting-edge software and data infrastructure for our AI-driven analytics platform. Our mission is to create groundbreaking data science that transforms the industry.This role involves hands-on development of solutions by combining...


  • Thiruvananthapuram, Kerala, India beBeeDatabase Full time ₹ 18,00,000 - ₹ 24,00,000

    Job Title: Database EngineerKey Responsibilities:We are seeking a highly skilled and experienced Database Engineer to lead the design, development, and maintenance of high-performance database systems. The ideal candidate will be responsible for architecting and implementing scalable solutions that handle vast volumes of financial market data.The Database...