Advanced Data Extractor Position

4 days ago


Tiruchi, Tamil Nadu, India beBeeWeb Full time ₹ 1,50,00,000 - ₹ 2,00,00,000
Senior Web Scraping Developer

We are seeking an experienced web scraping expert to lead high-throughput product data ingestion pipelines across hundreds of domains.

Key Responsibilities:
  • Design and implement HTTP-first crawlers with a Playwright fallback for JS-heavy pages.
  • Implement sitemap diffing and conditional GETs for incremental runs.
  • Build a lightweight classifier to auto-route HTTP vs Playwright.
  • Enforce per-domain throttles/backoff and add URL normalization/canonicalization and de-duplication.
  • Handle PDF discovery & download, apply Playwright browser automation resource budgets, and integrate third-party APIs as first-class sources.
  • Own automation & orchestration for scheduled runs, idempotent retries, and alerting.
  • Ship observability: per-site field coverage, error rates, retries, avg page time, and PDF success.
Requirements:
  • 4+ years of experience in Python, including 2+ years of building production web crawlers at scale.
  • Strong skills in Scrapy or aiohttp/asyncio and Playwright (or Puppeteer) in production.
  • Practical proxy management, polite anti-bot tactics, and per-domain rate limiting.
  • Hands-on with ETag/Last-Modified, retries, backoff, and HTTP caching.
  • Confident with CSS/XPath, schema.org/JSON-LD, and HTML parsing.
  • APIs: consuming REST/GraphQL (auth, pagination, backoff) and building small internal services (FastAPI or similar).
  • Automation/Orchestration: Airflow/Temporal/Celery (or equivalent schedulers/queues) for scheduled runs and monitoring.
  • PDF handling (requests/HEAD, hashing, size limits) and file integrity checks.
  • Queues (Redis/Kafka), Docker, Linux basics; comfort with logs/metrics.

This is an opportunity to work on complex data extraction projects. The ideal candidate will be familiar with the latest web scraping technologies and able to scale large projects.


  • Lead Data Extractor

    5 days ago


    Tiruchi, Tamil Nadu, India beBeeData Full time ₹ 80,00,000 - ₹ 1,50,00,000

    Job DescriptionWe are seeking an expert Data Extractor to join our team.The ideal candidate will have hands-on experience in web scraping and data extraction, with a strong proficiency in Python and expertise in tools like Selenium, Scrapy, and Pandas. They should also be familiar with API integration, SQL, and cloud services.Key Responsibilities:Design,...

  • Senior Data Extractor

    2 weeks ago


    Tiruchi, Tamil Nadu, India beBeeDataEngineer Full time ₹ 13,44,000 - ₹ 20,16,000

    Job SummaryWe are seeking a seasoned Data Engineer - Extraction to join our team. As a key member of our data operations, you will be responsible for designing and implementing efficient data extraction pipelines using machine learning models.Main Responsibilities:Extract relevant information from unstructured/semi-structured documents using combination of...


  • Tiruchi, Tamil Nadu, India beBeeETL Full time ₹ 15,00,000 - ₹ 20,00,000

    Job Title: ETL ConsultantWe are looking for a highly skilled professional to take on the role of a Talend ETL Developer with expertise in Talend Open Studio, BigQuery, PostgreSQL, Python, and GCP.Key Skills:Talend Data Integration: Design & optimize complex ETL pipelines, debugging, and deployment using TMCBigQuery: Partitioning, clustering, federated...


  • Tiruchi, Tamil Nadu, India beBeeDatabase Full time ₹ 1,80,00,000 - ₹ 2,50,00,000

    Chief Data Engineering ArchitectThis is a senior leadership position within our data engineering team. The successful candidate will be responsible for architecting and scaling our core data foundation, which supports our enterprise SaaS and AI-driven products.Data Model Development: Design and develop a robust semantic layer using dbt pipelines and...


  • Tiruchi, Tamil Nadu, India beBeeArtificial Full time ₹ 1,41,60,000 - ₹ 2,01,60,000

    AI Engineering RoleAre you a skilled AI engineer looking to leverage your expertise in building, deploying, and optimizing artificial intelligence (AI) and machine learning (ML) models? We are seeking an experienced professional to join our team as a key player in developing scalable, production-ready AI solutions. In this role, you will collaborate with...


  • Tiruchi, Tamil Nadu, India beBeeBackend Full time ₹ 15,00,000 - ₹ 20,00,000

    About the RoleWe are seeking a skilled Backend Developer with experience in Go or Rust to join our team. You will play a key role in transforming raw on-chain data into actionable insights by decoding smart contract events and implementing pricing logic from decentralized exchanges (DEXs).This role is ideal for someone who enjoys low-level protocol work,...


  • Tiruchi, Tamil Nadu, India beBeeDataQuality Full time ₹ 21,00,000 - ₹ 27,00,000

    Optimize data integrity through advanced testing methodologies.About the OpportunityWe are seeking an experienced Data Quality Engineer to develop and execute tests using SQL, Tricentis, Python (PySpark), and data quality frameworks for data pipeline, ETL, and ingestion testing. This role involves working closely with stakeholders to understand business...


  • Tiruchi, Tamil Nadu, India beBeeData Full time ₹ 60,00,000 - ₹ 1,20,00,000

    Job Title:Senior Data ArchitectAbout the RoleOur company is seeking a highly skilled Senior Data Architect to join our team. The successful candidate will be responsible for designing and implementing large-scale data solutions, ensuring seamless integration with global clinical trial systems.This role is ideal for individuals who possess in-depth knowledge...


  • Tiruchi, Tamil Nadu, India beBeeMachineLearning Full time ₹ 15,00,000 - ₹ 25,00,000

    Job Title: Data ScientistData AnalystAbout the Role:This position involves working as a data scientist to design, develop, and deploy machine learning solutions for real-world applications.Key Responsibilities:We are looking for someone with expertise in Python and ML frameworks (PyTorch, TensorFlow) to work on fine-tuning LLMs and building...


  • Tiruchi, Tamil Nadu, India beBeeData Full time ₹ 1,50,00,000 - ₹ 2,50,00,000

    Senior Data Engineer PositionWe are seeking an experienced Senior Data Engineer to join our organization. The ideal candidate will have a strong background in building scalable real-time and batch processing workflows using Azure Databricks, PySpark, and Apache Spark.The successful candidate will be responsible for:Performing data pre-processing tasks,...