Advanced Data Extractor Position
4 days ago
We are seeking an experienced web scraping expert to lead high-throughput product data ingestion pipelines across hundreds of domains.
Key Responsibilities:- Design and implement HTTP-first crawlers with a Playwright fallback for JS-heavy pages.
- Implement sitemap diffing and conditional GETs for incremental runs.
- Build a lightweight classifier to auto-route HTTP vs Playwright.
- Enforce per-domain throttles/backoff and add URL normalization/canonicalization and de-duplication.
- Handle PDF discovery & download, apply Playwright browser automation resource budgets, and integrate third-party APIs as first-class sources.
- Own automation & orchestration for scheduled runs, idempotent retries, and alerting.
- Ship observability: per-site field coverage, error rates, retries, avg page time, and PDF success.
- 4+ years of experience in Python, including 2+ years of building production web crawlers at scale.
- Strong skills in Scrapy or aiohttp/asyncio and Playwright (or Puppeteer) in production.
- Practical proxy management, polite anti-bot tactics, and per-domain rate limiting.
- Hands-on with ETag/Last-Modified, retries, backoff, and HTTP caching.
- Confident with CSS/XPath, schema.org/JSON-LD, and HTML parsing.
- APIs: consuming REST/GraphQL (auth, pagination, backoff) and building small internal services (FastAPI or similar).
- Automation/Orchestration: Airflow/Temporal/Celery (or equivalent schedulers/queues) for scheduled runs and monitoring.
- PDF handling (requests/HEAD, hashing, size limits) and file integrity checks.
- Queues (Redis/Kafka), Docker, Linux basics; comfort with logs/metrics.
This is an opportunity to work on complex data extraction projects. The ideal candidate will be familiar with the latest web scraping technologies and able to scale large projects.
-
Lead Data Extractor
5 days ago
Tiruchi, Tamil Nadu, India beBeeData Full time ₹ 80,00,000 - ₹ 1,50,00,000Job DescriptionWe are seeking an expert Data Extractor to join our team.The ideal candidate will have hands-on experience in web scraping and data extraction, with a strong proficiency in Python and expertise in tools like Selenium, Scrapy, and Pandas. They should also be familiar with API integration, SQL, and cloud services.Key Responsibilities:Design,...
-
Senior Data Extractor
2 weeks ago
Tiruchi, Tamil Nadu, India beBeeDataEngineer Full time ₹ 13,44,000 - ₹ 20,16,000Job SummaryWe are seeking a seasoned Data Engineer - Extraction to join our team. As a key member of our data operations, you will be responsible for designing and implementing efficient data extraction pipelines using machine learning models.Main Responsibilities:Extract relevant information from unstructured/semi-structured documents using combination of...
-
Data Architect Position
2 weeks ago
Tiruchi, Tamil Nadu, India beBeeETL Full time ₹ 15,00,000 - ₹ 20,00,000Job Title: ETL ConsultantWe are looking for a highly skilled professional to take on the role of a Talend ETL Developer with expertise in Talend Open Studio, BigQuery, PostgreSQL, Python, and GCP.Key Skills:Talend Data Integration: Design & optimize complex ETL pipelines, debugging, and deployment using TMCBigQuery: Partitioning, clustering, federated...
-
Data Engineer Leadership Position
6 days ago
Tiruchi, Tamil Nadu, India beBeeDatabase Full time ₹ 1,80,00,000 - ₹ 2,50,00,000Chief Data Engineering ArchitectThis is a senior leadership position within our data engineering team. The successful candidate will be responsible for architecting and scaling our core data foundation, which supports our enterprise SaaS and AI-driven products.Data Model Development: Design and develop a robust semantic layer using dbt pipelines and...
-
Advanced AI Engineer Position
2 weeks ago
Tiruchi, Tamil Nadu, India beBeeArtificial Full time ₹ 1,41,60,000 - ₹ 2,01,60,000AI Engineering RoleAre you a skilled AI engineer looking to leverage your expertise in building, deploying, and optimizing artificial intelligence (AI) and machine learning (ML) models? We are seeking an experienced professional to join our team as a key player in developing scalable, production-ready AI solutions. In this role, you will collaborate with...
-
Data Pipeline Architect
1 week ago
Tiruchi, Tamil Nadu, India beBeeBackend Full time ₹ 15,00,000 - ₹ 20,00,000About the RoleWe are seeking a skilled Backend Developer with experience in Go or Rust to join our team. You will play a key role in transforming raw on-chain data into actionable insights by decoding smart contract events and implementing pricing logic from decentralized exchanges (DEXs).This role is ideal for someone who enjoys low-level protocol work,...
-
Advanced Data Integrity Specialist
2 weeks ago
Tiruchi, Tamil Nadu, India beBeeDataQuality Full time ₹ 21,00,000 - ₹ 27,00,000Optimize data integrity through advanced testing methodologies.About the OpportunityWe are seeking an experienced Data Quality Engineer to develop and execute tests using SQL, Tricentis, Python (PySpark), and data quality frameworks for data pipeline, ETL, and ingestion testing. This role involves working closely with stakeholders to understand business...
-
Data Architect Position
5 days ago
Tiruchi, Tamil Nadu, India beBeeData Full time ₹ 60,00,000 - ₹ 1,20,00,000Job Title:Senior Data ArchitectAbout the RoleOur company is seeking a highly skilled Senior Data Architect to join our team. The successful candidate will be responsible for designing and implementing large-scale data solutions, ensuring seamless integration with global clinical trial systems.This role is ideal for individuals who possess in-depth knowledge...
-
Data Scientist Position
7 days ago
Tiruchi, Tamil Nadu, India beBeeMachineLearning Full time ₹ 15,00,000 - ₹ 25,00,000Job Title: Data ScientistData AnalystAbout the Role:This position involves working as a data scientist to design, develop, and deploy machine learning solutions for real-world applications.Key Responsibilities:We are looking for someone with expertise in Python and ML frameworks (PyTorch, TensorFlow) to work on fine-tuning LLMs and building...
-
Advanced Cloud Data Specialist
2 weeks ago
Tiruchi, Tamil Nadu, India beBeeData Full time ₹ 1,50,00,000 - ₹ 2,50,00,000Senior Data Engineer PositionWe are seeking an experienced Senior Data Engineer to join our organization. The ideal candidate will have a strong background in building scalable real-time and batch processing workflows using Azure Databricks, PySpark, and Apache Spark.The successful candidate will be responsible for:Performing data pre-processing tasks,...