
Senior Web Data Engineering Specialist
4 days ago
We are seeking an experienced Senior Web Scraping Engineer to develop a high-throughput product data ingestion pipeline across hundreds of domains. This role requires expertise in designing and building web crawlers, handling large datasets, and integrating third-party APIs.
The ideal candidate will have strong knowledge of Python, including Scrapy or aiohttp/asyncio and Playwright (or Puppeteer) in production. Practical experience with proxy management, polite anti-bot tactics, and per-domain rate limiting is also essential. Additionally, hands-on experience with ETag/Last-Modified, retries, backoff, and HTTP caching is required.
The successful candidate will be responsible for owning the crawling/extraction layer end-to-end, including HTTP-first crawling with a Playwright fallback, per-domain learned selectors, and reliable PDF handling. They will also drive automation around scheduling, retries, and monitoring to ensure hands-off runs. Furthermore, they will integrate vendor/public APIs (REST/GraphQL) wherever available to complement crawling.
Requirements- 4+ years Python, including 2+ years building production web crawlers at scale.
- Strong with Scrapy or aiohttp/asyncio and Playwright (or Puppeteer) in production.
- Practical proxy management, polite anti-bot tactics, and per-domain rate limiting.
- Hands-on with ETag/Last-Modified, retries, backoff, and HTTP caching.
- Design and build high-throughput web crawlers using Scrapy or aiohttp/asyncio.
- Implement sitemap diffing and conditional GETs (ETag/Last-Modified) for incremental runs.
- Build lightweight 'needs JS?' classifiers to auto-route HTTP vs Playwright.
- Enforce per-domain throttles/backoff (2–4 concurrent/domain; auto-lower on 429/503).
- Add URL normalization/canonicalization and de-duplication (respect ; hash PDFs).
- Handle PDF discovery & download (HEAD first to dedupe; size/concurrency caps; SHA-256 keys).
- Apply Playwright browser automation resource budgets (block images/fonts/analytics; kill outliers by size/CPU/time).
- Integrate third-party APIs (REST/GraphQL) as first-class sources: handle auth (API keys/OAuth2), pagination, and rate limits; unify API + crawl outputs.
- Owning automation & orchestration for scheduled runs (Airflow/Temporal/Celery or cron), idempotent retries, and alerting.
- Python, including Scrapy or aiohttp/asyncio and Playwright (or Puppeteer).
- Proxy management, polite anti-bot tactics, and per-domain rate limiting.
- ETag/Last-Modified, retries, backoff, and HTTP caching.
- Scrapy or aiohttp/asyncio and Playwright (or Puppeteer).
-
Senior IT Data Engineer
2 weeks ago
Guntur, Andhra Pradesh, India beBeeData Full time ₹ 15,00,000 - ₹ 20,00,000Lead Data Extraction SpecialistThe ideal candidate will have a minimum of 4 years of hands-on experience in IT data extraction, with at least 2 years leading a team of 5+ developers.This role requires deep technical knowledge in advanced data extraction techniques, reverse engineering, automation, and leadership skills to drive the team towards...
-
Senior Data Engineer
7 days ago
Guntur, Andhra Pradesh, India beBeeDataEngineering Full time US$ 1,20,000 - US$ 1,80,000Job Title: Data Engineering SpecialistWe are seeking a highly skilled Data Engineering Specialist to join our team and drive data-powered outcomes in complex telecommunications and IT Asset Management (ITAM) environments.The ideal candidate will design robust Extract, Transform, Load (ETL) pipelines, build scalable Business Intelligence (BI) solutions, and...
-
Data-Driven Software Engineer
1 week ago
Guntur, Andhra Pradesh, India beBeeEngineering Full time US$ 1,00,000 - US$ 1,20,000Senior Full Stack SDE with Data Engineering for AnalyticsTruckmentum is seeking a Senior Full Stack Software Development Engineer (SDE) with deep data engineering experience to help build cutting-edge software and data infrastructure for an AI-driven Trucking Science-as-a-Service platform.You'll be part of a team responsible for the development of dynamic...
-
Senior Web Data Extraction Specialist
1 week ago
Guntur, Andhra Pradesh, India beBeeData Full time ₹ 8,00,000 - ₹ 12,00,000Job Description: We are seeking a skilled professional to build and maintain web crawlers, extracting valuable insights from the web, and ensuring data quality. The ideal candidate will have strong Python programming skills and experience in web scraping frameworks, browser automation tools, and handling anti-scraping mechanisms.About Us: Our company...
-
Senior Web Application Specialist
2 weeks ago
Guntur, Andhra Pradesh, India beBeeFullStack Full time ₹ 15,00,000 - ₹ 25,00,000As a web application specialist, you will be responsible for developing and maintaining scalable web platforms using cutting-edge technologies.Key Responsibilities:Develop responsive web applications with optimal performanceCollaborate with UX/UI designers to translate wireframes into interactive featuresBuild and integrate RESTful APIs and third-party...
-
Senior Data Engineering Role
1 week ago
Guntur, Andhra Pradesh, India beBeeDataEngineer Full time ₹ 15,00,000 - ₹ 19,00,000Senior Data EngineerWe are seeking a skilled Senior Data Engineer to join our team and contribute to the design, implementation, and maintenance of large-scale data processing systems.The successful candidate will have expertise in Airflow, Python, and Snowflake, and a strong understanding of data architecture, design patterns, and data modeling.
-
Web Development Specialist
3 days ago
Guntur, Andhra Pradesh, India beBeeWebDevelopment Full time ₹ 15,00,000 - ₹ 25,00,000Job OverviewWe're seeking a talented individual to join our team as a web development specialist. As a key member of our design group, you'll be responsible for creating visually appealing and conversion-focused websites using Webflow.Key Responsibilities:Collaborate with designers to craft engaging web experiences that drive results.Implement user...
-
Senior Web Data Extractor
2 weeks ago
Guntur, Andhra Pradesh, India beBeeScraping Full time ₹ 80,00,000 - ₹ 1,00,00,000Web Scraping DeveloperWe are seeking a skilled web scraping developer to design, develop and optimize large-scale scraping solutions that power data-driven decision making.The ideal candidate will have expertise in Python development with Scrapy, proficiency in automation libraries such as Playwright or Selenium, experience with REST APIs, asynchronous...
-
AWS Data Engineer
1 week ago
Guntur, Andhra Pradesh, India beBeeData Full time ₹ 2,00,00,000 - ₹ 2,50,00,000Job Title:AWS Data Engineer - Big Data SpecialistJob Summary:We are seeking a highly skilled AWS Data Engineer to join our team. As a Data Engineer with a strong background in big data, you will have the opportunity to work on a project focused on integrating and processing channel activity data from multiple sources.Key Responsibilities:
-
Senior Web Developer
5 days ago
Guntur, Andhra Pradesh, India beBeeFullstack Full time ₹ 15,10,000 - ₹ 20,10,000Job OverviewThis role seeks a highly skilled developer with expertise in full-stack development and front-end design to work on our web applications.The selected candidate will be responsible for designing, developing, and maintaining the user interface of our web applications and assist in web deployment. Day-to-day tasks include converting design concepts...