
Highly Skilled Web Scraping Professional
4 days ago
This is a senior-level position for a web scraping engineer who will own the crawling/extraction layer end-to-end, ensuring high-quality data extraction and efficient automation of web scraping processes.
Main Responsibilities:- Design and Develop Crawling Layer: Design an HTTP-first crawler with a Playwright fallback for JS-heavy pages, enabling seamless data extraction from dynamic websites.
- Sitemap Diffing and Incremental Runs: Implement sitemap diffing and conditional GETs for incremental runs, minimizing downtime and ensuring data accuracy.
- Classifier Development: Build a classifier to auto-route HTTP vs Playwright based on HTML length, JSON-LD presence, and data-product markers, streamlining the extraction process.
- Per-Domain Throttles and Normalization: Enforce per-domain throttles/backoff and add URL normalization/canonicalization and de-duplication, preventing data duplication and ensuring consistency.
- PDF Discovery and Integration: Handle PDF discovery & download, apply Playwright browser automation resource budgets, and integrate third-party APIs as first-class sources, expanding data collection capabilities.
- Automation and Orchestration: Own automation & orchestration for scheduled runs, idempotent retries, and alerting, ensuring smooth operation and timely issue resolution.
- Observability and Maintenance: Ship observability: per-site field coverage, error rates, retries, avg page time, and PDF success, maintaining a high level of data quality and system reliability. Maintain allow/deny paths and adhere to robots.txt and Terms of Service, ensuring compliance and preventing data loss.
- Technical Expertise: 4+ years of Python experience, including 2+ years building production web crawlers at scale, with strong skills in Scrapy or aiohttp/asyncio and Playwright (or Puppeteer) in production.
- Practical Knowledge: Practical proxy management, polite anti-bot tactics, and per-domain rate limiting, as well as hands-on experience with ETag/Last-Modified, retries, backoff, and HTTP caching.
- CSS/XPath and API Experience: Confident with CSS/XPath, schema.org/JSON-LD, and HTML parsing, as well as APIs: consuming REST/GraphQL and building small internal services.
- Automation Tools: Automation/Orchestration: Airflow/Temporal/Celery for scheduled runs and monitoring, with comfort with logs/metrics and basic knowledge of Docker and Linux.
- Communication Skills: Clear, pragmatic communication and strong ownership, ensuring effective collaboration and issue resolution.
- Additional Skills: Go or Node.js experience for high-performance crawlers, cloud experience with AWS/GCP, S3, ECS/Kubernetes, and IaC basics.
- Workflow Engines: Workflow engines: Airflow/Temporal/Argo/Celery, document extraction: Textract/Tika/Camelot/Tabula, and search/analytics: Elasticsearch/OpenSearch; warehousing (Snowflake/Postgres).
-
Nashik, Maharashtra, India beBeeScraping Full time ₹ 15,00,000 - ₹ 20,00,000About UsWe are seeking a highly skilled Python Developer to join our team.The ideal candidate will have strong expertise in Scrapy, web scraping and automation. This role involves designing, developing and maintaining large-scale scraping solutions that power data-driven decision-making.
-
Data Mining Automation Specialist
5 days ago
Nashik, Maharashtra, India beBeeDataAnalyst Full time ₹ 20,00,000 - ₹ 25,00,000Web Data AnalystWe are seeking a skilled Web Data Analyst to join our team. As a key member of our data analytics department, you will be responsible for designing and developing robust web scraping solutions to extract structured and unstructured data from various websites and APIs.Key Responsibilities:Design, develop, and maintain efficient and scalable...
-
Highly Skilled Web Developer Wanted
1 week ago
Nashik, Maharashtra, India beBeeSoftware Full time ₹ 12,00,000 - ₹ 25,00,000Senior Software Engineer PositionWe are seeking an experienced Senior Software Engineer to join our team. In this role, you will be integral in both front-end and back-end development for cutting-edge web applications using .NET Core and React JS.Key Responsibilities:Develop and maintain high-quality web applications using .NET Core for backend and React JS...
-
Senior Data Analyst
1 day ago
Nashik, Maharashtra, India beBeeData Full time ₹ 18,00,000 - ₹ 25,00,000Data Analytics SpecialistWe are seeking a highly skilled Data Analytics Specialist with expertise in data engineering, analysis, and visualization. The ideal candidate will be responsible for designing and implementing efficient data pipelines, performing web scraping, and generating actionable insights through dashboards and reports.Key...
-
Lead Data Extraction Specialist
2 weeks ago
Nashik, Maharashtra, India beBeeData Full time ₹ 1,50,00,000 - ₹ 2,00,00,000Lead Data Extraction SpecialistWe are seeking a highly skilled and experienced professional to lead our data extraction efforts. The ideal candidate will have a minimum of 4 years of hands-on experience in IT scraping, with at least 2 years leading a team of 5+ developers.About the Role:The successful candidate will be responsible for designing and...
-
Innovative Web Data Specialist
6 days ago
Nashik, Maharashtra, India beBeeData Full time ₹ 10,00,000 - ₹ 14,00,000Web Data SpecialistWe're seeking a passionate and driven expert to join our tech-driven startup dedicated to solving fraud detection and prevention challenges.This role is ideal for someone eager to own the entire data collection process, who thrives on early-stage challenges, and loves building innovative, scalable solutions from day zero.Key...
-
Advanced Data Engineer
2 weeks ago
Nashik, Maharashtra, India beBeeData Full time ₹ 8,00,000 - ₹ 15,00,000Job DescriptionThe ideal candidate will have expertise in automating data extraction processes from web platforms with a strong proficiency in Python and experience with web scraping frameworks like Selenium Scrapy and BeautifulSoup.Main Responsibilities:Design develop and maintain robust web scraping solutions to extract structured and unstructured data...
-
Highly Skilled SRE Professional
2 weeks ago
Nashik, Maharashtra, India beBeeSreEngineer Full time ₹ 15,00,000 - ₹ 25,00,000Job OverviewWe are seeking a skilled SRE Engineer to join our team. The successful candidate will have 7-10 years of experience in IT and be responsible for providing production, operations support and application administration to business and web applications.The application environment is primarily based on Microsoft technologies, including intranet and...
-
Senior Data Extraction Specialist
1 week ago
Nashik, Maharashtra, India beBeeWebCrawler Full time ₹ 5,00,000 - ₹ 10,00,000Key ResponsibilitiesMaintain and enhance existing web scraping and data crawling projects.Develop and refine crawlers using Python-based tools and frameworks.Utilize browser automation tools to handle dynamic content.Clean, validate, and integrate extracted data into downstream storage systems.Implement and manage solutions for anti-bot measures.Optimize...
-
Nashik, Maharashtra, India beBeeDevelopment Full time ₹ 6,00,000 - ₹ 8,00,000Web Development OpportunityWe are seeking an experienced Web Developer to join our organization at Placement India. The successful candidate will have a strong understanding of HTML, CSS, and JavaScript, as well as experience with server-side languages.The ideal candidate will have excellent problem-solving skills, be able to communicate effectively, and be...