
Senior Data Scraping Engineer
2 weeks ago
Job Summary
We are seeking a highly skilled and experienced Senior Data Scraping Engineer to design, develop, and orchestrate robust web scraping frameworks. The ideal candidate will have 8-10 years of experience in ethical web scraping, including navigating login-protected websites, solving CAPTCHAs, and managing proxies or third-party services. You will be responsible for building scalable, efficient, and compliant scraping pipelines using industry-standard programming languages and tools, ensuring data integrity and adherence to legal and ethical guidelines.
Key Responsibilities
- Framework Development: Design and implement end-to-end web scraping frameworks to extract structured data from diverse web sources, including those requiring authentication (e.g., behind logins).
- CAPTCHA Handling: Develop and integrate solutions to bypass or solve CAPTCHAs (e.g., reCAPTCHA, hCaptcha) using ethical tools, services, or machine learning techniques.
- Proxy & Service Management: Configure and manage proxy services (e.g., rotating proxies, residential proxies) and third-party APIs (e.g., CAPTCHA-solving services) to ensure uninterrupted and anonymous scraping operations.
- Ethical Compliance: Ensure all scraping activities comply with website terms of service, data privacy regulations (e.g., GDPR, CCPA), and industry best practices for ethical data collection.
- Data Quality & Validation: Implement robust data validation and cleaning processes to ensure the accuracy, completeness, and consistency of scraped data.
- Scalability & Optimization: Build scalable scraping pipelines capable of handling large volumes of data with optimized performance, minimal latency, and efficient resource utilization.
- Monitoring & Maintenance: Develop monitoring tools to track scraping performance, detect failures (e.g., IP bans, structural changes in websites), and maintain scraping scripts to adapt to website updates.
- Collaboration: Work closely with data engineers, analysts, and product teams to understand data requirements and deliver high-quality datasets for downstream applications.
- Documentation: Maintain comprehensive documentation for scraping workflows, tools, and
processes to ensure transparency and reproducibility.
Required Qualifications
- Experience: 4-10 years of professional experience in web scraping, data extraction, or related fields, with a proven track record of handling complex scraping projects.
- Programming Languages:
- Primary: Proficiency in Python (e.g., Scrapy, BeautifulSoup, Selenium, Requests) for building
scraping scripts and frameworks.
Secondary (Preferred): Familiarity with (e.g., Puppeteer, Cheerio) for
dynamic website scraping or Go for high-performance tasks.Tools & Technologies:
- Scraping Frameworks: Expertise in Scrapy, Selenium, Puppeteer, or equivalent tools for
scraping static and dynamic web content.
- CAPTCHA Solutions: Experience with CAPTCHA-solving services (e.g., 2Captcha, Anti-
CAPTCHA) or custom ML-based solutions.
- Proxy Management: Hands-on experience with proxy services like Bright Data, Oxylabs,
Smartproxy, or ScrapingBee for IP rotation and anonymity.
- Headless Browsers: Proficiency in using headless browsers (e.g., Chrome, Firefox) for
scraping JavaScript-heavy websites.
- Databases: Knowledge of SQL (e.g., PostgreSQL, MySQL) and NoSQL (e.g., MongoDB) for
storing and querying scraped data.
- Cloud Platforms (Preferred): Familiarity with AWS, GCP, or Azure for deploying scraping
pipelines or managing infrastructure.
- Orchestration & Automation:
- Experience with workflow orchestration tools like Apache Airflow, Prefect, or Celery for
scheduling and managing scraping tasks.
Knowledge of containerization (e.g., Docker) and CI/CD pipelines for deploying scraping
scripts.Ethical & Legal Knowledge: Strong understanding of web scraping ethics, website terms of
service, and data privacy regulations (e.g., GDPR, CCPA).- Problem-Solving: Exceptional ability to troubleshoot issues like IP bans, rate limits, and website structural changes.
- Communication: Strong verbal and written communication skills to collaborate with cross-functional teams and document processes effectively.
Preferred Qualifications
- Experience with machine learning or AI-based techniques for CAPTCHA solving or dynamic content extraction.
-
Senior Python Developer
5 days ago
Ahmedabad, India Angel and Genie Full timeDescription :Working Days : 5 Days a Week.Role Overview :- We are seeking an experienced Senior Python Developer with proven expertise in Scrapy (must-have) and strong skills in web scraping and automation.- The ideal candidate will design, develop, and optimize large-scale scraping solutions that power data-driven decision-making.Key Responsibilities :-...
-
Data Scraping
4 days ago
Ahmedabad, India infistack corporate pvt ltd Full timeGreetings from the Infistack Technologies Pvt. Ltd. We are hiring Data Scraping Experience: Fresher **Job Details**: 1. Locating the information in the supplied source. 2. Manually searching for and extracting relevant data details. 3. Cleaning up the necessary data and locating what was requested. 4. Entering information into an excel spreadsheet in a...
-
Web Data Scraping Intern
4 days ago
Bengaluru, Karnataka, India Stylumia Full time ₹ 9,00,000 - ₹ 12,00,000 per yearCompany DescriptionStylumia, founded in 2015, aims to revolutionize retail strategy and decision-making through advanced intelligence, significantly reducing wastage from unsold products. Utilizing predictive and prescriptive analytics, Stylumia enables brands to anticipate market trends and stay ahead. With features like demand science, real-time consumer...
-
Bengaluru, India Stylumia Full timeCompany DescriptionStylumia, founded in 2015, aims to revolutionize retail strategy and decision-making through advanced intelligence, significantly reducing wastage from unsold products. Utilizing predictive and prescriptive analytics, Stylumia enables brands to anticipate market trends and stay ahead. With features like demand science, real-time consumer...
-
Data Engineer
2 weeks ago
Bengaluru, India Stylumia Full timeCompany Description Stylumia, founded in 2015, aims to revolutionize retail strategy and decision-making through advanced intelligence, significantly reducing wastage from unsold products. Utilizing predictive and prescriptive analytics, Stylumia enables brands to anticipate market trends and stay ahead. With features like demand science, real-time consumer...
-
Data Engineer
2 weeks ago
Bengaluru, India Stylumia Full timeCompany DescriptionStylumia, founded in 2015, aims to revolutionize retail strategy and decision-making through advanced intelligence, significantly reducing wastage from unsold products. Utilizing predictive and prescriptive analytics, Stylumia enables brands to anticipate market trends and stay ahead. With features like demand science, real-time consumer...
-
Data Engineer
2 weeks ago
Bengaluru, India Stylumia Full timeCompany DescriptionStylumia, founded in 2015, aims to revolutionize retail strategy and decision-making through advanced intelligence, significantly reducing wastage from unsold products. Utilizing predictive and prescriptive analytics, Stylumia enables brands to anticipate market trends and stay ahead. With features like demand science, real-time consumer...
-
Data Engineer
2 weeks ago
Bengaluru, India Stylumia Full timeCompany Description Stylumia, founded in 2015, aims to revolutionize retail strategy and decision-making through advanced intelligence, significantly reducing wastage from unsold products. Utilizing predictive and prescriptive analytics, Stylumia enables brands to anticipate market trends and stay ahead. With features like demand science, real-time consumer...
-
Web Scraping Specialist
4 weeks ago
Ahmedabad, India Actowiz Solutions Full timeJob Title: Web Scraping Specialist (Python) Company: Actowiz Solutions Location:Ahmedabad Job Type:Full-time Working Days:5 Days a Week. About Us Actowiz Solutions is a leading provider of data extraction, web scraping, and automation solutions. We empower businesses with actionable insights by delivering clean, structured, and scalable data through...
-
Python/JavaScript Developer – Web Scraping
3 weeks ago
Gandhinagar, India AIMLEAP Full timePython/JavaScript Developer – Web Scraping - 4 to 7 yearsExperience: 4 to 7 Years Location: RemoteMode of Engagement: Full-timeNo of Positions: 3Educational Qualifications: Bachelor's degree in Computer Science, Information TechnologyIndustry: IT / Software DevelopmentNotice Period: Immediate Joiners PreferredWhat We Are Looking For:Strong expertise in web...