Technical Lead – Web Crawling Systems, Data Pipelines

2 weeks ago

bangalore, India AIMLEAP Full time

Experience: 7 to 12 Years Location: Remote / Bangalore Engagement: Full-time Positions: 2 Qualification: B.E / B.Tech / M.Tech / MCA / Computer Science / IT Industry: IT / Data / AI / E-commerce / FinTech / Healthcare Notice Period: Immediate What We Are Looking For Proven experience leading data engineering teams with strong ownership of web crawling systems and pipeline architecture. Expertise in designing, building, and optimizing scalable data pipelines, preferably using workflow orchestration tools such as Airflow or Celery. Hands-on proficiency in Python and SQL for data extraction, transformation, processing, and storage. Experience working with cloud platforms such as AWS, GCP, or Azure for data infrastructure, deployments, and pipeline operations. Deep understanding of web crawling frameworks, proxy rotation, anti-bot strategies, session handling, and compliance with global data collection standards (GDPR/CCPA-safe crawling). Strong expertise in AI-driven automation, including integrating AI agents or frameworks like Crawl4ai into scraping, validation, and pipeline workflows.. Responsibilities Lead and mentor data engineering and web crawling teams, ensuring high-quality delivery and adherence to best practices. Architect, implement, and optimize scalable data pipelines that support high-volume data ingestion, transformation, and storage. Build and maintain robust crawling systems using modern frameworks, handling IP rotation, throttling, and dynamic content extraction. Establish pipeline orchestration using Airflow, Celery, or similar distributed processing technologies. Define and enforce data quality, validation, and security measures across all data flows and pipelines. Collaborate with product, engineering, and analytics teams to translate data requirements into scalable technical solutions. Develop monitoring, logging, and performance metrics to ensure high availability and reliability of data systems. Oversee cloud-based deployments, cost optimization, and infrastructure improvements on AWS/GCP/Azure. Integrate AI agents or LLM-based automation for tasks such as error resolution, data validation, enrichment, and adaptive crawling Qualifications Bachelor's or master's degree in engineering, Computer Science, or related field. 7–12 years of relevant experience in data engineering, pipeline design, or large-scale web crawling systems. Strong expertise in Python, SQL, and modern data processing practices. Experience working with Airflow, Celery, or similar workflow automation tools. Solid understanding of proxy systems, anti-bot techniques, and scalable crawler architecture. Hands-on experience with cloud data platforms (AWS/GCP/Azure). Experience with AI/LLM frameworks (Crawl4ai, LangChain, LlamaIndex, AutoGen, OpenAI, or similar). Strong analytical, architectural, and leadership skills.

Web Crawling Specialist

4 days ago

bangalore, India beBeeWebCrawler Full time

Job Opportunity: Web Crawling SpecialistWe are seeking an experienced Web Crawling Specialist to join our team and contribute to the development of innovative web crawling solutions.The ideal candidate will have strong expertise in Python programming, particularly in web scraping frameworks such as Scrapy. Experience with browser automation tools like...
Web Crawling Engineer

3 days ago

bangalore, India Forage AI Full time

We are seeking a Web Crawling Engineer who will be responsible for building and maintaining web crawlers, extracting valuable insights from the web, and ensuring data quality. The ideal candidate will have strong Python programming skills and experience in web scraping frameworks, browser automation tools, and handling anti-scraping mechanisms. Salary budget...
Senior Data Engineering Lead

2 weeks ago

bangalore, India beBeeData Full time

Job Summary:We are seeking a seasoned professional to lead our data engineering and web crawling teams.The ideal candidate will have extensive experience in architecting, implementing, and optimizing scalable data pipelines.A robust background in building and maintaining crawling systems is essential.The ability to establish pipeline orchestration using...
Expert Web Data Extractor

1 week ago

bangalore, India beBeeWebCrawling Full time

Web Crawling DeveloperWe are seeking a skilled professional to build and maintain web crawlers, extract valuable insights from the web, and ensure data quality.The ideal candidate has strong programming skills in Python and experience with web scraping frameworks, browser automation tools, and handling anti-scraping mechanisms.Maintain and enhance existing...
Senior Software Engineer

6 days ago

bangalore, India Crest Data Systems Full time

Company Overview:Crest Data is a leading provider of data center solutions and engineering/marketing services in the areas of Networking/SDN, Storage, Security, Virtualization, Cloud Computing, and Big Data / Data Analytics. The team has extensive experience in building and deploying various Data Center products from Cisco, VMware, NetApp, Amazon AWS, EMC,...
Lead Data Pipeline Engineer

2 weeks ago

Bangalore, India Persistent Systems Full time

About Position: We are seeking a highly skilled and experienced Senior Data Engineer to join our growing data team. The ideal candidate will have deep expertise in SQL/PLSQL development , AWS cloud services , DBT (Data Build Tool) , Python , and Snowflake . You will be responsible for designing, building, and maintaining scalable data pipelines and...
Data Engineering Lead

2 weeks ago

bangalore, India beBeeTechnical Full time

Job TitleA technical architect with hands-on coding skills is required to join our organization. The ideal candidate will have expertise in data/content crawling from the public internet, including bot detection avoidance techniques and VPN use.Key responsibilities include:Developing ETL processes using cloud-native technologiesArchitecting and implementing...
Senior Web Crawler Specialist

2 weeks ago

bangalore, India beBeeEngineer Full time

We are seeking a skilled engineer to build and maintain web crawlers, extract valuable insights from the web, and ensure data quality.The ideal candidate will have strong programming skills and experience in scraping frameworks, browser automation tools, and handling anti-scraping mechanisms.Key Responsibilities:Maintain and enhance existing web scraping...
Data Pipeline Specialist

4 days ago

bangalore, India beBeeDataSupportEngineer Full time

Job Role:The Data Support Engineer serves as the primary point of contact for data ingestion, pipeline, and consumption issues. Ensuring high system stability is a key priority through proactive monitoring and root-cause analysis.Key ResponsibilitiesSupport escalation point for issues with data ingestion, consumption, pipelines.Proactive monitoring:...
Technical Lead

2 days ago

bangalore, India Bharti AXA Life Insurance Full time

We, at Bharti Axa Life Insurance are looking for candidates who are interested in joining a fast paced company with lots of growth opportunity. We are hiring for Tech Lead in our Information Technology department. The roles and responsibilities are as follows: Tech Lead - IFRS Data Transformation The Tech Lead will be responsible for the technical...

Americas

Europe

Asia / Oceania

Africa

Technical Lead – Web Crawling Systems, Data Pipelines