Data Engineering Manager – Web Crawling

3 weeks ago


Mumbai, India AIMLEAP Full time

Data Engineering Manager – Web Crawling & Pipeline Architecture Experience: 7 to 12 Years Location: Remote / Bangalore Engagement: Full-time Positions: 2 Qualification: B.E / B.Tech / M.Tech / MCA / Computer Science / IT Industry: IT / Data / AI / E-commerce / FinTech / Healthcare Notice Period: Immediate  What We Are Looking For Proven experience leading data engineering teams with strong ownership of web crawling systems and pipeline architecture. Expertise in designing, building, and optimizing scalable data pipelines, preferably using workflow orchestration tools such as Airflow or Celery. Hands-on proficiency in Python and SQL for data extraction, transformation, processing, and storage. Experience working with cloud platforms such as AWS, GCP, or Azure for data infrastructure, deployments, and pipeline operations. Deep understanding of web crawling frameworks, proxy rotation, anti-bot strategies, session handling, and compliance with global data collection standards (GDPR/CCPA-safe crawling). Strong expertise in AI-driven automation, including integrating AI agents or frameworks like Crawl4ai into scraping, validation, and pipeline workflows..  Responsibilities Lead and mentor data engineering and web crawling teams, ensuring high-quality delivery and adherence to best practices. Architect, implement, and optimize scalable data pipelines that support high-volume data ingestion, transformation, and storage. Build and maintain robust crawling systems using modern frameworks, handling IP rotation, throttling, and dynamic content extraction. Establish pipeline orchestration using Airflow, Celery, or similar distributed processing technologies. Define and enforce data quality, validation, and security measures across all data flows and pipelines. Collaborate with product, engineering, and analytics teams to translate data requirements into scalable technical solutions. Develop monitoring, logging, and performance metrics to ensure high availability and reliability of data systems. Oversee cloud-based deployments, cost optimization, and infrastructure improvements on AWS/GCP/Azure. Integrate AI agents or LLM-based automation for tasks such as error resolution, data validation, enrichment, and adaptive crawling  Qualifications Bachelor's or master's degree in engineering, Computer Science, or related field. 7–12 years of relevant experience in data engineering, pipeline design, or large-scale web crawling systems. Strong expertise in Python, SQL, and modern data processing practices. Experience working with Airflow, Celery, or similar workflow automation tools. Solid understanding of proxy systems, anti-bot techniques, and scalable crawler architecture. Hands-on experience with cloud data platforms (AWS/GCP/Azure). Experience with AI/LLM frameworks (Crawl4ai, LangChain, LlamaIndex, AutoGen, OpenAI, or similar). Strong analytical, architectural, and leadership skills. 


  • Web Crawling Engineer

    4 weeks ago


    Mumbai, India Forage AI Full time

    We are seeking a Web Crawling Engineer who will be responsible for building and maintaining web crawlers, extracting valuable insights from the web, and ensuring data quality. The ideal candidate will have strong Python programming skills and experience in web scraping frameworks, browser automation tools, and handling anti-scraping mechanisms.Salary budget...


  • mumbai, India beBeeData Full time

    Web Crawling SpecialistWe are seeking a highly skilled web crawling specialist to join our team. This role requires expertise in building and maintaining complex web crawlers, extracting valuable insights from the web, and ensuring data quality.The ideal candidate will have strong programming skills in Python, experience with web scraping frameworks such as...


  • mumbai, India beBeeCrawling Full time

    Web Crawling SpecialistWe are looking for a skilled expert to develop and maintain web crawlers, extracting valuable insights from the web, and ensuring data quality.About the PositionDesign and build high-performance web crawlers using Python-based tools and frameworks.Utilize browser automation tools (e.g., Playwright, Selenium) to handle dynamic content...

  • Web Performance

    2 days ago


    Mumbai, Maharashtra, India arrivia Full time

    Are you a developer who sees search engines as the ultimate API? Do you thrive on optimizing the code and architecture of major websites to drive massive organic traffic?We are searching for a Web Performance & SEO Developer to join our global Performance Marketing team. This is a critical hybrid role where you use your developer expertise (.NET, PHP,...


  • Mumbai, India GoodSpace AI Full time

    SEO - Associate Manager :Responsibilities :The candidate is responsible for enhancing organic visits for our client’s website & to make sure that the goals are achieved.The candidate needs to check information available from various tools and ensure that the suggestions are implemented to enhance the number of organic keywords and to get better ranking for...


  • Mumbai, India GoodSpace AI Full time

    SEO - Associate Manager :Responsibilities :The candidate is responsible for enhancing organic visits for our client’s website & to make sure that the goals are achieved.The candidate needs to check information available from various tools and ensure that the suggestions are implemented to enhance the number of organic keywords and to get better ranking for...


  • Mumbai, Maharashtra, India Nexio Full time

    Senior Software Engineer (Puppeteer/React/NodeJS)  BACHELOR'S DEGREE IN COMPUTER SCIENCE OR SIMILAR ENGINEERING DEGREE REQUIRED, WITH 5+ YEARS OF RELEVANT WORK EXPERIENCE  Qualifications: Hands-on and proficient in ReactJS, , HTML5, CSS3, and JavaScriptExperience in creating responsive web applications and cross-browser compatibilityProven experience with...


  • Mumbai, India GoodSpace AI Full time

    SEO - Associate Manager : Responsibilities : The candidate is responsible for enhancing organic visits for our client’s website & to make sure that the goals are achieved. The candidate needs to check information available from various tools and ensure that the suggestions are implemented to enhance the number of organic keywords and to get better ranking...


  • Mumbai, India GoodSpace AI Full time

    SEO - Associate Manager : Responsibilities : - The candidate is responsible for enhancing organic visits for our client's website & to make sure that the goals are achieved. - The candidate needs to check information available from various tools and ensure that the suggestions are implemented to enhance the number of organic keywords and to get better...


  • mumbai, India beBeeAutomation Full time

    Job Opening:We are seeking an innovative and hands-on Full Stack Developer who will help us build internal tools, management consoles, and automation pipelines, enabling self-service capabilities and more efficient operations.You will play a critical role in systemizing crawler lifecycle management—such as launching new crawling instances, monitoring...