Current jobs related to Data Engineering Manager – Web Crawling - Remote - Outsource Bigdata
-
Web Crawling
1 week ago
Remote, India FullStackTechies Full timeWeb Crawling & Data Extraction Engineer (WFH)Experience: 1–7 YearsLocation: Remote (Work from Home)Mode of Engagement: Full-timeNo of Positions: 3 to 8Educational Qualification: Bachelor's degree in Computer Science, IT, or related fieldIndustry: IT / Software Services / Data & AINotice Period: Immediate JoinersWhat We Are Looking ForStrong hands-on...
-
Data Pipeline Engineer- Web Services
1 week ago
Remote, India Forage Ai Full timeWe are seeking a Data Pipeline Engineer to develop, optimize, and maintain production-grade data pipelines focused on web data extraction and ETL workflows. This is a hands-on role requiring strong experience with Python (as the primary programming language), spaCy, LLMs, webcrawling, and cloud deployment in containerized environments. You ll have...
-
Technical Architect – Data Engineering
1 week ago
Remote, India Rojgar group Full timeTechnical Architect – Data Engineering | LLM | Elastic Search | RAG | React | PythonLocation: Remote - WFHExperience: 10+ Years (Architect / Senior Level)Employment Type: Full-TimeAbout the RoleWe are looking for a Technical Architect with a strong background in data engineering, large language models (LLMs), and search technologies. This role is ideal for...
-
Technical Architect – Data Engineering
1 week ago
Remote, India Rojgar group Full timeTechnical Architect – Data Engineering | LLM | Elastic Search | RAG | React | PythonLocation: Remote - WFHExperience: 10+ Years (Architect / Senior Level)Employment Type: Full-TimeAbout the RoleWe are looking for a Technical Architect with a strong background in data engineering, large language models (LLMs), and search technologies. This role is ideal for...
-
Web Scraper
2 weeks ago
Remote, India Hyperhire Full time**Job Description for Web Scraping Developer (Crawling Developer)** - **Experience**: 3+ years - **Job Type**: 3 months full-time Contractual Role (may get extended) - **Duration**: Initially 3 months, with the possibility of extension based on performance or project requirements. **Skillsets Required**: - Proven experience in scraping data from...
-
Data Engineer
2 weeks ago
Remote, India Slooze Full timeAbout the RoleWe're looking for a Data Engineer who can turn messy, unstructured information into clean, usable insights. You'll be building crawlers, integrating APIs, and setting up data flows that power our analytics and AI layers. If you love data plumbing as much as data puzzles — this role is for you.ResponsibilitiesBuild and maintain Python-based...
-
Web Developer
2 weeks ago
Remote, India cyber web Full timeOverview:We are looking for a motivated and detail-oriented Web Developer with a basic understanding of WordPress, SEO, HTML, CSS, JavaScript, and PHP. The ideal candidate should be able to complete assigned tasks within deadlines, maintain a learning attitude, and work collaboratively to improve website performance and visibility.Key...
-
Software Engineer
1 week ago
Remote, India Forage Ai Full timeYou will design, build, and operate software for data collection and processing at scale. The role is hands on, with emphasis on clean design, reliability, and performance.Key Responsibilities:Develop and maintain Python applications for crawling, parsing, enrichment, and processing of large datasets.Build and operate data workflows (ETL/ELT), including...
-
Senior AI Data Engineer
1 week ago
REMOTE, India Welo Data Full timeDataset Design & CurationDesign and build realistic toy/dummy datasets at varying complexity levels (simple, moderate, complex).Ensure datasets reflect real-world scenarios while remaining clean, reproducible, and well-structured (CSV format).Prompt EngineeringWrite concise, natural-language prompts (Ensure prompts are grammatically precise and align with...
-
Data Engineer Advisor
4 weeks ago
Remote, India NTT Data Full timeJob Description NTT DATA strives to hire exceptional, innovative and passionate individuals who want to grow with us. If you want to be part of an inclusive, adaptable, and forward-thinking organization, apply now. We are currently seeking a Data Engineer Advisor to join our team in Remote, Karntaka (IN-KA), India (IN). The SAP-to-Databricks Migration...
Data Engineering Manager – Web Crawling
2 weeks ago
Aimleap
India (Remote)
Posted on November 29, 2025
AIMLEAP is Hiring:
Data Engineering Manager – Web Crawling & Pipeline Architecture
Experience: 7 to 12 Years
Location: Remote / Bangalore
Engagement: Full-time
Positions: 2
Qualification: B.E / B.Tech / M.Tech / MCA / Computer Science / IT
Industry: IT / Data / AI / E-commerce / FinTech / Healthcare
Notice Period: Immediate
What We Are Looking For:
- Proven experience leading data engineering teams with strong ownership of web crawling systems and pipeline architecture.
- Expertise in designing, building, and optimizing scalable data pipelines, preferably using workflow orchestration tools such as Airflow or Celery.
- Hands-on proficiency in Python and SQL for data extraction, transformation, processing, and storage.
- Experience working with cloud platforms such asAWS, GCP, or Azure for data infrastructure, deployments, and pipeline operations.
- Deep understanding of web crawling frameworks, proxy rotation, anti-bot strategies, session handling, and compliance with global data collection standards (GDPR/CCPA-safe crawling).
- Strong expertise in AI-driven automation, including integrating AI agents or frameworks like Crawl4ai into scraping, validation, and pipeline workflows..
- Lead and mentor data engineering and web crawling teams, ensuring high-quality delivery and adherence to best practices.
- Architect, implement, and optimize scalable data pipelines that support high-volume data ingestion, transformation, and storage.
- Build and maintain robust crawling systems using modern frameworks, handling IP rotation, throttling, and dynamic content extraction.
- Establish pipeline orchestration using Airflow, Celery, or similar distributed processing technologies.
- Define and enforce data quality, validation, and security measures across all data flows and pipelines.
- Collaborate with product, engineering, and analytics teams to translate data requirements into scalable technical solutions.
- Develop monitoring, logging, and performance metrics to ensure high availability and reliability of data systems.
- Oversee cloud-based deployments, cost optimization, and infrastructure improvements on AWS/GCP/Azure.
- Integrate AI agents or LLM-based automation for tasks such as error resolution, data validation, enrichment, and adaptive crawling.
- Bachelor's or master's degree in engineering, Computer Science, or related field.
- 7–12 years of relevant experience in data engineering, pipeline design, or large-scale web crawling systems.
- Strong expertise in Python, SQL, and modern data processing practices.
- Experience working with Airflow, Celery, or similar workflow automation tools.
- Solid understanding of proxy systems, anti-bot techniques, and scalable crawler architecture.
- Hands-on experience with cloud data platforms (AWS/GCP/Azure).
- Experience with AI/LLM frameworks (Crawl4ai, LangChain, LlamaIndex, AutoGen, OpenAI, or similar).
- Strong analytical, architectural, and leadership skills.
AIMLEAP is an ISO 9001:2015 and ISO/IEC 27001:2013 certified global technology consulting and service provider offering Digital IT, AI-augmented Data Solutions, Automation, and Research & Analytics Services.
AIMLEAP has been recognized as 'The Great Place to Work'. With focus on AI and automation-first approach, our services include end-to-end IT application management, Mobile App Development, Data Management, Data Mining Services, Web Data Scraping, Self-serving BI reporting solutions, Digital Marketing, and Analytics solutions.
We started in 2012 and successfully delivered projects in IT & digital transformation, automation driven data solutions, and digital marketing for more than 750 fast-growing companies in the USA, Europe, New Zealand, Australia, Canada; and more.
– An ISO 9001:2015 and ISO/IEC 27001:2013 certified
– Served 750+ customers
– 12+ Years of industry experience
– 98% Client Retention
– Great Place to Work Certified
– Global Delivery Centers in the USA, Canada, India & Australia.