
High-Throughput Product Data Ingestion Engineer
13 hours ago
We are seeking a seasoned expert to design and develop a high-throughput product data ingestion pipeline across multiple domains.
- Create an HTTP-first crawler with a Playwright fallback for complex web pages.
- Implement sitemap diffing and conditional GETs (ETag/Last-Modified) for efficient incremental runs.
- Develop a lightweight classifier to automatically route HTTP vs Playwright requests.
- Enforce per-domain throttles/backoff (2–4 concurrent/domains).
- Add URL normalization/canonicalization and de-duplication.
- Handle PDF discovery & download.
- Integrate third-party APIs as first-class sources.
- Own automation & orchestration for scheduled runs, idempotent retries, and alerting.
- Ship observability metrics.
- 4+ years Python experience, including 2+ years building production web crawlers at scale.
- Strong proficiency with Scrapy or aiohttp/asyncio and Playwright in production.
- Practical proxy management, polite anti-bot tactics, and per-domain rate limiting.
- Hands-on with ETag/Last-Modified, retries, backoff, and HTTP caching.
- Confident with CSS/XPath, schema.org/JSON-LD, and HTML parsing.
- APIs: consuming REST/GraphQL and building small internal services.
- Automation/Orchestration: Airflow/Temporal/Celery for scheduled runs and monitoring.
- Go or Node.js experience for high-performance crawlers.
- Cloud: AWS/GCP, S3, ECS/Kubernetes; IaC basics.
- Workflow engines: Airflow/Temporal/Argo/Celery.
- Document extraction: Textract/Tika/Camelot/Tabula.
- Search/analytics: Elasticsearch/OpenSearch; warehousing (Snowflake/Postgres).
-
Big Data Ingestion Specialist
2 weeks ago
Amrāvati, Maharashtra, India beBeeDataEngineer Full time ₹ 20,00,000 - ₹ 25,00,000Azure Databricks is a leading platform for big data analytics and machine learning. We are seeking an experienced professional to join our team and work on building scalable and efficient data ingestion pipelines.The ideal candidate will have hands-on experience with data pipelines, automation, and infrastructure management to support the integration of...
-
Data Engineer
2 weeks ago
Amrāvati, Maharashtra, India beBeeDataEngineer Full time ₹ 1,20,00,000 - ₹ 1,80,00,000Unlock Your Potential as a Data EngineerWe're seeking an experienced and innovative Data Engineer to join our team. As a key member of our data analytics platforms, you will be responsible for developing and deploying scalable data processing pipelines using Azure Data Factory, Azure Databricks, and other relevant technologies.Your primary focus will be on...
-
Principal Data Architect
4 days ago
Amrāvati, Maharashtra, India beBeeData Full time ₹ 2,00,00,000 - ₹ 2,50,00,000Key Roles and Responsibilities:">Design, build, and operate scalable data infrastructure to support distributed computing and data orchestration for large language model research.Develop high-throughput systems for data ingestion, processing, and transformation to support model development.Collaborate with research teams to deliver critical data assets for...
-
High-Performance Data Architect
1 week ago
Amrāvati, Maharashtra, India beBeeDataEngineer Full time US$ 2,00,000 - US$ 2,50,000About Us:We are a leading provider of innovative telenutrition and foodcare solutions. Our platform is guided by a robust network of Registered Dietitians, empowering members to make informed decisions about their diet and lifestyle.Our mission is to make nutritious food accessible and affordable for everyone, regardless of economic status. We achieve this...
-
Advanced Data Scientist
2 days ago
Amrāvati, Maharashtra, India beBeeData Full time ₹ 1,00,00,000 - ₹ 2,00,00,000Senior Data EngineerWe're seeking a skilled data engineer to design, build, and optimize scalable data pipelines.Key Responsibilities:Design and Build Scalable Data Pipelines: Develop efficient and reliable data pipelines that process large datasets using Apache Spark on the cloud.Implement Data Ingestion, Transformation, and Integration Solutions: Utilize...
-
Chief Technology Architect
13 hours ago
Amrāvati, Maharashtra, India beBeeEngineering Full time ₹ 2,00,00,000 - ₹ 2,50,00,000Job Title"Head of Engineering"We are a forward-thinking organization in prop-commerce, aiming to transform complex problems into robust systems with real users. Over the next 12–18 months, we seek a Founding Engineer who excels at building end-to-end solutions: designing data models, writing production code, and shipping fast.Key DeliverablesArtificial...
-
Principal Software Developer
1 week ago
Amrāvati, Maharashtra, India beBeeDataEngineering Full time ₹ 1,50,00,000 - ₹ 2,50,00,000Job OverviewWe are seeking an experienced Lead Python Engineer to lead the development of a shared component library for data pipelines.The successful candidate will have hands-on experience with Apache Beam and Databricks, as well as strong expertise in performance and concurrency.Key ResponsibilitiesDesign and build a shared component library/SDK for...
-
Data Engineering Specialist
2 days ago
Amrāvati, Maharashtra, India beBeeDataEngineer Full time ₹ 2,00,00,000 - ₹ 2,50,00,000Job Title: Data EngineerWe are seeking a skilled Data Engineer with expertise in big data processing, cloud-based analytics, and advanced visualization technologies.The ideal candidate will have hands-on experience with Apache Airflow, Apache Spark, Azure data services, Python programming, and emerging technologies like IoT and AR/VR/MR. You will be...
-
Architecting High-Performance Data Solutions
2 weeks ago
Amrāvati, Maharashtra, India beBeeData Full time ₹ 15,00,000 - ₹ 20,00,000Job Title: Senior Data Engineer Role Summary:We are seeking a skilled Senior Data Engineer to join our team and contribute to the development of cutting-edge data pipelines, streaming platforms, and cloud-based infrastructures.Key Responsibilities:Design and implement ETL pipelines for large-scale data ingestion and transformation.Build real-time streaming...
-
Data Engineering Lead
14 hours ago
Amrāvati, Maharashtra, India beBeeData Full time ₹ 1,50,00,000 - ₹ 2,10,00,000Imagine contributing to the modernization of a large financial institution's platform.About This OpportunityWe are seeking an experienced data engineer to lead the architecture, ingestion and ETL process modernization efforts.This role will involve designing and implementing data pipelines, orchestrating workflows using Apache Airflow, processing and...