
Data Pipeline Architectural Specialist
3 days ago
Our organization is seeking a visionary Lead Engineer to spearhead the development of our data pipelines.
We require an exceptional individual with hands-on expertise in Python, architecture, and performance optimization.
- Key Responsibilities:
- Platform Engineering:
Design and implement a shared component library for ingestion, parsing, extraction, validation, enrichment, and publishing. - Standardization: Define patterns and templates for Apache Beam pipelines and Databricks jobs; standardize configuration, packaging, versioning, CI/CD, and documentation.
- Interoperability: Create pluggable interfaces to enable seamless integration of multiple teams and services without code rewrites.
- Repository Strategy: Establish a repository strategy for each use case.
- Performance Optimization:
- Profiling and Tuning: Conduct end-to-end profiling and tuning of pipelines using cProfile/py-spy/line_profiler, memory (tracemalloc), CPU vs I/O analysis.
- Instrumentation: Instrument services with Elastic APM and correlate traces/metrics with Splunk logs; build dashboards and runbooks.
- Concurrency Best Practices: Implement best practices for concurrency: asyncio for I/O-bound, ThreadPool/ProcessPool for CPU-bound, batching, rate limiting, retries, etc.
- LLM Rate Limiting: Implement robust LLM API rate limiting/governance: enforce provider TPM and concurrency caps, request queueing/token budgeting, and emit APM/Splunk metrics (throttle rate, queue depth, cost per job) with alerts.
- SLOs and Alerts: Establish SLOs/alerts for throughput, latency, error rates; set up DLQs and recovery patterns.
- Team Enablement:
- Mentorship: Mentor junior engineers, lead design reviews, codify best practices, write clear documentation and examples.
- Partnerships: Partner with machine learning engineers on the future LLM/SLM path (evaluation harness, safety/PII, cost/perf).
- 7+ years experience in Python with strong focus on performance and concurrency (asyncio, concurrent.futures, multiprocessing), profiling and memory tuning.
- Observability expertise: Elastic APM instrumentation and dashboarding; Splunk for logs and correlation; OpenTelemetry familiarity.
- Must have implemented LLM-based solutions and supported them in production.
- API engineering for high-throughput integrations (REST, OAuth2), resilience patterns, and secure handling of sensitive data.
- Strong architecture/design skills: clean interfaces, packaging shared libraries, versioning, CI/CD (GitHub Actions/Azure DevOps), testing.
- 3+ years building large-scale data pipelines with Apache Beam and/or Spark, including hands-on Databricks experience (Jobs, Delta Lake, cluster tuning).
- Document processing: OCR (Tesseract, AWS Textract, Azure Form Recognizer), PDF parsing, text normalization.
- LLM/SLM integration experience (e.g., OpenAI/Azure AI, local SLMs), prompt/eval frameworks, PII redaction/guardrails.
- Cloud and tooling: AWS/Azure/GCP, Dataflow/Flink, Terraform, Docker; cost/performance tuning on Databricks.
- Security/compliance mindset (HIPAA), secrets management, least-privilege access.
- Platform Engineering:
-
Data Pipeline Specialist
3 days ago
Udaipur, Rajasthan, India beBeeFinancial Full time ₹ 20,00,000 - ₹ 25,00,000We are seeking a highly skilled Big Data Engineer to join our growing team. In this role, you will be responsible for designing, building, and maintaining robust data pipelines that handle high-volume financial data, including stocks, cryptocurrencies, and third-party data sources. You will play a critical role in ensuring data integrity, scalability, and...
-
Data Pipeline Specialist
4 days ago
Udaipur, Rajasthan, India beBeeData Full time ₹ 15,00,000 - ₹ 22,50,000Unlock Your Potential as a Data EngineerWe are seeking an experienced Data Engineer to drive innovation in our data pipeline team.Your expertise in designing, developing, and running complete end-to-end data pipelines on the Azure platform will be invaluable.As a key member of our team, you will work closely with Agile teams to handle data validation,...
-
Lead Data Pipeline Architect
3 days ago
Udaipur, Rajasthan, India beBeeDataEngineer Full time US$ 1,50,000 - US$ 1,75,000Job Description:The Senior Data Engineer plays a pivotal role in constructing and optimizing the data pipeline architecture. This individual will lead the optimization and scalability of data delivery pipelines, ensuring performance and reliability for cross-functional stakeholders. This key position is responsible for architecting, building, and maintaining...
-
Data Architectural Specialist
3 days ago
Udaipur, Rajasthan, India beBeeDataEngineer Full time ₹ 80,00,000 - ₹ 1,60,00,000Data Engineer Role Overview The role of Data Engineer is pivotal in driving the organization's data strategy forward. Key responsibilities include developing and maintaining scalable data pipelines to support growing data volumes and complexities, collaborating with analytics and business teams to enhance data models that feed business intelligence tools,...
-
Data Pipelines Specialist
4 days ago
Udaipur, Rajasthan, India beBeeData Full time ₹ 20,00,000 - ₹ 25,00,000Job OverviewOur organization is seeking a highly skilled Data Engineer to join our team. This individual will be responsible for designing, implementing, and managing CI/CD pipelines using GitHub Actions or equivalent tools. They will also configure and maintain GitHub branch protection rules and workflows to ensure smooth code integration and deployment.Key...
-
Senior Data Pipeline Specialist
4 days ago
Udaipur, Rajasthan, India beBeeDataEngineer Full time ₹ 2,00,00,000 - ₹ 2,50,00,000Job OverviewWe are seeking an experienced Senior Data Engineer to design and develop scalable data pipelines using Databricks and PySpark.Our ideal candidate will have a strong understanding of distributed data processing, CI/CD pipelines, and infrastructure as code using Terraform.The role involves implementing and managing data solutions on AWS, including...
-
Cloud Data Pipeline Specialist
3 days ago
Udaipur, Rajasthan, India beBeeData Full time ₹ 1,50,00,000 - ₹ 2,50,00,000Job Title: Data EngineerWe are seeking an experienced data engineer with expertise in designing, building, and optimizing scalable data pipelines and cloud solutions. The ideal candidate will have strong experience in AWS cloud services, data engineering, and knowledge of upstream, midstream or downstream processes to support advanced analytics, reporting,...
-
Cloud Data Architecture Specialist
5 days ago
Udaipur, Rajasthan, India beBeeEngineering Full time ₹ 2,00,00,000 - ₹ 2,40,00,000Big Data Engineering LeadThe ideal candidate will lead a team of talented data engineers to design and deliver scalable data pipelines and analytics platforms, ensuring high performance, reliability, and security in a cloud-native environment.Achieve scalability and high-performance data solutions by leveraging Azure Data Services and Databricks.Work closely...
-
Building Scalable Data Pipelines
2 days ago
Udaipur, Rajasthan, India beBeeDataEngineer Full time ₹ 15,00,000 - ₹ 25,00,000We are seeking a seasoned Cloud Data Engineer to join our team and take charge of designing and building scalable data pipelines.About the RoleThe ideal candidate will have a strong background in Azure Data Factory, with expertise in ETL/ELT processes and pipeline design.Design and build data pipelines using Azure Data Factory that ensure smooth, secure, and...
-
Chief Data Pipeline Architect
4 days ago
Udaipur, Rajasthan, India beBeeDataEngineer Full time ₹ 15,00,000 - ₹ 25,00,000Key to operational excellence and growth is the role of a Data Engineer. This individual creates seamless interactions between people and technology, responsible for designing and developing robust data pipelines and ETL processes using data platforms for centralized data warehousing.The ideal candidate will have extensive experience in conceptualizing...