
High Performance Data Pipeline Architect
2 weeks ago
Top Talent Sought for High-Performance Data Engineer Role
Job DescriptionThe company is looking for an experienced engineer to drive the architecture and development of data pipelines. The ideal candidate will have a strong background in performance and concurrency, with expertise in profiling and memory tuning.
Key Responsibilities:- Architecture and Reuse:
- Design and build a shared component library/SDK for ingestion, parsing/OCR, extraction, validation, enrichment, and publishing.
- Define patterns/templates for Apache Beam pipelines and Databricks jobs; standardize configuration, packaging, versioning, CI/CD, and documentation.
- Create pluggable interfaces so multiple teams can swap extractors (Regex/LLM), OCR providers, and EMR publishers without code rewrites.
- Define repo strategy - shared/child repos for each use case.
- Performance and Reliability:
- Own end-to-end profiling and tuning: cProfile/py-spy/line_profiler, memory (tracemalloc), CPU vs I/O analysis.
- Instrument services with Elastic APM and correlate traces/metrics with Splunk logs; build dashboards and runbooks.
- Implement concurrency best practices: asyncio for I/O-bound, ThreadPool/ProcessPool for CPU-bound, batching, rate limiting, retries, etc.
- Implement robust LLM API rate limiting/governance: enforce provider TPM and concurrency caps, request queueing/token budgeting, and emit APM/Splunk metrics (throttle rate, queue depth, cost per job) with alerts.
- Establish SLOs/alerts for throughput, latency, error rates; set up DLQs and recovery patterns.
- Team Enablement:
- Mentor devs, lead design reviews, codify best practices, write clear docs and examples.
- Partner with ML engineers on the future LLM/SLM path (evaluation harness, safety/PII, cost/perf).
Required Skills and Qualifications:
- 7+ years Python with strong depth in performance and concurrency (asyncio, concurrent.futures, multiprocessing), profiling and memory tuning.
- Observability expertise: Elastic APM instrumentation and dashboarding; Splunk for logs and correlation; OpenTelemetry familiarity.
- Must have implemented LLM based solutions and supported them in production.
- API engineering for high-throughput integrations (REST, OAuth2), resilience patterns, and secure handling of sensitive data.
- Strong architecture/design skills: clean interfaces, packaging shared libs, versioning, CI/CD (GitHub Actions/Azure DevOps), testing.
- 3+ years building large-scale data pipelines with Apache Beam and/or Spark, including hands-on Databricks experience (Jobs, Delta Lake, cluster tuning).
- Document processing: OCR (Tesseract, AWS Textract, Azure Form Recognizer), PDF parsing, text normalization.
- LLM/SLM integration experience (e.g., OpenAI/Azure AI, local SLMs), prompt/eval frameworks, PII redaction/guardrails.
- Cloud and tooling: AWS/Azure/GCP, Dataflow/Flink, Terraform, Docker; cost/performance tuning on Databricks.
- Security/compliance mindset (HIPAA), secrets management, least-privilege access.
Benefits:
- Competitive salary and benefits package
- Culture focused on talent development with quarterly promotion cycles and company-sponsored higher education and certifications
- Opportunity to work with cutting-edge technologies
- Employee engagement initiatives such as project parties, flexible work hours, and Long Service awards
- Annual health check-ups
- Insurance coverage: group term life, personal accident, and Mediclaim hospitalization for self, spouse, two children, and parents
Values-Driven, People-Centric & Inclusive Work Environment:
Persistent Ltd. is dedicated to fostering diversity and inclusion in the workplace. We invite applications from all qualified individuals, including those with disabilities, and regardless of gender or gender preference. We welcome diverse candidates from all backgrounds.
Let's unleash your full potential at Persistent - persistent.com/careers
-
Data Pipeline Architect
2 weeks ago
Rajahmundry, Andhra Pradesh, India beBeeDataPipeline Full time ₹ 20,00,000 - ₹ 25,00,000Key Role: Data Pipeline ArchitectWe are seeking a skilled professional to design and build robust data pipelines that efficiently curate and ingest data.Main Responsibilities:Design, implement, and maintain large-scale data pipelines for seamless data workflows.Collaborate with cross-functional teams to ensure data quality, security, and...
-
Data Pipelines Architect
2 weeks ago
Rajahmundry, Andhra Pradesh, India beBeeMachineLearning Full time ₹ 1,50,00,000 - ₹ 2,00,00,000Job Title:Data Pipelines ArchitectJob DescriptionWe're evolving into an AI-native platform where intelligent agents predict churn, upsell opportunities and automate member engagement.The right candidate will design and build robust data pipelines to transform raw events into usable features.He/she will also develop and train machine learning models using...
-
Chief Data Pipeline Architect
7 days ago
Rajahmundry, Andhra Pradesh, India beBeeDataEngineering Full time ₹ 12,00,000 - ₹ 19,00,000Unlock Your Data Engineering PotentialAs a global leader in unified commerce platforms, we deliver world-class software solutions for the food and beverage industry.We are seeking an experienced Data Engineer to design and optimize our distributed data pipelines, infrastructure, and tools that power insights across our business.Design, build, and maintain...
-
Senior Data Pipeline Architect
3 days ago
Rajahmundry, Andhra Pradesh, India beBeeETLD Full time ₹ 18,00,000 - ₹ 25,20,000Job DescriptionWe are seeking an experienced ETL Developer to join our data engineering team. The successful candidate will work on building scalable and efficient data pipelines using IBM DataStage (on Cloud Pak for Data), AWS Glue, and Snowflake.As a key member of the team, you will collaborate with architects, business analysts, and data modelers to...
-
Data Architect Specialist
1 week ago
Rajahmundry, Andhra Pradesh, India beBeeData Full time ₹ 20,00,000 - ₹ 30,00,000Job Title: Data Architect SpecialistA data architect specialist is required to design and implement scalable, high-performance data systems for enterprise-wide consumption.Design, develop, and maintain robust data pipelines and ETL processes using cloud-based data platforms for centralized data storage.Create or contribute to frameworks that improve logging...
-
Senior Data Engineer(Lead)
2 weeks ago
Rajahmundry, Andhra Pradesh, India beBeeData Full time ₹ 15,00,000 - ₹ 25,00,000Lead Data Architect RoleWe are seeking a highly skilled Lead Data Architect to join our team.The ideal candidate will have expertise in designing and developing scalable data pipelines using Apache Spark, AWS Glue, and Azure Data Factory.Strong collaboration skills are essential for working with cross-functional teams to build data lake and data warehouse...
-
Senior Data Pipeline Architect
6 days ago
Rajahmundry, Andhra Pradesh, India beBeeDataEngineer Full time ₹ 18,00,000 - ₹ 20,25,000Job OverviewA comprehensive data engineering role involves designing, developing and maintaining robust data pipelines and ETL processes.The ideal candidate will conceptualize and own the build out of problem-solving data marts for consumption by data science and BI teams. They will create or contribute to frameworks that improve the efficacy of logging data...
-
Data Pipeline Specialist
7 days ago
Rajahmundry, Andhra Pradesh, India beBeeDataEngineering Full time ₹ 12,00,000 - ₹ 25,00,000Senior Data Engineering PositionAs a skilled data engineer, you will contribute to the growth and success of our organization by leveraging your expertise in data pipeline development, integration, and optimization.About the Role:We are seeking an experienced professional with strong PySpark and SQL skills for data manipulation and querying. The ideal...
-
Cloud Architect
2 weeks ago
Rajahmundry, Andhra Pradesh, India beBeeDataOps Full time ₹ 15,00,000 - ₹ 25,00,000Senior AWS Glue DeveloperAs a senior-level developer, you will play a pivotal role in crafting serverless data pipelines using AWS services. Your expertise will empower clients to transition from traditional ETL methodologies to more modern ELT approaches, aligning with industry best practices.You will be responsible for designing and implementing...
-
Chief Data Architect
5 days ago
Rajahmundry, Andhra Pradesh, India beBeeData Full time ₹ 18,00,000 - ₹ 25,00,000Senior Data Engineer RoleWe are seeking a skilled Senior Data Engineer to drive business strategies through robust data pipelines.Key Responsibilities:Develop and refine complex ETL pipelines for high-volume data processingEnsure data precision, consistency, and system performanceCollaborate with architects, analysts, and stakeholders to deliver impactful...