Airflow Orchestration

3 days ago


India Lemongrass Consulting Full time

**Vacancy No**

VN1197

**Business Unit**

EMEA

**Job Location**

India/Philippines

**Employment Type**

Full Time

**Job Details and Responsibilities**

**Key Responsibilities**:
**Workflow Migration**:

- Analyze and convert Oozie workflows into Airflow DAGs using Python-based orchestration.
- Design and implement reusable, modular, and optimized Airflow pipelines for data ingestion, transformation, and orchestration.
- Maintain a one-to-one mapping between legacy workflows and Airflow DAGs, ensuring no data loss or business interruption.

**Cloud Data Migration**:

- Collaborate with data engineering teams to migrate Cloudera Hadoop workloads to Databricks on AWS.
- Leverage Airflow for scheduling and orchestrating data workflows on AWS-based services (e.g., S3, EMR, Glue, Redshift).

**Pipeline Optimization**:

- Optimize data ingestion pipelines to achieve high throughput and low latency on AWS cloud infrastructure.
- Integrate Airflow workflows with Databricks for data transformations and analytics.

**Error Handling and Monitoring**:

- Implement robust error-handling mechanisms, task retries, and alerting within Airflow workflows.
- Set up monitoring dashboards using tools like CloudWatch, Prometheus, or Airflow’s built-in features.

**Collaboration and Documentation**:

- Work closely with data architects, cloud engineers, and DevOps teams to align on migration goals and architecture.
- Document the migration process, workflow logic, and best practices for ongoing maintenance.

**Performance Testing**:

- Conduct performance testing and benchmarking of migrated pipelines to ensure efficient resource utilization on Databricks and AWS.

**CI/CD Implementation**:

- Design and maintain CI/CD pipelines for orchestrated workflows using Jenkins and Terraform.
- Automate deployment of Airflow DAGs and infrastructure components using Terraform IaC (Infrastructure as Code).
- Implement quality checks and validation for workflow pipelines during the deployment process.

**Qualifications**

**Technical Expertise**:

- Strong experience in Apache Airflow, including designing and managing complex DAGs.
- Hands-on experience with Apache Oozie and migration to modern orchestration tools.
- Proficiency in Python for writing Airflow DAGs and custom operators/hooks.
- Experience with Hadoop ecosystems (Cloudera distribution preferred) and their components like HDFS, Hive, and Spar
- Solid experience with CI/CD tools, including Jenkins for pipeline automation and Terraform for data orchestration/pipeline provisioning.
- Familiarity with Databricks (on AWS preferred) and its integration with Airflow for ETL and data processing.

**Cloud and Infrastructure**:

- Solid understanding of AWS services such as S3, EMR, Glue, Lambda, Redshift, and IAM.
- Experience with containerization tools (Docker, Kubernetes) and CI/CD pipelines for workflow deployment.

**Analytical and Problem-Solving**:

- Ability to debug and resolve issues in data workflows and orchestrators.
- Experience optimizing workflows for performance and scalability.

**Preferred Qualifications**:

- Experience in large-scale cloud migrations, specifically from on-premises Hadoop to Databricks on AWS
- Knowledge of Spark and PySpark for big data transformations.
- Familiarity with version control tools (e.g., Git) and workflow monitoring tools.
- Certifications in AWS (e.g., AWS Certified Solutions Architect) or Databricks.
- Lemongrass Consulting is proud to be an Equal Opportunity and Affirmative Action employer. We do not discriminate on the basis of race, religion, color, national origin, religious creed, gender, sexual orientation, gender identity, gender expression, age, genetic information, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics._

**About Lemongrass**
- Lemongrass is a software-enabled services provider, synonymous with SAP on Cloud, focused on delivering superior, highly automated Managed Services to Enterprise customers. Our customers span multiple verticals and geographies across the Americas, EMEA and APAC. We partner with AWS, SAP, Microsoft and other global technology leaders.



  • India iVedha Inc. Full time

    INTERMEDIATE AIRFLOW DEVELOPER About the Role: We are looking for an experienced Intermediate Airflow Developer with over 2 years of experience to help transition our existing Windows scheduler jobs to Apache Airflow DAGs. In this role, you'll play a critical part in modernizing and optimizing our task automation processes by converting existing jobs into...


  • India iVedha Inc. Full time

    INTERMEDIATE AIRFLOW DEVELOPERAbout the Role:We are looking for an experienced Intermediate Airflow Developer with over 2 years of experience to help transition our existing Windows scheduler jobs to Apache Airflow DAGs. In this role, you'll play a critical part in modernizing and optimizing our task automation processes by converting existing jobs into...

  • Data Engineer

    4 weeks ago


    India EXL Full time

    Job Summary:We are looking for a skilled Data Engineer with strong experience in Google Cloud Platform (GCP) and Apache Airflow to design, build, and maintain scalable data pipelines and infrastructure. The ideal candidate should have a strong foundation in data engineering best practices, ETL/ELT processes, and cloud-native tools to support data-driven...


  • India BeGig Full time

    Job DescriptionAI Workflow Automation ExpertAbout BeGigBeGig is the leading tech freelancing marketplace. We empower innovative, early-stage, non-tech founders to bring their visions to life by connecting them with top-tier freelance talent. By joining BeGig, youre not just taking on one roleyoure signing up for a platform that will continuously match you...

  • Data Engineer

    4 weeks ago


    India BrightEdge Full time

    About the CompanyBrightedge is a global leader in AI-powered enterprise performance marketing and SEO solutions. We're building scalable, intelligent, cloud-native data platforms to power real-time insights and decision-making across our customer ecosystem. As part of our growth, we're hiring experienced data engineers to join our high-impact Professional...


  • India CES Full time

    We are seeking a highly skilled and proactive Senior Database Administrator to join ourteam. This hybrid role blends traditional DBA responsibilities with modern dataengineering tasks to support our Comparative data loads and ensure optimalperformance of critical database systems. You'll play a key role in scaling our datainfrastructure, diagnosing...

  • Ai/ml Devops Engineer

    3 weeks ago


    India Natlov Technologies Pvt Ltd Full time

    We're Hiring: AI/ML Dev Ops Engineer Location: (Remote) Experience: 2+ years Employment Type: Full-time Join Us at (Natlo Technologies Pvt Ltd)Interested Candidates can send CV to- (techhr@natlov.com)We're on the lookout for a skilled AI/ML Dev Ops Engineer to design and scale end-to-end MLOps infrastructure, enabling seamless machine learning...

  • Data Journalist

    4 weeks ago


    India DataDrooler Community Full time

    Job DescriptionDataDrooler is Calling for SubmissionsAre you passionate about Data Engineering and AIDo you enjoy writing technical blogs and simplifying complex topicsThis is your opportunity to become a Certified Technical Journalist at DataDrooler.What Were Looking For (August Edition):Were inviting submissions for:- QuickStart Guides- Technical Blogs-...


  • India Natlov Technologies Pvt Ltd Full time

    We're Hiring: AI/ML DevOps EngineerLocation: [Remote]Experience: 2+ yearsEmployment Type: Full-timeJoin Us at [Natlo Technologies Pvt Ltd]Interested Candidates can send CV to- (techhr@natlov.com)We're on the lookout for a skilled AI/ML DevOps Engineer to design and scale end-to-end MLOps infrastructure, enabling seamless machine learning lifecycle management...


  • India Xebia Full time

    We're Hiring: Senior Data Engineer – GCP | Databricks | E-commerce Domain Work Locations: Chennai | Bangalore | Hyderabad | Gurugram | Jaipur | Pune | Bhopal Experience: 7–8+ Years Shift Timing: 2 PM – 10 PM IST Work Mode: Hybrid – 3 days/week from office Only Immediate Joiners (0–15 Days Notice)Are you a seasoned Data Engineer passionate about...