PySpark & ETL Data Engineer

2 days ago


New Delhi, India CirrusLabs Full time

We are CirrusLabs. Our vision is to become the world's most sought-after niche digital transformation company that helps customers realize value through innovation. Our mission is to co-create success with our customers, partners and community. Our goal is to enable employees to dream, grow and make things happen. We are committed to excellence. We are a dependable partner organization that delivers on commitments. We strive to maintain integrity with our employees and customers. Every action we take is driven by value. The core of who we are is through our well-knit teams and employees. You are the core of a values driven organization.You have an entrepreneurial spirit. You enjoy working as a part of well-knit teams. You value the team over the individual. You welcome diversity at work and within the greater community. You aren't afraid to take risks. You appreciate a growth path with your leadership team that journeys how you can grow inside and outside of the organization. You thrive upon continuing education programs that your company sponsors to strengthen your skills and for you to become a thought leader ahead of the industry curve.You are excited about creating change because your skills can help the greater good of every customer, industry and community. We are hiring a talented Pyspark to join our team. If you're excited to be part of a winning team, CirrusLabs (http://www.cirruslabs.io) is a great place to grow your career.Experience - 4-8 yearsLocation - Hyderabad/ BengaluruAbout the RoleCirrusLabs is seeking a skilled and experienced PySpark Data Engineer (ETL Lead) to join our growing data engineering team. As an ETL Lead, you will play a pivotal role in designing, developing, and maintaining robust data integration pipelines using PySpark and related technologies. You’ll work closely with data architects, analysts, and stakeholders to transform raw data into high-quality, actionable insights, enabling data-driven decision-making across the organization.This is an exciting opportunity for someone who is not only technically strong in PySpark and Python but also capable of leading data integration efforts for complex projects.Key Responsibilities- Lead Data Integration Projects: - Manage the data integration and ETL activities for enterprise-level data projects. - Gather requirements from stakeholders and translate them into technical solutions. - Develop PySpark Pipelines: - Design and develop scalable and efficient PySpark scripts, both generic frameworks and custom solutions tailored to specific project requirements. - Implement end-to-end ETL processes to ingest, clean, transform, and load data. - Schedule and Automate ETL Processes: - Create scheduling processes to manage and run PySpark jobs reliably and efficiently. - Integrate ETL workflows into automation tools and CI/CD pipelines. - Optimize Data Processing: - Optimize PySpark jobs for performance and resource efficiency. - Monitor, troubleshoot, and resolve issues related to data processing and pipeline execution. - Data Transformation and Curation: - Transform raw data into consumable, curated data models suitable for reporting and analytics. - Ensure data quality, consistency, and reliability throughout all stages of the ETL process. - Collaboration and Best Practices: - Collaborate with data architects, analysts, and business stakeholders to define requirements and deliver solutions. - Contribute to the evolution of data engineering practices, frameworks, and standards. - Provide guidance and mentorship to junior engineers on PySpark and ETL best practices. - Documentation: - Develop and maintain technical documentation related to ETL processes, data flows, and solutions.Required Skills and Qualifications- Experience: - 5–8 years of professional experience in data engineering, ETL development, or related fields. - Proven experience leading data integration projects from design to deployment. - Technical Skills: - Strong hands-on experience with PySpark for building large-scale data pipelines. - Proficiency in Python, including writing efficient, reusable, and modular code. - Solid knowledge of SQL for data extraction, transformation, and analysis. - Strong understanding of Spark architecture, including execution plans, partitions, memory management, and optimization techniques. - Data Engineering Expertise: - Experience working on data integration projects, such as data warehousing, data lakes, or analytics solutions. - Familiarity with processing structured and semi-structured data formats (e.g., Parquet, Avro, JSON, CSV). - Ability to transform and harmonize data from raw to curated layers.Additional Skills:- Familiarity with data pipeline orchestration tools (e.g., Airflow, Azkaban) is a plus. - Experience with cloud platforms (e.g., AWS, Azure, GCP) is desirable. - Strong analytical and problem-solving skills. - Excellent communication and collaboration skills.



  • New Delhi, India Infosys Finacle Full time

    Mandate Skills - SQL, ETL ,hadoop, pyspark,Required SkillsPipeline/ETL (Extract, Transform, Load) processes, API integration, scripting languages (Python), and big data technologies (Trio, Iceberg, DuckDB/Parquet). Database design, data modeling, and data warehousing. SQL and at least one cloud platform. Analytical tools (Superset, Power BI), statistical...


  • New Delhi, India Infosys Finacle Full time

    Mandate Skills - SQL, ETL ,hadoop, pyspark,Required Skills- Pipeline/ETL (Extract, Transform, Load) processes, API integration, scripting languages (Python), and big data technologies (Trio, Iceberg, DuckDB/Parquet). - Database design, data modeling, and data warehousing. SQL and at least one cloud platform. - Analytical tools (Superset, Power BI),...


  • New Delhi, India Xebia Full time

    We’re Hiring: Senior Data Engineer – Python & PySparkLocation: Bangalore (Hybrid – 3 days office per week)We are looking for an experienced Senior Data Engineer with a strong background in Python (with OOPs concepts), PySpark, and building test cases. The ideal candidate must have 6+ years of hands-on experience and be available to join immediately or...

  • ETL Data Engineer

    3 weeks ago


    New Delhi, India The Techgalore Full time

    Pls rate the candidate (from 1 to 5, 1 lowest, 5 highest ) in these areas Big Data PySpark AWS Redshift Position Summary Experienced ETL Developers and Data Engineers to ingest and analyze data from multiple enterprise sources into Adobe Experience Platform Requirements About 4-6 years of professional technology experience mostly focused on the...

  • ETL Data Engineer

    3 weeks ago


    New Delhi, India The Techgalore Full time

    Pls rate the candidate (from 1 to 5, 1 lowest, 5 highest ) in these areas Big Data PySpark AWS Redshift Position Summary Experienced ETL Developers and Data Engineers to ingest and analyze data from multiple enterprise sources into Adobe Experience Platform Requirements About 4-6 years of professional technology experience...


  • New Delhi, India NTT DATA, Inc. Full time

    Mandatory Skills SQL,ELT/ETL Testing, Python, Data Validation, Pyspark/Pytest/Junit/TestNG and Azure Data services, Databricks QA, Data Lake QAGood to have Skills Devops, Azure cloud Devops, Data governance, API testing

  • Data Engineer

    2 weeks ago


    New Delhi, India Tata Consultancy Services Full time

    Job Title :- Data Engineer - Pyspark Experience: 5 to 8 Years Location: Pune/HyderabadJob DescriptionRequired Skills: 5+ years of experience in Big data and pyspark Must-Have Good work experience on Big Data Platforms like Hadoop, Spark, Scala, Hive, Impala, SQL Good-to-Have Good Spark, Pyspark,Big Data experience Spark UI/Optimization/debugging techniques...

  • Pyspark Developer

    2 weeks ago


    New Delhi, India Vista Applied Solutions Group Inc Full time

    Job Summary:A PySpark Developer is responsible for designing, developing, and optimizing large-scale data processing applications and pipelines using Apache Spark and Python. This role involves leveraging PySpark to handle, transform, and analyze vast datasets in distributed computing environments, often integrating with other big data technologies and cloud...


  • Delhi, India DigiHelic Solutions Pvt. Ltd. Full time

    Job Role: Snowflake DeveloperExperience: 6-10 YearsLocation: Trivandrum/Kochi/Bangalore/Chennai/Pune/Noida/HyderabadWork Model: HybridMandatory Skills: Snowflake, PySpark, ETL, SQLMust Have SkillsData Warehouse:· Design, implement, and optimize data warehouses on the Snowflake platform.· Ensure effective utilization of Snowflake features for scalable and...

  • Data Engineer

    3 days ago


    New Delhi, India Bahwan CyberTek Full time

    Role OverviewWe are seeking a skilled and motivated Data Engineer with 5–8 years of experience in building scalable data pipelines using Python, PySpark, and AWS services. The ideal candidate will have hands-on expertise in big data processing, orchestration using AWS Step Functions, and serverless computing with AWS Lambda. Familiarity with DynamoDB and...