PySpark Developer

4 weeks ago


Bengaluru, Karnataka, India ValueLabs Full time

Job Title- PySpark Data Engineer

We're growing our Data Engineering team at ValueLabs and looking for a talented individual to build scalable data pipelines on Cloudera Data Platform

Experience- 5years to 9years.

Pyspark Job Description:


• Data Pipeline Development: Design, develop, and maintain highly scalable and optimized ETL pipelines using PySpark on the Cloudera Data Platform, ensuring data integrity and accuracy.


• Data Ingestion: Implement and manage data ingestion processes from a variety of sources (e.g., relational databases, APIs, file systems) to the data lake or data warehouse on CDP.


• Data Transformation and Processing: Use PySpark to process, cleanse, and transform large datasets into meaningful formats that support analytical needs and business requirements.


• Performance Optimization: Conduct performance tuning of PySpark code and Cloudera components, optimizing resource utilization and reducing runtime of ETL processes.


• Data Quality and Validation: Implement data quality checks, monitoring, and validation routines to ensure data accuracy and reliability throughout the pipeline.


• Automation and Orchestration: Automate data workflows using tools like Apache Oozie, Airflow, or similar orchestration tools within the Cloudera ecosystem.


• Monitoring and Maintenance: Monitor pipeline performance, troubleshoot issues, and perform routine maintenance on the Cloudera Data Platform and associated data processes.


• Collaboration: Work closely with other data engineers, analysts, product managers, and other stakeholders to understand data requirements and support various data-driven initiatives.


• Documentation: Maintain thorough documentation of data engineering processes, code, and pipeline configurations.

Qualifications Education and Experience


• Bachelor's or Master's degree in Computer Science, Data Engineering, Information Systems, or a related field.


• 5+ years of experience as a Data Engineer, with a strong focus on PySpark and the Cloudera Data Platform. Technical Skills


• PySpark: Advanced proficiency in PySpark, including working with RDDs, DataFrames, and optimization techniques.


• Cloudera Data Platform: Strong experience with Cloudera Data Platform (CDP) components, including Cloudera Manager, Hive, Impala, HDFS, and HBase.


• Data Warehousing: Knowledge of data warehousing concepts, ETL best practices, and experience with SQL-based tools (e.g., Hive, Impala).


• Big Data Technologies: Familiarity with Hadoop, Kafka, and other distributed computing tools.


• Orchestration and Scheduling: Experience with Apache Oozie, Airflow, or similar orchestration frameworks.


• Scripting and Automation: Strong scripting skills in Linux.


  • Pyspark Developer

    4 weeks ago


    Bengaluru, Karnataka, India ValueLabs Full time

    Pyspark DeveloperLocation - Bangalore (5 Days WFO)Experience Level - 5+ yrs Notice Period - Immediate to 15 daysJob Description:We are seeking a highly skilled Python & PySpark Developer to join our dynamic team. This position will be responsible for developing and maintaining complex data processing systems using Python and PySpark, ensuring high...

  • Pyspark Developer

    2 weeks ago


    Bengaluru, Karnataka, India Tata Consultancy Services Full time

    Position - Pyspark Developer Experience: 6 to 12 Years Notice Period- 0-60 Days Required Skill Set: Pyspark, Python, SQL and relational databases, SparkSQL, Spark Scripting, UNIX Shell Scripting, ETL, Data Warehousing, CI/CD Job Location: Chennai / Pune / Bangalore / Hyderabad / Gurugram Must Have: Data Engineer, Python developer with specialty in...

  • Pyspark Developer

    2 weeks ago


    Bengaluru, Karnataka, India Tata Consultancy Services Full time

    Role : Pyspark Developer location : Bangalore, Chennai, Kolkata, Hyderabad, Pune Experience : 4 to 8 year Functional Skills: Experience in Credit Risk/Regulatory risk domain Technical Skills: Spark ,PySpark, Python, Hive, Scala, MapReduce, Unix shell scripting Good to Have Skills: Exposure to Machine Learning Techniques Job Description:4+ Years of experience...

  • Pyspark Developer

    4 weeks ago


    Bengaluru, Karnataka, India Tata Consultancy Services Full time

    Greetings from TCS TCS is hiring for Pyspark Developer Required Skill Set: Pyspark, Python, SQL and relational databases, SparkSQL, Spark Scripting, UNIX Shell Scripting, ETL, Data Warehousing, CI/CD Desired Experience Range: 4 to 10 Years Job Location: Chennai / Pune / Bangalore / Hyderabad / Trivandrum / Kochi Must Have: Data Engineer, Python...

  • Pyspark Developer

    2 weeks ago


    Bengaluru, Karnataka, India Tata Consultancy Services Full time

    Greetings from TCSTCS is hiring for Pyspark DeveloperRequired Skill Set: Pyspark, Python, SQL and relational databases, SparkSQL, Spark Scripting, UNIX Shell Scripting, ETL, Data Warehousing, CI/CDDesired Experience Range: 4 to 10 YearsJob Location: Chennai / Pune / Bangalore / Hyderabad / Trivandrum / KochiMust Have:Data Engineer, Python developer with...

  • Pyspark Developer

    4 weeks ago


    Bengaluru, Karnataka, India Tata Consultancy Services Full time

    Role**- Pyspark Developer Desired Competencies: 1. Strong Hands-On experience with Pyspark technology 2. Strong hands-in experience on Python 3. Strong knowledge of Python web frameworks 4. Good knowledge on SQL and AWS

  • Pyspark Developer

    3 weeks ago


    Bengaluru, Karnataka, India Tata Consultancy Services Full time

    Role : Pyspark Developer location : Bangalore, Chennai, Kolkata, Hyderabad, Pune Experience : 4 to 8 year Functional Skills: Experience in Credit Risk/Regulatory risk domain Technical Skills: Spark ,PySpark, Python, Hive, Scala, MapReduce, Unix shell scripting Good to Have Skills: Exposure to Machine Learning Techniques Job Description: 4+ Years of...

  • Pyspark Developer

    3 weeks ago


    Bengaluru, Karnataka, India ValueLabs Full time

    Pyspark Developer Location - Bangalore(5 Days WFO) Experience Level - 5+ yrs Notice Period - immediate to 15 days Job Description: Data Pipeline Development: Design, develop, and maintain highly scalable and optimized ETL pipelines using PySpark on the Cloudera Data Platform, ensuring data integrity and accuracy. Data Ingestion: Implement and manage...

  • Pyspark Developers

    3 weeks ago


    Bengaluru, Karnataka, India LTIMindtree Full time

    Skill : PySpark DeveloperJob Locations : Chennai, PuneNotice Period : AnyExperience : 3-8 yearsJob Description :PySpark DeveloperMandatory Skills : (Apache Spark, Big Data Hadoop Ecosystem, SparkSQL, Python)A good professional experience in Bigdata PySpark HIVE Hadoop PLSQLGood knowledge of AWS and SnowflakeGood understanding of CICD and system...

  • Pyspark Developer

    3 weeks ago


    Bengaluru, Karnataka, India Tata Consultancy Services Full time

    Role : Pyspark DeveloperRequired technical skill set : Python , PysparkJob location : PAN India requirementExp range : 7 -15 yrsRoles and responsibilities :1.Strong Hands-On experience with Pyspark technology2. Strong hands-in experience on Python3. Strong knowledge of Python web frameworks4. Good knowledge on SQL and AWS5. Working in Onsite and Offshore...