PySpark Developer
4 weeks ago
Job Title- PySpark Data Engineer
We're growing our Data Engineering team at ValueLabs and looking for a talented individual to build scalable data pipelines on Cloudera Data Platform
Experience- 5years to 9years.
Pyspark Job Description:
• Data Pipeline Development: Design, develop, and maintain highly scalable and optimized ETL pipelines using PySpark on the Cloudera Data Platform, ensuring data integrity and accuracy.
• Data Ingestion: Implement and manage data ingestion processes from a variety of sources (e.g., relational databases, APIs, file systems) to the data lake or data warehouse on CDP.
• Data Transformation and Processing: Use PySpark to process, cleanse, and transform large datasets into meaningful formats that support analytical needs and business requirements.
• Performance Optimization: Conduct performance tuning of PySpark code and Cloudera components, optimizing resource utilization and reducing runtime of ETL processes.
• Data Quality and Validation: Implement data quality checks, monitoring, and validation routines to ensure data accuracy and reliability throughout the pipeline.
• Automation and Orchestration: Automate data workflows using tools like Apache Oozie, Airflow, or similar orchestration tools within the Cloudera ecosystem.
• Monitoring and Maintenance: Monitor pipeline performance, troubleshoot issues, and perform routine maintenance on the Cloudera Data Platform and associated data processes.
• Collaboration: Work closely with other data engineers, analysts, product managers, and other stakeholders to understand data requirements and support various data-driven initiatives.
• Documentation: Maintain thorough documentation of data engineering processes, code, and pipeline configurations.
Qualifications Education and Experience
• Bachelor's or Master's degree in Computer Science, Data Engineering, Information Systems, or a related field.
• 5+ years of experience as a Data Engineer, with a strong focus on PySpark and the Cloudera Data Platform. Technical Skills
• PySpark: Advanced proficiency in PySpark, including working with RDDs, DataFrames, and optimization techniques.
• Cloudera Data Platform: Strong experience with Cloudera Data Platform (CDP) components, including Cloudera Manager, Hive, Impala, HDFS, and HBase.
• Data Warehousing: Knowledge of data warehousing concepts, ETL best practices, and experience with SQL-based tools (e.g., Hive, Impala).
• Big Data Technologies: Familiarity with Hadoop, Kafka, and other distributed computing tools.
• Orchestration and Scheduling: Experience with Apache Oozie, Airflow, or similar orchestration frameworks.
• Scripting and Automation: Strong scripting skills in Linux.
-
Pyspark Developer
4 weeks ago
Bengaluru, Karnataka, India ValueLabs Full timePyspark DeveloperLocation - Bangalore (5 Days WFO)Experience Level - 5+ yrs Notice Period - Immediate to 15 daysJob Description:We are seeking a highly skilled Python & PySpark Developer to join our dynamic team. This position will be responsible for developing and maintaining complex data processing systems using Python and PySpark, ensuring high...
-
Pyspark Developer
2 weeks ago
Bengaluru, Karnataka, India Tata Consultancy Services Full timePosition - Pyspark Developer Experience: 6 to 12 Years Notice Period- 0-60 Days Required Skill Set: Pyspark, Python, SQL and relational databases, SparkSQL, Spark Scripting, UNIX Shell Scripting, ETL, Data Warehousing, CI/CD Job Location: Chennai / Pune / Bangalore / Hyderabad / Gurugram Must Have: Data Engineer, Python developer with specialty in...
-
Pyspark Developer
2 weeks ago
Bengaluru, Karnataka, India Tata Consultancy Services Full timeRole : Pyspark Developer location : Bangalore, Chennai, Kolkata, Hyderabad, Pune Experience : 4 to 8 year Functional Skills: Experience in Credit Risk/Regulatory risk domain Technical Skills: Spark ,PySpark, Python, Hive, Scala, MapReduce, Unix shell scripting Good to Have Skills: Exposure to Machine Learning Techniques Job Description:4+ Years of experience...
-
Pyspark Developer
4 weeks ago
Bengaluru, Karnataka, India Tata Consultancy Services Full timeGreetings from TCS TCS is hiring for Pyspark Developer Required Skill Set: Pyspark, Python, SQL and relational databases, SparkSQL, Spark Scripting, UNIX Shell Scripting, ETL, Data Warehousing, CI/CD Desired Experience Range: 4 to 10 Years Job Location: Chennai / Pune / Bangalore / Hyderabad / Trivandrum / Kochi Must Have: Data Engineer, Python...
-
Pyspark Developer
2 weeks ago
Bengaluru, Karnataka, India Tata Consultancy Services Full timeGreetings from TCSTCS is hiring for Pyspark DeveloperRequired Skill Set: Pyspark, Python, SQL and relational databases, SparkSQL, Spark Scripting, UNIX Shell Scripting, ETL, Data Warehousing, CI/CDDesired Experience Range: 4 to 10 YearsJob Location: Chennai / Pune / Bangalore / Hyderabad / Trivandrum / KochiMust Have:Data Engineer, Python developer with...
-
Pyspark Developer
4 weeks ago
Bengaluru, Karnataka, India Tata Consultancy Services Full timeRole**- Pyspark Developer Desired Competencies: 1. Strong Hands-On experience with Pyspark technology 2. Strong hands-in experience on Python 3. Strong knowledge of Python web frameworks 4. Good knowledge on SQL and AWS
-
Pyspark Developer
3 weeks ago
Bengaluru, Karnataka, India Tata Consultancy Services Full timeRole : Pyspark Developer location : Bangalore, Chennai, Kolkata, Hyderabad, Pune Experience : 4 to 8 year Functional Skills: Experience in Credit Risk/Regulatory risk domain Technical Skills: Spark ,PySpark, Python, Hive, Scala, MapReduce, Unix shell scripting Good to Have Skills: Exposure to Machine Learning Techniques Job Description: 4+ Years of...
-
Pyspark Developer
3 weeks ago
Bengaluru, Karnataka, India ValueLabs Full timePyspark Developer Location - Bangalore(5 Days WFO) Experience Level - 5+ yrs Notice Period - immediate to 15 days Job Description: Data Pipeline Development: Design, develop, and maintain highly scalable and optimized ETL pipelines using PySpark on the Cloudera Data Platform, ensuring data integrity and accuracy. Data Ingestion: Implement and manage...
-
Pyspark Developers
3 weeks ago
Bengaluru, Karnataka, India LTIMindtree Full timeSkill : PySpark DeveloperJob Locations : Chennai, PuneNotice Period : AnyExperience : 3-8 yearsJob Description :PySpark DeveloperMandatory Skills : (Apache Spark, Big Data Hadoop Ecosystem, SparkSQL, Python)A good professional experience in Bigdata PySpark HIVE Hadoop PLSQLGood knowledge of AWS and SnowflakeGood understanding of CICD and system...
-
Pyspark Developer
3 weeks ago
Bengaluru, Karnataka, India Tata Consultancy Services Full timeRole : Pyspark DeveloperRequired technical skill set : Python , PysparkJob location : PAN India requirementExp range : 7 -15 yrsRoles and responsibilities :1.Strong Hands-On experience with Pyspark technology2. Strong hands-in experience on Python3. Strong knowledge of Python web frameworks4. Good knowledge on SQL and AWS5. Working in Onsite and Offshore...