PySpark Lead

4 weeks ago


Hyderabad, Telangana, India ValueMomentum Full time
Job Description

- Design, develop, and maintain scalable data pipelines using PySpark and related big data technologies.
- Work with large datasets and develop data models for consumption by data scientists and analysts.
- Optimize Spark jobs for better performance and resource management.
- Design and implement data integration workflows between various data sources.
- Troubleshoot and resolve issues related to data pipelines.
- Collaborate with cross-functional teams to understand business requirements and deliver solutions.
- Ensure data quality and cleanliness using validation and transformation techniques.
- Write and maintain efficient, scalable code in Python and PySpark.
- Manage data storage, computation, and scaling on cloud platforms like AWS or Azure.

Requirements

- Bachelor's degree in Computer Science, Engineering, or a related field.
- 2 to 4 years of experience with PySpark for data processing on large-scale datasets.
- Solid understanding of Spark architecture, including RDDs, DataFrames, and Datasets.
- Strong programming experience in Python, including libraries such as pandas, numpy, and matplotlib.
- Experience with Hadoop, Hive, and NoSQL databases (e.g., Cassandra, MongoDB).
- Working knowledge of cloud computing services (e.g., AWS, Azure, or Google Cloud).
- Familiarity with batch and stream processing (using Kafka, Flink, Spark Streaming).
- Strong problem-solving skills and attention to detail.
- Excellent communication and teamwork skills.

Good To Have

- Experience with Apache Airflow or other orchestration tools.
- Familiarity with Docker or Kubernetes for containerized data environments.
- Experience in implementing and managing CI/CD pipelines, focusing on automating, testing, and deploying code.

  • Hyderabad, Telangana, India Careers at Tide Full time US$ 1,50,000 - US$ 2,00,000 per year

    ABOUT TIDEAt Tide, we are building a business management  platform designed to save small businesses time and money. We provide our members with business accounts and related banking services, but also a comprehensive set of connected administrative solutions from invoicing to accounting.Launched in 2017, Tide is now used by over 1 million small businesses...

  • Python Pyspark

    22 hours ago


    Hyderabad, Telangana, India Capgemini Full time US$ 80,000 - US$ 1,20,000 per year

    Job Description Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you'd like, where you'll be supported and inspired by a collaborative community of colleagues around the world, and where you'll be able to reimagine what's possible. Join us and help the world's leading organizations unlock the value of...


  • Hyderabad, Telangana, India Infosys Limited Full time

    Job DescriptionJob Description:- Lead Data Engineer AWS Databricks Python PysparkKey Responsibilities:- A day in the life of an Infoscion- As part of the Infosys consulting team your primary role would be to lead the engagement effort of providing high quality and value adding consulting solutions to customers at different stages from problem definition to...


  • Hyderabad, Telangana, India HireAlpha Full time

    Data Engineer Lead – Python,Pyspark,AWS,- Primary skills and Palantir,GIS are secondary skills Palantir Lead Role – Primary skills – Python,Pyspark and Palantir. Min of 2 years experience in Palantir but over all we are looking at more than 12 years of experience for both the lead roles. Experience : 12+ Years Location : Hyderabad (5 days a week Work...


  • Hyderabad, Telangana, India HireAlpha Full time

    Data Engineer Lead – Python,Pyspark,AWS,- Primary skills and Palantir,GIS are secondary skills Palantir Lead Role – Primary skills – Python,Pyspark and Palantir. Min of 2 years experience in Palantir but over all we are looking at more than 12 years of experience for both the lead roles. Experience : 12+ Years Location : Hyderabad (5 days a week Work...


  • Hyderabad, Telangana, India ValueMomentum Full time

    Job Title: Databricks Engieer- LeadPrimary skills: Databricks, PySpark, SQLSecondary skills: Advanced SQL, Azure Data Factory, and Azure Datalake.Mode of Work: Work from OfficeLocation: HyderabadExperience: 7 to 10 YearsResponsibilities· Design and develop ETL pipelines using ADF for data ingestion and transformation.· Collaborate with Azure stack modules...


  • Hyderabad, Telangana, India ValueMomentum Full time

    Job Title: Databricks Engieer- LeadPrimary skills: Databricks, PySpark, SQLSecondary skills: Advanced SQL, Azure Data Factory, and Azure Datalake.Mode of Work: Work from OfficeLocation: HyderabadExperience: 7 to 10 YearsResponsibilities· Design and develop ETL pipelines using ADF for data ingestion and transformation. · Collaborate with Azure stack modules...

  • Lead Data Engineer

    3 days ago


    Hyderabad, Telangana, India Incedo Inc. Full time

    Technical Lead Data EngineerExperience : - 7+ to 10 YearsLocation :- Hyderabad / ChennaiNotice :- Looking for Immediate to serving notice periods upto 15th Sept / 30 Days Official Notice Period.Technical Lead – Data Ingestion / ETLSolution Design & Architecture: Lead the design and implementation of scalable data ingestion and ETL pipelines using tools...

  • Lead Data Engineer

    3 weeks ago


    Hyderabad, Telangana, India HireAlpha Full time

    Primary Skills : Strong in Python Programming, Pyspark queries, AWS,GIS, Palantir FoundryPySpark queries --- MUSTExperience : 12+ YearsLocation : Hyderabad (5 days a week Work from Office)Responsibilities• Develop and enhance data-processing, orchestration, monitoring, and more by leveraging popular open-source software, AWS, and GitLab automation.•...

  • Lead Data Engineer

    1 day ago


    Hyderabad, Telangana, India beBeeDataEngineer Full time ₹ 1,50,00,000 - ₹ 2,50,00,000

    Lead Data Engineer - Databricks SpecialistJob Summary:We are seeking a seasoned data engineering expert with expertise in Databricks, PySpark, and SQL to spearhead our data engineering initiatives.Main Responsibilities:Design and develop end-to-end data pipelines using Azure Data Factory for seamless data ingestion and transformation.Collaborate closely with...