
PySpark Lead
4 weeks ago
- Design, develop, and maintain scalable data pipelines using PySpark and related big data technologies.
- Work with large datasets and develop data models for consumption by data scientists and analysts.
- Optimize Spark jobs for better performance and resource management.
- Design and implement data integration workflows between various data sources.
- Troubleshoot and resolve issues related to data pipelines.
- Collaborate with cross-functional teams to understand business requirements and deliver solutions.
- Ensure data quality and cleanliness using validation and transformation techniques.
- Write and maintain efficient, scalable code in Python and PySpark.
- Manage data storage, computation, and scaling on cloud platforms like AWS or Azure.
Requirements
- Bachelor's degree in Computer Science, Engineering, or a related field.
- 2 to 4 years of experience with PySpark for data processing on large-scale datasets.
- Solid understanding of Spark architecture, including RDDs, DataFrames, and Datasets.
- Strong programming experience in Python, including libraries such as pandas, numpy, and matplotlib.
- Experience with Hadoop, Hive, and NoSQL databases (e.g., Cassandra, MongoDB).
- Working knowledge of cloud computing services (e.g., AWS, Azure, or Google Cloud).
- Familiarity with batch and stream processing (using Kafka, Flink, Spark Streaming).
- Strong problem-solving skills and attention to detail.
- Excellent communication and teamwork skills.
Good To Have
- Experience with Apache Airflow or other orchestration tools.
- Familiarity with Docker or Kubernetes for containerized data environments.
- Experience in implementing and managing CI/CD pipelines, focusing on automating, testing, and deploying code.
-
Lead Data Engineer(pyspark)
4 days ago
Hyderabad, Telangana, India Careers at Tide Full time US$ 1,50,000 - US$ 2,00,000 per yearABOUT TIDEAt Tide, we are building a business management platform designed to save small businesses time and money. We provide our members with business accounts and related banking services, but also a comprehensive set of connected administrative solutions from invoicing to accounting.Launched in 2017, Tide is now used by over 1 million small businesses...
-
Python Pyspark
22 hours ago
Hyderabad, Telangana, India Capgemini Full time US$ 80,000 - US$ 1,20,000 per yearJob Description Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you'd like, where you'll be supported and inspired by a collaborative community of colleagues around the world, and where you'll be able to reimagine what's possible. Join us and help the world's leading organizations unlock the value of...
-
AWS, Databricks, Python, Pyspark
3 weeks ago
Hyderabad, Telangana, India Infosys Limited Full timeJob DescriptionJob Description:- Lead Data Engineer AWS Databricks Python PysparkKey Responsibilities:- A day in the life of an Infoscion- As part of the Infosys consulting team your primary role would be to lead the engagement effort of providing high quality and value adding consulting solutions to customers at different stages from problem definition to...
-
Lead Data Engineer/Palantir Lead
1 week ago
Hyderabad, Telangana, India HireAlpha Full timeData Engineer Lead – Python,Pyspark,AWS,- Primary skills and Palantir,GIS are secondary skills Palantir Lead Role – Primary skills – Python,Pyspark and Palantir. Min of 2 years experience in Palantir but over all we are looking at more than 12 years of experience for both the lead roles. Experience : 12+ Years Location : Hyderabad (5 days a week Work...
-
Lead Data Engineer/Palantir Lead
2 weeks ago
Hyderabad, Telangana, India HireAlpha Full timeData Engineer Lead – Python,Pyspark,AWS,- Primary skills and Palantir,GIS are secondary skills Palantir Lead Role – Primary skills – Python,Pyspark and Palantir. Min of 2 years experience in Palantir but over all we are looking at more than 12 years of experience for both the lead roles. Experience : 12+ Years Location : Hyderabad (5 days a week Work...
-
Technical Lead-Databricks
1 day ago
Hyderabad, Telangana, India ValueMomentum Full timeJob Title: Databricks Engieer- LeadPrimary skills: Databricks, PySpark, SQLSecondary skills: Advanced SQL, Azure Data Factory, and Azure Datalake.Mode of Work: Work from OfficeLocation: HyderabadExperience: 7 to 10 YearsResponsibilities· Design and develop ETL pipelines using ADF for data ingestion and transformation.· Collaborate with Azure stack modules...
-
Technical Lead-Databricks
2 weeks ago
Hyderabad, Telangana, India ValueMomentum Full timeJob Title: Databricks Engieer- LeadPrimary skills: Databricks, PySpark, SQLSecondary skills: Advanced SQL, Azure Data Factory, and Azure Datalake.Mode of Work: Work from OfficeLocation: HyderabadExperience: 7 to 10 YearsResponsibilities· Design and develop ETL pipelines using ADF for data ingestion and transformation. · Collaborate with Azure stack modules...
-
Lead Data Engineer
3 days ago
Hyderabad, Telangana, India Incedo Inc. Full timeTechnical Lead Data EngineerExperience : - 7+ to 10 YearsLocation :- Hyderabad / ChennaiNotice :- Looking for Immediate to serving notice periods upto 15th Sept / 30 Days Official Notice Period.Technical Lead – Data Ingestion / ETLSolution Design & Architecture: Lead the design and implementation of scalable data ingestion and ETL pipelines using tools...
-
Lead Data Engineer
3 weeks ago
Hyderabad, Telangana, India HireAlpha Full timePrimary Skills : Strong in Python Programming, Pyspark queries, AWS,GIS, Palantir FoundryPySpark queries --- MUSTExperience : 12+ YearsLocation : Hyderabad (5 days a week Work from Office)Responsibilities• Develop and enhance data-processing, orchestration, monitoring, and more by leveraging popular open-source software, AWS, and GitLab automation.•...
-
Lead Data Engineer
1 day ago
Hyderabad, Telangana, India beBeeDataEngineer Full time ₹ 1,50,00,000 - ₹ 2,50,00,000Lead Data Engineer - Databricks SpecialistJob Summary:We are seeking a seasoned data engineering expert with expertise in Databricks, PySpark, and SQL to spearhead our data engineering initiatives.Main Responsibilities:Design and develop end-to-end data pipelines using Azure Data Factory for seamless data ingestion and transformation.Collaborate closely with...