
PySpark Developer
1 day ago
Type: Contract-to-Hire (C2H)Job SummaryWe are looking for a skilled PySpark Developer with hands-on experience in building scalable data pipelines and processing large datasets. The ideal candidate will have deep expertise in Apache Spark, Python, and working with modern data engineering tools in cloud environments such as AWS.
Key Skills & Responsibilities
- Strong expertise in PySpark and Apache Spark for batch and real-time data processing.
- Experience in designing and implementing ETL pipelines, including data ingestion, transformation, and validation.
- Proficiency in Python for scripting, automation, and building reusable components.
- Hands-on experience with scheduling tools like Airflow or Control-M to orchestrate workflows.
- Familiarity with AWS ecosystem, especially S3 and related file system operations.
- Strong understanding of Unix/Linux environments and Shell scripting.
- Experience with Hadoop, Hive, and platforms like Cloudera or Hortonworks.
- Ability to handle CDC (Change Data Capture) operations on large datasets.
- Experience in performance tuning, optimizing Spark jobs, and troubleshooting.
- Strong knowledge of data modeling, data validation, and writing unit test cases.
- Exposure to real-time and batch integration with downstream/upstream systems.
- Working knowledge of Jupyter Notebook, Zeppelin, or PyCharm for development and debugging.
- Understanding of Agile methodologies, with experience in CI/CD tools (e.g., Jenkins, Git).
Preferred Skills
- Experience in building or integrating APIs for data provisioning.
- Exposure to ETL or reporting tools such as Informatica, Tableau, Jasper, or QlikView.
- Familiarity with AI/ML model development using PySpark in cloud environments
Skills: ci/cd,zeppelin,pycharm,pyspark,etl tools,control-m,unit test cases,tableau,performance tuning,jenkins,qlikview,informatica,jupyter notebook,api integration,unix/linux,git,aws s3,hive,cloudera,jasper,airflow,cdc,pyspark, apache spark, python, aws s3, airflow/control-m, sql, unix/linux, hive, hadoop, data modeling, and performance tuning,agile methodologies,aws,s3,data modeling,data validation,ai/ml model development,batch integration,apache spark,python,etl pipelines,shell scripting,hortonworks,real-time integration,hadoop
Mandatory Key Skills
ci/cd,zeppelin,pycharm,etl,control-m,performance tuning,jenkins,qlikview,informatica,jupyter notebook,api integration,unix,PySpark*
-
Kolkata, West Bengal, India GENPACT Full timeGenpact NYSE G is a global professional services and solutions firm delivering outcomes that shape the future Our 125 000 people across 30 countries are driven by our innate curiosity entrepreneurial agility and desire to create lasting value for clients Powered by our purpose - the relentless pursuit of a world that works better for people - we...
-
AWS Databricks Developer
4 days ago
Kolkata, West Bengal, India Tata Consultancy Services Full time ₹ 15,00,000 - ₹ 25,00,000 per yearRole & responsibilities• Develop and maintain scalable data pipelines using Apache Spark on Databricks.• Build end-to-end ETL/ELT pipelines on AWS using services like S3, Glue, Lambda, EMR, and Step Functions.• Collaborate with data scientists, analysts, and business stakeholders to deliver high-quality data solutions.• Design and implement data...
-
Senior Data Engineer
2 weeks ago
Kolkata, West Bengal, India beBeeData Full time ₹ 1,40,000 - ₹ 28,00,000Our ideal candidate will possess extensive expertise in designing, building, and maintaining robust and scalable data pipelines on the Google Cloud Platform.Key ResponsibilitiesData Pipeline Development: Design and build data pipelines that are efficient, effective, and scalable.SQL and PySpark Expertise: Utilize expertise in SQL for complex data...
-
Senior Cloud Data Solutions Architect
1 week ago
Kolkata, West Bengal, India beBeeAzure Full time ₹ 9,00,000 - ₹ 12,00,000Cloud Data Engineer Opportunity">As a Cloud Data Engineer, you will play a key role in designing, developing and implementing data pipelines in a cloud-based environment. With a strong background in Azure, including ADF, Azure Databricks, Python, and PySpark, you will be responsible for ensuring seamless data flow and efficient processing.">">Required...
-
Cloud Data Architect
2 weeks ago
Kolkata, West Bengal, India beBeeDataEngineer Full time ₹ 12,00,000 - ₹ 14,00,000AWS Data Engineer JobWe are seeking a highly skilled Data Engineer to design and implement scalable data architectures on AWS.The ideal candidate will have a strong background in cloud-based data engineering, with a proven track record of designing, building, and maintaining scalable data pipelines using AWS services such as S3, Glue, and Redshift....
-
Unlock the Power of Advanced Analytics
2 weeks ago
Kolkata, West Bengal, India beBeeAnalytics Full time ₹ 1,80,00,000 - ₹ 2,52,00,000Lead Advanced Analytics ManagerWe're seeking a seasoned manager to lead our advanced analytics team in leveraging data insights to drive business growth.This role involves leveraging Palantir Foundry to implement analytical solutions using PySpark and hyperscaler platforms. As a leader, you will manage large teams of 20-30 people for complex engineering...
-
Artificial Intelligence Model Developer
2 weeks ago
Kolkata, West Bengal, India beBeeMachineLearning Full time ₹ 1,50,00,000 - ₹ 2,50,00,000Machine Learning Engineer PositionWe are seeking a highly skilled Machine Learning Engineer to join our team. As a Machine Learning Engineer, you will be responsible for designing, developing, and deploying machine learning models using Python and popular frameworks such as scikit-learn, PyTorch, xgboost, lightgbm, mlflow.Key Responsibilities:Data Ingestion...
-
Data Warehousing Specialist
2 weeks ago
Kolkata, West Bengal, India beBeeDataWarehouse Full time ₹ 1,20,00,000 - ₹ 2,00,00,000Job Title: Data Warehousing SpecialistSkillfully design and implement data warehouses on a leading cloud-based platform.Responsibilities:Develop scalable and high-performance data storage solutions.Create efficient end-to-end data pipelines.Maintain ETL workflows for seamless data processing.Requirements:Proven experience in data engineering with expertise...
-
Big Data Architect
2 weeks ago
Kolkata, West Bengal, India beBeeData Full time ₹ 1,80,00,000 - ₹ 2,50,00,000Job OverviewWe are seeking a seasoned data architect to design, develop, and optimize large-scale data pipelines and distributed data processing systems.The ideal candidate will have hands-on experience in Scala, Spark (PySpark), Python, Apache Kafka, and NiFi/Airflow for data processing, streaming solutions, and orchestration.Key ResponsibilitiesDesign and...
-
Advanced Data Architect
2 weeks ago
Kolkata, West Bengal, India beBeeDataEngineer Full time ₹ 15,00,000 - ₹ 25,00,000We are seeking a highly skilled Data Engineer to join our dynamic team. As a Data Engineer, you will play a critical role in building and maintaining scalable data solutions in the cloud using cutting-edge technologies like PySpark, AWS Glue, Lambda, Step Functions, and more.Key Responsibilities:Data Pipeline Development: Design, build, and maintain robust...