Spark/PySpark Developer
7 months ago
Job Profile : Spark ( Pyspark ) Developer
Industry Type : IT Services
Job description :
- The developer must have sound knowledge in Apache Spark and Python programming.
- Deep experience in developing data processing tasks using pySpark such as reading data from external sources, merge data, perform data enrichment and load in to target data destinations.
- Experience in deployment and operationalizing the code is added advantage
- Have knowledge and skills in Devops/version control and containerization.
- Preferable having deployment knowledge.
- Create Spark jobs for data transformation and aggregation
- Produce unit tests for Spark transformations and helper methods
- Write Scaladoc-style documentation with all code
- Design data processing pipelines to perform batch and Real- time/stream analytics on structured and unstructured data
- Spark query tuning and performance optimization
- Good understanding of different file formats (ORC, Parquet, AVRO) to optimize queries/processing and compression techniques.
- SQL database integration (Microsoft, Oracle, Postgres, and/or MySQL)
- Experience working with (HDFS, S3, Cassandra, and/or DynamoDB)
- Deep understanding of distributed systems (e.g. CAP theorem, partitioning, replication, consistency, and consensus)
- Experience in building cloud scalable high-performance data lake solutions
- Hands on expertise in cloud services like AWS, and/or Microsoft Azure.
- As a Spark developer you will manage the development of scalable distributed Architecture defined by the Architect or tech Lead in our team.
- Analyse, assemble large data sets to designed for the functional and non-functional requirements.
- You will develop ETL scripts for big data sources.
- Identify, design optimise data processing automate for reports and dashboards.
- You will be responsible for workflow optimizations, data optimizations and ETL optimization as per the requirements elucidated by the team.
- Work with stakeholders such as Product managers, Technical Leads Service Layer engineers to ensure end-to-end requirements are addressed.
- Strong team player to adhere to Software Development Life cycle (SDLC) and documentations needed to represent every stage of SDLC.
- Hands on working experience on any of the data engineering analytics platform (Hortonworks Cloudera MapR AWS), AWS preferred
- Hands-on experience on Data Ingestion Apache Nifi, Apache Airflow, Sqoop, and Oozie
- Hands-on working experience of data processing at scale with event driven systems, message queues (Kafka Flink Spark Streaming)
- Hands on working Experience with AWS Services like EMR, Kinesis, S3, Cloud Formation, Glue, API Gateway, Lake Foundation
- Hands on working Experience with AWS Athena
- Data Warehouse exposure on Apache Nifi, Apache Airflow, Kylo
- Operationalization of ML models on AWS (e.g. deployment, scheduling, model monitoring etc.)
- Feature Engineering Data Processing to be used for Model development
- Experience gathering and processing raw data at scale (including writing scripts, web scraping, calling APIs, write SQL queries, etc.)
- Experience building data pipelines for structured unstructured, real-time batch, events synchronous asynchronous using MQ, Kafka, Steam processing
- Hands-on working experience in analysing source system data and data flows, working with structured and unstructured data
- Must be very strong in writing SQL queries
-
Spark/PySpark Developer
1 month ago
Bihar/Jharkhand/Maharashtra/Pondicherry/Coimbatore/Patna/Aurangabad/Ranchi/Mumbai/Navi Mumbai/Pune/N, India ATech Full timeJob Profile : Spark ( Pyspark ) DeveloperIndustry Type : IT Services Job description :- The developer must have sound knowledge in Apache Spark and Python programming.- Deep experience in developing data processing tasks using pySpark such as reading data from external sources, merge data, perform data enrichment and load in to target data destinations.-...
-
Spark Developer Lead
3 weeks ago
Mumbai, Maharashtra, India The Celeritas AI Full timeThe Celeritas AI is searching for an experienced Spark Developer Lead to join our team. This individual will be responsible for designing and implementing scalable data pipelines using Pyspark, Apache Spark, and related technologies.Job Description:Create and maintain efficient data processing pipelines using Spark and related toolsImplement Nifi streaming...
-
Python with Pyspark
7 months ago
Mumbai, India INFOBEANS TECHNOLOGIES Full timeBangalore/Chennai/Hyderabad/Pune/Mumbai/Noida/Indore 4+ years Role: Python with Pyspark Location: Bangalore, Chennai, Hyderabad, Pune, Mumbai, Noida, Indore Experience: 4+ years Key Skills:. Python, Pyspark, AWS Job Category: Python Development What will your role look like Python Programming Language, PySpark, Linux Shell Scripting, Data Integration...
-
Bigdata (pyspark, hive)
2 weeks ago
Mumbai, India Tata Consultancy Services Full timeRole - Bigdata (Pyspark, Hive) Experience - 3 to 8 yrs Location - Mumbai/Pune/Chennai Desired Competencies (Technical/Behavioral Competency) Must-Have Spark Pyspark Hive HBase Good-to-Have DQ tool Agile scrum experience Exposure in data ingestion from disparate sources onto big data platform
-
TCS Opportunity for Pyspark Developer
2 weeks ago
Mumbai, India Tata Consultancy Services Full timeGreetings from TCS !!!Job Title :Pyspark DeveloperLocation- Mumbai, Chennai, Hyderabad, Bangalore, Indore, AhmedabadExperience required : 4 to 8 yearsKEYSKILLS-Design and implementation of Big Data pipelines using PySparkMust-HaveMandatory: Primary skill: Pyspark.3-6 years of experience in the design and implementation of Big Data pipelines using PySpark,...
-
Bigdata (Pyspark, Hive)
3 weeks ago
Mumbai, India Tata Consultancy Services Full timeRole - Bigdata (Pyspark, Hive) Experience - 3 to 8 yrs Location - Mumbai/Pune/Chennai Desired Competencies (Technical/Behavioral Competency) Must-Have Spark Pyspark Hive HBase Good-to-Have DQ tool Agile scrum experience Exposure in data ingestion from disparate sources onto big data platform
-
Bigdata (Pyspark, Hive)
3 weeks ago
Mumbai, India Tata Consultancy Services Full timeRole - Bigdata (Pyspark, Hive)Experience - 3 to 8 yrsLocation - Mumbai/Pune/ChennaiDesired Competencies (Technical/Behavioral Competency) Must-Have Spark Pyspark Hive HBase Good-to-Have DQ tool Agile scrum experience Exposure in data ingestion from disparate sources onto big data platform
-
Bigdata (Pyspark, Hive)
3 weeks ago
Mumbai, India Tata Consultancy Services Full timeRole - Bigdata (Pyspark, Hive) Experience - 3 to 8 yrs Location - Mumbai/Pune/Chennai Desired Competencies (Technical/Behavioral Competency) Must-Have Spark Pyspark Hive HBase Good-to-Have DQ tool Agile scrum experience Exposure in data ingestion from disparate sources onto big data platform
-
Bigdata (Pyspark, Hive)
3 weeks ago
Mumbai, India Tata Consultancy Services Full timeRole - Bigdata (Pyspark, Hive)Experience - 3 to 8 yrsLocation - Mumbai/Pune/ChennaiDesired Competencies (Technical/Behavioral Competency) Must-Have Spark Pyspark Hive HBase Good-to-Have DQ tool Agile scrum experience Exposure in data ingestion from disparate sources onto big data platform
-
TCS Opportunity for Pyspark Developer
2 weeks ago
Mumbai, India Tata Consultancy Services Full timeGreetings from TCS !!! Job Title :Pyspark Developer Location- Mumbai, Chennai, Hyderabad, Bangalore, Indore, Ahmedabad Experience required : 4 to 8 years KEYSKILLS- Design and implementation of Big Data pipelines using PySpark Must-Have Mandatory: Primary skill: Pyspark. 3-6 years of experience in the design and implementation of Big Data pipelines...
-
TCS Opportunity for Pyspark Developer
2 weeks ago
Mumbai, India Tata Consultancy Services Full timeGreetings from TCS !!!Job Title :Pyspark Developer Location- Mumbai, Chennai, Hyderabad, Bangalore, Indore, AhmedabadExperience required : 4 to 8 yearsKEYSKILLS-Design and implementation of Big Data pipelines using PySparkMust-Have Mandatory: Primary skill: Pyspark. 3-6 years of experience in the design and implementation of Big Data pipelines using PySpark,...
-
TCS Opportunity for Pyspark Developer
2 weeks ago
Mumbai, India Tata Consultancy Services Full timeGreetings from TCS !!!Job Title :Pyspark Developer Location- Mumbai, Chennai, Hyderabad, Bangalore, Indore, AhmedabadExperience required : 4 to 8 yearsKEYSKILLS-Design and implementation of Big Data pipelines using PySparkMust-Have Mandatory: Primary skill: Pyspark. 3-6 years of experience in the design and implementation of Big Data pipelines using PySpark,...
-
Mumbai, India Tata Consultancy Services Full timeGreetings from TCS !!!Job Title : Pyspark DeveloperLocation- Mumbai, Chennai, Hyderabad, Bangalore, Indore, AhmedabadExperience required : 4 to 8 yearsKEYSKILLS-Design and implementation of Big Data pipelines using Py SparkMust-HaveMandatory: Primary skill: Pyspark.3-6 years of experience in the design and implementation of Big Data pipelines using Py Spark,...
-
Mumbai, Maharashtra, India Tata Consultancy Services Full timeTCS Opportunity OverviewAs a seasoned professional, you will have the opportunity to work with TCS on a Pyspark Developer role, contributing to the design and implementation of Big Data pipelines.About the RoleWe are seeking an experienced Pyspark Developer to join our team in Mumbai, Chennai, Hyderabad, Bangalore, Indore, or Ahmedabad. With a minimum of 4-8...
-
mumbai, India Tata Consultancy Services Full timeGreetings from TCS !!!Job Title :Pyspark DeveloperLocation- Mumbai, Chennai, Hyderabad, Bangalore, Indore, AhmedabadExperience required : 4 to 8 yearsKEYSKILLS-Design and implementation of Big Data pipelines using PySparkMust-HaveMandatory: Primary skill: Pyspark.3-6 years of experience in the design and implementation of Big Data pipelines using PySpark,...
-
Tata Consultancy Services | Bigdata
3 weeks ago
mumbai, India Tata Consultancy Services Full timeRole - Bigdata (Pyspark, Hive)Experience - 3 to 8 yrsLocation - Mumbai/Pune/ChennaiDesired Competencies (Technical/Behavioral Competency) Must-Have Spark Pyspark Hive HBase Good-to-Have DQ tool Agile scrum experience Exposure in data ingestion from disparate sources onto big data platform
-
Tata Consultancy Services | Bigdata
3 weeks ago
mumbai, India Tata Consultancy Services Full timeRole - Bigdata (Pyspark, Hive) Experience - 3 to 8 yrs Location - Mumbai/Pune/Chennai Desired Competencies (Technical/Behavioral Competency) Must-Have Spark Pyspark Hive HBase Good-to-Have DQ tool Agile scrum experience Exposure in data ingestion from disparate sources onto big data platform
-
Tata Consultancy Services | Bigdata
2 days ago
mumbai, India Tata Consultancy Services Full timeRole - Bigdata (Pyspark, Hive) Experience - 3 to 8 yrs Location - Mumbai/Pune/Chennai Desired Competencies (Technical/Behavioral Competency) Must-Have Spark Pyspark Hive HBase Good-to-Have DQ tool Agile scrum experience Exposure in data ingestion from disparate sources onto big data platform
-
Subject Matter Expert
7 months ago
Navi Mumbai, India Chabez Tech Full time**Job Description**: **Role** - Pyspark buildings the pipeline - Maintaining the pipeline **Requirements**: - PySpark and Python expert with 5-9 years experience - Subject matter expert in Python and Spark - 3 Years of experience in Hadoop, Spark Scala, PySpark, Hive, Impala, and SQL Technologies. - Analyzed migration plans for various versions of Cloudera...
-
mumbai, India Tata Consultancy Services Full timeGreetings from TCS !!!Job Title :Pyspark Developer Location- Mumbai, Chennai, Hyderabad, Bangalore, Indore, AhmedabadExperience required : 4 to 8 yearsKEYSKILLS-Design and implementation of Big Data pipelines using PySparkMust-Have Mandatory: Primary skill: Pyspark. 3-6 years of experience in the design and implementation of Big Data pipelines using PySpark,...