Spark/PySpark Developer

4 weeks ago


BiharJharkhandMaharashtraPondicherryCoimbatorePatnaAurangabadRanchiMumbaiNavi MumbaiPuneN, India ATech Full time

Job Profile : Spark ( Pyspark ) Developer

Industry Type : IT Services

Job description :

- The developer must have sound knowledge in Apache Spark and Python programming.

- Deep experience in developing data processing tasks using pySpark such as reading data from external sources, merge data, perform data enrichment and load in to target data destinations.

- Experience in deployment and operationalizing the code is added advantage


- Have knowledge and skills in Devops/version control and containerization.


- Preferable having deployment knowledge.

- Create Spark jobs for data transformation and aggregation


- Produce unit tests for Spark transformations and helper methods

- Write Scaladoc-style documentation with all code

- Design data processing pipelines to perform batch and Real- time/stream analytics on structured and unstructured data

- Spark query tuning and performance optimization


- Good understanding of different file formats (ORC, Parquet, AVRO) to optimize queries/processing and compression techniques.

- SQL database integration (Microsoft, Oracle, Postgres, and/or MySQL)

- Experience working with (HDFS, S3, Cassandra, and/or DynamoDB)

- Deep understanding of distributed systems (e.g. CAP theorem, partitioning, replication, consistency, and consensus)

- Experience in building cloud scalable high-performance data lake solutions

- Hands on expertise in cloud services like AWS, and/or Microsoft Azure.

- As a Spark developer you will manage the development of scalable distributed Architecture defined by the Architect or tech Lead in our team.

- Analyse, assemble large data sets to designed for the functional and non-functional requirements.

- You will develop ETL scripts for big data sources.

- Identify, design optimise data processing automate for reports and dashboards.

- You will be responsible for workflow optimizations, data optimizations and ETL optimization as per the requirements elucidated by the team.

- Work with stakeholders such as Product managers, Technical Leads Service Layer engineers to ensure end-to-end requirements are addressed.

- Strong team player to adhere to Software Development Life cycle (SDLC) and documentations needed to represent every stage of SDLC.

- Hands on working experience on any of the data engineering analytics platform (Hortonworks Cloudera MapR AWS), AWS preferred

- Hands-on experience on Data Ingestion Apache Nifi, Apache Airflow, Sqoop, and Oozie

- Hands-on working experience of data processing at scale with event driven systems, message queues (Kafka Flink Spark Streaming)

- Hands on working Experience with AWS Services like EMR, Kinesis, S3, Cloud Formation, Glue, API Gateway, Lake Foundation

- Hands on working Experience with AWS Athena

- Data Warehouse exposure on Apache Nifi, Apache Airflow, Kylo

- Operationalization of ML models on AWS (e.g. deployment, scheduling, model monitoring etc.)

- Feature Engineering Data Processing to be used for Model development

- Experience gathering and processing raw data at scale (including writing scripts, web scraping, calling APIs, write SQL queries, etc.)

- Experience building data pipelines for structured unstructured, real-time batch, events synchronous asynchronous using MQ, Kafka, Steam processing

- Hands-on working experience in analysing source system data and data flows, working with structured and unstructured data

- Must be very strong in writing SQL queries

(ref:hirist.tech)

  • Bihar/Jharkhand/Maharashtra/Pondicherry/Coimbatore/Patna/Aurangabad/Ranchi/Mumbai/Navi Mumbai/Pune/N, IN ATech Full time

    Job Profile : Spark ( Pyspark ) DeveloperIndustry Type : IT Services Job description :- The developer must have sound knowledge in Apache Spark and Python programming.- Deep experience in developing data processing tasks using pySpark such as reading data from external sources, merge data, perform data enrichment and load in to target data destinations.-...


  • Navi Mumbai, India ATech Full time

    Job Profile : Spark ( Pyspark ) DeveloperIndustry Type : IT Services Job description :- The developer must have sound knowledge in Apache Spark and Python programming.- Deep experience in developing data processing tasks using pySpark such as reading data from external sources, merge data, perform data enrichment and load in to target data destinations.-...

  • Big Data Developer

    4 weeks ago


    Pune, India CloudQ IT Services Full time

    Looking for a candidate to help with migration of on-premise capital markets data lake to data bricks. Key skills : Spark, Scala, Pyspark- Good to have experience in Databricks - 4+ experience with various Hadoop/Databricks ecosystem components, 3+ years hands-on experience in Spark (Scala). - Working experience with Spark based data processing in Scala-...

  • Java + Spark

    4 weeks ago


    Mumbai, India L&T Technology Services Ltd. Full time

    Location Pune/ Mumbai/ Chennai/ Hyderabad Years of Experience 5-10 Years Any Project specific Prerequisite skills Java Spark, Detailed JD Strong experience in **ETL development with Java & Spark** Strong experience with **Redshift, AWS S3, SQL** Experience in developing **microservices** Proficiency with **Lambda** expressions, **Pyspark** Hands...

  • Pyspark AWS Developer

    3 weeks ago


    Pune, India Virtusa Full time

    Pyspark AWS Developer - CREQ184802 Description 8-10years of relevant work experience showing growth as a Data Engineer.Hands On programming experienceImplementation Experience on Kafka, Kinesis, Spark, AWS Glue, AWS LakeFormation.Experience of performance optimization in Batch and Real time processing applicationsExpertise in Data Governance and Data...


  • Pune, India LTIMindtree Full time

    Exp- 5 to 15 YearsLocations- PAN India LTIM locationsRel Experience- 3 Years (in Python/ Pyspar/ ScalaSpark)Scala development and design using Scala 3+ or PythonHadoop, Spark/Pyspark, Hive, YARNData Modelling: A good data engineer should be able to design, implement and maintain data models that can support the organization's data storage and analysis needs,...

  • Pyspark AWS Developer

    4 weeks ago


    Pune, India Virtusa Full time

    Pyspark AWS Developer - CREQ184802 Description 8-10years of relevant work experience showing growth as a Data Engineer. Hands On programming experience Implementation Experience on Kafka, Kinesis, Spark, AWS Glue, AWS LakeFormation. Experience of performance optimization in Batch and Real time processing applications Expertise in Data Governance and Data...

  • Pyspark AWS Developer

    3 weeks ago


    pune, India Virtusa Full time

    Pyspark AWS Developer - CREQ184802 Description 8-10years of relevant work experience showing growth as a Data Engineer.Hands On programming experienceImplementation Experience on Kafka, Kinesis, Spark, AWS Glue, AWS LakeFormation.Experience of performance optimization in Batch and Real time processing applicationsExpertise in Data Governance and Data...


  • Anywhere in India,Multiple Locations,Bangalore,Hyderabad,Pune,Chennai CA-One India Full time

    Job descriptionJob Opportunity: Pyspark DeveloperExperience: 6+ YearsLocation: Bangalore, Hyderabad, Pune, ChennaiJob description : Good hands-on experience in Pyspark, preferably more than 5 years Should have good knowledge of Python and spark concepts Develop and maintain data pipelines and ETL processes using Python and Pyspark. Design, implement, and...


  • Pune, India LTIMindtree Full time

    Exp- 5 to 15 Years Locations- PAN India LTIM locations Rel Experience- 3 Years (in Python/ Pyspar/ ScalaSpark) Scala development and design using Scala 3+ or Python Hadoop, Spark/Pyspark, Hive, YARN Data Modelling: A good data engineer should be able to design, implement and maintain data models that can support the organization's data storage and analysis...


  • Pune, India LTIMindtree Full time

    Exp- 5 to 15 YearsLocations- PAN India LTIM locationsRel Experience- 3 Years (in Python/ Pyspar/ ScalaSpark)Scala development and design using Scala 3+ or PythonHadoop, Spark/Pyspark, Hive, YARNData Modelling: A good data engineer should be able to design, implement and maintain data models that can support the organization's data storage and analysis needs,...


  • Pune, India LTIMindtree Full time

    Exp- 5 to 15 YearsLocations- PAN India LTIM locationsRel Experience- 3 Years (in Python/ Pyspar/ ScalaSpark)Scala development and design using Scala 3+ or PythonHadoop, Spark/Pyspark, Hive, YARNData Modelling: A good data engineer should be able to design, implement and maintain data models that can support the organization's data storage and analysis needs,...

  • Java + Spark

    4 weeks ago


    Pune, India L&T Technology Services Ltd. Full time

    Location Pune/ Mumbai/ Chennai/ Hyderabad Years of Experience 5-10 Years Any Project specific Prerequisite skills Java Spark, Detailed JD Strong experience in **ETL development with Java & Spark** Strong experience with **Redshift, AWS S3, SQL** Experience in developing **microservices** Proficiency with **Lambda** expressions, **Pyspark** Hands...

  • Big Data Engineer

    3 weeks ago


    Goa/Mumbai/Jammu & Kashmir/Jammu/Srinagar/Pondicherry/Jaipur/Lucknow/Varanasi/Banaras/Patna/Ranchi, IN ATech Full time

    Designation: BIG DATA ENGINEERJob Description:Your Role and Responsibilities:- Understand a data warehousing solution and able to work independently in such an environment- Responsible in Project development and delivery experience of a few good size projects- Design, build, optimize and support new and existing data models and ETL processes based on our...


  • Pune, India Coders Brain Pvt Ltd Full time

    Primary Skills : Databricks with Pyspark/Spark/ Python and SQLSecondary skills : ADFJob Overview :- 7+ years of experience with detailed knowledge of data warehouse technical architectures, ETL/ ELT and reporting/analytic tools.- 4+ years of work experience with very large data warehousing environment- 4+ years of work experience in Databricks, Pyspark and...

  • Data Engineer

    4 weeks ago


    Coimbatore, India XANDER CONSULTING AND ADVISORY PRIVATE LIMITED Full time

    Job DescriptionPrimary Skills : Databricks with Pyspark, PythonSecondary skills; ADF- 6+ years of experience with detailed knowledge of data warehouse technical architectures, ETL/ ELT and reporting/analytic tools.- 4+ years of work experience with very large data warehousing environment- 4+ years of work experience in Databricks, Pyspark and Python project...

  • Data Engineer

    3 weeks ago


    Coimbatore, Tamil Nadu, India XANDER CONSULTING AND ADVISORY PRIVATE LIMITED Full time

    Job DescriptionPrimary Skills : Databricks with Pyspark, PythonSecondary skills; ADF- 6+ years of experience with detailed knowledge of data warehouse technical architectures, ETL/ ELT and reporting/analytic tools.- 4+ years of work experience with very large data warehousing environment- 4+ years of work experience in Databricks, Pyspark and Python project...

  • Data Engineer

    4 weeks ago


    Coimbatore, India XANDER CONSULTING AND ADVISORY PRIVATE LIMITED Full time

    Job DescriptionPrimary Skills : Databricks with Pyspark, PythonSecondary skills; ADF- 6+ years of experience with detailed knowledge of data warehouse technical architectures, ETL/ ELT and reporting/analytic tools.- 4+ years of work experience with very large data warehousing environment- 4+ years of work experience in Databricks, Pyspark and Python project...

  • Big Data Engineer

    2 weeks ago


    Goa/Mumbai/Jammu & Kashmir/Jammu/Srinagar/Pondicherry/Jaipur/Lucknow/Varanasi/Banaras/Patna/Ranchi, India ATech Full time

    Designation: BIG DATA ENGINEERJob Description:Your Role and Responsibilities:- Understand a data warehousing solution and able to work independently in such an environment- Responsible in Project development and delivery experience of a few good size projects- Design, build, optimize and support new and existing data models and ETL processes based on our...


  • Pune, India LTIMindtree Full time

    - Exp- 5 to 15 Years- Locations- PAN India LTIM locations- Rel Experience- 3 Years (in Python/ Pyspar/ ScalaSpark)Scala development and design using Scala 3+ or PythonHadoop, Spark/Pyspark, Hive, YARNData Modelling: A good data engineer should be able to design, implement and maintain data models that can support the organization's data storage and analysis...