Spark/PySpark Developer

3 weeks ago


BiharJharkhandMaharashtraPondicherryCoimbatorePatnaAurangabadRanchiMumbaiNavi MumbaiPuneN, IN ATech Full time

Job Profile : Spark ( Pyspark ) Developer

Industry Type : IT Services

Job description :

- The developer must have sound knowledge in Apache Spark and Python programming.

- Deep experience in developing data processing tasks using pySpark such as reading data from external sources, merge data, perform data enrichment and load in to target data destinations.

- Experience in deployment and operationalizing the code is added advantage


- Have knowledge and skills in Devops/version control and containerization.


- Preferable having deployment knowledge.

- Create Spark jobs for data transformation and aggregation


- Produce unit tests for Spark transformations and helper methods

- Write Scaladoc-style documentation with all code

- Design data processing pipelines to perform batch and Real- time/stream analytics on structured and unstructured data

- Spark query tuning and performance optimization


- Good understanding of different file formats (ORC, Parquet, AVRO) to optimize queries/processing and compression techniques.

- SQL database integration (Microsoft, Oracle, Postgres, and/or MySQL)

- Experience working with (HDFS, S3, Cassandra, and/or DynamoDB)

- Deep understanding of distributed systems (e.g. CAP theorem, partitioning, replication, consistency, and consensus)

- Experience in building cloud scalable high-performance data lake solutions

- Hands on expertise in cloud services like AWS, and/or Microsoft Azure.

- As a Spark developer you will manage the development of scalable distributed Architecture defined by the Architect or tech Lead in our team.

- Analyse, assemble large data sets to designed for the functional and non-functional requirements.

- You will develop ETL scripts for big data sources.

- Identify, design optimise data processing automate for reports and dashboards.

- You will be responsible for workflow optimizations, data optimizations and ETL optimization as per the requirements elucidated by the team.

- Work with stakeholders such as Product managers, Technical Leads Service Layer engineers to ensure end-to-end requirements are addressed.

- Strong team player to adhere to Software Development Life cycle (SDLC) and documentations needed to represent every stage of SDLC.

- Hands on working experience on any of the data engineering analytics platform (Hortonworks Cloudera MapR AWS), AWS preferred

- Hands-on experience on Data Ingestion Apache Nifi, Apache Airflow, Sqoop, and Oozie

- Hands-on working experience of data processing at scale with event driven systems, message queues (Kafka Flink Spark Streaming)

- Hands on working Experience with AWS Services like EMR, Kinesis, S3, Cloud Formation, Glue, API Gateway, Lake Foundation

- Hands on working Experience with AWS Athena

- Data Warehouse exposure on Apache Nifi, Apache Airflow, Kylo

- Operationalization of ML models on AWS (e.g. deployment, scheduling, model monitoring etc.)

- Feature Engineering Data Processing to be used for Model development

- Experience gathering and processing raw data at scale (including writing scripts, web scraping, calling APIs, write SQL queries, etc.)

- Experience building data pipelines for structured unstructured, real-time batch, events synchronous asynchronous using MQ, Kafka, Steam processing

- Hands-on working experience in analysing source system data and data flows, working with structured and unstructured data

- Must be very strong in writing SQL queries

(ref:hirist.tech)
  • Big Data Engineer

    3 weeks ago


    Goa/Mumbai/Jammu & Kashmir/Jammu/Srinagar/Pondicherry/Jaipur/Lucknow/Varanasi/Banaras/Patna/Ranchi, IN ATech Full time

    Designation: BIG DATA ENGINEERJob Description:Your Role and Responsibilities:- Understand a data warehousing solution and able to work independently in such an environment- Responsible in Project development and delivery experience of a few good size projects- Design, build, optimize and support new and existing data models and ETL processes based on our...


  • Hyderabad/Pune, IN Coders Brain Pvt Ltd Full time

    Primary Skills : Databricks with Pyspark/Spark/ Python and SQLSecondary skills : ADFJob Overview :- 7+ years of experience with detailed knowledge of data warehouse technical architectures, ETL/ ELT and reporting/analytic tools.- 4+ years of work experience with very large data warehousing environment- 4+ years of work experience in Databricks, Pyspark and...

  • Senior Data Engineer

    3 weeks ago


    Bangalore/Kolkata/Pune, IN Nilasu consulting Full time

    Summary :- We are seeking a highly skilled and experienced Senior Data Engineer to join our team.- You will play a key role in designing, developing, and maintaining large-scale data processing pipelines using Python and Spark/PySpark.- Your expertise in distributed computing frameworks and DevOps tools will be instrumental in building efficient and scalable...

  • Senior Data Engineer

    3 weeks ago


    Bangalore/Hyderabad/Mumbai/Pune, IN MLOPS SOLUTIONS PRIVATE LIMITED Full time

    Job Description :Primary skillset :Experience working with distributed technology tools for developing Batch and Streaming pipelines using :- SQL, Spark, PySpark - Airflow - Spark with Scala .(optional)- Able to write code which is optimized for performance.- Experience in Cloud platform, e.g., AWS, GCP, Azure, etc.- Able to quickly pick up new programming...

  • Data Engineer

    3 weeks ago


    Hyderabad/Pune/Ahmedabad, IN Talentoj Full time

    Azure Data Engineer :We are seeking a highly skilled Azure Data Engineer to join the data engineering team and play a crucial role in an optimization initiative. The current problem is that read speed is compromised and faster responses from delta are required.The ideal candidate should have extensive experience in Python programming and be proficient in...

  • Senior Data Engineer

    3 weeks ago


    Mumbai/Chennai, IN Cyber Sphere LLC Full time

    Senior Data EngineerOnsite : Mumbai/ChennaiAbout the Role :- This role is more focused on Pyspark with Cloud developer.About the Responsibilities :- This position provides direct input to project plans, schedules, and follows software methodologies and best practices in the development of cross-functional software products under a micro-services styled...

  • Big Data Engineer

    2 weeks ago


    Bangalore/Chennai/Pune/Trivandrum/Thiruvananthapuram, IN Tata Elxsi Full time

    Job Description : - At least 4+ years of experience pyspark framework.- Good experience writing complex SQL and No SQL databases.- Excellent coding and design skills, particularly in Pyspark and Python.- Strong practical working experience with Unix scripting in at least one of Python, HDFS, SQL, PL/SQL, Shell (either bash or zsh).- Experience in AWS...


  • Bangalore/Chennai/Pune/Hyderabad, IN CGI Information Systems and Management Consultants Full time

    Job Description :- Designing, installing, testing, and maintaining scalable data management systems- Ensuring systems meet business requirements and industry standards- Integrating new data management tools into company ecosystems- Creating custom software components and analytics applications- Research data acquisition opportunities and new uses for...

  • Data Engineer

    3 weeks ago


    Pune/Bangalore/Chennai, IN ADVANSOFT Full time

    Skills :- Hadoop- Python- Spark- PySpark- ETL (Extract, Transform, Load)Roles & Responsibilities :- Data Ingestion: Develop and maintain data pipelines for ingesting raw data from various sources into the Hadoop ecosystem.- Data Processing: Utilize Python and Spark to process and transform large volumes of data efficiently, ensuring scalability and...

  • Big Data Engineer

    3 weeks ago


    Bangalore/Chennai/Pune, IN ADVANSOFT Full time

    We are hiring for Data engineer with top most MNC client;Exp : 5+yearsNp : immediate to 15daysLocation : & Responsibilities : - 5+ years of working experience in ETL scalable data pipeline usin61Scala, Python, Pyspark, Hadoop, Apache Spark, Spark SQL, Kafka, Nill, and incremental Data Load with Big data technologies.- Experience working with Databases like...

  • GCP Data Engineer

    3 weeks ago


    Pune/Hyderabad/Anywhere in India/Multiple Locations, IN Huquo Full time

    Job Description :Must-Have :- 5+ Years of Experience in Data Engineering and building and maintaining large-scale data pipelines.- Experience with designing and implementing a large-scale Data-Lake on Cloud Infrastructure - Strong technical expertise in Python and SQL- Extremely well-versed in Google Compute Platform including BigQuery, Cloud Storage, Cloud...


  • Anywhere in India/Multiple Locations/Hyderabad/Bangalore/Pune/Mumbai/Chennai, IN BRISKWIN IT SOLUTIONS PRIVATE LIMITED Full time

    Position/Role : Azure data bricks with Pyspark/Spark/Python and SQLExperience : 6-12 yrsLocation : Hyderabad/Bangalore/Pune/Mumbai/Chennai (Hybrid)Job Overview :Primary Skills : Databricks with Pyspark/Spark/ Python and SQLSecondary skills : ADF Job Description :7+ years of experience with detailed knowledge of data warehouse technical architectures, ETL/...


  • Bangalore/Chennai/Hyderabad/Mumbai/Pune/Noida, IN ACZ Global Private Limited Full time

    Azure Data ArchitectMandatory Skills : Solution Architecture - Pyspark + Databricks + Adf + Synapse is mandatoryJob Description :We are seeking a highly skilled and experienced Azure Data Architect to join our team. As an Azure Data Architect, you will play a key role in designing and implementing data solutions on the Microsoft Azure platform. The ideal...


  • Gurgaon/Gurugram/Noida/Pune/Bangalore, IN HuQuo Consulting Pvt. Ltd. Full time

    Python Developer or System Analyst.Essential Requirements:1. Ability to set up and manipulate python data structures like List, string, dictionaries, tuples2. Strong expertise in pandas and numpy.3. Familiar with data exploration, visualization and comparing metrics of large csv and parquet files including partitioned parquet files.4. Strong skills in join,...

  • Senior Data Scientist

    3 weeks ago


    Mumbai/Bihar/Jharkhand/Patna/Ranchi/Aurangabad/Guwahati/Kolkata/Jamshedpur, IN ATech Full time

    Designation: Data ScientistJob Description:Work you will do?- As a Senior Data Scientist, you will assume responsibility for guiding the planning, assessment, and execution of our content initiatives and the subsequent engagement and learner and student success- You will play a key role in identifying gaps in content and proposing ways to acquire new,...

  • AWS Data Engineer

    3 weeks ago


    Pune/Chennai/Mumbai/Bangalore/Delhi NCR/Hyderabad, IN Change leaders Full time

    Job Description : AWS Glue + Python PysparkJob Overview :Candidate should be responsible for the configuration, administration, and optimization of the AWS Glue + Python Pyspark platform. This role involves ensuring the smooth operation of AWS Glue + Python Pyspark for data integration, transformation, and loading tasks in alignment with the...

  • Kafka Architect

    2 weeks ago


    Bangalore/Pune/Kolkata/Chennai, IN Ultrabot Innovations Full time

    Job Description :As a Kafka Architect specializing in Spark and Apache Server, you will play a key role in designing, architecting, and implementing real-time data streaming solutions using Apache Kafka, Apache Spark, and related technologies. You will work closely with our clients and internal teams to understand business requirements, design robust...

  • Big Data Developer

    3 weeks ago


    Bangalore/Gurgaon/Gurugram/Pune/Chennai/Coimbatore/Mumbai/Cochin/Kochi/Noida, IN SP Software Pvt. Ltd. Full time

    Job Description :- Should have hands on experience in Software development- Responsible for designing, deploying, and managing high quality data solutions in the AWS cloud ecosystem.- Create and maintain optimal data architecture pipeline.- Collaborate with various onsite and client stakeholders in Agile environment to identify data engineering requirements...

  • Data Engineer

    3 weeks ago


    Pune/Hyderabad/Remote, IN HARP Technologies and Services Full time

    Job Location : Pune & Hyderabad (Initial 1 -2 months hybrid and then complete remote job. Client will offer accommodation(stay), food & travel exp. )Exp range : 6+ years Shift timings : General 9am - 6pm IST or 10am - 7pm IST Mandatory skills : Data engineering (4+years), Databricks (3.5+ years), Pyspark (3+ years), Python (preferred) or bash, Data pipeline,...

  • Azure Data Engineer

    3 weeks ago


    Delhi NCR/Mumbai/Bangalore/Hyderabad/Pune/Chennai/Kolkata/Coimbatore, IN Changeleaders Counsultancy Full time

    Job Description : - Min 3+ years of IT experience, including experience in designing Azure data platform with large scale implementation experience.- Hands-on-experience in developing data lake solutions using Azure (Azure data factory for ingestion, Data lake gen 2 and Azure SQL server for storage, Azure analysis service for transformations, Azure data...