PySpark/Databricks Engineer

3 weeks ago


Anywhere in IndiaMultiple LocationsHyderabadSrinagarJaipur, IN Aricent Full time

Job : PySpark/Databricks Engineer

Open for Multiple Locations with WFO and WFH

Job Description :

We are looking for a PySpark solutions developer and data engineer that is able to design and build solutions for one of our Fortune 500 Client programs, which aims to build a data standardized and curation-based Hadoop cluster

This high visibility, fast-paced key initiative will integrate data across internal and external sources, provide analytical insights, and integrate with the customer s critical systems

Key Responsibilities :

- Ability to design, build and unit test applications on Spark framework on Python.

- Build PySpark based applications for both batch and streaming requirements, which will require in-depth knowledge on majority of Hadoop and NoSQL databases as well.

- Develop and execute data pipeline testing processes and validate business rules and policies.

- Optimize performance of the built Spark applications in Hadoop using configurations around Spark Context, Spark-SQL, Data Frame, and Pair RDDs.

- Optimize performance for data access requirements by choosing the appropriate native Hadoop file formats (Avro, Parquet, ORC etc) and compression codec respectively.

- Ability to design build real-time applications using Apache Kafka Spark Streaming

- Build integrated solutions leveraging Unix shell scripting, RDBMS, Hive, HDFS File System, HDFS File Types, HDFS compression codec.

- Build data tokenization libraries and integrate with Hive Spark for column-level obfuscation

- Experience in processing large amounts of structured and unstructured data, including integrating data from multiple sources.

- Create and maintain integration and regression testing framework on Jenkins integrated with BitBucket and/or GIT repositories

- Participate in the agile development process, and document and communicate issues and bugs relative to data standards in scrum meetings

- Work collaboratively with onsite and offshore team.

- Develop review technical documentation for artifacts delivered.

- Ability to solve complex data-driven scenarios and triage towards defects and production issues

- Ability to learn-unlearn-relearn concepts with an open and analytical mindset

- Participate in code release and production deployment.

- Challenge and inspire team members to achieve business results in a fast paced and quickly changing environment

- BE/B.Tech/ B.Sc. in Computer Science/Statistics, Econometrics from an accredited college or university.

- Minimum 3 years of extensive experience in design, build and deployment of PySpark-based applications.

- Expertise in handling complex large-scale Big Data environments preferably (20Tb+).

- Minimum 3 years of experience in the following: HIVE, YARN, HDFS preferably on Hortonworks Data Platform.

- Good implementation experience of OOPS concepts.

- Hands-on experience writing complex SQL queries, exporting, and importing large amounts of data using utilities.

- Ability to build abstracted, modularized reusable code components.

- Hands-on experience in generating/parsing XML, JSON documents, and REST API request/responses

(ref:hirist.tech)

  • Anywhere in India/Multiple Locations, IN Optivalue tek consulting Full time

    Key Responsibilities :- Work on client projects to deliver AWS, PySpark, Databricks based Data engineering & Analytics solutions. - Build and operate very large data warehouses or data lakes. - ETL optimization, designing, coding, & tuning big data processes using Apache Spark. - Build data pipelines & applications to stream and process datasets at low...


  • Anywhere in India/Multiple Locations, IN Optivalue tek consulting Full time

    Key Responsibilities :- Work on client projects to deliver AWS, PySpark, Databricks based Data engineering & Analytics solutions. - Build and operate very large data warehouses or data lakes. - ETL optimization, designing, coding, & tuning big data processes using Apache Spark. - Build data pipelines & applications to stream and process datasets at low...

  • PySpark Developer

    1 week ago


    Anywhere in India/Multiple Locations, IN LOGIC PLANET IT SERCICES Full time

    We are hiring for the Position of Pyspark DeveloperExperience - 4 YearsSkills - Pyspark, python , DatabricksProject Role : Application DeveloperProject Role Description : Design, build and configure applications to meet business process and application requirements.Must have skills : PySparkGood to have skills : - Python (Programming Language), Databricks...

  • Azure Data Lead

    2 months ago


    Anywhere in India/Multiple Locations, IN Etaash Consulting Full time

    Years of experience : 7 to 15 Years Role : Sr. Tech LeadJob Description :- Experience in Perform Design, Development & Deployment using Azure Services (Databricks, PySpark, SQL, Data Factory,)- Develop and maintain scalable data pipelines and build new Data Source integrations to support increasing data volume and complexity.- Experience in creating...

  • Azure Data Lead

    3 weeks ago


    Anywhere in India/Multiple Locations, IN Etaash Consulting Full time

    Years of experience : 7 to 15 Years Role : Sr. Tech LeadJob Description :- Experience in Perform Design, Development & Deployment using Azure Services (Databricks, PySpark, SQL, Data Factory,)- Develop and maintain scalable data pipelines and build new Data Source integrations to support increasing data volume and complexity.- Experience in creating...

  • Data Engineer

    3 weeks ago


    Anywhere in India/Multiple Locations, IN ARAHAS TECHNOLOGIES PRIVATE LIMITED Full time

    We are currently hiring a Pyspark Data Engineer with AWS expertise with7 years of experience Job Description :- The candidate should have strong skills in Pyspark coding and have a background in AWS technologies such as Airflow- Experience working with complex SQL queries- Good with PySpark, Databricks, and Airflow- Bachelor's or Master's degree in...

  • Colan Infotech

    2 months ago


    Bangalore/Anywhere in India/Multiple Locations, IN Colan Infotech Pvt Ltd Full time

    Skill Set : Pyspark / Scala Spark, Data Factory, Databricks, Python, SQL.Job Description :Roles And Responsibilities :- Must have cloud knowledge in Azure- Should have programming skills with the ability to write optimized and reusable high-quality code.- Design, develop and maintain scalable data pipelines using Pyspark / Scala Spark, Databricks, Python,...


  • Anywhere in India/Multiple Locations/Bangalore, IN Notus Full time

    Job Title : Azure Databricks EngineerLocation : Pan IndiaJob Type : Immediate to 15-Day JoinerExperience Level : 6+ YearsJob Summary :We are seeking an experienced Azure Databricks Engineer to join our team in Pan India. The ideal candidate will have a strong background in Azure Data Engineering, with expertise in Azure Databricks, Azure Data Factory, Azure...


  • Anywhere in India/Multiple Locations/Bangalore, IN Notus Full time

    Job Title : Azure Databricks EngineerLocation : Pan IndiaJob Type : Immediate to 15-Day JoinerExperience Level : 6+ YearsJob Summary :We are seeking an experienced Azure Databricks Engineer to join our team in Pan India. The ideal candidate will have a strong background in Azure Data Engineering, with expertise in Azure Databricks, Azure Data Factory, Azure...

  • UsefulBI Corporation

    2 months ago


    Anywhere in India/Multiple Locations, IN USEFULBI CORPORATION Full time

    Job Description :Minimum 5+ years' Experience in Data Engineering.Must have good knowledge and experience in Python.Mush have good Knowledge of Pyspark or Spark.Mush have good Knowledge of Databricks.Must have good experience in AWS (Glue, EMR)Typically requires relevant analysis work and domain-area work experience.Expert in the management,...


  • Pune/Hyderabad, IN RapidBraiins Full time

    No. of openings - 10+Primary Skill : SQL, Azure DatabricksSecondary Skills : PySpark, Azure Data FactoryResponsibilities : As an Azure Data Engineer, your day-to-day work activities will be as follows : - Design & develop the ETL- Good experience in writing SQL, Python and PySpark programming- Create the Pipelines (simple and complex) using ADF.- Work with...


  • Pune/Hyderabad, IN RapidBraiins Full time

    No. of openings - 10+Primary Skill : SQL, Azure DatabricksSecondary Skills : PySpark, Azure Data FactoryResponsibilities : As an Azure Data Engineer, your day-to-day work activities will be as follows : - Design & develop the ETL- Good experience in writing SQL, Python and PySpark programming- Create the Pipelines (simple and complex) using ADF.- Work with...

  • Data Engineer

    1 week ago


    Anywhere in India/Multiple Locations, IN TeizoSoft Private Limited Full time

    Job Description :- Minimum of 4+ yrs.- Experience on Azure cloud environment.- Experience on ADLS, Azure Databricks, Azure SQL DB and Datawarehouse- Strong working experience in Implementation of Azure cloud components using Azure Data Factory , Azure Data Analytics, Azure Data Lake, Azure Data Catalogue, LogicApps FunctionApps- Experience on Python...

  • DevOps Engineer

    4 weeks ago


    Anywhere in India/Multiple Locations, IN Sam manpower services & Career LLP Full time

    Require a DevOps Engineer Azure Databricks ETL Pipeline Specialist. Key Pointers :- Building CI/CD pipelines for migrating in Databricks .- Building Jenkins and Bitbucket.- Azure Fundamentals : A solid understanding of the core concepts, services, and infrastructure components of the Azure cloud platform.- Cloud Architecture : Proficiency in designing and...

  • DevOps Engineer

    3 weeks ago


    Anywhere in India/Multiple Locations, IN Sam manpower services & Career LLP Full time

    Require a DevOps Engineer Azure Databricks ETL Pipeline Specialist. Key Pointers :- Building CI/CD pipelines for migrating in Databricks .- Building Jenkins and Bitbucket.- Azure Fundamentals : A solid understanding of the core concepts, services, and infrastructure components of the Azure cloud platform.- Cloud Architecture : Proficiency in designing and...


  • Bangalore/Hyderabad/Pune, IN Huquo Full time

    Job Description :- 7+ years of Data engineering experience with 3+ years hands on Databricks (DB) experience.- Should have thorough knowledge in creation of jobs using Pyspark. Should be extremely good with SQL and possess good exposure to Python.- Should be able to create New Clusters , Cluster Pools and attach existing clusters to pool in DB.- Should have...


  • Bangalore/Hyderabad/Pune, IN Huquo Full time

    Job Description :- 7+ years of Data engineering experience with 3+ years hands on Databricks (DB) experience.- Should have thorough knowledge in creation of jobs using Pyspark. Should be extremely good with SQL and possess good exposure to Python.- Should be able to create New Clusters , Cluster Pools and attach existing clusters to pool in DB.- Should have...

  • Data Engineer

    4 weeks ago


    Pune/Hyderabad, IN EDGESOFT Full time

    Job Description :The ideal candidate should have a robust understanding and hands-on expertise in PySpark and various components within DataBricks. As a crucial member of our data team, you will play a pivotal role in developing, optimizing, and maintaining our data infrastructure, ensuring seamless and efficient data processing.Responsibilities :- Design,...

  • Data Engineer

    3 weeks ago


    Pune/Hyderabad, IN EDGESOFT Full time

    Job Description :The ideal candidate should have a robust understanding and hands-on expertise in PySpark and various components within DataBricks. As a crucial member of our data team, you will play a pivotal role in developing, optimizing, and maintaining our data infrastructure, ensuring seamless and efficient data processing.Responsibilities :- Design,...

  • Data Engineer

    2 months ago


    Pune/Hyderabad/Remote, IN HARP Technologies and Services Full time

    Job Location : Pune & Hyderabad (Initial 1 -2 months hybrid and then complete remote job. Client will offer accommodation(stay), food & travel exp. )Exp range : 6+ years Shift timings : General 9am - 6pm IST or 10am - 7pm IST Mandatory skills : Data engineering (4+years), Databricks (3.5+ years), Pyspark (3+ years), Python (preferred) or bash, Data pipeline,...