Unity Catalog Migration

12 hours ago


India PureSoftware Pvt Ltd Full time ₹ 20,00,000 - ₹ 25,00,000 per year

This role focuses on migrating existing data environments from Apache Hive Metastore to Databricks Unity Catalog, leveraging Scala for data transformations and pipeline adjustments.

A seasoned senior candidate with 8+ years of relevant experience with strong expertise in Databricks, Scala, Spark with Azure cloud environment.

The typical responsibilities include;

  • Experience with large-scale data migrations.
  • Good knowledge and implementation experience in data lineage and auditing tools.
  • Assessment & Planning: Analyze the current Hive Metastore environment, including data models, pipelines, and access controls, to define a comprehensive migration strategy to Unity Catalog.
  • Unity Catalog Setup: Configure and manage Unity Catalog metastores, external locations, and credentials within Databricks workspaces.
  • Metadata Migration: Develop and execute Scala-based scripts and Databricks notebooks to migrate Hive Metastore tables, views, and associated metadata to Unity Catalog. This may involve using Unity Catalog's upgrade wizard or custom solutions for complex scenarios.
  • Data Governance & Security: Implement and enforce Unity Catalog's centralized access controls (ACLs, grants) to ensure secure data access and compliance.
  • Pipeline Modernization: Refactor existing Scala/Spark data pipelines to integrate seamlessly with Unity Catalog, updating table references and ensuring data integrity during and after migration.
  • Testing & Validation: Conduct thorough testing to validate data consistency, performance, and access control policies in the Unity Catalog environment.
  • Documentation: Create comprehensive documentation for the migration process, including architecture diagrams, migration scripts, and operational procedures.
  • Collaboration: Work closely with data architects, data scientists, and other engineering teams to ensure a smooth transition and adoption of Unity Catalog.
  • Required Skills & Qualifications:
  • Expertise in Scala: Strong proficiency in Scala for data manipulation, Spark development, and building robust data pipelines.
  • Databricks Platform: In-depth knowledge of Databricks, including Spark, Delta Lake, and Databricks notebooks.
  • Unity Catalog: Hands-on experience with Unity Catalog setup, configuration, and migration strategies.
  • Hive Metastore: Solid understanding of Hive Metastore concepts and its integration with data processing frameworks.
  • Cloud Platforms: Experience with cloud platforms (e.g., Azure, AWS, GCP) and their data storage services (e.g., ADLS, S3, GCS).
  • Data Governance: Familiarity with data governance principles, access control mechanisms, and data security best practices.
  • Problem-Solving: Excellent analytical and problem-solving skills to address complex migration challenges.

Roles and Responsibilities

This role focuses on migrating existing data environments from Apache Hive Metastore to Databricks Unity Catalog, leveraging Scala for data transformations and pipeline adjustments.

A seasoned senior candidate with 8+ years of relevant experience with strong expertise in Databricks, Scala, Spark with Azure cloud environment.

The typical responsibilities include;

  • Experience with large-scale data migrations.
  • Good knowledge and implementation experience in data lineage and auditing tools.
  • Assessment & Planning: Analyze the current Hive Metastore environment, including data models, pipelines, and access controls, to define a comprehensive migration strategy to Unity Catalog.
  • Unity Catalog Setup: Configure and manage Unity Catalog metastores, external locations, and credentials within Databricks workspaces.
  • Metadata Migration: Develop and execute Scala-based scripts and Databricks notebooks to migrate Hive Metastore tables, views, and associated metadata to Unity Catalog. This may involve using Unity Catalog's upgrade wizard or custom solutions for complex scenarios.
  • Data Governance & Security: Implement and enforce Unity Catalog's centralized access controls (ACLs, grants) to ensure secure data access and compliance.
  • Pipeline Modernization: Refactor existing Scala/Spark data pipelines to integrate seamlessly with Unity Catalog, updating table references and ensuring data integrity during and after migration.
  • Testing & Validation: Conduct thorough testing to validate data consistency, performance, and access control policies in the Unity Catalog environment.
  • Documentation: Create comprehensive documentation for the migration process, including architecture diagrams, migration scripts, and operational procedures.
  • Collaboration: Work closely with data architects, data scientists, and other engineering teams to ensure a smooth transition and adoption of Unity Catalog.
  • Required Skills & Qualifications:
  • Expertise in Scala: Strong proficiency in Scala for data manipulation, Spark development, and building robust data pipelines.
  • Databricks Platform: In-depth knowledge of Databricks, including Spark, Delta Lake, and Databricks notebooks.
  • Unity Catalog: Hands-on experience with Unity Catalog setup, configuration, and migration strategies.
  • Hive Metastore: Solid understanding of Hive Metastore concepts and its integration with data processing frameworks.
  • Cloud Platforms: Experience with cloud platforms (e.g., Azure, AWS, GCP) and their data storage services (e.g., ADLS, S3, GCS).
  • Data Governance: Familiarity with data governance principles, access control mechanisms, and data security best practices.
  • Problem-Solving: Excellent analytical and problem-solving skills to address complex migration challenges.


  • India Mastech Digital Full time

    Position: Unity Catalog + Databricks+ Informatica(CDGC)Location: RemoteDuration: Full Time and Full time Contractors (Both option available)Budget: 10-27 LPANotice Period: Only Immediate Joiners/ Currently Serving Notice/ Notice is less than 30 daysPrimary - Unity Catalog + Databricks , Secondary – CDGCEnsure all core data assets have their definitions...


  • India Mastech Digital Full time

    Position: Unity Catalog + Databricks+ Informatica(CDGC) Location: Remote Duration: Full Time and Full time Contractors (Both option available) Budget: 10-27 LPA Notice Period: Only Immediate Joiners/ Currently Serving Notice/ Notice is less than 30 days Primary - Unity Catalog + Databricks , Secondary – CDGC Ensure all core data assets have their...


  • india, IN Mastech Digital Full time

    Position: Unity Catalog + Databricks+ Informatica(CDGC)Location: RemoteDuration: Full Time and Full time Contractors (Both option available)Budget: 10-27 LPANotice Period: Only Immediate Joiners/ Currently Serving Notice/ Notice is less than 30 daysPrimary - Unity Catalog + Databricks , Secondary – CDGCEnsure all core data assets have their definitions...


  • India Mastech Digital Full time

    Position: Unity Catalog + Databricks+ Informatica(CDGC) Location: Remote Duration: Full Time and Full time Contractors (Both option available) Budget: 10-27 LPA Notice Period: Only Immediate Joiners/ Currently Serving Notice/ Notice is less than 30 days Primary - Unity Catalog + Databricks , Secondary – CDGC - Ensure all core data assets have their...

  • Only 24h Left: AVP

    6 days ago


    Hyderabad, India Impetus Career Consultants Full time

    Job Description Position: AVP Databricks Architect Champion Location: Hyderabad | Work Mode: 5 Days from Office | Shift: UK Timings | Experience: 14+ yrs We are looking for an experienced Databricks Architect Champion to lead data platform modernization and large-scale Lakehouse transformation initiatives. The ideal candidate will bring strong hands-on...

  • Data Engineer

    3 weeks ago


    Kolkata, West Bengal, India, West Bengal Tata Consultancy Services Full time

    TCS has been a great pioneer in feeding the fire of young techies like you. We are global leaders in the technology arena and there's nothing that can stop us from growing together. TCS Hiring for skill "Data Engineer". Role: Databricks Senior Developer Required Technical Skill Set: Databricks, Spark and Data Migration experience Experience: 8+yearsWork...


  • India Oracle Full time

    Job Description We are seeking an experienced Data Architect specializing in Databricks to lead the architecture, design, and migration of enterprise data workloads from on-premises systems (e.g., Oracle, Exadata, Hadoop) to Databricks on Azure or AWS. The role involves designing scalable, secure, and high-performing data platforms based on the medallion...


  • Kolkata, West Bengal, India, West Bengal Tata Consultancy Services Full time

    Role: Databricks Senior DeveloperLocation: KolkataExperience: 8 Plus YearsRole**Databricks Senior DeveloperRequired Technical Skill Set**Databricks, Spark and Data Migration experienceDesired Competencies (Technical/Behavioral Competency)Must-Have**1. 8+ years with strong Databricks, Spark and Data Migration experience2. Should have experience of end-to-end...


  • India Epergne Solutions Full time

    Project Overview We are seeking skilled Data Engineers with strong hands-on experience in Databricks for a high-impact data platform project The project involves building scalable data pipelines integrating with cloud platforms and managing data governance using Unity Catalog Key Requirements Primary Skills Expertise in Databricks working on either AWS or...


  • Hyderabad, India Oracle Full time

    Job Description Job Description We are seeking an experienced Data Architect specializing in Databricks to lead the architecture, design, and migration of enterprise data workloads from on-premises systems (e.g., Oracle, Exadata, Hadoop) to Databricks on Azure or AWS. The role involves designing scalable, secure, and high-performing data platforms based on...