Unity Catalog Migration
2 days ago
This role focuses on migrating existing data environments from Apache Hive Metastore to Databricks Unity Catalog, leveraging Scala for data transformations and pipeline adjustments.
A seasoned senior candidate with 8+ years of relevant experience with strong expertise in Databricks, Scala, Spark with Azure cloud environment.
The typical responsibilities include;
- Experience with large-scale data migrations.
- Good knowledge and implementation experience in data lineage and auditing tools.
- Assessment & Planning: Analyze the current Hive Metastore environment, including data models, pipelines, and access controls, to define a comprehensive migration strategy to Unity Catalog.
- Unity Catalog Setup: Configure and manage Unity Catalog metastores, external locations, and credentials within Databricks workspaces.
- Metadata Migration: Develop and execute Scala-based scripts and Databricks notebooks to migrate Hive Metastore tables, views, and associated metadata to Unity Catalog. This may involve using Unity Catalog's upgrade wizard or custom solutions for complex scenarios.
- Data Governance & Security: Implement and enforce Unity Catalog's centralized access controls (ACLs, grants) to ensure secure data access and compliance.
- Pipeline Modernization: Refactor existing Scala/Spark data pipelines to integrate seamlessly with Unity Catalog, updating table references and ensuring data integrity during and after migration.
- Testing & Validation: Conduct thorough testing to validate data consistency, performance, and access control policies in the Unity Catalog environment.
- Documentation: Create comprehensive documentation for the migration process, including architecture diagrams, migration scripts, and operational procedures.
- Collaboration: Work closely with data architects, data scientists, and other engineering teams to ensure a smooth transition and adoption of Unity Catalog.
- Required Skills & Qualifications:
- Expertise in Scala: Strong proficiency in Scala for data manipulation, Spark development, and building robust data pipelines.
- Databricks Platform: In-depth knowledge of Databricks, including Spark, Delta Lake, and Databricks notebooks.
- Unity Catalog: Hands-on experience with Unity Catalog setup, configuration, and migration strategies.
- Hive Metastore: Solid understanding of Hive Metastore concepts and its integration with data processing frameworks.
- Cloud Platforms: Experience with cloud platforms (e.g., Azure, AWS, GCP) and their data storage services (e.g., ADLS, S3, GCS).
- Data Governance: Familiarity with data governance principles, access control mechanisms, and data security best practices.
- Problem-Solving: Excellent analytical and problem-solving skills to address complex migration challenges.
Roles and Responsibilities
This role focuses on migrating existing data environments from Apache Hive Metastore to Databricks Unity Catalog, leveraging Scala for data transformations and pipeline adjustments.
A seasoned senior candidate with 8+ years of relevant experience with strong expertise in Databricks, Scala, Spark with Azure cloud environment.
The typical responsibilities include;
- Experience with large-scale data migrations.
- Good knowledge and implementation experience in data lineage and auditing tools.
- Assessment & Planning: Analyze the current Hive Metastore environment, including data models, pipelines, and access controls, to define a comprehensive migration strategy to Unity Catalog.
- Unity Catalog Setup: Configure and manage Unity Catalog metastores, external locations, and credentials within Databricks workspaces.
- Metadata Migration: Develop and execute Scala-based scripts and Databricks notebooks to migrate Hive Metastore tables, views, and associated metadata to Unity Catalog. This may involve using Unity Catalog's upgrade wizard or custom solutions for complex scenarios.
- Data Governance & Security: Implement and enforce Unity Catalog's centralized access controls (ACLs, grants) to ensure secure data access and compliance.
- Pipeline Modernization: Refactor existing Scala/Spark data pipelines to integrate seamlessly with Unity Catalog, updating table references and ensuring data integrity during and after migration.
- Testing & Validation: Conduct thorough testing to validate data consistency, performance, and access control policies in the Unity Catalog environment.
- Documentation: Create comprehensive documentation for the migration process, including architecture diagrams, migration scripts, and operational procedures.
- Collaboration: Work closely with data architects, data scientists, and other engineering teams to ensure a smooth transition and adoption of Unity Catalog.
- Required Skills & Qualifications:
- Expertise in Scala: Strong proficiency in Scala for data manipulation, Spark development, and building robust data pipelines.
- Databricks Platform: In-depth knowledge of Databricks, including Spark, Delta Lake, and Databricks notebooks.
- Unity Catalog: Hands-on experience with Unity Catalog setup, configuration, and migration strategies.
- Hive Metastore: Solid understanding of Hive Metastore concepts and its integration with data processing frameworks.
- Cloud Platforms: Experience with cloud platforms (e.g., Azure, AWS, GCP) and their data storage services (e.g., ADLS, S3, GCS).
- Data Governance: Familiarity with data governance principles, access control mechanisms, and data security best practices.
- Problem-Solving: Excellent analytical and problem-solving skills to address complex migration challenges.
-
Only 24h Left: AVP
4 weeks ago
Hyderabad, India Impetus Career Consultants Full timeJob Description Position: AVP Databricks Architect Champion Location: Hyderabad | Work Mode: 5 Days from Office | Shift: UK Timings | Experience: 14+ yrs We are looking for an experienced Databricks Architect Champion to lead data platform modernization and large-scale Lakehouse transformation initiatives. The ideal candidate will bring strong hands-on...
-
Lead Databricks Analyst
2 weeks ago
India Rialtes Full timeLocation: RemoteEmployment Type: Full-TimeOverviewRialtes is seeking a Databricks Support Analyst to manage platform operations, ensure reliability of clusters and jobs, and support development of data workflows. Role requires hands-on experience with Databricks administration, SQL-based development, and basic Python.Key ResponsibilitiesPrimary: Platform...
-
Lead Databricks Analyst
2 weeks ago
India Rialtes Full timeLocation: Remote Employment Type : Full-Time Overview Rialtes is seeking a Databricks Support Analyst to manage platform operations, ensure reliability of clusters and jobs, and support development of data workflows. Role requires hands-on experience with Databricks administration, SQL-based development, and basic Python. Key Responsibilities Primary:...
-
Lead Databricks Analyst
2 weeks ago
India Rialtes Full timeLocation: Remote Employment Type : Full-Time Overview Rialtes is seeking a Databricks Support Analyst to manage platform operations, ensure reliability of clusters and jobs, and support development of data workflows. Role requires hands-on experience with Databricks administration, SQL-based development, and basic Python. Key Responsibilities Primary:...
-
Data Engineer
2 weeks ago
India Mastek Full timeSkill set: Databricks & Unity Catalog Deep hands-on experience with Unity Catalog — creating and managing catalogs, schemas, and tables. Experience automating data onboarding and metadata registration via Unity Catalog APIs or Databricks CLI. Understanding of lineage, ownership, data access policies, and Delta Sharing. Automation, CI/CD & GitHub Experience...
-
Data Engineer
2 weeks ago
India Mastek Full timeSkill set:Databricks & Unity CatalogDeep hands-on experience with Unity Catalog — creating and managing catalogs, schemas, and tables.Experience automating data onboarding and metadata registration via Unity Catalog APIs or Databricks CLI.Understanding of lineage, ownership, data access policies, and Delta Sharing.Automation, CI/CD & GitHubExperience...
-
Oracle Analytics Server
7 hours ago
India Kosar Infotech Full timeJob Title:Oracle Analytics Server (OAS) Migration Specialist Location : Remote Type : Project-Based (3 Months Max – Extended based on client) Time zone : Middle-east Kosar is hiring an experienced Freelance Oracle Analytics Consultant for a critical, single-iteration technical project: migrating our OBIEE 12.2.1.4.1 platform to Oracle Analytics Server (OAS...
-
(High Salary) Data Engineer – Databricks
3 weeks ago
India Epergne Solutions Full timeProject Overview We are seeking skilled Data Engineers with strong hands-on experience in Databricks for a high-impact data platform project The project involves building scalable data pipelines integrating with cloud platforms and managing data governance using Unity Catalog Key Requirements Primary Skills Expertise in Databricks working on either AWS or...
-
Databricks Architect
4 weeks ago
Hyderabad, India Oracle Full timeJob Description Job Description We are seeking an experienced Data Architect specializing in Databricks to lead the architecture, design, and migration of enterprise data workloads from on-premises systems (e.g., Oracle, Exadata, Hadoop) to Databricks on Azure or AWS. The role involves designing scalable, secure, and high-performing data platforms based on...
-
Senior Data Engineer/Architect
3 weeks ago
India HARP Full timeMandatory skills- Python+ Azure Databricks + SQL (all are mandatory skills) Key responsibilities :Build reusable utilities, templates, and automation pipelines.Design scalable data engineering frameworks, standards, and best practices.Provide architectural guidance, cost optimization and performance tuning support for Data Engineering solutions (mainly on...