Delta Lake Architect

4 months ago


Banga Rural, India Thoucentric Full time



Asa Big Data Engineer you will be responsible for designingdeveloping and maintaining our big data infrastructure. You willwork with large datasets perform data processing and supportvarious business functions by creating data pipelines dataprocessing jobs and data integration solutions. You will be workingin a dynamic and collaborative environment leveraging yourexpertise in Hive Hadoop and PySpark to unlock valuable insightsfrom our data.


KeyResponsibilities:

DataIngestion andIntegration:

Develop andmaintain data ingestion processes to collect data from varioussources.

Integrate datafrom different platforms and databases into a unified datalake.


DataProcessing:

Create dataprocessing jobs using Hive and PySpark for largescale datatransformation.

Optimize dataprocessing workflows to ensure efficiency andperformance.


Data PipelineDevelopment:

Design andimplement ETL pipelines to move data from raw to processedformats.

Monitor andtroubleshoot data pipelines ensuring data quality andreliability.


Data ModelingandOptimization:

Develop datamodels for efficient querying and reporting usingHive.

Implementperformance tuning and optimization strategies for Hadoop andSpark.


DataGovernance:

Implement datasecurity and access controls to protect sensitiveinformation.

Ensurecompliance with data governance policies and bestpractices.


Collaboration:

Collaboratewith data scientists analysts and other stakeholders to understanddata requirements and provide datasupport.




Requirements

Qualifications:

    • Bachelors degree in computer science Information Technology or a relatedfield.
    • 8years of experience in big data engineering and dataprocessing.
    • Proficiencyin Hive Hadoop Airflow andPySpark.4
    • StrongSQL and NoSQL databaseexperience.
    • Experiencewith data warehousing and datamodeling.
    • Knowledgeof data integration ETL processes and dataquality.
    • Strongproblemsolving and troubleshootingskills.


    PreferredQualifications:

    • Experiencewith cloudbased big data technologies (e.g. AWS Azure andGCP).
    • Certificationin Hadoop Hive orPySpark.


This

BenefitsWhata Consulting role at Thoucentric will offeryou
  • Opportunityto define your career path and not as enforced by amanager.
  • Agreat consulting environment with a chance to work with Fortune 500companies and startupsalike.
  • Adynamic but relaxed and supportive working environment thatencourages personaldevelopment.
  • Bepart of One Extended Family. We bond beyond work sportsgettogethers common interestsetc.
  • Workin a very enriching environment with Open Culture Flat Organizationand Excellent PeerGroup.
  • Bepart of the exciting Growth Story ofThoucentric


Hive, Hadoop, Pyspark, delta lake