Data Engineer

6 days ago


Greater Kolkata Area, India Codesmith Full time ₹ 5,00,000 - ₹ 25,00,000 per year

Description
We are seeking a strong Data Engineer with advanced expertise in Databricks and PySpark. The successful candidate will be a key contributor to critical projects, including migrating Palantir data transformation pipelines to Databricks Notebooks, designing and implementing incremental data pipelines, and orchestrating workflows in Azure Databricks.

Key Responsibilities

  • Migrate Palantir data pipelines to Databricks Notebooks, leveraging PySpark for complex transformations.
  • Replace proprietary Palantir libraries with open source or custom Pyspark implementations
  • Design, build, and maintain incremental data load pipelines to handle dynamic updates from various sources, ensuring scalability and efficiency.
  • Develop robust data ingestion pipelines to load data into the Databricks Bronze layer from relational databases, APIs, and file systems.
  • Implement incremental data transformation workflows to update silver and gold layer datasets in near real-time, adhering to Delta Lake best practices.
  • Integrate Airflow with Databricks to orchestrate end-to-end workflows, including dependency management, error handling, and scheduling.
  • Understand business and technical requirements, translating them into scalable Databricks solutions.
  • Optimize Spark jobs and queries for performance, scalability, and cost-efficiency in a distributed environment.
  • Implement robust data quality checks, monitoring solutions, and governance frameworks within Databricks.
  • Collaborate with team members on Databricks best practices, reusable solutions, and incremental loading strategies.

Required Qualifications

  • Bachelors degree in computer science, Information Systems, or a related discipline.
  • 6+ years of hands-on experience with Databricks, including expertise in PySpark.
  • Proven experience in incremental data loading techniques into Databricks, leveraging Delta Lake's features (e.g., time travel, MERGE INTO).
  • Strong understanding of data warehousing concepts, including data partitioning, and indexing for efficient querying.
  • Solid knowledge of Azure Cloud Services, particularly Azure Databricks and Azure Data Lake Storage.
  • Familiarity with version control systems (e.g., Git) and CI/CD pipelines for data engineering workflows.
  • Excellent analytical and problem-solving skills with a focus on detail-oriented development.

Preferred Qualifications

  • Proficiency in Palantir and experience in migrating Palantir data pipelines to Databricks.
  • Expertise in Airflow integration for workflow orchestration, including designing and managing DAGs.
  • Familiarity with advanced Airflow features, such as SLA monitoring and external task dependencies.
  • Advanced knowledge of Delta Lake optimizations, such as compaction, Z-ordering, and vacuuming.
  • Experience with real-time streaming data pipelines using tools like Kafka or Azure Event Hubs.
  • Experience with building, updating, deploying, finetuning ML models
  • Certifications such as Databricks Certified Associate Developer for Apache Spark or equivalent.
  • Experience in Agile development methodologies.

)


  • Lead Data Engineer

    6 days ago


    Greater Kolkata Area, India Eucloid Data Solutions Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Job DescriptionEucloid is looking for a senior/ lead Data Engineer to join our Data Platform team supporting various business applications. The ideal candidate will support development of data infrastructure on Databricks for our clients by participating in activities which may include starting from up- stream and down-stream technology selection to...

  • Data Engineer

    19 hours ago


    Greater Kolkata Area, India Workmates Core2Cloud Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    ResponsibilitiesJob DescriptionAbout The RoleAs a Junior/Senior Data Engineer, you'll be taking the lead in designing and maintaining complex data ecosystems. Your experience will be instrumental in optimizing data processes, ensuring data quality, and driving data-driven decision-making within the organization.Architecting and designing complex data systems...

  • Data Engineer

    18 hours ago


    Greater Kolkata Area, India Sundew Full time ₹ 6,00,000 - ₹ 18,00,000 per year

    Sundew is a leading digital transformation firm with an 18-year legacy of excellence. We specialize in digital strategy, application development, and engineering, utilizing MEAN, MERN, and LAMP stacks, with PHP Laravel as our primary proficiency. As we continue to expand, we are seeking a skilled and experienced Data Engineer to design and optimize data...

  • Data Engineer

    15 hours ago


    Greater Kolkata Area, India Patch Infotech Pvt Ltd Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Position :Data EngineerExperience :5-8 YearsLocation :Work From HomeJob SummaryWe are seeking a skilled Data Engineer with 5-8 years of experience to join our remote team. The ideal candidate will have extensive experience with AWS Glue and a strong background in building and maintaining robust data pipelines. You will be responsible for designing,...

  • Data Engineer

    6 days ago


    Greater Kolkata Area, India ZenYData Technologies Private Limited Full time ₹ 9,00,000 - ₹ 12,00,000 per year

    We're Hiring – Data Engineer - Google Cloud Platform (GCP) –ZenYData Technologies Private LimitedAt the forefront of Data Automation & Data Management in Kolkata, we are on the lookout for passionate, innovative, and experienced Data Engineers ready to take on exciting challenges in Google Cloud Platform (GCP). Job Title: Data Engineer – Google Cloud...

  • Data Engineer

    2 weeks ago


    Greater Kolkata Area, India MatchMove Full time ₹ 9,00,000 - ₹ 12,00,000 per year

    You Will Get ToDesign, build, and maintain high-performance data pipelines that integrate large-scale transactional data from our payments platform, ensuring data quality, reliability, and compliance with regulatory requirements.Develop and manage distributed data processing pipelines for both high-volume data streams and batch processing workflows in a...

  • Senior Data Engineer

    13 hours ago


    Greater Kolkata Area, India Lexmark Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Lexmark is now a proud part of Xerox, bringing together two trusted names and decades of expertise into a bold and shared vision.When you join us, you step into a technology ecosystem where your ideas, skills, and ambition can shape what comes next. Whether you're just starting out or leading at the highest levels, this is a place to grow, stretch, and make...

  • Lead Data Engineer

    5 days ago


    Greater Kolkata Area, India Atom Systems Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    DescriptionName of the position :Data EngineerLocation :Coimbatore / RemoteOf resources needed :01Mode :Contract to HireYears of experience :15+ YearsAbout The RoleWe are seeking a highly skilled and driven Data Engineering Lead to lead our data engineering team. The ideal candidate combines strong leadership and technical expertise with the ability to...

  • Data Engineer

    6 days ago


    Greater Kolkata Area, India HIC Global Solutions Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Key ResponsibilitiesDesign, develop, test, and maintain robust data architectures, pipelines, and ETL processes.Ensure data quality, integrity, and security across systems and workflows.Optimize data systems for performance, scalability, and cost-efficiency.Collaborate with cross-functional teams to gather requirements and enable data-driven analytics and...

  • Data Engineer

    2 weeks ago


    Greater Kolkata Area, India The IT Firm Full time ₹ 5,00,000 - ₹ 8,00,000 per year

    ResponsibilitiesDesign, build, and maintain data pipelines and ETL workflows on Google Cloud Platform.Work with BigQuery, Dataflow, Pub/Sub, Dataproc, and Cloud Storage to enable scalable data solutions.Develop and optimize data models, transformations, and analytics layers.Write efficient Python/SQL scripts for data processing and automation.Collaborate...