Pyspark C5

1 week ago


Kolkata, India Mindtree Full time

**Responsibilities**:
Partner with business stakeholders to gather requirements and translate them into technical specifications and process documentation for IT counterparts (onshore and offshore)
Highly proficient in the architecture and development of an event driven data warehouse streaming, batch, data modeling, and storage
Advanced database knowledge creating optimizing SQL queries, stored procedures, functions, partitioning data, indexing, and reading execution plans
Skilled experien ce in writing and troubleshooting Python PySpark scripts to generate extracts, cleanse, conform and deliver data for consumption
Expert level of understanding and implementing ETL architecture data profiling, process flow, metric logging, and erro r handling
Support continuous improvement by investigating and presenting alternatives to processes and technologies to an architectural review board
Develop and ensure adherence to published system architectural decisions and development standar ds # Lead and foster junior data engineers in their careers to produce higher quality solutions at a faster velocity through optimization training and code review # Multi-task across several ongoing projects and daily duties of varying priorities as required
Interact with global technical teams to communicate business requirements and collaboratively build data solutions The duties listed above are the essential functions, or fundamental duties within the job classification. The essential fun ctions of individual positions within the classification may differ. May assign reasonably related additional duties to individual employees consistent with standard departmental policy. Qualifications
Bachelors degree in computer science or MIS
related area required or equivalent experience (industry experience substitutable)
10 years of experience in data development
3 years of experience in Banking and financial domain
Expert level in data warehouse design architecture, dimensional data modeling and ETL process development
Advanced level development in SQL NoSQL scripting and complex stored procedures (Snowflake, SQL Server, DynamoDB, NEO4J a plus)
Extremely proficient in Python, PySpark, and Java
AWS Expertise # Kinesis.

Job Requirements: PySpark, Data Processing, ETL Pipeline

**Job Type**:
Full Time

**Location**:
KOLKATA

**Mandatory Skills**:

- PySpark

**Years of Experience**:
11 to 14 Years



  • Kolkata, India Mindtree Full time

    Data Engineers 1 C5 C6 Offshore 1. Should be familiar with Agile ways of working 1. Should have hands on experience in streaming analytics and data integration using Azure streaming. 1. Integrating differen t Source systems like (Salesforce, Google Analytics, Amazon S3 Storage bucket, SAP etc...) 1. Should have good communication skills and able to...