Pyspark Dev
5 hours ago
Pyspark Dev & QA
Mandatory Skills
BIG Data technology mentioned below Hadoop / Big Data (HDFS, PYTHON, SPARK-SQL, MapReduce) with PySpark
Build CI/CD pipelines is required Outstanding coding, debugging and analytical skills
Spark APIs to cleanse, explore, aggregate, transform, store & analyse available data
Knowledge of installing, configuring, debugging and troubleshooting Hadoop clusters
Secondary Skills
Knowledge of cloud-based infrastructure, AWS services EC2, EMR, S3, Lambda, EBS, IAM, Redshift, RDS
Knowledge of deploying and managing ETL pipelines, RDBMS technologies (PostgreSQL, MySQL, Oracle, etc.).
Knowledge of data frames, Pandas, data visualization tools & data mining
knowledge of JIRA, Bitbucket GitHub
JD
The position is seeking someone having atleast 4 years of experience with expertise in pyspark, databricks, SQL. An individual must have good experience in extracting and manipulating data from relational databases with advanced SQL, HIVE, Python/PySpark.
Responsibility
To develop and implement a comprehensive quality assurance strategy for data engineering projects, ensuring that data quality standards are met throughout the data pipeline.
To collaborate with cross-functional teams to define test plans and strategies for validating data pipelines, ETL processes, and transformations using PySpark, Databricks, and SQL.
To design and execute tests using SQL queries and data profiling techniques to validate the accuracy, completeness, and integrity of data stored in various data repositories, including data lakes and databases.
To adhere to formal QA processes, ensuring that the Systems Implementation (SI) team is using industry-accepted best Practices.
**About Virtusa**
Teamwork, quality of life, professional and personal development: values that Virtusa is proud to embody. When you join us, you join a team of 30,000 people globally that cares about your growth — one that seeks to provide you with exciting projects, opportunities and work with state of the art technologies throughout your career with us.
Great minds, great potential: it all comes together at Virtusa. We value collaboration and the team environment of our company, and seek to provide great minds with a dynamic place to nurture new ideas and foster excellence.
Virtusa was founded on principles of equal opportunity for all, and so does not discriminate on the basis of race, religion, color, sex, gender identity, sexual orientation, age, non-disqualifying physical or mental disability, national origin, veteran status or any other basis covered by appropriate law. All employment is decided on the basis of qualifications, merit, and business need.
-
Chennai, India OneMagnify Full timeJob Description Looking for an offshore Lead Databricks/PySpark Developer who is willing to learn new technologies if needed and able to work with team. This position is long term and will likely be renewed annually. Essential Duties And Responsibilities - Design and development of data ingestion pipelines (Databricks background preferred). - Performance...
-
Consultant-Databricks
5 hours ago
Kolkata, Chennai, Pune, India Tredence Analytics Solutions Private Limited Full timeJob Description - Primary Roles and Responsibilities: Developing Modern Data Warehouse solutions using Databricks and AWS/ Azure Stack Ability to provide solutions that are forward-thinking in data engineering and analytics space Collaborate with DW/BI leads to understand new ETL pipeline development requirements - Triage issues to find gaps in existing...