
Pyspark, Hadoop
1 day ago
Job Summary:
Design and implement ETL workflows using AWS Glue, Python, and PySpark.
Develop and optimize queries using Amazon Athena and Redshift.
Build scalable data pipelines to ingest, transform, and load data from various sources.
Ensure data quality, integrity, and security across AWS services.
Collaborate with data analysts, data scientists, and business stakeholders to deliver data solutions.
Monitor and troubleshoot ETL jobs and cloud infrastructure performance.
Automate data workflows and integrate with CI/CD pipelines.
Required Skills & Qualifications:
Hands-on experience with AWS Glue, Athena, and Redshift.
Strong programming skills in Python and PySpark.
Experience with ETL design, implementation, and optimization.
Familiarity with S3, Lambda, CloudWatch, and other AWS services.
Understanding of data warehousing concepts and performance tuning in Redshift.
Experience with schema design, partitioning, and query optimization in Athena.
Proficiency in version control (Git) and agile development practices.
**About Virtusa**
Teamwork, quality of life, professional and personal development: values that Virtusa is proud to embody. When you join us, you join a team of 27,000 people globally that cares about your growth — one that seeks to provide you with exciting projects, opportunities and work with state of the art technologies throughout your career with us.
Great minds, great potential: it all comes together at Virtusa. We value collaboration and the team environment of our company, and seek to provide great minds with a dynamic place to nurture new ideas and foster excellence.
Virtusa was founded on principles of equal opportunity for all, and so does not discriminate on the basis of race, religion, color, sex, gender identity, sexual orientation, age, non-disqualifying physical or mental disability, national origin, veteran status or any other basis covered by appropriate law. All employment is decided on the basis of qualifications, merit, and business need.
-
Pyspark QA
3 days ago
Andhra Pradesh, India Virtusa Full time**JOB DESCRIPTION** **Skill: PySpark QA** **Role / Tier: Lead Software Engineer/Tier 2** **Experience: 6 - 9 years** Primary Skills BIG Data technology mentioned below Hadoop / Big Data (HDFS, PYTHON, SPARK-SQL, MapReduce) with PYSpark. build CI/CD pipelines Spark APIs to cleanse, explore, aggregate, transform, store & analyse installing, configuring,...
-
Bigdata Pyspark
2 days ago
Andhra Pradesh, India Virtusa Full timeOverall 10+ years of experience in Datawarehouse and Hadoop platform. MUST have experience with Python/PySpark and Hive in Big Data environments Should have strong skills in writing complex SQL Queries and good understanding of Data warehouse concepts. exposure to migration of legacy Data warehouse platform to Hadoop platform experience will be a big...
-
Mlse (Python/pyspark)
2 weeks ago
Noida, Uttar Pradesh, India Impetus Technologies Full timeNoida, Uttar Pradesh, India;Gurgaon, Haryana, India;Hyderabad, Telangana, India;Bangalore, Karnataka, India;Indore, Madhya Pradesh, India Qualification : - 6-8 years of good hands on exposure with Big Data technologies - pySpark (Data frame and SparkSQL), Hadoop, and Hive - Good hands on experience of python and Bash Scripts - Good understanding of SQL and...
-
Sr. Data Engineer
3 days ago
Indrapuri Colony, Indore, Madhya Pradesh, India Mindefy Technologies Full time ₹ 3,10,949 per yearJob Title: Data EngineerExperience: 3–5 YearsLocation: IndoreEmployment Type: Full-timeAbout the RoleWe're looking for a Data Engineer with 3–5 years of experience to join our team. In this role, you'll be responsible for building and managing data pipelines, improving data platforms, and making sure the right data is available for business needs. You...