AWS Glue PySpark Developer

5 days ago


India CIGNEX Full time ₹ 15,00,000 - ₹ 25,00,000 per year

We are looking for an experienced AWS Glue PySpark Developer to design, develop, and optimize ETL pipelines and data processing solutions on AWS. The ideal candidate will have deep expertise in PySpark, AWS Glue, and data engineering best practices, along with hands-on experience in building scalable, high-performance data solutions in the cloud.

Key Responsibilities:

  • Design, build, and maintain scalable ETL pipelines using AWS Glue and PySpark.
  • Work with stakeholders to gather and analyse data requirements and translate them into technical solutions.
  • Develop efficient and reusable PySpark scripts to process large-scale structured and unstructured datasets.
  • Optimize ETL jobs for performance, scalability, and cost-effectiveness in AWS environments.
  • Integrate AWS Glue with other AWS services such as S3, Redshift, RDS, Lambda, Step Functions, and Athena.
  • Implement data quality checks, validation frameworks, and error-handling mechanisms within ETL pipelines.
  • Collaborate with data engineers, analysts, and business teams to ensure data accuracy and consistency.
  • Monitor, debug, and resolve production issues related to Glue jobs and data workflows.
  • Ensure compliance with security, governance, and regulatory requirements for data pipelines.
  • Stay current with AWS and big data ecosystem advancements to continuously improve solutions.

Required Skills:

  • 5-6 years of experience in data engineering/ETL development, with at least 3 years in AWS Glue & PySpark.
  • Strong proficiency in PySpark, Spark SQL, and distributed data processing.
  • Hands-on experience with AWS services: S3, Glue Catalog, Redshift, RDS, Lambda, Step Functions, CloudWatch.
  • Expertise in designing data models, partitioning strategies, and optimizing large datasets.
  • Proficiency in SQL and working with relational as well as NoSQL databases.
  • Experience with version control (Git), CI/CD pipelines, and Agile methodologies.
  • Strong problem-solving skills and ability to debug complex data issues.
  • Excellent communication and collaboration skills.


  • India Matrix USA Full time

    Job Overview We are seeking an experienced AWS Developer proficient in Python and PySpark to design, develop, and maintain scalable, serverless data processing and workflow automation solutions on AWS. The ideal candidate will build Lambda functions, Step Functions, Glue ETL jobs, and integrate various AWS services to support complex data pipelines and...


  • India Digitrix Software LLP Full time

    Experience: 5 to 8 years Job description: Python AWS Data Engineer - Python, AWS Python (core language skill) -- Backend, Pandas, PySpark (DataFrame API), interacting with AWS (e.g., boto3 for S3, Glue, Lambda) - Data Processing: Spark (PySpark), Glue, EMR AWS Core Services: S3, Glue, Athena, Lambda, Step Functions, EMR - Containerization: Docker -...


  • India Digitrix Software LLP Full time

    Experience : 5 to 8 years Job description: Python AWS Data Engineer Python, AWS Python (core language skill) -- Backend, Pandas, PySpark (DataFrame API), interacting with AWS (e.g., boto3 for S3, Glue, Lambda) Data Processing: Spark (PySpark), Glue, EMR AWS Core Services: S3, Glue, Athena, Lambda, Step Functions, EMR Containerization: Docker ...


  • India Digitrix Software LLP Full time

    Experience : 5 to 8 years Job description: Python AWS Data Engineer Python, AWS Python (core language skill) -- Backend, Pandas, PySpark (DataFrame API), interacting with AWS (e.g., boto3 for S3, Glue, Lambda) Data Processing: Spark (PySpark), Glue, EMR AWS Core Services: S3, Glue, Athena, Lambda, Step Functions, EMR Containerization: Docker Orchestration:...


  • India Digitrix Software LLP Full time

    Experience: 5 to 8 yearsJob description: Python AWS Data EngineerPython, AWS Python (core language skill) -- Backend, Pandas, PySpark (DataFrame API), interacting with AWS (e.g., boto3 for S3, Glue, Lambda)Data Processing: Spark (PySpark), Glue, EMR AWS Core Services: S3, Glue, Athena, Lambda, Step Functions, EMRContainerization: DockerOrchestration:...


  • India ThreatXIntel Full time

    Job DescriptionCompany DescriptionThreatXIntel is a startup cyber security company focused on protecting businesses and organizations from cyber threats. Our services include cloud security, web and mobile security testing, cloud security assessment, and DevSecOps. We provide customized, affordable solutions to meet the specific needs of our clients,...


  • India Matrix USA Full time

    Job Overview We are seeking an experienced AWS Developer proficient in Python and PySpark to design, develop, and maintain scalable, serverless data processing and workflow automation solutions on AWS. The ideal candidate will build Lambda functions, Step Functions, Glue ETL jobs, and integrate various AWS services to support complex data pipelines and...


  • India Matrix USA Full time

    Job OverviewWe are seeking an experienced AWS Developer proficient in Python and PySpark to design, develop, and maintain scalable, serverless data processing and workflow automation solutions on AWS. The ideal candidate will build Lambda functions, Step Functions, Glue ETL jobs, and integrate various AWS services to support complex data pipelines and...


  • India Matrix USA Full time

    Job Overview We are seeking an experienced AWS Developer proficient in Python and PySpark to design, develop, and maintain scalable, serverless data processing and workflow automation solutions on AWS. The ideal candidate will build Lambda functions, Step Functions, Glue ETL jobs, and integrate various AWS services to support complex data pipelines and...


  • India Matrix USA Full time

    Job Overview We are seeking an experienced AWS Developer proficient in Python and PySpark to design, develop, and maintain scalable, serverless data processing and workflow automation solutions on AWS. The ideal candidate will build Lambda functions, Step Functions, Glue ETL jobs, and integrate various AWS services to support complex data pipelines and...