Data Engineer Python PySpark Apache Airflow NoSQL
2 months ago
Responsibilities: Build optimize and maintain scalable ETL pipelines for data ingestion and processing. Develop and manage workflows using Apache Airflow for scheduling and orchestrating tasks. Work with distributed computing technologies (PySpark) to handle largescale datasets. Design and implement data architectures that scale with growing business needs. Implement data lake and data warehousing solutions using both structured and unstructured data. Collaborate with data scientists and analytics teams to ensure data quality and availability. Optimize existing data models and pipelines for performance and scalability. Use NoSQL databases (e.g. MongoDB Cassandra) for large scalable data storage solutions. Ensure high data integrity security and quality through monitoring and validation processes. Write clear documentation and maintain data engineering best practices
Skills & Qualifications: Strong proficiency in Python PySpark and SQL. Experience working with Apache Airflow for orchestration. Handson experience with distributed computing and big data tools (PySpark Hadoop). Familiarity with cloud platforms (AWS GCP) and tools like S3 EMR Lambda etc. Experience with NoSQL databases (e.g. MongoDB Cassandra) and relational databases. Strong understanding of data warehousing concepts ETL processes and data lake architecture. Experience with data pipeline monitoring logging and alerting. Strong knowledge of Docker and containerized environments. Familiarity with DevOps and CI/CD practices for data engineering. Excellent problemsolving communication and teamwork skills.
About the Company:CuberaTech founded in 2020 is a data company revolutionizing Big Data Analytics through a data value share paradigm where the users entrust their data to us. Our deployment of deep learning techniques enables us to harness this data making us a source of the richest Zero party data. Further by stitching together all the relevant pieces of data from zero first and secondparty sources we enable advertisers to define and create custom audiences to maximize the programmatic ROAS.Website:
pyspark,ci/cd,nosql,cassandra,sql,devops,mongodb,apache airflow,airflow,python,aws,hadoop,docker,apache,gcp
-
Data Engineer Python PySpark Apache Airflow NoSQL
2 months ago
Bengaluru, India ConsultBae India Private limited Full timeData Engineer(Python PySpark Apache Airflow NoSQL)Location: Bengaluru Karnataka India (Onsite)Experience: 35yearsResponsibilities:Build optimize and maintain scalable ETL pipelines for dataingestion and processing. Develop and manageworkflows using Apache Airflow for scheduling and orchestratingtasks. Work with distributed computingtechnologies (PySpark) to...
-
Bengaluru, Karnataka, India ConsultBae India Private limited Full timeAbout the RoleWe are seeking a highly skilled Data Engineer to join our team at CuberaTech. As a Data Engineer, you will be responsible for designing, building, and maintaining large-scale data pipelines and architectures using Python, PySpark, and Apache Airflow. You will work closely with our data scientists and analytics teams to ensure data quality and...
-
Senior Data Engineer
4 weeks ago
Bengaluru, Karnataka, India ConsultBae India Private limited Full timeAbout the RoleWe are seeking a highly skilled Data Engineer to join our team at CuberaTech. As a Data Engineer, you will be responsible for designing, building, and maintaining large-scale data processing systems using Python, PySpark, and Apache Airflow.Key ResponsibilitiesDesign and implement scalable data pipelines using Python and PySparkDevelop and...
-
Data Engineer
2 months ago
Bengaluru, India nTech Workforce Full timeResponsibilities : - Build data collection pipelines that can acquire and handle millions of public data points from various sources using APIs and web extraction techniques at scale. - Build the data cleaning and preprocessing pipelines in the platform leveraging AWS services such as Lambda, S3 EC2 PostgreSQL, Elasticsearch, etc. - Iterative improvements to...
-
Principal Data Engineer
3 months ago
Bengaluru, India gethyr Full timeResponsibilities : Data Pipeline Development : Design, implement, and maintain scalable data pipelines using PySpark to process and transform large datasets efficiently.Workflow Orchestration : Develop, schedule, and monitor complex data workflows and ETL processes using Apache Airflow.Data Management : Manage and optimize data storage solutions, ensuring...
-
PySpark Developer
2 months ago
Bengaluru, India Rigel networks Full timeJob Title - Pyspark DeveloperLocation : Bangalore G.P.O., Karnataka Job Description :Lead Data Engineer :Top Skills :- Python- SQL- AWS- Spark- 7+ years relevant work experience in the Data Engineering field- 5+ years of experience working with Hadoop and Big Data processing frameworks (Hadoop, Spark, Hive, Flink, Airflow etc.)- 5+ years of Strong experience...
-
Data Engineering Specialist
7 days ago
Bengaluru, Karnataka, India ConsultBae India Private limited Full timeAbout the RoleCuberaTech is a data company revolutionizing Big Data Analytics through a data value share paradigm. We are looking for an experienced Data Engineering Specialist to join our team in Bengaluru, Karnataka. The ideal candidate will have strong proficiency in Python, PySpark, and SQL, as well as experience working with Apache Airflow for...
-
Azure Iot
5 months ago
Bengaluru, Karnataka, India InnoWave India Full timeAs an **Azure IoT with Apache Airflow Engineer** at **InnoWave**, you will be a key member of our IoT solutions team, responsible for designing, developing, and maintaining robust and scalable data processing pipelines. You will leverage Azure IoT services and Apache Airflow to build efficient workflows that enable real-time data ingestion, processing, and...
-
Data Architect
2 weeks ago
Bengaluru, Karnataka, India ConsultBae India Private limited Full timeJob Title: Data ArchitectAbout the Role:CuberaTech is seeking a highly skilled Data Architect to join our team. As a Data Architect, you will be responsible for designing and implementing scalable data architectures that meet the growing business needs of our company. You will work closely with our data engineering team to ensure that our data systems are...
-
Senior Data Engineer
4 weeks ago
Bengaluru, Karnataka, India gethyr Full timeAbout the RoleWe are seeking a highly skilled Senior Data Engineer to join our team at Gethyr. As a key member of our data engineering team, you will be responsible for designing, implementing, and maintaining scalable data pipelines using PySpark to process and transform large datasets efficiently.Key Responsibilities:Design and implement scalable data...
-
Data Architect for Scalable ETL Pipelines
3 weeks ago
Bengaluru, Karnataka, India ConsultBae India Private limited Full timeAbout the Role:As a Data Architect at ConsultBae India Private Limited, you will be responsible for designing and implementing scalable ETL pipelines for data ingestion and processing. You will work with distributed computing technologies like PySpark to handle large-scale datasets and develop workflows using Apache Airflow for scheduling and orchestrating...
-
Data Engineering Lead
4 weeks ago
Bengaluru, Karnataka, India gethyr Full timeJob Title: Principal Data EngineerAbout the Role:We are seeking a highly skilled Principal Data Engineer to join our team at Gethyr. As a key member of our data engineering team, you will be responsible for designing, implementing, and maintaining scalable data pipelines using PySpark to process and transform large datasets efficiently.Key...
-
Data Engineering Lead
4 weeks ago
Bengaluru, Karnataka, India gethyr Full timeJob Title: Principal Data EngineerAbout the Role:We are seeking a highly skilled Principal Data Engineer to join our team at Gethyr. As a key member of our data engineering team, you will be responsible for designing, implementing, and maintaining scalable data pipelines using PySpark to process and transform large datasets efficiently.Key...
-
Pyspark Engineer
1 month ago
Bengaluru, India Capgemini Engineering Full timeResponsibilities: Bachelor's degree in Computer Science, Engineering, or related field.Experience in data engineering or data analyst.Should have experience in ETL pipeline development.Proficiency in Python programming language, with experience in data manipulation, data structures, and object-oriented programming.Good experience in SQL/ Advanced SQL.Strong...
-
Pyspark Engineer
1 month ago
Bengaluru, India Capgemini Engineering Full timeResponsibilities: Bachelor's degree in Computer Science, Engineering, or related field.Experience in data engineering or data analyst.Should have experience in ETL pipeline development.Proficiency in Python programming language, with experience in data manipulation, data structures, and object-oriented programming.Good experience in SQL/ Advanced SQL.Strong...
-
Pyspark Engineer
1 month ago
Bengaluru, India Capgemini Engineering Full timeResponsibilities: Bachelor's degree in Computer Science, Engineering, or related field. Experience in data engineering or data analyst. Should have experience in ETL pipeline development. Proficiency in Python programming language, with experience in data manipulation, data structures, and object-oriented programming. Good experience in SQL/ Advanced SQL....
-
Bengaluru, Karnataka, India ScaleneWorks Full timeJob Title: AWS Big Data EngineerAbout the Role:We are seeking an experienced AWS Big Data Engineer to join our team at ScaleneWorks. The ideal candidate will have a strong background in coding and a deep understanding of big data technologies, with extensive experience in PySpark, SQL, Spark, and Airflow.Key Responsibilities:• Design and develop big data...
-
Airflow Python Developer
2 months ago
Greater Bengaluru Area, India Tata Consultancy Services Full time1 Role** Airflow Python Developer 2 Required Technical Skill Set** Airflow, Kubernetes, Python, SQL 3 No of Requirements** 2 4 Desired Experience Range** 4+Years 5 Location of Requirement Bangalore Desired Competencies (Technical/Behavioral Competency) Must-Have** (Ideally should not be more than 3-5) Design and develop scalable pipelines with Apache...
-
Airflow Python Developer
2 months ago
Greater Bengaluru Area, India Tata Consultancy Services Full time1 Role** Airflow Python Developer2 Required Technical Skill Set** Airflow, Kubernetes, Python, SQL3 No of Requirements** 24 Desired Experience Range** 4+Years5 Location of Requirement BangaloreDesired Competencies (Technical/Behavioral Competency)Must-Have**(Ideally should not bemore than 3-5) Design and develop scalable pipelines with Apache AirflowHands on...
-
AWS Cloud Data Specialist
3 weeks ago
Bengaluru, Karnataka, India Global Pharma Tek Full timeKey SkillsAt Global Pharma Tek, we're seeking a skilled Data Engineer for Cloud Data Platforms to join our team. As a key member of our data engineering team, you will be responsible for designing, building, and maintaining our cloud-based data infrastructure using Informatica, AWS Glue, and dbt. You will also work closely with our data scientists to develop...