Big Data with Pyspark
12 hours ago
**Job Title**: Big Data Engineer (PySpark)
**Location**: Bengaluru, India
**Experience**: 5+ years
**Employment Type**: Full-time
**Job Summary**:
**Key Responsibilities**:
- Design, develop, and maintain scalable **big data pipelines** using **PySpark** and other big data technologies.
- Work with **Hadoop, Spark, Kafka, Hive, and other distributed data processing frameworks**.
- Optimize **ETL workflows** and ensure efficient data processing.
- Implement **data quality checks, monitoring, and validation** to ensure high data integrity.
- Collaborate with **data scientists, analysts, and business teams** to understand requirements and deliver solutions.
- Optimize **Spark performance** by tuning jobs and implementing best practices for distributed computing.
- Manage and process **structured and unstructured data** from multiple sources.
- Work with **cloud platforms** like AWS, Azure, or GCP for big data storage and processing.
- Troubleshoot and debug **performance issues** related to big data systems.
**Required Skills**:
- Strong experience with **PySpark and Spark (RDD, DataFrame, Spark SQL)**.
- Proficiency in **Hadoop ecosystem** (HDFS, Hive, HBase, Oozie, etc.).
- Experience with **Kafka, Airflow, or other data orchestration tools**.
- Strong **SQL** skills for querying and optimizing data processing.
- Experience with **cloud platforms** (AWS Glue, EMR, Azure Databricks, GCP BigQuery, etc.).
- Proficiency in **Python and Scala** for big data processing.
- Knowledge of **data lake and data warehouse concepts**.
- Experience in **CI/CD pipelines for data engineering** is a plus.
- Strong problem-solving skills and the ability to work in an **agile environment**.
Pay: ₹50,000.00 - ₹100,000.00 per month
Schedule:
- Day shift
**Experience**:
- Big data with PySpark: 6 years (required)
Work Location: Remote
-
Big Data Trainer
2 days ago
Remote, India REGex Software Services Full timeRequired Big Data Trainer who is expert in following topics: Python Introduction to LINUX Operating System and Basic LINUX commands Hadoop(HDFS) Hadoop 2.0 & YARN Sqoop Hive Programming PySpark ETL **Job Types**: Part-time, Contractual / Temporary, Freelance Contract length: 6-8 weeks Part-time hours: 10-12 per week **Salary**: ₹500.00 -...
-
Senior Big Data Engineer
1 week ago
Remote, India UMENIT SOLUTIONS LLP Full time ₹ 2,40,000 - ₹ 6,00,000 per yearJob Title: Senior Big Data EngineerLocation: Remote / Hybrid (Preferred timezone overlap with [your region])Type: Full-time / ContractAbout the ProjectWe're building an intelligent data pipeline that brings SAP AO (Analysis Office) data into a central database, which will serve as the foundation for an AI-powered chatbot. This chatbot will help users...
-
Azure Data Factory
6 days ago
Remote, India Innowrap Technologies Full time**Experience**: **4-8yrs**: **Location**: **Remote | Work from home**: **Positions Open**: **3**: **Primary focus should be** *** *** - **Databricks with Pyspark ( heavily used in transformation)** *** - **_ADF,Synapse and Delta lake_** *** **Responsibilities** *** *** **-** *** **Design and Development of data pipelines in** **Azure DataBricks with...
-
Big Data
12 hours ago
Remote, India Technology Next Full time**Big Data AI/ML Engineer** **Experience**: 6 to 9 Yrs **Location**:Remote **Tenure: - **6 Months **Salary: - ** 80K-90K In hand **Notice**:Immediate Joiner We are seeking a talented Data Engineer with AI & ML knowledge to join our team. As Data Engineer or ML Ops Engineer, your primary responsibility will be to develop & integrate ML solutions that...
-
Azure Data Engineer
2 weeks ago
Remote, India TESTQ TECHNOLOGIES LTD Full time ₹ 15,00,000 - ₹ 25,00,000 per yearExperience:- 5+ YearsJob Description:Advance SQLAzure Data Factory (ADF) Databricks with PySpark Azure Synapse (added advantage)Fully conversant with big-data processing approaches and schema-on-read methodologies is a must and knowledge of Azure Data Factory / Azure Databricks (PySpark) / Azure Data Lake Storage (ADLS Gen 2)Good to have an excellent...
-
Big Data
6 days ago
Remote, India KEG HR Services Full time**Greetings From KEG HR Services!** Job Role: Big Data & Kubernetes Administrator Experience: 3 Years to 5 Years **Job Description**: - Install, configure, and maintain Cloudera Big Data clusters across multiple environments, ensuring optimal performance and resource utilization. - Ensure high availability and manage the performance of data services like...
-
Big Data Architect
2 days ago
Remote, India Databricks Full timeAs a **Big Data Architect** in our Professional Services team you will work with clients on short to medium term customer engagements on their big data challenges using the Databricks platform. You will provide data engineering, data science, and cloud technology projects which require integrating with client systems, training, and other technical tasks to...
-
Data Engineer
4 days ago
Remote, India Pontoon Global Solutions Full time ₹ 50,000 - ₹ 15,00,000 per yearJob Title: Data EngineerExperience: 5 – 10 yearsWork Mode: Hybrid / RemoteJob Description:We are seeking a Data Engineer with strong expertise in PySpark, Scala, and Python. The ideal candidate must have hands-on experience in Apache Spark, as it will be the core technology for large-scale data processing and pipeline development.Key...
-
Architect - Big Data
1 week ago
Remote, India Databricks Full timeCSQ125R88 As an **Architect (Big Data) **in our Professional Services team you will work with clients on short to medium term customer engagements on their **Big Data challenges using the Databricks platform**. You will provide data engineering, data science, and cloud technology projects which require integrating with client systems, training, and other...
-
Azure Data Engineer contractual
2 weeks ago
Remote, India Decillion Digital Full time ₹ 15,00,000 - ₹ 25,00,000 per yearExperience:- 5+ YearsJob Description:Advance SQLAzure Data Factory (ADF) Databricks with PySpark Azure Synapse (added advantage)Fully conversant with big-data processing approaches and schema-on-read methodologies is a must, and knowledge of Azure Data Factory / Azure Databricks (PySpark) / Azure Data Lake Storage (ADLS Gen 2)Good to have excellent...