Big Data with Pyspark
4 days ago
**Job Title**: Big Data Engineer (PySpark)
**Location**: Bengaluru, India
**Experience**: 5+ years
**Employment Type**: Full-time
**Job Summary**:
**Key Responsibilities**:
- Design, develop, and maintain scalable **big data pipelines** using **PySpark** and other big data technologies.
- Work with **Hadoop, Spark, Kafka, Hive, and other distributed data processing frameworks**.
- Optimize **ETL workflows** and ensure efficient data processing.
- Implement **data quality checks, monitoring, and validation** to ensure high data integrity.
- Collaborate with **data scientists, analysts, and business teams** to understand requirements and deliver solutions.
- Optimize **Spark performance** by tuning jobs and implementing best practices for distributed computing.
- Manage and process **structured and unstructured data** from multiple sources.
- Work with **cloud platforms** like AWS, Azure, or GCP for big data storage and processing.
- Troubleshoot and debug **performance issues** related to big data systems.
**Required Skills**:
- Strong experience with **PySpark and Spark (RDD, DataFrame, Spark SQL)**.
- Proficiency in **Hadoop ecosystem** (HDFS, Hive, HBase, Oozie, etc.).
- Experience with **Kafka, Airflow, or other data orchestration tools**.
- Strong **SQL** skills for querying and optimizing data processing.
- Experience with **cloud platforms** (AWS Glue, EMR, Azure Databricks, GCP BigQuery, etc.).
- Proficiency in **Python and Scala** for big data processing.
- Knowledge of **data lake and data warehouse concepts**.
- Experience in **CI/CD pipelines for data engineering** is a plus.
- Strong problem-solving skills and the ability to work in an **agile environment**.
Pay: ₹50,000.00 - ₹100,000.00 per month
Schedule:
- Day shift
**Experience**:
- Big data with PySpark: 6 years (required)
Work Location: Remote
-
Big Data Trainer
5 days ago
Remote, India REGex Software Services Full timeRequired Big Data Trainer who is expert in following topics: Python Introduction to LINUX Operating System and Basic LINUX commands Hadoop(HDFS) Hadoop 2.0 & YARN Sqoop Hive Programming PySpark ETL **Job Types**: Part-time, Contractual / Temporary, Freelance Contract length: 6-8 weeks Part-time hours: 10-12 per week **Salary**: ₹500.00 -...
-
Big Data Lead Engineer_DGLiger
2 weeks ago
Remote, India Coders Brain Technology Full time ₹ 2,00,00,000 - ₹ 4,00,00,000 per yearMust to have skills: Pyspark, SQL, Cloud computing(AWS/Azure or any other cloud) JOB DESCRIPTION § Work with extremely talented peers in a client environment to build new and enhance/maintain existing PySpark, Python codes that generates analytics insights leveraging the big data environment in AWS § Proficiency in SQL Writing, SQL Concepts, Data...
-
Big Data
4 days ago
Remote, India Technology Next Full time**Big Data AI/ML Engineer** **Experience**: 6 to 9 Yrs **Location**:Remote **Tenure: - **6 Months **Salary: - ** 80K-90K In hand **Notice**:Immediate Joiner We are seeking a talented Data Engineer with AI & ML knowledge to join our team. As Data Engineer or ML Ops Engineer, your primary responsibility will be to develop & integrate ML solutions that...
-
Big Data Architect
5 days ago
Remote, India Databricks Full timeAs a **Big Data Architect** in our Professional Services team you will work with clients on short to medium term customer engagements on their big data challenges using the Databricks platform. You will provide data engineering, data science, and cloud technology projects which require integrating with client systems, training, and other technical tasks to...
-
Architect - Big Data
2 weeks ago
Remote, India Databricks Full timeCSQ125R88 As an **Architect (Big Data) **in our Professional Services team you will work with clients on short to medium term customer engagements on their **Big Data challenges using the Databricks platform**. You will provide data engineering, data science, and cloud technology projects which require integrating with client systems, training, and other...
-
Data Engineer
2 weeks ago
Remote, India Pontoon Global Solutions Full time ₹ 50,000 - ₹ 15,00,000 per yearJob Title: Data EngineerExperience: 5 – 10 yearsWork Mode: Hybrid / RemoteJob Description:We are seeking a Data Engineer with strong expertise in PySpark, Scala, and Python. The ideal candidate must have hands-on experience in Apache Spark, as it will be the core technology for large-scale data processing and pipeline development.Key...
-
Big Data Engineer
2 weeks ago
Remote, India Sureminds Solutions Full time ₹ 12,00,000 - ₹ 36,00,000 per year5+ years of experience as a Big Data Engineer with a strong focus on Apache Spark and Databricks.Proficiency in Scala for data processing and transformation.Experience with version control systems, particularly Git and GitHub.Strong knowledge of cloud computing platforms, with a preference for experience with Azure.Hands-on experience with Azure Data Factory...
-
Big Data Architect
2 days ago
Remote, India Databricks Full timeCSQ225R24 As a** Resident Solutions Architect** in our Professional Services team you will work with clients on short to medium term customer engagements on their **big data challenges using the Databricks platform**. You will provide data engineering, data science, and cloud technology projects which require integrating with client systems, training, and...
-
Big Data Eng_Roji_Tookiaki
1 week ago
remote, India Coders Brain Technology Full timeJob Title: BigData Engineer Reporting to: Technical Lead Location: India Requirements: Our ideal candidate will have the following responsibilities: ● Owning and delivering installation and configuration of AI products. ● Participating in solution architecture discussions to estimate hardware sizing. ● Analysing defects and conducting effective...
-
Data Architect
2 days ago
Remote, India Alphawizz Technologies Pvt. Ltd. Full time**Job Title**: Data Architect (Cloud | Big Data | Analytics) **Location**: Remote **Experience**: Minimum 10+ years in data architecture, engineering, or related roles **Tech Stack Includes**: AWS Redshift | GCP BigQuery | Azure Synapse | Apache Spark | Airflow | dbt | Hadoop | Snowflake | PostgreSQL | MySQL | MongoDB | Kafka | Flink | Tableau | Power BI |...