Data engineer ii

1 day ago


Hubli, India ClearDemand Full time

Job Summary: Building on the foundation of the SDE-I role, the DE- II position takes on a greater level of responsibility and leadership. You'll play a crucial role in driving the evolution and efficiency of our data collection and analytics platform, capable of handling terabyte-scale data and billions of data points. Key Responsibilities Lead the design, development, and optimization of large-scale data pipelines and infrastructures using technologies like Apache Airflow, Spark, Kafka, and more. Architect and implement distributed data processing solutions to handle terabyte-scale datasets and billions of records efficiently across multi-region cloud infrastructure (AWS, GCP, DO). Develop and maintain real-time data processing solutions for high-volume data collection operations using technologies like Spark Streaming and Kafka. Optimize data storage strategies using technologies such as Amazon S3, HDFS, and Parquet/Avro file formats for efficient querying and cost management. Build and maintain high-quality ETL pipelines, ensuring robust data collection and transformation processes with a focus on scalability and fault tolerance. Collaborate with data analysts, researchers, and cross-functional teams to define and maintain data quality metrics, implement robust data validation, and enforce security best practices. Mentor junior engineers (SDE-I) and foster a collaborative, growth-oriented environment. Participate in technical discussions, contributing to architectural decisions, and proactively identifying improvements for scalability, performance, and cost-efficiency. Ensure application performance monitoring (APM) is in place, utilizing tools like Datadog, New Relic, or similar to proactively monitor and optimize system performance, detect bottlenecks, and ensure system health. Implement effective data partitioning strategies and indexing for performance optimization in distributed databases such as Dynamo DB, Cassandra, or HBase. Stay current with advancements in data engineering, orchestration tools, and emerging cloud technologies, continually enhancing the platform’s capabilities Qualifications & Experience: 4-5+ years of hands-on experience with Apache Airflow and other orchestration tools for managing large-scale workflows and data pipelines. Expertise in AWS technologies, Athena, AWS Glue, Dynamo DB, Apache Spark, Py Spark, SQL, and No SQL databases. Experience in designing and managing distributed data processing systems that scale to terabyte and billion-scale datasets using cloud platforms like AWS, GCP, or Digital Ocean. Proficiency in web crawling frameworks, including Node.js, HTTP protocols, Puppeteer, Playwright, and Chromium for large-scale data extraction. Experience with monitoring and observability tools such as Grafana, Prometheus, Elasticsearch, and familiarity with monitoring and optimizing resource utilization in distributed systems. Strong understanding of infrastructure as code using Terraform, automated CI/CD pipelines with Jenkins, and event-driven architecture with Kafka. Experience with data lake architectures and optimizing storage using formats such as Parquet, Avro, or ORC. Strong background in optimizing query performance and data processing frameworks (Spark, Flink, or Hadoop) for efficient data processing at scale. Knowledge of containerization (Docker, Kubernetes) and orchestration for distributed system deployments. Deep experience in designing resilient data systems with a focus on fault tolerance, data replication, and disaster recovery strategies in distributed environments. Strong data engineering skills, including ETL pipeline development, stream processing, and distributed systems. Excellent problem-solving abilities, with a collaborative mindset and strong communication skills.


  • Data engineer

    4 weeks ago


    Hubli, India Vriba Solutions Full time

    Role: Data Engineering Remote Looking for 5-10 years of Exp• Design, develop & maintain ETL/ELT pipelines• Ingest & transform data from APIs, DBs, files, streams• Build real-time & batch processing solutions• Data validation, quality & cleansing• Translate business needs into data models• Ensure data security, access control & compliance•...

  • Data Engineer

    3 weeks ago


    Hubli, India AS Technology Corporation Full time

    We are seeking skilled and motivated Spark & Databricks Developers to join our dynamic team for a long-term project. The ideal candidate will have strong hands-on experience in Apache Spark, Databricks, and GitHub-based development workflows.Key Responsibilities:Design, develop, and optimize big data pipelines using Apache Spark.Build and maintain scalable...

  • Aws data engineer

    3 weeks ago


    Hubli, India Coforge Full time

    AWS Data EngineerJob Location: BengaluruExperience Required: 5+ YearsMandatory Skills: AWS Services, ETL, ETL Integration, Code Pipeline, Jenkins, Glue, EMR, Athena, ECS, EKS, Kubernetes, Cloud Watch, Prometheus, Grafana, Python, Shell, or Power ShellJob Description:We are looking for an experienced AWS Engineer with around 5-8 years of hands-on experience...

  • Data Engineer

    3 weeks ago


    Hubli, India Dautom Full time

    Job Description: Data Engineer (Big Data/Kafka) We are seeking a highly experienced Senior Data Engineer with a deep background in Big Data technologies to join our team. This is a contract role for a major project in the Banking sector.Key Details:Role: Data EngineerIndustry: Banking (Financial Services)Work Location: Remote (India)Contract Duration: 6...

  • Software Engineer

    2 hours ago


    Hubli, India Triple-A Full time

    About Triple-ATriple-A is a global payment institution licensed in the United States, Europe, and Singapore, enabling businesses worldwide to pay and get paid in both local and digital currencies.We empower businesses to reach over 560M digital currency owners, boost revenue, and optimise costs through stablecoin and cryptocurrency payments, while...

  • Data Scientist

    2 weeks ago


    Hubli, India Whatjobs IN C2 Full time

    Job Summary: We are seeking a highly skilled and analytical Data Scientist with hands-on experience in designing, developing, and deploying data-driven solutions. The ideal candidate will have strong expertise in data analysis, machine learning, and cloud-based model deployment preferably on Google Cloud Platform (GCP). This role involves working closely...

  • Data Scientist

    1 week ago


    Hubli, India v4c.ai Full time

    Overview: The Data Scientist supports the development and implementation of data models, focusing on Machine Learning, under the supervision of more experienced scientists, contributing to the team’s innovative projects.Job Description:Assist in the development of Machine Learning models and algorithms, contributing to the design and implementation of...

  • Data consultant

    2 days ago


    Hubli, India Plative Full time

    The Data Consultant plays a key role in designing, implementing, and supporting data integration and analytics solutions across Salesforce, Net Suite, and other enterprise systems. This position focuses primarily (~80%) on Salesforce–Net Suite integration and data migration projects, ensuring seamless data flow across front-office and back-office...

  • Senior Executive- IT

    2 weeks ago


    Hubli, India Lemnisk Full time

    Company Description:Lemnisk is the world’s first real-time marketing automation built on an intelligent and secure Customer Data Platform. It orchestrates 1-to-1 personalization and cross-channel customer journeys at scale, increasing conversions, retention, and growth for enterprises. Lemnisk leverages cutting-edge technology to provide marketers with...

  • Senior Executive- IT

    2 weeks ago


    Hubli, India Lemnisk Full time

    Company Description:Lemnisk is the world’s first real-time marketing automation built on an intelligent and secure Customer Data Platform. It orchestrates 1-to-1 personalization and cross-channel customer journeys at scale, increasing conversions, retention, and growth for enterprises. Lemnisk leverages cutting-edge technology to provide marketers with...