Data engineer ii
3 weeks ago
Job Summary: Building on the foundation of the SDE-I role, the DE- II position takes on a greater level of responsibility and leadership. You'll play a crucial role in driving the evolution and efficiency of our data collection and analytics platform, capable of handling terabyte-scale data and billions of data points. Key Responsibilities Lead the design, development, and optimization of large-scale data pipelines and infrastructures using technologies like Apache Airflow, Spark, Kafka, and more. Architect and implement distributed data processing solutions to handle terabyte-scale datasets and billions of records efficiently across multi-region cloud infrastructure (AWS, GCP, DO). Develop and maintain real-time data processing solutions for high-volume data collection operations using technologies like Spark Streaming and Kafka. Optimize data storage strategies using technologies such as Amazon S3, HDFS, and Parquet/Avro file formats for efficient querying and cost management. Build and maintain high-quality ETL pipelines, ensuring robust data collection and transformation processes with a focus on scalability and fault tolerance. Collaborate with data analysts, researchers, and cross-functional teams to define and maintain data quality metrics, implement robust data validation, and enforce security best practices. Mentor junior engineers (SDE-I) and foster a collaborative, growth-oriented environment. Participate in technical discussions, contributing to architectural decisions, and proactively identifying improvements for scalability, performance, and cost-efficiency. Ensure application performance monitoring (APM) is in place, utilizing tools like Datadog, New Relic, or similar to proactively monitor and optimize system performance, detect bottlenecks, and ensure system health. Implement effective data partitioning strategies and indexing for performance optimization in distributed databases such as Dynamo DB, Cassandra, or HBase. Stay current with advancements in data engineering, orchestration tools, and emerging cloud technologies, continually enhancing the platform’s capabilities Qualifications & Experience: 4-5+ years of hands-on experience with Apache Airflow and other orchestration tools for managing large-scale workflows and data pipelines. Expertise in AWS technologies, Athena, AWS Glue, Dynamo DB, Apache Spark, Py Spark, SQL, and No SQL databases. Experience in designing and managing distributed data processing systems that scale to terabyte and billion-scale datasets using cloud platforms like AWS, GCP, or Digital Ocean. Proficiency in web crawling frameworks, including Node.js, HTTP protocols, Puppeteer, Playwright, and Chromium for large-scale data extraction. Experience with monitoring and observability tools such as Grafana, Prometheus, Elasticsearch, and familiarity with monitoring and optimizing resource utilization in distributed systems. Strong understanding of infrastructure as code using Terraform, automated CI/CD pipelines with Jenkins, and event-driven architecture with Kafka. Experience with data lake architectures and optimizing storage using formats such as Parquet, Avro, or ORC. Strong background in optimizing query performance and data processing frameworks (Spark, Flink, or Hadoop) for efficient data processing at scale. Knowledge of containerization (Docker, Kubernetes) and orchestration for distributed system deployments. Deep experience in designing resilient data systems with a focus on fault tolerance, data replication, and disaster recovery strategies in distributed environments. Strong data engineering skills, including ETL pipeline development, stream processing, and distributed systems. Excellent problem-solving abilities, with a collaborative mindset and strong communication skills.
-
Data Engineer
2 weeks ago
Hubli, India Sapphire Software Solutions Inc Full timeHI FolksPlease check the JD and share your updated resume to my email naresh@sapphiresoftwaresolutions.com and ping me on whatsapp (+91 970-529-6474) along with your resumeData Engineer100% Remote1 year contractJOB DESCRIPTIONA global law firm with nearly 1,400 lawyers and more than 3,000 employees across 19 offices in the United States, Europe, and Asia is...
-
Data Engineer
1 week ago
Hubli, India Whatjobs IN C2 Full timeJob Title: Data Engineer Location: Hyderabad / Chennai (Hybrid) Experience: 6–10 Years (STRICTLY) Employment Type: Permanent Notice Period: Immediate Joiners Only (≤15 days) Skills Required: Python, PySpark, AWS / Snowflake About the Company Our client is a global leader in IT and business services, operating in 50+ countries . They specialize in cloud,...
-
Data Engineer
2 weeks ago
Hubli, India People Prime Worldwide Full timeAbout Company:-A leading global information technology, consulting, and business process services organization, the company delivers innovative solutions that enable clients across industries to thrive in the digital era. With a strong focus on technology-driven transformation, it helps enterprises harness the power of cloud, AI, automation, and analytics to...
-
Data Engineer
2 weeks ago
Hubli, India Deloitte Full timeYour potential, unleashed.India’s impact on the global economy has increased at an exponential rate and Deloitte presents an opportunity to unleash and realise your potential amongst cutting edge leaders, and organisations shaping the future of the region, and indeed, the world beyond.At Deloitte, your whole self to work, every day. Combine that with our...
-
Senior Data Engineer
1 week ago
Hubli, India EXL Full timeKey Responsibilities: • Develop high quality, secure and scalable data pipelines using spark, Scala/Python/Java on Hadoop or object storage like MinIO. • Leverage technologies and solutions to innovate with increasingly large data sets. • Drive automation and efficiency in Data ingestion, data movement and data access workflows by innovation and...
-
Senior Data Engineer
1 week ago
Hubli, India EXL Full timeKey Responsibilities: • Develop high quality, secure and scalable data pipelines using spark, Scala/Python/Java on Hadoop or object storage like MinIO. • Leverage technologies and solutions to innovate with increasingly large data sets. • Drive automation and efficiency in Data ingestion, data movement and data access workflows by innovation and...
-
Senior Data Engineer
3 weeks ago
Hubli, India Guidanz Inc Full timeAbout BI ConnectorBI Connector is the industry leading solution for integrating Oracle Fusion Cloud data into modern BI platforms like Power BI, Tableau, and Data Warehouse, without complex ETL. Our Data Architecture and Reporting Solutions team extends this mission by helping enterprises unlock the full value of their Oracle Fusion ERP, SCM, HCM data...
-
Freelance data engineer
2 weeks ago
Hubli, India Leading MNC Full timeLooking for a Freelance Data Engineer to join a team of rockstar developers. The candidate should have a minimum of 8+ yrs. of experience. There are multiple openings. If you're looking for freelance/ part time opportunity (along with your day job) & a chance to work with the top 0.1% of developers in the industry, this one is for you! You will report into...
-
Senior Data Engineer
2 weeks ago
Hubli, India Whatjobs IN C2 Full timeJob Summary Kogta Financial Ltd. is seeking an experienced and highly skilled ETL & Data Warehouse Developer with strong expertise in AWS data services . As a key member of our data engineering team, you will be responsible for designing, developing, and optimizing ETL pipelines and scalable data warehouse solutions on the AWS platform. The ideal candidate...
-
Senior Data Engineer
4 days ago
Hubli, India Spocto X (A Yubi Company) Full timeAbout YubiYubi, formerly known as CredAvenue, is re-defining global debt markets by freeing the flow of finance between borrowers, lenders, and investors. We are the world's possibility platform for the discovery, investment, fulfillment, and collection of any debt solution. At Yubi, opportunities are plenty and we equip you with tools to seize it.In March...