
Senior Data Engineer
16 hours ago
We're building a high-performance data intelligence platform using a scalable architecture. We need a senior expert to lead the development of our Entity Resolution and Network Generation services.
This is a technical leadership role where you'll design and implement distributed graph computing solutions for billions of entities and relationships.
Our cloud-native platform leverages:
- Microservices architecture with orchestration
- Apache Spark for distributed processing
- Elasticsearch for real-time search and fuzzy matching
- Scala as the primary development language
Data mesh principles with API-first design ensure seamless integration between entity resolution and network generation.
Job Responsibilities- Entity Resolution Service
- Design and implement distributed entity resolution algorithms capable of processing billions of records
- Build blocking strategies (e.g. LSH, canopy clustering) optimized for Spark at scale
- Develop fuzzy matching algorithms leveraging Elasticsearch's capabilities
- Create ML-enhanced matching with explainable AI for match decisions
- Implement incremental resolution supporting real-time and batch modes
- Design APIs for entity lookup with sub-100ms latency requirements
- Architect distributed graph generation pipelines using GraphX/GraphFrames
- Implement graph analytics algorithms (PageRank, community detection, centrality measures)
- Design storage strategies for multi-billion edge graphs in Parquet/distributed file systems
- Build temporal graph support for time-evolving networks
- Create high-performance graph serving APIs with complex query capabilities
- Optimize graph partitioning to minimize shuffle and maximize locality
- Build Graph Neural Networks (GNNs): Develop GNN models (e.g., GraphSAGE, GATv2) using PyTorch Geometric or DGL to analyze corporate and transaction networks
- Implement Entity Resolution: Design algorithms for fuzzy matching, semantic matching (Sentence-BERT), and clustering to unify entities across heterogeneous data sources
- Create Risk Scoring Models: Combine rule-based, supervised (XGBoost), and unsupervised (Isolation Forest) methods to generate composite risk scores
- Advance Composite AI: Leverage ContexQ's proprietary approach, integrating symbolic AI, vector embeddings, and graph AI for robust entity resolution and network analytics
- Champion Transparency: Integrate SHAP, LIME, and GNNExplainer to provide clear, interpretable explanations for model predictions
- Ensure Fairness: Audit models for bias and fairness, embedding ethical principles into every stage of development
- Technical Expertise:
- 7+ years of experience in distributed computing and big data systems
- 5+ years specifically in entity resolution and graph analytics at scale
- Expert-level Scala programming skills
- Deep experience with Apache Spark, including custom optimizations
- Domain Knowledge:
- Strong understanding of blocking algorithms and their trade-offs
- Experience with probabilistic record linkage and similarity measures
- Expertise in graph algorithms and their distributed implementations
- Systems Design:
- Experience designing microservices architectures
- Track record of building fault-tolerant, scalable systems
Competitive compensation package with flexible remote work arrangements, latest hardware and cloud resources for development, LTIP - Long term Incentive plan, 75% of base as Bonus payment at the end of 4th year in service, Equity potential of up to USD 150K every year.
Interview ProcessTechnical screen focusing on distributed systems and graph algorithms, system design session on entity resolution at scale, coding session implementing a graph algorithm in Scala, architecture discussion with the team, final round with leadership.
-
Data Scientist/ML Engineer
3 days ago
Kollam, Kerala, India Quant-data Full timeWe're Hiring: Machine Learning Engineer / Data Engineer (Remote | Full-Time) Build AI-powered credit decisioning systems on Microsoft AzureWe're looking for a Machine Learning Engineer / Data Engineer with 5+ years of experience to join our AI-driven credit lending platform team. In this role, you'll design and deploy scalable ML solutions that power loan...
-
Senior Data Engineering Professional
2 days ago
Kollam, Kerala, India beBeeDataEngineer Full time ₹ 12,00,000 - ₹ 20,10,000Job Title: Senior Data EngineerWe are seeking an experienced professional to join our organization in the role of Senior Data Engineer. In this position, you will be responsible for developing and implementing data engineering solutions using Python and Pyspark.Key Responsibilities:Design and implement scalable data architectures to support large-scale data...
-
Senior Data Engineering Leader
31 minutes ago
Kollam, Kerala, India beBeeDataEngineering Full time ₹ 20,00,000 - ₹ 25,00,000Job Title: Senior Data Engineering Leader">The Technical Lead-Data Engineer is a senior-level position responsible for leading data engineering teams and projects. The role requires a deep understanding of data warehousing concepts, OLAP design, and enterprise-level data engineering principles.">Must-Have Skills and Qualifications:">">8+ years of experience...
-
Senior Cloud Data Engineer
34 minutes ago
Kollam, Kerala, India beBeeDataEngineer Full time ₹ 15,00,000 - ₹ 25,00,000Data Engineer Job DescriptionWe are seeking a seasoned and proficient Senior Data Engineer with substantial experience in cloud technologies.As a pivotal member of our data engineering team, you will play a crucial role in designing, implementing, and optimizing data pipelines, ensuring seamless integration with cloud platforms.Key Responsibilities:Design,...
-
Data engineer
7 hours ago
Kollam, Kerala, India Centrilogic Full timeData EngineerPurpose:Over 15 years, we have become a premier global provider of multi-cloud management, cloud-native application development solutions, and strategic end-to-end digital transformation services.Headquartered in Canada and with regional headquarters in the U. S. and the United Kingdom, Centrilogic delivers smart, streamlined solutions to...
-
Senior Cloud Data Architect
4 hours ago
Kollam, Kerala, India beBeeDataEngineer Full time ₹ 1,50,00,000 - ₹ 2,50,00,000Job Title: Senior Software Engineer/Technical SpecialistJob DescriptionWe are seeking a skilled software engineer to join our team. As a senior software engineer, you will be responsible for designing and implementing scalable data architectures using Azure Data Factory (ADF), Databricks, and Synapse Analytics.You will work with large datasets, developing...
-
Senior Data Engineering Specialist
30 minutes ago
Kollam, Kerala, India beBeeData Full time ₹ 18,00,000 - ₹ 25,00,000Job Opportunity:We are seeking a skilled Data Engineer to join our organization.The ideal candidate will have a strong background in data engineering, with expertise in building scalable data pipelines using PySpark and Apache Airflow.Proficiency in using Spark SQL, DataFrame, and RDD APIs to implement complex business logic is essential.A solid foundation...
-
Senior Cloud Data Specialist
18 hours ago
Kollam, Kerala, India beBeeDatabricksspecialist Full time ₹ 1,50,00,000 - ₹ 2,50,00,000Job Overview:We are seeking a seasoned professional to serve as a Senior Databricks Data Engineer on our data engineering team. This is an exciting opportunity for the right candidate to utilize their skills in PySpark, CI/CD pipelines, and Terraform for infrastructure as code to drive success in our organization.Design, develop, and maintain scalable and...
-
Senior Data Specialist
5 days ago
Kollam, Kerala, India beBeeDataEngineer Full time ₹ 2,00,00,000 - ₹ 2,50,00,000Job OpportunityWe are seeking a highly skilled Senior Data Engineer to join our mission-critical banking project. This is an excellent opportunity to work on cutting-edge data pipelines, streaming platforms, and cloud-based infrastructures.Key Responsibilities:Designing and implementing ETL pipelines for large-scale data ingestion and transformation.Building...
-
Senior Data Engineer
19 hours ago
Kollam, Kerala, India beBeeData Full time ₹ 8,00,000 - ₹ 15,00,000Expert Data Pipeline Developer NeededWe are seeking an experienced and skilled data pipeline developer to design, build, and deploy robust ETL/ELT pipelines in Databricks. The ideal candidate will have a strong background in data engineering, Azure Databricks, and Azure Data Lake.Key Responsibilities:Pipeline Design and Development: Create complex data...