
Data Intelligence Architect
4 days ago
We are seeking an experienced expert to lead the development of a next-generation data intelligence platform.
This is a hands-on technical leadership role where you will architect and implement distributed graph computing solutions processing billions of entities and relationships.
The cloud-native platform leverages microservices architecture with Kubernetes orchestration, Apache Spark for distributed processing, Elasticsearch for real-time search and fuzzy matching, Scala as the primary development language, and data mesh principles with API-first design.
Entity Resolution Service
You will design and implement distributed entity resolution algorithms capable of processing billions of records.
Build blocking strategies (e.g. LSH, canopy clustering) optimized for Spark at scale.
Develop fuzzy matching algorithms leveraging Elasticsearch's capabilities.
Create ML-enhanced matching with explainable AI for match decisions.
Implement incremental resolution supporting real-time and batch modes.
Design APIs for entity lookup with sub-100ms latency requirements.
Network Generation Service
Architect distributed graph generation pipelines using GraphX/GraphFrames.
Implement graph analytics algorithms (PageRank, community detection, centrality measures).
Design storage strategies for multi-billion edge graphs in Parquet/distributed file systems.
Build temporal graph support for time-evolving networks.
Create high-performance graph serving APIs with complex query capabilities.
Optimize graph partitioning to minimize shuffle and maximize locality.
Ai Model Development
You will build Graph Neural Networks (GNNs): Develop GNN models (e.g., GraphSAGE, GATv2) using PyTorch Geometric or DGL to analyze corporate and transaction networks, detecting fraud rings and risk patterns.
Implement Entity Resolution: Design algorithms for fuzzy matching, semantic matching (Sentence-BERT), and clustering to unify entities across heterogeneous data sources (e.g., CSVs, APIs, PDFs).
Create Risk Scoring Models: Combine rule-based, supervised (XGBoost), and unsupervised (Isolation Forest) methods to generate composite risk scores, optimized for real-time and large data processing in trillions.
Advance Composite Ai: Leverage ContexQ's proprietary approach, integrating symbolic Ai, vector embeddings, and graph Ai for robust entity resolution and network analytics.
Explainable Ai (Xai)
You will champion transparency by integrating SHAP, LIME, and GNNExplainer to provide clear, interpretable explanations for model predictions, meeting regulatory and ethical standards.
Ensure fairness by auditing models for bias and embedding ethical principles into every stage of development.
Cross-Service Responsibilities
You will ensure seamless integration between entity resolution and network generation.
Design data lineage tracking across both services.
Implement comprehensive monitoring and observability.
Contribute to API design and service contracts.
Optimize for 10x scale growth.
Required Qualifications
Technical Expertise
You should have 7+ years of experience in distributed computing and big data systems.
5+ years specifically in entity resolution and graph analytics at scale.
Expert-level Scala programming skills.
Deep experience with Apache Spark, including custom optimizations.
Production experience with Elasticsearch for search and matching.
Proven track record building systems processing billions of entities/edges.
Domain Knowledge
Strong understanding of blocking algorithms and their trade-offs.
Experience with probabilistic record linkage and similarity measures.
Expertise in graph algorithms and their distributed implementations.
Knowledge of graph storage formats and query optimization.
Understanding of ML applications in entity resolution.
BASIC EXPERIENCE OF BANKING COMPLIANCES - FINCRIME, FRAUD.
Systems Design
Experience designing microservices architectures.
Track record of building fault-tolerant, scalable systems.
API design experience with GraphQL or REST.
Performance optimization and capacity planning expertise.
Preferred Qualifications
PhD in Computer Science or related field with focus on graphs/entity resolution.
Contributions to open-source projects (especially Spark, GraphX, Elasticsearch).
Experience with graph databases (Neo4j, Neptune, JanusGraph) or equivalent.
Publications or conference talks on entity resolution or graph analytics.
Experience with real-time stream processing (Kafka, Spark Streaming).
Knowledge of graph neural networks and embedding techniques.
Technical Environment
Languages: Scala (primary), Python, Java.
Big Data: Apache Spark 3.x, Hadoop ecosystem.
Search: Elasticsearch 8.x.
Orchestration: Kubernetes, Docker.
Storage: HDFS/S3/GCS, Parquet.
Monitoring: Prometheus, Grafana, Jaeger.
CI/CD: Modern DevOps practices.
What We're Looking For
We are looking for someone who thinks in distributed systems and can optimize for both latency and throughput.
A technical leader who can make architectural decisions and implement them.
Strong communicator who can explain complex graph concepts to stakeholders.
Self-directed engineer who can own large technical initiatives end-to-end.
Performance-obsessed developer who benchmarks everything.
Impact You'll Make
You will define the architecture for entity resolution serving multiple business domains.
Build the graph intelligence layer powering advanced analytics and ML.
Create systems that will process billions of entities with millisecond latencies.
Establish best practices for graph computing in our organization.
Mentor other engineers on distributed graph algorithms.
Compensation & Benefits
We offer competitive senior/staff-level compensation.
Flexible remote work arrangements.
Latest hardware and cloud resources for development.
LTIP - Long term Incentive plan.
75% of base as Bonus payment at the end of 4th year in service.
Equity potential of upto in excess of USD 150K every year.
Interview Process
The interview process includes a technical screen focusing on distributed systems and graph algorithms, system design session on entity resolution at scale, coding session implementing a graph algorithm in Scala, architecture discussion with the team, and final round with leadership.
To Apply
Please include links to relevant open-source contributions, brief description of the largest graph system you've built, your approach to a specific entity resolution challenge you've solved, and any publications or talks on graph computing or entity resolution.
We are building something ambitious and need someone who gets excited about processing graphs with billions of nodes and solving entity resolution at unprecedented scale.
],-
Lead Data Architect
1 day ago
Kanpur, Uttar Pradesh, India beBeeData Full time ₹ 1,00,00,000 - ₹ 1,54,00,000Job Opportunity: Lead Data ArchitectWe seek an experienced data leader to create robust and scalable data solutions that support our analytics and reporting initiatives.This role is ideal for a seasoned professional with expertise in designing, implementing, and maintaining large-scale data systems, particularly within financial institutions.Develop and...
-
Data Intelligence Specialist
1 day ago
Kanpur, Uttar Pradesh, India beBeeAi Full time ₹ 1,20,00,000 - ₹ 2,50,00,000AI ArchitectJob Overview:The successful candidate will design and implement data intelligence platforms that integrate structured and unstructured data sources, leveraging Generative AI models for text generation, synthetic data creation, and content automation.Key Responsibilities:Design and develop scalable data pipelines and feature stores to support...
-
Lead Data Intelligence Specialist
3 days ago
Kanpur, Uttar Pradesh, India beBeeMachine Full time US$ 1,00,000 - US$ 1,40,000Space Data Intelligence ExpertWe are seeking an experienced Space Data Intelligence Expert to join our India-based development team. The ideal candidate will have a strong background in computer vision, image processing, or machine learning.Proactively use modern AI tools to enhance productivity across the data science lifecycle.Design and implement...
-
Data Architect
1 day ago
Kanpur, Uttar Pradesh, India beBeeDataExpert Full time US$ 90,000 - US$ 1,20,000Job Title: Data ArchitectThis is a key role for an experienced data professional to design, develop and maintain our company's data systems.Key Responsibilities:Apply data engineering expertise to build scalable and efficient data pipelinesCollaborate with cross-functional teams to drive business outcomesRequired Skills and Technologies:Palantir...
-
Artificial Intelligence Specialist
3 days ago
Kanpur, Uttar Pradesh, India HCLSoftware Full timePosition - Lead Software Architect (Agentic AI)Location - Noida/BengaluruExp - 8+YearsAbout HCLSoftware HCLSoftware, a division of HCLTech, develops, markets, sells, and supports transformative solutions across business and industry, intelligent operations, total experience, data and analytics, and cybersecurity. We empower over 20,000 global organizations,...
-
Data Architect
3 days ago
Kanpur, Uttar Pradesh, India beBeeCloud Full time ₹ 15,00,000 - ₹ 25,00,000Job Title:Cloud Data and AI Expert">Job Description:We are seeking a seasoned professional in cloud data and AI to drive the technical depth, accuracy, and instructional value of our training content. You'll lead the definition and review of presentations, labs, and real-world scenarios on modern cloud data and AI solutions.Key Responsibilities:Define and...
-
Senior Data Engineering Lead
2 days ago
Kanpur, Uttar Pradesh, India beBeeDataEngineer Full time ₹ 1,50,00,000 - ₹ 2,50,00,000Job DescriptionWe seek an experienced data engineer to lead the design and development of large-scale data processing systems that enable advanced analytics and business intelligence.As a senior data engineer, you will be responsible for architecting and implementing scalable and robust data pipelines that collect, process, and store large volumes of...
-
Data Specialist
1 day ago
Kanpur, Uttar Pradesh, India beBeeDataSpecialist Full time ₹ 10,00,000 - ₹ 20,00,000Key Responsibilities:Build data crawlers to extract information from customer sources using ETL platforms and troubleshoot issues during data loading and processing.Design and build columnar database models.Develop data processing scripts in SQL and optimize complex sequences of queries.Deploy effective SSIS packages to validate, synthesize, and transform...
-
Strategic Data Architect
10 hours ago
Kanpur, Uttar Pradesh, India beBeeDataEngineer Full time ₹ 15,00,000 - ₹ 20,00,000Job OverviewAs a strategic data architect, you will design and develop complex data pipelines that provide actionable business insights.Key ResponsibilitiesCollaborate with cross-functional teams to integrate data-driven insights into product roadmaps.Design and implement data processing workflows using cloud-based services.Develop ETL processes to extract,...
-
Business Intelligence Data Specialist
9 hours ago
Kanpur, Uttar Pradesh, India beBeeData Full time US$ 90,000 - US$ 1,25,000Job Title: Business Data AnalystWe are seeking a skilled and detail-oriented data professional with expertise in data analytics platforms, particularly SAP HANA and Snowflake.This role will play a critical part in the migration of enterprise data solutions from HANA to Snowflake, ensuring business continuity, data integrity, and performance optimization.The...