Senior Distributed Systems Architect

3 days ago


Lucknow, Uttar Pradesh, India beBeeENTITYRESOLUTION Full time US$ 1,20,000 - US$ 2,00,000

**About the Role

We're seeking a senior technical leader to spearhead our entity resolution and network generation services. This hands-on role involves architecting and implementing distributed graph computing solutions that process billions of entities and relationships.

The Platform

Our cloud-native platform leverages:

Microservices architecture with Kubernetes orchestration

Apache Spark for distributed processing

Elasticsearch for real-time search and fuzzy matching

Scala as the primary development language

Data mesh principles with API-first design

Core Responsibilities

**Entity Resolution Service**

Design and implement distributed entity resolution algorithms capable of processing billions of records

Build blocking strategies (e.g. LSH, canopy clustering) optimized for Spark at scale

Develop fuzzy matching algorithms leveraging Elasticsearch's capabilities

Create ML-enhanced matching with explainable AI for match decisions

Implement incremental resolution supporting real-time and batch modes

Design APIs for entity lookup with sub-100ms latency requirements

**Network Generation Service**

Architect distributed graph generation pipelines using GraphX/GraphFrames

Implement graph analytics algorithms (PageRank, community detection, centrality measures)

Design storage strategies for multi-billion edge graphs in Parquet/distributed file systems

Build temporal graph support for time-evolving networks

Create high-performance graph serving APIs with complex query capabilities

Optimize graph partitioning to minimize shuffle and maximize locality

**AI Model Development**

Build Graph Neural Networks (GNNs): Develop GNN models (e.g., GraphSAGE, GATv2) using PyTorch Geometric or DGL to analyze corporate and transaction networks, detecting fraud rings and risk patterns.

Implement Entity Resolution: Design algorithms for fuzzy matching, semantic matching (Sentence-BERT), and clustering to unify entities across heterogeneous data sources (e.g., CSVs, APIs, PDFs).

Create Risk Scoring Models: Combine rule-based, supervised (XGBoost), and unsupervised (Isolation Forest) methods to generate composite risk scores, optimized for real-time and large data processing in trillions.

Advance Composite AI: Leverage ContexQ's proprietary approach, integrating symbolic AI, vector embeddings, and graph AI for robust entity resolution and network analytics.

**Explainable AI (XAI)**

Champion Transparency: Integrate SHAP, LIME, and GNNExplainer to provide clear, interpretable explanations for model predictions, meeting regulatory and ethical standards.

Ensure Fairness: Audit models for bias and fairness, embedding ethical principles into every stage of development.

**Cross-Service Responsibilities**

Ensure seamless integration between entity resolution and network generation

Design data lineage tracking across both services

Implement comprehensive monitoring and observability

Contribute to API design and service contracts

Optimize for 10x scale growth

Required Qualifications

Technical Expertise

7+ years of experience in distributed computing and big data systems

5+ years specifically in entity resolution and graph analytics at scale

Expert-level Scala programming skills

Deep experience with Apache Spark, including custom optimizations

Production experience with Elasticsearch for search and matching

Proven track record building systems processing billions of entities/edges

**Domain Knowledge**

Strong understanding of blocking algorithms and their trade-offs

Experience with probabilistic record linkage and similarity measures

Expertise in graph algorithms and their distributed implementations

Knowledge of graph storage formats and query optimization

Understanding of ML applications in entity resolution

Basic experience of Banking compliances - FinCrime, Fraud

**Systems Design**

Experience designing microservices architectures

Track record of building fault-tolerant, scalable systems

API design experience with GraphQL or REST

Performance optimization and capacity planning expertise

**Preferred Qualifications**

PhD in Computer Science or related field with focus on graphs/entity resolution

Contributions to open-source projects (especially Spark, GraphX, Elasticsearch)

Experience with graph databases (Neo4j, Neptune, JanusGraph) or equivalent

Publications or conference talks on entity resolution or graph analytics

Experience with real-time stream processing (Kafka, Spark Streaming)

Knowledge of graph neural networks and embedding techniques

**Technical Environment**

Languages: Scala (primary), Python, Java

Big Data: Apache Spark 3.x, Hadoop ecosystem

Search: Elasticsearch 8.x

Orchestration: Kubernetes, Docker

Storage: HDFS/S3/GCS, Parquet

Monitoring: Prometheus, Grafana, Jaeger

CI/CD: Modern DevOps practices

What We're Looking For

Someone who thinks in distributed systems and can optimize for both latency and throughput

A technical leader who can make architectural decisions and implement them

Strong communicator who can explain complex graph concepts to stakeholders

Self-directed engineer who can own large technical initiatives end-to-end

Performance-obsessed developer who benchmarks everything

Impact You'll Make

Define the architecture for entity resolution serving multiple business domains

Build the graph intelligence layer powering advanced analytics and ML

Create systems that will process billions of entities with millisecond latencies

Establish best practices for graph computing in our organization

Mentor other engineers on distributed graph algorithms

**Compensation & Benefits**

Competitive compensation package

Flexible remote work arrangements

Latest hardware and cloud resources for development

LTIP - Long term Incentive plan.

**75% of base as Bonus payment at the end of 4th year in service.**

**Equity potential up to USD 150K every year.**

**Interview Process**

Technical screen focusing on distributed systems and graph algorithms

System design session on entity resolution at scale

Coding session implementing a graph algorithm in Scala

Architecture discussion with the team

Final round with leadership

To Apply

Please include:

Links to relevant open-source contributions

Brief description of the largest graph system you've built (nodes/edges scale)

Your approach to a specific entity resolution challenge you've solved

Any publications or talks on graph computing or entity resolution

We're building something ambitious and need someone who gets excited about processing graphs with billions of nodes and solving entity resolution at unprecedented scale. If you've been looking for a role where you can push the boundaries of what's possible with distributed graph computing, we want to talk to you.



  • Lucknow, Uttar Pradesh, India beBeeBackendDeveloper Full time ₹ 18,00,000 - ₹ 25,00,000

    Backend Developer RoleWe're seeking a seasoned Backend Developer with strong expertise in designing and implementing scalable, distributed systems using Node.js and TypeScript.Job Responsibilities:Create high-quality, maintainable code for backend systems, utilizing microservice architecture and RESTful APIs or GraphQL endpoints.Collaborate with...


  • Lucknow, Uttar Pradesh, India beBeeBackend Full time ₹ 2,00,00,000 - ₹ 2,50,00,000

    Senior Backend EngineerWe are rebuilding B2B cross-border payments from the ground up, making them seamless, fast and cost-effective.Responsibilities:Architect and implement our core payment processing engine.Develop reliable and scalable financial systems for global operations.Build integration layers with banking partners and an enterprise-grade...


  • Lucknow, Uttar Pradesh, India beBeeDataPipeline Full time ₹ 20,00,000 - ₹ 25,00,000

    ETL Developer/Senior DeveloperDesign and develop scalable data pipelines using IBM DataStage and AWS Glue/Lambda.Collaborate with architects, business analysts, and data modelers to ensure timely delivery of critical data assets supporting analytics and AI/ML use cases.Required Skills:At least 4 years of experience in ETL development with at least 1–2...


  • Lucknow, Uttar Pradesh, India beBeeDataEngineer Full time ₹ 1,80,00,000 - ₹ 2,50,00,000

    Job DescriptionAs a Data Engineer, you will be responsible for designing and developing high-quality data systems that meet the needs of our clients.You will work independently to develop and deliver features and tasks to a high standard, collaborating with cross-functional teams as needed.The ideal candidate has 8+ years of experience in software...


  • Lucknow, Uttar Pradesh, India beBeeBackend Full time ₹ 12,00,000 - ₹ 24,00,000

    Job DescriptionRebuild the future of cross-border payments from scratch. We are making them seamless, fast, and cost-effective.Key Responsibilities:Design scalable systems for our payment platform globally.Architect our core payment processing engine.Develop reliable financial systems for global operations.Build integration layers with banking partners and...


  • Lucknow, Uttar Pradesh, India beBeeData Full time US$ 1,00,000 - US$ 1,50,000

    Senior Data ArchitectWe are seeking an experienced Senior Data Architect to join our growing team of consultants. In this role, you will be responsible for designing and constructing large-scale data processing systems that enable advanced analytics and business intelligence.


  • Lucknow, Uttar Pradesh, India beBeeSpecialist Full time ₹ 90,00,000 - ₹ 1,20,00,000

    Job DescriptionThe primary objective of the JD Edwards S&D Functional Consultant is to oversee and facilitate ERP activities focused on Sales & Distribution during organizational transitions.Key Responsibilities:Design and implement JDE processes for divestiture scenarios, including customer/supplier master segregation, order closure, and data...

  • Senior Data Architect

    4 hours ago


    Lucknow, Uttar Pradesh, India beBeeSeniorDataEngineer Full time ₹ 1,20,00,000 - ₹ 1,50,00,000

    Job Title: Senior Data EngineerWe are seeking a highly skilled individual to join our team as a Senior Data Engineer, responsible for designing, building, and maintaining our modern data platform.This role is perfect for someone with deep technical expertise in ETL pipeline design, data modeling, and data infrastructure who thrives in a fast-paced,...


  • Lucknow, Uttar Pradesh, India beBeeSolarEnergy Full time ₹ 90,00,000 - ₹ 1,20,00,000

    Senior Solar Energy DesignerThis role involves designing and optimizing solar energy systems for maximum efficiency.Key Responsibilities:Conduct site assessments to determine solar potential, taking into account local climate conditions and geographical features.Design system layouts and create electrical drawings using software such as AutoCAD, PVsyst,...


  • Lucknow, Uttar Pradesh, India beBeeData Full time ₹ 12,00,000 - ₹ 15,00,000

    Job Title: Data EngineerWe are seeking a Senior Data Engineer to help implement an application for large retailer.The ideal candidate will have hands-on coding expertise in Python and Pyspark.Required Skills and Qualifications:5+ years of experience as a Data Engineer with strong Big Data experience, GCP preferred.Expertise in Python, Pyspark and SQL with...