Data Scientist Lead

3 days ago


Bareilly, Uttar Pradesh, India beBeeExpert Full time ₹ 2,00,00,000 - ₹ 2,50,00,000

About the Role

As a senior expert in artificial intelligence and machine learning, you will lead the development of our Entity Resolution and Network Generation services. This is a hands-on technical leadership role where you will architect and implement distributed graph computing solutions to process billions of entities and relationships.

The Platform

  • Microservices architecture with Kubernetes orchestration
  • Apollo Spark for distributed processing
  • Elasticsearch for real-time search and fuzzy matching
  • Scala as the primary development language
  • Data mesh principles with API-first design

Key Responsibilities

Entity Resolution Service
  • Design and implement distributed entity resolution algorithms capable of processing large datasets
  • Build blocking strategies (e.g., LSH, canopy clustering) optimized for Spark at scale
  • Develop fuzzy matching algorithms leveraging Elasticsearch's capabilities
  • Create ML-enhanced matching with explainable AI for match decisions
  • Implement incremental resolution supporting real-time and batch modes
  • Design APIs for entity lookup with sub-100ms latency requirements
Network Generation Service
  • Architect distributed graph generation pipelines using GraphX/GraphFrames
  • Implement graph analytics algorithms (PageRank, community detection, centrality measures)
  • Design storage strategies for multi-billion edge graphs in Parquet/distributed file systems
  • Build temporal graph support for time-evolving networks
  • Create high-performance graph serving APIs with complex query capabilities
  • Optimize graph partitioning to minimize shuffle and maximize locality
AI Model Development
  • Build Graph Neural Networks (GNNs): Develop GNN models (e.g., GraphSAGE, GATv2) using PyTorch Geometric or DGL to analyze corporate and transaction networks, detecting fraud rings and risk patterns.
  • Implement Entity Resolution: Design algorithms for fuzzy matching, semantic matching (Sentence-BERT), and clustering to unify entities across heterogeneous data sources (e.g., CSVs, APIs, PDFs).
  • Create Risk Scoring Models: Combine rule-based, supervised (XGBoost), and unsupervised (Isolation Forest) methods to generate composite risk scores, optimized for real-time and large data processing.
  • Advance Composite AI: Leverage ContexQ's proprietary approach, integrating symbolic AI, vector embeddings, and graph AI for robust entity resolution and network analytics.
Explainable AI (XAI)
  • Champion Transparency: Integrate SHAP, LIME, and GNNExplainer to provide clear, interpretable explanations for model predictions, meeting regulatory and ethical standards.
  • Ensure Fairness: Audit models for bias and fairness, embedding ethical principles into every stage of development.

Additional Responsibilities

  • Ensure seamless integration between entity resolution and network generation
  • Design data lineage tracking across both services
  • Implement comprehensive monitoring and observability
  • Contribute to API design and service contracts
  • Optimize for 10x scale growth

  • Senior Data Scientist

    2 weeks ago


    Bareilly, Uttar Pradesh, India ThreatModeler Software, Inc Full time

    About the CompanyThreatModeler Software, Inc. is an industry leader in automated threat modeling, helping enterprises proactively secure their systems by identifying, quantifying, and mitigating cybersecurity threats during the design phase. We're expanding our AI capabilities to accelerate threat detection, model generation, and decision intelligence —...


  • Bareilly, Uttar Pradesh, India beBeeDataScience Full time ₹ 80,00,000 - ₹ 1,50,00,000

    About Our OpportunityWe are seeking an experienced Data Scientist to lead the design, development, and deployment of AI-driven solutions using the Azure ecosystem. This role involves leveraging expertise in Machine Learning, Generative AI, and Large Language Models (LLMs) to drive business growth.Key Responsibilities:Design and develop end-to-end ML...


  • Bareilly, Uttar Pradesh, India beBeeDataEngineer Full time ₹ 8,00,000 - ₹ 15,00,000

    Job OverviewThis is an exciting opportunity to work as a Data Engineer, driving data-driven initiatives forward. The role involves building scalable pipelines, working with streaming data, and collaborating with cross-functional teams.Key Responsibilities:Data Pipeline Development: Design, build, and optimize data pipelines and ETL workflows to ensure...


  • Bareilly, Uttar Pradesh, India beBeeDataSpecialist Full time US$ 1,75,000 - US$ 2,25,000

    Data SpecialistWe are seeking a skilled Data Specialist to lead the modernization of our clients' data infrastructures, architecture, and pipelines.Key Responsibilities:Develop cloud-first strategies for core business data initiativesProvide thought leadership on technical mattersDesign and implement modernized end-to-end data strategies, including...


  • Bareilly, Uttar Pradesh, India beBeeDataExpert Full time ₹ 2,00,00,000 - ₹ 2,50,00,000

    Job TitleThe role involves designing and implementing scalable, secure, and efficient Databricks solutions that meet client requirements. Key Responsibilities:Databricks Solution Architecture: Develop data pipelines, architect data lakes, and implement data warehousing solutions using Databricks.Data Engineering: Collaborate with data scientists and analysts...

  • Chief Data Labeler

    2 days ago


    Bareilly, Uttar Pradesh, India beBeeDataAnnotator Full time ₹ 6,00,000 - ₹ 12,00,000

    Job Overview:Data Annotators play a vital role in the data science process by accurately labeling and categorizing data to prepare it for use in machine learning models.Main Responsibilities:Annotate data with high accuracy and attention to detailReview and validate existing annotations for consistency and qualityCollaborate with data scientists and...


  • Bareilly, Uttar Pradesh, India beBeeAzure Full time ₹ 18,00,000 - ₹ 2,51,20,000

    Seeking a highly skilled Lead Azure Data Engineer with expertise in designing and developing scalable data pipelines using Azure and Databricks.We are looking for an individual who can build and manage complex ELT/ETL workflows with Databricks, Delta Lake, ADF, and other Azure services.The ideal candidate is a hands-on problem solver with strong technical...


  • Bareilly, Uttar Pradesh, India beBeeEngineering Full time ₹ 25,00,000 - ₹ 30,00,000

    Job DescriptionWe are seeking an experienced Engineering Manager to lead a team of talented data engineers and design scalable data pipelines and analytics platforms.The ideal candidate will have 8+ years of experience in software/data engineering with at least 3 years in a leadership or managerial role.


  • Bareilly, Uttar Pradesh, India beBeeData Full time ₹ 15,00,000 - ₹ 25,00,000

    Job OverviewA Data Scientist plays a pivotal role in informing business strategy and enhancing operational efficiency by collecting, analyzing, and interpreting data. They employ programming, statistical analysis, and machine learning techniques to uncover patterns, build predictive models, and communicate insights effectively.The ideal candidate will have...


  • Bareilly, Uttar Pradesh, India beBeeData Full time ₹ 15,00,000 - ₹ 20,10,000

    We are seeking a highly skilled Senior Data Engineer to join our growing team of consultants.The ideal candidate will be responsible for designing, constructing, and optimizing large-scale data processing systems that enable advanced analytics and business intelligence.Key ResponsibilitiesData Engineering ExpertiseDesign, develop, and maintain scalable and...