Evaluating the Trustworthiness of Advanced Language Models

2 days ago


dindigul, India beBeeArtificial Full time

AI Assurance ExpertIn this pivotal role, you will be instrumental in ensuring the trustworthiness and performance of cutting-edge Large Language Models (LLMs), sub-models, and multi-agent systems through comprehensive evaluation strategies.This involves simulating real-world scenarios and edge cases using Python scripting and synthetic data generation, collaborating closely with Data Science and Engineering teams, and maintaining detailed documentation and prompt libraries.Key Responsibilities:Designing and executing evaluation strategies to validate LLM performance.Performing manual and automated testing to ensure model reliability.Developing synthetic test data to simulate real-world scenarios.Ensuring boundary condition coverage for robustness.Preparing and maintaining prompt libraries for efficient testing.Evaluating multi-model architectures for optimal performance.Applying and interpreting evaluation metrics to inform development decisions.Documenting test plans and evaluation reports for transparency and reproducibility.Requirements:5+ years of experience in quality assurance and AI/ML evaluation.Strong hands-on experience with LLM evaluation techniques.Proficiency in Python programming language.Deep understanding of AI model architecture and its implications.Familiarity with prompt engineering principles.Experience with AI/ML testing frameworks and tools.Solid grasp of evaluation metrics and their application.Excellent analytical, documentation, and communication skills.Prior experience in QA for AI/ML products is a plus.Benefits:This role offers a unique opportunity to contribute to the development of cutting-edge AI technologies and work closely with cross-functional teams. As an AI Assurance Expert, you will have the chance to grow professionally and personally in a dynamic and supportive environment.



  • dindigul, India beBeeLanguageModeler Full time

    Job Description:This position entails developing and refining large language models for coding in various programming languages, including Bash, Shell, Rust, and SQL.The primary responsibilities include crafting and implementing strategies to train, fine-tune, and evaluate advanced coding models. This involves collaborating with AI researchers and developers...


  • dindigul, India beBeeEvaluator Full time

    About the RoleWe seek skilled professionals to evaluate, review and compare AI-generated responses in Telugu. This role requires a strong command of Telugu language, ability to identify harmful or toxic content, and deep cultural understanding for high-quality model performance.Evaluate AI-generated outputs specifically in Telugu (both native script and...


  • dindigul, India beBeeGenerativeai Full time

    Job Opportunity for Advanced AI Model Developer About the Role:We seek a skilled professional to lead our Generative AI team in defining technical direction and best practices.Identify high-impact use cases for AI-driven transformation, execute them, and integrate data for intelligent workflows.Design, build, and deploy advanced generative and large...


  • dindigul, India beBeeLlmTraining Full time

    Advanced Coding Model TrainingWe are seeking an experienced professional to train, fine-tune, and evaluate advanced coding models using Bash/Shell/Rust/SQL.Responsibilities include preprocessing data, developing training pipelines, debugging issues, and collaborating with AI researchers and developers to refine model performance.RequirementsProficiency in...


  • dindigul, India beBeeEngineering Full time

    As a Lead AI Engineer, you will be responsible for spearheading the development and implementation of advanced AI and machine learning models.This leadership role involves guiding a team of engineers to successfully deploy projects that leverage AI/ML technologies to solve complex problems.The ideal candidate should have hands-on expertise in NLP, Computer...


  • dindigul, India beBeecontent Full time

    Job Title: Content Evaluator – MalayalamAt our organization, we are seeking a skilled Content Evaluator to evaluate AI-generated content in the Malayalam language. This role is ideal for individuals who possess strong proficiency in both Malayalam and English.The successful candidate will be responsible for evaluating AI-generated outputs in Malayalam,...


  • dindigul, India beBeeTechnical Full time

    Job Title: Language SpecialistJob Description: We are seeking a skilled Language Specialist to evaluate, annotate, and provide structured feedback on AI-generated content produced by Large Language Models (LLMs).Required Skills and Qualifications: Evaluate LLM outputs for correctness, coherence, and relevance in technical areas such as programming,...


  • dindigul, India beBeeContentEvaluation Full time

    Content Evaluation Specialist Job DescriptionEvaluate AI model outputs in Bengali, identifying toxic or harmful content and assessing model performance across multiple datasets.Classify toxicity into hate speech, harassment, abusive language, etc.Provide brief explanations for flagged items where required.Key Qualifications:Proficient in English and...


  • dindigul, India beBeeEvaluator Full time

    Job OpportunityWe are seeking skilled professionals to evaluate academic content for our clients.Evaluate essays, dissertation papers, and open-ended responses reliably as per established scoring guidelines.Apply client-provided training and guidelines accurately while grading.Maintain consistency, objectivity, and high-quality scoring standards in all...


  • dindigul, India beBeeMachineLearning Full time

    A highly skilled professional in machine learning and deep learning is required to develop advanced algorithms and models for industry-specific problems.Key responsibilities include collecting and analyzing large datasets, developing machine learning models using Python and SQL, training and mentoring junior team members, and designing and implementing...