AI Evals Engineer

5 days ago


Bengaluru, Karnataka, India Docket Full time ₹ 12,00,000 - ₹ 36,00,000 per year

About The RoleWe're hiring anAI Evals Engineerto own the evaluation and observability systems that keep our AI clear, accurate, and trustworthywhile closing the loop with customers. You'll design gold‑standard test sets, automate offline/online evaluation,trace customer queries end‑to‑end, and wire quality signals into our product and release process so we can move fast without breaking trust.What You'll DoBuild eval pipelines:Implement automated offline evals for key use cases (RAG, agents, chat, extraction), including data ingestion, labeling strategies (human + LLM‑as‑judge), scoring, and regression suites in CI/CD.Define quality metrics:Formalize task‑specific and aggregate KPIs (e.g., factuality, faithfulness, toxicity, grounding precision/recall, nDCG/ for retrieval, latency/cost) with clear acceptance bars.Own test datasets:Create and maintain golden sets, synthetic variants, adversarial cases, and red‑teaming corpora; version them and track drift.Production monitoring:Instrument tracing and post‑deployment checks; detect regressions using canary traffic, shadow tests, and guardrails; trigger auto‑rollbacks or alerts.Customer query tracing & triage:Instrument end‑to‑end traces from a customer question through retrieval, inference, and post‑processing. Reproduce and debug customer‑reported issues, perform root‑cause analysis across ML pipelines, and prioritize fixes.Customer feedback loop:Partner with Customer Success & Support to translate tickets and feedback into structured QA signals, updated test cases, and proactive communications back to customers.Experimentation:Run A/Bs and prompt/model experiments; ensure statistical rigor (power, MDE, CUPED when needed); analyze and ship recommendations.Dashboards & visibility:Build self‑serve dashboards and runbooks that surface trends by segment, feature, and model/prompt version.Cross‑functional partner:Validate new features, escalate quality risks early, and translate customer feedback & support tickets into structured QA signals.Process & governance:Establish eval review rituals, documentation, and change management for prompts, tools, model versions, and customer‑facing releases.What You'll Bring2+ years in product analytics, QA, data, or ML—ideally in AI/LLM products.Strong Python and SQL; comfort querying production data, reading logs, and debugging flows.Hands‑on experience designing LLM evals (rubric design, LLM‑as‑judge, human review, inter‑rater reliability).Experience with distributed tracing or observability stacks (OpenTelemetry, Honeycomb, Datadog) and debugging end‑to‑end request paths.Familiarity with retrieval metrics, prompt/version control, and basic statistics for experiments.Very high agency and ownership:you turn ambiguity into a plan, ship iteratively, and measure impact.Nice to HaveExperience with any of: LangSmith / OpenAI Evals / Ragas / DeepEval / Arize / WhyLabs; Airflow/Prefect/dbt; BigQuery/Snowflake; MLflow/W&B; Metabase/Mode/Hex; OpenTelemetry/Trace‑based evaluation.Safety & red‑teaming experience (prompt injection, PII leakage, jailbreaks).Knowledge of CI/CD and release gates for prompts/tools/models.Prior collaboration with Customer Success or Support teams.How We'll Measure Success (First 90 Days)30 days:Baseline dashboards live for top journeys; initial golden sets and acceptance criteria published; trace logging in place for all customer‑facing endpoints.60 days:CI evals block regressions; first A/Bs shipped with readouts; on‑call quality & customer‑impact alerting in place; mean time‑to‑identify (MTTI) for customer issues < 30 min.90 days:Coverage of priority use cases ≥ 80%; measurable quality lift (e.g., +X pts factuality, −Y % escalations); mean time‑to‑resolution (MTTR) for customer quality tickets reduced by Z %; documented playbook for ongoing evals and customer feedback triage.(If you're passionate about building trustworthy AI and love digging into both metrics and customer stories, we'd love to meet you)


  • - AI Evals Engineer

    2 weeks ago


    Bengaluru, Karnataka, India Pibit Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Description : About : is transforming the underwriting landscape with Generative AI. Our SaaS solutions help US-based insurance companies make smarter, faster decisions by optimizing underwriting processes, reducing risk, and improving premiums. Were hiring an AI Evals Engineer to lead the systems that measure and maintain our AIs clarity,...


  • Bengaluru, Karnataka, India eeKee AI Full time ₹ 8,00,000 - ₹ 24,00,000 per year

    Company DescriptionEekee AI is an AI-driven life coach designed to help employees feel grounded at work and build a lasting sense of purpose. Rooted in Ikigai and Viktor Frankl's logotherapy, Eekee uses daily conversations and psychometric signals to identify strengths and values, suggesting resources like books, courses, and team rituals. Eekee supports HR...

  • AI Engineer

    1 week ago


    Bengaluru, Karnataka, India Plum Benefits Private Limited Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    AI Automation Engineer (Internal AI Lead)Location: Bengaluru (India) About PlumPlum is re-imagining employee healthcare & insurance benefits for fast-growing Indian businesses. We combine modern insurance products, primary, preventive care and data-driven claims to protect 5,000+ companies and 1 million+ lives today. Our next milestone—10 million lives by...


  • Bengaluru, Karnataka, India Quash Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    About the RoleWe're looking for a hands-on Applied AI Engineer who lives and breathes LLMs. This isn't a research role — you'll be solving real-world problems, shipping features, and building production-ready systems that make our agent smarter every week.What You'll DoBuild LLM-powered systems: prompt chains, retrieval pipelines, eval frameworksWork...

  • Platform Tech Lead

    2 weeks ago


    Bengaluru, Karnataka, India Sarvam AI Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Platform Tech LeadCompany Overview: is a pioneering generative AI startup headquartered in Bengaluru, India. Our mission is to make generative AI accessible and impactful for Bharat. Founded by a team of AI experts, is dedicated to developing cost-effective, high-performance AI agents tailored for the Indian market, enabling enterprises to tap into new...


  • Bengaluru, Karnataka, India blue yonder Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Job Description Scope: We are seeking a highly skilled AI/Prompt Engineer to design, implement, and maintain artificial intelligence (AI) and machine learning (ML) solutions for our organization. The ideal candidate will have a deep understanding of AI and ML technologies, as well as experience with data analysis, software development, and cloud...

  • AI Engineer

    3 days ago


    Bengaluru, Karnataka, India Weekday AI Full time ₹ 60,00,000 - ₹ 80,00,000 per year

    This role is for one of the Weekday's clientsSalary range: Rs Rs ie INR 6-8 LPA)Min Experience: 0 yearsLocation: BangaloreJobType: full-timeWe are looking for a passionate and motivated AI Engineer to join our growing team. This role is ideal for individuals who are eager to build a career in artificial intelligence and machine learning. You will work...

  • AI Engineer

    3 days ago


    Bengaluru, Karnataka, India Weekday AI Full time ₹ 6,00,000 - ₹ 8,00,000

    This role is for one of the Weekday's clientsSalary range: Rs Rs ie INR 6-8 LPA)Min Experience: 0 yearsLocation: BangaloreJobType: full-timeWe are looking for a passionate and motivated AI Engineer to join our growing team. This role is ideal for individuals who are eager to build a career in artificial intelligence and machine learning. You will work...

  • AI Engineer

    2 days ago


    Bengaluru, Karnataka, India Weekday AI Full time ₹ 6,00,000 - ₹ 8,00,000 per year

    This role is for one of the Weekday's clientsSalary range: Rs Rs ie INR 6-8 LPA)Min Experience: 0 yearsLocation: BangaloreJobType: full-time We are looking for a passionate and motivated AI Engineer to join our growing team. This role is ideal for individuals who are eager to build a career in artificial intelligence and machine learning. You will work...

  • AI Engineer

    2 weeks ago


    Bengaluru, Karnataka, India Amogha AI Full time ₹ 9,00,000 - ₹ 12,00,000 per year

    About Amogha AIAt Amogha AI, we are building India's first voice-led conversational AI app specifically for mental health, therapy, and emotional well-being. Our mission is to provide a supportive, empathetic listener in your pocket, 24/7. We are creating a next-generation product that understands user context, provides therapy-grade support, and guarantees...