LLM Reliability

3 weeks ago

Sahibzada Ajit Singh Nagar, India XenonStack Full time

ABOUT XENONSTACK

XenonStack is the fastest-growing Data and AI Foundry for Agentic Systems , enabling enterprises to gain real-time and intelligent business insights .

We deliver innovation through:

Agentic Systems for AI Agents → akira.ai
Vision AI Platform → xenonstack.ai
Inference AI Infrastructure for Agentic Systems → nexastack.ai

Our mission is to accelerate the world’s transition to AI + Human Intelligence by making AI agents reliable, explainable, and enterprise-ready .

THE OPPORTUNITY

We are seeking an LLM Reliability & Evaluation Engineer to ensure that large language models (LLMs) and agentic AI systems meet enterprise-grade standards of accuracy, safety, and trustworthiness .

This role focuses on evaluating, benchmarking, and stress-testing LLMs in real-world workflows, building frameworks for reliability, robustness, and continuous improvement . If you thrive at the intersection of AI research, applied testing, and responsible deployment , this is the role for you.

KEY RESPONSIBILITIES

Evaluation Frameworks
- Design and implement LLM evaluation pipelines covering accuracy, robustness, safety, and bias.
- Develop automated systems for benchmarking models on enterprise-relevant tasks.
Reliability Engineering
- Conduct stress tests, adversarial testing, and edge-case evaluations .
- Build tools to measure latency, consistency, and error recovery in multi-turn interactions.
Metrics & Monitoring
- Define KPIs such as factual accuracy, hallucination rate, toxicity, and compliance alignment .
- Establish real-time monitoring for drift, anomalies, and performance regressions .
Collaboration & Alignment
- Partner with ML engineers, product managers, and domain experts to align evaluation with business objectives.
- Work with Responsible AI teams to implement ethical, explainable, and compliant evaluation practices .
Continuous Improvement
- Feed insights from evaluation into fine-tuning, RLHF/RLAIF pipelines, and model selection .
- Maintain a central repository of test cases, benchmarks, and evaluation results .
Research & Innovation
- Stay current with state-of-the-art LLM evaluation techniques , from academic benchmarks to applied enterprise metrics.
- Explore automated evaluation using agentic test harnesses and synthetic data generation .

SKILLS & QUALIFICATIONS

Must-Have

3–6 years in AI/ML, NLP, or applied model evaluation .
Strong understanding of LLM architectures, prompt engineering, and failure modes .
Hands-on with evaluation frameworks (Eval harnesses, Ragas, OpenAI Evals, DeepEval).
Proficiency in Python and libraries like LangChain, LangGraph, LlamaIndex, Hugging Face .
Experience with vector databases, RAG pipelines, and knowledge graph integration .
Familiarity with bias/fairness testing and Responsible AI frameworks .

Good-to-Have

Experience with reinforcement learning (RLHF, RLAIF) and reward modeling.
Exposure to agentic evaluation frameworks (multi-agent stress testing, synthetic user simulators).
Knowledge of compliance and safety requirements for BFSI, GRC, or SOC use cases.
Contributions to open-source evaluation libraries or research papers .

WHY SHOULD YOU JOIN US?

Agentic AI Product Company

Ensure reliability in cutting-edge AI platforms that are redefining enterprise adoption.
A Fast-Growing Category Leader

Be part of one of the fastest-growing AI Foundries , powering Fortune 500 enterprises with trustworthy AI.
Career Mobility & Growth

Grow into roles such as AI Systems Architect, Responsible AI Engineer, or Reliability Engineering Lead .
Global Exposure

Work on enterprise-scale evaluation challenges across BFSI, Healthcare, Telecom, and GRC.
Create Real Impact

Your evaluations will directly shape production-grade AI agents used in mission-critical systems .
Culture of Excellence

Our values — Agency, Taste, Ownership, Mastery, Impatience, and Customer Obsession — empower you to innovate fearlessly.
Responsible AI First

Join a company that prioritizes trustworthy, explainable, and compliant AI .

XENONSTACK CULTURE – JOIN US & MAKE AN IMPACT

At XenonStack, we believe in shaping the future of intelligent systems . We foster a culture of cultivation built on bold, human-centric leadership principles, where deep work, simplicity, and adoption define everything we do.

Our Cultural Values

Agency – Be self-directed and proactive.
Taste – Sweat the details and build with precision.
Ownership – Take responsibility for outcomes.
Mastery – Commit to continuous learning and growth.
Impatience – Move fast and embrace progress.
Customer Obsession – Always put the customer first.

Our Product Philosophy

Obsessed with Adoption – Making AI accessible, reliable, and enterprise-ready.
Obsessed with Simplicity – Turning complex evaluation challenges into seamless, automated frameworks.

Be part of our mission to accelerate the world’s transition to AI + Human Intelligence — by making AI agents not just powerful, but trustworthy and reliable .

LLM Reliability

3 weeks ago

Sahibzada Ajit Singh Nagar, India XenonStack Full time

ABOUT XENONSTACK XenonStack is the fastest-growing Data and AI Foundry for Agentic Systems , enabling enterprises to gain real-time and intelligent business insights . We deliver innovation through: Agentic Systems for AI Agents → akira.ai Vision AI Platform → xenonstack.ai Inference AI Infrastructure for Agentic Systems → ...
LLM Reliability

1 week ago

Nagar, Sahibzada Ajit Singh Nagar, India XenonStack Moments Full time

Job Description About Xenonstack XenonStack is the fastest-growing Data and AI Foundry for Agentic Systems, enabling enterprises to gain real-time and intelligent business insights. We Deliver Innovation Through - Agentic Systems for AI Agents akira.ai - Vision AI Platform xenonstack.ai - Inference AI Infrastructure for Agentic Systems nexastack.ai Our...
Matrix Marketers

1 week ago

Sahibzada Ajit Singh Nagar, India Matrix Marketers Full time

Job Summary :We are looking for an AI/ML Engineer with 4 years of experience in designing, developing, and deploying machine learning and artificial intelligence solutions. The right candidate will have a solid background in algorithms, data pipelines, and model optimization, along with practical experience in production-level ML Responsibilities :- Design,...
Agentic AI Engineer

1 week ago

Nagar, Sahibzada Ajit Singh Nagar, India XenonStack Moments Full time

Job Description About XenonStack XenonStack is the fastest-growing Data and AI Foundry for Agentic Systems, enabling people and organizations to gain real-time, intelligent business insights. We Deliver Innovation Through - Akira AI Building Agentic Systems for AI Agents - XenonStack Vision AI Vision AI Platform - NexaStack AI Inference AI Infrastructure for...
AI Expert

3 weeks ago

Sahibzada Ajit Singh Nagar, India RChilli Full time

Location: Mohali, PB Job Type: Full-Time Exp: Minimum of 2 years of experience in advanced AI development. Shift Timings: 12-10pm IST About RChilli RChilli is a leader in AI-driven HR technology, powering next-generation recruitment solutions globally. We thrive on innovation, agility, and a mission to revolutionize the way HR teams...
AI Expert

3 weeks ago

Sahibzada Ajit Singh Nagar, India RChilli Full time

Location: Mohali, PB Job Type: Full-Time Exp: Minimum of 2 years of experience in advanced AI development. Shift Timings: 12-10pm IST About RChilli RChilli is a leader in AI-driven HR technology, powering next-generation recruitment solutions globally. We thrive on innovation, agility, and a mission to revolutionize the way HR teams work with...
▷ (Urgent Search) Robotics Engineer

1 week ago

Nagar, Sahibzada Ajit Singh Nagar, India XenonStack Moments Full time

Job Description About Xenonstack XenonStack is the fastest-growing Data and AI Foundry for Agentic Systems, enabling people and organizations to gain real-time and intelligent business insights. We Deliver Innovation Through - Akira AI Building Agentic Systems for AI Agents - XenonStack Vision AI Vision AI Platform - NexaStack AI Inference AI Infrastructure...
Node.js Developer

2 weeks ago

Nagar, Sahibzada Ajit Singh Nagar, India Delta4 Infotech Full time

Job Description At Delta4 Infotech, we are building next-gen AI products like YourGPT, a powerful platform that helps businesses automate, engage, and scale using Generative AI. We are looking for Node.js Developer and you will be responsible for building and maintaining high-performance, scalable backend systems that serve as the backbone of our AI-driven...
AI Interaction Engineer

1 week ago

Nagar, Sahibzada Ajit Singh Nagar, India XenonStack Moments Full time

Job Description About Xenonstack XenonStack is the fastest-growing Data and AI Foundry for Agentic Systems, enabling people and organizations to gain real-time and intelligent business insights. We Deliver Innovation Through - Agentic Systems for AI Agents akira.ai - Vision AI Platform xenonstack.ai - Inference AI Infrastructure for Agentic Systems...
(Apply in 3 Minutes) Director of Engineering

1 week ago

Nagar, Sahibzada Ajit Singh Nagar, India XenonStack Moments Full time

Job Description About Xenonstack XenonStack is the fastest-growing data and AI foundry for agentic systems, enabling people and organizations to gain real-time and intelligent business insights. - Agentic Systems for AI Agents: akira.ai - Vision AI Platform: xenonstack.ai - Inference AI Infrastructure for Agentic Systems: nexastack.ai THE OPPORTUNITY We are...

Americas

Europe

Asia / Oceania

Africa

LLM Reliability