pibit.ai - AI Evals Engineer

2 weeks ago

Bengaluru, India Pibit.ai Full time

Description :About Pibit.ai :Pibit.ai is transforming the underwriting landscape with Generative AI. Our SaaS solutions help US-based insurance companies make smarter, faster decisions by optimizing underwriting processes, reducing risk, and improving premiums.Were hiring an AI Evals Engineer to lead the systems that measure and maintain our AIs clarity, accuracy, and trustworthinesswhile directly connecting insights from real customer use. Youll build gold-standard test sets, automate both offline and online evaluations, trace customer interactions end-to-end, and integrate quality signals into our product and release pipelinesenabling us to move quickly while preserving trust.Position Overview :As an AI Evals Engineer, youll build the evaluation, monitoring, and quality infrastructure that ensures our AI systems stay accurate, reliable, and customer-trusted. Youll collaborate closely with ML engineers, product, and customer teams to design gold-standard test sets, automate eval pipelines, trace customer queries, and wire quality signals into our release process. This role is ideal for someone who wants to grow as an applied ML/LLM engineer with a deep focus on evaluation, observability, and continuous improvement.Key Responsibilities :- Collaborate with ML and product engineers to design and implement evaluation and observability systems for AI models.- Build automated o?ine and online eval pipelines for key use cases (RAG, agents, chat, extraction).- Develop and maintain gold-standard datasets, synthetic/adversarial test cases, and regression suites in CI/CD.- Define and track LLM quality metrics such as factuality, grounding precision/recall, latency, and cost.- Instrument end-to-end tracing of customer queries across retrieval, inference, and post-processing to debug and improve quality.- Partner with Customer Success and Support teams to translate feedback into structured QA signals and test updates.- Run and analyze A/B tests and model/prompt experiments, ensuring statistical rigor and measurable improvements.- Integrate evaluation and monitoring signals into deployment pipelines to prevent regressions and enforce release quality gates.- Build dashboards and visibility tools that surface model performance trends by feature, prompt, and version.- Contribute to documentation, evaluation governance, and best practices for model updates and prompt changes.Technical Requirements :- LLM & ML : GPT, Claude, Gemini, Mixtral, Llama, Hugging Face OSS models- LLMOps & Evaluation : OpenAI Evals, LangSmith, LangChain, LangGraph, MLflow, LangFuse, DeepEval, LlamaIndex, SageMaker, AWS Bedrock, Azure AI- Databases : PostgreSQL, MongoDB, Pinecone, ChromaDB- Cloud : AWS, Azure- DevOps & Monitoring : Kubernetes, Docker, OpenTelemetry, Datadog, Honeycomb- Languages : Python, SQL, JavaScript- Certifications (Bonus) : AWS Machine Learning Specialty, AWS Solutions Architect Professional, Azure Solutions Architect ExpertWhat You'll Do :- Build, automate, and maintain LLM evaluation pipelines and gold datasets for our AI products.- Establish quantitative quality metrics and acceptance criteria for production LLM systems.- Implement observability and tracing across AI workflows to detect regressions and ensure reliability.- Work on real-world generative AI and NLP applications, particularly in high-trust domains.- Collaborate with data, product, and engineering teams to close the loop between customer feedback and model quality.- Gain hands-on experience with cloud ML infrastructure and modern LLMOps tooling.- Contribute to improving model accuracy, safety, and trustworthiness through experimentation and data-driven evaluation.What You Need to Succeed :- Bachelors or Masters degree in Computer Science, Machine Learning, or a related field.- Minimum 2 years of experience in ML, data science, or evaluation/QA roles.- Strong understanding of ML fundamentals, deep learning, and LLM-based systems.- Proficiency in Python and SQL for data analysis, model evaluation, and automation.- Familiarity with LLMOps tools (LangChain, MLflow, SageMaker, etc.) and basic DevOps (Docker, Kubernetes).- Curiosity, ownership, and a problem-solving mindsetyou thrive in ambiguous, high-impact environments.- Excellent communication and collaboration skills to work cross-functionally and drive quality improvements end-to-end.Why Join Us :- Work directly with experienced founders and senior engineers.- Get hands-on mentorship in advanced ML and LLMOps.- Be part of a high-energy team that values learning and innovation.- Contribute to building AI-first products shaping the future of insurance tech.- Enjoy a culture that celebrates both hard work and growth (ref:hirist.tech)

pibit.ai - Senior Machine Learning Engineer - Python/PyTorch

2 weeks ago

Bengaluru, India Pibit.ai Full time

Description :About Pibit.ai :Pibit.ai is transforming the underwriting landscape with Generative AI. Our SaaS solutions help insurance companies in the US make smarter, faster decisions by optimizing underwriting processes, reducing risk, and improving premiums. As we expand, were looking for a Senior Machine Learning Engineer to build NLP, CV & LLM based...
pibit.ai - Machine Learning Engineer II - Python/PyTorch

2 weeks ago

Bengaluru, India Pibit.ai Full time

Description :About Pibit.ai :Pibit.ai is transforming the underwriting landscape with Generative AI. Our SaaS solutions help insurance companies in the US make smarter, faster decisions by optimizing underwriting processes, reducing risk, and improving premiums. As we expand, were looking for a Machine Learning Engineer - 2 to build NLP, CV & LLM based...
Pibit.ai - Associate Product Manager - AI Team

2 weeks ago

Bengaluru, India Pibit.ai Full time

- As an Associate Product Manager - AI at Pibit.ai, you will play a critical role in building critical features that power our AI-driven underwriting solutions.- You will collaborate with machine learning, engineering, design, and business stakeholders to enhance platform features, improve data processing capabilities, and ensure scalability for our growing...
Pibit.ai - Product Manager - AI Team

2 weeks ago

Bengaluru, India Pibit.ai Full time

Bangalore In Office.Department: Product Management.Reports to: Senior Product Manager.About Pibit.ai:Pibit.ai is a Y Combinator-backed, insurtech startup co-founded by IIT Roorkee alumni focused on transforming the underwriting landscape with Generative AI.Our SaaS solutions help insurance companies in the US make smarter, faster decisions by optimizing...
Pibit.ai - Associate Product Manager AI Team

2 weeks ago

Bengaluru, India Pibit.ai Full time

Location: Bangalore - In Office Department: Product Management Reports to: Senior Product ManagerAbout Pibit.ai:Pibit.ai is a Y Combinator-backed, insurtech startup co-founded by IIT Roorkee alumni focused on transforming the underwriting landscape with Generative AI.Our SaaS solutions help insurance companies in the US make smarter, faster decisions by...
Pibit.ai - Product Manager - Platform

2 weeks ago

Bengaluru, India Pibit.ai Full time

Bangalore In Office.Department: Product Management.Reports to: Senior Product Manager.About is a Y Combinator-backed, insurtech startup co-founded by IIT Roorkee alumni focused on transforming the underwriting landscape with Generative AI.Our SaaS solutions help insurance companies in the US make smarter, faster decisions by optimizing underwriting...
Pibit.ai - Associate Product Manager - Platform

2 weeks ago

Bengaluru, India Pibit.ai Full time

Location: Bangalore - In Office Department: Product Management Reports to: Senior Product Pibit.ai:Pibit.ai is a Y Combinator-backed, insurtech startup co-founded by IIT Roorkee alumni focused on transforming the underwriting landscape with Generative AI.Our SaaS solutions help insurance companies in the US make smarter, faster decisions by optimizing...
pibit.ai - Software Development Engineer II - Python

3 weeks ago

Bengaluru, India Pibit.ai Full time

Who are we? Pibit.ai is a Y Combinator-backed, insurtech startup co-founded by IIT Roorkee alumn focused on transforming the underwriting landscape with Generative AI. Our SaaS offering helps insurance companies in the US make smarter, faster decisions by leveraging advanced AI to optimize underwriting processes, reduce risk, and improve premiums. As we...
Pibit.ai-Senior ML Engineer

22 hours ago

Bengaluru, India Nexthire Full time

Senior Machine Learning Engineer Location: Onsite - Bengaluru Company: Pibit.ai : Y Combinator backed Insurtech Startup About Pibit.ai Pibit.ai is transforming the underwriting landscape with Generative AI. Our SaaS solutions help insurance companies in the US make smarter, faster decisions by optimizing underwriting processes, reducing risk, and improving...
- AI Evals Engineer

2 weeks ago

Bengaluru, Karnataka, India Pibit Full time ₹ 12,00,000 - ₹ 36,00,000 per year

Description : About : is transforming the underwriting landscape with Generative AI. Our SaaS solutions help US-based insurance companies make smarter, faster decisions by optimizing underwriting processes, reducing risk, and improving premiums. Were hiring an AI Evals Engineer to lead the systems that measure and maintain our AIs clarity,...

Americas

Europe

Asia / Oceania

Africa

pibit.ai - AI Evals Engineer