AI Evaluation Engineer

1 week ago


Surat, Gujarat, India Atologist Infotech Full time ₹ 8,00,000 - ₹ 12,00,000 per year

Role Summary

We are seeking an 
AI Evaluation Engineer
 to join our team and help define how next-generation AI systems are tested, trusted, and improved. In this role, you'll design and implement rigorous quality assurance and evaluation frameworks—combining automated pipelines, human-in-the-loop review, and synthetic data generation—to measure not only our platform reliability but also AI agents' accuracy, safety, and alignment with real-world use cases. You'll work end-to-end across the product lifecycle: writing test case scenarios, building automated tests, managing release test plans, developing dashboards and analysis tools, and translating insights into actionable improvements for both internal teams and clients.

Key Responsibilities

●       
Design evaluation frameworks
 for accuracy, safety, fairness, and alignment with intended use cases.

●       
Build and maintain evaluation pipelines
 that combine automated systems, human-in-the-loop review, and synthetic data generation to test AI Agents' performance at scale.

●       
Conduct failure mode and edge-case analysis
 to surface weaknesses, risks, and unexpected behaviors in AI outputs.

●       
Develop internal tools and dashboards
 that make evaluation results transparent, reproducible, and actionable across engineering, research, and client teams.

●       
Ensure evaluation datasets
 are diverse, representative, and high-quality, minimizing bias while capturing real-world complexity.

●       
Collaborate with researchers, engineers, and product stakeholders
 to translate insights into prioritized improvements and product decisions.

●       
Treat evaluation as a discipline of testing
—applying statistical rigor, reproducibility, and operational reliability across the AI lifecycle.

●       
Ensure deployment readiness
 by stress-testing agents for resilience, safety, and alignment in production-like environments.

●       
Quality Assurance
 ensure software is built to specifications. It is reliable, robust, secure, and ready for deployment. Create test cases, test plan, and bug reporting process for unit, regression, and UAT testing.

Qualifications & Skills

Required

●       Strong software engineering skills, with proficiency in Python and familiarity with data pipelines, APIs, and evaluation tooling.

●       Solid understanding of the machine learning lifecycle, including model training, testing, and deployment.

●       Experience designing or implementing evaluation metrics, experiment design, or statistical analysis.

●       Exposure to human-in-the-loop workflows, annotation systems, or synthetic data generation.

●       Ability to conduct rigorous failure analysis and translate results into actionable insights.

●       Clear, precise communication skills; able to present evaluation findings to technical and non-technical audiences.

Preferred

●       Quality Assurance and Testing Experience

●       Experience with LLMs, generative AI systems, or agentic workflows.

●       Familiarity with fairness, bias detection, interpretability, or safety evaluation.

●       Background in building dashboards, monitoring tools, or large-scale observability systems.

●       Prior work with evaluation frameworks, testing suites, or reproducibility practices at scale.

●       Comfort working end-to-end: from scoping evaluation goals to delivering deployment-ready results.

Seniority Levels / Variations

Depending on seniority (e.g., junior vs senior vs staff), responsibilities might scale to include:

●       Owning or leading an evaluation strategy at a product or platform level.

●       Mentoring others or managing QA teams

●       Architecture of evaluation platforms.

●       Setting standards for metrics, tools, and best practices across multiple product lines.

What We Offer / Why Join Us

●       Opportunity to influence AI product quality, fairness, and trust at scale.

●       Working with cutting-edge model architectures and AI tools.

●       Collaborating with top researchers / engineers / product leaders.

●       Flexibility / remote / collaborative environment (if applicable).

●       Learning opportunities in safety, fairness, interpretability, and evaluation methodologies.


  • AI Engineers

    2 days ago


    Surat, Gujarat, India Ipangram Digital Services Llp Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Key Responsibilities: Develop and deploy machine learning models and algorithms.Design and train AI/ML models using frameworks like TensorFlow, PyTorch, or scikit-learn.Implement generative AI models using GPT, VAE, and GANs.Collaborate with cross-functional teams to solve business problems and define AI project requirements.Stay updated with advancements...

  • Diamond Evaluator

    2 weeks ago


    Surat, Gujarat, India Apt Resources Full time US$ 42,000 - US$ 84,000 per year

    Apt Resources is seeking an experienced Diamond Evaluator to join our client's mining operations team in Angola. The ideal candidate will bring deep expertise in evaluating, grading, and valuing rough diamonds, ensuring compliance with international standards and industry best practices. This role is key in maintaining the integrity and accuracy of the...

  • Diamond Evaluator

    2 weeks ago


    Surat, Gujarat, India Apt Resources Full time ₹ 9,00,000 - ₹ 12,00,000 per year

    Apt Resources is seeking an experienced Diamond Evaluator to join our client's mining operations team in Angola. The ideal candidate will bring deep expertise in evaluating, grading, and valuing rough diamonds, ensuring compliance with international standards and industry best practices. This role is key in maintaining the integrity and accuracy of the...

  • AI/ML Engineer

    4 days ago


    Surat, Gujarat, India Arham web work Full time ₹ 4,00,000 - ₹ 6,00,000 per year

    Job Summary:We are seeking a skilled AI/ML Engineer with a foundational understanding of Laravel (PHP framework). The ideal candidate will have 2–3 years of hands-on experience in developing and deploying AI/ML models and the ability to integrate intelligent systems into Laravel-based web applications.Key Responsibilities:Design, develop, and deploy...

  • AI/ML Engineer

    4 days ago


    Surat, Gujarat, India Arham Web Works Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Job Summary:We are seeking a skilled AI/ML Engineer with a foundational understanding of Laravel (PHP framework). The ideal candidate will have 2–3 years of hands-on experience in developing and deploying AI/ML models and the ability to integrate intelligent systems into Laravel-based web applications.Key Responsibilities:Design, develop, and deploy  ...

  • AI Engineer

    6 days ago


    Surat, Gujarat, India Appstonelab Technologies Full time ₹ 5,00,000 - ₹ 15,00,000 per year

    AI Engineers at AppStoneLab focus on building intelligent agentic workflows using large language models (LLMs) and tools like LangGraph, LangChain, and OpenAI APIs. You'll work closely with product and engineering teams to design prompt-based systems, automate tasks using LLMs, and create scalable AI-driven solutions. The ideal candidate is curious,...


  • Surat, Gujarat, India Mindrift Full time ₹ 6,00,000 - ₹ 9,00,000 per year

    This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency.At Mindrift, innovation meets opportunity. We believe in using the power of collective intelligence to ethically shape the future of AI.What...


  • Surat, Gujarat, India Mindrift Full time ₹ 2,50,000 - ₹ 15,00,000 per year

    This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency.At Mindrift, innovation meets opportunity. We believe in using the power of collective intelligence to ethically shape the future of AI.What...

  • AI/ML Developer

    4 days ago


    Surat, Gujarat, India Trezix - The Future of Global Trade Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Experience: 3+ years of experience in designing and developing AI solutions.Qualification: Bachelor s degree in Computer Science, Data Science, or a related field (e.g., Mathematics, Engineering)Location: Should be based out of Surat or ready to relocate to Surat, ASHINE, SVNIT Campus.Working Days: 6 days working with 2nd and 4th Saturday weekly offPosition...

  • Freelance AI Trainer

    2 weeks ago


    Surat, Gujarat, India Mindrift Full time ₹ 2,40,000 - ₹ 14,40,000 per year

    This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English.At Mindrift, innovation meets opportunity. We believe in using the power of collective intelligence to ethically shape the future of AI.What we doThe...