AI Evaluation Engineer

21 hours ago


Surat, Gujarat, India Atologist Infotech Full time ₹ 6,00,000 - ₹ 8,00,000 per year

Location-
Surat, Gujarat (on-site)

Role Summary

We are seeking an 
AI Evaluation Engineer
 to join our team and help define how next-generation AI systems are tested, trusted, and improved. In this role, you'll design and implement rigorous quality assurance and evaluation frameworks—combining automated pipelines, human-in-the-loop review, and synthetic data generation—to measure not only our platform reliability but also AI agents' accuracy, safety, and alignment with real-world use cases. You'll work end-to-end across the product lifecycle: writing test case scenarios, building automated tests, managing release test plans, developing dashboards and analysis tools, and translating insights into actionable improvements for both internal teams and clients.

Key Responsibilities

Design evaluation frameworks
 for accuracy, safety, fairness, and alignment with intended use cases.

Build and maintain evaluation pipelines
 that combine automated systems, human-in-the-loop review, and synthetic data generation to test AI Agents' performance at scale.

Conduct failure mode and edge-case analysis
 to surface weaknesses, risks, and unexpected behaviors in AI outputs.

Develop internal tools and dashboards
 that make evaluation results transparent, reproducible, and actionable across engineering, research, and client teams.

Ensure evaluation datasets
 are diverse, representative, and high-quality, minimizing bias while capturing real-world complexity.

Collaborate with researchers, engineers, and product stakeholders
 to translate insights into prioritized improvements and product decisions.

Treat evaluation as a discipline of testing
—applying statistical rigor, reproducibility, and operational reliability across the AI lifecycle.

Ensure deployment readiness
 by stress-testing agents for resilience, safety, and alignment in production-like environments.

Quality Assurance
 ensures software is built to specifications. It is reliable, robust, secure, and ready for deployment. Create test cases, test plan, and bug reporting process for unit, regression, and UAT testing.

Qualifications & Skills
Required

Strong software engineering skills, with proficiency in Python and familiarity with data pipelines, APIs, and evaluation tooling.

Solid understanding of the machine learning lifecycle, including model training, testing, and deployment.

Experience designing or implementing evaluation metrics, experiment design, or statistical analysis.

Exposure to human-in-the-loop workflows, annotation systems, or synthetic data generation.

Ability to conduct rigorous failure analysis and translate results into actionable insights.

Clear, precise communication skills; able to present evaluation findings to technical and non-technical audiences.

Preferred

Quality Assurance and Testing Experience

Experience with LLMs, generative AI systems, or agentic workflows.

Familiarity with fairness, bias detection, interpretability, or safety evaluation.

Background in building dashboards, monitoring tools, or large-scale observability systems.

Prior work with evaluation frameworks, testing suites, or reproducibility practices at scale.

Comfort working end-to-end: from scoping evaluation goals to delivering deployment-ready results.

Seniority Levels / Variations

Depending on seniority (e.g., junior vs senior vs staff), responsibilities might scale to include:

Owning or leading an evaluation strategy at a product or platform level.

Mentoring others or managing QA teams

Architecture of evaluation platforms.

Setting standards for metrics, tools, best practices across multiple product lines.

What We Offer / Why Join Us

Opportunity to influence AI product quality, fairness, and trust at scale.

Working with cutting-edge model architectures and AI tools.

Collaborating with top researchers/engineers/product leaders.

Learning opportunities in safety, fairness, interpretability, and evaluation methodologies.

If the above requirements suit your interest, please call us on

or send your resume to



  • Surat, Gujarat, India Atologist Infotech Full time ₹ 8,00,000 - ₹ 12,00,000 per year

    Role SummaryWe are seeking an AI Evaluation Engineer to join our team and help define how next-generation AI systems are tested, trusted, and improved. In this role, you'll design and implement rigorous quality assurance and evaluation frameworks—combining automated pipelines, human-in-the-loop review, and synthetic data generation—to measure not only...

  • AI/ML Engineer

    4 days ago


    Surat, Gujarat, India Casepoint Pvt. Ltd. Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Hi there Greetings from Casepoint Pvt. Ltd. Company website:  No. of positions vacant: 2 Job Description We are looking for a passionate AI/ML Engineer who can design, develop, and deploy scalable machine learning and generative AI solutions. You will work with large datasets to build intelligent systems that enhance our products and drive data-driven...

  • AI/ML Engineer

    1 day ago


    Surat, Gujarat, India Casepoint Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Surat, GujaratWork Type: Full TimeHi thereGreetings from Casepoint Pvt. Ltd.Company website: No. of positions vacant: 2Job DescriptionWe are looking for a passionate AI/ML Engineer who can design, develop, and deploy scalable machine learning and generative AI solutions. You will work with large datasets to build intelligent systems that enhance our products...

  • AI Engineers

    6 days ago


    Surat, Gujarat, India Ipangram Digital Services Llp Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Key Responsibilities: Develop and deploy machine learning models and algorithms.Design and train AI/ML models using frameworks like TensorFlow, PyTorch, or scikit-learn.Implement generative AI models using GPT, VAE, and GANs.Collaborate with cross-functional teams to solve business problems and define AI project requirements.Stay updated with advancements...

  • AI/ML Engineer

    1 week ago


    Surat, Gujarat, India Arham web work Full time ₹ 4,00,000 - ₹ 6,00,000 per year

    Job Summary:We are seeking a skilled AI/ML Engineer with a foundational understanding of Laravel (PHP framework). The ideal candidate will have 2–3 years of hands-on experience in developing and deploying AI/ML models and the ability to integrate intelligent systems into Laravel-based web applications.Key Responsibilities:Design, develop, and deploy...

  • AI/ML Engineer

    1 week ago


    Surat, Gujarat, India Arham Web Works Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Job Summary:We are seeking a skilled AI/ML Engineer with a foundational understanding of Laravel (PHP framework). The ideal candidate will have 2–3 years of hands-on experience in developing and deploying AI/ML models and the ability to integrate intelligent systems into Laravel-based web applications.Key Responsibilities:Design, develop, and deploy  ...

  • AI/ML Developer

    1 week ago


    Surat, Gujarat, India Inventam Full time ₹ 3,00,000 - ₹ 4,50,000 per year

    We are building next-generation AI-powered SaaS infrastructure and internal AI systems across ERP/CRM/SaaS products.You will join as a core AI/ML Engineer, working directly with a high-strength founding tech team (Senior full-stack founders, DevOps, ML infra).This role is for someone who loves to build, iterate fast, experiment, and ship AI-first...

  • AI/ML Developer

    1 week ago


    Surat, Gujarat, India Trezix - The Future of Global Trade Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Experience: 3+ years of experience in designing and developing AI solutions.Qualification: Bachelor s degree in Computer Science, Data Science, or a related field (e.g., Mathematics, Engineering)Location: Should be based out of Surat or ready to relocate to Surat, ASHINE, SVNIT Campus.Working Days: 6 days working with 2nd and 4th Saturday weekly offPosition...

  • AI/ML Engineer

    1 day ago


    Surat, Gujarat, India Blurbee Solutions Full time ₹ 6,00,000 - ₹ 12,00,000 per year

    Responsibilities:Build and optimize AI/ML solutions leveraging LLMs, RAG, fine-tuning, and prompt engineering.Work with Generative AI for text, speech (TTS/STT), image, and video use cases.Develop automation tools (including no-code platforms like n8n, Zapier ).Integrate with Hugging Face, Gradio, Google Colab, Vector Databases, and other AI/ML...

  • AI/ML Intern

    1 week ago


    Surat, Gujarat, India Blue Data Consulting Full time ₹ 1,20,000 per year

    This internship provides a unique opportunity to work alongside experienced professionals, contributing to real-world projects that impact our company's success.Job Responsibilities:Collect, clean, and pre-process data for machine learning purposes.Work with diverse datasets to ensure their quality and relevance to the project.Assist in developing and...