AI Evaluation Engineer
21 hours ago
Location-
Surat, Gujarat (on-site)
Role Summary
We are seeking an
AI Evaluation Engineer
to join our team and help define how next-generation AI systems are tested, trusted, and improved. In this role, you'll design and implement rigorous quality assurance and evaluation frameworks—combining automated pipelines, human-in-the-loop review, and synthetic data generation—to measure not only our platform reliability but also AI agents' accuracy, safety, and alignment with real-world use cases. You'll work end-to-end across the product lifecycle: writing test case scenarios, building automated tests, managing release test plans, developing dashboards and analysis tools, and translating insights into actionable improvements for both internal teams and clients.
Key Responsibilities
Design evaluation frameworks
for accuracy, safety, fairness, and alignment with intended use cases.
Build and maintain evaluation pipelines
that combine automated systems, human-in-the-loop review, and synthetic data generation to test AI Agents' performance at scale.
Conduct failure mode and edge-case analysis
to surface weaknesses, risks, and unexpected behaviors in AI outputs.
Develop internal tools and dashboards
that make evaluation results transparent, reproducible, and actionable across engineering, research, and client teams.
Ensure evaluation datasets
are diverse, representative, and high-quality, minimizing bias while capturing real-world complexity.
Collaborate with researchers, engineers, and product stakeholders
to translate insights into prioritized improvements and product decisions.
Treat evaluation as a discipline of testing
—applying statistical rigor, reproducibility, and operational reliability across the AI lifecycle.
Ensure deployment readiness
by stress-testing agents for resilience, safety, and alignment in production-like environments.
Quality Assurance
ensures software is built to specifications. It is reliable, robust, secure, and ready for deployment. Create test cases, test plan, and bug reporting process for unit, regression, and UAT testing.
Qualifications & Skills
Required
Strong software engineering skills, with proficiency in Python and familiarity with data pipelines, APIs, and evaluation tooling.
Solid understanding of the machine learning lifecycle, including model training, testing, and deployment.
Experience designing or implementing evaluation metrics, experiment design, or statistical analysis.
Exposure to human-in-the-loop workflows, annotation systems, or synthetic data generation.
Ability to conduct rigorous failure analysis and translate results into actionable insights.
Clear, precise communication skills; able to present evaluation findings to technical and non-technical audiences.
Preferred
Quality Assurance and Testing Experience
Experience with LLMs, generative AI systems, or agentic workflows.
Familiarity with fairness, bias detection, interpretability, or safety evaluation.
Background in building dashboards, monitoring tools, or large-scale observability systems.
Prior work with evaluation frameworks, testing suites, or reproducibility practices at scale.
Comfort working end-to-end: from scoping evaluation goals to delivering deployment-ready results.
Seniority Levels / Variations
Depending on seniority (e.g., junior vs senior vs staff), responsibilities might scale to include:
Owning or leading an evaluation strategy at a product or platform level.
Mentoring others or managing QA teams
Architecture of evaluation platforms.
Setting standards for metrics, tools, best practices across multiple product lines.
What We Offer / Why Join Us
Opportunity to influence AI product quality, fairness, and trust at scale.
Working with cutting-edge model architectures and AI tools.
Collaborating with top researchers/engineers/product leaders.
Learning opportunities in safety, fairness, interpretability, and evaluation methodologies.
If the above requirements suit your interest, please call us on
or send your resume to
-
AI Evaluation Engineer
2 weeks ago
Surat, Gujarat, India Atologist Infotech Full time ₹ 8,00,000 - ₹ 12,00,000 per yearRole SummaryWe are seeking an AI Evaluation Engineer to join our team and help define how next-generation AI systems are tested, trusted, and improved. In this role, you'll design and implement rigorous quality assurance and evaluation frameworks—combining automated pipelines, human-in-the-loop review, and synthetic data generation—to measure not only...
-
AI/ML Engineer
4 days ago
Surat, Gujarat, India Casepoint Pvt. Ltd. Full time ₹ 20,00,000 - ₹ 25,00,000 per yearHi there Greetings from Casepoint Pvt. Ltd. Company website: No. of positions vacant: 2 Job Description We are looking for a passionate AI/ML Engineer who can design, develop, and deploy scalable machine learning and generative AI solutions. You will work with large datasets to build intelligent systems that enhance our products and drive data-driven...
-
AI/ML Engineer
1 day ago
Surat, Gujarat, India Casepoint Full time ₹ 20,00,000 - ₹ 25,00,000 per yearSurat, GujaratWork Type: Full TimeHi thereGreetings from Casepoint Pvt. Ltd.Company website: No. of positions vacant: 2Job DescriptionWe are looking for a passionate AI/ML Engineer who can design, develop, and deploy scalable machine learning and generative AI solutions. You will work with large datasets to build intelligent systems that enhance our products...
-
AI Engineers
6 days ago
Surat, Gujarat, India Ipangram Digital Services Llp Full time ₹ 20,00,000 - ₹ 25,00,000 per yearKey Responsibilities: Develop and deploy machine learning models and algorithms.Design and train AI/ML models using frameworks like TensorFlow, PyTorch, or scikit-learn.Implement generative AI models using GPT, VAE, and GANs.Collaborate with cross-functional teams to solve business problems and define AI project requirements.Stay updated with advancements...
-
AI/ML Engineer
1 week ago
Surat, Gujarat, India Arham web work Full time ₹ 4,00,000 - ₹ 6,00,000 per yearJob Summary:We are seeking a skilled AI/ML Engineer with a foundational understanding of Laravel (PHP framework). The ideal candidate will have 2–3 years of hands-on experience in developing and deploying AI/ML models and the ability to integrate intelligent systems into Laravel-based web applications.Key Responsibilities:Design, develop, and deploy...
-
AI/ML Engineer
1 week ago
Surat, Gujarat, India Arham Web Works Full time ₹ 15,00,000 - ₹ 25,00,000 per yearJob Summary:We are seeking a skilled AI/ML Engineer with a foundational understanding of Laravel (PHP framework). The ideal candidate will have 2–3 years of hands-on experience in developing and deploying AI/ML models and the ability to integrate intelligent systems into Laravel-based web applications.Key Responsibilities:Design, develop, and deploy ...
-
AI/ML Developer
1 week ago
Surat, Gujarat, India Inventam Full time ₹ 3,00,000 - ₹ 4,50,000 per yearWe are building next-generation AI-powered SaaS infrastructure and internal AI systems across ERP/CRM/SaaS products.You will join as a core AI/ML Engineer, working directly with a high-strength founding tech team (Senior full-stack founders, DevOps, ML infra).This role is for someone who loves to build, iterate fast, experiment, and ship AI-first...
-
AI/ML Developer
1 week ago
Surat, Gujarat, India Trezix - The Future of Global Trade Full time ₹ 12,00,000 - ₹ 36,00,000 per yearExperience: 3+ years of experience in designing and developing AI solutions.Qualification: Bachelor s degree in Computer Science, Data Science, or a related field (e.g., Mathematics, Engineering)Location: Should be based out of Surat or ready to relocate to Surat, ASHINE, SVNIT Campus.Working Days: 6 days working with 2nd and 4th Saturday weekly offPosition...
-
AI/ML Engineer
1 day ago
Surat, Gujarat, India Blurbee Solutions Full time ₹ 6,00,000 - ₹ 12,00,000 per yearResponsibilities:Build and optimize AI/ML solutions leveraging LLMs, RAG, fine-tuning, and prompt engineering.Work with Generative AI for text, speech (TTS/STT), image, and video use cases.Develop automation tools (including no-code platforms like n8n, Zapier ).Integrate with Hugging Face, Gradio, Google Colab, Vector Databases, and other AI/ML...
-
AI/ML Intern
1 week ago
Surat, Gujarat, India Blue Data Consulting Full time ₹ 1,20,000 per yearThis internship provides a unique opportunity to work alongside experienced professionals, contributing to real-world projects that impact our company's success.Job Responsibilities:Collect, clean, and pre-process data for machine learning purposes.Work with diverse datasets to ensure their quality and relevance to the project.Assist in developing and...