Senior AI/ML and GPU Performance QA engineer

2 days ago


Hyderabad, Telangana, India AMD Full time ₹ 10,00,000 - ₹ 25,00,000 per year

Overview:

WHAT YOU DO AT AMD CHANGES EVERYTHING

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.

Responsibilities:

AI/ML and GPU Performance QA engineer

We are seeking an experienced Senior Technical Validation Engineer to drive validation and performance engineering for Machine Learning (ML), High-Performance Computing (HPC) frameworks, GPU software stacks, and cluster environments.

This role requires good understanding and experience in ROCm, CUDA, GPU architecture, ML frameworks, CI/CD systems, benchmarking, and competitive analysis.

You will lead cross-functional initiatives across validation, automation, test development, performance tuning, and system scalability, ensuring delivery of high-quality, high-performance software for next-generation AI and HPC workloads.

Key Responsibilities

  • Lead validation for ML/AI models: accuracy testing, performance benchmarking, regression, drift detection, A/B testing
  • Test ML frameworks: PyTorch, Hugging Face, MLFlow experiment tracking
  • Validate wide varieties of AI models to ensure correctness in distributed training or inference
  • Perform GPU testing & profiling: ROCm/CUDA validation, performance profiling, memory/thermal analysis, multi-GPU scaling
  • Validate HPC frameworks, distributed runtimes, compilers, and GPU libraries
  • Build scalable CI/CD workflows for ML/HPC validation. Develop automated test pipelines using Docker, Kubernetes, GitHub Actions, Jenkins
  • Validate cloud-based AI workloads on AWS SageMaker, Lambda, and S3
  • Test the benchmarks under containerized and virtualized GPU environments
  • Design and implement automated validation pipelines for ML frameworks (e.g., PyTorch, TensorFlow, JAX) across GPU platforms.
  • Develop and maintain benchmarking suites for AI models and HPC workloads, focusing on performance, scalability, and regression detection.
  • Multi-node validation efforts using orchestration tools (e.g., Slurm, MPI, Kubernetes) to simulate real-world distributed training and inference.
  • Collaborate with hardware and software teams to validate GPU hardware platforms (NVIDIA CUDA, AMD ROCm) for ML and HPC readiness.
  • Analyze performance metrics using profiling tools (e.g., Nsight, rocprof, perf) and provide actionable insights.
  • Drive test content development for emerging AI workloads, including LLMs, vision models, and scientific computing benchmarks.
  • Perform bottleneck analysis, hyperparameter validation, and competitive benchmarking
  • Mentor junior engineers and contribute to validation strategy, tooling, and best practices.

Preferred Expereince :

  • Bachelor's or Master's degree in Computer Science, Electrical Engineering, or related field.
  • 8+ years of experience in validation engineering, ML infrastructure, or HPC performance testing.
  • Strong hands-on experience with GPU platforms (NVIDIA CUDA, AMD ROCm) and their software ecosystems.
  • Deep understanding of AI model architectures, training/inference workflows, and ML performance bottlenecks.
  • Proven experience with CI/CD systems, Git, Docker, and automated test frameworks.
  • Expertise in multi-node orchestration and distributed system validation.
  • Familiarity with HPC benchmarks (e.g., HPL, HPCG, MLPerf) and AI model benchmarking methodologies.
  • Proficiency in scripting and automation (Python, Bash, YAML) in Linux environments.
  • Strong communication, documentation, and cross-functional collaboration skills.
LI-NR1

Qualifications:

Benefits offered are described: AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.


  • QA Engineer AI/ML

    2 weeks ago


    Hyderabad, Telangana, India Warner Bros. Discovery Full time ₹ 5,00,000 - ₹ 12,00,000 per year

    Welcome to Warner Bros. Discovery… the stuff dreams are made of.Who We Are…When we say, "the stuff dreams are made of," we're not just referring to the world of wizards, dragons and superheroes, or even to the wonders of Planet Earth. Behind WBD's vast portfolio of iconic content and beloved brands, are thestorytellersbringing our characters to life,...

  • AI/ML Engineer

    3 days ago


    Hyderabad, Telangana, India Mihira Visual Labs Full time ₹ 12,00,000 - ₹ 24,00,000 per year

    About Mihira Visual LabsMihira Visual Labs is a research-driven CGI and VFX studio redefining filmmaking through AI- and ML-powered workflows. We specialize in the development and production of full-length animated films, empowering creators with cutting-edge tools to accelerate high-quality storytelling and IP creation. Our mission is to make world-class...

  • AI/ML Engineer

    2 days ago


    Hyderabad, Telangana, India ZEN Cloud Systems Private Limited Full time ₹ 12,00,000 - ₹ 18,00,000 per year

    Job Title: AI/ML Engineer – Robotics / Computer Vision / Edge AILocation: HyderabadExperience: 5+ YearsEmployment Type: Full-time, OnsiteJob ResponsibilitiesDevelop, train, and optimize AI/ML models for perception, behavior prediction, and real-time robot control.Improve model accuracy, robustness, and generalization across varying kitchen...


  • Hyderabad, Telangana, India Mobius by Gaian Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    DescriptionAbout the Role :We are seeking an experienced DevOps Engineer to join our infrastructure team, with a strong focus on managing and optimizing GPU-based compute environments for machine learning and deep learning workloads.In this role, you will be responsible for the end-to-end infrastructure lifecyclefrom provisioning with Terraform/Ansible to...

  • AI/ML Engineer

    2 days ago


    Hyderabad, Telangana, India 3625c965-c45c-4d85-b96c-7cfe282c0374 Full time ₹ 3,00,000 - ₹ 6,00,000 per year

    AI/ML Engineer - Fresher/Entry LevelSurviant | Full-Time | Hyderabad, India (On-site)About SurviantSurviant is a Digital Innovation Studio that specializes in building cutting-edge AI-powered products for developers and enterprises. We solve complex problems at the intersection of artificial intelligence and developer tooling, creating solutions that make...


  • Hyderabad, Telangana, India Thomson Reuters Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Senior QA Software Engineer - AIAre you passionate about ensuring the quality and reliability of AI-driven software that transforms how professionals work? Join a dynamic and highly skilled team at Thomson Reuters, where we invest deeply in AI technologies and explore emerging fields with the backing of a global leader. As a Senior QA Software Engineer - AI,...

  • Senior AI/ML Engineer

    7 hours ago


    Hyderabad, Telangana, India Prophecy Technologies Full time ₹ 25,00,000 - ₹ 60,00,000 per year

    We are looking for aSenior AI/ML Engineerwith strong expertise inPython, LLMs, RAG, Generative AI, LangChain/LangGraph, and vector databases. The ideal candidate should have hands-on experience building scalable AI pipelines and deploying ML models in production.Key ResponsibilitiesBuild & fine-tune LLMs and GenAI modelsDevelop RAG pipelines using enterprise...


  • Hyderabad, Telangana, India Prophecy Technologies Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Job Title:Senior AI/ML EngineerLocation:HyderabadExperience:8+ YearsSummary:We are looking for a Senior AI/ML Engineer with deep expertise inPython,Large Language Models (LLMs), andGenerative AI. The ideal candidate will design, build, and deployscalable AI solutionsusing modern frameworks likeLangChain,LangGraph, andRAG pipelines, integrating structured and...

  • Senior AI/ML Engineer

    2 weeks ago


    Hyderabad, Telangana, India Charter Global Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Job Summary:We are looking for a results-drivenSenior AI/ML Engineerto lead the development and deployment of scalable machine learning models and intelligent systems. You will be at the forefront of building AI solutions that solve high-value business problems, with full ownership from data preparation to model monitoring. This is a key role in ahands-on,...


  • Hyderabad, Telangana, India Orion Innovation Naukri Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    About the RoleExperienced AI QA Test Engineer to ensure the quality, reliability, and ethical performance of AI-driven systems and applications. This role requires a deep understanding of AI/ML workflows, data validation, model testing techniques, and automation frameworks. The ideal candidate is highly analytical, detail-oriented, and passionate about...