Reinforcement Learning

3 days ago


Taiwan New Zealand Australia, India Binance Full time ₹ 1,20,000 - ₹ 1,80,000 per year
Job Description

Position: Reinforcement Learning (RL), Data Scientist/Machine Learning Engineer
Location: Asia / Australia, Brisbane / Australia, Melbourne / Australia, Sydney / Hong Kong / Taiwan, Taipei / New Zealand, Auckland / New Zealand, Wellington
Department: Engineering Data Science/AI
Employment Type: Full-time: Remote
Remote: Yes

Binance is a leading global blockchain ecosystem behind the worlds largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 280 million people in 100 countries for our industry-leading security, user fund transparency, trading engine speed, deep liquidity, and an unmatched portfolio of digital-asset products. Binance offerings range from trading and finance to education, research, payments, institutional services, Web3 features, and more. We leverage the power of digital assets and blockchain to build an inclusive financial ecosystem to advance the freedom of money and improve financial access for people around the world.

About the Role

You will develop and optimize Reinforcement Learning (RL) models for enterprise-scale applications such as customer service, token reporting, compliance, and Web3 domain reasoning.

You will explore and evaluate advanced Algorithms including PPO, GRPO, DPO, RLHF, RLAIF, and Agentic RL to enhance the capabilities of LLMs, VLMs, and Agentic AI at Binance. The role requires a strong theoretical foundation in RLcovering policy optimization, reward modeling, and planningpaired with the Engineering skills to build scalable production systems.

You will take full ownership from research through deployment, driving experimentation with systematic evaluation and benchmarking. Collaboration across research, infrastructure, and application teams will be key to delivering impactful AI solutions.

Responsibilities
  • Research and develop state-of-the-art RL algorithms, focusing on large model optimization and alignment techniques.
  • Design and implement RL training pipelines, including environment simulation, data generation, and reward function design.
  • Apply Reinforcement Learning methods to enhance LLM/VLM/Agentic AI capabilities in reasoning, planning, and autonomous decision-making.
  • Collaborate with Engineers and researchers to integrate RL solutions into enterprise AI platforms.
  • Monitor model performance in production and continuously improve through iterative training and fine-tuning.
Requirements
  • Masters Degree in Computer Science, Applied Mathematics, Machine Learning, or related fields.
  • 5 years of hands-on experience in RL and [either 1: LLM/VLM/Agentic AI] optimization.
  • Strong coding skills in Python, with experience in ML frameworks and RL libraries.
  • Experience with large-scale distributed training and optimization.
  • Self-driven, ownership mindset, and strong problem-solving skills. Excellent communication skills for cross-functional collaboration.
Why Binance
  • Shape the future with the worlds leading blockchain ecosystem.
  • Collaborate with world-class talent in a user-centric global organization with a flat structure.
  • Tackle unique, fast-paced projects with autonomy in an innovative environment.
  • Thrive in a results-driven workplace with opportunities for career growth and continuous learning.
  • Competitive salary and company benefits.
  • Work-from-home arrangement (the arrangement may vary depending on the work nature of the business team).

Binance is committed to being an equal opportunity employer. We believe that having a diverse workforce is fundamental to our success.

By submitting a job application, you confirm that you have read and agree to our Candidate Privacy Notice.

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.


  • Data Scientist

    2 days ago


    Taiwan ,New Zealand ,Australia, India Binance Full time ₹ 1,20,000 - ₹ 2,40,000 per year

    Job Description Data Scientist (Reinforcement Learning/Vision Language Model) Location: Asia / Australia, Brisbane / Australia, Melbourne / Australia, Sydney / Hong Kong / Taiwan, Taipei / New Zealand, Auckland / New Zealand, WellingtonDepartment: Engineering Data Science/AIEmployment Type: Full-time: Remote Binance is a leading global blockchain...


  • Taiwan ,New Zealand ,Australia, India Binance Full time US$ 1,50,000 - US$ 2,00,000 per year

    Job Description Fine Tuning/Post Training Data Scientist - RL (GRPO, PPO, RLHF) Location: Asia / Australia, Brisbane / Australia, Melbourne / Australia, Sydney / Hong Kong / Taiwan, Taipei / New Zealand, Auckland / New Zealand, WellingtonDepartment: Engineering Data Science/AIEmployment Type: Full-time: RemoteAbout Binance Binance is a leading global...


  • New Zealand, India Binance Full time US$ 1,20,000 - US$ 1,80,000 per year

    Job Description Recommendation System, Data Scientist/Machine Learning Engineer Location: Asia / Australia, Brisbane / Australia, Melbourne / Australia, Sydney / Hong Kong / New Zealand, Auckland / New Zealand, Wellington / Taiwan, Taipei Department: Engineering Data Science/AI Job Type: Full-time: Remote Binance is a leading global...

  • Data Scientist

    1 week ago


    Thailand ,Taiwan ,Australia, India Binance Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Job Description : Data Scientist (Recommendation Systems) Location: Binance Square Taiwan, Taipei / Thailand, Bangkok / Australia, Brisbane / Australia, Melbourne / Australia, Sydney / Indonesia, Jakarta / Hong Kong / Asia / New Zealand, Auckland / New Zealand, WellingtonDepartment: Engineering Data Science/AIEmployment Type: Full-time: Remote Binance is...


  • New Delhi, India People Prime Worldwide Full time

    About Client :Our client is one of the world's fastest-growing AI companies, accelerating the advancement and deployment of powerful AI systems. They helps customers in two ways: Working with the world’s leading AI labs to advance frontier model capabilities in thinking, reasoning, coding, agentic behavior, multimodality, multilinguality, STEM and frontier...

  • Principal Engineer

    5 days ago


    Dubai ,Taiwan ,Australia, India Binance Full time ₹ 1,20,000 - ₹ 2,40,000 per year

    Job Description Principal Engineer (Artificial Intelligence, Backend Development) Locations:Asia / Taiwan, Taipei UAE, Dubai Australia, Brisbane Australia, Melbourne Australia, Sydney New Zealand, Auckland New Zealand, Wellington Department: Engineering Data Science/AIEmployment Type: Full-time: Remote Binance is a leading global...


  • New Delhi, India Mercity Full time

    We are seeking a Machine Learning Researcher to join our team. You will be working on cutting-edge research projects, building experimental prototypes, and documenting your findings through technical publications. This is a fully remote position, initially a 3-month research engagement with the potential for extension or full-time offer based on performance...

  • Subject Matter Expert

    3 weeks ago


    New Delhi, India Flex Full time

    To support our extraordinary teams who build great products and contribute to our growth, were looking to add aSubject Matter Expert - Learning & Developmentbased inchennailocationWhat A Typical Day Looks Like :Facilitate 4 hours of training daily on leadership, soft/ Behavioral skills. Design and develop or enhance existing training content for Individual...

  • Associate Architect

    3 weeks ago


    New Delhi, India Quantiphi Full time

    Role : Associate Architect - Machine Learning (Gen AI)Experience : 6 to 8 YearsLocation : Bangalore / Mumbai (Hybrid)Job Summary:We are looking for an experienced Associate Architect - Machine Learning to join our team, focused on building Agentic AI workflows, fine-tuning Large Language Models (LLMs), performing prompt engineering, and applying related...


  • New Delhi, India Quantiphi Full time

    Role : Senior Machine Learning Engineer (GenAI)Required Experience : 3 to 6 yearsLocation : Mumbai / Bangalore / Trivandrum (hybrid)Job Summary:We are looking for experienced Machine Learning Engineers to join our team, focused on building Agentic AI workflows, fine-tuning Large Language Models (LLMs), performing prompt engineering, and applying related...