Fine Tuning/Post Training Data Scientist

1 day ago


Taiwan New Zealand Australia, India Binance Full time US$ 1,50,000 - US$ 2,00,000 per year
Job Description

Fine Tuning/Post Training Data Scientist - RL (GRPO, PPO, RLHF)

Location: Asia / Australia, Brisbane / Australia, Melbourne / Australia, Sydney / Hong Kong / Taiwan, Taipei / New Zealand, Auckland / New Zealand, Wellington
Department: Engineering Data Science/AI
Employment Type: Full-time: Remote

About Binance

Binance is a leading global blockchain ecosystem behind the worlds largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 280 million people in 100 countries for our industry-leading security, user fund transparency, trading engine speed, deep liquidity, and an unmatched portfolio of digital-asset products. Binance offerings range from trading and finance to education, research, payments, institutional services, Web3 features, and more. We leverage the power of digital assets and blockchain to build an inclusive financial ecosystem to advance the freedom of money and improve financial access for people around the world.

About the Role

You will develop and optimize Reinforcement Learning (RL) models for enterprise-scale applications such as customer service, token reporting, compliance, and Web3 domain reasoning.

You will explore and evaluate advanced Algorithms including PPO, GRPO, DPO, RLHF, RLAIF, and Agentic RL to enhance the capabilities of LLMs, VLMs, and Agentic AI at Binance. The role requires a strong theoretical foundation in RLcovering policy optimization, reward modeling, and planningpaired with the Engineering skills to build scalable production systems.

You will take full ownership from research through deployment, driving experimentation with systematic evaluation and benchmarking. Collaboration across research, infrastructure, and application teams will be key to delivering impactful AI solutions.

Responsibilities
  • Research and develop state-of-the-art RL algorithms, focusing on large model optimization and alignment techniques.
  • Design and implement RL training pipelines, including environment simulation, data generation, and reward function design.
  • Apply Reinforcement Learning methods to enhance LLM/VLM/Agentic AI capabilities in reasoning, planning, and autonomous decision-making.
  • Collaborate with Engineers and researchers to integrate RL solutions into enterprise AI platforms.
  • Monitor model performance in production and continuously improve through iterative training and fine-tuning.
Requirements
  • Masters Degree in Computer Science, Applied Mathematics, Machine Learning, or related fields.
  • 5 years of hands-on experience in RL and [either 1: LLM/VLM/Agentic AI] optimization.
  • Strong coding skills in Python, with experience in ML frameworks and RL libraries.
  • Experience with large-scale distributed training and optimization.
  • Self-driven, ownership mindset, and strong problem-solving skills. Excellent communication skills for cross-functional collaboration.
Why Binance
  • Shape the future with the worlds leading blockchain ecosystem
  • Collaborate with world-class talent in a user-centric global organization with a flat structure
  • Tackle unique, fast-paced projects with autonomy in an innovative environment
  • Thrive in a results-driven workplace with opportunities for career growth and continuous learning
  • Competitive salary and company benefits
  • Work-from-home arrangement (the arrangement may vary depending on the work nature of the business team)

Binance is committed to being an equal opportunity employer. We believe that having a diverse workforce is fundamental to our success.

By submitting a job application, you confirm that you have read and agree to our Candidate Privacy Notice.

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.


  • Data Scientist

    1 day ago


    Taiwan ,New Zealand ,Australia, India Binance Full time ₹ 1,20,000 - ₹ 2,40,000 per year

    Job Description Data Scientist (Reinforcement Learning/Vision Language Model) Location: Asia / Australia, Brisbane / Australia, Melbourne / Australia, Sydney / Hong Kong / Taiwan, Taipei / New Zealand, Auckland / New Zealand, WellingtonDepartment: Engineering Data Science/AIEmployment Type: Full-time: Remote Binance is a leading global blockchain...


  • Taiwan ,New Zealand ,Australia, India Binance Full time ₹ 80,000 - ₹ 1,60,000 per year

    Job Description Research Data Scientist, NLP & Financial Signals Location: Taiwan, Taipei / Asia / Hong Kong / Australia, Brisbane / Australia, Melbourne / Australia, Sydney / New Zealand, Auckland / New Zealand, Wellington / Argentina, Buenos Aires / Czech Republic, Prague / Georgia, Tbilisi / Hungary, Budapest / Italy, Milan / Philippines, Manila /...


  • Thailand ,Taiwan ,Australia, India Binance Full time

    Job Description Data Scientist/Algorithm Engineer (LLM) AI Safety Location: Taiwan, Taipei / Thailand, Bangkok / Australia, Brisbane / Australia, Melbourne / Australia, Sydney / Indonesia, Jakarta / Hong Kong / Asia / New Zealand, Auckland / New Zealand, Wellington / Poland, Krakow / Poland, WarsawDepartment: Engineering Data Science/AIJob Type: Full-time:...


  • Taiwan ,New Zealand ,Australia, India Binance Full time ₹ 1,20,000 - ₹ 1,80,000 per year

    Job Description Position: Reinforcement Learning (RL), Data Scientist/Machine Learning EngineerLocation: Asia / Australia, Brisbane / Australia, Melbourne / Australia, Sydney / Hong Kong / Taiwan, Taipei / New Zealand, Auckland / New Zealand, WellingtonDepartment: Engineering Data Science/AIEmployment Type: Full-time: RemoteRemote: Yes Binance is a...

  • Data Scientist

    3 weeks ago


    New Delhi, India Supervsr Full time

    Company DescriptionSupervsr provides Vision AI solutions for CCTV surveillance systems tailored to business establishments and public places. Our technology is designed to enhance and ensure security and efficiency in any environment. With cutting-edge artificial intelligence, we strive to deliver reliable and effective surveillance solutions for our...

  • Data Scientist

    2 weeks ago


    New Delhi, India Celebal Technologies Full time

    Job Title: Data Scientist Job Location: Jaipur | Pune | Bangalore Experience: 3+ years Job Description We are seeking a highly skilled Data Scientist with strong expertise in Machine Learning and Python programming. The ideal candidate should be passionate about writing clean and efficient code, developing production-ready ML models, and solving complex...


  • Thailand ,Taiwan ,Australia, India Binance Full time US$ 60,000 - US$ 1,80,000 per year

    Job Description Position: Data Scientist, AI Agent Engineering & InfrastructureLocation: Asia / Taiwan, Taipei / Hong Kong / Thailand, Bangkok / Australia, Brisbane / Australia, Melbourne / Australia, SydneyDepartment: Engineering Data Science/AIType: Full-time: Remote / On-site Binance is a leading global blockchain ecosystem behind the worlds largest...


  • Thailand ,Taiwan ,Australia, India Binance Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Job Description : Data Scientist AI Agent Engineering & Infrastructure Location: Asia / Taiwan, Taipei / Hong Kong / Thailand, Bangkok / Australia, Brisbane / Australia, Melbourne / Australia, Sydney / Eastern EuropeDepartment: Engineering Data Science/AIEmployment Type: Full-time: Remote / On-siteCompany Overview Binance is a leading global blockchain...

  • Data Scientist

    2 weeks ago


    New Delhi, India Deloitte Full time

    Data Scientist (Video & Image Generation) - 3+ Years, immediate joiner, location agnosticRequired / Must-Have- Strong background in Machine Learning, Deep Learning, and Computer Vision - Hands-on experience with Generative AI models (diffusion, transformer, or latent video models) - Proficiency in Python and ML frameworks such as PyTorch or TensorFlow -...

  • Data Scientist

    2 weeks ago


    New Delhi, India Deloitte Full time

    Data Scientist (Video & Image Generation) - 3+ Years, immediate joiner, location agnosticRequired / Must-HaveStrong background in Machine Learning, Deep Learning, and Computer Vision Hands-on experience with Generative AI models (diffusion, transformer, or latent video models) Proficiency in Python and ML frameworks such as PyTorch or TensorFlow Experience...