Reinforcement Learning LLM

About 119,000 results

Open links in new tab

Any time

ibm.com
https://www.ibm.com › think › topics › llm-reinforcement-learning
LLM Reinforcement Learning | IBM
Apr 23, 2026 · Reinforcement learning has become an essential tool for improving large language models after pretraining. Methods like RLHF, PPO and DPO help teams optimize outputs by using …
huggingface.co
https://huggingface.co › learn › llm-course
Introduction to Reinforcement Learning and its Role in LLMs · …
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
arxiv.org
https://arxiv.org › abs
[2412.10400] Reinforcement Learning Enhanced LLMs: A Survey
Dec 5, 2024 · In this work, we are going to make a systematic review of the most up-to-date state of knowledge on RL-enhanced LLMs, attempting to consolidate and analyze the rapidly growing …
amazon.com
https://aws.amazon.com › blogs › machine-learning › ...
Reinforcement fine-tuning with LLM-as-a-judge
Apr 30, 2026 · They’re built for each domain through verifiable reward functions that can score LLM generations through a piece of code (Reinforcement Learning with Verifiable Rewards or RLVR) or …
ieee.org
https://ieeexplore.ieee.org › document
Survey on Large Language Model-Enhanced Reinforcement Learning: …
In this survey, we provide a comprehensive review of the existing literature in LLM-enhanced RL and summarize its characteristics compared with conventional RL methods, aiming to clarify the research …
newdigitalpartnership.co.uk
https://www.newdigitalpartnership.co.uk › ...
Mastering LLM Reinforcement Learning: A Comprehensive Tutorial …
Dive into the world of LLM Reinforcement Learning! This tutorial explains RLHF, supervised fine-tuning, reward models, and PPO to align language models with human values. Perfect for AI innovators.
inferless.com
https://www.inferless.com › learn › a-deep-dive-into-reinforcement-learning
Reinforcement Learning for LLMs: RLHF, DPO, and the Future of …
Explore how reinforcement learning transforms LLMs post-training—from RLHF and DPO to cutting-edge RLVR pipelines. Learn how these techniques improve reasoning, alignment, controllability, and …
springer.com
https://link.springer.com › chapter
Reinforcement Learning in Large Language Models (LLMs): The
Sep 23, 2024 · We’ll examine innovative RL-based approaches including reinforcement learning from human feedback (RLHF) and reinforcement learning from AI feedback (RLAIF), which are making …
arxiv.org
https://arxiv.org › abs
Reinforcement Learning for LLM Post-Training: A Survey
Jul 23, 2024 · Large language models (LLMs) trained via pretraining and supervised fine-tuning (SFT) can still produce harmful and misaligned outputs, or struggle in domains like math and coding. …
dataversity.net
https://www.dataversity.net › articles › the-role-of...
The Role of Reinforcement Learning in Enhancing LLM Performance
Jan 8, 2025 · That’s where reinforcement learning (RL) steps in, adding layers of learning and adaptability that take LLMs to the extra mile. This blog post explores how reinforcement learning …

Some results have been removed
Pagination
- 1
- 2
- 3
- Next

LLM Reinforcement Learning | IBM

Introduction to Reinforcement Learning and its Role in LLMs · …

[2412.10400] Reinforcement Learning Enhanced LLMs: A Survey

Reinforcement fine-tuning with LLM-as-a-judge

Survey on Large Language Model-Enhanced Reinforcement Learning: …

Mastering LLM Reinforcement Learning: A Comprehensive Tutorial …

Reinforcement Learning for LLMs: RLHF, DPO, and the Future of …

Reinforcement Learning in Large Language Models (LLMs): The

Reinforcement Learning for LLM Post-Training: A Survey

The Role of Reinforcement Learning in Enhancing LLM Performance