>
72 articles tagged "cs-CL" — page 1 of 3
Reinforced Attention Learning(arxiv.org)
|paper|arXiv

Post-training with Reinforcement Learning (RL) has substantially improved reasoning in Large Language Models (LLMs) via test-time scaling. However, extending this paradigm to Multimodal LLMs (MLLMs) t...