>
150 articles tagged "cs-LG" — page 2 of 5
Reinforced Attention Learning(arxiv.org)
|paper|arXiv

Post-training with Reinforcement Learning (RL) has substantially improved reasoning in Large Language Models (LLMs) via test-time scaling. However, extending this paradigm to Multimodal LLMs (MLLMs) t...