Solving Spatial Supersensing Without Spatial Supersensing [TOP LAB](arxiv.org)2025-11-23|paper|arXiv<think>cs-CVcs-LG
EvoLMM: Self-Evolving Large Multimodal Models with Continuous Rewards(arxiv.org)2025-11-21|paper|arXiv<think>cs-CV
Solving Spatial Supersensing Without Spatial Supersensing [TOP LAB](arxiv.org)2025-11-21|paper|arXiv<think>cs-CVcs-LG
Dataset Distillation for Pre-Trained Self-Supervised Vision Models(arxiv.org)2025-11-21|paper|arXiv<think>cs-CVcs-AIcs-LG
NoPo-Avatar: Generalizable and Animatable Avatars from Sparse Inputs without Human Poses(arxiv.org)2025-11-21|paper|arXiv<think>cs-CV
GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization(arxiv.org)2025-11-20|paper|arXiv<think>cs-CV
In-N-On: Scaling Egocentric Manipulation with in-the-wild and on-task Data(arxiv.org)2025-11-20|paper|arXiv<think>cs-ROcs-AIcs-CV
Think Visually, Reason Textually: Vision-Language Synergy in ARC(arxiv.org)2025-11-20|paper|arXiv<think>cs-CVcs-AIcs-CL
GEO-Bench-2: From Performance to Capability, Rethinking Evaluation in Geospatial AI [TOP LAB](arxiv.org)2025-11-20|paper|arXiv<think>cs-CVcs-AI
UniGen-1.5: Enhancing Image Generation and Editing through Reward Unification in Reinforcement Learning(arxiv.org)2025-11-19|paper|arXiv<think>cs-CV
OlmoEarth: Stable Latent Image Modeling for Multimodal Earth Observation [TOP LAB](arxiv.org)2025-11-18|paper|arXiv<think>cs-CVcs-LG
Alpha Divergence Losses for Biometric Verification [TOP LAB](arxiv.org)2025-11-18|paper|arXiv<think>cs-CVcs-AI
Scaling Spatial Intelligence with Multimodal Foundation Models(arxiv.org)2025-11-18|paper|arXiv<think>cs-CVcs-AIcs-LG
Synergy vs. Noise: Performance-Guided Multimodal Fusion For Biochemical Recurrence-Free Survival in Prostate Cancer [TOP LAB](arxiv.org)2025-11-17|paper|arXiv<think>q-bio-QMcs-CVcs-LG
Enhancing the Outcome Reward-based RL Training of MLLMs with Self-Consistency Sampling(arxiv.org)2025-11-16|paper|arXiv<think>cs-CV
Depth Anything 3: Recovering the Visual Space from Any Views(arxiv.org)2025-11-16|paper|arXiv<think>cs-CV
Towards Blind and Low-Vision Accessibility of Lightweight VLMs and Custom LLM-Evals [TOP LAB](arxiv.org)2025-11-16|paper|arXiv<think>cs-CVcs-CL
Preparation of Fractal-Inspired Computational Architectures for Advanced Large Language Model Analysis [TOP LAB](arxiv.org)2025-11-11|paper|arXiv<think>cs-LGcs-CV
GentleHumanoid: Learning Upper-body Compliance for Contact-rich Human and Object Interaction(arxiv.org)2025-11-09|paper|arXiv<think>cs-ROcs-CVcs-HC
Learning from Single Timestamps: Complexity Estimation in Laparoscopic Cholecystectomy [TOP LAB](arxiv.org)2025-11-09|paper|arXiv<think>cs-CV
GentleHumanoid: Learning Upper-body Compliance for Contact-rich Human and Object Interaction(arxiv.org)2025-11-08|paper|arXiv<think>cs-ROcs-CVcs-HC
Learning from Single Timestamps: Complexity Estimation in Laparoscopic Cholecystectomy [TOP LAB](arxiv.org)2025-11-08|paper|arXiv<think>cs-CV
GentleHumanoid: Learning Upper-body Compliance for Contact-rich Human and Object Interaction(arxiv.org)2025-11-07|paper|arXiv<think>cs-ROcs-CVcs-HC
Learning from Single Timestamps: Complexity Estimation in Laparoscopic Cholecystectomy [TOP LAB](arxiv.org)2025-11-07|paper|arXiv<think>cs-CV