The AI Wire

150 articles tagged "cs-LG" — page 2 of 5

Beyond Rewards in Reinforcement Learning for Cyber Defence [TOP LAB](arxiv.org)

2026-02-05|paper|arXiv

Recent years have seen an explosion of interest in autonomous cyber defence agents trained to defend computer networks using deep reinforcement learning. These agents are typically trained in cyber gy...

cs-LG cs-AI

Reinforced Attention Learning (arxiv.org)

2026-02-05|paper|arXiv

Post-training with Reinforcement Learning (RL) has substantially improved reasoning in Large Language Models (LLMs) via test-time scaling. However, extending this paradigm to Multimodal LLMs (MLLMs) t...

cs-CL cs-CV cs-LG

Protein Autoregressive Modeling via Multiscale Structure Generation (arxiv.org)

2026-02-05|paper|arXiv

We present protein autoregressive modeling (PAR), the first multi-scale autoregressive framework for protein backbone generation via coarse-to-fine next-scale prediction. Using the hierarchical nature...

cs-LG cs-AI q-bio-BM

Contrastive Continual Learning for Model Adaptability in Internet of Things (arxiv.org)

2026-02-05|paper|arXiv

Internet of Things (IoT) deployments operate in nonstationary, dynamic environments where factors such as sensor drift, evolving user behavior, and heterogeneous user privacy requirements can affect a...

cs-LG cs-AI

Equilibrium Propagation for Non-Conservative Systems [TOP LAB](arxiv.org)

2026-02-04|paper|arXiv

Equilibrium Propagation (EP) is a physics-inspired learning algorithm that uses stationary states of a dynamical system both for inference and learning. In its original formulation it is limited to co...

cs-LG cs-AI cs-NE

PLATE: Plasticity-Tunable Efficient Adapters for Geometry-Aware Continual Learning (arxiv.org)

2026-02-04|paper|arXiv

We develop a continual learning method for pretrained models that \emph{requires no access to old-task data}, addressing a practical barrier in foundation model adaptation where pretraining distributi...

cs-LG cs-AI

Investigating Quantum Circuit Designs Using Neuro-Evolution (arxiv.org)

2026-02-04|paper|arXiv

Designing effective quantum circuits remains a central challenge in quantum computing, as circuit structure strongly influences expressivity, trainability, and hardware feasibility. Current approaches...

cs-NE cs-LG

MentisOculi: Revealing the Limits of Reasoning with Mental Imagery [TOP LAB](arxiv.org)

2026-02-03|paper|arXiv

Frontier models are transitioning from multimodal large language models (MLLMs) that merely ingest visual information to unified multimodal models (UMMs) capable of native interleaved generation. This...

cs-AI cs-CV cs-LG

Misconception Diagnosis From Student-Tutor Dialogue: Generate, Retrieve, Rerank [TOP LAB](arxiv.org)

2026-02-03|paper|arXiv

Timely and accurate identification of student misconceptions is key to improving learning outcomes and pre-empting the compounding of student errors. However, this task is highly dependent on the effo...

cs-CL cs-LG

Didactic to Constructive: Turning Expert Solutions into Learnable Reasoning [TOP LAB](arxiv.org)

2026-02-03|paper|arXiv

Improving the reasoning capabilities of large language models (LLMs) typically relies either on the model's ability to sample a correct solution to be reinforced or on the existence of a stronger mode...

cs-LG cs-AI

Personalized Image Generation via Human-in-the-loop Bayesian Optimization [TOP LAB](arxiv.org)

2026-02-03|paper|arXiv

Imagine Alice has a specific image $x^\ast$ in her mind, say, the view of the street in which she grew up during her childhood. To generate that exact image, she guides a generative model with multipl...

cs-CV cs-LG

Reward-free Alignment for Conflicting Objectives (arxiv.org)

2026-02-03|paper|arXiv

Direct alignment methods are increasingly used to align large language models (LLMs) with human preferences. However, many real-world alignment problems involve multiple conflicting objectives, where ...

cs-CL cs-AI cs-LG

VideoGPA: Distilling Geometry Priors for 3D-Consistent Video Generation (arxiv.org)

2026-02-02|paper|arXiv

While recent video diffusion models (VDMs) produce visually impressive results, they fundamentally struggle to maintain 3D structural consistency, often resulting in object deformation or spatial drif...

cs-CV cs-AI cs-LG

EditYourself: Audio-Driven Generation and Manipulation of Talking Head Videos with Diffusion Transformers [TOP LAB](arxiv.org)

2026-02-01|paper|arXiv

Current generative video models excel at producing novel content from text and image prompts, but leave a critical gap in editing existing pre-recorded videos, where minor alterations to the spoken sc...

cs-CV cs-GR cs-LG

SERA: Soft-Verified Efficient Repository Agents [TOP LAB](arxiv.org)

2026-01-30|paper|arXiv

Open-weight coding agents should hold a fundamental advantage over closed-source systems: they can be specialized to private codebases, encoding repository-specific information directly in their weigh...

cs-CL cs-LG cs-SE

Supervised Guidance Training for Infinite-Dimensional Diffusion Models [TOP LAB](arxiv.org)

2026-01-30|paper|arXiv

Score-based diffusion models have recently been extended to infinite-dimensional function spaces, with uses such as inverse problems arising from partial differential equations. In the Bayesian formul...

cs-LG

Evolutionary Strategies lead to Catastrophic Forgetting in LLMs (arxiv.org)

2026-01-30|paper|arXiv

One of the biggest missing capabilities in current AI systems is the ability to learn continuously after deployment. Implementing such continually learning systems have several challenges, one of whic...

cs-LG cs-AI cs-CL

EditYourself: Audio-Driven Generation and Manipulation of Talking Head Videos with Diffusion Transformers [TOP LAB](arxiv.org)

2026-01-30|paper|arXiv

cs-CV cs-GR cs-LG

Self-Distillation Enables Continual Learning (arxiv.org)

2026-01-29|paper|arXiv

Continual learning, enabling models to acquire new skills and knowledge without degrading existing capabilities, remains a fundamental challenge for foundation models. While on-policy reinforcement le...

cs-LG

Post-LayerNorm Is Back: Stable, ExpressivE, and Deep (arxiv.org)

2026-01-29|paper|arXiv

Large language model (LLM) scaling is hitting a wall. Widening models yields diminishing returns, and extending context length does not improve fundamental expressivity. In contrast, depth scaling off...

cs-LG cs-CL

POPE: Learning to Reason on Hard Problems via Privileged On-Policy Exploration [TOP LAB](arxiv.org)

2026-01-28|paper|arXiv

Reinforcement learning (RL) has improved the reasoning abilities of large language models (LLMs), yet state-of-the-art methods still fail to learn on many training problems. On hard problems, on-polic...

cs-LG cs-AI cs-CL

ctELM: Decoding and Manipulating Embeddings of Clinical Trials with Embedding Language Models (arxiv.org)

2026-01-28|paper|arXiv

Text embeddings have become an essential part of a variety of language applications. However, methods for interpreting, exploring and reversing embedding spaces are limited, reducing transparency and ...

cs-CL cs-AI cs-LG

Reuse your FLOPs: Scaling RL on Hard Problems by Conditioning on Very Off-Policy Prefixes (arxiv.org)

2026-01-28|paper|arXiv

Typical reinforcement learning (RL) methods for LLM reasoning waste compute on hard problems, where correct on-policy traces are rare, policy gradients vanish, and learning stalls. To bootstrap more e...

cs-LG cs-AI cs-CL

Subword-Based Comparative Linguistics across 242 Languages Using Wikipedia Glottosets (arxiv.org)

2026-01-28|paper|arXiv

We present a large-scale comparative study of 242 Latin and Cyrillic-script languages using subword-based methodologies. By constructing 'glottosets' from Wikipedia lexicons, we introduce a framework ...

cs-CL cs-AI cs-LG

DInf-Grid: A Neural Differential Equation Solver with Differentiable Feature Grids (arxiv.org)

2026-01-16|paper|arXiv

We present a novel differentiable grid-based representation for efficiently solving differential equations (DEs). Widely used architectures for neural solvers, such as sinusoidal neural networks, are ...

cs-LG

Exploring Fine-Tuning for Tabular Foundation Models [TOP LAB](arxiv.org)

2026-01-15|paper|arXiv

Tabular Foundation Models (TFMs) have recently shown strong in-context learning capabilities on structured data, achieving zero-shot performance comparable to traditional machine learning methods. We ...

cs-LG

Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning (arxiv.org)

2026-01-15|paper|arXiv

Vision-Language-Action (VLA) tasks require reasoning over complex visual scenes and executing adaptive actions in dynamic environments. While recent studies on reasoning VLAs show that explicit chain-...

cs-CV cs-AI cs-LG

Accelerated Methods with Complexity Separation Under Data Similarity for Federated Learning Problems [TOP LAB](arxiv.org)

2026-01-14|paper|arXiv

Heterogeneity within data distribution poses a challenge in many modern federated learning tasks. We formalize it as an optimization problem involving a computationally heavy composite under data simi...

math-OC cs-LG

PFT: Phonon Fine-tuning for Machine Learned Interatomic Potentials [TOP LAB](arxiv.org)

2026-01-13|paper|arXiv

Many materials properties depend on higher-order derivatives of the potential energy surface, yet machine learned interatomic potentials (MLIPs) trained with standard a standard loss on energy, force,...

cond-mat-mtrl-sci cs-LG

Manifold limit for the training of shallow graph convolutional neural networks (arxiv.org)

2026-01-12|paper|arXiv

We study the discrete-to-continuum consistency of the training of shallow graph convolutional neural networks (GCNNs) on proximity graphs of sampled point clouds under a manifold assumption. Graph con...

stat-ML cs-LG math-FA

← Prev2 / 5Next →