The AI Wire

A Comprehensive Dataset for Human vs. AI Generated Image Detection [TOP LAB](arxiv.org)

2026-01-05|paper|arXiv

Multimodal generative AI systems like Stable Diffusion, DALL-E, and MidJourney have fundamentally changed how synthetic images are created. These tools drive innovation but also enable the spread of m...

cs-CV cs-AI

SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time (arxiv.org)

2026-01-04|paper|arXiv

We present SpaceTimePilot, a video diffusion model that disentangles space and time for controllable generative rendering. Given a monocular video, SpaceTimePilot can independently alter the camera vi...

cs-CV cs-AI cs-RO

AI-Driven Cloud Resource Optimization for Multi-Cluster Environments [TOP LAB](arxiv.org)

2026-01-04|paper|arXiv

Modern cloud-native systems increasingly rely on multi-cluster deployments to support scalability, resilience, and geographic distribution. However, existing resource management approaches remain larg...

cs-DC cs-AI

Video and Language Alignment in 2D Systems for 3D Multi-object Scenes with Multi-Information Derivative-Free Control [TOP LAB](arxiv.org)

2026-01-04|paper|arXiv

Cross-modal systems trained on 2D visual inputs are presented with a dimensional shift when processing 3D scenes. An in-scene camera bridges the dimensionality gap but requires learning a control modu...

cs-CV cs-AI

SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time (arxiv.org)

2026-01-03|paper|arXiv

We present SpaceTimePilot, a video diffusion model that disentangles space and time for controllable generative rendering. Given a monocular video, SpaceTimePilot can independently alter the camera vi...

cs-CV cs-AI cs-RO

AI-Driven Cloud Resource Optimization for Multi-Cluster Environments [TOP LAB](arxiv.org)

2026-01-03|paper|arXiv

Modern cloud-native systems increasingly rely on multi-cluster deployments to support scalability, resilience, and geographic distribution. However, existing resource management approaches remain larg...

cs-DC cs-AI

Video and Language Alignment in 2D Systems for 3D Multi-object Scenes with Multi-Information Derivative-Free Control [TOP LAB](arxiv.org)

2026-01-03|paper|arXiv

Cross-modal systems trained on 2D visual inputs are presented with a dimensional shift when processing 3D scenes. An in-scene camera bridges the dimensionality gap but requires learning a control modu...

cs-CV cs-AI

SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time (arxiv.org)

2026-01-02|paper|arXiv

We present SpaceTimePilot, a video diffusion model that disentangles space and time for controllable generative rendering. Given a monocular video, SpaceTimePilot can independently alter the camera vi...

cs-CV cs-AI cs-RO

AI-Driven Cloud Resource Optimization for Multi-Cluster Environments [TOP LAB](arxiv.org)

2026-01-02|paper|arXiv

Modern cloud-native systems increasingly rely on multi-cluster deployments to support scalability, resilience, and geographic distribution. However, existing resource management approaches remain larg...

cs-DC cs-AI

Video and Language Alignment in 2D Systems for 3D Multi-object Scenes with Multi-Information Derivative-Free Control [TOP LAB](arxiv.org)

2026-01-02|paper|arXiv

Cross-modal systems trained on 2D visual inputs are presented with a dimensional shift when processing 3D scenes. An in-scene camera bridges the dimensionality gap but requires learning a control modu...

cs-CV cs-AI

SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time (arxiv.org)

2026-01-01|paper|arXiv

We present SpaceTimePilot, a video diffusion model that disentangles space and time for controllable generative rendering. Given a monocular video, SpaceTimePilot can independently alter the camera vi...

cs-CV cs-AI cs-RO

AI-Driven Cloud Resource Optimization for Multi-Cluster Environments [TOP LAB](arxiv.org)

2026-01-01|paper|arXiv

Modern cloud-native systems increasingly rely on multi-cluster deployments to support scalability, resilience, and geographic distribution. However, existing resource management approaches remain larg...

cs-DC cs-AI

Video and Language Alignment in 2D Systems for 3D Multi-object Scenes with Multi-Information Derivative-Free Control [TOP LAB](arxiv.org)

2026-01-01|paper|arXiv

Cross-modal systems trained on 2D visual inputs are presented with a dimensional shift when processing 3D scenes. An in-scene camera bridges the dimensionality gap but requires learning a control modu...

cs-CV cs-AI

AI tutoring can safely and effectively support students: An exploratory RCT in UK classrooms [TOP LAB](arxiv.org)

2025-12-31|paper|arXiv

One-to-one tutoring is widely considered the gold standard for personalized education, yet it remains prohibitively expensive to scale. To evaluate whether generative AI might help expand access to th...

cs-CY cs-AI cs-LG

AI tutoring can safely and effectively support students: An exploratory RCT in UK classrooms [TOP LAB](arxiv.org)

2025-12-30|paper|arXiv

One-to-one tutoring is widely considered the gold standard for personalized education, yet it remains prohibitively expensive to scale. To evaluate whether generative AI might help expand access to th...

cs-CY cs-AI cs-LG

A Real-World Evaluation of LLM Medication Safety Reviews in NHS Primary Care [TOP LAB](arxiv.org)

2025-12-27|paper|arXiv

Large language models (LLMs) often match or exceed clinician-level performance on medical benchmarks, yet very few are evaluated on real clinical data or examined beyond headline metrics. We present, ...

cs-AI

A Real-World Evaluation of LLM Medication Safety Reviews in NHS Primary Care [TOP LAB](arxiv.org)

2025-12-26|paper|arXiv

Large language models (LLMs) often match or exceed clinician-level performance on medical benchmarks, yet very few are evaluated on real clinical data or examined beyond headline metrics. We present, ...

cs-AI

A Real-World Evaluation of LLM Medication Safety Reviews in NHS Primary Care [TOP LAB](arxiv.org)

2025-12-25|paper|arXiv

Large language models (LLMs) often match or exceed clinician-level performance on medical benchmarks, yet very few are evaluated on real clinical data or examined beyond headline metrics. We present, ...

cs-AI

LongVideoAgent: Multi-Agent Reasoning with Long Videos (arxiv.org)

2025-12-24|paper|arXiv

Recent advances in multimodal LLMs and systems that use tools for long-video QA point to the promise of reasoning over hour-long episodes. However, many methods still compress content into lossy summa...

cs-AI cs-CV cs-LG

Benchmarking LLMs for Predictive Applications in the Intensive Care Units [TOP LAB](arxiv.org)

2025-12-24|paper|arXiv

With the advent of LLMs, various tasks across the natural language processing domain have been transformed. However, their application in predictive tasks remains less researched. This study compares ...

cs-AI

Graph-Symbolic Policy Enforcement and Control (G-SPEC): A Neuro-Symbolic Framework for Safe Agentic AI in 5G Autonomous Networks [TOP LAB](arxiv.org)

2025-12-24|paper|arXiv

As networks evolve toward 5G Standalone and 6G, operators face orchestration challenges that exceed the limits of static automation and Deep Reinforcement Learning. Although Large Language Model (LLM)...

cs-AI cs-NI

Beyond CLIP: Knowledge-Enhanced Multimodal Transformers for Cross-Modal Alignment in Diabetic Retinopathy Diagnosis [TOP LAB](arxiv.org)

2025-12-23|paper|arXiv

Diabetic retinopathy (DR) is a leading cause of preventable blindness worldwide, demanding accurate automated diagnostic systems. While general-domain vision-language models like Contrastive Language-...

cs-CV cs-AI

Scalably Enhancing the Clinical Validity of a Task Benchmark with Physician Oversight (arxiv.org)

2025-12-23|paper|arXiv

Automating the calculation of clinical risk scores offers a significant opportunity to reduce physician administrative burden and enhance patient care. The current standard for evaluating this capabil...

cs-AI stat-AP

MGRegBench: A Novel Benchmark Dataset with Anatomical Landmarks for Mammography Image Registration [TOP LAB](arxiv.org)

2025-12-22|paper|arXiv

Robust mammography registration is essential for clinical applications like tracking disease progression and monitoring longitudinal changes in breast tissue. However, progress has been limited by the...

cs-CV cs-AI

GenEval 2: Addressing Benchmark Drift in Text-to-Image Evaluation [TOP LAB](arxiv.org)

2025-12-21|paper|arXiv

Automating Text-to-Image (T2I) model evaluation is challenging; a judge model must be used to score correctness, and test prompts must be selected to be challenging for current T2I models but not the ...

cs-CV cs-AI

Grammar-Forced Translation of Natural Language to Temporal Logic using LLMs [TOP LAB](arxiv.org)

2025-12-21|paper|arXiv

Translating natural language (NL) into a formal language such as temporal logic (TL) is integral for human communication with robots and autonomous systems. State-of-the-art approaches decompose the t...

cs-CL cs-AI

GenEval 2: Addressing Benchmark Drift in Text-to-Image Evaluation [TOP LAB](arxiv.org)

2025-12-20|paper|arXiv

Automating Text-to-Image (T2I) model evaluation is challenging; a judge model must be used to score correctness, and test prompts must be selected to be challenging for current T2I models but not the ...

cs-CV cs-AI

Grammar-Forced Translation of Natural Language to Temporal Logic using LLMs [TOP LAB](arxiv.org)

2025-12-20|paper|arXiv

Translating natural language (NL) into a formal language such as temporal logic (TL) is integral for human communication with robots and autonomous systems. State-of-the-art approaches decompose the t...

cs-CL cs-AI

GenEval 2: Addressing Benchmark Drift in Text-to-Image Evaluation [TOP LAB](arxiv.org)

2025-12-19|paper|arXiv

<think>

cs-CV cs-AI

Grammar-Forced Translation of Natural Language to Temporal Logic using LLMs [TOP LAB](arxiv.org)

2025-12-19|paper|arXiv

<think>

cs-CL cs-AI