The AI Wire

5101 articles — page 13 of 171

RayDer: Scalable Self-Supervised Novel View Synthesis from Real-World Video (arxiv.org)

2026-06-01|paper|arxiv

A scalable self-supervised framework synthesizes novel views from real-world video by leveraging ray-based representations without requiring posed multi-view supervision.

Automated Prediction of Postoperative Pancreatic Fistula Using Preoperative Computed Tomography (arxiv.org)

2026-06-01|paper|arxiv

Trains a model on preoperative CT scans to automatically predict which patients will develop pancreatic fistula after surgery, enabling earlier risk stratification.

Preference-Aware Rubric Learning for Personalized Evaluation (arxiv.org)

2026-06-01|paper|arxiv

Learns personalized evaluation rubrics that align automated scoring with individual human evaluator preferences rather than a single universal standard.

The Dynamic-Probabilistic Consistency Gap in Chaotic Surrogate Modeling (arxiv.org)

2026-06-01|paper|arxiv

Quantifies and characterizes the gap between dynamic trajectory accuracy and probabilistic distributional consistency in surrogate models trained on chaotic systems.

Semantic Triplet Restoration: A Novel Protocol for Hierarchical Table Understanding in Large Language Models (arxiv.org)

2026-06-01|paper|arxiv

A protocol restores corrupted or incomplete semantic triplets (subject-relation-object) to train LLMs for hierarchical table structure understanding.

Vision-Language Models Suppress Female Representations Under Ambiguous Input (arxiv.org)

2026-06-01|paper|arxiv

Vision-language models systematically reduce female representation in generated or retrieved outputs when input prompts are gender-ambiguous.

Positional versus Symbolic Attention Heads: Learning Dynamics, RoPE Geometry, and Length Generalization (arxiv.org)

2026-06-01|paper|arxiv

Attention heads are categorized as positional or symbolic, analyzing how RoPE geometry shapes their learning dynamics and ability to generalize to longer sequences.

Functional Attention: From Pairwise Affinities to Functional Correspondences (arxiv.org)

2026-06-01|paper|arxiv

Attention is reformulated to produce functional correspondences between continuous representations rather than discrete pairwise similarity scores between token pairs.

What Am I Missing? Question-Answering as Hidden State Probing (arxiv.org)

2026-06-01|paper|arxiv

Question-answering tasks are used as probes to reveal what information is encoded or absent in hidden states of language models.

Effective Biological Representation Learning by Masking Gene Expression (arxiv.org)

2026-06-01|paper|arxiv

Masked gene expression pretraining learns biologically meaningful representations by forcing models to reconstruct randomly masked expression values.

Disagreeing Rationales: Rethinking Classification and Explainability Evaluation in Hate Speech Detection (arxiv.org)

2026-06-01|paper|arxiv

Classification labels and rationale-based explanations in hate speech detection are shown to frequently disagree, challenging standard explainability evaluation practices.

What Gets Unmasked First? Trajectory Analysis of Diffusion Models for Graph-to-Text Generation (arxiv.org)

2026-06-01|paper|arxiv

Trajectory analysis of discrete diffusion models reveals which graph elements (nodes, edges, labels) get unmasked earliest during graph-to-text generation.

SPECTRA: Synthetic IR Test Collections with Relevance Oracles and Controlled Distractor Diagnostics (arxiv.org)

2026-06-01|paper|arxiv

SPECTRA generates synthetic information retrieval test collections with ground-truth relevance oracles and controlled distractor documents for diagnostic evaluation of IR systems.

Giving Sensors a Voice: Multimodal JEPA for Semantic Time-Series Embeddings (arxiv.org)

2026-06-01|paper|arxiv

A multimodal Joint Embedding Predictive Architecture encodes heterogeneous sensor time-series into semantically meaningful shared embeddings without requiring labeled data.

Choosing the Lens: Strategic Perspective Activation in Context-Dependent Argumentation (arxiv.org)

2026-06-01|paper|arxiv

A system strategically selects which argumentative perspective to activate based on conversational context to construct more effective context-dependent arguments.

LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards (arxiv.org)

2026-06-01|paper|arxiv

LLMs are fine-tuned for long-context reasoning using search agent trajectories as training data, with rubric-based rewards guiding reinforcement learning.

Language Models Learn Constructional Semantics, Not To Mention Syntax: Investigating LM Understanding of Paired-Focus Constructions (arxiv.org)

2026-06-01|paper|arxiv

Probes whether language models encode construction-grammar meaning (e.g., paired-focus constructions like 'not to mention') beyond surface syntax, evaluating grammatical versus semantic understanding.

TunerDiT: Training-free Progressive Steering of Diffusion Transformer for Multi-Event Video Generation (arxiv.org)

2026-06-01|paper|arxiv

Steers a pretrained Diffusion Transformer at inference time using progressive guidance signals to generate videos containing multiple distinct events without any additional training.

Stateful Online Monitoring Catches Distributed Agent Attacks (arxiv.org)

2026-06-01|paper|arxiv

Presents a stateful runtime monitor that detects adversarial attacks on distributed multi-agent systems by tracking agent state sequences rather than inspecting individual messages in isolation.

A Tight Theory of Error Feedback Algorithms in Distributed Optimization (arxiv.org)

2026-06-01|paper|arxiv

Derives tight convergence bounds for error-feedback gradient compression algorithms in distributed optimization, closing gaps between existing upper and lower complexity analyses.

KLIP: localized distribution shift detection via KL-divergence with diffusion priors in Inverse Problems (arxiv.org)

2026-06-01|paper|arxiv

Detects localized distribution shifts in inverse-problem measurements by comparing KL-divergence between observed data and diffusion-model-learned priors over spatial regions.

Lumos-Nexus: Efficient Frequency Bridging with Homogeneous Latent Space for Video Unified Models (arxiv.org)

2026-06-01|paper|arxiv

Unifies video understanding and generation in a single model by bridging low- and high-frequency temporal signals within a shared homogeneous latent space for efficiency.

Domain expertise has always been the real moat (brethorsting.com)

2026-05-31|news|hackernews

Argues that deep domain knowledge remains a defensible competitive advantage over generic AI capabilities in specialized industries.

OpenRouter raises $113M Series B (openrouter.ai)

2026-05-31|news|hackernews

OpenRouter, the unified LLM API routing platform, secured $113 million in Series B funding to scale its infrastructure.

bytedance/deer-flow (70044 stars): An open-source long-horizon SuperAgent harness that researches, codes, and creat (github.com)

2026-05-31|tool|github

ByteDance's open-source DeerFlow agent autonomously conducts multi-step research, writes and executes code, and generates long-form content.

daytonaio/daytona (72515 stars): Daytona is a Secure and Elastic Infrastructure for Running AI-Generated Code (github.com)

2026-05-31|tool|github

Daytona provides sandboxed, scalable infrastructure for safely executing AI-generated code in isolated, production-grade environments.

thedaviddias/Front-End-Checklist (72756 stars): 🗂 The essential checklist for modern web development, for humans and AI agents (github.com)

2026-05-31|tool|github

A structured checklist covering modern front-end development best practices, usable by both human developers and AI coding agents.

thedotmack/claude-mem (79795 stars): Persistent Context Across Sessions for Every Agent – Captures everything your a (github.com)

2026-05-31|tool|github

claude-mem adds persistent memory to Claude-based agents by storing and retrieving context across separate conversation sessions.

infiniflow/ragflow (81578 stars): RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine tha (github.com)

2026-05-31|tool|github

RAGFlow is an open-source RAG engine enabling document ingestion, retrieval, and generation pipelines for building knowledge-grounded AI applications.

google-gemini/gemini-cli (104768 stars): An open-source AI agent that brings the power of Gemini directly into your termi (github.com)

2026-05-31|tool|github

Google's open-source CLI agent integrates Gemini models directly into the terminal for interactive coding, file manipulation, and task automation.

← Prev13 / 171Next →