The AI Wire

High Signal (4-5)clear

3149 articles — page 7 of 105

VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization (huggingface.co)

2026-06-02|model|huggingface

Vision-language models serve as teachers to distill video reasoning capabilities into smaller student models via adaptive optimization at test time.

LongAttnComp: Cross-Family Context Compression for Long-Context Reasoning (huggingface.co)

2026-06-02|model|huggingface

A cross-family context compression method reduces long-context input length for reasoning models by identifying and retaining attended tokens across different model architectures.

When Does Multi-Agent RL Improve LLM Workflows? Workflow, Scale, and Policy-Sharing Tradeoffs (huggingface.co)

2026-06-02|model|huggingface

Empirical analysis identifies conditions under which multi-agent reinforcement learning improves LLM-based workflows, examining workflow structure, scale, and policy-sharing strategies.

StressDream: Steering Video World Models for Robust Policy Evaluation and Improvement (huggingface.co)

2026-06-02|model|huggingface

A video world model is steered to synthesize stress-test scenarios for evaluating and improving robustness of learned policies under challenging conditions.

Thinking in Blender: Staged Executable Inverse Graphics with Vision-Language Models (huggingface.co)

2026-06-02|model|huggingface

Vision-language models decompose inverse graphics into staged executable steps within Blender to recover 3D scene structure and attributes from images.

Not only where, But when: Temporal Scheduling for RLVR (huggingface.co)

2026-06-02|model|huggingface

A temporal scheduling strategy for RLVR determines not only which training samples to use but when during training to apply them for optimal reasoning improvement.

3DCodeBench: Benchmarking Agentic Procedural 3D Modeling Via Code (huggingface.co)

2026-06-02|model|huggingface

Introduces a benchmark evaluating AI agents that generate procedural 3D models through code, measuring their ability to produce correct geometric outputs programmatically.

Multi-Agent Computer Use (huggingface.co)

2026-06-02|model|huggingface

Presents a framework where multiple AI agents collaborate to operate computer interfaces, distributing GUI interaction tasks across specialized agents.

Joint Agent Memory and Exploration Learning via Novelty Signals (huggingface.co)

2026-06-02|model|huggingface

Proposes a method that jointly trains agent memory and exploration behavior using novelty-based signals to improve navigation and discovery in unknown environments.

Off-the-Shelf LLMs as Process Scorers: Training-Free Alternative to PRMs for Mathematical Reasoning (huggingface.co)

2026-06-02|model|huggingface

Uses unmodified LLMs to score intermediate reasoning steps in math problems at inference time, replacing trained process reward models without any additional training.

OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents (huggingface.co)

2026-06-02|model|huggingface

Releases an open framework for training visual web agents with online multi-turn RL, clarifying implementation details that enable agents to learn from live browser interactions.

MCP-Persona: Benchmarking LLM Agents on Real-World Personal Applications via Environment Simulation (huggingface.co)

2026-06-02|model|huggingface

Benchmarks LLM agents on personal productivity tasks by simulating realistic personal data environments, testing performance on real-world applications like calendars and email.

Hackers Simply Asked Meta AI to Give Them Access to High-Profile Instagram Accounts. It Worked (simonwillison.net)

2026-06-02|news|blog/Simon Willison

Reports that attackers used social engineering prompts to manipulate Meta AI into granting unauthorized access to high-profile Instagram accounts.

Pasted File Editor (simonwillison.net)

2026-06-02|news|blog/Simon Willison

Describes a tool or feature enabling users to directly edit files that have been pasted into an interface, streamlining in-context file modification.

Beyond LLMs: Why Scalable Enterprise AI Adoption Depends on Agent Logic (huggingface.co)

2026-06-02|news|blog/Hugging Face Blog

Argues that enterprise AI scaling bottlenecks stem from agent orchestration logic rather than LLM capability, advocating for purpose-built agent architectures over raw model scaling.

Introducing Mellum2: A 12B Mixture-of-Experts Model by JetBrains (huggingface.co)

2026-06-02|news|blog/Hugging Face Blog

JetBrains releases Mellum2, a 12-billion-parameter mixture-of-experts language model, likely targeting developer-focused coding and IDE assistance tasks.

Building the infrastructure for the Intelligence Age in Michigan (openai.com)

2026-06-02|news|blog/OpenAI Blog

Announces infrastructure investment in Michigan to build data centers or computing facilities supporting AI workloads as part of a broader national AI build-out.

Our views on AI policy and political advocacy (openai.com)

2026-06-02|news|blog/OpenAI Blog

Articulates an organization's official positions on AI governance policy and the boundaries of appropriate political engagement or lobbying activity.

Jun 1, 2026AnnouncementsAnthropic confidentially submits draft S-1 to the SEC (anthropic.com)

2026-06-02|news|blog/Anthropic News

Anthropic has filed a confidential draft S-1 registration statement with the SEC, initiating the regulatory process toward a potential public offering.

@@xai: Composer 2.5 is now available inside Grok Build....(x.com)

2026-06-02|news|twitter-bookmarks

xAI has released Composer 2.5, a code/content composition tool, now integrated into the Grok Build development environment.

strace-ui, Bonsai_term, and the TUI renaissance (blog.janestreet.com)

2026-06-02|news|hackernews

A survey or advocacy piece covers the resurgence of terminal user interface tools, highlighting strace-ui and Bonsai_term as examples of the TUI revival.

Show HN: AI Simulaionen Based on FEP (aic-ai-lab.site)

2026-06-02|news|hackernews

A system simulates agent behavior or cognition using the Free Energy Principle as the computational and theoretical foundation.

1-Bit Bonsai Image 4B Image Generation for Local Devices (prismml.com)

2026-06-01|news|hackernews

A 1-bit quantized 4B-parameter image generation model optimized to run locally on consumer devices with minimal memory and compute.

United Airlines 767 returns to Newark after Bluetooth name sparks alert (simpleflying.com)

2026-06-01|news|hackernews

A United Airlines 767 diverted back to Newark after a passenger's Bluetooth device name triggered a security alert onboard.

ChatGPT for Google Sheets exfiltrates workbooks (promptarmor.com)

2026-06-01|news|hackernews

A vulnerability in ChatGPT's Google Sheets integration allows malicious prompts to exfiltrate spreadsheet data to external parties.

The Speed of Prototyping in the Age of AI (darylcecile.net)

2026-06-01|news|hackernews

An analysis of how AI tools have dramatically accelerated the software prototyping cycle, reducing time from concept to working demo.

What if remote working, not AI, is to blame for weak junior hiring?(ft.com)

2026-06-01|news|hackernews

An argument that remote work reduced mentorship and visibility for junior employees, explaining weak junior hiring better than AI displacement does.

langchain-ai/langchain (138165 stars): The agent engineering platform.(github.com)

2026-06-01|tool|github

LangChain provides a framework for building LLM-powered agents and chains, abstracting prompt management, tool use, and memory.

open-webui/open-webui (139448 stars): User-friendly AI Interface (Supports Ollama, OpenAI API, ...)(github.com)

2026-06-01|tool|github

Open WebUI delivers a self-hosted browser interface for interacting with local and API-based LLMs including Ollama and OpenAI-compatible endpoints.

langgenius/dify (143346 stars): Production-ready platform for agentic workflow development.(github.com)

2026-06-01|tool|github

Dify provides a production-ready platform for designing, deploying, and managing agentic LLM workflows with built-in orchestration tooling.

← Prev7 / 105Next →