The AI Wire

5155 articles — page 24 of 172

firecrawl/firecrawl (124902 stars): 🔥 Search, scrape, and clean the web for AI agents.(github.com)

2026-05-27|tool|github

Firecrawl provides web search, scraping, and content-cleaning capabilities purpose-built for feeding structured data to AI agents.

langgenius/dify (142791 stars): Production-ready platform for agentic workflow development.(github.com)

2026-05-27|tool|github

Dify is a production-ready platform for designing, deploying, and managing agentic workflows that combine LLMs with tools and data sources.

MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation (huggingface.co)

2026-05-27|model|huggingface

MUSE-Autoskill enables agents to autonomously create, store, manage, and evaluate reusable skills, allowing continuous self-improvement without human intervention.

LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding (huggingface.co)

2026-05-27|model|huggingface

LocateAnything accelerates vision-language grounding by decoding bounding boxes in parallel rather than sequentially, improving both speed and localization quality.

Squeezing Capacity from Multimodal Large Language Models for Subject-driven Generation (huggingface.co)

2026-05-27|model|huggingface

A method extracts latent capacity from multimodal LLMs to generate images conditioned on specific subjects without requiring dedicated subject-driven generation architectures.

MRT: Masked Region Transformer for Layered Image Generation and Editing at Scale (huggingface.co)

2026-05-27|model|huggingface

A transformer architecture uses masked region modeling to generate and edit images as separate compositable layers at scale.

Normal Guidance is what Attention Needs (arxiv.org)

2026-05-27|paper|arxiv

Incorporates surface normal vectors as guidance signals into attention mechanisms to improve geometry-aware feature learning in vision models.

Real Images, Worse Judgments: Evaluating Vision-Language Models on Concreteness and Imagery (arxiv.org)

2026-05-27|paper|arxiv

Shows that vision-language models judge concreteness and imagery properties less accurately on real photographs than on synthetic or abstract stimuli.

Towards Controllable Image Generation through Representation-Conditioned Diffusion Models (arxiv.org)

2026-05-27|paper|arxiv

Conditions diffusion-based image generation on learned internal representations to enable fine-grained, controllable synthesis of image content.

China Clamps Down on Overseas Travel for AI Talent at Alibaba, DeepSeek (ibtimes.sg)

2026-05-27|news|reddit/LocalLLaMA

China is restricting international travel for AI researchers employed at Alibaba and DeepSeek, tightening controls on the movement of AI talent abroad.

A rare look inside Qwen 3.7’s open source model release approval process:(i.redd.it)

2026-05-27|news|reddit/LocalLLaMA

An inside account reveals the internal review and approval process Alibaba's Qwen team follows before publicly releasing open-source model weights.

@@zxlzr: 🚀 Excited to share our latest progress on scientific idea evaluation!...(x.com)

2026-05-27|news|twitter-bookmarks

A system or framework for automatically evaluating the quality and novelty of scientific research ideas is introduced.

ollama/ollama (172390 stars): Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemm (github.com)

2026-05-27|tool|github

Ollama provides a local runtime to download and run large language models including Kimi-K2.5, GLM-5, MiniMax, DeepSeek, and Qwen on personal hardware.

Does Seeing More Mean Knowing More? Mono-Anchored Advantage Normalization for Multi-Source Visual Reasoning (huggingface.co)

2026-05-27|model|huggingface

Mono-Anchored Advantage Normalization addresses whether additional visual inputs genuinely improve reasoning by normalizing advantage estimates to a single-source anchor.

Microsoft Copilot Cowork Exfiltrates Files (simonwillison.net)

2026-05-27|news|blog/Simon Willison

A security report details how Microsoft Copilot's Cowork feature can be exploited to exfiltrate files from user environments.

Modeling Agentic Technical Debt and Stochastic Tax: A Standalone Framework for Measurement, Simulation, and Dashboarding (arxiv.org)

2026-05-27|paper|arxiv

Provides a standalone framework that quantifies, simulates, and visualizes technical debt and stochastic cost accumulation specific to multi-agent AI systems.

Governed Evolution of Agent Runtimes through Executable Operational Cognition (arxiv.org)

2026-05-27|paper|arxiv

Defines a framework for governing AI agent runtime behavior through formally executable cognitive policies that constrain and direct agent actions.

FinHarness: An Inline Lifecycle Safety Harness for Finance LLM Agents (arxiv.org)

2026-05-27|paper|arxiv

Introduces an inline safety harness that monitors and enforces lifecycle-level constraints on LLM-based finance agents during execution.

From Scores to Gibbs Correctors: Accelerating Uniform-Rate Discrete Diffusion Models (arxiv.org)

2026-05-27|paper|arxiv

Derives Gibbs-correction terms from learned score functions to accelerate sampling in uniform-rate discrete diffusion models without retraining.

LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding (arxiv.org)

2026-05-27|paper|arxiv

Achieves fast, high-quality visual grounding by decoding bounding boxes in parallel rather than sequentially, accelerating vision-language object localization.

Qwen3.5 35B A3B uncensored heretic Native MTP Preserved is Out Now With the Full 785 MTPs Preserved and Retained, Available in Safetensors, GGUFs. NVFP4, NVFP4 GGUFs and GPTQ-Int4 Formats (huggingface.co)

2026-05-27|news|reddit/LocalLLaMA

An uncensored Qwen3.5 35B A3B variant with all 785 Multi-Token Prediction heads preserved is released in Safetensors, GGUF, NVFP4, and GPTQ-Int4 formats.

Turning local agents into self-optimizing agents (i.redd.it)

2026-05-27|news|reddit/LocalLLaMA

A technique converts standard local AI agents into ones that iteratively improve their own behavior through self-optimization feedback loops.

langchain-ai/langchain (137735 stars): The agent engineering platform.(github.com)

2026-05-27|tool|github

LangChain offers a framework and tooling for building, orchestrating, and deploying LLM-powered agents and multi-step reasoning pipelines.

open-webui/open-webui (138814 stars): User-friendly AI Interface (Supports Ollama, OpenAI API, ...)(github.com)

2026-05-27|tool|github

Open WebUI delivers a self-hosted, user-friendly chat interface compatible with locally-run Ollama models and remote OpenAI-compatible APIs.

Significant-Gravitas/AutoGPT (184577 stars): AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our (github.com)

2026-05-27|tool|github

AutoGPT is an open-source platform enabling non-technical users to create and deploy autonomous AI agents without writing code.

Soap2Soap: Long Cinematic Video Remaking via Multi-Agent Collaboration (huggingface.co)

2026-05-27|model|huggingface

Soap2Soap remakes long cinematic videos by coordinating multiple specialized agents that collaboratively handle narrative, style, and temporal consistency across extended sequences.

LongAV-Compass: Towards Unified Evaluation of Minute-Scale Audio-Visual Generation Across T2AV, I2AV, and V2AV (huggingface.co)

2026-05-27|model|huggingface

A unified evaluation framework benchmarks minute-scale audio-visual generation across text-to-AV, image-to-AV, and video-to-AV tasks.

Probing Cultural Awareness in LLMs: A Case Study of Cross-Culture Aesthetic Stylistics (arxiv.org)

2026-05-27|paper|arxiv

Evaluates LLMs' knowledge of culturally specific aesthetic and stylistic conventions across multiple cultures to quantify cross-cultural awareness gaps.

Greening AI Inference with Accuracy and Latency-aware User Incentives (arxiv.org)

2026-05-27|paper|arxiv

Designs user incentive schemes that trade off inference accuracy and latency to shift AI workloads toward low-carbon energy availability windows.

Probabilistic Smoothing with Ratio-Monotone Transforms for Global Optimization (arxiv.org)

2026-05-27|paper|arxiv

Applies ratio-monotone transforms to probabilistically smooth non-convex objective landscapes, enabling more reliable convergence in global optimization problems.

← Prev24 / 172Next →