Apple reveals new AI architecture built around Google Gemini models. Apple unveils a new AI architecture with Google Gemini at its core, signaling a deeper partnership for on-device and cloud intelligence. MacRumors

iOSWorld: A Benchmark for Personally Intelligent Phone Agents. Introduces a comprehensive benchmark for evaluating personally intelligent agents on iOS. arXiv

Preserving Plasticity in Continual Learning via Dynamical Isometry. Proposes a method to preserve plasticity in continual learning through dynamical isometry principles. arXiv

SIGA: Self-Evolving Coding-Agent Adapters for Scientific Simulation. Presents a framework for self-evolving adapters that improve coding agents on scientific simulation tasks. arXiv

Discovering Functionally Selective Brain Regions with a Deep Topographic Multimodal Model. Proposes a deep topographic multimodal model for discovering functionally selective brain regions. arXiv

MiMo-v2.5-Pro-UltraSpeed: 1T model at 1000 tokens/second. Xiaomi's latest 1T-parameter model claims 1000 tokens per second inference speed. Xiaomi

FASE: Fast Adaptive Semantic Entropy for Code Quality. Proposes a fast adaptive semantic entropy method for assessing code quality. arXiv

Packed Twin Inference: 2× tokens/sec on MI50. Exploits spare compute to nearly double tokens/sec on consumer AMD hardware. GitHub

llama.cpp CLI Command Builder. Web tool for constructing llama.cpp command-line invocations. llamabuilding.com

Apple bets cheaper AI will woo small developers. Apple positions lower-cost AI offerings to attract indie and small-shop developers. TechCrunch

xAI is looking more like a datacentre REIT than a frontier lab. Analysis argues xAI's business model increasingly resembles datacenter real estate over frontier model research. martinalderson.com

Anthropic: Why has AI advanced faster in coding than biology?. New Anthropic science blog post explores the asymmetry of AI progress across domains. X / @AnthropicAI

ArXiv to ban researchers for a year if they submit AI slop. New arXiv policy cracks down on low-quality AI-generated submissions. 404 Media

Hivemind launches shared-brain for coding agents. New approach lets coding agents share knowledge across instances. X / @dr_cintas

JetBrains Mellum 2: a really good and performant model. Community discussion highlights Mellum 2 as a strong local coding model. r/LocalLLaMA

Silx-AI Quasar-Preview: 5M context length. New open-weight model preview offers an exceptionally long context window. r/LocalLLaMA

LocalLLaMA post tier list. Community-curated ranking of recurring post types on the subreddit. r/LocalLLaMA