Top story
Apple reveals new AI architecture built around Google Gemini models. Apple unveils a new AI architecture with Google Gemini at its core, signaling a deeper partnership for on-device and cloud intelligence. MacRumors
Research
iOSWorld: A Benchmark for Personally Intelligent Phone Agents. Introduces a comprehensive benchmark for evaluating personally intelligent agents on iOS. arXiv
Preserving Plasticity in Continual Learning via Dynamical Isometry. Proposes a method to preserve plasticity in continual learning through dynamical isometry principles. arXiv
SIGA: Self-Evolving Coding-Agent Adapters for Scientific Simulation. Presents a framework for self-evolving adapters that improve coding agents on scientific simulation tasks. arXiv
Discovering Functionally Selective Brain Regions with a Deep Topographic Multimodal Model. Proposes a deep topographic multimodal model for discovering functionally selective brain regions. arXiv
Tools
MiMo-v2.5-Pro-UltraSpeed: 1T model at 1000 tokens/second. Xiaomi's latest 1T-parameter model claims 1000 tokens per second inference speed. Xiaomi
FASE: Fast Adaptive Semantic Entropy for Code Quality. Proposes a fast adaptive semantic entropy method for assessing code quality. arXiv
Packed Twin Inference: 2× tokens/sec on MI50. Exploits spare compute to nearly double tokens/sec on consumer AMD hardware. GitHub
llama.cpp CLI Command Builder. Web tool for constructing llama.cpp command-line invocations. llamabuilding.com
Industry
Apple bets cheaper AI will woo small developers. Apple positions lower-cost AI offerings to attract indie and small-shop developers. TechCrunch
xAI is looking more like a datacentre REIT than a frontier lab. Analysis argues xAI's business model increasingly resembles datacenter real estate over frontier model research. martinalderson.com
Anthropic: Why has AI advanced faster in coding than biology?. New Anthropic science blog post explores the asymmetry of AI progress across domains. X / @AnthropicAI
ArXiv to ban researchers for a year if they submit AI slop. New arXiv policy cracks down on low-quality AI-generated submissions. 404 Media
Community
Hivemind launches shared-brain for coding agents. New approach lets coding agents share knowledge across instances. X / @dr_cintas
JetBrains Mellum 2: a really good and performant model. Community discussion highlights Mellum 2 as a strong local coding model. r/LocalLLaMA
Silx-AI Quasar-Preview: 5M context length. New open-weight model preview offers an exceptionally long context window. r/LocalLLaMA
LocalLLaMA post tier list. Community-curated ranking of recurring post types on the subreddit. r/LocalLLaMA