The AI Wire

High Signal (4-5)clear

519 articles — page 17 of 18

World Properties without World Models: Recovering Spatial and Temporal Structure from Co-occurrence Statistics in Static Word Embeddings (arxiv.org)

2026-03-04|paper|arxiv

What Does Flow Matching Bring To TD Learning?(arxiv.org)

2026-03-04|paper|arxiv

We could be hours (or less than a week) away from true NVFP4 support in Llama.cpp GGUF format 👀(github.com)

2026-03-04|news|reddit/LocalLLaMA

NanoGPT Slowrun: Language Modeling with Limited Data, Infinite Compute (qlabs.sh)

2026-03-04|news|hackernews

Fast Matrix Multiplication in Small Formats: Discovering New Schemes with an Open-Source Flip Graph Framework (huggingface.co)

2026-03-03|model|huggingface

Qwen3-Coder-Next Technical Report (huggingface.co)

2026-03-03|model|huggingface

Gemini 3.1 Flash-Lite: Built for intelligence at scale (deepmind.google)

2026-03-03|news|blog/Google DeepMind

GPT‑5.3 Instant (openai.com)

2026-03-03|news|hackernews

GPT-5.3 Instant System Card (openai.com)

2026-03-03|news|blog/OpenAI Blog

PRX Part 3 — Training a Text-to-Image Model in 24h!(huggingface.co)

2026-03-03|news|blog/Hugging Face Blog

Claude's Cycles [pdf](www-cs-faculty.stanford.edu)

2026-03-03|news|hackernews

Kling-MotionControl Technical Report (huggingface.co)

2026-03-03|model|huggingface

Chain of World: World Model Thinking in Latent Motion (huggingface.co)

2026-03-03|model|huggingface

GPT-5.3 Instant: Smoother, more useful everyday conversations (openai.com)

2026-03-03|news|blog/OpenAI Blog

Inside the M4 Apple Neural Engine, Part 1: Reverse Engineering (maderix.substack.com)

2026-03-02|news|hackernews

LongRLVR: Long-Context Reinforcement Learning Requires Verifiable Context Rewards (arxiv.org)

2026-03-02|paper|arxiv

Breaking : The small qwen3.5 models have been dropped (i.redd.it)

2026-03-02|news|reddit/LocalLLaMA

Tool Verification for Test-Time Reinforcement Learning (huggingface.co)

2026-03-02|model|huggingface

Machine Learning (ML) library in Linux kernel (arxiv.org)

2026-03-02|paper|arxiv

Frontier Models Can Take Actions at Low Probabilities (arxiv.org)

2026-03-02|paper|arxiv

Tool Verification for Test-Time Reinforcement Learning (arxiv.org)

2026-03-02|paper|arxiv

Reasoning Core: A Scalable Procedural Data Generation Suite for Symbolic Pre-training and Post-Training (arxiv.org)

2026-03-02|paper|arxiv

CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation (huggingface.co)

2026-03-01|model|huggingface

Reverse engineered Apple Neural Engine(ANE) to train Microgpt (i.redd.it)

2026-03-01|news|reddit/LocalLLaMA

[R] Tiny transformers (<100 params) can add two 10-digit numbers to 100% accuracy (github.com)

2026-03-01|news|reddit/MachineLearning

Vectorizing the Trie: Efficient Constrained Decoding for LLM-based Generative Retrieval on Accelerators (huggingface.co)

2026-03-01|model|huggingface

CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation (arxiv.org)

2026-03-01|paper|arxiv

huggingface/transformers (157238 stars): 🤗 Transformers: the model-definition framework for state-of-the-art machine lear (github.com)

2026-03-01|tool|github

SenCache: Accelerating Diffusion Model Inference via Sensitivity-Aware Caching (huggingface.co)

2026-03-01|model|huggingface

dLLM: Simple Diffusion Language Modeling (huggingface.co)

2026-03-01|model|huggingface

← Prev17 / 18Next →