A community-curated collection where users share, discover, and save effective prompts for ChatGPT and other LLMs.
Ollama enables local installation and execution of popular open-weight LLMs including Kimi-K2.6, DeepSeek, Qwen, and others via a simple CLI.
AutoGPT provides an open-source platform enabling users to deploy and build autonomous AI agents that chain LLM calls to complete multi-step goals.
This is *slightly older than one week* but extremely relevant to your focus on new frontier models.
OpenAI's frontier models and Codex are made available as managed services on AWS infrastructure for enterprise deployment.
Anthropic releases Claude Opus 4.8, an updated iteration of its large-scale Claude Opus model with improved capabilities.
OmniDreams generates real-time photorealistic driving scenarios as a generative world model supporting closed-loop simulation for autonomous vehicle training and evaluation.
A decentralized instruction-tuning framework splits conflicting training instructions across separate models and merges their weights to reduce multi-task interference.
Ψ-Bench evaluates how well conversational AI systems tailor persuasive dialogue strategies to individual user personas and psychological profiles.
A KV cache eviction policy for reasoning models selectively discards cache entries based on their estimated contribution to output value, reducing memory without degrading reasoning quality.
MERIT learns disentangled latent representations of music that separate independent attributes such as melody, rhythm, and timbre to improve audio similarity retrieval.
Linear probes trained to detect deceptive internal states in LLMs are stress-tested for robustness under adversarial pressure, with analysis of how deception organizes geometrically in representation space.
Combines world models handling concrete environment dynamics with language models handling abstract reasoning, showing the two approaches are complementary rather than competing.
A local perturbation theory formalizes how policy updates in one domain cause interference in others during multi-domain RL and derives recovery conditions to restore cross-domain performance.
Analyzes long chain-of-thought training traces where the final answer is correct but intermediate reasoning steps contain harmful continuations, diagnosing how such traces arise and their training risks.
ClawHub analyzes malware signals by reconciling disagreements between VirusTotal verdicts, static analysis findings, and SkillSpector detections to improve security assessments.
AutoMedBench evaluates agentic AI systems on automated medical research tasks, benchmarking their ability to autonomously conduct and validate biomedical investigations.
TRON provides rule-verifiable online environments specifically designed for training visual reasoning agents via reinforcement learning with objectively checkable rewards.
A small RL controller guides token sampling decisions of a large language model at test time, improving output quality without retraining the LLM.
Decoupled residual denoising separates content and style pathways in a diffusion model to enable unified image-to-image translation across multiple tasks with fewer training examples.
PaddleOCR-VL-1.6 improves document parsing by targeting previously under-optimized layout regions and applying a progressive post-training strategy to boost recognition accuracy.
micropython-wasm 0.1a0 is an initial alpha release enabling MicroPython to run as a WebAssembly module.
A news item covering the California Brown Pelican, likely reporting on its conservation status, population trends, or ecological observations.
micropython-wasm 0.1a1 is a follow-up alpha release of the MicroPython WebAssembly package, delivering early fixes or improvements over 0.1a0.
datasette-agent-micropython 0.1a0 is an initial alpha plugin integrating MicroPython-based agentic capabilities into the Datasette data exploration tool.
Microsoft announced new MAI frontier models, signaling expanded investment in its own internally developed AI systems beyond existing partnerships.
Holo3.1 is a fast, locally running computer-use agent system that executes GUI tasks on-device without requiring cloud inference.
OpenAI positions Codex as a mainstream productivity tool extending beyond professional developers to broader everyday users.
OpenAI outlines global policy and partnership initiatives aimed at protecting youth safety and expanding educational or economic opportunities.
OpenAI expands Codex integration across diverse professional roles, development tools, and organizational workflows beyond software engineering.