Hugging Face Transformers standardizes model definitions, training, and inference for state-of-the-art NLP and multimodal models across frameworks.
A community-curated repository for sharing and discovering reusable ChatGPT system and user prompts across diverse tasks and personas.
Ollama enables one-command local execution of large language models including Kimi-K2.5, DeepSeek, Qwen, and Gemma on personal hardware.
AutoGPT provides an open platform for building and running autonomous AI agents, targeting accessibility for non-expert users and developers.
Based on available public release notes and news, there are **no clearly documented brand‑new frontier foundation models from Google, Meta, or Microsoft in just the past week** that meet your criteria (new model, significant capabilities, released beyond narrow research prototypes). The most recent major jumps (e.g., new Gemini variants, Llama versions, DeepSeek/Qwen releases) are earlier than this one‑week window, and current search results do not show a fresh model‑class announcement in the la
Your query is “past week,” and OpenAI’s major frontier family steps (GPT‑5.x, o‑series reasoning, open‑weight gpt‑oss models) all fall **earlier than the last 7 days**, based on their own release notes timeline.[1][2][3] Still, since they shape the current frontier landscape: - **GPT‑5.3 / 5.4 series** (Instant, Thinking, Pro, mini) — new flagship work/learning models emphasizing faster web‑integrated reasoning and multi‑step workflows.[1][2][3] - **o‑series reasoning models (o1, o3, 4.5 rese
Anthropic's Claude Opus 4.8 is a frontier large language model release advancing capability, safety, and instruction-following over prior Claude versions.
Introduces a memory mechanism that selectively retains and retrieves task-relevant information for multimodal agents operating across long interaction sequences.
Automatically generates reusable AI agent skills by distilling knowledge from human experts, reducing manual skill engineering for complex task pipelines.
Provides an automated auditing framework that evaluates and surfaces gaps, redundancies, or failures within the open skill ecosystem available to LLM-based agents.
An autoencoder architecture that takes full input, produces residual outputs, and uses a projection pursuit encoder to learn compact, disentangled latent representations.
A zero-shot speech synthesis system that generates expressive, long-form audio for both monologue and multi-speaker dialogue without speaker-specific training data.
Generates spatially positioned, synchronized audio in a streaming fashion using an autoregressive diffusion transformer that produces multichannel spatial audio in real time.
Systematically evaluates long-form speech generation systems across diverse scenarios including different speaking styles, domains, and acoustic conditions to expose failure modes.
Uses frequency-domain decomposition and sub-frequency manifold traversal to guide a diffusion model for generating temporally coherent and smooth action sequences.
Analyzes when Markov boundary feature selection helps, hurts, or produces mixed results for tabular prediction tasks, clarifying its practical reliability.
Scales human motion generation by conditioning on any combination of input modalities using masked modeling, enabling flexible multimodal control over generated motions.
A general-purpose counting model that estimates the quantity of arbitrary object categories in images based on open-vocabulary or user-specified targets.
Uses on-policy data generated during RLHF training to self-supervisedly improve reward model accuracy, addressing reward model degradation caused by policy distribution shift.
Trains agents on open-ended tasks through self-play where multiple policies co-evolve together, generating increasingly challenging and diverse training signal without human-designed curricula.
Evaluates whether vision-language models can reliably abstain from answering spatial questions they lack sufficient visual information to answer correctly, diagnosing failure modes.
Introduces a benchmark and synthetic trajectory generation method for training GUI agents to recover from their own policy-induced errors during task execution.
An investigation into issues or behavior observed in the pydantic-monty library, likely examining bugs, unexpected functionality, or security concerns.
A personal account arguing that cancelling an AI subscription was the right practical or financial decision, weighing real utility against cost.
Release notes for version 1.0a32 of Datasette, the open-source tool for exploring and publishing SQLite databases, detailing new features or fixes.
A monthly newsletter from May 2026 summarizing recent developments, projects, or curated content relevant to the author's focus area.
Explains that a running process's memory is exposed as a file on disk via interfaces like /proc/pid/mem, illustrating Unix's everything-is-a-file design.
NVIDIA releases Cosmos 3, an open multimodal model designed to support physical AI systems by integrating reasoning and action planning across modalities.
Traces the origin and cultural journey of Adriano Celentano's 1972 nonsense-lyric song deliberately composed to mimic American English sounds without meaning.
The Vera Rubin Observatory has detected both very large near-Earth asteroids and failed supernova candidates (stars that collapse without a visible explosion).