Reconstructs simulation-ready, city-scale 3D meshes from multi-view images suitable for use in downstream urban simulation pipelines.
Establishes theoretical characterizations of which languages can be generated in the limit by algorithms constrained to bounded memory resources.
RoboWits introduces a benchmark of creative, open-ended physical problem-solving tasks designed to expose unexpected failure modes in robotic AI systems.
SoundnessBench evaluates whether AI research-idea generation systems can reliably distinguish scientifically valid hypotheses from flawed or unsound ones.
COMPOSE automatically synthesizes novel formal theorem statements by combining citation graphs and structural patterns from existing mathematical literature.
Introduces Trajectory Shapley Value to fairly attribute contributions of federated clients over training trajectories, enabling fairness-aware model aggregation.
Fuses RGB, depth, and event-based sensing through dynamics-guided representations to improve robotic perception across varying motion and lighting conditions.
A guide exposes undocumented Claude Code configuration options, giving practitioners finer control over behavior beyond what official documentation covers.
- OpenAI published a **Frontier Governance Framework** explaining how internal safety practices map to emerging regulation and risk‑assessment requirements for **frontier models**.[8] - In a related cybersecurity post, OpenAI references **GPT‑5.5** as “our smartest and most intuitive model to date,” with strong cybersecurity capabilities, noting it was released *two weeks before* that article.[2]
A unified risk map framework is learned for autonomous driving that integrates partial observability, aggregating heterogeneous risk signals into a single spatial representation.
Provides a benchmark evaluating speech and audio-language models on child-produced sounds, covering developmental speech characteristics across different childhood age groups.
Adapts the jackknife resampling method to handle temporal dependencies in time series by excluding contiguous windows rather than individual observations.
Applies matrix completion techniques to heterogeneous treatment-effect estimation, yielding tighter theoretical guarantees than prior methods under weaker assumptions.
Applies a compact vision-language model to time-series anomaly detection, achieving trusted, efficient inference suitable for resource-constrained deployment.
Examines whether physics domain knowledge alone suffices to guide AI-assisted scientific software development, using physicist-supervised workflows as a case study.
Releases version 1.0a31 of Datasette, the open-source tool for exploring and publishing SQLite databases, with incremental fixes or features toward stable 1.0.
Endava, an IT services firm, restructured its engineering workflows by deploying OpenAI Codex agents to automate software development tasks organization-wide.
Plans portrait photography by suggesting aesthetically optimal camera angles and actionable shooting instructions within a reconstructed 3D scene before capture.
Generates PCB schematics by representing circuit designs as semantically grounded code, enabling LLMs to produce structured, meaningful schematic outputs.
A Python package provides reusable utilities for defining, registering, and managing lifecycle hooks that extend or customize Claude Code agent behavior.
Within approximately the last 7 days, there are **no publicly documented releases** that meet all of your criteria of: - Brand‑new **frontier base models** from OpenAI, Anthropic (beyond Opus 4.8), Google, Meta, or Microsoft. - Newly released, **high‑capability open‑source base models** with clearly superior benchmarks, substantial new architecture, or paradigm‑shift behaviors. - Novel architectures (e.g., radically different from transformer‑variants) released as broadly usable models, not
- While not a model, this is a direct indicator of rapidly increasing capital behind **frontier model R&D and training runs** at Anthropic, including successor models beyond Claude Opus 4.8. - For forecasting **near‑future model releases**, this kind of funding event is a key structural signal in the frontier race. ---
Project Glasswing is an early Anthropic agentic system designed to perform automated security monitoring and threat detection using AI agents.
A user previews Claude Opus 4.8, suggesting it offers notable improvements users of earlier Opus versions will find impressive.
- **OpenAI Frontier Governance Framework** – OpenAI[8] - **GPT‑5.5** context (released “two weeks ago” relative to OpenAI’s cyber post)[2]
While not a release, these are the only frontier‑adjacent OpenAI updates in the timeframe.
A community-curated repository for sharing and discovering reusable prompt templates designed for ChatGPT and other conversational AI systems.
A renderer that converts Markdown containing SVG markup into properly displayed vector graphics output.
Releases version 0.25.1 of the llm-anthropic plugin, adding or fixing features for using Anthropic Claude models via the LLM command-line tool.