Reinforcement Learning from Human Feedback (RLHF) remains indispensable for aligning large language models (LLMs) in subjective domains. To enhance robustness, recent work shifts toward Generative Rew...
Adolescent Idiopathic Scoliosis (AIS) is a prevalent spinal deformity whose progression can be mitigated through early detection. Conventional screening methods are often subjective, difficult to scal...
Multimodal large language models (MLLMs) have rapidly advanced, yet their adoption in medicine remains limited by gaps in domain coverage, modality alignment, and grounded reasoning. In this work, we ...
First Opus-class model with 1M token context window (beta), adaptive thinking with effort levels, and context compaction for sustained agentic tasks. Agent teams feature enables parallel subtask execution.
Achieves Gemini 3 Pro-class reasoning at Flash-tier latency and cost. Outperforms 2.5 Pro while being 3x faster at less than 1/4 the cost of 3 Pro. 1M token context, 65K output tokens.
OpenAI's first open-weight LLMs since GPT-2 (2019). Apache 2.0 license. Trained with RL and distillation from o3 and frontier internal models. GPT-oss-120B runs on single 80GB GPU; 20B runs on 16GB edge devices.
- Two interfaces: Editor View (synchronous coding) and Manager View (orchestrate parallel agents across workspaces)
- 11 open-source plugins bundling skills, connectors, slash commands, sub-agents
Native multimodal model trained on 15T tokens mixing visual and textual data from the start. Agent Swarm technology coordinates up to 100 specialized agents simultaneously, reducing execution time by 4.5x for complex workflows.
Cosmos Reason 2 is an open reasoning VLM enabling machines to see, understand, and act in the physical world. GR00T N1.6 is a vision-language-action (VLA) model for humanoid robots integrating egocentric camera streams, robot states, and language instructions into a unified policy.
Flagship reasoning model with adaptive tool-use -- intelligently invokes retrieval and code interpreter on demand during inference. Advanced test-time scaling via RL.
First open-source model supporting three video generation modes in one architecture: multi-subject reference image-to-video, audio-driven avatar generation, and video-to-video editing. Intelligent shot-switching for minute-level durations.
Creates a new model category: the "Large Tabular Model" (LTM). Trained on billions of tabular datasets to natively understand non-linear relationships in structured data, bypassing traditional ETL pipelines.
Foundation models trained on physics data, not text. Walrus learns across 19 fluid dynamics scenarios and 63 physical fields. AION-1 integrates 39 data modalities from astronomical surveys (200M+ observations, ~100TB data).
- "FastAPI of MCP" -- decorator-based server building
- Fetches current, version-specific documentation in real-time
Multilingual document intelligence model supporting all 22 official Indian languages with OCR, visual language understanding, and semantic document parsing. Uses state-space architecture rather than transformer.
The Moore-Penrose Pseudo-inverse (PInv) serves as the fundamental solution for linear systems. In this paper, we propose a natural generalization of PInv to the nonlinear regime in general and to neur...
Large language models (LLMs) are increasingly being used in a zero-shot fashion to assess mental health conditions, yet we have limited knowledge on what factors affect their accuracy. In this study, ...
Flow and diffusion models produce high-quality samples, but adapting them to user preferences or constraints post-training remains costly and brittle, a challenge commonly called reward alignment. We ...
Adapting large pretrained models to new tasks efficiently and continually is crucial for real-world deployment but remains challenging due to catastrophic forgetting and the high cost of retraining. W...
Multi-image spatial reasoning remains challenging for current multimodal large language models (MLLMs). While single-view perception is inherently 2D, reasoning over multiple views requires building a...
Reasoning language models, which generate long chains of thought, dramatically outperform non-reasoning language models on abstract problems. However, the internal model mechanisms that allow this sup...
Recent years have seen an explosion of interest in autonomous cyber defence agents trained to defend computer networks using deep reinforcement learning. These agents are typically trained in cyber gy...
Post-training with Reinforcement Learning (RL) has substantially improved reasoning in Large Language Models (LLMs) via test-time scaling. However, extending this paradigm to Multimodal LLMs (MLLMs) t...
We present protein autoregressive modeling (PAR), the first multi-scale autoregressive framework for protein backbone generation via coarse-to-fine next-scale prediction. Using the hierarchical nature...
Internet of Things (IoT) deployments operate in nonstationary, dynamic environments where factors such as sensor drift, evolving user behavior, and heterogeneous user privacy requirements can affect a...