The AI Wire

5 articles tagged "inference" — page 1 of 1

@@Marktechpost: NVIDIA just got a 120B model to cold-start in under 5 seconds on Kubernetes....(x.com)

2026-06-06|news|twitter-bookmarks

NVIDIA demonstrates cold-starting a 120B parameter model in under 5 seconds on Kubernetes.

ollama/ollama (173295 stars): Get up and running with Kimi-K2.6, GLM-5.1, MiniMax, DeepSeek, gpt-oss, Qwen, Ge (github.com)

2026-06-06|tool|github

Ollama adds support for several new models including Kimi-K2.6, GLM-5.1, and MiniMax.

inference local-llm tooling

causalml (github.com)

2026-02-01|tool|GitHub

Uplift modeling and causal inference with machine learning algorithms...

incubation machine-learning causal-inference uplift-modeling

truss (github.com)

2025-12-20|tool|GitHub

"Truss" has two common meanings in engineering and in some AI/ML contexts; I’ll explain both and then focus on the features and AI/ML use cases.

machine-learning artificial-intelligence easy-to-use inference-api

BentoML (github.com)

2025-11-26|tool|GitHub

BentoML is an open-source Python framework designed to simplify the deployment and serving of machine learning models in production environments.[1] It bridges the gap between model development and pr...

model-serving mlops llmops generative-ai