Ollama 0.17 shipped on February 22, and the headline feature isn’t about running models faster - it’s about running agents locally.
The new release adds one-command installation of OpenClaw, the AI coding assistant that competes with Claude Code and Codex. Type ollama launch openclaw and you get a fully configured agentic AI environment running on your own hardware.
This matters because AI agents - systems that can browse files, execute code, and take actions across multiple steps - have been mostly locked to cloud services. Claude Code requires a subscription. OpenAI’s Codex needs API access. Local alternatives existed, but setting them up meant configuring multiple tools, managing dependencies, and debugging integration issues.
Ollama just reduced that to a single command.
What You Get
The OpenClaw integration supports several frontier-class open models out of the box:
- Kimi-K2.5 - The reasoning model that scores 96% on AIME 2025 math benchmarks
- GLM-5 - Zhipu AI’s 744B MoE model with 40B active parameters, trained on Huawei Ascend chips
- Minimax-M2.5 - Minimax’s latest multimodal model
These aren’t toy models. GLM-5 beats Claude Opus 4.5 on Humanity’s Last Exam and scores 77.8% on SWE-bench Verified. Kimi-K2.5 outperforms most proprietary models on mathematical reasoning.
When using cloud models through the integration, OpenClaw also gains web search capability - letting agents pull in current information when needed.
Performance and Memory Improvements
The release also brings practical quality-of-life improvements:
Smarter context lengths. Ollama’s macOS and Windows apps now automatically set context length based on your available VRAM. Previously, users had to manually configure this or risk out-of-memory crashes. Now the tool adapts to your hardware.
Faster tokenization. The changelog mentions “improved tokenizer performance,” which affects how quickly Ollama can process your prompts before the model even starts generating.
Independent benchmarks from WebProNews report additional improvements: up to 40% faster prompt processing on NVIDIA GPUs and 10-15% improvements on Apple Silicon. Better multi-GPU support and improved KV cache management also landed in this release.
Why This Matters
The AI agent landscape is fragmenting along a clear line: cloud versus local.
On one side, you have Claude Code finding 500 vulnerabilities in open-source projects, Codex writing and testing code autonomously, and enterprise customers paying premium prices for agent capabilities. On the other side, you have users who can’t or won’t send their code to external servers.
Until now, the local side was at a significant capability disadvantage. You could run models locally, but running agents locally meant cobbling together your own tooling.
Ollama 0.17 closes part of that gap. OpenClaw running Kimi-K2.5 or GLM-5 gives you reasoning capabilities that were frontier-tier six months ago. And because everything runs on your hardware, your code never leaves your machine.
The catch: you need the hardware. GLM-5 with 40B active parameters isn’t running on a laptop with 8GB of RAM. The VRAM-aware context length feature helps prevent crashes, but compute requirements for serious agent workloads remain substantial.
Getting Started
If you already have Ollama installed, update to 0.17 and run:
ollama launch openclaw
That’s it. Ollama handles downloading the model weights, configuring the agent environment, and setting up the integration.
For new users, the installation process is similarly straightforward - a single installer for macOS or Windows, or a one-liner for Linux.
What This Means
Ollama has evolved from “easy way to run LLMs locally” to “easy way to run AI agents locally.” That’s a significant expansion of what desktop users can do without touching cloud APIs.
The question is whether open models can keep pace with the proprietary agents. Claude Code Security just demonstrated what Claude Opus 4.6 can do for vulnerability detection. OpenAI’s Aardvark is hunting bugs autonomously. The frontier keeps advancing.
But for users who prioritize privacy, or work in environments where code can’t leave the network, Ollama 0.17 makes local agents practical in a way they weren’t before.
Sometimes the most important feature isn’t raw capability - it’s accessibility.