In Part 1, I migrated 208 OpenClaw agent sessions from Anthropic Claude Opus 4.6 ($250/day) to Google Vertex AI MaaS ($5/day) — a 98% cost reduction.
But I wasn't done. Two things nagged at me:
- MiniMax just released M2.5 — a model scoring 80.2% on SWE-Bench Verified and matching Claude Opus 4.6 speed, at 1/68th the output cost. But it wasn't on Vertex AI yet.
- I still wanted Claude Opus 4.6 as a premium fallback — for the rare cases where nothing else would do.
The solution: a multi-provider architecture that mixes Vertex AI, OpenRouter, and Anthropic in a single OpenClaw gateway.
The New Model Hierarchy
| Priority | Model | Provider | Cost (Input / Output per M tokens) | Use Case |
|---|---|---|---|---|
| 1. Primary | MiniMax M2.5 | OpenRouter | $0.20 / $1.10 | Best all-rounder: coding, agents, office work |
| 2. Fallback | GPT-OSS 120B | Vertex AI (us-central1) | $0.09 / $0.19 | Solid reasoning, cheapest option |
| 3. Fallback | Qwen3 80B Thinking | Vertex AI (global) | $0.12 / $0.24 | Deep logical reasoning |
| 4. Fallback | GPT-OSS 20B | Vertex AI (us-central1) | $0.04 / $0.19 | Lightweight heartbeat tasks |
| 5. Last resort | Claude Opus 4.6 | Anthropic | $15.00 / $75.00 | Premium fallback — only if all else fails |
The Cost Math
| Configuration | Daily Cost | Monthly Cost |
|---|---|---|
| Claude Opus 4.6 only | ~$250 | ~$7,500 |
| Vertex AI only (Part 1) | ~$5 | ~$150 |
| Multi-provider with M2.5 primary | ~$7 | ~$210 |
The multi-provider setup costs slightly more than pure Vertex because MiniMax M2.5 output tokens ($1.10/M) are pricier than GPT-OSS ($0.19/M). But M2.5 is dramatically more capable. At $7/day instead of $250/day, that's still a 97% reduction with better quality on complex tasks.
Step 1: Add OpenRouter as a Provider
OpenRouter provides an OpenAI-compatible API to 300+ models. This is key — OpenClaw's openai-completions API type works with it out of the box.
- Sign up at openrouter.ai
- Go to openrouter.ai/keys
- Create a new key
Unlike Vertex AI tokens, OpenRouter keys are permanent. No refresh needed.
Gotcha #9: The Auth-Profiles File
This is the one that nearly broke me. Even after adding OpenRouter to both config files, I kept getting:
FailoverError: No API key found for provider "openrouter".
Auth store: ~/.openclaw/agents/main/agent/auth-profiles.json
It turns out OpenClaw uses a third file for agent-level authentication: auth-profiles.json. This file has its own schema with versioning and profile entries.
{
"version": 1,
"profiles": {
"openrouter:default": {
"type": "api-key",
"provider": "openrouter",
"apiKey": "your-key-here"
}
}
}
Step 2: Add Claude Opus 4.6 as the Premium Fallback
Adding Anthropic requires the same three-file dance, plus one extra gotcha.
Gotcha #10: Anthropic Needs a baseUrl
Every provider in OpenClaw must have a baseUrl. For Anthropic, use:
https://api.anthropic.com
Step 3: Set Up the Full Fallback Chain
{
"agents": {
"defaults": {
"model": {
"primary": "openrouter/minimax/minimax-m2.5",
"fallbacks": [
"vertex-maas/openai/gpt-oss-120b-maas",
"vertex-maas-global/qwen/qwen3-next-80b-a3b-thinking-maas",
"vertex-maas/openai/gpt-oss-20b-maas",
"anthropic/claude-opus-4-6"
]
}
}
}
}
Now in the OpenClaw chat, you can switch models on the fly:
@minimax25— MiniMax M2.5 (primary)@oss120b— GPT-OSS 120B@qwen3think— Qwen3 80B Thinking@oss20b— GPT-OSS 20B@opus— Claude Opus 4.6 (premium)
Why MiniMax M2.5?
MiniMax M2.5 was released on February 12, 2026 and immediately became one of the most interesting models for autonomous agent use:
- Performance: 80.2% on SWE-Bench Verified, 51.3% on Multi-SWE-Bench, 76.3% on BrowseComp. These are frontier-level scores.
- Architecture: 230B total parameters, but only 10B active (Mixture of Experts). Fast — matching Claude Opus 4.6's speed.
- Agent-native: M2.5 was specifically trained for agentic workflows. It decomposes and plans tasks like a software architect before executing.
- Cost: $0.20/M input, $1.10/M output via OpenRouter. Compare that to Claude Opus 4.6 at $15/$75 per M tokens. That's 75x cheaper on input and 68x cheaper on output.
The Final Architecture
┌──────────────────────────────────────────────────────────────┐
│ Mac Mini │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ OpenClaw Gateway │ │
│ │ 208 Active Sessions · Automatic Failover │ │
│ └──┬──────────────┬──────────────┬───────────────────────┘ │
│ │ │ │ │
│ ┌──┴──────────────┴──────────────┴───────────────────────┐ │
│ │ Token Refresh (30 min) Permanent Keys │ │
│ │ Vertex only OpenRouter + Anthropic │ │
│ └────────────────────────────────────────────────────────┘ │
└─────┼──────────────┼──────────────┼───────────────────────────┘
│ │ │
▼ ▼ ▼
┌──────────┐ ┌───────────┐ ┌───────────┐
│ Vertex │ │ OpenRouter │ │ Anthropic │
│ AI │ │ │ │ │
│GPT-OSS │ │MiniMax │ │Claude │
│120B │ │M2.5 │ │Opus 4.6 │
│$0.09/M │ │$0.20/M │ │$15.00/M │
│Qwen 80B │ │PRIMARY │ │LAST │
│$0.12/M │ │@minimax25 │ │RESORT │
│GPT-OSS │ └───────────┘ │@opus │
│20B │ └───────────┘
│$0.04/M │
└──────────┘
Key Takeaways
- OpenRouter bridges the gap for models not yet on Vertex AI.
- Three files, three purposes.
openclaw.jsonfor the gateway,models.jsonfor the agent,auth-profiles.jsonfor credentials. Miss one and you'll get a different cryptic error. - Permanent keys vs. rotating tokens — OpenRouter and Anthropic use permanent API keys. Only Vertex needs the token refresh dance.
- MiniMax M2.5 is the best value in AI right now. Frontier-level performance at $0.20/$1.10 per M tokens.
- Keep Claude as your safety net. At $15/$75 per M tokens it's expensive for bulk work, but as a last-resort fallback it means your agents never go completely dark.
Total savings: from $7,500/month to roughly $210/month — a 97% reduction — while actually improving the intelligence available to my agents.
Want to build a multi-provider AI architecture for your business? We help companies set up intelligent model routing, cost optimization, and fallback strategies. Book an AI-First Fit Call and let's discuss your setup.
