AI ImplementationFebruary 17, 2026· 12 min read

Beyond Vertex AI: Building a Multi-Provider OpenClaw Gateway with MiniMax M2.5

I added MiniMax M2.5, Claude Opus 4.6, and built a multi-provider architecture mixing Vertex AI, OpenRouter, and Anthropic. Here's how I achieved the optimal price-to-intelligence ratio for autonomous agents.

Bright vibrant multi-provider AI architecture concept with colorful network diagram showing Google Vertex, OpenRouter, and Anthropic

In Part 1, I migrated 208 OpenClaw agent sessions from Anthropic Claude Opus 4.6 ($250/day) to Google Vertex AI MaaS ($5/day) — a 98% cost reduction.

But I wasn't done. Two things nagged at me:

  1. MiniMax just released M2.5 — a model scoring 80.2% on SWE-Bench Verified and matching Claude Opus 4.6 speed, at 1/68th the output cost. But it wasn't on Vertex AI yet.
  2. I still wanted Claude Opus 4.6 as a premium fallback — for the rare cases where nothing else would do.

The solution: a multi-provider architecture that mixes Vertex AI, OpenRouter, and Anthropic in a single OpenClaw gateway.

The New Model Hierarchy

Priority Model Provider Cost (Input / Output per M tokens) Use Case
1. Primary MiniMax M2.5 OpenRouter $0.20 / $1.10 Best all-rounder: coding, agents, office work
2. Fallback GPT-OSS 120B Vertex AI (us-central1) $0.09 / $0.19 Solid reasoning, cheapest option
3. Fallback Qwen3 80B Thinking Vertex AI (global) $0.12 / $0.24 Deep logical reasoning
4. Fallback GPT-OSS 20B Vertex AI (us-central1) $0.04 / $0.19 Lightweight heartbeat tasks
5. Last resort Claude Opus 4.6 Anthropic $15.00 / $75.00 Premium fallback — only if all else fails

The Cost Math

Configuration Daily Cost Monthly Cost
Claude Opus 4.6 only ~$250 ~$7,500
Vertex AI only (Part 1) ~$5 ~$150
Multi-provider with M2.5 primary ~$7 ~$210

The multi-provider setup costs slightly more than pure Vertex because MiniMax M2.5 output tokens ($1.10/M) are pricier than GPT-OSS ($0.19/M). But M2.5 is dramatically more capable. At $7/day instead of $250/day, that's still a 97% reduction with better quality on complex tasks.

Step 1: Add OpenRouter as a Provider

OpenRouter provides an OpenAI-compatible API to 300+ models. This is key — OpenClaw's openai-completions API type works with it out of the box.

  1. Sign up at openrouter.ai
  2. Go to openrouter.ai/keys
  3. Create a new key

Unlike Vertex AI tokens, OpenRouter keys are permanent. No refresh needed.

Gotcha #9: The Auth-Profiles File

This is the one that nearly broke me. Even after adding OpenRouter to both config files, I kept getting:

FailoverError: No API key found for provider "openrouter".
Auth store: ~/.openclaw/agents/main/agent/auth-profiles.json

It turns out OpenClaw uses a third file for agent-level authentication: auth-profiles.json. This file has its own schema with versioning and profile entries.

{
  "version": 1,
  "profiles": {
    "openrouter:default": {
      "type": "api-key",
      "provider": "openrouter",
      "apiKey": "your-key-here"
    }
  }
}

Step 2: Add Claude Opus 4.6 as the Premium Fallback

Adding Anthropic requires the same three-file dance, plus one extra gotcha.

Gotcha #10: Anthropic Needs a baseUrl

Every provider in OpenClaw must have a baseUrl. For Anthropic, use:

https://api.anthropic.com

Step 3: Set Up the Full Fallback Chain

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "openrouter/minimax/minimax-m2.5",
        "fallbacks": [
          "vertex-maas/openai/gpt-oss-120b-maas",
          "vertex-maas-global/qwen/qwen3-next-80b-a3b-thinking-maas",
          "vertex-maas/openai/gpt-oss-20b-maas",
          "anthropic/claude-opus-4-6"
        ]
      }
    }
  }
}

Now in the OpenClaw chat, you can switch models on the fly:

  • @minimax25 — MiniMax M2.5 (primary)
  • @oss120b — GPT-OSS 120B
  • @qwen3think — Qwen3 80B Thinking
  • @oss20b — GPT-OSS 20B
  • @opus — Claude Opus 4.6 (premium)

Why MiniMax M2.5?

MiniMax M2.5 was released on February 12, 2026 and immediately became one of the most interesting models for autonomous agent use:

  • Performance: 80.2% on SWE-Bench Verified, 51.3% on Multi-SWE-Bench, 76.3% on BrowseComp. These are frontier-level scores.
  • Architecture: 230B total parameters, but only 10B active (Mixture of Experts). Fast — matching Claude Opus 4.6's speed.
  • Agent-native: M2.5 was specifically trained for agentic workflows. It decomposes and plans tasks like a software architect before executing.
  • Cost: $0.20/M input, $1.10/M output via OpenRouter. Compare that to Claude Opus 4.6 at $15/$75 per M tokens. That's 75x cheaper on input and 68x cheaper on output.

The Final Architecture

┌──────────────────────────────────────────────────────────────┐
│                        Mac Mini                               │
│  ┌────────────────────────────────────────────────────────┐  │
│  │                OpenClaw Gateway                         │  │
│  │   208 Active Sessions  ·  Automatic Failover              │  │
│  └──┬──────────────┬──────────────┬───────────────────────┘  │
│     │              │              │                           │
│  ┌──┴──────────────┴──────────────┴───────────────────────┐  │
│  │  Token Refresh (30 min)     Permanent Keys             │  │
│  │  Vertex only                OpenRouter + Anthropic      │  │
│  └────────────────────────────────────────────────────────┘  │
└─────┼──────────────┼──────────────┼───────────────────────────┘
      │              │              │
      ▼              ▼              ▼
┌──────────┐  ┌───────────┐  ┌───────────┐
│  Vertex  │  │ OpenRouter │  │ Anthropic │
│   AI     │  │            │  │           │
│GPT-OSS   │  │MiniMax    │  │Claude     │
│120B      │  │M2.5       │  │Opus 4.6   │
│$0.09/M   │  │$0.20/M    │  │$15.00/M   │
│Qwen 80B  │  │PRIMARY    │  │LAST       │
│$0.12/M   │  │@minimax25 │  │RESORT     │
│GPT-OSS   │  └───────────┘  │@opus      │
│20B       │                 └───────────┘
│$0.04/M   │
└──────────┘

Key Takeaways

  1. OpenRouter bridges the gap for models not yet on Vertex AI.
  2. Three files, three purposes. openclaw.json for the gateway, models.json for the agent, auth-profiles.json for credentials. Miss one and you'll get a different cryptic error.
  3. Permanent keys vs. rotating tokens — OpenRouter and Anthropic use permanent API keys. Only Vertex needs the token refresh dance.
  4. MiniMax M2.5 is the best value in AI right now. Frontier-level performance at $0.20/$1.10 per M tokens.
  5. Keep Claude as your safety net. At $15/$75 per M tokens it's expensive for bulk work, but as a last-resort fallback it means your agents never go completely dark.

Total savings: from $7,500/month to roughly $210/month — a 97% reduction — while actually improving the intelligence available to my agents.

Want to build a multi-provider AI architecture for your business? We help companies set up intelligent model routing, cost optimization, and fallback strategies. Book an AI-First Fit Call and let's discuss your setup.

Browse more blog posts →

About the Author

Levi Brackman

Levi Brackman is the founder of Be AI First, helping companies become AI-first in 6 weeks. He builds and deploys agentic AI systems daily and advises leadership teams on AI transformation strategy.

Learn more →