The Situation
Anthropic killed third-party OAuth today (April 4, 2026 at 12pm PT). You can no longer use a Claude subscription (Pro/Max) to power OpenClaw. Two options remain for Claude:
- Anthropic API key โ pay-per-token ($3/$15 per MTok for Sonnet, $5/$25 for Opus)
- Switch to a different provider entirely
This guide answers: What should you actually do?
Section 1: The Sonnet Replacement
Current role: Fergus's main agent โ everyday conversation, task management, delegation, Discord responses, voice message handling, tool calls, persona consistency, multi-agent orchestration.
๐ฅGPT-5.4 via OpenAI Codex OAuth โ CAN REPLACE SONNET
This is the move. Here's why:
- Quality: AI Top 40 score of 100.0 (vs Sonnet's #5-6 range). Terminal-Bench 2.0: 75.1% vs Sonnet's 59.1%.
- Tool calling: Has "Tool Search" โ dedicated multi-tool orchestration. "Slight edge in complex orchestration where orchestrator must track multiple simultaneous sub-task states." This is exactly what Fergus does.
- Session memory: ZooClaw benchmark rated GPT-5.4 "Excellent" for memory reliability after compaction โ Sonnet rated "Good".
- OpenClaw community: "5.4 is more than capable of running main agent 95% of the time." Named ZooClaw overall winner above Sonnet AND Opus.
- Access: Codex OAuth via ChatGPT Plus ($20/mo) or Pro ($200/mo). The ONLY subscription-based OAuth that still works in OpenClaw for a top-tier model.
Where GPT-5.4 is WORSE than Sonnet
- Speed: 20-30 tok/s vs Sonnet's 44-63 tok/s. Slower Discord responses โ biggest real-world tradeoff.
- Persona compliance: GPT-5.4 wants to DO work. Sonnet follows "don't do work, delegate" rules more naturally. Needs stronger guardrails.
- Discord formatting: Occasionally generates markdown tables (broken in Discord). Add explicit rules to SOUL.md.
- Verbosity: Tends toward longer responses. Add "be concise" rules.
Verdict: GPT-5.4 CAN replace Sonnet. Stronger on tool calling, session memory, and multi-step orchestration. Weaker on speed, persona compliance, and formatting โ all fixable with prompt engineering. Transition will require 1-2 weeks of SOUL.md tuning.
๐ฅClaude Sonnet 4.6 via Anthropic API Key
Keep Sonnet by moving to API billing. Identical quality. Cost: $3/$15 per MTok (~$50-150/mo with 90% prompt caching discount for heavy daily use).
Verdict: Obviously works โ it's the same model. Question is whether $50-150/mo variable beats $20/mo flat for GPT-5.4 Plus, which is arguably better.
๐ฅGemini 3.1 Pro via Google API Key โ CANNOT FULLY REPLACE
Strong benchmarks (GPQA 94.3%, SWE-bench 80.6%) but no subscription OAuth in OpenClaw, Google is suspending accounts that use their API heavily with OpenClaw agents, and persona consistency is "underwhelming out of the box."
Verdict: CANNOT replace Sonnet for this use case. Too risky, no subscription, weak persona handling.
Sonnet Replacement Recommendation:
- Primary: GPT-5.4 via Codex OAuth (ChatGPT Plus $20/mo or Pro $200/mo)
- Fallback: Keep Sonnet via Anthropic API key ($50-150/mo variable)
Section 2: The Opus Replacement
Current role: Dwight โ deep analysis, strategy, research, complex reasoning, comparison tasks, financial analysis, product research.
๐ฅGPT-5.4 via OpenAI Codex OAuth โ CAN REPLACE OPUS (for 80% of tasks)
- AI Top 40 #1 (100.0) vs Opus at #2 (93.2). Leads on 8 of 10 benchmarks.
- Matches Opus on: multi-step reasoning, knowledge work, structured analysis, research synthesis.
- Falls short: Opus has a unique quality for nuanced, creative reasoning โ "thinking around corners." GPT-5.4 is more systematic, Opus more intuitive. For the hardest 20% of Dwight tasks, Opus still has an edge.
Verdict: GPT-5.4 CAN replace Opus for most Dwight tasks. For the hardest 20%, you lose some nuance.
๐ฅClaude Opus 4.6 via Anthropic API Key โ STAYS AS OPUS
Arena #1 (Elo 1,504). Arena Code #1 (Elo 1,548). The acknowledged quality ceiling. Cost: $5/$25 per MTok. Dwight is used selectively so realistic monthly: $30-80.
Verdict: Opus via API CAN stay as Opus. $30-80/mo for the best reasoning model is reasonable for selective use.
Opus Replacement Recommendation:
- Primary: Keep Opus via Anthropic API key ($30-80/mo) โ cheap enough for selective Dwight use
- Alternative: Use GPT-5.4 for everything (accepts the 20% quality loss on deepest reasoning)
Section 3: Full Model Rankings โ April 2026
Ranked by overall capability. Pricing, access method, and value rating for each.
Tier S โ Frontier (Best Models Alive)
#1
GPT-5.4 OpenAI
AI Top 40: 100.0 ยท Terminal-Bench: 75.1% ยท SWE-bench: ~80%
API: $2.50/$15 per MTok ยท โ
Codex OAuth subscription ($20-200/mo)
Best all-around agentic model. Tool calling. Computer use. Multi-step execution.
๐ต $20/mo
#2
Claude Opus 4.6 Anthropic
Arena: #1 (1,504) ยท Arena Code: #1 (1,548) ยท SWE-bench: 79.6%
API: $5/$25 per MTok ยท โ ๏ธ API key only (OAuth killed April 4)
Best human-preference quality. Best coding. Best nuanced reasoning. Best persona adherence.
๐ก Fair
#3
Grok 4 xAI
AI Top 40: 86.6 ยท HLE: 50.7% (#1) ยท Strong reasoning
API: $3/$15 per MTok ยท โ
API key only. No OAuth.
Strongest on Humanity's Last Exam. Good deep reasoning.
๐ก Fair
#4
Gemini 3.1 Pro Google
GPQA: 94.3% (#1) ยท SWE-bench: 80.6% ยท ARC-AGI-2: 77.1% (#1)
API: $2/$12 per MTok ยท โ
API key. โ No OAuth subscription.
โ ๏ธ Risk Google suspending accounts using API with OpenClaw agents.
๐ข Great value
Tier A โ Near-Frontier
#5
Claude Sonnet 4.6 Anthropic
SWE-bench: 79.6% ยท 44-63 tok/s ยท Excellent instruction following
API: $3/$15 per MTok (90% cache discount) ยท โ ๏ธ API key only (OAuth killed April 4)
Fastest frontier-class model. Best instruction following. Best persona consistency. Best Discord formatting.
๐ข Excellent
#6
GPT-5.4 Pro OpenAI
Enhanced reasoning version
API: $30/$180 per MTok (12x standard) ยท โ
API key. Not on Codex OAuth.
Maximum reasoning when cost is no object.
๐ด Poor value
#7
GPT-5.2 OpenAI
Strong general-purpose. "Most reliable instruction following" per OpenAI.
API: $1.75/$14 per MTok ยท โ
API key.
๐ข Excellent
#8
Qwen3-Max Alibaba
API: $1.20/$6 per MTok ยท Or Alibaba Coding Plan: $10-50/mo flat
โ
API key + โ
Alibaba Coding Plan OAuth
โ ๏ธ Quality Significantly below Sonnet/GPT-5.4. "TV robot" persona compliance.
๐ข Budget
Tier B โ Strong Performers
#9
Grok 4.2 xAI
Fast variant. API: $2/$6 per MTok ยท โ
API key. 2M context window.
๐ข Excellent
#10
GPT-5.3-Codex OpenAI
400K context. Strong coding specialist. API: $1.75/$14 per MTok
โ
Codex OAuth (subscription) โ included in Plus/Pro. This is what Cody/Bomb use.
๐ต Included
#11
GPT-5.3-Codex-Spark OpenAI
1,000+ tok/s on Cerebras. Ultra-fast coding. Pro subscription only.
โ
Codex OAuth ยท Separate usage pool from GPT-5.4 (different hardware)
Pro users get GPT-5.4 quota PLUS Spark quota independently โ more total headroom.
Pro only
#12
DeepSeek V3.2 DeepSeek
API: $0.28/$0.42 per MTok (27x cheaper than Opus)
โ ๏ธ Privacy Chinese company. Data handling concerns for business use.
๐ข Cheapest
#13
Mistral Large 3 Mistral
API: $0.50/$1.50 per MTok ยท European data handling.
๐ข Excellent
Tier C โ Budget / Specialist
#15
Gemma 4 31B Google Open
AA Intelligence Index: #2 among open models (score 39). MMLU-Pro 85.2%. Native function calling.
โ
Local via Ollama ยท Free if you have hardware. Apache 2.0 license.
โ ๏ธ Gap Sonnet has a 43-point advantage on AA Intelligence Index. Good for local fallback, not primary agent.
Free local
#16
Llama 4 Maverick Meta Open
AI Top 40: #31. 400B total params, MoE. Multimodal.
Free locally ยท API via OpenRouter/Groq ยท CANNOT replace Sonnet.
Fair
Section 4: Subscription OAuth in OpenClaw โ Full List
Which providers let you pay a flat monthly rate and use it through OpenClaw?
โ
Confirmed Working Subscription OAuth
| Provider |
Subscription |
Price |
Models |
Quality for Main Agent? |
| OpenAI Codex |
ChatGPT Plus |
$20/mo |
GPT-5.4, GPT-5.3-Codex, GPT-5.4-mini |
YES โ Top tier |
| OpenAI Codex |
ChatGPT Pro |
$200/mo |
All Plus + Spark, 6x limits |
YES โ Top tier + headroom |
| Alibaba Coding Plan |
Lite |
$10/mo ($3 first mo) |
Qwen3.5+, Kimi K2.5, GLM-5, MiniMax M2.5 |
No โ budget tier |
| Alibaba Coding Plan |
Pro |
$50/mo ($15 first mo) |
Same models, 5x more requests |
No โ budget tier |
| MiniMax Coding Plan |
OAuth |
~$10-20/mo |
MiniMax models |
No โ budget tier |
| Z.AI / GLM Coding Plan |
OAuth |
~$10-20/mo |
GLM models |
No โ budget tier |
โ No Subscription OAuth
- Anthropic โ KILLED TODAY (April 4, 2026). API key only going forward.
- Google Gemini โ Feature request filed, not implemented. API key only.
- xAI Grok โ API key only. No OAuth in OpenClaw.
- Mistral โ API key only.
- DeepSeek โ API key only.
- Meta Llama โ Open weights only (local or hosted).
The reality: OpenAI is the ONLY provider offering top-tier subscription OAuth in OpenClaw.
Section 5: Can GPT-5.4 Do What Brent Does With Sonnet?
โ
Long complex sessions without degrading
ZooClaw benchmark: "Excellent" for memory reliability after compaction โ same as Opus, BETTER than Sonnet ("Good").
โ ๏ธ
Follow SOUL.md / AGENTS.md persona instructions
YES, WITH WORK. More opinionated than Sonnet. Needs stronger delegation rules, explicit "no markdown tables," explicit "be concise." Set personalityOverlay: "off".
โ
Handle tool calls reliably
BETTER than Sonnet. Terminal-Bench 75.1% vs 59.1%. Designed for agentic tool use.
โ
Multi-agent orchestration
"Instruction-following precision gives it a slight edge in complex orchestration where orchestrator must track multiple simultaneous sub-task states."
โ ๏ธ
Format Discord messages properly
YES, WITH RULES. Occasionally generates markdown tables, tends toward longer responses. Fixable with explicit SOUL.md rules.
โ
Flat-rate subscription
The ONLY top-tier model with subscription OAuth in OpenClaw right now.
Bottom line: GPT-5.4 CAN do everything Brent does with Sonnet. It's actually BETTER at tool calling and session memory. Worse at delegation compliance, formatting, and speed โ all fixable with prompt engineering. Will take 1-2 weeks of SOUL.md tuning.
Section 6: Transition Plan
Option A: All-In OpenAI (Simplest, Cheapest)
$200/mo (Pro) or $20/mo (Plus)
- Main Agent (Fergus):
openai-codex/gpt-5.4
- Sub-Agent Dwight:
openai-codex/gpt-5.4
- Sub-Agent Scout:
openai-codex/gpt-5.4-mini
- Coding (Cody/Bomb):
openai-codex/gpt-5.4 (already on this)
Pros
- One provider, one bill, flat rate
- Generous limits
Cons
- No Opus for hardest reasoning
- Slower speed
- Prompt tuning needed
SOUL.md additions needed:
## ๐จ GPT-5.4 SPECIFIC RULES
- NEVER use markdown tables. Use bullet lists instead. Always.
- Keep responses concise. Don't over-explain.
- You are a DISPATCHER. Do NOT do work yourself. This model
has a tendency to try to do things โ resist it. Delegate EVERYTHING.
- When formatting for Discord: use bold headers + bullet lists.
No tables. No code blocks for non-code content.
openclaw.json config:
{
"plugins": {
"entries": {
"openai": {
"config": {
"personalityOverlay": "off"
}
}
}
},
"agents": {
"defaults": {
"model": {
"primary": "openai-codex/gpt-5.4"
}
}
}
}
Option B: Hybrid (Best Quality, Higher Cost)
~$280-430/mo total
- Main Agent (Fergus):
openai-codex/gpt-5.4 ($200/mo Pro or $20/mo Plus)
- Sub-Agent Dwight:
anthropic/claude-opus-4-6 (API key, $30-80/mo)
- Sub-Agent Scout:
openai-codex/gpt-5.4-mini (included)
- Coding (Cody/Bomb):
openai-codex/gpt-5.4 (included)
- Fallback:
anthropic/claude-sonnet-4-6 (API key, if GPT-5.4 hits limits)
Pros
- Best of both worlds
- Opus for hard problems
- GPT-5.4 for everything else
Cons
- Two providers, two billing methods
- More complexity
Option C: Stay on Anthropic API (Minimal Change)
~$100-250/mo variable
- Main Agent (Fergus):
anthropic/claude-sonnet-4-6 (API key)
- Sub-Agent Dwight:
anthropic/claude-opus-4-6 (API key)
- Coding (Cody/Bomb):
openai-codex/gpt-5.4 (Codex OAuth, Plus $20/mo)
Just swap the Anthropic auth from OAuth token to API key. Same SOUL.md works. Variable monthly cost โ could spike.
Section 7: Local Model Option โ Gemma 4
Can Gemma 4 replace Sonnet if run locally?
Short answer: No.
- AA Intelligence Index: Gemma 4 scores 39 vs Sonnet's 82+ (43-point gap)
- Gemma 4 is the BEST open/local model โ but "best local" is still a tier below frontier cloud models.
Hardware Requirements
- Current Mac Mini M4 16GB: โ Can't run 31B. Gemma 4 26B MoE is tight. Not recommended for production.
- Mac Mini M4 Pro 48GB (~$1,800-2,000): โ
Runs 31B comfortably. Sweet spot.
- Mac Studio M4 Max 64GB+ (~$2,700+): โ
Can run 70B models at usable speed.
Best Local Models for OpenClaw โ April 2026
- Gemma 4 31B โ Best overall. Native function calling. Apache 2.0. Needs 32GB+ RAM.
- Qwen3.5 27B โ "Matches GPT-5 Mini." Good tool calling. Needs 32GB+ RAM.
- Gemma 4 26B MoE โ Good for its effective size (3.8B active). Best "bang for buck" โ can squeeze into 16GB.
- DeepSeek-Coder small variants โ Good for local coding agents.
Recommendation: Don't buy hardware for local models as a Sonnet replacement. The quality gap is too large. Local models are great as emergency fallback, privacy-sensitive tasks, or saving money on low-priority Scout-level work. But for the main Fergus agent? Cloud models are the answer.
Final Recommendation โ What To Do Right Now
Immediate (Today)
- Set up Anthropic API key in OpenClaw config โ keeps everything working while you transition. Cost: ~$50-150/mo for Sonnet, $30-80/mo for selective Opus.
- Keep Codex OAuth for Cody/Bomb (already working on Plus $20/mo).
This Week
- Test GPT-5.4 as main agent on a non-critical channel. Try it in #fergus for a day.
- Tune SOUL.md for GPT-5.4 quirks (anti-table rules, stronger delegation language, conciseness).
Within 2 Weeks
- If GPT-5.4 works โ upgrade to Pro ($200/mo) and go all-in (Option A or B).
- If GPT-5.4 doesn't match Sonnet's persona โ stay on Anthropic API (Option C), accept variable cost.
The Money
- Cheapest good option GPT-5.4 Pro all-in = $200/mo flat
- Best quality option GPT-5.4 Pro + Opus API for Dwight = $230-280/mo
- Status quo Anthropic API for everything = $100-250/mo variable + Plus $20/mo for Codex
The bottom line: GPT-5.4 is the replacement. It's ranked #1 overall. The OpenClaw community endorses it. It's the only top-tier model with subscription access. The only question is whether you can tune your prompts to match Sonnet's persona compliance โ and based on everything reviewed here, you can. It just takes a week or two of iteration.
AI Model Replacement Guide ยท Prepared by Dwight ยท April 4, 2026 ยท For Brent's OpenClaw stack