Model Routing
How Hyperflow assigns models to roles.
The thinking/worker split, resolution priority, per-provider defaults, and runtime customization.
The thinking/worker split
Not every task needs the most powerful model. Hyperflow routes work to the right tier based on what the task requires:
- Thinking models — orchestration, review, debugging, architecture decisions. Reasoning over ambiguity.
- Worker models — implementation, search, writing, boilerplate. Execution against a clear spec.
This keeps costs lower and speed higher without sacrificing quality — because every worker output gets a thinking-model review before it lands.
Role routing table
| Role | Tier | Default (Claude Code) | Examples |
|---|---|---|---|
| Orchestrator | Thinking | Opus 4.8 |
Decomposing "build a dashboard" into 5 parallel tasks |
| Reviewer | Thinking | Opus 4.8 |
Checking if a worker's component matches the spec |
| Debugger | Thinking | Opus 4.8 |
Root-causing why tests fail after a refactor |
| Decision-maker | Thinking | Opus 4.8 |
Choosing between REST vs tRPC for a new API |
| Brainstormer | Thinking | Opus 4.8 |
Exploring 2–3 approaches with trade-offs |
| Implementer | Worker | Sonnet 4.6 |
Writing a React component from a clear spec |
| Searcher | Worker | Sonnet 4.6 |
Finding all files that import a specific module |
| Writer | Worker | Sonnet 4.6 |
Creating test files, config files, documentation |
Per-provider defaults
Each supported platform has its own model naming and defaults:
| Provider | Default Thinking | Default Worker |
|---|---|---|
| Claude Code | opus-4-8 |
sonnet-4-6 |
| OpenCode | anthropic/claude-opus-4-8 |
anthropic/claude-sonnet-4-6 |
| Codex | gpt-5.5 with adaptive reasoning |
gpt-5.4 with fast/low reasoning |
| Antigravity | gemini-3-pro |
gemini-3.5-flash |
See providers.html for the full model list per platform, including version pinning, reasoning policy, and dynamic detection.
How it works in practice
When you give Hyperflow a task:
- Config loaded — reads
~/.hyperflow/config.jsonand auto-detects the active provider - Models resolved — thinking and worker models determined via the priority chain (see below)
- Thinking model (orchestrator) receives your request
- Thinking model breaks it into sub-tasks and identifies which can run in parallel
- Thinking model dispatches worker model agents with self-contained prompts
- Worker model agents execute and return results + notes
- Thinking model reviews each output — approves or sends back for fixes
- Thinking model synthesizes learnings and injects them into the next batch
- Repeat until done
For the full chain mechanics, see orchestration.html.
Codex reasoning
Codex uses gpt-5.5 for thinking roles and resolves reasoning by task: low for trivial docs/config checks, medium for normal planning/review, and high for debugging, architecture, security, and final integration. Worker roles use gpt-5.4 with low reasoning for fast mode. Codex defaults never use xhigh.
Model resolution priority
When Hyperflow needs to determine which model to use for a role, it walks this chain — first match wins:
| Priority | Source | Scope | Example |
|---|---|---|---|
| 1 | Per-task inline request | Single task | "Fix this bug using opus-4-8" |
| 2 | Session command | Current session | hyperflow: thinking opus-4-8 |
| 3 | Environment variable | Current session | HYPERFLOW_THINKING_MODEL=opus-4-8 |
| 4 | Role override | Persistent | providers.claude-code.roles.reviewer: "opus-4-8" |
| 5 | Provider tier | Persistent | providers.claude-code.thinking: "opus-4-8" |
| 6 | Global default | Persistent | defaults.thinking: "opus-4-8" |
Customization
Generated config (what the installer writes)
The installer detects every provider on your machine and writes a full block for each one in ~/.hyperflow/config.json — not just the active provider. Each block carries the canonical model list, a default thinking / worker, and the role mapping. You can edit any field; missing fields fall back to per-provider defaults.
"activeProvider": "claude-code",
"defaults": { "thinking": "opus-4-8", "worker": "sonnet-4-6" },
"providers": {
"claude-code": {
"thinking": "opus-4-8",
"worker": "sonnet-4-6",
"models": {
"thinking": ["opus-4-8", "opus-4-7", "opus-4-6", "opus-4-5", "sonnet-4-6"],
"worker": ["sonnet-4-6", "sonnet-4-5", "haiku-4-5"]
},
"roles": {
"orchestrator": "opus-4-8", "reviewer": "opus-4-8", "debugger": "opus-4-8",
"decision-maker": "opus-4-8", "brainstormer": "opus-4-8",
"implementer": "sonnet-4-6", "searcher": "sonnet-4-6", "writer": "sonnet-4-6"
}
},
"opencode": { "thinking": "anthropic/claude-opus-4-8", "worker": "anthropic/claude-sonnet-4-6", "models": { … }, "roles": { … } },
"antigravity": { "thinking": "gemini-3-pro", "worker": "gemini-3.5-flash", "models": { … }, "roles": { … } }
},
"security": { "enabled": true },
"memory": { "compactionThreshold": 300 },
"context": { "windowTokens": 200000, "autoCompactMinPercent": 72 }
}
context.windowTokens and context.autoCompactMinPercent drive the smart PreCompact hook. Automatic compaction is blocked when the transcript estimate is confidently below the threshold; manual /compact and unknown-estimate compaction still pass.
Change default models
The minimal override — works if you only care about thinking / worker globally:
"defaults": {
"thinking": "opus-4-8",
"worker": "sonnet-4-6"
}
}
Override a specific role
Drop the reviewer to the previous Opus to save cost while keeping orchestration on the latest:
"providers": {
"claude-code": {
"thinking": "opus-4-8",
"worker": "sonnet-4-6",
"roles": {
"reviewer": "opus-4-7"
}
}
}
}
Use a cheaper worker for search tasks
"providers": {
"claude-code": {
"thinking": "opus-4-8",
"worker": "sonnet-4-6",
"roles": {
"searcher": "haiku-4-5"
}
}
}
}
Switch models mid-session
hyperflow: worker haiku-4-5
hyperflow: models # verify current config
hyperflow: reset models # revert to config.json defaults
Per-session via environment
For initial setup and the full model catalog, see installation.html and providers.html.