OpenRouter Launches Fusion API, Which Uses A Combination Of Models To Achieve Fable-Like Performance At Half The Price

Anthropic’s Fable and Mythos models have been withdrawn following US export controls, but OpenRouter might have a solution.

The company has launched Fusion, a compound model API that fans a prompt out to a panel of AI models simultaneously, synthesizes their responses, and returns a single answer. On OpenRouter’s internal benchmark of 100 hard research tasks, Fusion’s top configuration matched the performance of Claude Fable 5 — the model that’s now off the table for most users worldwide.

How Fusion Works

When a prompt hits the Fusion API, OpenRouter sends it in parallel to a panel of models, each with web search and bash tools enabled. A judge model then reads every response and maps out where the models agree, where they contradict each other, what they each covered and missed. A synthesizer writes the final answer grounded in that analysis.

OpenRouter says roughly three-quarters of Fusion’s performance gain comes from that synthesis step — combining what the models produced — with the remaining quarter coming from the diversity of models themselves. In other words, the intelligence multiplier is largely in the synthesis layer, not just in running more models.

For developers, the integration is straightforward: call it as a single model slug — openrouter/fusion — or let the model decide when to invoke it by adding a tools parameter. Panel composition is also customizable; developers can pass in their own participant models and synthesizer.

The Benchmark Numbers

OpenRouter tested Fusion on DRACO, Perplexity’s deep research benchmark covering 100 tasks across ten domains including law, medicine, finance, and product comparison. Each task is scored against roughly 39 weighted criteria, with wrong answers carrying negative weight. Verbose but vague answers don’t inflate scores.

The headline result: Fable 5 + GPT-5.5 in fusion came in at approximately 69%, edging out Opus 4.8 + GPT-5.5 + Gemini 3.1 Pro and Opus 4.8 + GPT-5.5 combinations. Solo Claude Fable 5 landed around 65%.

The more striking finding is at the budget end. A panel of Gemini 3 Flash, Kimi K2.6, and DeepSeek V4 Pro — three models well below frontier pricing — fused together outperformed solo GPT-5.5 and solo Claude Opus 4.8, and came within 1% of Fable 5’s score. OpenRouter says that panel costs roughly half what Fable 5 would have.

One thing worth noting about the benchmark setup: when models were given web search access, some started surfacing the DRACO rubric online. OpenRouter caught this, excluded the affected domains across every model with a configuration change, and re-ran the benchmark. The published numbers reflect the clean setup.

Why This Matters Now

The timing is hard to ignore. Anthropic pulled Fable and Mythos last week following a US government export control directive, cutting off access for foreign users globally with little notice. The dispute over whether the underlying jailbreak concern was proportionate is still ongoing — David Sacks has argued that Anthropic was warned and refused to act; Anthropic says the vulnerability was minor and overstated — but the practical outcome remains the same: the models are unavailable.

For teams that had built workflows around Fable-level performance, Fusion offers a way to get comparable results from models that aren’t subject to the same export restrictions. The fact that a budget panel gets within 1% of Fable 5 changes the economics and resilience calculus for anyone who needs to plan around geopolitical risk in their AI stack.

The broader principle OpenRouter is demonstrating — that panels of models can consistently beat individual models, and that smart synthesis matters more than raw model power — also has implications beyond the current news cycle. Frontier labs compete on model capability, but Fusion’s results suggest the combination and synthesis layer is increasingly where performance gaps get closed.