Engine13 min read

Air Max: every AI model, working together in one answer

Air Max is AIR's hybrid engine. It routes each message to the best model — and on the hardest tasks, runs several models together and merges them into one superior answer.

No single AI model is best at everything. One is sharpest at reasoning, another handles enormous documents without losing the thread, another is fastest for quick replies, another is strongest at fresh live-web research. For years the trade-off was forced on you: pick one model and live with its weak spots. Air Max removes that trade-off. It is not another model — it is a conductor that puts every model to work on the part of your request each one does best.

This article explains exactly how Air Max decides what to do with each message, when it uses one model versus several, and why combining models can produce a better answer than any of them alone — while still staying cost-efficient for the everyday questions that don't need the full orchestra.

Two modes, chosen automatically

Air Max works in two modes and switches between them by itself, per message. The first is the Router. For the vast majority of messages — a quick question, a rewrite, an image request, an everyday explanation — Air Max simply picks the single best engine for that specific task and answers with it. That keeps things fast and cheap: one model, one answer, no waste.

The second is the Ensemble. When a message is genuinely hard — deep research, complex analysis, a multi-part strategic ask — Air Max runs more than one model and merges their work. A fast, long-context model drafts broad coverage of the problem; a deep reasoning model then verifies that draft, corrects mistakes, fills gaps, and elevates it into one final, polished answer. You see a single response — but two engines shaped it.

How Air Max routes your message

Your message	Mode	What runs
Quick question / rewrite	Router	One fast Flash engine
Everyday explanation	Router	Balanced Flash model
Code / focused analysis	Router	Gemini 3.1 Pro
Deep research / strategy	Ensemble	Gemini 3.1 Pro + Claude Sonnet 5
Document creation	Router	Claude Sonnet 5 (doc-grade)

Why combining models actually helps

Two models help for the same reason two experts help: they catch each other's blind spots. A model that is excellent at breadth may state something confidently that a deeper reasoning model would flag as wrong. When the second model treats the first's output as research to verify rather than truth to repeat, errors get caught before they reach you, and the final answer carries the strengths of both — coverage plus rigor.

This is why the ensemble is reserved for the hard 20% of tasks. On a simple question there is nothing to reconcile, so running several models would only add cost and latency for no gain. On a complex one, the cross-check is exactly where the quality comes from.

Powerful, but still cost-efficient

The instinct with a 'use every model' feature is to assume it must be expensive. Air Max is designed to be the opposite for normal use. Because it routes most messages to a single, appropriately-sized engine, everyday questions cost about what they would on a mid-tier model — not a flagship. The multi-model ensemble, which is the pricier path, only fires when a task is genuinely worth it.

Billing is reconciled against the exact engines that actually ran, so you are charged for the real work done and nothing more — while AIR keeps a consistent margin on every message. In practice: Air Max gives you flagship-grade answers on the questions that need them, and cheap, fast answers on the ones that don't, without you ever having to switch engines yourself.