Behind the scenes: how Forese routes a shot to the right model
Twelve providers, dozens of models, and one prompt. A look at how Forese picks the right camera for the job — and what happens when the first choice fails.
Every shot you generate in Forese passes through a single router. The user picks a model from the picker, sees an estimated cost, and clicks Generate. What happens next is a four-stage flow we built across Phases 04–08 of the build.
Stage 1 — Estimate + hold
Before we call any provider, we estimate cost in millicredits, place a credit hold against the org's free → plan → top-up accounts in priority order, and persist the generations row in pending. If the org can't cover the estimate, we throw insufficient_balance here — the user sees a clean upgrade CTA, not a half-done generation.
Stage 2 — Provider call
The Provider Abstraction Layer (PAL) gives every model the same submit/poll/normalize shape. We call provider.submit(input, ctx) and either wait.forToken (webhook transport) or provider.poll() on an interval (poll transport). Both paths converge on a normalized result.
Stage 3 — Branch on outcome
The normalized result is one of:
succeeded— break out to the success pathfailed_terminal— invalid input or content-policy rejection. No retry; we either charge-and-explain (policy) or refund (input).failed_retryable— transient. Try the next fallback provider.
Stage 4 — Stitch
On success: download → upload to R2 → create a Mux asset → settle the credit hold against the original estimate. On terminal failure: refund proportionally. On every step, we record a generation_event so you can audit later.
What's different about this design
Most "multi-provider" systems wire each model into the call site. We wire them into the router, so every model speaks the same surface. Adding a new provider is one Zod schema, one submit function, one normalize function — and it gets free retries, free fallback, free billing, free auditing.
If you're curious about the codebase, the relevant files are packages/providers/src/router.ts and packages/trigger/src/tasks/generation/submit-generation.ts. The whole flow is ~600 lines.