DEV Community
•
2026-04-17 19:05
The Invisible Orchestrator: Cheap Routing + Expensive Reasoning in Multi-Agent Apps
The Problem
We had four specialist AI agents — math, verbal, data insights, and strategy — each with a different system prompt, RAG namespace, and reasoning style. Every user message needed to land on the right one.
The naive solution: run every message through GPT-4o, ask it to decide, then call the specialist. That added 800–1,200ms of latency before the user saw a single token. On...