Five Stages from
Request to Result
Every complex request follows a structured five-stage pipeline — from initial complexity analysis through planning, execution, validation, and final synthesis.
Discovery
The orchestrator analyzes the incoming request and classifies its complexity. This determines whether the request needs orchestration at all, or can be handled by a single agent.
🔬 Complexity Classification
Requests are classified as SIMPLE, MODERATE, or COMPLEX using LLM analysis against the available sub-agent registry.
🎯 Agent Matching
The classifier suggests which sub-agents are relevant, narrowing the field from 13 agents to only those needed.
⚡ Threshold Gating
If complexity falls below the configured threshold (e.g., MODERATE), the request bypasses orchestration entirely for faster response.
📊 Cost Estimation
Early complexity signal enables the model router to pre-select cost-appropriate models before planning begins.
Planning
The task planner decomposes the request into a directed acyclic graph (DAG) of tasks. Each task is assigned to a specific sub-agent with clear instructions and dependency chains.
🗺️ DAG Generation
Claude Opus generates a task graph with explicit dependency edges — enabling maximum parallelism while respecting data flow.
📝 Task Definitions
Each task includes a unique ID, assigned agent, detailed input prompt, and dependency list (depends_on).
🔄 Execution Mode Selection
The planner chooses the optimal execution mode: sequential, parallel, DAG, router, or iterative.
💾 Plan Persistence
The complete plan is persisted to PostgreSQL's orchestrator_plans table — enabling plan review, approval gates, and post-hoc analysis.
Action
Sub-agents are spawned and execute their assigned tasks. Independent tasks run concurrently (up to the configured limit), while dependent tasks wait for their predecessors.
🚀 Concurrent Spawning
Up to 5 sub-agents run simultaneously via asyncio.Semaphore. Each gets its own workspace, conversation history, and MCP tool sandbox.
🔧 MCP Tool Execution
Each sub-agent accesses only the MCP servers listed in its YAML config — Python executor, filesystem, web search, etc.
📈 Token Tracking
Every LLM call and tool execution is logged to subagent_executions with discrete token columns (prompt, completion, total) and duration.
⏱️ Timeout Protection
Per-task timeouts via asyncio.wait_for() prevent runaway sub-agents. Configurable per sub-agent in YAML.
Validation
The evaluator sub-agent reviews all outputs against the original request. Each task is scored for completeness, accuracy, and relevance — with an overall pass/fail determination.
✅ Per-Task Scoring
Each task receives a 0.0–1.0 score with specific issue descriptions. Scores are persisted to plan_tasks for analytics.
🔄 Retry on Failure
If evaluation fails, the orchestrator can retry specific tasks up to max_retries times with feedback from the evaluator.
📋 Structured Output
The evaluator returns structured JSON with overall_pass, per-task evaluations, and a summary assessment.
💾 Score Persistence
Evaluation results are written to orchestrator_plans (overall) and plan_tasks (per-task) for plan grading and quality trends.
Finalization
The orchestrator synthesizes all sub-agent results into a single, coherent response. The user never sees the orchestration — they receive a polished, unified answer.
🔗 Result Synthesis
All sub-agent outputs are merged into a clear, well-structured response. Failed tasks are acknowledged gracefully.
📎 File Aggregation
Generated files (reports, presentations, charts) from all sub-agents are collected and exposed via the download endpoint.
💰 Cost Summary
Total token usage, model costs, and execution timing are computed and persisted to the plan record.
📊 Observability
The complete execution trace — plan, tasks, evaluations, costs — is available for analytics, debugging, and optimization.
Structured Process.
Reliable Results.
Five phases ensure every complex request is decomposed, executed, validated, and synthesized with full observability.
Explore Execution Methods →