How many states should a typical outbound booking flow have?

Most booking flows need 8–14 states: opening, rapport-build, qualification (2–3 questions), objection handling (2–3 variants), booking confirmation, callback scheduling, and call-end. Fewer than 8 usually means you're collapsing qualification, which tanks conversion. More than 14 usually means you're over-engineering paths that callers never take.

Can I use the LLM to decide state transitions instead of a rule-based flow?

Yes, and for complex conversations it often performs better than exhaustive rule trees. The trade-off: LLM-driven transitions are harder to debug, more expensive per call, and can behave unexpectedly at edge cases. The pattern we use is a hybrid: the LLM handles intent classification and natural-language generation within each state; the state machine handles which state comes next. This keeps costs manageable and keeps the flow auditable.

How do I test a call-flow JSON before going live?

Two approaches: (1) unit test each state transition with synthetic intents — write JSON fixtures for every expected caller response and assert the correct next-state. (2) simulate full calls against a mock telephony endpoint (Retell's test mode, or a local VAPI sandbox) with scripted caller responses. We run at least 50 scripted scenarios before any live traffic hits a new flow revision.

What's the difference between an intent and a state in a voice agent flow?

An intent is what the caller said or meant (e.g. 'interested', 'objecting-price', 'asking-for-human'). A state is where the agent is in the conversation (e.g. 'qualification_q2', 'objection_handling', 'transfer_pending'). Intents trigger state transitions; states determine what the agent says next. Conflating the two is the most common flow-design mistake.

Call-Flow Design for Voice Agents: JSON Blueprints That Ship

A client's outbound booking agent was converting at 23%. Then it dropped to 8% overnight. The LLM hadn't changed. The TTS voice hadn't changed. The STT model hadn't changed. It took four hours to find the issue: someone had edited the flow JSON during a "quick fix" and deleted the three states handling price objections. objection_price, acknowledge_concern, retry_value_prop — gone. Fifteen percentage points of conversion disappeared with them.

The lesson isn't that flow JSON is fragile, though it is. It's that the call flow is the agent's decision-making spine. Everything else — the voice, the model, the latency tuning — serves it. Most teams spend weeks on STT accuracy and an afternoon on flow design. That's backwards.

Why the call flow is where agents succeed or fail

A voice agent's behaviour at runtime is determined by three things: which state it's currently in, which intent it classified from the caller's last utterance, and which state to enter next. The LLM generates natural-sounding language within each state; the state machine decides the overall shape of the conversation.

When a caller objects to price on the third call of an outbound sequence, you need that objection handled with a specific counter-argument — not a generic "I understand your concerns." That specific response only exists if you designed a handle_price_objection state, with a dedicated prompt and a concrete rebuttal, that the transition logic routes into correctly.

The platform — Twilio, Deepgram, ElevenLabs, Retell — executes whatever the flow dictates. A well-designed stack running a poorly-designed flow will fail. A slightly imperfect stack running a well-designed flow will ship.

The anatomy of a production call-flow JSON

Most voice agent platforms accept flows as JSON or YAML. The exact schema varies — Retell AI's flow format, VAPI's workflow schema, and Bland each have their own structures — but the logical architecture is universal: a collection of states, each with a prompt, a set of expected intents, and a transition table mapping intents to next states.

Here's a real example from a UK mortgage broker's outbound pre-qualification campaign:

{
  "version": "2.1",
  "entry_state": "opening",
  "global_context": {
    "agent_name": "Aria",
    "company_name": "{{company_name}}",
    "lead_context": "{{lead_product_interest}}"
  },
  "states": {
    "opening": {
      "prompt": "Hi {{lead_first_name}}, this is Aria calling from {{company_name}}. Do you have 90 seconds? I'm calling about {{lead_context}}.",
      "intents": ["yes_proceed", "busy_callback", "not_interested", "asking_who"],
      "transitions": {
        "yes_proceed": "qualification_q1",
        "busy_callback": "schedule_callback",
        "not_interested": "graceful_exit",
        "asking_who": "clarify_identity",
        "default": "clarify_intent"
      },
      "timeout_state": "re_engage",
      "timeout_seconds": 6
    },
    "qualification_q1": {
      "prompt": "Great. Quick question — are you currently using any automated follow-up for your inbound enquiries, or is it mostly manual at the moment?",
      "intents": ["yes_automated", "no_manual", "partial_automated", "ask_why", "not_interested"],
      "transitions": {
        "yes_automated": "qualification_q2_existing",
        "no_manual": "qualification_q2_pain",
        "partial_automated": "qualification_q2_existing",
        "ask_why": "explain_reason",
        "not_interested": "graceful_exit",
        "default": "rephrase_q1"
      }
    },
    "objection_price": {
      "prompt": "That's a fair point — and I get it. We work with a flat £2,400 setup fee, no monthly licence. For most clients that's less than one month of a junior SDR's time, without the ramp-up. Does that change the picture at all?",
      "intents": ["still_no", "reconsidering", "wants_more_info", "wants_human"],
      "transitions": {
        "still_no": "graceful_exit",
        "reconsidering": "book_call",
        "wants_more_info": "send_case_study",
        "wants_human": "transfer_to_human",
        "default": "re_engage"
      }
    }
  }
}

The essential properties per state: a prompt (what the agent says on entry), an intents list (what the classifier listens for), a transitions map, a default transition for anything unexpected, and a timeout_state for caller silence. The global_context block is injected by your CRM at call initiation.

State machines: designing intent-to-action mappings

The hardest part of flow design isn't writing prompts — it's mapping the space of possible caller responses onto a finite set of intents. Get this wrong and your agent ends up in default constantly.

Be specific. "interested" is too broad. Break it into "interested_wants_demo", "interested_wants_pricing", "interested_wants_case_study". The intent classifier — typically a small model or an LLM call with a constrained output schema — can handle the nuance; what it cannot do is infer which specific "interested" the caller meant when you've only given it one option.

Model objections explicitly. For outbound calling, objections are the majority of what you'll hear. Map them individually: objection_price, objection_timing, objection_not_decision_maker, objection_using_competitor. Each deserves its own handling state with a specific rebuttal. Generic "I understand your concern" responses halve your conversion on objection-heavy campaigns.

Always have graceful_exit and transfer_to_human, reachable from any state. Callers who feel trapped are a GDPR risk, a PR risk, and will never convert. The UK Information Commissioner's Office recommends that automated calling systems must provide a clear mechanism to exit — don't bury it three states deep.

Handling objections, silence, and unexpected inputs

Objections route into state-specific response states. The objection_price state should address price with a specific data point, not empathy. Empathy alone converted at 12% in our A/B tests; empathy plus a concrete comparison converted at 31%.

Silence — the caller said nothing for N seconds — is different from not_interested. Our pattern: a 5-second timeout triggers a soft re-engagement prompt ("Sorry, I think the line may have dropped — still there?"), then a second 4-second timeout routes to graceful_exit. Two attempts, not one, not five.

Unexpected input should route to clarify_intent ("Just to make sure I've got that right — are you saying yes, or would you prefer I called back at a better time?"), not to graceful_exit. Agents that exit on any unrecognised input convert poorly.

Transfer logic: when and how to hand off

Transfer-to-human is a state, not an emergency exit. Design it with the same care as qualification.

Decide the trigger: did the caller request a human explicitly, or did your flow logic determine they're qualified for a live conversation? Explicit requests should be honoured immediately. Flow-triggered transfers can include a brief handoff message before Twilio's <Dial> verb fires.

Pass context before the human picks up. Whatever the agent learned — qualification answers, objections raised, sentiment signals — should arrive at the human agent's screen before the caller does. We use a Redis-backed call-state object, POSTed to a webhook, that populates a browser notification in the CRM. The agent who picks up knows the lead's name, what they want, and what objection they raised — before saying hello.

For the full transfer architecture, see Voice Agent Transfer to Human: Designing Handoffs That Don't Lose Deals.

Retry logic and call-end conditions

Rephrase retries: if the intent classifier returns unclear twice on the same question, rephrase with a simpler version. Asking the same question identically three times loses callers.

State loop prevention: track visit counts per state. Three visits to the same state without progression signals a broken path — route to graceful_exit.

Maximum turns: we set a hard cap at 25 turns for most outbound flows. Beyond that, route to graceful_exit regardless of current state. Long wandering calls very rarely convert and often generate complaints.

Version-controlling your flow JSON

The opening story — a "quick fix" deleting three states and costing 15 points of conversion — is almost always a version-control failure, not a design failure. Flow JSON needs to be treated like application code: version-controlled, reviewed, and tested before deployment.

The minimal setup: store your flow JSON in a Git repository, require pull-request reviews for any changes to production flows, and use a CI step that runs your synthetic call scenarios before merge. This adds 30 minutes to any flow change and has saved every client who's done it from at least one production incident.

Beyond basic version control, tag your flow versions. When you push a new flow to production, record the Git commit SHA, the timestamp, and the person who approved it. When conversion drops, you can diff the current flow against the previous version in under a minute. Without this, you're debugging by memory.

For teams running multiple flows in parallel — different flows for different products, regions, or lead scores — use a naming convention that encodes the campaign, version, and date: mortgage_outbound_v4_2026-04.json. Ambiguous filenames like flow_final_v2_ACTUAL.json are a sign that version control has already broken down. We've seen both on the same project — in the same week.

What changed in 2025–2026: dynamic flows and conditional branching

Two shifts changed what's practical in production flow design.

First, personalised opening at call initiation. Platforms now support dynamic variables injected from CRM data before dialling: lead name, company, last interaction date, deal stage. Agents that open with "Hi James, I see you requested a callback last Tuesday about your mortgage renewal" convert at 2–3× the rate of generic openers. The data needs to arrive in the global_context block before the first state fires.

Second, conditional branching within the flow JSON. Rather than routing every branch decision through an LLM call, platforms now support if/else logic evaluated at transition time — branch on a lead score threshold, a UK postcode prefix, a HubSpot deal stage, or a webhook response. This cuts LLM costs significantly on high-volume campaigns and makes the branching auditable in a way that LLM-driven transitions are not.

Some teams argue that fully LLM-driven flows — where the model decides everything, not a state machine — are simpler to build and better at handling conversational complexity. That's true in controlled settings. The counterargument in production: a state machine fails visibly (you can see which transition fired and why), while an LLM-driven flow fails invisibly (the model chose a path you didn't expect and you can't easily reproduce it). At 4,000 calls per week, debuggability matters more than flexibility.

Good / Bad / Ugly: real flow patterns from production

Good: A 12-state outbound qualification flow with explicit objection states (objection_price, objection_timing, objection_competitor), a conditional branch on lead score (low-score leads get a 5-question light-touch path, high-score leads get the full demo offer), a 2-attempt silence handler, and a transfer state that POSTs call context to the CRM before initiating Twilio's Dial. Tested across 50+ synthetic call scenarios. Runs at 28% booking rate on a 4,000-call/week campaign.

Bad: A 6-state flow that handles everything with LLM-generated responses and one long system prompt. Works in demos. In production: the model occasionally books meetings at blocked times, handles objections with generic empathy instead of specific counter-arguments, and routes to graceful_exit when uncertain. Conversion: 11%.

Ugly: A flow with no graceful_exit state. Callers who want to leave trigger clarify_intent on repeat. On one deployment, a caller was caught in a 4-minute re-engagement loop before hanging up. The recording went on Twitter. The campaign paused for 10 days.

For the full production architecture that sits behind the call flow — STT, LLM configuration, TTS streaming, telephony stack — see Voice AI Architecture 2025: A Production Implementation Guide. If you want to see how a 10-agent outbound system was designed end-to-end for a UK client, the Voice AI and Document Analysis case study covers the complete design and delivery.

Book a 30-minute scoping call and we'll map your top three conversation paths before you leave.

Call-Flow Design for Voice Agents: JSON Blueprints That Ship

Why the call flow is where agents succeed or fail

The anatomy of a production call-flow JSON

State machines: designing intent-to-action mappings

Handling objections, silence, and unexpected inputs

Transfer logic: when and how to hand off

Retry logic and call-end conditions

Version-controlling your flow JSON

What changed in 2025–2026: dynamic flows and conditional branching

Good / Bad / Ugly: real flow patterns from production

FAQ

Need a production-grade voice agent designed from scratch?

Call-Flow Design for Voice Agents: JSON Blueprints That Ship

Why the call flow is where agents succeed or fail

The anatomy of a production call-flow JSON

State machines: designing intent-to-action mappings

Handling objections, silence, and unexpected inputs

Transfer logic: when and how to hand off

Retry logic and call-end conditions

Version-controlling your flow JSON

What changed in 2025–2026: dynamic flows and conditional branching

Good / Bad / Ugly: real flow patterns from production

FAQ

Related Reading

Twilio vs Retell vs VAPI: Voice Agent Platform Comparison

Voice AI Architecture : A 2025 Implementation Guide

Voice Agent Transfer-to-Human: Designing Handoffs That Don't Lose Deals

Need a production-grade voice agent designed from scratch?