Streaming UX for Long-Running Agents: SSE, Approvals, and Human-in-the-Loop
Real agents do work that takes minutes. They ask questions. They request more scope. They write files mid-task. A2A Cloud's streaming protocol turns every one of those into a live UI event — so users stay in the loop, not in the dark.
Streaming UX for Long-Running Agents: SSE, Approvals, and Human-in-the-Loop
Agent demos look great when they finish in three seconds.
Real agents take *minutes*. They install packages. They generate files. They ask follow-up questions. They request more workspace access. They hand off to other agents. They wait for the user to approve a risky operation.
"Wait for the response and render it" doesn't cut it. Not for real agent work.
A2A Cloud's dashboard is built around a streaming protocol that turns *every* one of those moments into a live UI event. Users see what's happening as it happens, answer questions inline, approve or deny risky steps, and watch files appear in their workspace in real time.
Here's how it works.
One SSE Stream, Every Event Type
The dashboard hits POST /v1/me/chat on the control plane with stream: true. The response is a text/event-stream flow, parsed by hand-rolled Web Streams API code in dashboard/src/api.ts (no eventsource library — just fetch + ReadableStream).
Heartbeat comments fire every 15 seconds to keep the connection alive. A [DONE] sentinel marks completion. Everything else is a typed event.
The Event Taxonomy
This is the full ChatEvent union flowing from agents to the UI:
LLM streaming
delta— token-by-token LLM outputfinal— message complete
Tool calls
tool_call— LLM invoked a tooltool_result— tool returned
DAG orchestration
dag_started,dag_node_started,dag_node_complete,dag_node_skipped,dag_complete
Agent handoff
agent_handoff— a subagent invocation startedhandoff_complete— subagent finishedhandoff_denied— user denied the handoffapproval_required— handoff needs sign-off before running
Mid-skill interaction
agent_question/agent_question_answeredagent_input_request/agent_input_submitted/agent_input_timeoutagent_progress
Scope expansion
scope_request/scope_approval_required/scope_grant/scope_denied
Other
agent_setup_required— missing consumer configthread— session created/resumederror— stream failure
That's not a streaming chat. That's an interactive protocol.
Every interesting moment in an agent's life has a corresponding event. The UI knows what to render. The agent knows how to ask. The user knows what's happening.
Mid-Skill Questions, Inline
When an agent calls ctx.ask(), the stream emits agent_question. The dashboard renders a QuestionCard (with a textarea, send button, and live status) right inside the chat thread.
The user types. Hits enter. The dashboard POSTs to /v1/me/chat/approvals/{approval_id} via answerQuestion(). The control plane wakes the paused skill. The agent resumes with the answer.
Backend uses PendingActionsStore.wait() with a 60–120s auto-timeout. Agents can't hang forever waiting for a human. They time out cleanly.
Need structured input (a form, not free text)? The agent calls a richer API and the dashboard renders an InputRequestCard — a JSON-schema-driven form via the SchemaFields component. Same flow, more structure.
Handoff Approval: User Owns The Call
Some handoffs need approval before they run — financial actions, data exfiltration risk, irreversible operations. The agent declares this, and when the time comes, the stream emits approval_required.
The dashboard renders a HandoffCard with approve/deny buttons. The orchestration halts until the user decides. Click approve → POST to the approval endpoint → subagent runs. Click deny → handoff_denied event, orchestration moves on.
There's also a global require_approval_for_file_writes policy in ControlPolicy — if set, *every* handoff requires sign-off regardless of declared risk. Operations teams can dial paranoia up to 11.
Scope Approval: Powerful, Bounded
The scope flow is the most interesting one. An agent calls ctx.request_scope() mid-skill, asking for write access to a path the initial grant didn't include.
The control plane evaluates policy:
- Within bounds? Emit
scope_grant, mint a new grant, continue. - Out of bounds? Emit
scope_approval_required, show ScopeCard in the UI.
User sees: which agent, which path, which mode (read/write), why. Click grant → expanded grant minted, agent resumes. Click deny → scope_denied, agent gets a clean failure.
This is the human-in-the-loop loop. Privilege escalation requires explicit consent. Every. Time.
File Operations, Visible
Agents write files. Users want to see them.
DAG node events and handoff_complete events carry file_ops arrays — every create, update, delete the agent performed. The FileOpsList component renders them as cards with op counts, clickable previews, and download buttons. Color-coded by op type.
No polling. No refresh. The workspace just updates live.
The Stack Is Simple
React 18 + TypeScript 5.5 + Vite. That's it. No external SSE library — just fetch and manual frame parsing. react-markdown + rehype-highlight for the message rendering. No state management library beyond React hooks.
The complexity lives in the event taxonomy and the components that handle each type. Not in the framework. That's the right place for it.
Why This Matters
The agent UX problem is the *long-running task* problem. You can't pretend it's a chat. It isn't.
A2A Cloud's streaming protocol treats agents as collaborative processes, not request/response. Every interesting moment surfaces in the UI:
- Progress — *the agent is working*
- Questions — *the agent needs your input*
- Approvals — *the agent needs your consent*
- Scope requests — *the agent needs more access*
- File ops — *the agent produced this*
- Handoffs — *the agent called another agent*
- Errors — *something went wrong, here's what*
The user is never stuck staring at a spinner. They're in the loop. They can answer, approve, redirect, or deny.
For builders: write ctx.ask() and ctx.request_scope() like normal Python. The protocol handles the rest.
For users: see what the agent is doing. Stay in control. Trust grows.
For product teams: this is the UX layer that makes long-running agents feel like a *product*, not a black box.
The Bigger Picture
Most agent UIs are chat UIs that pretend to be agent UIs. A2A Cloud's dashboard is the other thing. It's an interactive control panel for processes that think, ask, request, decide, write, and call other agents.
That's what the agent economy needs. Not better spinners. A protocol where the human and the agent are both first-class actors.
That's shipped.