Start the Viewer
Begin a named session
Use
--session so every command in the flow attaches to the same browser context, viewer, and history.Open the viewer
http://127.0.0.1:7878/viewer?token=...) and opens it in your default browser. The URL includes a one-time security token — share it only with collaborators who should have access.If you need the URL without auto-opening (for example, to pass it to another tool):Set a goal banner (optional)
Display a visible goal message so collaborators understand what the agent is doing:Clear it when the step is done:
Drive actions normally
Keep issuing CLI or MCP commands. The viewer reflects every action in real time — clicks, typing, navigation, and failures all appear with target rings, cursor animations, and narration entries.
Viewer Controls
The viewer exposes controls directly in the browser UI. You can also use keyboard shortcuts.| Control | Keyboard Shortcut | Effect |
|---|---|---|
| Pause | Space | Blocks the agent before its next narrated action |
| Resume | Space | Allows agent actions to continue |
| Step | → (Right Arrow) | Allows exactly one action, then returns to paused |
| Abort | Esc | Aborts the next queued action |
| Refs overlay | R | Shows or hides bounding boxes and labels for interactive elements |
Human Takeover
Take full manual control of the browser without terminating the session:Click Take Control in the viewer UI
The agent pauses immediately. The session remains live — cookies, auth state, and page context are all preserved.
Interact manually
Use your mouse and keyboard in the viewer to click, type, scroll, or navigate. All manual actions are captured in the narration history.
In MCP mode, use
browser_takeover to pause and browser_release_control to hand back programmatically.Annotation Tools
Mark important moments directly in the viewer so they appear in the narration log and any exported evidence bundle.Draw overlays
Highlight regions of the page with freehand drawing to call attention to specific elements.
Add notes
Attach text annotations to the current frame — these become timestamped entries in the narration history.
Mark elements for the agent
Flag elements in the viewer so the agent’s next snapshot includes a prioritized hint for that region.
Narration history
Every annotation streams to the MCP server and appears in the narration log. Retrieve the full log at any time.
Review History
Open a history-focused view without issuing new browser actions:Fast Agent Runs
For agent-only runs where no human needs the cursor lead-in animation, use--no-narration-delay. The narration history is still recorded — only the lead-time sleep is skipped.
When to Use the Live Viewer
Compliance review
Let a compliance officer watch the agent complete a regulated workflow and annotate the key attestation steps in real time.
Debugging
Pause and step through a failing flow action-by-action to pinpoint exactly where the agent goes wrong.
Training data collection
Record the viewer session — including manual takeovers and annotations — as a rich, labeled training dataset.
User acceptance testing
Let a QA engineer watch the agent run a user story and intervene at any point to verify or correct behavior.
