Skip to main content
This guide walks you through your first GSD Browser session. You will start the daemon, navigate to a page, take a snapshot to get versioned element refs, interact with those refs, and capture a screenshot. By the end, you will understand the core CLI workflow and know where to go next for AI agent integration.
Make sure you have installed GSD Browser and that Chrome or Chromium is available on your machine before continuing.

CLI quickstart

1

Start the daemon

The daemon starts automatically on the first browser command, but you can also pre-warm it explicitly:
gsd-browser daemon start
Verify it is running:
gsd-browser daemon health
2

Navigate to a URL

Send the browser to any page. The daemon opens Chrome if it isn’t already running.
gsd-browser navigate https://example.com
You can combine navigation with an instant screenshot:
gsd-browser navigate https://example.com --screenshot
3

Take a snapshot to get element refs

A snapshot scans the current page and assigns stable versioned refs to every interactive element. Refs look like @v1:e1, @v1:e2, and so on. The version prefix ensures stale refs from a previous snapshot are automatically rejected.
gsd-browser snapshot
Example output:
@v1:e1  [a]      "More information..." (href="/more")
@v1:e2  [input]  type="search" placeholder="Search..."
@v1:e3  [button] "Search"
Always take a fresh snapshot after navigation or any DOM change. Old refs become stale when the page state changes. See Snapshots & Refs for a full explanation.
4

Click an element using its ref

Pass a ref directly to click. GSD Browser resolves it to the live DOM element — no CSS selector guessing required.
gsd-browser click @v1:e1
You can also click by CSS selector when you prefer:
gsd-browser click "#submit-btn"
5

Type into a field

Target an input by its ref and provide the text to type:
gsd-browser type @v1:e2 "hello world"
Add --submit to press Enter after typing, or --clear-first to empty the field before filling it:
gsd-browser type @v1:e2 "hello world" --clear-first --submit
6

Capture a screenshot

Save the current viewport as an image. Both JPEG and PNG are supported.
gsd-browser screenshot --output page.png --format png
Capture the full scrollable page:
gsd-browser screenshot --output full.png --format png --full-page
7

Stop the daemon

When you are done, stop the daemon and close the browser:
gsd-browser daemon stop

Useful CLI options

Every gsd-browser command accepts these global flags:
FlagPurpose
--jsonReturn structured JSON output instead of human-readable text
--session <name>Target a named session for isolated browser state
--browser-path <path>Path to a specific Chrome or Chromium binary
--cdp-url <url>Attach to an already-running Chrome instance
For example, to get JSON output from a snapshot:
gsd-browser snapshot --json

AI agent path: MCP server

If you are connecting an AI agent rather than scripting the CLI directly, use the MCP server instead. It exposes the full GSD Browser surface as 50+ discoverable tools, live resources, and executable prompts to any MCP-compatible client. Start the MCP server on stdio (most clients manage the process for you):
gsd-browser mcp
Or run it as an HTTP server for remote and cloud agents:
export GSD_BROWSER_MCP_AUTH_TOKEN="$(openssl rand -hex 32)"
gsd-browser mcp --http --host 0.0.0.0 --port 8788

MCP Server guide

Client configuration, tool categories, resources, prompts, and remote hosting.

Snapshots & Refs

Deep dive into versioned refs — the core mechanism for reliable agent interaction.

What to explore next

Sessions

Isolate browser state, run parallel agent tasks, and persist authentication across runs.

Snapshots & Refs

Understand versioned refs, snapshot modes, and the action cache.

MCP Server

Connect Cursor, Claude Desktop, or any MCP-compatible agent.