How a snapshot works
Rungsd-browser snapshot on any loaded page. The daemon traverses the live DOM, identifies interactive and semantically meaningful elements, and assigns each one a ref in the format @vN:eM — where N is the snapshot version number and M is the element index within that snapshot.
https://example.com:
@v2::
@v1:e1 to a command after taking snapshot v2 raises a stale-ref error immediately, before any interaction reaches Chrome. This fail-fast behaviour prevents the subtle wrong-element bugs that plague selector-based automation.
Snapshot modes
The default snapshot captures interactive elements only. Pass--mode to focus on a specific semantic category:
| Mode | What it captures |
|---|---|
interactive (default) | Buttons, links, inputs, selects, and other actionable elements |
form | All form fields, labels, and submit controls |
dialog | Elements inside open dialogs and modals |
navigation | Nav links, breadcrumbs, and pagination controls |
errors | Visible error messages, validation feedback, and alerts |
headings | Page headings (h1–h6) for structural orientation |
visible_only | Every visible element, not limited to interactive ones |
--selector:
Interacting with refs
All interaction commands accept either a ref or a CSS selector. Use refs whenever you have a current snapshot — they are more robust than selectors because the daemon validates the version stamp before acting.Inspecting a specific ref
gsd-browser get-ref returns full metadata for any ref — bounding box, ARIA role and name, element tag, structural signature, and the resolved CSS selector the daemon uses internally:
Best practices for AI agents
Always snapshot before interacting
Read the
gsd-browser://latest-snapshot MCP resource (or call browser_snapshot) after every navigation and after any action that changes the DOM. This gives your agent fresh, versioned refs before it attempts any interaction.Prefer semantic tools first
Use
browser_act or browser_find_best for common intents (fill email, click submit, accept cookies, etc.) before dropping down to ref-based interaction. Semantic tools use the action cache automatically and are more resilient to minor DOM changes.Fall back to refs for precision
When
browser_act doesn’t cover your case, or when you need to click a very specific element, take a snapshot and use the resulting refs. Refs give you exact, version-validated targeting.The action cache and self-healing
Every time an agent successfully maps an intent (likefill_email) to a concrete selector on a given page, GSD Browser can store that mapping in the action cache. On future visits to the same page, the cache is consulted first — if the mapping is still valid, the interaction proceeds immediately without another round-trip to find the element.
--session checkout cache never contaminates a --session dashboard cache. Over long-running agent projects, the cache makes interactions progressively faster and more robust — the system effectively learns the page’s element structure from prior successes.
