feat(browser): add native action commands

main
Peter Steinberger 2025-12-20 00:53:45 +00:00
parent d67bec0740
commit a526d3c1f2
26 changed files with 2589 additions and 1234 deletions

File diff suppressed because one or more lines are too long

View File

@ -97,4 +97,4 @@ git commit -m "Add Clawd workspace"
- Keep heartbeats enabled so the assistant can schedule reminders, monitor inboxes, and trigger camera captures. - Keep heartbeats enabled so the assistant can schedule reminders, monitor inboxes, and trigger camera captures.
- For browser-driven verification, use `clawdis browser` (tabs/status/screenshot) with the clawd-managed Chrome profile. - For browser-driven verification, use `clawdis browser` (tabs/status/screenshot) with the clawd-managed Chrome profile.
- For DOM inspection, use `clawdis browser eval|query|dom|snapshot` (and `--json`/`--out` when you need machine output). - For DOM inspection, use `clawdis browser eval|query|dom|snapshot` (and `--json`/`--out` when you need machine output).
- For advanced actions, use `clawdis browser tool browser_* --args '{...}'` (Playwright MCP parity). - For interactions, use `clawdis browser click|type|hover|drag|select|upload|press|wait|navigate|back|evaluate|run`.

View File

@ -1,5 +1,5 @@
--- ---
summary: "Spec: integrated browser control server + MCP tool dispatch" summary: "Spec: integrated browser control server + action commands"
read_when: read_when:
- Adding agent-controlled browser automation - Adding agent-controlled browser automation
- Debugging why clawd is interfering with your own Chrome - Debugging why clawd is interfering with your own Chrome
@ -98,45 +98,45 @@ Fallback behavior:
the user set the profile color/name once via Chrome UI; it must persist because the user set the profile color/name once via Chrome UI; it must persist because
the `userDataDir` is persistent. the `userDataDir` is persistent.
## Control server contract (proposed) ## Control server contract (current)
Expose a small local HTTP API (and/or gateway RPC surface) so the agent can manage Expose a small local HTTP API (and/or gateway RPC surface) so the agent can manage
state without touching the user's Chrome. state without touching the user's Chrome.
Minimum endpoints/methods (names illustrative): Basics:
- `GET /` status payload (enabled/running/pid/cdpPort/etc)
- `POST /start` start browser
- `POST /stop` stop browser
- `GET /tabs` list tabs
- `POST /tabs/open` open a new tab
- `POST /tabs/focus` focus a tab by id/prefix
- `DELETE /tabs/:targetId` close a tab by id/prefix
- `POST /close` close the current tab (optional targetId in body)
- `browser.status` Inspection:
- returns: `{ enabled, url, running, pid?, version?, chosenBrowser?, userDataDir?, ports: { control, cdp } }` - `GET /screenshot` (CDP screenshot)
- `browser.start` - `POST /screenshot` (Playwright screenshot with ref/element)
- starts the browser-control server + browser (no-op if already running) - `POST /eval` (CDP evaluate)
- `browser.stop` - `GET /query`
- stops the server and closes the clawd browser (best-effort; graceful first, then force if needed) - `GET /dom`
- `browser.tabs.list` - `GET /snapshot` (`aria` | `domSnapshot` | `ai`)
- returns: array of `{ targetId, title, url, isActive, lastFocusedAt? }`
- `browser.tabs.open`
- params: `{ url, newTab?: true }` → returns `{ targetId }`
- `browser.tabs.focus`
- params: `{ targetId }`
- `browser.tabs.close`
- params: `{ targetId }`
- `browser.screenshot`
- params: `{ targetId?, fullPage?: false }` → returns a `MEDIA:` attachment URL (via the existing Clawdis media host)
DOM + inspection (v1): Actions:
- `browser.eval` - `POST /navigate`, `POST /back`
- params: `{ js, targetId?, await?: false }` → returns the CDP `Runtime.evaluate` result (best-effort `returnByValue`) - `POST /resize`
- `browser.query` - `POST /click`, `POST /type`, `POST /press`, `POST /hover`, `POST /drag`, `POST /select`
- params: `{ selector, targetId?, limit? }` → returns basic element summaries (tag/id/class/text/value/href/outerHTML) - `POST /upload` (file chooser modal must be open)
- `browser.dom` - `POST /fill` (JSON field descriptors)
- params: `{ format: "html"|"text", targetId?, selector?, maxChars? }` → returns a truncated dump (`text` field) - `POST /dialog` (alert/confirm/prompt)
- `browser.snapshot` - `POST /wait` (time/text/textGone)
- params: `{ format: "aria"|"domSnapshot", targetId?, limit? }` - `POST /evaluate` (function + optional ref)
- `aria`: simplified Accessibility tree with `backendDOMNodeId` when available (future click/type hooks) - `POST /run` (function(page) → result)
- `domSnapshot`: lightweight DOM walk snapshot (tree-ish, bounded by `limit`) - `GET /console`, `GET /network`
- `POST /trace/start`, `POST /trace/stop`
Nice-to-have (later): - `POST /pdf`
- `browser.click` / `browser.type` / `browser.waitFor` helpers built atop snapshot refs / backend node ids - `POST /verify/element`, `POST /verify/text`, `POST /verify/list`, `POST /verify/value`
- `browser.tool` dispatch that mirrors Playwright MCP tool names for quick feature parity - `POST /mouse/move`, `POST /mouse/click`, `POST /mouse/drag`
- `POST /locator` (generate Playwright locator)
### "Is it open or closed?" ### "Is it open or closed?"
@ -163,54 +163,60 @@ The agent should not assume tabs are ephemeral. It should:
- reuse an existing tab when appropriate (e.g. a persistent "main" tab) - reuse an existing tab when appropriate (e.g. a persistent "main" tab)
- avoid opening duplicate tabs unless asked - avoid opening duplicate tabs unless asked
## Tool dispatch (Playwright MCP parity) ## CLI quick reference (one example each)
Clawdis exposes a generic tool dispatcher for Playwright MCP-style tools: Basics:
- `clawdis browser status`
- `clawdis browser start`
- `clawdis browser stop`
- `clawdis browser tabs`
- `clawdis browser open https://example.com`
- `clawdis browser focus abcd1234`
- `clawdis browser close abcd1234`
`POST /tool` with JSON `{ name: "browser_*", args: { ... }, targetId?: "..." }` Inspection:
- `clawdis browser screenshot`
- `clawdis browser screenshot --full-page`
- `clawdis browser screenshot --ref 12`
- `clawdis browser eval "document.title"`
- `clawdis browser query "a" --limit 5`
- `clawdis browser dom --format text --max-chars 5000`
- `clawdis browser snapshot --format aria --limit 200`
- `clawdis browser snapshot --format ai`
CLI helper: Actions:
`clawdis browser tool browser_* --args '{...}'` - `clawdis browser navigate https://example.com`
- `clawdis browser back`
Supported tool names: - `clawdis browser resize 1280 720`
- `browser_close` - `clawdis browser click 12 --double`
- `browser_resize` - `clawdis browser type 23 "hello" --submit`
- `browser_console_messages` - `clawdis browser press Enter`
- `browser_network_requests` - `clawdis browser hover 44`
- `browser_handle_dialog` - `clawdis browser drag 10 11`
- `browser_evaluate` - `clawdis browser select 9 OptionA OptionB`
- `browser_file_upload` - `clawdis browser upload /tmp/file.pdf`
- `browser_fill_form` - `clawdis browser fill --fields '[{\"ref\":\"1\",\"value\":\"Ada\"}]'`
- `browser_install` (no-op; uses system Chrome/Chromium) - `clawdis browser dialog --accept`
- `browser_press_key` - `clawdis browser wait --text "Done"`
- `browser_type` - `clawdis browser evaluate --fn '(el) => el.textContent' --ref 7`
- `browser_navigate` - `clawdis browser run --code '(page) => page.title()'`
- `browser_navigate_back` - `clawdis browser console --level error`
- `browser_run_code` - `clawdis browser network --include-static`
- `browser_take_screenshot` - `clawdis browser trace-start`
- `browser_snapshot` - `clawdis browser trace-stop`
- `browser_click` - `clawdis browser pdf`
- `browser_drag` - `clawdis browser verify-element --role button --name "Submit"`
- `browser_hover` - `clawdis browser verify-text "Welcome"`
- `browser_select_option` - `clawdis browser verify-list 3 ItemA ItemB`
- `browser_tabs` - `clawdis browser verify-value --ref 4 --type textbox --value hello`
- `browser_wait_for` - `clawdis browser mouse-move --x 120 --y 240`
- `browser_pdf_save` - `clawdis browser mouse-click --x 120 --y 240`
- `browser_start_tracing` - `clawdis browser mouse-drag --start-x 10 --start-y 20 --end-x 200 --end-y 300`
- `browser_stop_tracing` - `clawdis browser locator 77`
- `browser_verify_element_visible`
- `browser_verify_text_visible`
- `browser_verify_list_visible`
- `browser_verify_value`
- `browser_mouse_move_xy`
- `browser_mouse_click_xy`
- `browser_mouse_drag_xy`
- `browser_generate_locator`
Notes: Notes:
- `browser_file_upload` and `browser_handle_dialog` are modal-only; they only - `upload` and `dialog` only work when a file chooser or dialog is present.
work when a file chooser/dialog modal state is present. - `snapshot --format ai` returns Playwright-for-AI markup used for ref-based actions.
- `browser_snapshot` returns a Playwright-for-AI snapshot (use for follow-up actions).
## Security & privacy notes ## Security & privacy notes

View File

@ -1,13 +0,0 @@
---
summary: "Redirect: /clawd.md → /clawd"
permalink: /clawd.md
---
<!-- {% raw %} -->
<script>
window.location.replace("{{ \"/clawd\" | relative_url }}");
</script>
If youre not redirected automatically, go to
<a href="{{ \"/clawd\" | relative_url }}">/clawd</a>.
<!-- {% endraw %} -->

View File

@ -1,11 +0,0 @@
---
summary: "Redirect: mac/browser.md → browser.md"
read_when:
- Adding agent-controlled browser automation
- Debugging why clawd is interfering with your own Chrome
- Implementing browser settings + lifecycle in the macOS app
---
# Browser (macOS app) — moved
This doc moved to `docs/browser.md`.

View File

@ -0,0 +1,287 @@
import type { ScreenshotResult } from "./client.js";
import type { BrowserActionTabResult } from "./client-actions-types.js";
import { fetchBrowserJson } from "./client-fetch.js";
export async function browserNavigate(
baseUrl: string,
opts: { url: string; targetId?: string },
): Promise<BrowserActionTabResult> {
return await fetchBrowserJson<BrowserActionTabResult>(`${baseUrl}/navigate`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ url: opts.url, targetId: opts.targetId }),
timeoutMs: 20000,
});
}
export async function browserBack(
baseUrl: string,
opts: { targetId?: string } = {},
): Promise<BrowserActionTabResult> {
return await fetchBrowserJson<BrowserActionTabResult>(`${baseUrl}/back`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ targetId: opts.targetId }),
timeoutMs: 20000,
});
}
export async function browserResize(
baseUrl: string,
opts: { width: number; height: number; targetId?: string },
): Promise<BrowserActionTabResult> {
return await fetchBrowserJson<BrowserActionTabResult>(`${baseUrl}/resize`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
width: opts.width,
height: opts.height,
targetId: opts.targetId,
}),
timeoutMs: 20000,
});
}
export async function browserClosePage(
baseUrl: string,
opts: { targetId?: string } = {},
): Promise<BrowserActionTabResult> {
return await fetchBrowserJson<BrowserActionTabResult>(`${baseUrl}/close`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ targetId: opts.targetId }),
timeoutMs: 20000,
});
}
export async function browserClick(
baseUrl: string,
opts: {
ref: string;
targetId?: string;
doubleClick?: boolean;
button?: string;
modifiers?: string[];
},
): Promise<BrowserActionTabResult> {
return await fetchBrowserJson<BrowserActionTabResult>(`${baseUrl}/click`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
ref: opts.ref,
targetId: opts.targetId,
doubleClick: opts.doubleClick,
button: opts.button,
modifiers: opts.modifiers,
}),
timeoutMs: 20000,
});
}
export async function browserType(
baseUrl: string,
opts: {
ref: string;
text: string;
targetId?: string;
submit?: boolean;
slowly?: boolean;
},
): Promise<BrowserActionTabResult> {
return await fetchBrowserJson<BrowserActionTabResult>(`${baseUrl}/type`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
ref: opts.ref,
text: opts.text,
targetId: opts.targetId,
submit: opts.submit,
slowly: opts.slowly,
}),
timeoutMs: 20000,
});
}
export async function browserPressKey(
baseUrl: string,
opts: { key: string; targetId?: string },
): Promise<BrowserActionTabResult> {
return await fetchBrowserJson<BrowserActionTabResult>(`${baseUrl}/press`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ key: opts.key, targetId: opts.targetId }),
timeoutMs: 20000,
});
}
export async function browserHover(
baseUrl: string,
opts: { ref: string; targetId?: string },
): Promise<BrowserActionTabResult> {
return await fetchBrowserJson<BrowserActionTabResult>(`${baseUrl}/hover`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ ref: opts.ref, targetId: opts.targetId }),
timeoutMs: 20000,
});
}
export async function browserDrag(
baseUrl: string,
opts: { startRef: string; endRef: string; targetId?: string },
): Promise<BrowserActionTabResult> {
return await fetchBrowserJson<BrowserActionTabResult>(`${baseUrl}/drag`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
startRef: opts.startRef,
endRef: opts.endRef,
targetId: opts.targetId,
}),
timeoutMs: 20000,
});
}
export async function browserSelectOption(
baseUrl: string,
opts: { ref: string; values: string[]; targetId?: string },
): Promise<BrowserActionTabResult> {
return await fetchBrowserJson<BrowserActionTabResult>(`${baseUrl}/select`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
ref: opts.ref,
values: opts.values,
targetId: opts.targetId,
}),
timeoutMs: 20000,
});
}
export async function browserUpload(
baseUrl: string,
opts: { paths?: string[]; targetId?: string } = {},
): Promise<BrowserActionTabResult> {
return await fetchBrowserJson<BrowserActionTabResult>(`${baseUrl}/upload`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ paths: opts.paths, targetId: opts.targetId }),
timeoutMs: 20000,
});
}
export async function browserFillForm(
baseUrl: string,
opts: { fields: Array<Record<string, unknown>>; targetId?: string },
): Promise<BrowserActionTabResult> {
return await fetchBrowserJson<BrowserActionTabResult>(`${baseUrl}/fill`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ fields: opts.fields, targetId: opts.targetId }),
timeoutMs: 20000,
});
}
export async function browserHandleDialog(
baseUrl: string,
opts: { accept: boolean; promptText?: string; targetId?: string },
): Promise<{ ok: true; message: string; type: string }> {
return await fetchBrowserJson<{ ok: true; message: string; type: string }>(
`${baseUrl}/dialog`,
{
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
accept: opts.accept,
promptText: opts.promptText,
targetId: opts.targetId,
}),
timeoutMs: 20000,
},
);
}
export async function browserWaitFor(
baseUrl: string,
opts: {
time?: number;
text?: string;
textGone?: string;
targetId?: string;
},
): Promise<BrowserActionTabResult> {
return await fetchBrowserJson<BrowserActionTabResult>(`${baseUrl}/wait`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
time: opts.time,
text: opts.text,
textGone: opts.textGone,
targetId: opts.targetId,
}),
timeoutMs: 20000,
});
}
export async function browserEvaluate(
baseUrl: string,
opts: { fn: string; ref?: string; targetId?: string },
): Promise<{ ok: true; result: unknown }> {
return await fetchBrowserJson<{ ok: true; result: unknown }>(
`${baseUrl}/evaluate`,
{
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
function: opts.fn,
ref: opts.ref,
targetId: opts.targetId,
}),
timeoutMs: 20000,
},
);
}
export async function browserRunCode(
baseUrl: string,
opts: { code: string; targetId?: string },
): Promise<{ ok: true; result: unknown }> {
return await fetchBrowserJson<{ ok: true; result: unknown }>(
`${baseUrl}/run`,
{
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ code: opts.code, targetId: opts.targetId }),
timeoutMs: 20000,
},
);
}
export async function browserScreenshotAction(
baseUrl: string,
opts: {
targetId?: string;
fullPage?: boolean;
ref?: string;
element?: string;
type?: "png" | "jpeg";
filename?: string;
},
): Promise<ScreenshotResult & { filename?: string }> {
return await fetchBrowserJson<ScreenshotResult & { filename?: string }>(
`${baseUrl}/screenshot`,
{
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
targetId: opts.targetId,
fullPage: opts.fullPage,
ref: opts.ref,
element: opts.element,
type: opts.type,
filename: opts.filename,
}),
timeoutMs: 20000,
},
);
}

View File

@ -0,0 +1,207 @@
import type {
BrowserActionOk,
BrowserActionPathResult,
} from "./client-actions-types.js";
import { fetchBrowserJson } from "./client-fetch.js";
import type {
BrowserConsoleMessage,
BrowserNetworkRequest,
} from "./pw-session.js";
export async function browserConsoleMessages(
baseUrl: string,
opts: { level?: string; targetId?: string } = {},
): Promise<{ ok: true; messages: BrowserConsoleMessage[]; targetId: string }> {
const q = new URLSearchParams();
if (opts.level) q.set("level", opts.level);
if (opts.targetId) q.set("targetId", opts.targetId);
const suffix = q.toString() ? `?${q.toString()}` : "";
return await fetchBrowserJson<{
ok: true;
messages: BrowserConsoleMessage[];
targetId: string;
}>(`${baseUrl}/console${suffix}`, { timeoutMs: 20000 });
}
export async function browserNetworkRequests(
baseUrl: string,
opts: { includeStatic?: boolean; targetId?: string } = {},
): Promise<{ ok: true; requests: BrowserNetworkRequest[]; targetId: string }> {
const q = new URLSearchParams();
if (opts.includeStatic) q.set("includeStatic", "true");
if (opts.targetId) q.set("targetId", opts.targetId);
const suffix = q.toString() ? `?${q.toString()}` : "";
return await fetchBrowserJson<{
ok: true;
requests: BrowserNetworkRequest[];
targetId: string;
}>(`${baseUrl}/network${suffix}`, { timeoutMs: 20000 });
}
export async function browserStartTracing(
baseUrl: string,
opts: { targetId?: string } = {},
): Promise<BrowserActionOk> {
return await fetchBrowserJson<BrowserActionOk>(`${baseUrl}/trace/start`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ targetId: opts.targetId }),
timeoutMs: 20000,
});
}
export async function browserStopTracing(
baseUrl: string,
opts: { targetId?: string } = {},
): Promise<BrowserActionPathResult> {
return await fetchBrowserJson<BrowserActionPathResult>(
`${baseUrl}/trace/stop`,
{
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ targetId: opts.targetId }),
timeoutMs: 20000,
},
);
}
export async function browserPdfSave(
baseUrl: string,
opts: { targetId?: string } = {},
): Promise<BrowserActionPathResult> {
return await fetchBrowserJson<BrowserActionPathResult>(`${baseUrl}/pdf`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ targetId: opts.targetId }),
timeoutMs: 20000,
});
}
export async function browserVerifyElementVisible(
baseUrl: string,
opts: { role: string; accessibleName: string; targetId?: string },
): Promise<BrowserActionOk> {
return await fetchBrowserJson<BrowserActionOk>(`${baseUrl}/verify/element`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
role: opts.role,
accessibleName: opts.accessibleName,
targetId: opts.targetId,
}),
timeoutMs: 20000,
});
}
export async function browserVerifyTextVisible(
baseUrl: string,
opts: { text: string; targetId?: string },
): Promise<BrowserActionOk> {
return await fetchBrowserJson<BrowserActionOk>(`${baseUrl}/verify/text`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ text: opts.text, targetId: opts.targetId }),
timeoutMs: 20000,
});
}
export async function browserVerifyListVisible(
baseUrl: string,
opts: { ref: string; items: string[]; targetId?: string },
): Promise<BrowserActionOk> {
return await fetchBrowserJson<BrowserActionOk>(`${baseUrl}/verify/list`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
ref: opts.ref,
items: opts.items,
targetId: opts.targetId,
}),
timeoutMs: 20000,
});
}
export async function browserVerifyValue(
baseUrl: string,
opts: { ref: string; type: string; value?: string; targetId?: string },
): Promise<BrowserActionOk> {
return await fetchBrowserJson<BrowserActionOk>(`${baseUrl}/verify/value`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
ref: opts.ref,
type: opts.type,
value: opts.value,
targetId: opts.targetId,
}),
timeoutMs: 20000,
});
}
export async function browserMouseMove(
baseUrl: string,
opts: { x: number; y: number; targetId?: string },
): Promise<BrowserActionOk> {
return await fetchBrowserJson<BrowserActionOk>(`${baseUrl}/mouse/move`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ x: opts.x, y: opts.y, targetId: opts.targetId }),
timeoutMs: 20000,
});
}
export async function browserMouseClick(
baseUrl: string,
opts: { x: number; y: number; button?: string; targetId?: string },
): Promise<BrowserActionOk> {
return await fetchBrowserJson<BrowserActionOk>(`${baseUrl}/mouse/click`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
x: opts.x,
y: opts.y,
button: opts.button,
targetId: opts.targetId,
}),
timeoutMs: 20000,
});
}
export async function browserMouseDrag(
baseUrl: string,
opts: {
startX: number;
startY: number;
endX: number;
endY: number;
targetId?: string;
},
): Promise<BrowserActionOk> {
return await fetchBrowserJson<BrowserActionOk>(`${baseUrl}/mouse/drag`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
startX: opts.startX,
startY: opts.startY,
endX: opts.endX,
endY: opts.endY,
targetId: opts.targetId,
}),
timeoutMs: 20000,
});
}
export async function browserGenerateLocator(
baseUrl: string,
opts: { ref: string },
): Promise<{ ok: true; locator: string }> {
return await fetchBrowserJson<{ ok: true; locator: string }>(
`${baseUrl}/locator`,
{
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ ref: opts.ref }),
timeoutMs: 20000,
},
);
}

View File

@ -0,0 +1,15 @@
export type BrowserActionOk = { ok: true };
export type BrowserActionTabResult = {
ok: true;
targetId: string;
url?: string;
};
export type BrowserActionPathResult = {
ok: true;
path: string;
targetId: string;
url?: string;
filename?: string;
};

View File

@ -0,0 +1,3 @@
export * from "./client-actions-core.js";
export * from "./client-actions-observe.js";
export * from "./client-actions-types.js";

View File

@ -0,0 +1,67 @@
function unwrapCause(err: unknown): unknown {
if (!err || typeof err !== "object") return null;
const cause = (err as { cause?: unknown }).cause;
return cause ?? null;
}
function enhanceBrowserFetchError(
url: string,
err: unknown,
timeoutMs: number,
): Error {
const cause = unwrapCause(err);
const code =
(cause && typeof cause === "object" && "code" in cause
? String((cause as { code?: unknown }).code ?? "")
: "") ||
(err && typeof err === "object" && "code" in err
? String((err as { code?: unknown }).code ?? "")
: "");
const hint =
"Start (or restart) the Clawdis gateway (Clawdis.app menubar, or `clawdis gateway`) and try again.";
if (code === "ECONNREFUSED") {
return new Error(
`Can't reach the clawd browser control server at ${url} (connection refused). ${hint}`,
);
}
if (code === "ETIMEDOUT" || code === "UND_ERR_CONNECT_TIMEOUT") {
return new Error(
`Can't reach the clawd browser control server at ${url} (timed out after ${timeoutMs}ms). ${hint}`,
);
}
const msg = String(err);
if (msg.toLowerCase().includes("abort")) {
return new Error(
`Can't reach the clawd browser control server at ${url} (timed out after ${timeoutMs}ms). ${hint}`,
);
}
return new Error(
`Can't reach the clawd browser control server at ${url}. ${hint} (${msg})`,
);
}
export async function fetchBrowserJson<T>(
url: string,
init?: RequestInit & { timeoutMs?: number },
): Promise<T> {
const timeoutMs = init?.timeoutMs ?? 5000;
const ctrl = new AbortController();
const t = setTimeout(() => ctrl.abort(), timeoutMs);
let res: Response;
try {
res = await fetch(url, { ...init, signal: ctrl.signal } as RequestInit);
} catch (err) {
throw enhanceBrowserFetchError(url, err, timeoutMs);
} finally {
clearTimeout(t);
}
if (!res.ok) {
const text = await res.text().catch(() => "");
throw new Error(text ? `${res.status}: ${text}` : `HTTP ${res.status}`);
}
return (await res.json()) as T;
}

View File

@ -1,4 +1,5 @@
import { loadConfig } from "../config/config.js"; import { loadConfig } from "../config/config.js";
import { fetchBrowserJson } from "./client-fetch.js";
import { resolveBrowserConfig } from "./config.js"; import { resolveBrowserConfig } from "./config.js";
export type BrowserStatus = { export type BrowserStatus = {
@ -21,11 +22,6 @@ export type BrowserTab = {
type?: string; type?: string;
}; };
export type BrowserToolResponse = {
ok: true;
[key: string]: unknown;
};
export type ScreenshotResult = { export type ScreenshotResult = {
ok: true; ok: true;
path: string; path: string;
@ -117,74 +113,6 @@ export type SnapshotResult =
snapshot: string; snapshot: string;
}; };
function unwrapCause(err: unknown): unknown {
if (!err || typeof err !== "object") return null;
const cause = (err as { cause?: unknown }).cause;
return cause ?? null;
}
function enhanceBrowserFetchError(
url: string,
err: unknown,
timeoutMs: number,
): Error {
const cause = unwrapCause(err);
const code =
(cause && typeof cause === "object" && "code" in cause
? String((cause as { code?: unknown }).code ?? "")
: "") ||
(err && typeof err === "object" && "code" in err
? String((err as { code?: unknown }).code ?? "")
: "");
const hint =
"Start (or restart) the Clawdis gateway (Clawdis.app menubar, or `clawdis gateway`) and try again.";
if (code === "ECONNREFUSED") {
return new Error(
`Can't reach the clawd browser control server at ${url} (connection refused). ${hint}`,
);
}
if (code === "ETIMEDOUT" || code === "UND_ERR_CONNECT_TIMEOUT") {
return new Error(
`Can't reach the clawd browser control server at ${url} (timed out after ${timeoutMs}ms). ${hint}`,
);
}
const msg = String(err);
if (msg.toLowerCase().includes("abort")) {
return new Error(
`Can't reach the clawd browser control server at ${url} (timed out after ${timeoutMs}ms). ${hint}`,
);
}
return new Error(
`Can't reach the clawd browser control server at ${url}. ${hint} (${msg})`,
);
}
async function fetchJson<T>(
url: string,
init?: RequestInit & { timeoutMs?: number },
): Promise<T> {
const timeoutMs = init?.timeoutMs ?? 5000;
const ctrl = new AbortController();
const t = setTimeout(() => ctrl.abort(), timeoutMs);
let res: Response;
try {
res = await fetch(url, { ...init, signal: ctrl.signal } as RequestInit);
} catch (err) {
throw enhanceBrowserFetchError(url, err, timeoutMs);
} finally {
clearTimeout(t);
}
if (!res.ok) {
const text = await res.text().catch(() => "");
throw new Error(text ? `${res.status}: ${text}` : `HTTP ${res.status}`);
}
return (await res.json()) as T;
}
export function resolveBrowserControlUrl(overrideUrl?: string) { export function resolveBrowserControlUrl(overrideUrl?: string) {
const cfg = loadConfig(); const cfg = loadConfig();
const resolved = resolveBrowserConfig(cfg.browser); const resolved = resolveBrowserConfig(cfg.browser);
@ -193,19 +121,27 @@ export function resolveBrowserControlUrl(overrideUrl?: string) {
} }
export async function browserStatus(baseUrl: string): Promise<BrowserStatus> { export async function browserStatus(baseUrl: string): Promise<BrowserStatus> {
return await fetchJson<BrowserStatus>(`${baseUrl}/`, { timeoutMs: 1500 }); return await fetchBrowserJson<BrowserStatus>(`${baseUrl}/`, {
timeoutMs: 1500,
});
} }
export async function browserStart(baseUrl: string): Promise<void> { export async function browserStart(baseUrl: string): Promise<void> {
await fetchJson(`${baseUrl}/start`, { method: "POST", timeoutMs: 15000 }); await fetchBrowserJson(`${baseUrl}/start`, {
method: "POST",
timeoutMs: 15000,
});
} }
export async function browserStop(baseUrl: string): Promise<void> { export async function browserStop(baseUrl: string): Promise<void> {
await fetchJson(`${baseUrl}/stop`, { method: "POST", timeoutMs: 15000 }); await fetchBrowserJson(`${baseUrl}/stop`, {
method: "POST",
timeoutMs: 15000,
});
} }
export async function browserTabs(baseUrl: string): Promise<BrowserTab[]> { export async function browserTabs(baseUrl: string): Promise<BrowserTab[]> {
const res = await fetchJson<{ running: boolean; tabs: BrowserTab[] }>( const res = await fetchBrowserJson<{ running: boolean; tabs: BrowserTab[] }>(
`${baseUrl}/tabs`, `${baseUrl}/tabs`,
{ timeoutMs: 3000 }, { timeoutMs: 3000 },
); );
@ -216,7 +152,7 @@ export async function browserOpenTab(
baseUrl: string, baseUrl: string,
url: string, url: string,
): Promise<BrowserTab> { ): Promise<BrowserTab> {
return await fetchJson<BrowserTab>(`${baseUrl}/tabs/open`, { return await fetchBrowserJson<BrowserTab>(`${baseUrl}/tabs/open`, {
method: "POST", method: "POST",
headers: { "Content-Type": "application/json" }, headers: { "Content-Type": "application/json" },
body: JSON.stringify({ url }), body: JSON.stringify({ url }),
@ -228,7 +164,7 @@ export async function browserFocusTab(
baseUrl: string, baseUrl: string,
targetId: string, targetId: string,
): Promise<void> { ): Promise<void> {
await fetchJson(`${baseUrl}/tabs/focus`, { await fetchBrowserJson(`${baseUrl}/tabs/focus`, {
method: "POST", method: "POST",
headers: { "Content-Type": "application/json" }, headers: { "Content-Type": "application/json" },
body: JSON.stringify({ targetId }), body: JSON.stringify({ targetId }),
@ -240,7 +176,7 @@ export async function browserCloseTab(
baseUrl: string, baseUrl: string,
targetId: string, targetId: string,
): Promise<void> { ): Promise<void> {
await fetchJson(`${baseUrl}/tabs/${encodeURIComponent(targetId)}`, { await fetchBrowserJson(`${baseUrl}/tabs/${encodeURIComponent(targetId)}`, {
method: "DELETE", method: "DELETE",
timeoutMs: 5000, timeoutMs: 5000,
}); });
@ -257,9 +193,12 @@ export async function browserScreenshot(
if (opts.targetId) q.set("targetId", opts.targetId); if (opts.targetId) q.set("targetId", opts.targetId);
if (opts.fullPage) q.set("fullPage", "true"); if (opts.fullPage) q.set("fullPage", "true");
const suffix = q.toString() ? `?${q.toString()}` : ""; const suffix = q.toString() ? `?${q.toString()}` : "";
return await fetchJson<ScreenshotResult>(`${baseUrl}/screenshot${suffix}`, { return await fetchBrowserJson<ScreenshotResult>(
timeoutMs: 20000, `${baseUrl}/screenshot${suffix}`,
}); {
timeoutMs: 20000,
},
);
} }
export async function browserEval( export async function browserEval(
@ -270,7 +209,7 @@ export async function browserEval(
awaitPromise?: boolean; awaitPromise?: boolean;
}, },
): Promise<EvalResult> { ): Promise<EvalResult> {
return await fetchJson<EvalResult>(`${baseUrl}/eval`, { return await fetchBrowserJson<EvalResult>(`${baseUrl}/eval`, {
method: "POST", method: "POST",
headers: { "Content-Type": "application/json" }, headers: { "Content-Type": "application/json" },
body: JSON.stringify({ body: JSON.stringify({
@ -294,9 +233,12 @@ export async function browserQuery(
q.set("selector", opts.selector); q.set("selector", opts.selector);
if (opts.targetId) q.set("targetId", opts.targetId); if (opts.targetId) q.set("targetId", opts.targetId);
if (typeof opts.limit === "number") q.set("limit", String(opts.limit)); if (typeof opts.limit === "number") q.set("limit", String(opts.limit));
return await fetchJson<QueryResult>(`${baseUrl}/query?${q.toString()}`, { return await fetchBrowserJson<QueryResult>(
timeoutMs: 15000, `${baseUrl}/query?${q.toString()}`,
}); {
timeoutMs: 15000,
},
);
} }
export async function browserDom( export async function browserDom(
@ -314,7 +256,7 @@ export async function browserDom(
if (typeof opts.maxChars === "number") if (typeof opts.maxChars === "number")
q.set("maxChars", String(opts.maxChars)); q.set("maxChars", String(opts.maxChars));
if (opts.selector) q.set("selector", opts.selector); if (opts.selector) q.set("selector", opts.selector);
return await fetchJson<DomResult>(`${baseUrl}/dom?${q.toString()}`, { return await fetchBrowserJson<DomResult>(`${baseUrl}/dom?${q.toString()}`, {
timeoutMs: 20000, timeoutMs: 20000,
}); });
} }
@ -331,7 +273,7 @@ export async function browserSnapshot(
q.set("format", opts.format); q.set("format", opts.format);
if (opts.targetId) q.set("targetId", opts.targetId); if (opts.targetId) q.set("targetId", opts.targetId);
if (typeof opts.limit === "number") q.set("limit", String(opts.limit)); if (typeof opts.limit === "number") q.set("limit", String(opts.limit));
return await fetchJson<SnapshotResult>( return await fetchBrowserJson<SnapshotResult>(
`${baseUrl}/snapshot?${q.toString()}`, `${baseUrl}/snapshot?${q.toString()}`,
{ {
timeoutMs: 20000, timeoutMs: 20000,
@ -346,7 +288,7 @@ export async function browserClickRef(
targetId?: string; targetId?: string;
}, },
): Promise<{ ok: true; targetId: string; url: string }> { ): Promise<{ ok: true; targetId: string; url: string }> {
return await fetchJson<{ ok: true; targetId: string; url: string }>( return await fetchBrowserJson<{ ok: true; targetId: string; url: string }>(
`${baseUrl}/click`, `${baseUrl}/click`,
{ {
method: "POST", method: "POST",
@ -360,22 +302,4 @@ export async function browserClickRef(
); );
} }
export async function browserTool( // Actions beyond the basic read-only commands live in client-actions.ts.
baseUrl: string,
opts: {
name: string;
args?: Record<string, unknown>;
targetId?: string;
},
): Promise<BrowserToolResponse> {
return await fetchJson<BrowserToolResponse>(`${baseUrl}/tool`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
name: opts.name,
args: opts.args ?? {},
targetId: opts.targetId,
}),
timeoutMs: 20000,
});
}

View File

@ -21,37 +21,13 @@ const pw = vi.hoisted(() => ({
resizeViewportViaPlaywright: vi.fn().mockResolvedValue(undefined), resizeViewportViaPlaywright: vi.fn().mockResolvedValue(undefined),
runCodeViaPlaywright: vi.fn().mockResolvedValue("ok"), runCodeViaPlaywright: vi.fn().mockResolvedValue("ok"),
selectOptionViaPlaywright: vi.fn().mockResolvedValue(undefined), selectOptionViaPlaywright: vi.fn().mockResolvedValue(undefined),
snapshotAiViaPlaywright: vi
.fn()
.mockResolvedValue({ snapshot: "SNAP" }),
takeScreenshotViaPlaywright: vi
.fn()
.mockResolvedValue({ buffer: Buffer.from("png") }),
typeViaPlaywright: vi.fn().mockResolvedValue(undefined), typeViaPlaywright: vi.fn().mockResolvedValue(undefined),
waitForViaPlaywright: vi.fn().mockResolvedValue(undefined), waitForViaPlaywright: vi.fn().mockResolvedValue(undefined),
})); }));
const screenshot = vi.hoisted(() => ({
DEFAULT_BROWSER_SCREENSHOT_MAX_BYTES: 128,
DEFAULT_BROWSER_SCREENSHOT_MAX_SIDE: 64,
normalizeBrowserScreenshot: vi
.fn()
.mockImplementation(async (buf: Buffer) => ({
buffer: buf,
contentType: "image/png",
})),
}));
const media = vi.hoisted(() => ({
ensureMediaDir: vi.fn().mockResolvedValue(undefined),
saveMediaBuffer: vi.fn().mockResolvedValue({ path: "/tmp/fake.png" }),
}));
vi.mock("../pw-ai.js", () => pw); vi.mock("../pw-ai.js", () => pw);
vi.mock("../screenshot.js", () => screenshot);
vi.mock("../../media/store.js", () => media);
import { handleBrowserToolCore } from "./tool-core.js"; import { handleBrowserActionCore } from "./actions-core.js";
const baseTab = { const baseTab = {
targetId: "tab1", targetId: "tab1",
@ -84,10 +60,12 @@ function createCtx(
ensureBrowserAvailable: vi.fn().mockResolvedValue(undefined), ensureBrowserAvailable: vi.fn().mockResolvedValue(undefined),
ensureTabAvailable: vi.fn().mockResolvedValue(baseTab), ensureTabAvailable: vi.fn().mockResolvedValue(baseTab),
isReachable: vi.fn().mockResolvedValue(true), isReachable: vi.fn().mockResolvedValue(true),
listTabs: vi.fn().mockResolvedValue([ listTabs: vi
baseTab, .fn()
{ targetId: "tab2", title: "Two", url: "https://example.com/2" }, .mockResolvedValue([
]), baseTab,
{ targetId: "tab2", title: "Two", url: "https://example.com/2" },
]),
openTab: vi.fn().mockResolvedValue({ openTab: vi.fn().mockResolvedValue({
targetId: "newtab", targetId: "newtab",
title: "", title: "",
@ -102,15 +80,15 @@ function createCtx(
}; };
} }
async function callTool( async function callAction(
name: string, action: Parameters<typeof handleBrowserActionCore>[0]["action"],
args: Record<string, unknown> = {}, args: Record<string, unknown> = {},
ctxOverride?: Partial<BrowserRouteContext>, ctxOverride?: Partial<BrowserRouteContext>,
) { ) {
const res = createRes(); const res = createRes();
const ctx = createCtx(ctxOverride); const ctx = createCtx(ctxOverride);
const handled = await handleBrowserToolCore({ const handled = await handleBrowserActionCore({
name, action,
args, args,
targetId: "", targetId: "",
cdpPort: 18792, cdpPort: 18792,
@ -124,25 +102,30 @@ beforeEach(() => {
vi.clearAllMocks(); vi.clearAllMocks();
}); });
describe("handleBrowserToolCore", () => { describe("handleBrowserActionCore", () => {
it("dispatches core Playwright tools", async () => { it("dispatches core browser actions", async () => {
const cases = [ const cases = [
{ {
name: "browser_close", action: "close" as const,
args: {}, args: {},
fn: pw.closePageViaPlaywright, fn: pw.closePageViaPlaywright,
expectArgs: { cdpPort: 18792, targetId: "tab1" }, expectArgs: { cdpPort: 18792, targetId: "tab1" },
expectBody: { ok: true, targetId: "tab1", url: baseTab.url }, expectBody: { ok: true, targetId: "tab1", url: baseTab.url },
}, },
{ {
name: "browser_resize", action: "resize" as const,
args: { width: 800, height: 600 }, args: { width: 800, height: 600 },
fn: pw.resizeViewportViaPlaywright, fn: pw.resizeViewportViaPlaywright,
expectArgs: { cdpPort: 18792, targetId: "tab1", width: 800, height: 600 }, expectArgs: {
cdpPort: 18792,
targetId: "tab1",
width: 800,
height: 600,
},
expectBody: { ok: true, targetId: "tab1", url: baseTab.url }, expectBody: { ok: true, targetId: "tab1", url: baseTab.url },
}, },
{ {
name: "browser_handle_dialog", action: "dialog" as const,
args: { accept: true, promptText: "ok" }, args: { accept: true, promptText: "ok" },
fn: pw.handleDialogViaPlaywright, fn: pw.handleDialogViaPlaywright,
expectArgs: { expectArgs: {
@ -154,7 +137,7 @@ describe("handleBrowserToolCore", () => {
expectBody: { ok: true, message: "ok", type: "alert" }, expectBody: { ok: true, message: "ok", type: "alert" },
}, },
{ {
name: "browser_evaluate", action: "evaluate" as const,
args: { function: "() => 1", ref: "1" }, args: { function: "() => 1", ref: "1" },
fn: pw.evaluateViaPlaywright, fn: pw.evaluateViaPlaywright,
expectArgs: { expectArgs: {
@ -166,7 +149,7 @@ describe("handleBrowserToolCore", () => {
expectBody: { ok: true, result: "result" }, expectBody: { ok: true, result: "result" },
}, },
{ {
name: "browser_file_upload", action: "upload" as const,
args: { paths: ["/tmp/file.txt"] }, args: { paths: ["/tmp/file.txt"] },
fn: pw.fileUploadViaPlaywright, fn: pw.fileUploadViaPlaywright,
expectArgs: { expectArgs: {
@ -177,7 +160,7 @@ describe("handleBrowserToolCore", () => {
expectBody: { ok: true, targetId: "tab1" }, expectBody: { ok: true, targetId: "tab1" },
}, },
{ {
name: "browser_fill_form", action: "fill" as const,
args: { fields: [{ ref: "1", value: "x" }] }, args: { fields: [{ ref: "1", value: "x" }] },
fn: pw.fillFormViaPlaywright, fn: pw.fillFormViaPlaywright,
expectArgs: { expectArgs: {
@ -188,14 +171,14 @@ describe("handleBrowserToolCore", () => {
expectBody: { ok: true, targetId: "tab1" }, expectBody: { ok: true, targetId: "tab1" },
}, },
{ {
name: "browser_press_key", action: "press" as const,
args: { key: "Enter" }, args: { key: "Enter" },
fn: pw.pressKeyViaPlaywright, fn: pw.pressKeyViaPlaywright,
expectArgs: { cdpPort: 18792, targetId: "tab1", key: "Enter" }, expectArgs: { cdpPort: 18792, targetId: "tab1", key: "Enter" },
expectBody: { ok: true, targetId: "tab1" }, expectBody: { ok: true, targetId: "tab1" },
}, },
{ {
name: "browser_type", action: "type" as const,
args: { ref: "2", text: "hi", submit: true, slowly: true }, args: { ref: "2", text: "hi", submit: true, slowly: true },
fn: pw.typeViaPlaywright, fn: pw.typeViaPlaywright,
expectArgs: { expectArgs: {
@ -209,7 +192,7 @@ describe("handleBrowserToolCore", () => {
expectBody: { ok: true, targetId: "tab1" }, expectBody: { ok: true, targetId: "tab1" },
}, },
{ {
name: "browser_navigate", action: "navigate" as const,
args: { url: "https://example.com" }, args: { url: "https://example.com" },
fn: pw.navigateViaPlaywright, fn: pw.navigateViaPlaywright,
expectArgs: { expectArgs: {
@ -220,21 +203,21 @@ describe("handleBrowserToolCore", () => {
expectBody: { ok: true, targetId: "tab1", url: baseTab.url }, expectBody: { ok: true, targetId: "tab1", url: baseTab.url },
}, },
{ {
name: "browser_navigate_back", action: "back" as const,
args: {}, args: {},
fn: pw.navigateBackViaPlaywright, fn: pw.navigateBackViaPlaywright,
expectArgs: { cdpPort: 18792, targetId: "tab1" }, expectArgs: { cdpPort: 18792, targetId: "tab1" },
expectBody: { ok: true, targetId: "tab1", url: "about:blank" }, expectBody: { ok: true, targetId: "tab1", url: "about:blank" },
}, },
{ {
name: "browser_run_code", action: "run" as const,
args: { code: "return 1" }, args: { code: "return 1" },
fn: pw.runCodeViaPlaywright, fn: pw.runCodeViaPlaywright,
expectArgs: { cdpPort: 18792, targetId: "tab1", code: "return 1" }, expectArgs: { cdpPort: 18792, targetId: "tab1", code: "return 1" },
expectBody: { ok: true, result: "ok" }, expectBody: { ok: true, result: "ok" },
}, },
{ {
name: "browser_click", action: "click" as const,
args: { args: {
ref: "1", ref: "1",
doubleClick: true, doubleClick: true,
@ -253,7 +236,7 @@ describe("handleBrowserToolCore", () => {
expectBody: { ok: true, targetId: "tab1", url: baseTab.url }, expectBody: { ok: true, targetId: "tab1", url: baseTab.url },
}, },
{ {
name: "browser_drag", action: "drag" as const,
args: { startRef: "1", endRef: "2" }, args: { startRef: "1", endRef: "2" },
fn: pw.dragViaPlaywright, fn: pw.dragViaPlaywright,
expectArgs: { expectArgs: {
@ -265,14 +248,14 @@ describe("handleBrowserToolCore", () => {
expectBody: { ok: true, targetId: "tab1" }, expectBody: { ok: true, targetId: "tab1" },
}, },
{ {
name: "browser_hover", action: "hover" as const,
args: { ref: "3" }, args: { ref: "3" },
fn: pw.hoverViaPlaywright, fn: pw.hoverViaPlaywright,
expectArgs: { cdpPort: 18792, targetId: "tab1", ref: "3" }, expectArgs: { cdpPort: 18792, targetId: "tab1", ref: "3" },
expectBody: { ok: true, targetId: "tab1" }, expectBody: { ok: true, targetId: "tab1" },
}, },
{ {
name: "browser_select_option", action: "select" as const,
args: { ref: "4", values: ["A"] }, args: { ref: "4", values: ["A"] },
fn: pw.selectOptionViaPlaywright, fn: pw.selectOptionViaPlaywright,
expectArgs: { expectArgs: {
@ -284,7 +267,7 @@ describe("handleBrowserToolCore", () => {
expectBody: { ok: true, targetId: "tab1" }, expectBody: { ok: true, targetId: "tab1" },
}, },
{ {
name: "browser_wait_for", action: "wait" as const,
args: { time: 500, text: "ok", textGone: "bye" }, args: { time: 500, text: "ok", textGone: "bye" },
fn: pw.waitForViaPlaywright, fn: pw.waitForViaPlaywright,
expectArgs: { expectArgs: {
@ -299,120 +282,10 @@ describe("handleBrowserToolCore", () => {
]; ];
for (const item of cases) { for (const item of cases) {
const { res, handled } = await callTool(item.name, item.args); const { res, handled } = await callAction(item.action, item.args);
expect(handled).toBe(true); expect(handled).toBe(true);
expect(item.fn).toHaveBeenCalledWith(item.expectArgs); expect(item.fn).toHaveBeenCalledWith(item.expectArgs);
expect(res.body).toEqual(item.expectBody); expect(res.body).toEqual(item.expectBody);
} }
}); });
it("handles screenshots via media storage", async () => {
const { res } = await callTool("browser_take_screenshot", {
type: "jpeg",
ref: "1",
fullPage: true,
element: "main",
filename: "shot.jpg",
});
expect(pw.takeScreenshotViaPlaywright).toHaveBeenCalledWith({
cdpPort: 18792,
targetId: "tab1",
ref: "1",
element: "main",
fullPage: true,
type: "jpeg",
});
expect(media.ensureMediaDir).toHaveBeenCalled();
expect(media.saveMediaBuffer).toHaveBeenCalled();
expect(res.body).toMatchObject({
ok: true,
path: "/tmp/fake.png",
filename: "shot.jpg",
targetId: "tab1",
url: baseTab.url,
});
});
it("handles snapshots with optional file output", async () => {
const { res } = await callTool("browser_snapshot", {
filename: "snapshot.txt",
});
expect(pw.snapshotAiViaPlaywright).toHaveBeenCalledWith({
cdpPort: 18792,
targetId: "tab1",
});
expect(media.ensureMediaDir).toHaveBeenCalled();
expect(media.saveMediaBuffer).toHaveBeenCalledWith(
expect.any(Buffer),
"text/plain",
"browser",
);
expect(res.body).toMatchObject({
ok: true,
path: "/tmp/fake.png",
filename: "snapshot.txt",
targetId: "tab1",
url: baseTab.url,
});
});
it("returns a message for browser_install", async () => {
const { res } = await callTool("browser_install");
expect(res.body).toMatchObject({ ok: true });
});
it("supports browser_tabs actions", async () => {
const ctx = createCtx();
const listRes = createRes();
await handleBrowserToolCore({
name: "browser_tabs",
args: { action: "list" },
targetId: "",
cdpPort: 18792,
ctx,
res: listRes,
});
expect(listRes.body).toMatchObject({ ok: true });
expect(ctx.listTabs).toHaveBeenCalled();
const newRes = createRes();
await handleBrowserToolCore({
name: "browser_tabs",
args: { action: "new" },
targetId: "",
cdpPort: 18792,
ctx,
res: newRes,
});
expect(ctx.ensureBrowserAvailable).toHaveBeenCalled();
expect(ctx.openTab).toHaveBeenCalled();
expect(newRes.body).toMatchObject({ ok: true, tab: { targetId: "newtab" } });
const closeRes = createRes();
await handleBrowserToolCore({
name: "browser_tabs",
args: { action: "close", index: 1 },
targetId: "",
cdpPort: 18792,
ctx,
res: closeRes,
});
expect(ctx.closeTab).toHaveBeenCalledWith("tab2");
expect(closeRes.body).toMatchObject({ ok: true, targetId: "tab2" });
const selectRes = createRes();
await handleBrowserToolCore({
name: "browser_tabs",
args: { action: "select", index: 0 },
targetId: "",
cdpPort: 18792,
ctx,
res: selectRes,
});
expect(ctx.focusTab).toHaveBeenCalledWith("tab1");
expect(selectRes.body).toMatchObject({ ok: true, targetId: "tab1" });
});
}); });

View File

@ -1,8 +1,5 @@
import path from "node:path";
import type express from "express"; import type express from "express";
import { ensureMediaDir, saveMediaBuffer } from "../../media/store.js";
import { import {
clickViaPlaywright, clickViaPlaywright,
closePageViaPlaywright, closePageViaPlaywright,
@ -18,16 +15,9 @@ import {
resizeViewportViaPlaywright, resizeViewportViaPlaywright,
runCodeViaPlaywright, runCodeViaPlaywright,
selectOptionViaPlaywright, selectOptionViaPlaywright,
snapshotAiViaPlaywright,
takeScreenshotViaPlaywright,
typeViaPlaywright, typeViaPlaywright,
waitForViaPlaywright, waitForViaPlaywright,
} from "../pw-ai.js"; } from "../pw-ai.js";
import {
DEFAULT_BROWSER_SCREENSHOT_MAX_BYTES,
DEFAULT_BROWSER_SCREENSHOT_MAX_SIDE,
normalizeBrowserScreenshot,
} from "../screenshot.js";
import type { BrowserRouteContext } from "../server-context.js"; import type { BrowserRouteContext } from "../server-context.js";
import { import {
jsonError, jsonError,
@ -37,8 +27,26 @@ import {
toStringOrEmpty, toStringOrEmpty,
} from "./utils.js"; } from "./utils.js";
type ToolCoreParams = { export type BrowserActionCore =
name: string; | "back"
| "click"
| "close"
| "dialog"
| "drag"
| "evaluate"
| "fill"
| "hover"
| "navigate"
| "press"
| "resize"
| "run"
| "select"
| "type"
| "upload"
| "wait";
type ActionCoreParams = {
action: BrowserActionCore;
args: Record<string, unknown>; args: Record<string, unknown>;
targetId: string; targetId: string;
cdpPort: number; cdpPort: number;
@ -46,20 +54,20 @@ type ToolCoreParams = {
res: express.Response; res: express.Response;
}; };
export async function handleBrowserToolCore( export async function handleBrowserActionCore(
params: ToolCoreParams, params: ActionCoreParams,
): Promise<boolean> { ): Promise<boolean> {
const { name, args, targetId, cdpPort, ctx, res } = params; const { action, args, targetId, cdpPort, ctx, res } = params;
const target = targetId || undefined; const target = targetId || undefined;
switch (name) { switch (action) {
case "browser_close": { case "close": {
const tab = await ctx.ensureTabAvailable(target); const tab = await ctx.ensureTabAvailable(target);
await closePageViaPlaywright({ cdpPort, targetId: tab.targetId }); await closePageViaPlaywright({ cdpPort, targetId: tab.targetId });
res.json({ ok: true, targetId: tab.targetId, url: tab.url }); res.json({ ok: true, targetId: tab.targetId, url: tab.url });
return true; return true;
} }
case "browser_resize": { case "resize": {
const width = toNumber(args.width); const width = toNumber(args.width);
const height = toNumber(args.height); const height = toNumber(args.height);
if (!width || !height) { if (!width || !height) {
@ -76,7 +84,7 @@ export async function handleBrowserToolCore(
res.json({ ok: true, targetId: tab.targetId, url: tab.url }); res.json({ ok: true, targetId: tab.targetId, url: tab.url });
return true; return true;
} }
case "browser_handle_dialog": { case "dialog": {
const accept = toBoolean(args.accept); const accept = toBoolean(args.accept);
if (accept === undefined) { if (accept === undefined) {
jsonError(res, 400, "accept is required"); jsonError(res, 400, "accept is required");
@ -93,7 +101,7 @@ export async function handleBrowserToolCore(
res.json({ ok: true, ...result }); res.json({ ok: true, ...result });
return true; return true;
} }
case "browser_evaluate": { case "evaluate": {
const fn = toStringOrEmpty(args.function); const fn = toStringOrEmpty(args.function);
if (!fn) { if (!fn) {
jsonError(res, 400, "function is required"); jsonError(res, 400, "function is required");
@ -110,7 +118,7 @@ export async function handleBrowserToolCore(
res.json({ ok: true, result }); res.json({ ok: true, result });
return true; return true;
} }
case "browser_file_upload": { case "upload": {
const paths = toStringArray(args.paths) ?? []; const paths = toStringArray(args.paths) ?? [];
const tab = await ctx.ensureTabAvailable(target); const tab = await ctx.ensureTabAvailable(target);
await fileUploadViaPlaywright({ await fileUploadViaPlaywright({
@ -121,7 +129,7 @@ export async function handleBrowserToolCore(
res.json({ ok: true, targetId: tab.targetId }); res.json({ ok: true, targetId: tab.targetId });
return true; return true;
} }
case "browser_fill_form": { case "fill": {
const fields = Array.isArray(args.fields) const fields = Array.isArray(args.fields)
? (args.fields as Array<Record<string, unknown>>) ? (args.fields as Array<Record<string, unknown>>)
: null; : null;
@ -138,15 +146,7 @@ export async function handleBrowserToolCore(
res.json({ ok: true, targetId: tab.targetId }); res.json({ ok: true, targetId: tab.targetId });
return true; return true;
} }
case "browser_install": { case "press": {
res.json({
ok: true,
message:
"clawd browser uses system Chrome/Chromium; no Playwright install needed.",
});
return true;
}
case "browser_press_key": {
const key = toStringOrEmpty(args.key); const key = toStringOrEmpty(args.key);
if (!key) { if (!key) {
jsonError(res, 400, "key is required"); jsonError(res, 400, "key is required");
@ -161,7 +161,7 @@ export async function handleBrowserToolCore(
res.json({ ok: true, targetId: tab.targetId }); res.json({ ok: true, targetId: tab.targetId });
return true; return true;
} }
case "browser_type": { case "type": {
const ref = toStringOrEmpty(args.ref); const ref = toStringOrEmpty(args.ref);
const text = toStringOrEmpty(args.text); const text = toStringOrEmpty(args.text);
if (!ref || !text) { if (!ref || !text) {
@ -182,7 +182,7 @@ export async function handleBrowserToolCore(
res.json({ ok: true, targetId: tab.targetId }); res.json({ ok: true, targetId: tab.targetId });
return true; return true;
} }
case "browser_navigate": { case "navigate": {
const url = toStringOrEmpty(args.url); const url = toStringOrEmpty(args.url);
if (!url) { if (!url) {
jsonError(res, 400, "url is required"); jsonError(res, 400, "url is required");
@ -197,7 +197,7 @@ export async function handleBrowserToolCore(
res.json({ ok: true, targetId: tab.targetId, ...result }); res.json({ ok: true, targetId: tab.targetId, ...result });
return true; return true;
} }
case "browser_navigate_back": { case "back": {
const tab = await ctx.ensureTabAvailable(target); const tab = await ctx.ensureTabAvailable(target);
const result = await navigateBackViaPlaywright({ const result = await navigateBackViaPlaywright({
cdpPort, cdpPort,
@ -206,7 +206,7 @@ export async function handleBrowserToolCore(
res.json({ ok: true, targetId: tab.targetId, ...result }); res.json({ ok: true, targetId: tab.targetId, ...result });
return true; return true;
} }
case "browser_run_code": { case "run": {
const code = toStringOrEmpty(args.code); const code = toStringOrEmpty(args.code);
if (!code) { if (!code) {
jsonError(res, 400, "code is required"); jsonError(res, 400, "code is required");
@ -221,73 +221,7 @@ export async function handleBrowserToolCore(
res.json({ ok: true, result }); res.json({ ok: true, result });
return true; return true;
} }
case "browser_take_screenshot": { case "click": {
const type = args.type === "jpeg" ? "jpeg" : "png";
const ref = toStringOrEmpty(args.ref) || undefined;
const fullPage = toBoolean(args.fullPage) ?? false;
const element = toStringOrEmpty(args.element) || undefined;
const filename = toStringOrEmpty(args.filename) || undefined;
const tab = await ctx.ensureTabAvailable(target);
const snap = await takeScreenshotViaPlaywright({
cdpPort,
targetId: tab.targetId,
ref,
element,
fullPage,
type,
});
const normalized = await normalizeBrowserScreenshot(snap.buffer, {
maxSide: DEFAULT_BROWSER_SCREENSHOT_MAX_SIDE,
maxBytes: DEFAULT_BROWSER_SCREENSHOT_MAX_BYTES,
});
await ensureMediaDir();
const saved = await saveMediaBuffer(
normalized.buffer,
normalized.contentType ?? `image/${type}`,
"browser",
DEFAULT_BROWSER_SCREENSHOT_MAX_BYTES,
);
res.json({
ok: true,
path: path.resolve(saved.path),
filename,
targetId: tab.targetId,
url: tab.url,
});
return true;
}
case "browser_snapshot": {
const filename = toStringOrEmpty(args.filename) || undefined;
const tab = await ctx.ensureTabAvailable(target);
const snap = await snapshotAiViaPlaywright({
cdpPort,
targetId: tab.targetId,
});
if (filename) {
await ensureMediaDir();
const saved = await saveMediaBuffer(
Buffer.from(snap.snapshot, "utf8"),
"text/plain",
"browser",
);
res.json({
ok: true,
path: path.resolve(saved.path),
filename,
targetId: tab.targetId,
url: tab.url,
});
return true;
}
res.json({
ok: true,
snapshot: snap.snapshot,
targetId: tab.targetId,
url: tab.url,
});
return true;
}
case "browser_click": {
const ref = toStringOrEmpty(args.ref); const ref = toStringOrEmpty(args.ref);
if (!ref) { if (!ref) {
jsonError(res, 400, "ref is required"); jsonError(res, 400, "ref is required");
@ -310,7 +244,7 @@ export async function handleBrowserToolCore(
res.json({ ok: true, targetId: tab.targetId, url: tab.url }); res.json({ ok: true, targetId: tab.targetId, url: tab.url });
return true; return true;
} }
case "browser_drag": { case "drag": {
const startRef = toStringOrEmpty(args.startRef); const startRef = toStringOrEmpty(args.startRef);
const endRef = toStringOrEmpty(args.endRef); const endRef = toStringOrEmpty(args.endRef);
if (!startRef || !endRef) { if (!startRef || !endRef) {
@ -327,7 +261,7 @@ export async function handleBrowserToolCore(
res.json({ ok: true, targetId: tab.targetId }); res.json({ ok: true, targetId: tab.targetId });
return true; return true;
} }
case "browser_hover": { case "hover": {
const ref = toStringOrEmpty(args.ref); const ref = toStringOrEmpty(args.ref);
if (!ref) { if (!ref) {
jsonError(res, 400, "ref is required"); jsonError(res, 400, "ref is required");
@ -342,7 +276,7 @@ export async function handleBrowserToolCore(
res.json({ ok: true, targetId: tab.targetId }); res.json({ ok: true, targetId: tab.targetId });
return true; return true;
} }
case "browser_select_option": { case "select": {
const ref = toStringOrEmpty(args.ref); const ref = toStringOrEmpty(args.ref);
const values = toStringArray(args.values); const values = toStringArray(args.values);
if (!ref || !values?.length) { if (!ref || !values?.length) {
@ -359,59 +293,7 @@ export async function handleBrowserToolCore(
res.json({ ok: true, targetId: tab.targetId }); res.json({ ok: true, targetId: tab.targetId });
return true; return true;
} }
case "browser_tabs": { case "wait": {
const action = toStringOrEmpty(args.action);
const index = toNumber(args.index);
if (!action) {
jsonError(res, 400, "action is required");
return true;
}
if (action === "list") {
const reachable = await ctx.isReachable(300);
if (!reachable) {
res.json({ ok: true, tabs: [] });
return true;
}
const tabs = await ctx.listTabs();
res.json({ ok: true, tabs });
return true;
}
if (action === "new") {
await ctx.ensureBrowserAvailable();
const tab = await ctx.openTab("about:blank");
res.json({ ok: true, tab });
return true;
}
if (action === "close") {
const tabs = await ctx.listTabs();
const targetTab = typeof index === "number" ? tabs[index] : tabs.at(0);
if (!targetTab) {
jsonError(res, 404, "tab not found");
return true;
}
await ctx.closeTab(targetTab.targetId);
res.json({ ok: true, targetId: targetTab.targetId });
return true;
}
if (action === "select") {
if (typeof index !== "number") {
jsonError(res, 400, "index is required");
return true;
}
const tabs = await ctx.listTabs();
const targetTab = tabs[index];
if (!targetTab) {
jsonError(res, 404, "tab not found");
return true;
}
await ctx.focusTab(targetTab.targetId);
res.json({ ok: true, targetId: targetTab.targetId });
return true;
}
jsonError(res, 400, "unknown tab action");
return true;
}
case "browser_wait_for": {
const time = toNumber(args.time); const time = toNumber(args.time);
const text = toStringOrEmpty(args.text) || undefined; const text = toStringOrEmpty(args.text) || undefined;
const textGone = toStringOrEmpty(args.textGone) || undefined; const textGone = toStringOrEmpty(args.textGone) || undefined;

View File

@ -30,7 +30,7 @@ const media = vi.hoisted(() => ({
vi.mock("../pw-ai.js", () => pw); vi.mock("../pw-ai.js", () => pw);
vi.mock("../../media/store.js", () => media); vi.mock("../../media/store.js", () => media);
import { handleBrowserToolExtra } from "./tool-extra.js"; import { handleBrowserActionExtra } from "./actions-extra.js";
const baseTab = { const baseTab = {
targetId: "tab1", targetId: "tab1",
@ -78,11 +78,14 @@ function createCtx(
}; };
} }
async function callTool(name: string, args: Record<string, unknown> = {}) { async function callAction(
action: Parameters<typeof handleBrowserActionExtra>[0]["action"],
args: Record<string, unknown> = {},
) {
const res = createRes(); const res = createRes();
const ctx = createCtx(); const ctx = createCtx();
const handled = await handleBrowserToolExtra({ const handled = await handleBrowserActionExtra({
name, action,
args, args,
targetId: "", targetId: "",
cdpPort: 18792, cdpPort: 18792,
@ -96,11 +99,11 @@ beforeEach(() => {
vi.clearAllMocks(); vi.clearAllMocks();
}); });
describe("handleBrowserToolExtra", () => { describe("handleBrowserActionExtra", () => {
it("dispatches extra Playwright tools", async () => { it("dispatches extra browser actions", async () => {
const cases = [ const cases = [
{ {
name: "browser_console_messages", action: "console" as const,
args: { level: "error" }, args: { level: "error" },
fn: pw.getConsoleMessagesViaPlaywright, fn: pw.getConsoleMessagesViaPlaywright,
expectArgs: { expectArgs: {
@ -111,7 +114,7 @@ describe("handleBrowserToolExtra", () => {
expectBody: { ok: true, messages: [], targetId: "tab1" }, expectBody: { ok: true, messages: [], targetId: "tab1" },
}, },
{ {
name: "browser_network_requests", action: "network" as const,
args: { includeStatic: true }, args: { includeStatic: true },
fn: pw.getNetworkRequestsViaPlaywright, fn: pw.getNetworkRequestsViaPlaywright,
expectArgs: { expectArgs: {
@ -122,14 +125,14 @@ describe("handleBrowserToolExtra", () => {
expectBody: { ok: true, requests: [], targetId: "tab1" }, expectBody: { ok: true, requests: [], targetId: "tab1" },
}, },
{ {
name: "browser_start_tracing", action: "traceStart" as const,
args: {}, args: {},
fn: pw.startTracingViaPlaywright, fn: pw.startTracingViaPlaywright,
expectArgs: { cdpPort: 18792, targetId: "tab1" }, expectArgs: { cdpPort: 18792, targetId: "tab1" },
expectBody: { ok: true }, expectBody: { ok: true },
}, },
{ {
name: "browser_verify_element_visible", action: "verifyElement" as const,
args: { role: "button", accessibleName: "Submit" }, args: { role: "button", accessibleName: "Submit" },
fn: pw.verifyElementVisibleViaPlaywright, fn: pw.verifyElementVisibleViaPlaywright,
expectArgs: { expectArgs: {
@ -141,14 +144,14 @@ describe("handleBrowserToolExtra", () => {
expectBody: { ok: true }, expectBody: { ok: true },
}, },
{ {
name: "browser_verify_text_visible", action: "verifyText" as const,
args: { text: "Hello" }, args: { text: "Hello" },
fn: pw.verifyTextVisibleViaPlaywright, fn: pw.verifyTextVisibleViaPlaywright,
expectArgs: { cdpPort: 18792, targetId: "tab1", text: "Hello" }, expectArgs: { cdpPort: 18792, targetId: "tab1", text: "Hello" },
expectBody: { ok: true }, expectBody: { ok: true },
}, },
{ {
name: "browser_verify_list_visible", action: "verifyList" as const,
args: { ref: "1", items: ["a", "b"] }, args: { ref: "1", items: ["a", "b"] },
fn: pw.verifyListVisibleViaPlaywright, fn: pw.verifyListVisibleViaPlaywright,
expectArgs: { expectArgs: {
@ -160,7 +163,7 @@ describe("handleBrowserToolExtra", () => {
expectBody: { ok: true }, expectBody: { ok: true },
}, },
{ {
name: "browser_verify_value", action: "verifyValue" as const,
args: { ref: "2", type: "textbox", value: "x" }, args: { ref: "2", type: "textbox", value: "x" },
fn: pw.verifyValueViaPlaywright, fn: pw.verifyValueViaPlaywright,
expectArgs: { expectArgs: {
@ -173,14 +176,14 @@ describe("handleBrowserToolExtra", () => {
expectBody: { ok: true }, expectBody: { ok: true },
}, },
{ {
name: "browser_mouse_move_xy", action: "mouseMove" as const,
args: { x: 10, y: 20 }, args: { x: 10, y: 20 },
fn: pw.mouseMoveViaPlaywright, fn: pw.mouseMoveViaPlaywright,
expectArgs: { cdpPort: 18792, targetId: "tab1", x: 10, y: 20 }, expectArgs: { cdpPort: 18792, targetId: "tab1", x: 10, y: 20 },
expectBody: { ok: true }, expectBody: { ok: true },
}, },
{ {
name: "browser_mouse_click_xy", action: "mouseClick" as const,
args: { x: 1, y: 2, button: "right" }, args: { x: 1, y: 2, button: "right" },
fn: pw.mouseClickViaPlaywright, fn: pw.mouseClickViaPlaywright,
expectArgs: { expectArgs: {
@ -193,7 +196,7 @@ describe("handleBrowserToolExtra", () => {
expectBody: { ok: true }, expectBody: { ok: true },
}, },
{ {
name: "browser_mouse_drag_xy", action: "mouseDrag" as const,
args: { startX: 1, startY: 2, endX: 3, endY: 4 }, args: { startX: 1, startY: 2, endX: 3, endY: 4 },
fn: pw.mouseDragViaPlaywright, fn: pw.mouseDragViaPlaywright,
expectArgs: { expectArgs: {
@ -207,7 +210,7 @@ describe("handleBrowserToolExtra", () => {
expectBody: { ok: true }, expectBody: { ok: true },
}, },
{ {
name: "browser_generate_locator", action: "locator" as const,
args: { ref: "99" }, args: { ref: "99" },
fn: pw.generateLocatorForRef, fn: pw.generateLocatorForRef,
expectArgs: "99", expectArgs: "99",
@ -216,7 +219,7 @@ describe("handleBrowserToolExtra", () => {
]; ];
for (const item of cases) { for (const item of cases) {
const { res, handled } = await callTool(item.name, item.args); const { res, handled } = await callAction(item.action, item.args);
expect(handled).toBe(true); expect(handled).toBe(true);
expect(item.fn).toHaveBeenCalledWith(item.expectArgs); expect(item.fn).toHaveBeenCalledWith(item.expectArgs);
expect(res.body).toEqual(item.expectBody); expect(res.body).toEqual(item.expectBody);
@ -224,7 +227,7 @@ describe("handleBrowserToolExtra", () => {
}); });
it("stores PDF and trace outputs", async () => { it("stores PDF and trace outputs", async () => {
const { res: pdfRes } = await callTool("browser_pdf_save"); const { res: pdfRes } = await callAction("pdf");
expect(pw.pdfViaPlaywright).toHaveBeenCalledWith({ expect(pw.pdfViaPlaywright).toHaveBeenCalledWith({
cdpPort: 18792, cdpPort: 18792,
targetId: "tab1", targetId: "tab1",
@ -239,7 +242,7 @@ describe("handleBrowserToolExtra", () => {
}); });
media.saveMediaBuffer.mockResolvedValueOnce({ path: "/tmp/fake.zip" }); media.saveMediaBuffer.mockResolvedValueOnce({ path: "/tmp/fake.zip" });
const { res: traceRes } = await callTool("browser_stop_tracing"); const { res: traceRes } = await callAction("traceStop");
expect(pw.stopTracingViaPlaywright).toHaveBeenCalledWith({ expect(pw.stopTracingViaPlaywright).toHaveBeenCalledWith({
cdpPort: 18792, cdpPort: 18792,
targetId: "tab1", targetId: "tab1",

View File

@ -27,8 +27,23 @@ import {
toStringOrEmpty, toStringOrEmpty,
} from "./utils.js"; } from "./utils.js";
type ToolExtraParams = { export type BrowserActionExtra =
name: string; | "console"
| "locator"
| "mouseClick"
| "mouseDrag"
| "mouseMove"
| "network"
| "pdf"
| "traceStart"
| "traceStop"
| "verifyElement"
| "verifyList"
| "verifyText"
| "verifyValue";
type ActionExtraParams = {
action: BrowserActionExtra;
args: Record<string, unknown>; args: Record<string, unknown>;
targetId: string; targetId: string;
cdpPort: number; cdpPort: number;
@ -36,14 +51,14 @@ type ToolExtraParams = {
res: express.Response; res: express.Response;
}; };
export async function handleBrowserToolExtra( export async function handleBrowserActionExtra(
params: ToolExtraParams, params: ActionExtraParams,
): Promise<boolean> { ): Promise<boolean> {
const { name, args, targetId, cdpPort, ctx, res } = params; const { action, args, targetId, cdpPort, ctx, res } = params;
const target = targetId || undefined; const target = targetId || undefined;
switch (name) { switch (action) {
case "browser_console_messages": { case "console": {
const level = toStringOrEmpty(args.level) || undefined; const level = toStringOrEmpty(args.level) || undefined;
const tab = await ctx.ensureTabAvailable(target); const tab = await ctx.ensureTabAvailable(target);
const messages = await getConsoleMessagesViaPlaywright({ const messages = await getConsoleMessagesViaPlaywright({
@ -54,7 +69,7 @@ export async function handleBrowserToolExtra(
res.json({ ok: true, messages, targetId: tab.targetId }); res.json({ ok: true, messages, targetId: tab.targetId });
return true; return true;
} }
case "browser_network_requests": { case "network": {
const includeStatic = toBoolean(args.includeStatic) ?? false; const includeStatic = toBoolean(args.includeStatic) ?? false;
const tab = await ctx.ensureTabAvailable(target); const tab = await ctx.ensureTabAvailable(target);
const requests = await getNetworkRequestsViaPlaywright({ const requests = await getNetworkRequestsViaPlaywright({
@ -65,7 +80,7 @@ export async function handleBrowserToolExtra(
res.json({ ok: true, requests, targetId: tab.targetId }); res.json({ ok: true, requests, targetId: tab.targetId });
return true; return true;
} }
case "browser_pdf_save": { case "pdf": {
const tab = await ctx.ensureTabAvailable(target); const tab = await ctx.ensureTabAvailable(target);
const pdf = await pdfViaPlaywright({ const pdf = await pdfViaPlaywright({
cdpPort, cdpPort,
@ -86,7 +101,7 @@ export async function handleBrowserToolExtra(
}); });
return true; return true;
} }
case "browser_start_tracing": { case "traceStart": {
const tab = await ctx.ensureTabAvailable(target); const tab = await ctx.ensureTabAvailable(target);
await startTracingViaPlaywright({ await startTracingViaPlaywright({
cdpPort, cdpPort,
@ -95,7 +110,7 @@ export async function handleBrowserToolExtra(
res.json({ ok: true }); res.json({ ok: true });
return true; return true;
} }
case "browser_stop_tracing": { case "traceStop": {
const tab = await ctx.ensureTabAvailable(target); const tab = await ctx.ensureTabAvailable(target);
const trace = await stopTracingViaPlaywright({ const trace = await stopTracingViaPlaywright({
cdpPort, cdpPort,
@ -116,7 +131,7 @@ export async function handleBrowserToolExtra(
}); });
return true; return true;
} }
case "browser_verify_element_visible": { case "verifyElement": {
const role = toStringOrEmpty(args.role); const role = toStringOrEmpty(args.role);
const accessibleName = toStringOrEmpty(args.accessibleName); const accessibleName = toStringOrEmpty(args.accessibleName);
if (!role || !accessibleName) { if (!role || !accessibleName) {
@ -133,7 +148,7 @@ export async function handleBrowserToolExtra(
res.json({ ok: true }); res.json({ ok: true });
return true; return true;
} }
case "browser_verify_text_visible": { case "verifyText": {
const text = toStringOrEmpty(args.text); const text = toStringOrEmpty(args.text);
if (!text) { if (!text) {
jsonError(res, 400, "text is required"); jsonError(res, 400, "text is required");
@ -148,7 +163,7 @@ export async function handleBrowserToolExtra(
res.json({ ok: true }); res.json({ ok: true });
return true; return true;
} }
case "browser_verify_list_visible": { case "verifyList": {
const ref = toStringOrEmpty(args.ref); const ref = toStringOrEmpty(args.ref);
const items = toStringArray(args.items); const items = toStringArray(args.items);
if (!ref || !items?.length) { if (!ref || !items?.length) {
@ -165,7 +180,7 @@ export async function handleBrowserToolExtra(
res.json({ ok: true }); res.json({ ok: true });
return true; return true;
} }
case "browser_verify_value": { case "verifyValue": {
const ref = toStringOrEmpty(args.ref); const ref = toStringOrEmpty(args.ref);
const type = toStringOrEmpty(args.type); const type = toStringOrEmpty(args.type);
const value = toStringOrEmpty(args.value); const value = toStringOrEmpty(args.value);
@ -184,7 +199,7 @@ export async function handleBrowserToolExtra(
res.json({ ok: true }); res.json({ ok: true });
return true; return true;
} }
case "browser_mouse_move_xy": { case "mouseMove": {
const x = toNumber(args.x); const x = toNumber(args.x);
const y = toNumber(args.y); const y = toNumber(args.y);
if (x === undefined || y === undefined) { if (x === undefined || y === undefined) {
@ -201,7 +216,7 @@ export async function handleBrowserToolExtra(
res.json({ ok: true }); res.json({ ok: true });
return true; return true;
} }
case "browser_mouse_click_xy": { case "mouseClick": {
const x = toNumber(args.x); const x = toNumber(args.x);
const y = toNumber(args.y); const y = toNumber(args.y);
if (x === undefined || y === undefined) { if (x === undefined || y === undefined) {
@ -220,7 +235,7 @@ export async function handleBrowserToolExtra(
res.json({ ok: true }); res.json({ ok: true });
return true; return true;
} }
case "browser_mouse_drag_xy": { case "mouseDrag": {
const startX = toNumber(args.startX); const startX = toNumber(args.startX);
const startY = toNumber(args.startY); const startY = toNumber(args.startY);
const endX = toNumber(args.endX); const endX = toNumber(args.endX);
@ -246,7 +261,7 @@ export async function handleBrowserToolExtra(
res.json({ ok: true }); res.json({ ok: true });
return true; return true;
} }
case "browser_generate_locator": { case "locator": {
const ref = toStringOrEmpty(args.ref); const ref = toStringOrEmpty(args.ref);
if (!ref) { if (!ref) {
jsonError(res, 400, "ref is required"); jsonError(res, 400, "ref is required");

View File

@ -0,0 +1,249 @@
import type express from "express";
import type { BrowserRouteContext } from "../server-context.js";
import { handleBrowserActionCore } from "./actions-core.js";
import { handleBrowserActionExtra } from "./actions-extra.js";
import { jsonError, toBoolean, toStringOrEmpty } from "./utils.js";
function readBody(req: express.Request): Record<string, unknown> {
const body = req.body as Record<string, unknown> | undefined;
if (!body || typeof body !== "object" || Array.isArray(body)) return {};
return body;
}
function readTargetId(value: unknown): string {
return toStringOrEmpty(value);
}
function handleActionError(
ctx: BrowserRouteContext,
res: express.Response,
err: unknown,
) {
const mapped = ctx.mapTabError(err);
if (mapped) return jsonError(res, mapped.status, mapped.message);
jsonError(res, 500, String(err));
}
async function runCoreAction(
ctx: BrowserRouteContext,
res: express.Response,
action: Parameters<typeof handleBrowserActionCore>[0]["action"],
args: Record<string, unknown>,
targetId: string,
) {
try {
const cdpPort = ctx.state().cdpPort;
await handleBrowserActionCore({
action,
args,
targetId,
cdpPort,
ctx,
res,
});
} catch (err) {
handleActionError(ctx, res, err);
}
}
async function runExtraAction(
ctx: BrowserRouteContext,
res: express.Response,
action: Parameters<typeof handleBrowserActionExtra>[0]["action"],
args: Record<string, unknown>,
targetId: string,
) {
try {
const cdpPort = ctx.state().cdpPort;
await handleBrowserActionExtra({
action,
args,
targetId,
cdpPort,
ctx,
res,
});
} catch (err) {
handleActionError(ctx, res, err);
}
}
export function registerBrowserActionRoutes(
app: express.Express,
ctx: BrowserRouteContext,
) {
app.post("/navigate", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runCoreAction(ctx, res, "navigate", body, targetId);
});
app.post("/back", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runCoreAction(ctx, res, "back", body, targetId);
});
app.post("/resize", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runCoreAction(ctx, res, "resize", body, targetId);
});
app.post("/close", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runCoreAction(ctx, res, "close", body, targetId);
});
app.post("/click", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runCoreAction(ctx, res, "click", body, targetId);
});
app.post("/type", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runCoreAction(ctx, res, "type", body, targetId);
});
app.post("/press", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runCoreAction(ctx, res, "press", body, targetId);
});
app.post("/hover", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runCoreAction(ctx, res, "hover", body, targetId);
});
app.post("/drag", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runCoreAction(ctx, res, "drag", body, targetId);
});
app.post("/select", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runCoreAction(ctx, res, "select", body, targetId);
});
app.post("/upload", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runCoreAction(ctx, res, "upload", body, targetId);
});
app.post("/fill", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runCoreAction(ctx, res, "fill", body, targetId);
});
app.post("/dialog", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runCoreAction(ctx, res, "dialog", body, targetId);
});
app.post("/wait", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runCoreAction(ctx, res, "wait", body, targetId);
});
app.post("/evaluate", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runCoreAction(ctx, res, "evaluate", body, targetId);
});
app.post("/run", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runCoreAction(ctx, res, "run", body, targetId);
});
app.get("/console", async (req, res) => {
const targetId = readTargetId(req.query.targetId);
const level = toStringOrEmpty(req.query.level);
const args = level ? { level } : {};
await runExtraAction(ctx, res, "console", args, targetId);
});
app.get("/network", async (req, res) => {
const targetId = readTargetId(req.query.targetId);
const includeStatic = toBoolean(req.query.includeStatic) ?? false;
await runExtraAction(ctx, res, "network", { includeStatic }, targetId);
});
app.post("/trace/start", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runExtraAction(ctx, res, "traceStart", body, targetId);
});
app.post("/trace/stop", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runExtraAction(ctx, res, "traceStop", body, targetId);
});
app.post("/pdf", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runExtraAction(ctx, res, "pdf", body, targetId);
});
app.post("/verify/element", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runExtraAction(ctx, res, "verifyElement", body, targetId);
});
app.post("/verify/text", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runExtraAction(ctx, res, "verifyText", body, targetId);
});
app.post("/verify/list", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runExtraAction(ctx, res, "verifyList", body, targetId);
});
app.post("/verify/value", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runExtraAction(ctx, res, "verifyValue", body, targetId);
});
app.post("/mouse/move", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runExtraAction(ctx, res, "mouseMove", body, targetId);
});
app.post("/mouse/click", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runExtraAction(ctx, res, "mouseClick", body, targetId);
});
app.post("/mouse/drag", async (req, res) => {
const body = readBody(req);
const targetId = readTargetId(body.targetId);
await runExtraAction(ctx, res, "mouseDrag", body, targetId);
});
app.post("/locator", async (req, res) => {
const body = readBody(req);
await runExtraAction(ctx, res, "locator", body, "");
});
}

View File

@ -1,10 +1,10 @@
import type express from "express"; import type express from "express";
import type { BrowserRouteContext } from "../server-context.js"; import type { BrowserRouteContext } from "../server-context.js";
import { registerBrowserActionRoutes } from "./actions.js";
import { registerBrowserBasicRoutes } from "./basic.js"; import { registerBrowserBasicRoutes } from "./basic.js";
import { registerBrowserInspectRoutes } from "./inspect.js"; import { registerBrowserInspectRoutes } from "./inspect.js";
import { registerBrowserTabRoutes } from "./tabs.js"; import { registerBrowserTabRoutes } from "./tabs.js";
import { registerBrowserToolRoutes } from "./tool.js";
export function registerBrowserRoutes( export function registerBrowserRoutes(
app: express.Express, app: express.Express,
@ -13,5 +13,5 @@ export function registerBrowserRoutes(
registerBrowserBasicRoutes(app, ctx); registerBrowserBasicRoutes(app, ctx);
registerBrowserTabRoutes(app, ctx); registerBrowserTabRoutes(app, ctx);
registerBrowserInspectRoutes(app, ctx); registerBrowserInspectRoutes(app, ctx);
registerBrowserToolRoutes(app, ctx); registerBrowserActionRoutes(app, ctx);
} }

View File

@ -281,27 +281,4 @@ export function registerBrowserInspectRoutes(
jsonError(res, 500, String(err)); jsonError(res, 500, String(err));
} }
}); });
app.post("/click", async (req, res) => {
const ref = toStringOrEmpty((req.body as { ref?: unknown })?.ref);
const targetId = toStringOrEmpty(
(req.body as { targetId?: unknown })?.targetId,
);
if (!ref) return jsonError(res, 400, "ref is required");
try {
const tab = await ctx.ensureTabAvailable(targetId || undefined);
await clickViaPlaywright({
cdpPort: ctx.state().cdpPort,
targetId: tab.targetId,
ref,
});
res.json({ ok: true, targetId: tab.targetId, url: tab.url });
} catch (err) {
const mapped = ctx.mapTabError(err);
if (mapped) return jsonError(res, mapped.status, mapped.message);
jsonError(res, 500, String(err));
}
});
} }

View File

@ -1,65 +0,0 @@
import type express from "express";
import type { BrowserRouteContext } from "../server-context.js";
import { handleBrowserToolCore } from "./tool-core.js";
import { handleBrowserToolExtra } from "./tool-extra.js";
import { jsonError, toStringOrEmpty } from "./utils.js";
type ToolRequestBody = {
name?: unknown;
args?: unknown;
targetId?: unknown;
};
function toolArgs(value: unknown): Record<string, unknown> {
if (!value || typeof value !== "object" || Array.isArray(value)) return {};
return value as Record<string, unknown>;
}
export function registerBrowserToolRoutes(
app: express.Express,
ctx: BrowserRouteContext,
) {
app.post("/tool", async (req, res) => {
const body = req.body as ToolRequestBody;
const name = toStringOrEmpty(body?.name);
if (!name) return jsonError(res, 400, "name is required");
const args = toolArgs(body?.args);
const targetId = toStringOrEmpty(body?.targetId || args?.targetId);
try {
let cdpPort: number;
try {
cdpPort = ctx.state().cdpPort;
} catch {
return jsonError(res, 503, "browser server not started");
}
const handledCore = await handleBrowserToolCore({
name,
args,
targetId,
cdpPort,
ctx,
res,
});
if (handledCore) return;
const handledExtra = await handleBrowserToolExtra({
name,
args,
targetId,
cdpPort,
ctx,
res,
});
if (handledExtra) return;
return jsonError(res, 400, "unknown tool name");
} catch (err) {
const mapped = ctx.mapTabError(err);
if (mapped) return jsonError(res, mapped.status, mapped.message);
jsonError(res, 500, String(err));
}
});
}

View File

@ -0,0 +1,492 @@
import type { Command } from "commander";
import { resolveBrowserControlUrl } from "../browser/client.js";
import {
browserBack,
browserClick,
browserDrag,
browserEvaluate,
browserFillForm,
browserHandleDialog,
browserHover,
browserNavigate,
browserPressKey,
browserResize,
browserRunCode,
browserSelectOption,
browserType,
browserUpload,
browserWaitFor,
} from "../browser/client-actions.js";
import { danger } from "../globals.js";
import { defaultRuntime } from "../runtime.js";
import type { BrowserParentOpts } from "./browser-cli-shared.js";
async function readStdin(): Promise<string> {
const chunks: string[] = [];
return await new Promise((resolve, reject) => {
process.stdin.setEncoding("utf8");
process.stdin.on("data", (chunk) => chunks.push(chunk));
process.stdin.on("end", () => resolve(chunks.join("")));
process.stdin.on("error", reject);
});
}
async function readFile(path: string): Promise<string> {
const fs = await import("node:fs/promises");
return await fs.readFile(path, "utf8");
}
async function readCode(opts: {
code?: string;
codeFile?: string;
codeStdin?: boolean;
}): Promise<string> {
if (opts.codeFile) return await readFile(opts.codeFile);
if (opts.codeStdin) return await readStdin();
return opts.code ?? "";
}
async function readFields(opts: {
fields?: string;
fieldsFile?: string;
}): Promise<Array<Record<string, unknown>>> {
const payload = opts.fieldsFile
? await readFile(opts.fieldsFile)
: (opts.fields ?? "");
if (!payload.trim()) throw new Error("fields are required");
const parsed = JSON.parse(payload) as unknown;
if (!Array.isArray(parsed)) throw new Error("fields must be an array");
return parsed as Array<Record<string, unknown>>;
}
export function registerBrowserActionInputCommands(
browser: Command,
parentOpts: (cmd: Command) => BrowserParentOpts,
) {
browser
.command("navigate")
.description("Navigate the current tab to a URL")
.argument("<url>", "URL to navigate to")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (url: string, opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const result = await browserNavigate(baseUrl, {
url,
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(`navigated to ${result.url ?? url}`);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("back")
.description("Navigate back in history")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const result = await browserBack(baseUrl, {
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(
`navigated back to ${result.url ?? "previous page"}`,
);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("resize")
.description("Resize the viewport")
.argument("<width>", "Viewport width", (v: string) => Number(v))
.argument("<height>", "Viewport height", (v: string) => Number(v))
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (width: number, height: number, opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
if (!Number.isFinite(width) || !Number.isFinite(height)) {
defaultRuntime.error(danger("width and height must be numbers"));
defaultRuntime.exit(1);
return;
}
try {
const result = await browserResize(baseUrl, {
width,
height,
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(`resized to ${width}x${height}`);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("click")
.description("Click an element by ref from an ai snapshot (e.g. 76)")
.argument("<ref>", "Ref id from ai snapshot")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.option("--double", "Double click", false)
.option("--button <left|right|middle>", "Mouse button to use")
.option("--modifiers <list>", "Comma-separated modifiers (Shift,Alt,Meta)")
.action(async (ref: string, opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
const modifiers = opts.modifiers
? String(opts.modifiers)
.split(",")
.map((v: string) => v.trim())
.filter(Boolean)
: undefined;
try {
const result = await browserClick(baseUrl, {
ref,
targetId: opts.targetId?.trim() || undefined,
doubleClick: Boolean(opts.double),
button: opts.button?.trim() || undefined,
modifiers,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
const suffix = result.url ? ` on ${result.url}` : "";
defaultRuntime.log(`clicked ref ${ref}${suffix}`);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("type")
.description("Type into an element by ai ref")
.argument("<ref>", "Ref id from ai snapshot")
.argument("<text>", "Text to type")
.option("--submit", "Press Enter after typing", false)
.option("--slowly", "Type slowly (human-like)", false)
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (ref: string, text: string, opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const result = await browserType(baseUrl, {
ref,
text,
submit: Boolean(opts.submit),
slowly: Boolean(opts.slowly),
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(`typed into ref ${ref}`);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("press")
.description("Press a key")
.argument("<key>", "Key to press (e.g. Enter)")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (key: string, opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const result = await browserPressKey(baseUrl, {
key,
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(`pressed ${key}`);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("hover")
.description("Hover an element by ai ref")
.argument("<ref>", "Ref id from ai snapshot")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (ref: string, opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const result = await browserHover(baseUrl, {
ref,
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(`hovered ref ${ref}`);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("drag")
.description("Drag from one ref to another")
.argument("<startRef>", "Start ref id")
.argument("<endRef>", "End ref id")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (startRef: string, endRef: string, opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const result = await browserDrag(baseUrl, {
startRef,
endRef,
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(`dragged ${startRef}${endRef}`);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("select")
.description("Select option(s) in a select element")
.argument("<ref>", "Ref id from ai snapshot")
.argument("<values...>", "Option values to select")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (ref: string, values: string[], opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const result = await browserSelectOption(baseUrl, {
ref,
values,
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(`selected ${values.join(", ")}`);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("upload")
.description("Upload file(s) when a file chooser is open")
.argument("<paths...>", "File paths to upload")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (paths: string[], opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const result = await browserUpload(baseUrl, {
paths,
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(`uploaded ${paths.length} file(s)`);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("fill")
.description("Fill a form with JSON field descriptors")
.option("--fields <json>", "JSON array of field objects")
.option("--fields-file <path>", "Read JSON array from a file")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const fields = await readFields({
fields: opts.fields,
fieldsFile: opts.fieldsFile,
});
const result = await browserFillForm(baseUrl, {
fields,
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(`filled ${fields.length} field(s)`);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("dialog")
.description("Handle a modal dialog (alert/confirm/prompt)")
.option("--accept", "Accept the dialog", false)
.option("--dismiss", "Dismiss the dialog", false)
.option("--prompt <text>", "Prompt response text")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
const accept = opts.accept ? true : opts.dismiss ? false : undefined;
if (accept === undefined) {
defaultRuntime.error(danger("Specify --accept or --dismiss"));
defaultRuntime.exit(1);
return;
}
try {
const result = await browserHandleDialog(baseUrl, {
accept,
promptText: opts.prompt?.trim() || undefined,
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(`dialog handled: ${result.type}`);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("wait")
.description("Wait for time or text conditions")
.option("--time <ms>", "Wait for N milliseconds", (v: string) => Number(v))
.option("--text <value>", "Wait for text to appear")
.option("--text-gone <value>", "Wait for text to disappear")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const result = await browserWaitFor(baseUrl, {
time: Number.isFinite(opts.time) ? opts.time : undefined,
text: opts.text?.trim() || undefined,
textGone: opts.textGone?.trim() || undefined,
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log("wait complete");
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("evaluate")
.description("Evaluate a function against the page or a ref")
.option("--fn <code>", "Function source, e.g. (el) => el.textContent")
.option("--ref <id>", "ARIA ref from ai snapshot")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
if (!opts.fn) {
defaultRuntime.error(danger("Missing --fn"));
defaultRuntime.exit(1);
return;
}
try {
const result = await browserEvaluate(baseUrl, {
fn: opts.fn,
ref: opts.ref?.trim() || undefined,
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(JSON.stringify(result.result, null, 2));
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("run")
.description("Run a Playwright code function (page => ...) ")
.option("--code <code>", "Function source, e.g. (page) => page.title()")
.option("--code-file <path>", "Read function source from a file")
.option("--code-stdin", "Read function source from stdin", false)
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const code = await readCode({
code: opts.code,
codeFile: opts.codeFile,
codeStdin: Boolean(opts.codeStdin),
});
if (!code.trim()) {
defaultRuntime.error(danger("Missing --code (or file/stdin)"));
defaultRuntime.exit(1);
return;
}
const result = await browserRunCode(baseUrl, {
code,
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(JSON.stringify(result.result, null, 2));
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
}

View File

@ -0,0 +1,379 @@
import type { Command } from "commander";
import { resolveBrowserControlUrl } from "../browser/client.js";
import {
browserConsoleMessages,
browserGenerateLocator,
browserMouseClick,
browserMouseDrag,
browserMouseMove,
browserNetworkRequests,
browserPdfSave,
browserStartTracing,
browserStopTracing,
browserVerifyElementVisible,
browserVerifyListVisible,
browserVerifyTextVisible,
browserVerifyValue,
} from "../browser/client-actions.js";
import { danger } from "../globals.js";
import { defaultRuntime } from "../runtime.js";
import type { BrowserParentOpts } from "./browser-cli-shared.js";
export function registerBrowserActionObserveCommands(
browser: Command,
parentOpts: (cmd: Command) => BrowserParentOpts,
) {
browser
.command("console")
.description("Get recent console messages")
.option("--level <level>", "Filter by level (error, warn, info)")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const result = await browserConsoleMessages(baseUrl, {
level: opts.level?.trim() || undefined,
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(JSON.stringify(result.messages, null, 2));
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("network")
.description("Get recent network requests")
.option("--include-static", "Include static assets", false)
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const result = await browserNetworkRequests(baseUrl, {
includeStatic: Boolean(opts.includeStatic),
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(JSON.stringify(result.requests, null, 2));
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("trace-start")
.description("Start Playwright tracing")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const result = await browserStartTracing(baseUrl, {
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log("trace started");
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("trace-stop")
.description("Stop tracing and save a trace.zip")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const result = await browserStopTracing(baseUrl, {
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(`trace: ${result.path}`);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("pdf")
.description("Save page as PDF")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const result = await browserPdfSave(baseUrl, {
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(`PDF: ${result.path}`);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("verify-element")
.description("Verify element visible by role + name")
.option("--role <role>", "ARIA role")
.option("--name <text>", "Accessible name")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
if (!opts.role || !opts.name) {
defaultRuntime.error(danger("--role and --name are required"));
defaultRuntime.exit(1);
return;
}
try {
const result = await browserVerifyElementVisible(baseUrl, {
role: opts.role,
accessibleName: opts.name,
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log("element visible");
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("verify-text")
.description("Verify text is visible")
.argument("<text>", "Text to find")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (text: string, opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const result = await browserVerifyTextVisible(baseUrl, {
text,
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log("text visible");
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("verify-list")
.description("Verify list items under a ref")
.argument("<ref>", "Ref id from ai snapshot")
.argument("<items...>", "List items to verify")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (ref: string, items: string[], opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const result = await browserVerifyListVisible(baseUrl, {
ref,
items,
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log("list visible");
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("verify-value")
.description("Verify a form control value")
.option("--ref <ref>", "Ref id from ai snapshot")
.option("--type <type>", "Input type (textbox, checkbox, slider, etc)")
.option("--value <value>", "Expected value")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
if (!opts.ref || !opts.type) {
defaultRuntime.error(danger("--ref and --type are required"));
defaultRuntime.exit(1);
return;
}
try {
const result = await browserVerifyValue(baseUrl, {
ref: opts.ref,
type: opts.type,
value: opts.value,
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log("value verified");
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("mouse-move")
.description("Move mouse to viewport coordinates")
.option("--x <n>", "X coordinate", (v: string) => Number(v))
.option("--y <n>", "Y coordinate", (v: string) => Number(v))
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
if (!Number.isFinite(opts.x) || !Number.isFinite(opts.y)) {
defaultRuntime.error(danger("--x and --y are required"));
defaultRuntime.exit(1);
return;
}
try {
const result = await browserMouseMove(baseUrl, {
x: opts.x,
y: opts.y,
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log("mouse moved");
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("mouse-click")
.description("Click at viewport coordinates")
.option("--x <n>", "X coordinate", (v: string) => Number(v))
.option("--y <n>", "Y coordinate", (v: string) => Number(v))
.option("--button <left|right|middle>", "Mouse button")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
if (!Number.isFinite(opts.x) || !Number.isFinite(opts.y)) {
defaultRuntime.error(danger("--x and --y are required"));
defaultRuntime.exit(1);
return;
}
try {
const result = await browserMouseClick(baseUrl, {
x: opts.x,
y: opts.y,
button: opts.button?.trim() || undefined,
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log("mouse clicked");
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("mouse-drag")
.description("Drag by viewport coordinates")
.option("--start-x <n>", "Start X", (v: string) => Number(v))
.option("--start-y <n>", "Start Y", (v: string) => Number(v))
.option("--end-x <n>", "End X", (v: string) => Number(v))
.option("--end-y <n>", "End Y", (v: string) => Number(v))
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
if (
!Number.isFinite(opts.startX) ||
!Number.isFinite(opts.startY) ||
!Number.isFinite(opts.endX) ||
!Number.isFinite(opts.endY)
) {
defaultRuntime.error(
danger("--start-x, --start-y, --end-x, --end-y are required"),
);
defaultRuntime.exit(1);
return;
}
try {
const result = await browserMouseDrag(baseUrl, {
startX: opts.startX,
startY: opts.startY,
endX: opts.endX,
endY: opts.endY,
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log("mouse dragged");
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("locator")
.description("Generate a Playwright locator for a ref")
.argument("<ref>", "Ref id from ai snapshot")
.action(async (ref: string, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const result = await browserGenerateLocator(baseUrl, { ref });
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(result.locator);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
}

View File

@ -0,0 +1,48 @@
export const browserCoreExamples = [
"clawdis browser status",
"clawdis browser start",
"clawdis browser stop",
"clawdis browser tabs",
"clawdis browser open https://example.com",
"clawdis browser focus abcd1234",
"clawdis browser close abcd1234",
"clawdis browser screenshot",
"clawdis browser screenshot --full-page",
"clawdis browser screenshot --ref 12",
'clawdis browser eval "document.title"',
'clawdis browser query "a" --limit 5',
"clawdis browser dom --format text --max-chars 5000",
"clawdis browser snapshot --format aria --limit 200",
"clawdis browser snapshot --format ai",
];
export const browserActionExamples = [
"clawdis browser navigate https://example.com",
"clawdis browser back",
"clawdis browser resize 1280 720",
"clawdis browser click 12 --double",
'clawdis browser type 23 "hello" --submit',
"clawdis browser press Enter",
"clawdis browser hover 44",
"clawdis browser drag 10 11",
"clawdis browser select 9 OptionA OptionB",
"clawdis browser upload /tmp/file.pdf",
'clawdis browser fill --fields \'[{"ref":"1","value":"Ada"}]\'',
"clawdis browser dialog --accept",
'clawdis browser wait --text "Done"',
"clawdis browser evaluate --fn '(el) => el.textContent' --ref 7",
"clawdis browser run --code '(page) => page.title()'",
"clawdis browser console --level error",
"clawdis browser network --include-static",
"clawdis browser trace-start",
"clawdis browser trace-stop",
"clawdis browser pdf",
'clawdis browser verify-element --role button --name "Submit"',
'clawdis browser verify-text "Welcome"',
"clawdis browser verify-list 3 ItemA ItemB",
"clawdis browser verify-value --ref 4 --type textbox --value hello",
"clawdis browser mouse-move --x 120 --y 240",
"clawdis browser mouse-click --x 120 --y 240",
"clawdis browser mouse-drag --start-x 10 --start-y 20 --end-x 200 --end-y 300",
"clawdis browser locator 77",
];

View File

@ -0,0 +1,273 @@
import type { Command } from "commander";
import {
browserDom,
browserEval,
browserQuery,
browserScreenshot,
browserSnapshot,
resolveBrowserControlUrl,
} from "../browser/client.js";
import { browserScreenshotAction } from "../browser/client-actions.js";
import { danger } from "../globals.js";
import { defaultRuntime } from "../runtime.js";
import type { BrowserParentOpts } from "./browser-cli-shared.js";
async function readStdin(): Promise<string> {
const chunks: string[] = [];
return await new Promise((resolve, reject) => {
process.stdin.setEncoding("utf8");
process.stdin.on("data", (chunk) => chunks.push(chunk));
process.stdin.on("end", () => resolve(chunks.join("")));
process.stdin.on("error", reject);
});
}
async function readTextFromSource(opts: {
js?: string;
jsFile?: string;
jsStdin?: boolean;
}): Promise<string> {
if (opts.jsFile) {
const fs = await import("node:fs/promises");
return await fs.readFile(opts.jsFile, "utf8");
}
if (opts.jsStdin) {
return await readStdin();
}
return opts.js ?? "";
}
export function registerBrowserInspectCommands(
browser: Command,
parentOpts: (cmd: Command) => BrowserParentOpts,
) {
browser
.command("screenshot")
.description("Capture a screenshot (MEDIA:<path>)")
.argument("[targetId]", "CDP target id (or unique prefix)")
.option("--full-page", "Capture full scrollable page", false)
.option("--ref <ref>", "ARIA ref from ai snapshot")
.option("--element <selector>", "CSS selector for element screenshot")
.option("--type <png|jpeg>", "Output type (default: png)", "png")
.option("--filename <name>", "Preferred output filename")
.action(async (targetId: string | undefined, opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const advanced = Boolean(opts.ref || opts.element || opts.filename);
const result = advanced
? await browserScreenshotAction(baseUrl, {
targetId: targetId?.trim() || undefined,
fullPage: Boolean(opts.fullPage),
ref: opts.ref?.trim() || undefined,
element: opts.element?.trim() || undefined,
filename: opts.filename?.trim() || undefined,
type: opts.type === "jpeg" ? "jpeg" : "png",
})
: await browserScreenshot(baseUrl, {
targetId: targetId?.trim() || undefined,
fullPage: Boolean(opts.fullPage),
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(`MEDIA:${result.path}`);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("eval")
.description("Run JavaScript in the active tab")
.argument("[js]", "JavaScript expression")
.option("--js-file <path>", "Read JavaScript from a file")
.option("--js-stdin", "Read JavaScript from stdin", false)
.option("--target-id <id>", "CDP target id (or unique prefix)")
.option("--await", "Await promise result", false)
.action(async (js: string | undefined, opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const source = await readTextFromSource({
js,
jsFile: opts.jsFile,
jsStdin: Boolean(opts.jsStdin),
});
if (!source.trim()) {
defaultRuntime.error(danger("Missing JavaScript input."));
defaultRuntime.exit(1);
return;
}
const result = await browserEval(baseUrl, {
js: source,
targetId: opts.targetId?.trim() || undefined,
awaitPromise: Boolean(opts.await),
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(JSON.stringify(result.result, null, 2));
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("query")
.description("Query selector matches")
.argument("<selector>", "CSS selector")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.option("--limit <n>", "Max matches (default: 20)", (v: string) =>
Number(v),
)
.action(async (selector: string, opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const result = await browserQuery(baseUrl, {
selector,
targetId: opts.targetId?.trim() || undefined,
limit: Number.isFinite(opts.limit) ? opts.limit : undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(JSON.stringify(result.matches, null, 2));
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("dom")
.description("Dump DOM (html or text) with truncation")
.option("--format <html|text>", "Output format (default: html)", "html")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.option("--selector <css>", "Optional CSS selector to scope the dump")
.option(
"--max-chars <n>",
"Max characters (default: 200000)",
(v: string) => Number(v),
)
.option("--out <path>", "Write output to a file")
.action(async (opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
const format = opts.format === "text" ? "text" : "html";
try {
const result = await browserDom(baseUrl, {
format,
targetId: opts.targetId?.trim() || undefined,
maxChars: Number.isFinite(opts.maxChars) ? opts.maxChars : undefined,
selector: opts.selector?.trim() || undefined,
});
if (opts.out) {
const fs = await import("node:fs/promises");
await fs.writeFile(opts.out, result.text, "utf8");
if (parent?.json) {
defaultRuntime.log(
JSON.stringify({ ok: true, out: opts.out }, null, 2),
);
} else {
defaultRuntime.log(opts.out);
}
return;
}
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(result.text);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("snapshot")
.description("Capture an AI-friendly snapshot (aria, domSnapshot, or ai)")
.option(
"--format <aria|domSnapshot|ai>",
"Snapshot format (default: aria)",
"aria",
)
.option("--target-id <id>", "CDP target id (or unique prefix)")
.option("--limit <n>", "Max nodes (default: 500/800)", (v: string) =>
Number(v),
)
.option("--out <path>", "Write snapshot to a file")
.action(async (opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
const format =
opts.format === "domSnapshot"
? "domSnapshot"
: opts.format === "ai"
? "ai"
: "aria";
try {
const result = await browserSnapshot(baseUrl, {
format,
targetId: opts.targetId?.trim() || undefined,
limit: Number.isFinite(opts.limit) ? opts.limit : undefined,
});
if (opts.out) {
const fs = await import("node:fs/promises");
if (result.format === "ai") {
await fs.writeFile(opts.out, result.snapshot, "utf8");
} else {
const payload = JSON.stringify(result, null, 2);
await fs.writeFile(opts.out, payload, "utf8");
}
if (parent?.json) {
defaultRuntime.log(
JSON.stringify({ ok: true, out: opts.out }, null, 2),
);
} else {
defaultRuntime.log(opts.out);
}
return;
}
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
if (result.format === "ai") {
defaultRuntime.log(result.snapshot);
return;
}
if (result.format === "domSnapshot") {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
const nodes = "nodes" in result ? result.nodes : [];
defaultRuntime.log(
nodes
.map((n) => {
const indent = " ".repeat(Math.min(20, n.depth));
const name = n.name ? ` "${n.name}"` : "";
const value = n.value ? ` = "${n.value}"` : "";
return `${indent}- ${n.role}${name}${value}`;
})
.join("\n"),
);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
}

View File

@ -0,0 +1,183 @@
import type { Command } from "commander";
import {
browserCloseTab,
browserFocusTab,
browserOpenTab,
browserStart,
browserStatus,
browserStop,
browserTabs,
resolveBrowserControlUrl,
} from "../browser/client.js";
import { browserClosePage } from "../browser/client-actions.js";
import { danger, info } from "../globals.js";
import { defaultRuntime } from "../runtime.js";
import type { BrowserParentOpts } from "./browser-cli-shared.js";
export function registerBrowserManageCommands(
browser: Command,
parentOpts: (cmd: Command) => BrowserParentOpts,
) {
browser
.command("status")
.description("Show browser status")
.action(async (_opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const status = await browserStatus(baseUrl);
if (parent?.json) {
defaultRuntime.log(JSON.stringify(status, null, 2));
return;
}
defaultRuntime.log(
[
`enabled: ${status.enabled}`,
`running: ${status.running}`,
`controlUrl: ${status.controlUrl}`,
`cdpPort: ${status.cdpPort}`,
`browser: ${status.chosenBrowser ?? "unknown"}`,
`profileColor: ${status.color}`,
].join("\n"),
);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("start")
.description("Start the clawd browser (no-op if already running)")
.action(async (_opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
await browserStart(baseUrl);
const status = await browserStatus(baseUrl);
if (parent?.json) {
defaultRuntime.log(JSON.stringify(status, null, 2));
return;
}
defaultRuntime.log(info(`🦞 clawd browser running: ${status.running}`));
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("stop")
.description("Stop the clawd browser (best-effort)")
.action(async (_opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
await browserStop(baseUrl);
const status = await browserStatus(baseUrl);
if (parent?.json) {
defaultRuntime.log(JSON.stringify(status, null, 2));
return;
}
defaultRuntime.log(info(`🦞 clawd browser running: ${status.running}`));
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("tabs")
.description("List open tabs")
.action(async (_opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const tabs = await browserTabs(baseUrl);
if (parent?.json) {
defaultRuntime.log(JSON.stringify({ tabs }, null, 2));
return;
}
if (tabs.length === 0) {
defaultRuntime.log("No tabs (browser closed or no targets).");
return;
}
defaultRuntime.log(
tabs
.map(
(t, i) =>
`${i + 1}. ${t.title || "(untitled)"}\n ${t.url}\n id: ${t.targetId}`,
)
.join("\n"),
);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("open")
.description("Open a URL in a new tab")
.argument("<url>", "URL to open")
.action(async (url: string, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const tab = await browserOpenTab(baseUrl, url);
if (parent?.json) {
defaultRuntime.log(JSON.stringify(tab, null, 2));
return;
}
defaultRuntime.log(`opened: ${tab.url}\nid: ${tab.targetId}`);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("focus")
.description("Focus a tab by target id (or unique prefix)")
.argument("<targetId>", "Target id or unique prefix")
.action(async (targetId: string, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
await browserFocusTab(baseUrl, targetId);
if (parent?.json) {
defaultRuntime.log(JSON.stringify({ ok: true }, null, 2));
return;
}
defaultRuntime.log(`focused tab ${targetId}`);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("close")
.description("Close a tab (target id optional)")
.argument("[targetId]", "Target id or unique prefix (optional)")
.action(async (targetId: string | undefined, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
if (targetId?.trim()) {
await browserCloseTab(baseUrl, targetId.trim());
} else {
await browserClosePage(baseUrl);
}
if (parent?.json) {
defaultRuntime.log(JSON.stringify({ ok: true }, null, 2));
return;
}
defaultRuntime.log("closed tab");
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
}

View File

@ -0,0 +1 @@
export type BrowserParentOpts = { url?: string; json?: boolean };

View File

@ -1,24 +1,15 @@
import type { Command } from "commander"; import type { Command } from "commander";
import { import { danger } from "../globals.js";
browserClickRef,
browserCloseTab,
browserDom,
browserEval,
browserFocusTab,
browserOpenTab,
browserQuery,
browserScreenshot,
browserSnapshot,
browserStart,
browserStatus,
browserStop,
browserTabs,
browserTool,
resolveBrowserControlUrl,
} from "../browser/client.js";
import { danger, info } from "../globals.js";
import { defaultRuntime } from "../runtime.js"; import { defaultRuntime } from "../runtime.js";
import { registerBrowserActionInputCommands } from "./browser-cli-actions-input.js";
import { registerBrowserActionObserveCommands } from "./browser-cli-actions-observe.js";
import {
browserActionExamples,
browserCoreExamples,
} from "./browser-cli-examples.js";
import { registerBrowserInspectCommands } from "./browser-cli-inspect.js";
import { registerBrowserManageCommands } from "./browser-cli-manage.js";
export function registerBrowserCli(program: Command) { export function registerBrowserCli(program: Command) {
const browser = program const browser = program
@ -31,24 +22,10 @@ export function registerBrowserCli(program: Command) {
.option("--json", "Output machine-readable JSON", false) .option("--json", "Output machine-readable JSON", false)
.addHelpText( .addHelpText(
"after", "after",
` `\nExamples:\n ${[...browserCoreExamples, ...browserActionExamples].join("\n ")}\n`,
Examples:
clawdis browser status
clawdis browser start
clawdis browser tabs
clawdis browser open https://example.com
clawdis browser screenshot # emits MEDIA:<path>
clawdis browser screenshot <targetId> --full-page
clawdis browser eval "location.href"
clawdis browser query "a" --limit 5
clawdis browser dom --format text --max-chars 5000
clawdis browser snapshot --format aria --limit 200
clawdis browser snapshot --format ai
clawdis browser click 76
clawdis browser tool browser_file_upload --args '{"paths":["/tmp/file.txt"]}'
`,
) )
.action(() => { .action(() => {
browser.outputHelp();
defaultRuntime.error( defaultRuntime.error(
danger('Missing subcommand. Try: "clawdis browser status"'), danger('Missing subcommand. Try: "clawdis browser status"'),
); );
@ -58,425 +35,8 @@ Examples:
const parentOpts = (cmd: Command) => const parentOpts = (cmd: Command) =>
cmd.parent?.opts?.() as { url?: string; json?: boolean }; cmd.parent?.opts?.() as { url?: string; json?: boolean };
browser registerBrowserManageCommands(browser, parentOpts);
.command("status") registerBrowserInspectCommands(browser, parentOpts);
.description("Show browser status") registerBrowserActionInputCommands(browser, parentOpts);
.action(async (_opts, cmd) => { registerBrowserActionObserveCommands(browser, parentOpts);
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const status = await browserStatus(baseUrl);
if (parent?.json) {
defaultRuntime.log(JSON.stringify(status, null, 2));
return;
}
defaultRuntime.log(
[
`enabled: ${status.enabled}`,
`running: ${status.running}`,
`controlUrl: ${status.controlUrl}`,
`cdpPort: ${status.cdpPort}`,
`browser: ${status.chosenBrowser ?? "unknown"}`,
`profileColor: ${status.color}`,
].join("\n"),
);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("start")
.description("Start the clawd browser (no-op if already running)")
.action(async (_opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
await browserStart(baseUrl);
const status = await browserStatus(baseUrl);
if (parent?.json) {
defaultRuntime.log(JSON.stringify(status, null, 2));
return;
}
defaultRuntime.log(info(`🦞 clawd browser running: ${status.running}`));
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("stop")
.description("Stop the clawd browser (best-effort)")
.action(async (_opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
await browserStop(baseUrl);
const status = await browserStatus(baseUrl);
if (parent?.json) {
defaultRuntime.log(JSON.stringify(status, null, 2));
return;
}
defaultRuntime.log(info(`🦞 clawd browser running: ${status.running}`));
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("tabs")
.description("List open tabs")
.action(async (_opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const tabs = await browserTabs(baseUrl);
if (parent?.json) {
defaultRuntime.log(JSON.stringify({ tabs }, null, 2));
return;
}
if (tabs.length === 0) {
defaultRuntime.log("No tabs (browser closed or no targets).");
return;
}
defaultRuntime.log(
tabs
.map(
(t, i) =>
`${i + 1}. ${t.title || "(untitled)"}\n ${t.url}\n id: ${t.targetId}`,
)
.join("\n"),
);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("open")
.description("Open a URL in a new tab")
.argument("<url>", "URL to open")
.action(async (url: string, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const tab = await browserOpenTab(baseUrl, url);
if (parent?.json) {
defaultRuntime.log(JSON.stringify(tab, null, 2));
return;
}
defaultRuntime.log(`opened: ${tab.url}\nid: ${tab.targetId}`);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("focus")
.description("Focus a tab by target id (or unique prefix)")
.argument("<targetId>", "Target id or unique prefix")
.action(async (targetId: string, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
await browserFocusTab(baseUrl, targetId);
if (parent?.json) {
defaultRuntime.log(JSON.stringify({ ok: true }, null, 2));
return;
}
defaultRuntime.log(`focused tab ${targetId}`);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("close")
.description("Close a tab by target id (or unique prefix)")
.argument("<targetId>", "Target id or unique prefix")
.action(async (targetId: string, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
await browserCloseTab(baseUrl, targetId);
if (parent?.json) {
defaultRuntime.log(JSON.stringify({ ok: true }, null, 2));
return;
}
defaultRuntime.log(`closed tab ${targetId}`);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("screenshot")
.description("Capture a screenshot (MEDIA:<path>)")
.argument("[targetId]", "CDP target id (or unique prefix)")
.option("--full-page", "Capture full scrollable page", false)
.action(async (targetId: string | undefined, opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const result = await browserScreenshot(baseUrl, {
targetId: targetId?.trim() || undefined,
fullPage: Boolean(opts.fullPage),
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(`MEDIA:${result.path}`);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("eval")
.description("Run JavaScript in the active tab")
.argument("<js>", "JavaScript expression")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.option("--await", "Await promise result", false)
.action(async (js: string, opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const result = await browserEval(baseUrl, {
js,
targetId: opts.targetId?.trim() || undefined,
awaitPromise: Boolean(opts.await),
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(JSON.stringify(result.result, null, 2));
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("query")
.description("Query selector matches")
.argument("<selector>", "CSS selector")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.option("--limit <n>", "Max matches (default: 20)", (v: string) =>
Number(v),
)
.action(async (selector: string, opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const result = await browserQuery(baseUrl, {
selector,
targetId: opts.targetId?.trim() || undefined,
limit: Number.isFinite(opts.limit) ? opts.limit : undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(JSON.stringify(result.matches, null, 2));
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("dom")
.description("Dump DOM (html or text) with truncation")
.option("--format <html|text>", "Output format (default: html)", "html")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.option("--selector <css>", "Optional CSS selector to scope the dump")
.option(
"--max-chars <n>",
"Max characters (default: 200000)",
(v: string) => Number(v),
)
.option("--out <path>", "Write output to a file")
.action(async (opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
const format = opts.format === "text" ? "text" : "html";
try {
const result = await browserDom(baseUrl, {
format,
targetId: opts.targetId?.trim() || undefined,
maxChars: Number.isFinite(opts.maxChars) ? opts.maxChars : undefined,
selector: opts.selector?.trim() || undefined,
});
if (opts.out) {
const fs = await import("node:fs/promises");
await fs.writeFile(opts.out, result.text, "utf8");
if (parent?.json) {
defaultRuntime.log(
JSON.stringify({ ok: true, out: opts.out }, null, 2),
);
} else {
defaultRuntime.log(opts.out);
}
return;
}
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(result.text);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("snapshot")
.description("Capture an AI-friendly snapshot (aria, domSnapshot, or ai)")
.option(
"--format <aria|domSnapshot|ai>",
"Snapshot format (default: aria)",
"aria",
)
.option("--target-id <id>", "CDP target id (or unique prefix)")
.option("--limit <n>", "Max nodes (default: 500/800)", (v: string) =>
Number(v),
)
.option("--out <path>", "Write snapshot to a file")
.action(async (opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
const format =
opts.format === "domSnapshot"
? "domSnapshot"
: opts.format === "ai"
? "ai"
: "aria";
try {
const result = await browserSnapshot(baseUrl, {
format,
targetId: opts.targetId?.trim() || undefined,
limit: Number.isFinite(opts.limit) ? opts.limit : undefined,
});
if (opts.out) {
const fs = await import("node:fs/promises");
if (result.format === "ai") {
await fs.writeFile(opts.out, result.snapshot, "utf8");
} else {
const payload = JSON.stringify(result, null, 2);
await fs.writeFile(opts.out, payload, "utf8");
}
if (parent?.json) {
defaultRuntime.log(
JSON.stringify({ ok: true, out: opts.out }, null, 2),
);
} else {
defaultRuntime.log(opts.out);
}
return;
}
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
if (result.format === "ai") {
defaultRuntime.log(result.snapshot);
return;
}
if (result.format === "domSnapshot") {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
const nodes = "nodes" in result ? result.nodes : [];
defaultRuntime.log(
nodes
.map((n) => {
const indent = " ".repeat(Math.min(20, n.depth));
const name = n.name ? ` "${n.name}"` : "";
const value = n.value ? ` = "${n.value}"` : "";
return `${indent}- ${n.role}${name}${value}`;
})
.join("\n"),
);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("click")
.description("Click an element by ref from an ai snapshot (e.g. 76)")
.argument("<ref>", "Ref id from ai snapshot")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (ref: string, opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
try {
const result = await browserClickRef(baseUrl, {
ref,
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(`clicked ref ${ref} on ${result.url}`);
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
browser
.command("tool")
.description("Call a Playwright MCP-style browser tool by name")
.argument("<name>", "Tool name (browser_*)")
.option("--args <json>", "JSON arguments for the tool")
.option("--target-id <id>", "CDP target id (or unique prefix)")
.action(async (name: string, opts, cmd) => {
const parent = parentOpts(cmd);
const baseUrl = resolveBrowserControlUrl(parent?.url);
let args: Record<string, unknown> = {};
if (opts.args) {
try {
args = JSON.parse(String(opts.args));
} catch (err) {
defaultRuntime.error(
danger(`Invalid JSON for --args: ${String(err)}`),
);
defaultRuntime.exit(1);
}
}
try {
const result = await browserTool(baseUrl, {
name,
args,
targetId: opts.targetId?.trim() || undefined,
});
if (parent?.json) {
defaultRuntime.log(JSON.stringify(result, null, 2));
return;
}
defaultRuntime.log(JSON.stringify(result, null, 2));
} catch (err) {
defaultRuntime.error(danger(String(err)));
defaultRuntime.exit(1);
}
});
} }