3.8 KiB
3.8 KiB
SKILL: Playwright-Python Bridge (Universal Literate Note)
Overview
The Playwright Bridge provides high-fidelity web browsing capabilities by wrapping a headless Chromium instance managed via Python. It allows the agent to interact with JavaScript-heavy applications that are inaccessible to standard HTTP clients.
Phase A: Demand (PRD)
1. Purpose
Enable the agent to "see" and "read" the modern web by executing JavaScript and waiting for network idle states.
2. Success Criteria
- Interaction: Can navigate to any URL and wait for full page rendering.
- Extraction: Can retrieve inner text from any CSS selector.
- Vision: Can take base64-encoded screenshots of rendered pages.
Phase B: Blueprint (PROTOCOL)
1. Architectural Intent
Uses a "JSON Bridge" over standard I/O. The Lisp kernel executes a standalone Python script, passing parameters via `stdin` and receiving structured results via `stdout`.
2. Semantic Interfaces
- `(:target :tool :action :call :tool "browser" :args (:url "…" :action "extract_text"))`
Phase D: Build (Implementation)
Package Context
(in-package :org-agent)
Bridge Script Path
Calculates the location of the Python bridge script relative to the project root.
(defun get-browser-bridge-path ()
"Returns the absolute path to the Python browser bridge script."
(let ((root (or (uiop:getenv "PROJECT_ROOT") (uiop:native-namestring (uiop:getcwd)))))
(merge-pathnames "scripts/browser-bridge.py" (uiop:ensure-directory-pathname root))))
Execution Wrapper (execute-browser-command)
Invokes the Python bridge and parses its JSON output.
(defun execute-browser-command (args)
"Invokes the Playwright Python bridge with the provided arguments."
(let* ((script-path (get-browser-bridge-path))
(json-input (cl-json:encode-json-to-string args)))
(handler-case
(let ((output (uiop:run-program (list "python3" (uiop:native-namestring script-path))
:input (make-string-input-stream json-input)
:output :string
:error-output :string)))
(cl-json:decode-json-from-string output))
(error (c)
(list :status "error" :message (format nil "Bridge Execution Failed: ~a" c))))))
Cognitive Tool: Browser
Register the high-fidelity browsing tool with the harness.
(def-cognitive-tool :browser
"High-fidelity web browsing via Playwright (Chromium). Supports JS rendering."
((:url :type :string :description "The target URL")
(:action :type :string :description "Action to perform: 'extract_text' or 'screenshot'")
(:selector :type :string :description "Optional CSS selector (default: 'body')"))
:body (lambda (args)
(let ((result (execute-browser-command args)))
(if (string= (cdr (assoc :status result)) "success")
(or (cdr (assoc :content result))
(cdr (assoc :screenshot--base64 result))
"Success (no content returned)")
(format nil "BROWSER ERROR: ~a" (cdr (assoc :message result)))))))
Registration: Skill
(defskill :skill-playwright
:priority 150
:trigger (lambda (ctx) (declare (ignore ctx)) nil) ; Passive tool provider
:neuro nil
:symbolic (lambda (action ctx) (declare (ignore ctx)) action))