Files
org-agent-contrib/org-skill-playwright.org

3.8 KiB

SKILL: Playwright-Python Bridge (Universal Literate Note)

Overview

The Playwright Bridge provides high-fidelity web browsing capabilities by wrapping a headless Chromium instance managed via Python. It allows the agent to interact with JavaScript-heavy applications that are inaccessible to standard HTTP clients.

Phase A: Demand (PRD)

1. Purpose

Enable the agent to "see" and "read" the modern web by executing JavaScript and waiting for network idle states.

2. Success Criteria

  • Interaction: Can navigate to any URL and wait for full page rendering.
  • Extraction: Can retrieve inner text from any CSS selector.
  • Vision: Can take base64-encoded screenshots of rendered pages.

Phase B: Blueprint (PROTOCOL)

1. Architectural Intent

Uses a "JSON Bridge" over standard I/O. The Lisp kernel executes a standalone Python script, passing parameters via `stdin` and receiving structured results via `stdout`.

2. Semantic Interfaces

  • `(:target :tool :action :call :tool "browser" :args (:url "…" :action "extract_text"))`

Phase D: Build (Implementation)

Package Context

(in-package :org-agent)

Bridge Script Path

Calculates the location of the Python bridge script relative to the project root.

(defun get-browser-bridge-path ()
  "Returns the absolute path to the Python browser bridge script."
  (let ((root (or (uiop:getenv "PROJECT_ROOT") (uiop:native-namestring (uiop:getcwd)))))
    (merge-pathnames "scripts/browser-bridge.py" (uiop:ensure-directory-pathname root))))

Execution Wrapper (execute-browser-command)

Invokes the Python bridge and parses its JSON output.

(defun execute-browser-command (args)
  "Invokes the Playwright Python bridge with the provided arguments."
  (let* ((script-path (get-browser-bridge-path))
         (json-input (cl-json:encode-json-to-string args)))
    (handler-case
        (let ((output (uiop:run-program (list "python3" (uiop:native-namestring script-path))
                                        :input (make-string-input-stream json-input)
                                        :output :string
                                        :error-output :string)))
          (cl-json:decode-json-from-string output))
      (error (c)
        (list :status "error" :message (format nil "Bridge Execution Failed: ~a" c))))))

Cognitive Tool: Browser

Register the high-fidelity browsing tool with the harness.

(def-cognitive-tool :browser 
  "High-fidelity web browsing via Playwright (Chromium). Supports JS rendering."
  ((:url :type :string :description "The target URL")
   (:action :type :string :description "Action to perform: 'extract_text' or 'screenshot'")
   (:selector :type :string :description "Optional CSS selector (default: 'body')"))
  :body (lambda (args)
          (let ((result (execute-browser-command args)))
            (if (string= (cdr (assoc :status result)) "success")
                (or (cdr (assoc :content result))
                    (cdr (assoc :screenshot--base64 result))
                    "Success (no content returned)")
                (format nil "BROWSER ERROR: ~a" (cdr (assoc :message result)))))))

Registration: Skill

(defskill :skill-playwright
  :priority 150
  :trigger (lambda (ctx) (declare (ignore ctx)) nil) ; Passive tool provider
  :neuro nil
  :symbolic (lambda (action ctx) (declare (ignore ctx)) action))