Files
memex/notes/org-skill-web-research.org

124 lines
4.3 KiB
Org Mode

#+TITLE: SKILL: Web Research Agent (Universal Literate Note)
#+ID: skill-web-research
#+STARTUP: content
#+FILETAGS: :web:research:internet:psf:
* Overview
The *Web Research Agent* provides the bridge to the internet. It fetches and synthesizes information from the web using pluggable engines like Lynx and Curl, enabling real-time research and fact verification.
* Phase A: Demand (PRD)
:PROPERTIES:
:STATUS: FROZEN
:END:
** 1. Purpose
Define the interfaces for internet information retrieval and synthesis.
** 2. User Needs
- *Connectivity:* Pluggable engines (Lynx, Curl) for fetching URLs.
- *Synthesis:* Neural transformation of raw content into factual summaries.
- *Efficiency:* Default to text-only engines to minimize overhead.
- *Search Integration:* Automatic DuckDuckGo routing for general queries.
** 3. Success Criteria
*** TODO Engine Fetching Verification (Lynx/Curl)
*** TODO URL vs Query Routing Logic
*** TODO Neural Synthesis Formatting Accuracy
* Phase B: Blueprint (PROTOCOL)
:PROPERTIES:
:STATUS: SIGNED
:END:
** 1. Architectural Intent
Interfaces for web I/O and content synthesis. Source of truth is the global internet and local CLI browser engines.
** 2. Semantic Interfaces
#+begin_src lisp
(defun trigger-skill-web-research (context)
"Triggers on :delegation :target-skill :web.")
(defun web-fetch (url &optional engine)
"Dispatches fetch request to CLI engines.")
(defun neuro-skill-web-research (context)
"Neural selection of engine and synthesis of fetched content.")
#+end_src
* Phase D: Build (Implementation)
** Browser Engines
#+begin_src lisp :tangle projects/org-skill-web-research/src/research-logic.lisp
(defun fetch-with-lynx (url)
(let ((cmd (format nil "lynx -dump -nolist '~a'" url)))
(uiop:run-program cmd :output :string :ignore-error-status t)))
(defun fetch-with-curl (url)
(let ((cmd (format nil "curl -sL '~a'" url)))
(uiop:run-program cmd :output :string :ignore-error-status t)))
(defun vision-browse (url)
"Uses a headless browser (Node/Playwright) to fetch text and a screenshot."
(let* ((proj-dir (or (uiop:getenv "PROJECTS_DIR") "projects/"))
(script-path (format nil "~aorg-skill-web-research/src/browse.js" proj-dir))
(cmd (format nil "node ~a '~a'" script-path url)))
(handler-case
(let* ((output (uiop:run-program cmd :output :string :ignore-error-status t))
(json (cl-json:decode-json-from-string output)))
json)
(error (c)
(list :error (format nil "Vision Browse Failure: ~a" c))))))
(defun web-fetch (url &optional engine)
(case engine
(:curl (fetch-with-curl url))
(:vision (vision-browse url))
(t (fetch-with-lynx url))))
#+end_src
** Neuro-Cognitive Intelligence
#+begin_src lisp :tangle projects/org-skill-web-research/src/research-logic.lisp
(defun neuro-skill-web-research (context)
"Neural stage for multi-modal web research.
If the user asks for visual details or the site is JS-heavy, it defaults to :vision."
(let* ((payload (getf context :payload))
(url (getf payload :url))
(query (getf payload :query))
(prefer-vision (getf payload :vision-p)))
(if url
(let* ((engine (if prefer-vision :vision :curl))
(content (web-fetch url engine)))
(format nil "
I fetched the following content from ~a using ~a:
---
~a
---
TASK:
If a screenshot was provided (as base64), it will be analyzed by the multimodal layer.
Summarize the key information or answer the original query: ~a
" url engine (getf content :text) query))
;; If no URL, we might need to search first
(format nil "No URL provided for research. Query: ~a" query))))
#+end_src
** Trigger Perception
#+begin_src lisp :tangle projects/org-skill-web-research/src/research-logic.lisp
(defun trigger-skill-web-research (context)
(let ((type (getf context :type))
(payload (getf context :payload)))
(and (eq type :EVENT)
(eq (getf payload :sensor) :delegation)
(eq (getf payload :target-skill) :web))))
#+end_src
* Registration
#+begin_src lisp
(defskill :skill-web-research
:priority 80
:trigger #'trigger-skill-web-research
:neuro #'neuro-skill-web-research
:symbolic #'verify-skill-web-research)
#+end_src