amr/hermes-brain

Fork 0

Files

Hermes 47ca8689fc gbrain: sync converted org-mode brain files

2026-06-01 03:03:11 +00:00

6.4 KiB

Raw Blame History

Pass as a Service — Hosting Economics

Assumptions
Unit cost per idle instance
Packing density
Infrastructure cost breakdown (100K users)
Pricing and margin
Scaling inflection points
Cloud vs colo
Architecture
Why this works
Addressable market
Price ladder

Assumptions

Stage 2 instance: one SBCL Lisp process with Gate + PDS + environment in one address space
User brings own LLM API key — no AI token cost to the provider
Containerized: Docker image running on cloud VMs (AWS spot instances)
Instances are embarrassingly parallel — no cross-instance coordination

Unit cost per idle instance

At rest: ~500 MB-1 GB RAM. Active peaks at 2-3 GB. CPU at rest negligible.

Packing density

AWS r6a instances (AMD, good price-to-RAM):

r6a.4xlarge (128 GB RAM): ~$0.23/hr spot, ~80 instances per VM
r6a.8xlarge (256 GB RAM): ~$0.45/hr spot, ~160 instances per VM

Cost per user per month at ~80 instances per 128 GB VM: ~$1.50 for compute.

Infrastructure cost breakdown (100K users)

Component	Detail	$/user/month
Compute	r6a spot, 80 instances/VM	~$1.50
Storage	10 GB EBS gp3 per user	~$0.80
Egress	Light protocol usage	~$0.50
Relay	K8s, stateless web service	~$0.50
Infra subtotal		~$3.30

Overhead
Engineering	4-5 people	~$0.80-1.60
Support	2-3 people	~$0.40-0.80
Overhead subtotal		~$1.20-2.40

Total

~$4.50-5.70

Pricing and margin

At $10/user/month:

Cost: ~$4.50-5.70/user/month
Margin: 43-55%

Scaling inflection points

Users	Provider cost/user/mo	Margin at $10/mo
1K	$25-40+	Negative
10K	$8-12	~0-20%
50K	$5-7	~30-50%
100K	$4.50-5.70	~43-55%
1M	$2-4	~60-80%

10K-20K users is the crossover to positive unit economics. Below that, the team overhead dominates.

Cloud vs colo

At small scale (under 10K users): AWS wins. No hardware risk, no colo contract, elastic.

At large scale (100K+ users): Colo is 2-5x cheaper per instance. AWS premium comes from degraded packing density (hypervisor overhead, can't overcommit memory).

Crossover at roughly 50K-100K users where dedicated ops justify colo.

Architecture

Relay on Kubernetes (stateless web service, standard pattern). Instances are Docker containers on raw VMs — one container = one SBCL Lisp process + volume mount for PDS. No orchestration magic needed for the instance layer.

The hardest operational problems: port mapping at scale (reverse proxy in front of VM pools) and PDS data persistence on VM failure (EBS snapshots or NFS-backed volumes).

Why this works

Three things make the unit economics viable early:

Zero AI token cost (user brings own API key)
The Gate runs even without an LLM — caches common decisions, declines to reason when no key is configured. Not a degraded product, just a non-AI mode.
Docker-on-large-VM packing recovers bare-metal packing density on cloud, avoiding per-instance overhead.

Addressable market

AI chat vs AI agents — orders of magnitude gap:

Category	Users (Jun 2026)	Notes
ChatGPT (chatbot)	900M weekly active	Mostly text generation
AI agent users (all tools)	5-10M	Actions, tools, environment control
Ratio	~100:1	Not 1000:1 as of mid-2026

Agent users are 1% of chatbot users today. If agent adoption follows the same growth curve as chatbots but lags by 18-24 months:

Year	Est. agent users	0.1% capture = users	MRR at $10/mo	Annual rev at 50% margin
2026	5-10M	5K-10K	$50K-100K	$300K-600K
2027	50-100M	50K-100K	$500K-1M	$3M-6M
2028	300-500M	300K-500K	$3M-5M	$18M-30M

This is conservative — 0.1% capture of the agent market, $10/month (no AI tokens included).

Passepartout is not just an AI agent. It's a social protocol, verified computing environment, and knowledge system. It competes on more than agent UX. Even a fraction of the growing agent market funds the infrastructure.

Price ladder

The most important constraint: the price users will bear must cover real infrastructure cost at whatever scale you're at. Two tiers solve for both growth and unit economics.

Self-hosted tier (growth engine):

User downloads the image, runs on own hardware or $5-10/mo VPS
Brings own API key for LLM access
Provider cost: ~$0.50-1/user/month (relay + routing)
Zero per-user compute or storage cost to provider
Negative margin but negligible — scales to millions freely
This is the wedge: proves the protocol, builds the network, costs nothing to operate per user

Hosted tier (revenue engine):

Provider-managed container, user brings API key
Packing density drives cost:
- At small scale (<5K hosted users): cost = $20-25/user/month
- At mature scale (50K+): cost = $5-7/user/month

Phase	Scale	Hosted cost/mo	Charged price	Margin
Bootstrap	<5K	$20-25	$25-30	10-20%
Break-even	5-20K	$10-15	$20-25	40-50%
Mature	50K+	$5-7	$15-20	60-70%
Commodity	500K+	$2-4	$10-15	75-85%

Pricing strategy:

Never price below cost — ramp pricing down as infrastructure efficiency improves
First 5K hosted users are enthusiasts and early adopters who pay a premium ($25-30)
When costs drop below $10/user/month, you have room to price at $10-15 and open a much wider funnel
Self-hosted grows the network regardless of whether the hosted tier succeeds

What the price buys:

Persistent Passepartout environment (shell, editor, browser, agent in one image)
Social protocol identity (DID, PDS, encrypted messaging)
The Gate verifying every action
Org data you own, in a format you own
AI tokens are NOT included — user brings own API key

Buyer profile:

Already spends $10-200/month on LLM API keys
Values a verified, persistent environment over ephemeral chatbot sessions
Wants to own data and identity
Can't or won't self-host
Developer, researcher, or knowledge worker

The non-obvious constraint: The addressable market at $25-30/month is narrower (0.1-0.5% of agent users) than at $10-15/month (1-5%). The hockey stick in user growth depends on infrastructure costs dropping far enough to price at consumer-friendly levels without burning capital.

6.4 KiB Raw Blame History