chore: initial infrastructure docs scaffold

This commit is contained in:
Hermes
2026-05-15 07:07:19 +00:00
commit 26e95185e3

168
infrastructure.org Normal file
View File

@@ -0,0 +1,168 @@
#+TITLE: Infrastructure Documentation — gharbeia.net
#+AUTHOR: Amr Gharbeia
#+DATE: 2026-05-15
* Architecture
** Hosts
- =production-1= (10.10.10.201) :: Docker host, runs all services
- Hermes Agent :: Management/automation host
** Network
- Docker network =networking= (172.28.10.0/24)
- Proxmox VLANs: 1/10/20/30/40/50
- Services VLAN: 10.10.10.0/24
- Domain: gharbeia.net via Cloudflare (orange cloud/proxied)
** External Access Architecture
Cloudflare (edge, orange cloud)
└─ Cloudflare Tunnel "home" (cloudflared on production-1)
└─ Traefik (entrypoint=tunnel, port 8081)
├─ Authentik Forward Auth (external routers)
├─ gharbeia-site (nginx)
├─ jellyfin (SSO via plugin + OIDC)
├─ gitea (native OIDC)
└─ *.gharbeia.net services
** Internal Access Architecture
LAN client (browser)
└─ Traefik (entrypoint=secureweb, port 443)
├─ Authentik Forward Auth (internal.yaml routers)
├─ gharbeia-site (public, no auth)
├─ jellyfin (SSO via plugin)
└─ *.gharbeia.net services
Service-to-service / automation / cross-VLAN
└─ Traefik (entrypoint=internal, port 8083 — NO auth)
└─ Same routing as secureweb, from =traefik-internal-noauth.yaml=
Key distinction: =:443= = browsers/humans with Authentik auth.
=:8083= = runners, automated tooling, services on other VLANs.
* Services
** gharbeia-site (Static Website)
- Container: =gharbeia-site= (nginx:stable-alpine3.17-perl)
- Purpose: Landing page for gharbeia.net
- Docroot: =/docker/appdata/gharbeia-site/html=
- Nginx config: =/docker/appdata/gharbeia-site/nginx.conf=
- Traefik router: =gharbeia-site= on entrypoints =tunnel= and =secureweb=
~www.gharbeia.net~ → 301 redirect → ~gharbeia.net~ (handled by nginx)
Both domains in Traefik router rule: ~Host(\`gharbeia.net\`) || Host(\`www.gharbeia.net\`)~
** Cloudflare Tunnel "home"
- Container: =cloudflared= (cloudflare/cloudflared:latest)
- Config: =/docker/compose/cloudflared-config.yml= (local, unused at runtime)
- Runtime: =docker compose up -d cloudflared= with =--token= (remote config from dashboard)
- Local config is IGNORED when running with =--token= — ingress rules come from Cloudflare Zero Trust dashboard's public hostname configuration
- DNS CNAME records must point to =<tunnel-uuid>.cfargotunnel.com=
- Tunnel UUID: =c29295c5-946a-4ddf-bdfe-7eafcd74faa3=
*** Public Hostnames (Cloudflare Dashboard)
These must be added in Cloudflare Zero Trust > Networks > Tunnels > home > Public Hostnames:
- *.gharbeia.net → https://traefik:443
- gharbeia.net → https://traefik:443 (must be explicit, wildcard doesn't cover root)
- www.gharbeia.net → https://traefik:443
*** DNS Records
gharbeia.net:
- CNAME → c29295c5-946a-4ddf-bdfe-7eafcd74faa3.cfargotunnel.com (proxied)
- MX → in1-smtp.messagingengine.com
- MX → in2-smtp.messagingengine.com
- TXT → v=spf1 include:spf.messagingengine.com ?all
www.gharbeia.net:
- CNAME → c29295c5-946a-4ddf-bdfe-7eafcd74faa3.cfargotunnel.com (proxied)
* Authentication
** Authentik (IdP)
- Provides SSO for all services
- Two modes: Forward Auth (proxy-level) and OIDC (service-level)
- External tunnel traffic: Forward Auth on all routers in compose labels
- Internal LAN: Forward Auth on all routers in internal.yaml
- Exceptions: Jellyfin (SSO plugin), Gitea (native OIDC)
** Gitea — Native OIDC
- Configured in Gitea → Site Administration → Authentication Sources
- Authentik OIDC provider registered
- Works with native Gitea clients (no browser redirect needed)
** Jellyfin — SSO-Auth Plugin v4.0.0.4
- Plugin: SSO-Auth (via Jellyfin plugin catalog)
- Authentik OIDC provider created, redirect URI: ~https://jellyfin.gharbeia.net/sso/OID/redirect/Authentik~
- Scope mapping sends ~groups~ claim in OpenID token
- Plugin configured via API (=docker cp= XML into container)
- SSO button added to login page via Jellyfin branding config
- No Forward Auth — Jellyfin handles auth itself via plugin
* Traefik Configuration
** Source of Truth
- Compose labels: =/docker/compose/docker-compose.yaml= and companion files
- Internal routers (auth): =/docker/compose/traefik-internal.yaml=
- Internal routers (noauth): =/docker/compose/traefik-internal-noauth.yaml=
- Runtime copy: =/docker/appdata/traefik/internal.yaml= and =internal-noauth.yaml=
- Deploy: =restart.sh= copies all yaml files to =/docker/appdata/traefik/= then =docker compose up -d traefik=
** Entrypoints
- =tunnel= (port 8081) :: Cloudflare Tunnel traffic (external)
- =secureweb= (port 443) :: Internal LAN traffic, TLS, with Authentik Forward Auth
- =internal= (port 8083) :: Internal service-to-service, NO Authentik auth, HTTP only
** External Routers (tunnel entrypoint)
- Defined in compose labels
- All behind Authentik Forward Auth middleware (except Jellyfin, Gitea)
** Internal Routers — Authenticated
Defined in =traefik-internal.yaml= (=internal.yaml= at runtime)
All on entrypoint =secureweb= (port 443, TLS)
All behind Authentik Forward Auth middleware
Common middleware chain: =auth@file, security-headers@file, traefik-bouncer@file=
** Internal Routers — Authless
Defined in =traefik-internal-noauth.yaml= (=internal-noauth.yaml= at runtime)
All on entrypoint =internal= (port 8083, HTTP)
No Authentik Forward Auth (no =authentik-forwardauth@file= middleware)
Same services as the authenticated routers, same backend URLs
Common middleware chain: =security-headers@file, traefik-bouncer@file=
Used for: runners, cross-VLAN tooling, service-to-service API calls
Services already authless on secureweb (Gitea, gharbeia-site) also have
noauth copies for consistency.
* LOGBOOK
** [2026-05-15 Thu 06:10] Static site launched
- Setup gharbeia.net and www.gharbeia.net with nginx container
- Tunnel + Traefik wiring
- www → root 301 redirect in nginx config
- Traefik router on both tunnel and secureweb entrypoints
** [2026-05-15 Thu 06:38] Internal authless entrypoint + domain migration
- Added Traefik `internal` entrypoint (port 8083) for authless service-to-service traffic
- Created `/docker/compose/traefik-internal-noauth.yaml` with 28 router copies
- Copied to runtime: `/docker/appdata/traefik/internal-noauth.yaml`
- Exposed port 8083 on traefik container (`INTERNAL_PORT_TRAEFIK=8083`)
- Added to restart.sh deployment: `sudo cp traefik-internal-noauth.yaml $FOLDER_FOR_DATA/traefik/internal-noauth.yaml`
- Unbound already resolves `*.gharbeia.net` → `10.10.10.201` via `local-zone redirect` — no changes needed
- Docker containers already inherit this DNS through embedded DNS (127.0.0.11) → host DNS → Unbound
- Updated Gitea runner: `GITEA_INSTANCE_URL: http://10.10.10.201:3001` → `http://git.gharbeia.net:8083`
- Verified: runner registers and communicates through domain-based URL
- Gitea config already used domains: `ROOT_URL = https://git.gharbeia.net/`
- Gluetun extra_hosts kept as-is (safety net for VPN namespace DNS leaks)
- Architecture: browsers use =:443= (secureweb, with auth); services/automation use =:8083= (internal, no auth)
** [2026-05-15 Thu 06:18] Error 1033 on gharbeia.net
- Problem: CNAME for ~gharbeia.net~ pointed to old tunnel (2cd53dc4-...), not "home" tunnel (c29295c5-...)
- www.gharbeia.net worked because its CNAME was correct
- www → root redirect → Cloudflare tried old tunnel → 1033
- Fix: Updated CNAME DNS record via Cloudflare API (DNS token)
- Config.yml on disk had correct entries but ISN'T used at runtime (tunnel runs with =--token=)
- Lesson: Bare domain DNS must point to the same tunnel UUID as subdomains
- Lesson: The local cloudflared-config.yml is decorative when running with =--token=