robot-u-site/AGENTS.md
kacper ce32eb067c
All checks were successful
CI / check (push) Successful in 17s
CI / deploy (push) Successful in 21s
Restrict deployment workflow to main
2026-04-15 06:39:26 -04:00

17 KiB

Robot U Site Agent Guide

Purpose

This repository contains the Robot U community site.

It is a thin application layer over Forgejo:

  • Forgejo is the source of truth for authentication, public content repos, and issue-backed discussions.
  • This app provides the web UI, course/lesson browsing, markdown rendering, and ICS calendar ingestion.
  • The current live Forgejo instance is https://aksal.cloud.

Stack

  • Backend: FastAPI
  • Frontend: Preact + TypeScript + Vite
  • Python tooling: uv, ruff
  • Frontend tooling: bun, Biome

Important Files

  • app.py: FastAPI app and SPA/static serving
  • live_prototype.py: live payload assembly for courses, lessons, discussions, and events
  • prototype_cache.py: server-side cache for the public Forgejo content payload
  • update_events.py: in-process SSE broker for content update notifications
  • forgejo_client.py: Forgejo API client
  • calendar_feeds.py: ICS/webcal feed loading and parsing
  • settings.py: env-driven runtime settings
  • frontend/src/App.tsx: client routes and page composition
  • frontend/src/MarkdownContent.tsx: safe markdown renderer used in lessons and discussions
  • scripts/start.sh: main startup command for local runs

Repo Layout Notes

  • The root repository is the site application.
  • examples/quadrature-encoder-course/ is a separate nested git repo used as sample content. It is intentionally ignored by the root repo and should stay that way.

First-Time Setup

Python

python3 -m venv .venv
.venv/bin/pip install -r requirements.txt

Frontend

cd frontend
~/.bun/bin/bun install

Environment

Runtime configuration is loaded from shell env, then .env, then .env.local through scripts/start.sh.

Recommended local flow:

cp .env.example .env

Useful variables:

  • FORGEJO_BASE_URL=https://aksal.cloud
  • APP_BASE_URL=http://kacper-dev-pod:8800
  • AUTH_SECRET_KEY=...
  • AUTH_COOKIE_SECURE=false
  • CORS_ALLOW_ORIGINS=http://kacper-dev-pod:8800
  • FORGEJO_OAUTH_CLIENT_ID=...
  • FORGEJO_OAUTH_CLIENT_SECRET=...
  • FORGEJO_OAUTH_SCOPES=openid profile
  • FORGEJO_TOKEN=...
  • FORGEJO_GENERAL_DISCUSSION_REPO=Robot-U/general_forum
  • FORGEJO_WEBHOOK_SECRET=...
  • FORGEJO_CACHE_TTL_SECONDS=60.0
  • CALENDAR_FEED_URLS=webcal://...
  • HOST=0.0.0.0
  • PORT=8800

Notes:

  • Browser sign-in uses Forgejo OAuth/OIDC. APP_BASE_URL must match the URL opened in the browser, CORS_ALLOW_ORIGINS should include that origin, and the Forgejo OAuth app must include /api/auth/forgejo/callback under that base URL.
  • Browser OAuth requests only identity scopes. The backend stores the resulting Forgejo token in an encrypted HttpOnly cookie and may use it only after enforcing public-repository checks for writes.
  • FORGEJO_TOKEN is optional and should be treated as a read-only local fallback for the public content cache. Browser sessions and API token calls may write issues/comments only after verifying the target repo is public.
  • /api/prototype uses a server-side cache for public Forgejo content. FORGEJO_CACHE_TTL_SECONDS=0 disables it; successful discussion replies invalidate it.
  • General discussion creation requires FORGEJO_GENERAL_DISCUSSION_REPO. Linked discussions are created in the content repo and include canonical app URLs in the Forgejo issue body.
  • Forgejo webhooks should POST to /api/forgejo/webhook; when FORGEJO_WEBHOOK_SECRET is set, the backend validates Forgejo/Gitea-style HMAC headers.
  • API clients can query with Authorization: token ... or Authorization: Bearer ....
  • CALENDAR_FEED_URLS is optional and accepts comma-separated webcal:// or https:// ICS feeds.
  • Do not commit .env, .env.local, or .env.proxmox.

Main Start Command

Use this for the normal local app flow:

./scripts/start.sh

What it does:

  1. Loads .env and .env.local if present.
  2. Builds the frontend with bun.
  3. Starts FastAPI with uvicorn.

Override host/port when needed:

HOST=0.0.0.0 PORT=8800 ./scripts/start.sh

Deployment Commands

Bootstrap Forgejo Actions SSH clone credentials:

export FORGEJO_API_TOKEN=...
./scripts/bootstrap_ci_clone_key.py

Bootstrap or rotate the Forgejo Actions LXC deploy credentials:

export FORGEJO_API_TOKEN=...
./scripts/bootstrap_lxc_deploy_key.py

Validate production environment before starting:

./scripts/check_deploy_config.py

Container deployment:

docker compose up --build -d
curl -fsS http://127.0.0.1:8800/health

Non-container production start after building frontend/dist:

HOST=0.0.0.0 PORT=8000 ./scripts/run_prod.sh

Current Proxmox Deployment

Current app host:

  • Proxmox node: proxmox
  • LXC VMID: 108
  • LXC hostname: robotu-app
  • LXC IP: 192.168.1.220/24
  • LXC gateway: 192.168.1.2
  • LXC DNS: 192.168.1.2
  • SSH target: root@192.168.1.220
  • App directory on LXC: /opt/robot-u-site
  • Public runtime URL: https://discourse.onl
  • Internal app URL: http://192.168.1.220:8800
  • Compose service: robot-u-site
  • Container port mapping: host 8800 to container 8000
  • Reverse proxy: LXC 102 routes discourse.onl to 192.168.1.220:8800

The local .env.proxmox file contains Proxmox credentials and LXC settings. It is ignored by git and must not be printed, committed, or copied into the app container.

The deployed app uses /opt/robot-u-site/.env on the LXC. That file contains Forgejo OAuth settings, AUTH_SECRET_KEY, optional FORGEJO_TOKEN for the server-side public content cache, calendar feeds, and the deployed APP_BASE_URL. Treat it as secret material and do not print values.

The current deployed OAuth redirect URI is:

https://discourse.onl/api/auth/forgejo/callback

Forgejo OAuth sign-in from the public URL requires that exact callback URL to be allowed in the Forgejo OAuth app.

Important deployment notes:

  • The LXC was initially created with gateway/DNS 192.168.1.1, but this network uses 192.168.1.2. If package installs hang or outbound network fails, check ip route and /etc/resolv.conf first.
  • Proxmox persistent LXC config was updated so net0 uses gw=192.168.1.2, and nameserver is 192.168.1.2.
  • Docker inside the unprivileged LXC requires Proxmox features nesting=1,keyctl=1; those are set on the current container.
  • Ubuntu package installs were made reliable by adding /etc/apt/apt.conf.d/99force-ipv4 with Acquire::ForceIPv4 "true";.
  • The current LXC has 512MiB memory and 512MiB swap. It runs the app, but large builds or future services may need more memory.
  • FORGEJO_TOKEN is needed server-side if anonymous Forgejo API discovery returns no content. Without that token, /api/prototype can return zero courses/posts/discussions even though the app is healthy.

Useful checks:

ssh root@192.168.1.220 'cd /opt/robot-u-site && docker compose ps'
curl -fsS http://192.168.1.220:8800/health
curl -fsS https://discourse.onl/health
curl -fsS https://discourse.onl/api/prototype

Manual redeploy to the current LXC:

ssh root@192.168.1.220 'mkdir -p /opt/robot-u-site'
rsync -az --delete \
  --exclude='.git/' \
  --exclude='.venv/' \
  --exclude='__pycache__/' \
  --exclude='.pytest_cache/' \
  --exclude='.ruff_cache/' \
  --exclude='.env' \
  --exclude='.env.*' \
  --exclude='frontend/node_modules/' \
  --exclude='frontend/dist/' \
  --exclude='frontend/.vite/' \
  --exclude='examples/quadrature-encoder-course/' \
  ./ root@192.168.1.220:/opt/robot-u-site/
ssh root@192.168.1.220 'cd /opt/robot-u-site && ./scripts/check_deploy_config.py && docker compose up --build -d'
curl -fsS http://192.168.1.220:8800/health

Do not overwrite /opt/robot-u-site/.env during rsync. Update it deliberately when runtime config changes.

Current production env notes:

  • /opt/robot-u-site/.env should use APP_BASE_URL=https://discourse.onl.
  • AUTH_COOKIE_SECURE=true is required for the public HTTPS site.
  • CORS_ALLOW_ORIGINS=https://discourse.onl is the current public origin.
  • A pre-domain backup exists on the app LXC at /opt/robot-u-site/.env.backup.20260415T101957Z.

CI state:

  • .forgejo/workflows/ci.yml runs on docker.
  • The workflow checks PRs targeting main and pushes to main; deployment is explicitly gated to push events where github.ref == refs/heads/main.
  • The check job manually installs CI_REPO_SSH_KEY, clones git@aksal.cloud:Robot-U/robot-u-site.git, installs uv and Bun, then runs Python and frontend checks.
  • The deploy job runs after check on pushes to main, installs DEPLOY_SSH_KEY, clones the repo, rsyncs it to root@192.168.1.220:/opt/robot-u-site/, rebuilds Docker Compose, and checks /health.
  • Forgejo branch protection on main should block direct pushes and require the CI / check (pull_request) status check before PR merge.
  • The repo has a read-only deploy key and matching Forgejo Actions secret for CI clone.
  • The app LXC has a CI deploy public key in root's authorized_keys, and the matching private key is stored in the Forgejo Actions secret DEPLOY_SSH_KEY.
  • scripts/bootstrap_lxc_deploy_key.py recreates or rotates the LXC deploy key. It uses FORGEJO_API_TOKEN, appends the generated public key to the LXC user's authorized_keys, verifies SSH, and stores the generated private key in DEPLOY_SSH_KEY.
  • The deploy rsync excludes .env and .env.*, so production runtime secrets and backups on /opt/robot-u-site are preserved.

Reverse Proxy LXC 102

The reverse proxy host is Proxmox LXC 102:

  • LXC hostname: reverse-proxy
  • LXC IP: 192.168.1.203/24
  • Gateway: 192.168.1.2
  • Main jobs: nginx reverse proxy, LiteLLM proxy, and custom Porkbun DDNS script
  • nginx service: nginx.service
  • LiteLLM service: litellm.service
  • Porkbun service: porkbun-ddns.service
  • Robot U public site: discourse.onl
  • Robot U nginx config: /etc/nginx/sites-available/discourse.onl
  • Robot U certificate: /etc/letsencrypt/live/discourse.onl/
  • Robot U upstream: http://192.168.1.220:8800

Do not bundle unrelated maintenance. If asked to update LiteLLM, do not change nginx or Porkbun DNS config unless explicitly requested. As of the last LiteLLM update, porkbun-ddns.service was failed and was intentionally left untouched.

The discourse.onl nginx site was created on April 15, 2026 following the existing aksal.cloud pattern:

nginx -t && systemctl reload nginx
certbot --nginx -d discourse.onl --redirect --non-interactive

Certbot issued a Let's Encrypt certificate expiring on July 14, 2026. Validate the route with:

curl -fsS https://discourse.onl/health
curl -fsS -o /tmp/discourse-home.html -w '%{http_code} %{content_type}\n' https://discourse.onl/

curl -I https://discourse.onl/ returns 405 because the FastAPI app does not handle HEAD; use GET-based checks instead.

The discourse.onl Porkbun DDNS copy is intentionally separate from the existing aksal.* setup:

  • Script directory: /opt/porkbun-ddns-discourse-onl
  • Service user/group: porkbun-discourse:porkbun-discourse
  • Service: porkbun-ddns-discourse-onl.service
  • Timer: porkbun-ddns-discourse-onl.timer
  • Managed records: A discourse.onl and A *.discourse.onl
  • Current managed IP as of setup: 64.30.74.112

The discourse.onl copy of updateDNS.sh was patched locally to make Porkbun curl calls use --fail and stronger retries, preventing transient 503 HTML bodies from being concatenated with JSON. A PR with the same fix was opened against the upstream Porkbun DDNS repo: https://aksal.cloud/Amargius_Commons/porkbun_ddns_script/pulls/1.

Direct SSH to root@192.168.1.203, litellm@192.168.1.203, or root@192.168.1.200 may not work from this workspace. If SSH fails, use the Proxmox API credentials in the ignored .env.proxmox file to open a Proxmox node terminal and run pct exec 102 -- ....

Proxmox API terminal access pattern:

  1. Read .env.proxmox; never print credentials.
  2. POST /api2/json/access/ticket with the Proxmox username/password.
  3. POST /api2/json/nodes/proxmox/termproxy using the returned ticket and CSRF token.
  4. Connect to wss://<proxmox-host>:8006/api2/json/nodes/proxmox/vncwebsocket?port=<port>&vncticket=<ticket>.
  5. Send binary login payload root@pam:<term-ticket>\n; expect OK.
  6. Send shell commands through the xterm websocket protocol: command payloads are framed as 0:<byte-length>:<command>, followed by 0:1:\n.
  7. Prefer adding a unique sentinel to each command so the runner can detect completion instead of treating websocket read timeouts as command failure.

Useful discovery commands from the Proxmox node shell:

pct status 102
pct config 102
pct exec 102 -- bash -lc 'hostname; systemctl list-units --type=service --all --no-pager | grep -Ei "lite|llm|nginx|porkbun|dns"'
pct exec 102 -- bash -lc 'systemctl status litellm --no-pager; systemctl cat litellm --no-pager'

LiteLLM current layout:

  • Service unit: /etc/systemd/system/litellm.service
  • Service user/group: litellm:litellm
  • Working directory: /opt/litellm/
  • Virtualenv: /opt/litellm/venv
  • Config file: /opt/litellm/config.yaml
  • Service command: /opt/litellm/venv/bin/litellm --config /opt/litellm/config.yaml --port 4000
  • Local liveliness check: http://127.0.0.1:4000/health/liveliness
  • Local readiness check: http://127.0.0.1:4000/health/readiness

LiteLLM update checklist:

  1. Inspect current state and versions.
pct exec 102 -- bash -lc '/opt/litellm/venv/bin/python -m pip show litellm; curl -fsS -m 5 http://127.0.0.1:4000/health/liveliness'
  1. Back up config and installed package set.
pct exec 102 -- bash -lc 'set -euo pipefail; stamp=$(date -u +%Y%m%dT%H%M%SZ); mkdir -p /opt/litellm/backups; cp -a /opt/litellm/config.yaml /opt/litellm/backups/config.yaml.$stamp; /opt/litellm/venv/bin/python -m pip freeze > /opt/litellm/backups/pip-freeze.$stamp.txt; chown -R litellm:litellm /opt/litellm/backups'
  1. Stop LiteLLM before upgrading. Container 102 has only 512MiB RAM and tends to use swap; stopping the proxy keeps pip from competing with the running process.
pct exec 102 -- bash -lc 'systemctl stop litellm; systemctl is-active litellm || true'
  1. Upgrade pip and LiteLLM as the litellm user.
pct exec 102 -- bash -lc 'set -euo pipefail; runuser -u litellm -- /opt/litellm/venv/bin/python -m pip install --upgrade pip; runuser -u litellm -- /opt/litellm/venv/bin/python -m pip install --upgrade "litellm[proxy]"'
  1. Restart and verify.
pct exec 102 -- bash -lc 'set -euo pipefail; systemctl start litellm; sleep 8; systemctl is-active litellm; /opt/litellm/venv/bin/python -m pip show litellm | sed -n "1,8p"; curl -fsS -m 10 http://127.0.0.1:4000/health/liveliness; echo; curl -fsS -m 10 http://127.0.0.1:4000/health/readiness; echo; /opt/litellm/venv/bin/python -m pip check; systemctl show litellm -p ActiveState -p SubState -p NRestarts -p MainPID -p ExecMainStatus --no-pager'

After the April 15, 2026 update, LiteLLM was upgraded from 1.81.15 to 1.83.7, /health/liveliness returned "I'm alive!", /health/readiness reported db=connected, and pip check reported no broken requirements. Startup logs may briefly print Unable to connect to DB. DATABASE_URL found in environment, but prisma package not found.; treat readiness and the Prisma process/import check as the source of truth before deciding it is an actual failure.

Development Commands

Backend only

.venv/bin/python -m uvicorn app:app --reload

Frontend only

cd frontend
~/.bun/bin/bun run dev

Frontend production build

cd frontend
~/.bun/bin/bun run build

Quality Checks

Run both before pushing:

./scripts/check_python_quality.sh
./scripts/check_frontend_quality.sh

Product/Data Model Background

  • Public non-fork repos are scanned.
  • A repo with /lessons/ is treated as a course repo.
  • A repo with /blogs/ is treated as a post repo.
  • Lessons are discovered from lessons/<chapter>/<lesson>/.
  • Each lesson folder is expected to contain one markdown file plus optional assets.
  • Frontmatter is used when present for title and summary.
  • Discussions are loaded from Forgejo issues and comments.
  • Issue bodies are scanned for canonical post/lesson URLs and Forgejo file URLs to connect discussions back to content.
  • Calendar events are loaded from ICS feeds, not managed in-app.

UI Expectations

  • The UI should not expose Forgejo as a user-facing implementation detail unless necessary for debugging.
  • Course cards should open course pages.
  • Lesson rows should open lesson pages.
  • Discussion pages should focus on one thread at a time.
  • Markdown should render as readable content, not raw source.

Push Workflow

The site source repo currently lives at:

  • git@aksal.cloud:Robot-U/robot-u-site.git

Typical push flow:

git status
git add ...
git commit -m "..."
git push origin main