ImaGen #8: imagen.jobs queue + imagen worker subcommand (write path for flexsiebels viewer)
#8
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Goal
Let users (m, via the flexsiebels owner-mode UI) trigger image generations from a web form instead of the CLI. Async job-queue architecture: flexsiebels INSERTs a row into
imagen.jobs, a persistentimagen workerdaemon on mRiver picks it up via Postgres NOTIFY (+ 5s safety poll), runs the existing generation pipeline, writes the result intoimagen.images(same path as #7), and updates the job status. flexsiebels polls the job-id and renders the result when status flips todone.Joint plan negotiated head-to-head with paul (flexsiebels/head) — mai messages 1626 / 1627 / 1628. m's ask: 2026-05-11 10:08 ("can we make it that I can also prompt imagen from the website?!").
Why a queue, not a sync HTTP API
pending → running → done|failedstate machine matches what the flexsiebels UI wants to render (pending cards that flip to rendered images).Scope
1. Schema migration
Apply via
mcp__supabase__apply_migration. Paul's sketch verbatim:RLS mirroring
imagen.images:owner_user_id = auth.uid()for SELECT + INSERT (lets flexsiebels' authenticated user insert + read their own jobs). Worker writes (status updates, image_id link, error) via service-role which bypasses RLS.Grants: USAGE on
imagenalready done in #7; add DML onimagen.jobsfor authenticated + service_role per the #7 pattern.PostgREST exposure is already done from #7 (
imagenis inpgrst.db_schemas). New tables auto-pick-up — no NOTIFY pgrst reload needed.2. Postgres NOTIFY trigger
A trigger on
imagen.jobsINSERT that issuesNOTIFY imagen_jobs, <job_id>. The worker LISTENs on this channel for low-latency pickup. Usepg_notify('imagen_jobs', NEW.id::text)in an AFTER INSERT trigger.3.
imagen workersubcommandNew
cmd/imagen/worker.go. Long-running daemon:internal/cloud/client pattern (service-role).LISTEN imagen_jobschannel via pgx or libpq. On notification: try to claim the job (UPDATE ... SET status='running', started_at=now() WHERE id=$1 AND status='pending' RETURNING *) — the UPDATE-returning pattern means concurrent workers can't double-claim.pendingrows older than 5s that no LISTEN delivered (handles NOTIFY drops + cold start).imagen generateuses internally — backend dispatch, prompt enrichment, output writer, cloud-sync from #7). All of #7's output handling runs as-is and produces theimagen.imagesrow.UPDATE imagen.jobs SET status='done', completed_at=now(), image_id=$1 WHERE id=$2.UPDATE imagen.jobs SET status='failed', completed_at=now(), error=$1 WHERE id=$2. Capture the error message but not secrets / stack traces — caller-facing summary only.4. systemd unit
New
scripts/imagen-worker.servicefor mRiver:Install via
systemctl --user enable imagen-worker.serviceon mRiver. Document indocs/setup-worker-mriver.md.Environment file path: must contain
SUPABASE_URL,SUPABASE_SERVICE_ROLE_KEY, and any backend creds (REPLICATE_API_TOKEN if Replicate is enabled). Note: never commit the .env file itself — only the systemd unit references the path.5. Tests
cmd/imagen/worker_test.go: unit tests for the claim/release logic, status transitions, NOTIFY parsing. Mockedpgxconnection. No real Postgres in CI.internal/cloud/cloud_test.go(existing): no changes; cloud-sync stays the same.imagen.imagesrow lands and job status flips todone. Run only whenIMAGEN_WORKER_INTEGRATION=1.6. Smoke test
With the worker running (
systemctl --user start imagen-worker), INSERT a test job via mcp__supabase:Expected within ~10s: job status
done,image_idpopulated,imagen.imagesrow + Storage object + signed URL all green.Acceptance criteria
imagen.jobstable exists in msupabase per the schema above, RLS + indexes in place, INSERT NOTIFY trigger on the channelimagen_jobs.imagen workersubcommand built into the CLI;systemctl --user start imagen-worker.serviceon mRiver runs it as a daemon.pendingrow intoimagen.jobstriggers a generation within ~10s (~5s LISTEN latency budget + ~8s FLUX schnell render).pending → running → donewithstarted_at/completed_atset, and theimage_idlinks to a realimagen.imagesrow written by the worker via the same cloud-sync path from #7.failedwith error text; worker process restart picks up where it left off (no orphanedrunningjobs).go build ./... && go test ./...clean.Out of scope
imagen.images→flexsiebels.images) — still deferred from #7's spec.Refs
Workflow
Coder/gitster role. Phasing:
imagen workersubcommand + tests + systemd unit + env-file template.Head reviews + merges --no-ff into main + comments + applies
donelabel.Phase 1+2+3 done end-to-end. Commit 2758c5a on branch mai/hermes/issue-8-imagen-8-imagen.
Phase 1 — schema
Migration
imagen_jobs_initapplied to msupabase:imagen.jobsper the spec (UUID PK, owner_user_id NOT NULL FK toauth.users, status CHECK,image_idFK toimagen.imagesON DELETE SET NULL)imagen_jobs_status_pending_idx(partial on pending) +imagen_jobs_owner_recent_idxuid()— same shape asimagen.imagesfrom ImaGen #7: cloud-sync (Supabase Storage + imagen.images schema) for the flexsiebels viewer (#7)authenticated+service_roleimagen_jobs_notify_insert→pg_notify('imagen_jobs', NEW.id::text)imagenschema already exposed by PostgREST from #7 — new table auto-picks upPhase 2 —
imagen workersubcommandinternal/workeris a DB-agnostic loop over aQueueinterface. Job-scoped contexts derive fromBackground()so SIGTERM lets the in-flight generation finish — no half-state on shutdown. Drains the whole pending backlog on every wake.ResetStaleRunning()at startup unsticks rows left over from a crash. Eight unit tests cover done / failed / missing-id / drain / NOTIFY-wake / shutdown / transient-error paths against a fake queue (no real Postgres in CI).cmd/imagen/worker.goships the pgx implementation (single dedicated conn for LISTEN + UPDATE) plusworkerPipelinereusingbuildBackend+attachUsageSink+prompt.Apply+buildWriter+maybeCloudSync. Per-jobowner_user_idoverrides the env fallback so multi-user worlds attribute correctly.maybeCloudSyncnow returns(*cloud.SyncResult, error)so the worker can linkimagen.jobs.image_idto the insertedimagen.imagesrow. CLIgeneratekeeps its stderr summary unchanged.scripts/imagen-worker.service(systemd--user) +scripts/imagen-worker.env.example. EnvironmentFile lives in~/.dotfiles/.env.imagen-workerand is never committed.docs/setup-worker-mriver.mdwalks through install + the spec's SQL-INSERT smoke;docs/architecture.mdgrows an "async write path" section.Phase 3 — end-to-end smoke
Two passes, both green:
TestWorker_Integration_EndToEnd(env-guarded byIMAGEN_WORKER_INTEGRATION=1) inserts one real job intoimagen.jobs, lets the worker claim + process via the mock backend, assertsstatus='done'+ linkedimage_id, then cleans up. 0.54s../bin/imagen worker --poll-interval=2sin the background, INSERTed a job viamcp__supabase, watched pickup + done within ~74ms (NOTIFY round-trip ~7ms). Failure path verified by inserting a job withbackend='nonexistent-backend'→status='failed', full error text in theerrorcolumn. SIGTERM exits cleanly. Test rows deleted afterwards.Build + tests
Acceptance criteria
imagen.jobstable + RLS + indexes + INSERT NOTIFY trigger live — yes.imagen workersubcommand built + run as a daemon (systemd unit inscripts/) — yes.pending → running → donetransitions with timestamps,image_idlinks into a realimagen.imagesrow written through #7's cloud-sync path — yes.status='failed'with error text — yes.ResetStaleRunningcovers worker-restart resilience for rows left inrunning.go build ./... && go test ./...clean — yes.Out of scope (carried through from spec)
FOR UPDATE SKIP LOCKED, so it's cheap to add later)imagen.images→flexsiebels.imagesDeployment note
The systemd unit + env template are committed but not yet installed on mRiver — m can do
systemctl --user enable --now imagen-worker.serviceperdocs/setup-worker-mriver.mdwhen ready, or leave it parked until the flexsiebels-side UI lands.Merged into main + worker live on mRiver
Branch
mai/hermes/issue-8-imagen-8-imagenmerged via--no-ffatdbe1704. Pushed to origin/main.2758c5aAcceptance criteria
imagen.jobstable + RLS + indexes + NOTIFY triggerimagen_jobs_initapplied;imagen_jobs_notify_inserttrigger callspg_notify('imagen_jobs', NEW.id::text)on AFTER INSERTimagen workersubcommand running as systemd on mRiver~/.config/systemd/user/imagen-worker.serviceinstalled, enabled, active; logs:imagen worker: ready (poll-interval 5s job-timeout 5m0s)aa41e87e-e80c-445d-9ed0-597cd7c6f3edend-to-end in 8.3s (INSERT 10:25:37.558 → done 10:25:45.852); NOTIFY latency <5msimage_id=19397d4c-2e76-442c-8e2b-2b8742bb1544, Storage object2026-05-11/a-tiny-owl-wearing-wire-rim-glasses-phot-4824635221023530683.pngResetStaleRunningon worker startup flips orphanedrunningrows back topending; backend errors setstatus='failed'with error textinternal/worker/worker_test.gocovers claim logic, status transitions, NOTIFY parsing;worker_integration_test.go(env-guarded byIMAGEN_WORKER_INTEGRATION=1) for real DB pathgo build ./... && go test ./...cleanArchitecture summary (for future reference)
Operations
systemctl --user status imagen-workerjournalctl --user -u imagen-worker -fsystemctl --user restart imagen-workerdocs/setup-worker-mriver.md~/.dotfiles/.env.imagen-workerCoordinated with flexsiebels-side
1a062b7(paul, knuth):/imagine/newform route, POST/GET jobs API endpoints behind requireAuth, pending-card polling UI on/imagine. Awaiting Dokploy redeploy for the visual smoke at https://flexsiebels.de/imagine.Cross-link