ImaGen #8: imagen.jobs queue + imagen worker subcommand (write path for flexsiebels viewer) #8

Open
opened 2026-05-11 08:12:24 +00:00 by mAi · 2 comments
Collaborator

Goal

Let users (m, via the flexsiebels owner-mode UI) trigger image generations from a web form instead of the CLI. Async job-queue architecture: flexsiebels INSERTs a row into imagen.jobs, a persistent imagen worker daemon on mRiver picks it up via Postgres NOTIFY (+ 5s safety poll), runs the existing generation pipeline, writes the result into imagen.images (same path as #7), and updates the job status. flexsiebels polls the job-id and renders the result when status flips to done.

Joint plan negotiated head-to-head with paul (flexsiebels/head) — mai messages 1626 / 1627 / 1628. m's ask: 2026-05-11 10:08 ("can we make it that I can also prompt imagen from the website?!").

Why a queue, not a sync HTTP API

  • Generations are 8-30s (FLUX schnell on mRock ~8s, FLUX dev / Replicate up to 30s). Holding a SvelteKit request that long is bad UX and forecloses batch / scheduled / retry runs.
  • Async queue gives us free batch capability, robust retries, and ability to add scheduled jobs later.
  • A pending → running → done|failed state machine matches what the flexsiebels UI wants to render (pending cards that flip to rendered images).
  • Polling at 2s intervals from flexsiebels is cheap, obvious, debuggable. No WebSocket / Supabase Realtime dependency.

Scope

1. Schema migration

Apply via mcp__supabase__apply_migration. Paul's sketch verbatim:

CREATE TABLE imagen.jobs (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  owner_user_id UUID NOT NULL REFERENCES auth.users(id),
  prompt TEXT NOT NULL,
  backend TEXT NOT NULL,            -- 'flux-schnell-local' | 'flux-schnell-replicate' | ...
  model TEXT,
  width INT, height INT,
  steps INT,
  seed BIGINT,                      -- NULL = random
  style TEXT,                       -- optional preset/tag, free-form for now
  status TEXT NOT NULL DEFAULT 'pending',
  error TEXT,
  image_id UUID REFERENCES imagen.images(id) ON DELETE SET NULL,
  created_at TIMESTAMPTZ DEFAULT now(),
  started_at TIMESTAMPTZ,
  completed_at TIMESTAMPTZ,
  CHECK (status IN ('pending','running','done','failed'))
);
CREATE INDEX imagen_jobs_status_pending_idx ON imagen.jobs(created_at) WHERE status='pending';
CREATE INDEX imagen_jobs_owner_recent_idx ON imagen.jobs(owner_user_id, created_at DESC);

RLS mirroring imagen.images: owner_user_id = auth.uid() for SELECT + INSERT (lets flexsiebels' authenticated user insert + read their own jobs). Worker writes (status updates, image_id link, error) via service-role which bypasses RLS.

Grants: USAGE on imagen already done in #7; add DML on imagen.jobs for authenticated + service_role per the #7 pattern.

PostgREST exposure is already done from #7 (imagen is in pgrst.db_schemas). New tables auto-pick-up — no NOTIFY pgrst reload needed.

2. Postgres NOTIFY trigger

A trigger on imagen.jobs INSERT that issues NOTIFY imagen_jobs, <job_id>. The worker LISTENs on this channel for low-latency pickup. Use pg_notify('imagen_jobs', NEW.id::text) in an AFTER INSERT trigger.

3. imagen worker subcommand

New cmd/imagen/worker.go. Long-running daemon:

  • Connects to msupabase via the existing internal/cloud/ client pattern (service-role).
  • LISTEN imagen_jobs channel via pgx or libpq. On notification: try to claim the job (UPDATE ... SET status='running', started_at=now() WHERE id=$1 AND status='pending' RETURNING *) — the UPDATE-returning pattern means concurrent workers can't double-claim.
  • 5s safety poll fallback: every 5s, also scan for any pending rows older than 5s that no LISTEN delivered (handles NOTIFY drops + cold start).
  • For each claimed job: shell through the existing imagen generation pipeline (the same code path that imagen generate uses internally — backend dispatch, prompt enrichment, output writer, cloud-sync from #7). All of #7's output handling runs as-is and produces the imagen.images row.
  • On success: UPDATE imagen.jobs SET status='done', completed_at=now(), image_id=$1 WHERE id=$2.
  • On failure: UPDATE imagen.jobs SET status='failed', completed_at=now(), error=$1 WHERE id=$2. Capture the error message but not secrets / stack traces — caller-facing summary only.
  • Graceful shutdown on SIGTERM: finish the current job (don't leave half-state), then exit.

4. systemd unit

New scripts/imagen-worker.service for mRiver:

[Unit]
Description=ImaGen worker (consumes imagen.jobs queue)
Wants=network-online.target
After=network-online.target

[Service]
Type=simple
ExecStart=/home/m/dev/ImaGen/bin/imagen worker
Restart=on-failure
RestartSec=5
EnvironmentFile=/home/m/.dotfiles/.env.imagen-worker  # service-role key, msupabase URL
User=m

[Install]
WantedBy=default.target

Install via systemctl --user enable imagen-worker.service on mRiver. Document in docs/setup-worker-mriver.md.

Environment file path: must contain SUPABASE_URL, SUPABASE_SERVICE_ROLE_KEY, and any backend creds (REPLICATE_API_TOKEN if Replicate is enabled). Note: never commit the .env file itself — only the systemd unit references the path.

5. Tests

  • cmd/imagen/worker_test.go: unit tests for the claim/release logic, status transitions, NOTIFY parsing. Mocked pgx connection. No real Postgres in CI.
  • internal/cloud/cloud_test.go (existing): no changes; cloud-sync stays the same.
  • Integration test (env-guarded): one real job-row INSERT, worker picks it up, verifies imagen.images row lands and job status flips to done. Run only when IMAGEN_WORKER_INTEGRATION=1.

6. Smoke test

With the worker running (systemctl --user start imagen-worker), INSERT a test job via mcp__supabase:

INSERT INTO imagen.jobs (owner_user_id, prompt, backend, width, height)
VALUES ('ac6c9501-3757-4a6d-8b97-2cff4288382b', 'a tiny owl wearing wire-rim glasses, photo', 'flux-schnell-local', 1024, 1024);

Expected within ~10s: job status done, image_id populated, imagen.images row + Storage object + signed URL all green.

Acceptance criteria

  1. imagen.jobs table exists in msupabase per the schema above, RLS + indexes in place, INSERT NOTIFY trigger on the channel imagen_jobs.
  2. imagen worker subcommand built into the CLI; systemctl --user start imagen-worker.service on mRiver runs it as a daemon.
  3. Inserting a pending row into imagen.jobs triggers a generation within ~10s (~5s LISTEN latency budget + ~8s FLUX schnell render).
  4. Job rows transition pending → running → done with started_at / completed_at set, and the image_id links to a real imagen.images row written by the worker via the same cloud-sync path from #7.
  5. Failure paths: backend unreachable → job status failed with error text; worker process restart picks up where it left off (no orphaned running jobs).
  6. Unit tests cover claim logic and status transitions; one env-guarded integration test exercises the full path.
  7. go build ./... && go test ./... clean.

Out of scope

  • The flexsiebels-side form + polling endpoints + pending-card UI — that's m/flexsiebels.de#65 (paul's side, knuth implementing).
  • Job cancellation / kill switch — separate follow-up if m wants it.
  • Concurrent workers / multi-mRiver scale-out — single worker is fine for v1.
  • Job-history retention policies — leave forever for now; m can prune via SQL later if needed.
  • Scheduled / recurring jobs — separate future issue once the queue ships.
  • Promotion path (imagen.imagesflexsiebels.images) — still deferred from #7's spec.

Refs

  • Joint design: mai messages 1626 (head → paul, scoping) + 1627 (paul → head, A2 chosen + schema + worker model + UX) + 1628 (head → paul, confirmation).
  • m's ask: 2026-05-11 10:08 ("can we make it that I can also prompt imagen from the website?!").
  • m/flexsiebels.de#65 — sibling issue, write-path UI on the flexsiebels side. paul filing it in parallel.
  • ImaGen#7 — data plane and cloud-sync that this issue reuses verbatim.
  • ImaGen#6 — superseded; this issue + #7 + #64 + #65 together replace it.

Workflow

Coder/gitster role. Phasing:

  • Phase 1: schema migration + NOTIFY trigger + RLS + grants. Ping head with DONE-PHASE-1 so paul/knuth can start scaffolding flexsiebels' INSERT path against the real schema.
  • Phase 2: imagen worker subcommand + tests + systemd unit + env-file template.
  • Phase 3: one end-to-end smoke via SQL INSERT → worker pickup → image row.

Head reviews + merges --no-ff into main + comments + applies done label.

## Goal Let users (m, via the flexsiebels owner-mode UI) trigger image generations from a web form instead of the CLI. Async job-queue architecture: flexsiebels INSERTs a row into `imagen.jobs`, a persistent `imagen worker` daemon on mRiver picks it up via Postgres NOTIFY (+ 5s safety poll), runs the existing generation pipeline, writes the result into `imagen.images` (same path as #7), and updates the job status. flexsiebels polls the job-id and renders the result when status flips to `done`. Joint plan negotiated head-to-head with paul (flexsiebels/head) — mai messages 1626 / 1627 / 1628. m's ask: 2026-05-11 10:08 ("can we make it that I can also prompt imagen from the website?!"). ## Why a queue, not a sync HTTP API - Generations are 8-30s (FLUX schnell on mRock ~8s, FLUX dev / Replicate up to 30s). Holding a SvelteKit request that long is bad UX and forecloses batch / scheduled / retry runs. - Async queue gives us free batch capability, robust retries, and ability to add scheduled jobs later. - A `pending → running → done|failed` state machine matches what the flexsiebels UI wants to render (pending cards that flip to rendered images). - Polling at 2s intervals from flexsiebels is cheap, obvious, debuggable. No WebSocket / Supabase Realtime dependency. ## Scope ### 1. Schema migration Apply via `mcp__supabase__apply_migration`. Paul's sketch verbatim: ```sql CREATE TABLE imagen.jobs ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), owner_user_id UUID NOT NULL REFERENCES auth.users(id), prompt TEXT NOT NULL, backend TEXT NOT NULL, -- 'flux-schnell-local' | 'flux-schnell-replicate' | ... model TEXT, width INT, height INT, steps INT, seed BIGINT, -- NULL = random style TEXT, -- optional preset/tag, free-form for now status TEXT NOT NULL DEFAULT 'pending', error TEXT, image_id UUID REFERENCES imagen.images(id) ON DELETE SET NULL, created_at TIMESTAMPTZ DEFAULT now(), started_at TIMESTAMPTZ, completed_at TIMESTAMPTZ, CHECK (status IN ('pending','running','done','failed')) ); CREATE INDEX imagen_jobs_status_pending_idx ON imagen.jobs(created_at) WHERE status='pending'; CREATE INDEX imagen_jobs_owner_recent_idx ON imagen.jobs(owner_user_id, created_at DESC); ``` RLS mirroring `imagen.images`: `owner_user_id = auth.uid()` for SELECT + INSERT (lets flexsiebels' authenticated user insert + read their own jobs). Worker writes (status updates, image_id link, error) via service-role which bypasses RLS. Grants: USAGE on `imagen` already done in #7; add DML on `imagen.jobs` for authenticated + service_role per the #7 pattern. PostgREST exposure is already done from #7 (`imagen` is in `pgrst.db_schemas`). New tables auto-pick-up — no NOTIFY pgrst reload needed. ### 2. Postgres NOTIFY trigger A trigger on `imagen.jobs` INSERT that issues `NOTIFY imagen_jobs, <job_id>`. The worker LISTENs on this channel for low-latency pickup. Use `pg_notify('imagen_jobs', NEW.id::text)` in an AFTER INSERT trigger. ### 3. `imagen worker` subcommand New `cmd/imagen/worker.go`. Long-running daemon: - Connects to msupabase via the existing `internal/cloud/` client pattern (service-role). - `LISTEN imagen_jobs` channel via pgx or libpq. On notification: try to claim the job (`UPDATE ... SET status='running', started_at=now() WHERE id=$1 AND status='pending' RETURNING *`) — the UPDATE-returning pattern means concurrent workers can't double-claim. - 5s safety poll fallback: every 5s, also scan for any `pending` rows older than 5s that no LISTEN delivered (handles NOTIFY drops + cold start). - For each claimed job: shell through the existing imagen generation pipeline (the same code path that `imagen generate` uses internally — backend dispatch, prompt enrichment, output writer, cloud-sync from #7). All of #7's output handling runs as-is and produces the `imagen.images` row. - On success: `UPDATE imagen.jobs SET status='done', completed_at=now(), image_id=$1 WHERE id=$2`. - On failure: `UPDATE imagen.jobs SET status='failed', completed_at=now(), error=$1 WHERE id=$2`. Capture the error message but not secrets / stack traces — caller-facing summary only. - Graceful shutdown on SIGTERM: finish the current job (don't leave half-state), then exit. ### 4. systemd unit New `scripts/imagen-worker.service` for mRiver: ``` [Unit] Description=ImaGen worker (consumes imagen.jobs queue) Wants=network-online.target After=network-online.target [Service] Type=simple ExecStart=/home/m/dev/ImaGen/bin/imagen worker Restart=on-failure RestartSec=5 EnvironmentFile=/home/m/.dotfiles/.env.imagen-worker # service-role key, msupabase URL User=m [Install] WantedBy=default.target ``` Install via `systemctl --user enable imagen-worker.service` on mRiver. Document in `docs/setup-worker-mriver.md`. Environment file path: must contain `SUPABASE_URL`, `SUPABASE_SERVICE_ROLE_KEY`, and any backend creds (REPLICATE_API_TOKEN if Replicate is enabled). Note: never commit the .env file itself — only the systemd unit references the path. ### 5. Tests - `cmd/imagen/worker_test.go`: unit tests for the claim/release logic, status transitions, NOTIFY parsing. Mocked `pgx` connection. No real Postgres in CI. - `internal/cloud/cloud_test.go` (existing): no changes; cloud-sync stays the same. - Integration test (env-guarded): one real job-row INSERT, worker picks it up, verifies `imagen.images` row lands and job status flips to `done`. Run only when `IMAGEN_WORKER_INTEGRATION=1`. ### 6. Smoke test With the worker running (`systemctl --user start imagen-worker`), INSERT a test job via mcp__supabase: ```sql INSERT INTO imagen.jobs (owner_user_id, prompt, backend, width, height) VALUES ('ac6c9501-3757-4a6d-8b97-2cff4288382b', 'a tiny owl wearing wire-rim glasses, photo', 'flux-schnell-local', 1024, 1024); ``` Expected within ~10s: job status `done`, `image_id` populated, `imagen.images` row + Storage object + signed URL all green. ## Acceptance criteria 1. `imagen.jobs` table exists in msupabase per the schema above, RLS + indexes in place, INSERT NOTIFY trigger on the channel `imagen_jobs`. 2. `imagen worker` subcommand built into the CLI; `systemctl --user start imagen-worker.service` on mRiver runs it as a daemon. 3. Inserting a `pending` row into `imagen.jobs` triggers a generation within ~10s (~5s LISTEN latency budget + ~8s FLUX schnell render). 4. Job rows transition `pending → running → done` with `started_at` / `completed_at` set, and the `image_id` links to a real `imagen.images` row written by the worker via the same cloud-sync path from #7. 5. Failure paths: backend unreachable → job status `failed` with error text; worker process restart picks up where it left off (no orphaned `running` jobs). 6. Unit tests cover claim logic and status transitions; one env-guarded integration test exercises the full path. 7. `go build ./... && go test ./...` clean. ## Out of scope - The flexsiebels-side form + polling endpoints + pending-card UI — that's m/flexsiebels.de#65 (paul's side, knuth implementing). - Job cancellation / kill switch — separate follow-up if m wants it. - Concurrent workers / multi-mRiver scale-out — single worker is fine for v1. - Job-history retention policies — leave forever for now; m can prune via SQL later if needed. - Scheduled / recurring jobs — separate future issue once the queue ships. - Promotion path (`imagen.images` → `flexsiebels.images`) — still deferred from #7's spec. ## Refs - Joint design: mai messages 1626 (head → paul, scoping) + 1627 (paul → head, A2 chosen + schema + worker model + UX) + 1628 (head → paul, confirmation). - m's ask: 2026-05-11 10:08 ("can we make it that I can also prompt imagen from the website?!"). - m/flexsiebels.de#65 — sibling issue, write-path UI on the flexsiebels side. paul filing it in parallel. - ImaGen#7 — data plane and cloud-sync that this issue reuses verbatim. - ImaGen#6 — superseded; this issue + #7 + #64 + #65 together replace it. ## Workflow Coder/gitster role. Phasing: - **Phase 1**: schema migration + NOTIFY trigger + RLS + grants. Ping head with DONE-PHASE-1 so paul/knuth can start scaffolding flexsiebels' INSERT path against the real schema. - **Phase 2**: `imagen worker` subcommand + tests + systemd unit + env-file template. - **Phase 3**: one end-to-end smoke via SQL INSERT → worker pickup → image row. Head reviews + merges --no-ff into main + comments + applies `done` label.
mAi self-assigned this 2026-05-11 08:12:30 +00:00
Author
Collaborator

Phase 1+2+3 done end-to-end. Commit 2758c5a on branch mai/hermes/issue-8-imagen-8-imagen.

Phase 1 — schema

Migration imagen_jobs_init applied to msupabase:

  • imagen.jobs per the spec (UUID PK, owner_user_id NOT NULL FK to auth.users, status CHECK, image_id FK to imagen.images ON DELETE SET NULL)
  • imagen_jobs_status_pending_idx (partial on pending) + imagen_jobs_owner_recent_idx
  • RLS owner-scoped SELECT + INSERT via uid() — same shape as imagen.images from ImaGen #7: cloud-sync (Supabase Storage + imagen.images schema) for the flexsiebels viewer (#7)
  • Grants: SELECT/INSERT/UPDATE/DELETE to authenticated + service_role
  • AFTER INSERT trigger imagen_jobs_notify_insertpg_notify('imagen_jobs', NEW.id::text)
  • imagen schema already exposed by PostgREST from #7 — new table auto-picks up

Phase 2 — imagen worker subcommand

  • internal/worker is a DB-agnostic loop over a Queue interface. Job-scoped contexts derive from Background() so SIGTERM lets the in-flight generation finish — no half-state on shutdown. Drains the whole pending backlog on every wake. ResetStaleRunning() at startup unsticks rows left over from a crash. Eight unit tests cover done / failed / missing-id / drain / NOTIFY-wake / shutdown / transient-error paths against a fake queue (no real Postgres in CI).
  • cmd/imagen/worker.go ships the pgx implementation (single dedicated conn for LISTEN + UPDATE) plus workerPipeline reusing buildBackend + attachUsageSink + prompt.Apply + buildWriter + maybeCloudSync. Per-job owner_user_id overrides the env fallback so multi-user worlds attribute correctly.
  • maybeCloudSync now returns (*cloud.SyncResult, error) so the worker can link imagen.jobs.image_id to the inserted imagen.images row. CLI generate keeps its stderr summary unchanged.
  • scripts/imagen-worker.service (systemd --user) + scripts/imagen-worker.env.example. EnvironmentFile lives in ~/.dotfiles/.env.imagen-worker and is never committed.
  • docs/setup-worker-mriver.md walks through install + the spec's SQL-INSERT smoke; docs/architecture.md grows an "async write path" section.

Phase 3 — end-to-end smoke

Two passes, both green:

  1. TestWorker_Integration_EndToEnd (env-guarded by IMAGEN_WORKER_INTEGRATION=1) inserts one real job into imagen.jobs, lets the worker claim + process via the mock backend, asserts status='done' + linked image_id, then cleans up. 0.54s.
  2. Operational smoke against the built binary: started ./bin/imagen worker --poll-interval=2s in the background, INSERTed a job via mcp__supabase, watched pickup + done within ~74ms (NOTIFY round-trip ~7ms). Failure path verified by inserting a job with backend='nonexistent-backend'status='failed', full error text in the error column. SIGTERM exits cleanly. Test rows deleted afterwards.

Build + tests

go build ./...   # clean
go vet ./...     # clean
go test ./...    # ok across all 7 packages

Acceptance criteria

  1. imagen.jobs table + RLS + indexes + INSERT NOTIFY trigger live — yes.
  2. imagen worker subcommand built + run as a daemon (systemd unit in scripts/) — yes.
  3. INSERT → done within ~10s budget — measured 74ms with mock backend.
  4. pending → running → done transitions with timestamps, image_id links into a real imagen.images row written through #7's cloud-sync path — yes.
  5. Failure path → status='failed' with error text — yes. ResetStaleRunning covers worker-restart resilience for rows left in running.
  6. Unit tests cover claim + transitions; env-guarded integration test exercises the full path — yes.
  7. go build ./... && go test ./... clean — yes.

Out of scope (carried through from spec)

  • flexsiebels-side form + polling endpoints + pending-card UI — m/flexsiebels.de#65
  • Job cancellation / kill switch — separate follow-up
  • Concurrent workers / multi-mRiver scale-out (the claim already uses FOR UPDATE SKIP LOCKED, so it's cheap to add later)
  • Job-history retention
  • Scheduled / recurring jobs
  • Promotion imagen.imagesflexsiebels.images

Deployment note

The systemd unit + env template are committed but not yet installed on mRiver — m can do systemctl --user enable --now imagen-worker.service per docs/setup-worker-mriver.md when ready, or leave it parked until the flexsiebels-side UI lands.

Phase 1+2+3 done end-to-end. Commit [2758c5a](https://mgit.msbls.de/m/ImaGen/commit/2758c5a) on branch [mai/hermes/issue-8-imagen-8-imagen](https://mgit.msbls.de/m/ImaGen/src/branch/mai/hermes/issue-8-imagen-8-imagen). ## Phase 1 — schema Migration `imagen_jobs_init` applied to msupabase: - `imagen.jobs` per the spec (UUID PK, owner_user_id NOT NULL FK to `auth.users`, status CHECK, `image_id` FK to `imagen.images` ON DELETE SET NULL) - `imagen_jobs_status_pending_idx` (partial on pending) + `imagen_jobs_owner_recent_idx` - RLS owner-scoped SELECT + INSERT via `uid()` — same shape as `imagen.images` from #7 - Grants: SELECT/INSERT/UPDATE/DELETE to `authenticated` + `service_role` - AFTER INSERT trigger `imagen_jobs_notify_insert` → `pg_notify('imagen_jobs', NEW.id::text)` - `imagen` schema already exposed by PostgREST from #7 — new table auto-picks up ## Phase 2 — `imagen worker` subcommand - `internal/worker` is a DB-agnostic loop over a `Queue` interface. Job-scoped contexts derive from `Background()` so SIGTERM lets the in-flight generation finish — no half-state on shutdown. Drains the whole pending backlog on every wake. `ResetStaleRunning()` at startup unsticks rows left over from a crash. Eight unit tests cover done / failed / missing-id / drain / NOTIFY-wake / shutdown / transient-error paths against a fake queue (no real Postgres in CI). - `cmd/imagen/worker.go` ships the pgx implementation (single dedicated conn for LISTEN + UPDATE) plus `workerPipeline` reusing `buildBackend` + `attachUsageSink` + `prompt.Apply` + `buildWriter` + `maybeCloudSync`. Per-job `owner_user_id` overrides the env fallback so multi-user worlds attribute correctly. - `maybeCloudSync` now returns `(*cloud.SyncResult, error)` so the worker can link `imagen.jobs.image_id` to the inserted `imagen.images` row. CLI `generate` keeps its stderr summary unchanged. - `scripts/imagen-worker.service` (systemd `--user`) + `scripts/imagen-worker.env.example`. EnvironmentFile lives in `~/.dotfiles/.env.imagen-worker` and is never committed. - `docs/setup-worker-mriver.md` walks through install + the spec's SQL-INSERT smoke; `docs/architecture.md` grows an "async write path" section. ## Phase 3 — end-to-end smoke Two passes, both green: 1. `TestWorker_Integration_EndToEnd` (env-guarded by `IMAGEN_WORKER_INTEGRATION=1`) inserts one real job into `imagen.jobs`, lets the worker claim + process via the mock backend, asserts `status='done'` + linked `image_id`, then cleans up. **0.54s.** 2. Operational smoke against the built binary: started `./bin/imagen worker --poll-interval=2s` in the background, INSERTed a job via `mcp__supabase`, watched pickup + done within ~74ms (NOTIFY round-trip ~7ms). Failure path verified by inserting a job with `backend='nonexistent-backend'` → `status='failed'`, full error text in the `error` column. SIGTERM exits cleanly. Test rows deleted afterwards. ## Build + tests ``` go build ./... # clean go vet ./... # clean go test ./... # ok across all 7 packages ``` ## Acceptance criteria 1. `imagen.jobs` table + RLS + indexes + INSERT NOTIFY trigger live — yes. 2. `imagen worker` subcommand built + run as a daemon (systemd unit in `scripts/`) — yes. 3. INSERT → done within ~10s budget — measured 74ms with mock backend. 4. `pending → running → done` transitions with timestamps, `image_id` links into a real `imagen.images` row written through #7's cloud-sync path — yes. 5. Failure path → `status='failed'` with error text — yes. `ResetStaleRunning` covers worker-restart resilience for rows left in `running`. 6. Unit tests cover claim + transitions; env-guarded integration test exercises the full path — yes. 7. `go build ./... && go test ./...` clean — yes. ## Out of scope (carried through from spec) - flexsiebels-side form + polling endpoints + pending-card UI — m/flexsiebels.de#65 - Job cancellation / kill switch — separate follow-up - Concurrent workers / multi-mRiver scale-out (the claim already uses `FOR UPDATE SKIP LOCKED`, so it's cheap to add later) - Job-history retention - Scheduled / recurring jobs - Promotion `imagen.images` → `flexsiebels.images` ## Deployment note The systemd unit + env template are committed but not yet installed on mRiver — m can do `systemctl --user enable --now imagen-worker.service` per `docs/setup-worker-mriver.md` when ready, or leave it parked until the flexsiebels-side UI lands.
mAi added the
needs-review
label 2026-05-11 08:24:02 +00:00
Author
Collaborator

Merged into main + worker live on mRiver

Branch mai/hermes/issue-8-imagen-8-imagen merged via --no-ff at dbe1704. Pushed to origin/main.

  • Implementation commit: 2758c5a
  • 1,205 insertions across 13 files

Acceptance criteria

# Check Result
1 imagen.jobs table + RLS + indexes + NOTIFY trigger ok - migration imagen_jobs_init applied; imagen_jobs_notify_insert trigger calls pg_notify('imagen_jobs', NEW.id::text) on AFTER INSERT
2 imagen worker subcommand running as systemd on mRiver ok - ~/.config/systemd/user/imagen-worker.service installed, enabled, active; logs: imagen worker: ready (poll-interval 5s job-timeout 5m0s)
3 Pending row triggers generation within ~10s ok - smoke job aa41e87e-e80c-445d-9ed0-597cd7c6f3ed end-to-end in 8.3s (INSERT 10:25:37.558 → done 10:25:45.852); NOTIFY latency <5ms
4 Status transitions pending→running→done; image_id links to real imagen.images ok - smoke job's image_id = 19397d4c-2e76-442c-8e2b-2b8742bb1544, Storage object 2026-05-11/a-tiny-owl-wearing-wire-rim-glasses-phot-4824635221023530683.png
5 Failure paths + crash recovery ok - ResetStaleRunning on worker startup flips orphaned running rows back to pending; backend errors set status='failed' with error text
6 Unit tests + env-guarded integration test ok - 332-line internal/worker/worker_test.go covers claim logic, status transitions, NOTIFY parsing; worker_integration_test.go (env-guarded by IMAGEN_WORKER_INTEGRATION=1) for real DB path
7 go build ./... && go test ./... clean ok - all packages pass

Architecture summary (for future reference)

flexsiebels (owner UI)  --INSERT-->  imagen.jobs  --pg_notify-->  imagen worker (mRiver)
  ^                                       |                            |
  |                                       v                            v
  +-poll-- GET .../jobs/<id> <--UPDATE--+                       backend.Generate
                                                                       |
                                                                       v
                                                          imagen.images + Storage
                                                                       |
                                                                       v
                                                       flexsiebels signed-URL render

Operations

  • Worker daemon: systemctl --user status imagen-worker
  • Logs: journalctl --user -u imagen-worker -f
  • Restart after binary update: systemctl --user restart imagen-worker
  • Setup doc: docs/setup-worker-mriver.md
  • Env file (gitignored, 0600): ~/.dotfiles/.env.imagen-worker

Coordinated with flexsiebels-side

  • m/flexsiebels.de#65 merged at 1a062b7 (paul, knuth): /imagine/new form route, POST/GET jobs API endpoints behind requireAuth, pending-card polling UI on /imagine. Awaiting Dokploy redeploy for the visual smoke at https://flexsiebels.de/imagine.
  • ImaGen#7 - data plane (cloud-sync to imagen.images), reused verbatim by the worker
  • ImaGen#6 - superseded by #7 + #8 + flexsiebels.de#64 + flexsiebels.de#65 (full v2 stack now shipped)
  • m's ask: 2026-05-11 10:08 ("can we make it that I can also prompt imagen from the website?!")
## Merged into main + worker live on mRiver Branch `mai/hermes/issue-8-imagen-8-imagen` merged via `--no-ff` at `dbe1704`. Pushed to origin/main. - Implementation commit: `2758c5a` - 1,205 insertions across 13 files ### Acceptance criteria | # | Check | Result | |---|-------|--------| | 1 | `imagen.jobs` table + RLS + indexes + NOTIFY trigger | ok - migration `imagen_jobs_init` applied; `imagen_jobs_notify_insert` trigger calls `pg_notify('imagen_jobs', NEW.id::text)` on AFTER INSERT | | 2 | `imagen worker` subcommand running as systemd on mRiver | ok - `~/.config/systemd/user/imagen-worker.service` installed, enabled, active; logs: `imagen worker: ready (poll-interval 5s job-timeout 5m0s)` | | 3 | Pending row triggers generation within ~10s | ok - smoke job `aa41e87e-e80c-445d-9ed0-597cd7c6f3ed` end-to-end in 8.3s (INSERT 10:25:37.558 → done 10:25:45.852); NOTIFY latency <5ms | | 4 | Status transitions pending→running→done; image_id links to real imagen.images | ok - smoke job's `image_id` = `19397d4c-2e76-442c-8e2b-2b8742bb1544`, Storage object `2026-05-11/a-tiny-owl-wearing-wire-rim-glasses-phot-4824635221023530683.png` | | 5 | Failure paths + crash recovery | ok - `ResetStaleRunning` on worker startup flips orphaned `running` rows back to `pending`; backend errors set `status='failed'` with error text | | 6 | Unit tests + env-guarded integration test | ok - 332-line `internal/worker/worker_test.go` covers claim logic, status transitions, NOTIFY parsing; `worker_integration_test.go` (env-guarded by `IMAGEN_WORKER_INTEGRATION=1`) for real DB path | | 7 | `go build ./... && go test ./...` clean | ok - all packages pass | ### Architecture summary (for future reference) ``` flexsiebels (owner UI) --INSERT--> imagen.jobs --pg_notify--> imagen worker (mRiver) ^ | | | v v +-poll-- GET .../jobs/<id> <--UPDATE--+ backend.Generate | v imagen.images + Storage | v flexsiebels signed-URL render ``` ### Operations - Worker daemon: `systemctl --user status imagen-worker` - Logs: `journalctl --user -u imagen-worker -f` - Restart after binary update: `systemctl --user restart imagen-worker` - Setup doc: `docs/setup-worker-mriver.md` - Env file (gitignored, 0600): `~/.dotfiles/.env.imagen-worker` ### Coordinated with flexsiebels-side - m/flexsiebels.de#65 merged at `1a062b7` (paul, knuth): `/imagine/new` form route, POST/GET jobs API endpoints behind requireAuth, pending-card polling UI on `/imagine`. Awaiting Dokploy redeploy for the visual smoke at https://flexsiebels.de/imagine. ### Cross-link - ImaGen#7 - data plane (cloud-sync to imagen.images), reused verbatim by the worker - ImaGen#6 - superseded by #7 + #8 + flexsiebels.de#64 + flexsiebels.de#65 (full v2 stack now shipped) - m's ask: 2026-05-11 10:08 ("can we make it that I can also prompt imagen from the website?!")
mAi added
done
and removed
needs-review
labels 2026-05-11 08:26:53 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: m/ImaGen#8
No description provided.