Files
ImaGen/CLAUDE.md
mAi 2758c5a500 mAi: #8 - imagen.jobs queue + worker subcommand (flexsiebels write path)
Async write path for the flexsiebels owner-mode UI: flexsiebels INSERTs into
imagen.jobs, the worker on mRiver claims pending rows via LISTEN/NOTIFY +
5s safety poll, runs the same generate pipeline imagen generate uses, and
writes the result through internal/cloud into imagen.images.

- Schema migration imagen_jobs_init: table + status CHECK + two indexes +
  owner-scoped RLS + grants + AFTER INSERT trigger publishing on the
  imagen_jobs channel via pg_notify.
- internal/worker: DB-agnostic loop over a Queue interface. Drains the
  whole pending backlog on each wake. Job-scoped contexts are derived
  from Background so SIGTERM lets the in-flight generation finish (no
  half-state). ResetStaleRunning at startup unsticks rows left over from
  a previous crash. Eight unit tests cover the done / failed / missing-id /
  drain / NOTIFY-wake / shutdown / transient-error paths against a fake
  queue (no real Postgres in CI).
- cmd/imagen/worker.go: pgx-backed Queue (one dedicated conn for LISTEN +
  UPDATE), plus the workerPipeline that reuses buildBackend +
  attachUsageSink + prompt.Apply + buildWriter + maybeCloudSync. The
  per-job owner_user_id overrides the env-level fallback so each row in
  imagen.images is attributed correctly.
- maybeCloudSync now returns (*cloud.SyncResult, error) so the worker can
  link imagen.jobs.image_id to the inserted imagen.images row. The CLI
  generate path keeps printing its stderr summary unchanged.
- scripts/imagen-worker.service + .env.example for the systemd --user unit
  on mRiver. EnvironmentFile lives in ~/.dotfiles and is never committed.
- docs/setup-worker-mriver.md walks through installation + the spec's
  SQL-INSERT smoke; docs/architecture.md grows an "async write path"
  section.
- worker_integration_test.go (env-guarded by IMAGEN_WORKER_INTEGRATION=1)
  drives one real job through the full pipeline against msupabase using
  the mock backend, then verifies imagen.images + Storage object landed
  and the row flipped to done with image_id linked. Verified end-to-end:
  pickup latency ~7ms, total 74ms, failure path captures error text.
2026-05-11 10:23:33 +02:00

118 lines
4.8 KiB
Markdown

# ImaGen — Project Instructions
ImaGen is a model-agnostic image-generation framework. It has a single
opinionated CLI (`imagen`) that dispatches to whichever backend the user
configured — local FLUX on mRock via ComfyUI today, Replicate or DALL-E
tomorrow, something else next year. The framework owns plumbing (config,
output, naming, sidecars, prompt enrichment); each adapter owns the schema
and lifecycle of its own block in `~/.config/imagen.yaml`.
## Architecture
```
cmd/imagen/ CLI shell — generate, worker, backends, config, serve
internal/backend/ Backend interface + Registry + Mock reference impl
internal/prompt/ Style preset registry (embedded styles.yaml)
internal/output/ Filename templating, image writer, JSON sidecar
internal/config/ YAML loader, validation, sample generator
internal/cloud/ Supabase Storage + imagen.images writer
internal/usage/ mai.imagen_usage cost-tracking sink
internal/worker/ imagen.jobs queue consumer (DB-agnostic via Queue interface)
internal/server/ HTTP stub (not implemented yet — follow-up issue)
scripts/ imagen-worker.service + env template, ComfyUI scripts
docs/ architecture.md, usage.md, setup-worker-mriver.md
```
Data flow for `imagen generate`:
1. Parse flags, load config (`internal/config`).
2. Resolve the requested **instance name** to a config block, then the block's
`type` to a registered constructor in `backend.Default`.
3. Apply style preset (`internal/prompt`) to the prompt.
4. Call `backend.Generate(ctx, Request)`. The adapter returns a `*Result`
with an image stream + metadata.
5. Stream to disk via `internal/output`. If `write_metadata_json` is on, a
sidecar `<image>.json` is written next to it.
## Backend contract
```go
type Backend interface {
Name() string
Generate(ctx context.Context, req Request) (*Result, error)
}
```
`Request` carries the cross-backend fields (prompt, negative, size, steps,
seed, style preset, free-form `BackendOpts`). `Result` returns the image
bytes via an `io.ReadCloser`, the MIME type, and a metadata map (model name,
seed actually used, latency, cost-estimate, …).
## Adding a new adapter
1. Create `internal/backend/<adapter>.go` (e.g. `comfyui.go`). Define a struct
that holds whatever the adapter needs (HTTP client, model id, token).
2. Add a constructor `func New<Adapter>(name string, cfg map[string]any) (Backend, error)`.
Read fields from `cfg` — that map is the adapter's own block from
`imagen.yaml` minus the `type:` key. Resolve secrets from env vars
(`api_token_env`, `api_key_env`) — never accept tokens inline.
3. Implement `Name()` (return the user-facing instance name) and
`Generate(ctx, Request)`.
4. In `init()` call `Register("<type-name>", New<Adapter>)`.
5. Anonymous-import the package from `cmd/imagen/main.go` if it lives in a
separate package, so the `init()` runs.
6. Add a smoke test under `internal/backend/<adapter>_test.go`. Network tests
should be guarded by `testing.Short()` or an env var.
## Config
`~/.config/imagen.yaml` (override with `--config`). Top-level keys:
- `default_backend` — instance name used when `--backend` is omitted.
- `output.directory` / `output.naming` / `output.write_metadata_json`.
- `backends:` — map of instance-name → `{type, …adapter-specific…}`.
The framework parses `type` and stuffs the rest into `BackendSpec.Raw`. The
adapter is free to define any schema it likes inside its block.
## Credentials
Never hardcode. Always reference env-var names from the config:
```yaml
flux-dev-replicate:
type: replicate
api_token_env: REPLICATE_API_TOKEN
```
The adapter then `os.Getenv("REPLICATE_API_TOKEN")` at construction and fails
fast if unset. Tokens never go through `imagen.yaml` in plaintext.
## How the `/imagine` skill calls into imagen
The skill (issue #4) wraps `imagen generate` and post-processes the path it
prints on stdout. Slash-command surface area:
```
/imagine "a cat in a fishbowl" --style blog-header --size 1024x1024
```
The skill resolves to `imagen generate "<prompt>" --backend <default> …` and
returns the image path so otto can attach it to a chat reply.
## References
- mAi project conventions: `~/.m/docs/msystem.md`
- Backend follow-ups: ImaGen issues #2 (ComfyUI on mRock), #3 (Replicate), #4 (skill)
- mRock GPU: NVIDIA RTX 4070 Ti SUPER, 16 GB VRAM, runs Ollama + F5-TTS
## House rules
- No technical debt. No TODOs in landed code. If something can't be done now,
open an issue.
- All user-facing strings: ASCII or proper Unicode (Umlaute), never `ae/oe/ue`.
- Tests live next to the package they cover (`*_test.go`). No `tests/` dir.
- `go build ./...` and `go test ./...` must be clean before any commit.
- Run `task build` (or `make build`) for the full build; both call into
`go build -o bin/imagen ./cmd/imagen`.