ImaGen #1: bootstrap repo + framework skeleton (Backend interface, CLI shell, config, output writer) #1

Open
opened 2026-05-08 12:30:04 +00:00 by mAi · 1 comment
Collaborator

Goal

Bootstrap the ImaGen repo with a minimal, model-agnostic framework that other components (skill, adapters, agents) can build on. After this issue lands, the repo has: a CLI shell, a Backend interface, a config loader, an output writer, plus an in-repo CLAUDE.md so future workers have context.

m's framing (PWA 2026-05-08 14:24): "Das Projekt wäre 'mImaGen' und soll für uns möglichst model-agnostisch sein. Also wir machen das Framework und können später leicht tauschen. Aber ich brauche halt ne gute Anlaufstelle für otto oder andere agents, um Bilder zu erstellen."

Skill name: /imagine (decided). Project ID: imagen. Repo: m/ImaGen. Memory group: imagen.

Scope

1. Repo skeleton (Go)

m/ImaGen/
├── cmd/imagen/             # CLI entry point
│   └── main.go
├── internal/
│   ├── backend/            # Backend interface + registry
│   │   ├── backend.go      # Backend interface
│   │   └── registry.go     # name → Backend constructor
│   ├── prompt/             # Prompt enrichment, style presets
│   │   └── prompt.go
│   ├── output/             # File-saving, naming, metadata sidecar
│   │   └── output.go
│   ├── config/             # YAML config loader
│   │   └── config.go
│   └── server/             # Optional HTTP server (for non-Go callers)
│       └── server.go
├── docs/
│   ├── architecture.md     # Backend interface, where adapters plug in
│   └── usage.md            # CLI examples, config samples
├── CLAUDE.md
├── README.md
├── go.mod
├── go.sum
└── .gitignore

2. Backend interface (the contract every adapter implements)

package backend

import (
    "context"
    "io"
)

// Request is the cross-backend request shape. Adapters translate it
// to whatever their target API expects.
type Request struct {
    Prompt        string
    NegativePrompt string  // optional, ignored by backends that don't support it
    Width, Height int
    Steps         int       // optional; backend default if 0
    Seed          int64     // optional; random if 0
    Style         string    // optional preset name (resolved by prompt package)
    BackendOpts   map[string]any  // backend-specific overrides
}

// Result is what the backend produces.
type Result struct {
    ImageReader io.ReadCloser  // decoded image bytes (PNG or JPEG)
    MimeType    string
    Metadata    map[string]any // model name, seed actually used, latency, cost-estimate, …
}

// Backend is the interface every adapter satisfies.
type Backend interface {
    Name() string
    Generate(ctx context.Context, req Request) (*Result, error)
}

3. CLI shape

# Generate one image
imagen generate "a cat in a fishbowl" \
  --backend flux-schnell-local \
  --size 1024x1024 \
  --output /tmp/cat.png

# List available backends + their status (config check, reachability)
imagen backends

# Render a config sample
imagen config init > ~/.config/imagen.yaml

Backends are looked up via the registry (internal/backend/registry.go). When a backend isn't implemented yet, the CLI returns a clean "backend X not registered, available: …" error.

4. Config

~/.config/imagen.yaml:

default_backend: flux-schnell-local

output:
  directory: ~/Pictures/imagen
  naming: "{date}-{slug}-{seed}.png"
  write_metadata_json: true     # alongside each image, a .json sidecar with prompt + backend + seed + latency

backends:
  flux-schnell-local:
    type: comfyui
    base_url: http://mrock:8188
    model: flux1-schnell.safetensors
    default_steps: 4

  flux-dev-replicate:
    type: replicate
    api_token_env: REPLICATE_API_TOKEN
    model: black-forest-labs/flux-dev
    default_steps: 28

  dalle3:
    type: openai
    api_key_env: OPENAI_API_KEY
    model: dall-e-3

Adapters get only their own sub-block at construction. The framework doesn't know what comfyui or replicate adapters need internally — that's the adapter's contract.

5. Output writer

internal/output/:

  • Resolve filename from naming template (placeholders: {date}, {time}, {slug} from prompt, {seed}, {backend}).
  • Write image bytes to disk.
  • If write_metadata_json is true, write <filename>.json with the full Request + Result.Metadata + ISO timestamp.
  • Return final paths.

6. Style presets (prompt enrichment)

internal/prompt/styles.yaml:

styles:
  photo:           "photorealistic, sharp focus, natural lighting"
  illustration:    "digital illustration, clean lines, vibrant colors"
  diagram:         "minimal technical diagram, isometric, white background, line-art"
  sketch:          "rough pencil sketch, hand-drawn, monochrome"
  blog-header:     "wide aspect, conceptual, soft palette, editorial illustration"

When --style photo is passed, the preset string is appended to the prompt before sending to the backend. m can extend this list later.

7. CLAUDE.md (in-repo)

Document: project goal, architecture overview, how to add a new backend (one paragraph), how the skill calls it, the config file location, where credentials live (always env vars, never hardcoded). Reference ~/.m/docs/msystem.md.

8. Tests

  • internal/output/ unit tests for naming + sidecar writing
  • internal/backend/ mock backend for end-to-end CLI smoke test
  • No backend-network tests in this issue (those land with each adapter)

9. README.md

Short: what it is, why it exists, "see CLAUDE.md for design", install instructions (go install ...), one-liner example.

Acceptance criteria

  1. go build ./... is clean.
  2. imagen backends lists registered backends (only the mock+stub at this point), exits 0.
  3. imagen generate "test prompt" --backend mock --output /tmp/x.png writes a real (mock-generated) PNG and a sidecar JSON.
  4. ~/.config/imagen.yaml parses cleanly via imagen config validate.
  5. CLAUDE.md describes how to add a new adapter — short enough that a worker can read it before opening #2 (ComfyUI adapter).

Out of scope (intentional)

  • Any real backend implementation — those are #2 (ComfyUI), #3 (Replicate). This issue ships a Mock backend only.
  • The /imagine skill — that's #4.
  • HTTP server beyond a stub — wire it up in a follow-up issue once the CLI works end-to-end.
  • Image post-processing (cropping, watermarking) — out of scope for v0.
  • Cost-tracking — that lands with #3 since only API backends bill.

Refs

  • mai project: imagen (registered)
  • Memory group: imagen
  • Repo: m/ImaGen
  • mRock specs (for #2): NVIDIA RTX 4070 Ti SUPER, 16 GB VRAM, runs Ollama + F5-TTS already
  • m's voice via PWA 2026-05-08 13:47 + 14:01 + 14:24 (this is the spec)

Workflow

Coder role. End shift with make build + imagen generate "test" --backend mock smoke test. After this lands and merges, m or otto/head assigns mAi to #2, #3, #4 (which are blocked-on-#1 today).

## Goal Bootstrap the ImaGen repo with a minimal, model-agnostic framework that other components (skill, adapters, agents) can build on. After this issue lands, the repo has: a CLI shell, a Backend interface, a config loader, an output writer, plus an in-repo CLAUDE.md so future workers have context. m's framing (PWA 2026-05-08 14:24): *"Das Projekt wäre 'mImaGen' und soll für uns möglichst model-agnostisch sein. Also wir machen das Framework und können später leicht tauschen. Aber ich brauche halt ne gute Anlaufstelle für otto oder andere agents, um Bilder zu erstellen."* Skill name: `/imagine` (decided). Project ID: `imagen`. Repo: `m/ImaGen`. Memory group: `imagen`. ## Scope ### 1. Repo skeleton (Go) ``` m/ImaGen/ ├── cmd/imagen/ # CLI entry point │ └── main.go ├── internal/ │ ├── backend/ # Backend interface + registry │ │ ├── backend.go # Backend interface │ │ └── registry.go # name → Backend constructor │ ├── prompt/ # Prompt enrichment, style presets │ │ └── prompt.go │ ├── output/ # File-saving, naming, metadata sidecar │ │ └── output.go │ ├── config/ # YAML config loader │ │ └── config.go │ └── server/ # Optional HTTP server (for non-Go callers) │ └── server.go ├── docs/ │ ├── architecture.md # Backend interface, where adapters plug in │ └── usage.md # CLI examples, config samples ├── CLAUDE.md ├── README.md ├── go.mod ├── go.sum └── .gitignore ``` ### 2. Backend interface (the contract every adapter implements) ```go package backend import ( "context" "io" ) // Request is the cross-backend request shape. Adapters translate it // to whatever their target API expects. type Request struct { Prompt string NegativePrompt string // optional, ignored by backends that don't support it Width, Height int Steps int // optional; backend default if 0 Seed int64 // optional; random if 0 Style string // optional preset name (resolved by prompt package) BackendOpts map[string]any // backend-specific overrides } // Result is what the backend produces. type Result struct { ImageReader io.ReadCloser // decoded image bytes (PNG or JPEG) MimeType string Metadata map[string]any // model name, seed actually used, latency, cost-estimate, … } // Backend is the interface every adapter satisfies. type Backend interface { Name() string Generate(ctx context.Context, req Request) (*Result, error) } ``` ### 3. CLI shape ```bash # Generate one image imagen generate "a cat in a fishbowl" \ --backend flux-schnell-local \ --size 1024x1024 \ --output /tmp/cat.png # List available backends + their status (config check, reachability) imagen backends # Render a config sample imagen config init > ~/.config/imagen.yaml ``` Backends are looked up via the registry (`internal/backend/registry.go`). When a backend isn't implemented yet, the CLI returns a clean "backend X not registered, available: …" error. ### 4. Config `~/.config/imagen.yaml`: ```yaml default_backend: flux-schnell-local output: directory: ~/Pictures/imagen naming: "{date}-{slug}-{seed}.png" write_metadata_json: true # alongside each image, a .json sidecar with prompt + backend + seed + latency backends: flux-schnell-local: type: comfyui base_url: http://mrock:8188 model: flux1-schnell.safetensors default_steps: 4 flux-dev-replicate: type: replicate api_token_env: REPLICATE_API_TOKEN model: black-forest-labs/flux-dev default_steps: 28 dalle3: type: openai api_key_env: OPENAI_API_KEY model: dall-e-3 ``` Adapters get **only their own sub-block** at construction. The framework doesn't know what `comfyui` or `replicate` adapters need internally — that's the adapter's contract. ### 5. Output writer `internal/output/`: - Resolve filename from `naming` template (placeholders: `{date}`, `{time}`, `{slug}` from prompt, `{seed}`, `{backend}`). - Write image bytes to disk. - If `write_metadata_json` is true, write `<filename>.json` with the full Request + Result.Metadata + ISO timestamp. - Return final paths. ### 6. Style presets (prompt enrichment) `internal/prompt/styles.yaml`: ```yaml styles: photo: "photorealistic, sharp focus, natural lighting" illustration: "digital illustration, clean lines, vibrant colors" diagram: "minimal technical diagram, isometric, white background, line-art" sketch: "rough pencil sketch, hand-drawn, monochrome" blog-header: "wide aspect, conceptual, soft palette, editorial illustration" ``` When `--style photo` is passed, the preset string is appended to the prompt before sending to the backend. m can extend this list later. ### 7. CLAUDE.md (in-repo) Document: project goal, architecture overview, how to add a new backend (one paragraph), how the skill calls it, the config file location, where credentials live (always env vars, never hardcoded). Reference `~/.m/docs/msystem.md`. ### 8. Tests - `internal/output/` unit tests for naming + sidecar writing - `internal/backend/` mock backend for end-to-end CLI smoke test - No backend-network tests in this issue (those land with each adapter) ### 9. README.md Short: what it is, why it exists, "see CLAUDE.md for design", install instructions (`go install ...`), one-liner example. ## Acceptance criteria 1. `go build ./...` is clean. 2. `imagen backends` lists registered backends (only the mock+stub at this point), exits 0. 3. `imagen generate "test prompt" --backend mock --output /tmp/x.png` writes a real (mock-generated) PNG and a sidecar JSON. 4. `~/.config/imagen.yaml` parses cleanly via `imagen config validate`. 5. CLAUDE.md describes how to add a new adapter — short enough that a worker can read it before opening #2 (ComfyUI adapter). ## Out of scope (intentional) - Any real backend implementation — those are #2 (ComfyUI), #3 (Replicate). This issue ships a Mock backend only. - The `/imagine` skill — that's #4. - HTTP server beyond a stub — wire it up in a follow-up issue once the CLI works end-to-end. - Image post-processing (cropping, watermarking) — out of scope for v0. - Cost-tracking — that lands with #3 since only API backends bill. ## Refs - mai project: `imagen` (registered) - Memory group: `imagen` - Repo: `m/ImaGen` - mRock specs (for #2): NVIDIA RTX 4070 Ti SUPER, 16 GB VRAM, runs Ollama + F5-TTS already - m's voice via PWA 2026-05-08 13:47 + 14:01 + 14:24 (this is the spec) ## Workflow Coder role. End shift with `make build` + `imagen generate "test" --backend mock` smoke test. After this lands and merges, m or otto/head assigns mAi to #2, #3, #4 (which are blocked-on-#1 today).
mAi self-assigned this 2026-05-08 12:30:04 +00:00
Author
Collaborator

Bootstrap landed on m/ImaGen main

Framework skeleton merged. The repo now has:

  • internal/backend/Backend interface, thread-safe Registry, Mock reference adapter (deterministic gradient PNG)
  • internal/config/ — YAML loader, validation, sample generator. Adapters get only their own sub-block via BackendSpec.Raw
  • internal/output/ — filename templating ({date}/{time}/{slug}/{seed}/{backend}/{ext}), JSON sidecar
  • internal/prompt/ — embedded styles.yaml, style-preset suffix application
  • internal/server/ — 501 stub (HTTP surface lands in a follow-up)
  • cmd/imagen/generate / backends / config (init|validate|path) / serve / version
  • Tests across all four packages
  • CLAUDE.md, README.md, docs/architecture.md, docs/usage.md, Makefile

Acceptance criteria

# Check Result
1 go build ./... clean
2 imagen backends lists registered + flags non-compiled types
3 imagen generate "test prompt" --backend mock --output /tmp/x.png writes PNG + sidecar — 1024×1024 PNG, 34.5 KB, sidecar with prompt/seed/backend/timestamp/metadata
4 imagen config initimagen config validate round-trip OK — 4 backend(s) defined, default="mock"
5 CLAUDE.md documents how to add an adapter — six-step recipe in §"Adding a new adapter"

go test ./...: 4 packages, all green.

What's next

Unblocks the three follow-up issues. Each one drops a single file under internal/backend/ plus an anonymous import in cmd/imagen/main.go:

  • #2 ComfyUI on mRock (comfyui.go)
  • #3 Replicate (replicate.go)
  • #4 /imagine skill — wraps imagen generate

Refs

Mirrored from m/mAi#211 (closed) — the original bootstrap issue moved here; this is the canonical paper trail on the code repo.

## Bootstrap landed on `m/ImaGen` main Framework skeleton merged. The repo now has: - `internal/backend/` — `Backend` interface, thread-safe `Registry`, `Mock` reference adapter (deterministic gradient PNG) - `internal/config/` — YAML loader, validation, sample generator. Adapters get only their own sub-block via `BackendSpec.Raw` - `internal/output/` — filename templating (`{date}/{time}/{slug}/{seed}/{backend}/{ext}`), JSON sidecar - `internal/prompt/` — embedded `styles.yaml`, style-preset suffix application - `internal/server/` — 501 stub (HTTP surface lands in a follow-up) - `cmd/imagen/` — `generate / backends / config (init|validate|path) / serve / version` - Tests across all four packages - `CLAUDE.md`, `README.md`, `docs/architecture.md`, `docs/usage.md`, `Makefile` ### Acceptance criteria | # | Check | Result | |---|-------|--------| | 1 | `go build ./...` clean | ✅ | | 2 | `imagen backends` lists registered + flags non-compiled types | ✅ | | 3 | `imagen generate "test prompt" --backend mock --output /tmp/x.png` writes PNG + sidecar | ✅ — 1024×1024 PNG, 34.5 KB, sidecar with prompt/seed/backend/timestamp/metadata | | 4 | `imagen config init` ↔ `imagen config validate` round-trip | ✅ — `OK — 4 backend(s) defined, default="mock"` | | 5 | `CLAUDE.md` documents how to add an adapter | ✅ — six-step recipe in §"Adding a new adapter" | `go test ./...`: 4 packages, all green. ### What's next Unblocks the three follow-up issues. Each one drops a single file under `internal/backend/` plus an anonymous import in `cmd/imagen/main.go`: - #2 ComfyUI on mRock (`comfyui.go`) - #3 Replicate (`replicate.go`) - #4 `/imagine` skill — wraps `imagen generate` ### Refs - Bootstrap commit: https://mgit.msbls.de/m/ImaGen/commit/237270b - Merge commit: https://mgit.msbls.de/m/ImaGen/commit/2049091 - Branch: `mai/bohr/issue-211-bootstrap` (preserved for history; merged via `--no-ff`) _Mirrored from m/mAi#211 (closed) — the original bootstrap issue moved here; this is the canonical paper trail on the code repo._
mAi added the
done
label 2026-05-08 14:29:51 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: m/ImaGen#1
No description provided.