diff --git a/.gitignore b/.gitignore index eac5259..8813897 100644 --- a/.gitignore +++ b/.gitignore @@ -7,3 +7,4 @@ .env.local /imagen /coverage.txt +/.m/ diff --git a/docs/setup-comfyui-mrock.md b/docs/setup-comfyui-mrock.md new file mode 100644 index 0000000..3577044 --- /dev/null +++ b/docs/setup-comfyui-mrock.md @@ -0,0 +1,181 @@ +# ComfyUI on mRock — install + ops + +ImaGen's `flux-schnell-local` backend talks to ComfyUI on mRock at +`http://mrock:8188` (Tailscale-internal). This document is the reproducible +install path from a clean mRock state. + +mRock runs Arch Linux + systemd with an NVIDIA RTX 4070 Ti SUPER (16 GB +VRAM). Ollama is already a native systemd service, so ComfyUI follows the +same pattern (native Python venv + systemd unit) instead of Docker — Docker +on mRock has no `nvidia` runtime configured, and adding one is more invasive +than another systemd unit. + +## Prerequisites on mRock + +- Python via `uv` (already installed). +- NVIDIA driver new enough for CUDA 12.4. `nvidia-smi --query-gpu=driver_version` + should show >= 550. Driver 595 is what mRock has today. +- ~35 GB free on `/home` for the model files. +- `ollama.service` running on port 11434 — coexistence notes below. + +## 1. Clone ComfyUI + Python venv + +```bash +mkdir -p ~/dev && cd ~/dev +git clone --depth 1 https://github.com/comfyanonymous/ComfyUI.git comfyui +cd comfyui +uv venv --python 3.12 .venv +source .venv/bin/activate.fish + +# PyTorch CUDA 12.4 wheels — match the system driver +uv pip install --no-cache torch torchvision torchaudio \ + --index-url https://download.pytorch.org/whl/cu124 + +uv pip install --no-cache -r requirements.txt +``` + +Verify CUDA is wired up: + +```bash +.venv/bin/python -c \ + "import torch; print(torch.__version__, torch.cuda.is_available(), torch.cuda.get_device_name(0))" +# expected: 2.6.0+cu124 True NVIDIA GeForce RTX 4070 Ti SUPER +``` + +## 2. Models — FLUX.1 schnell + +The Black-Forest-Labs primary repo (`black-forest-labs/FLUX.1-schnell`) is +**gated** — `curl` against it without an HF token returns HTTP 401. We pull +the weights from ungated mirrors of the same Apache-2.0 release. + +| File | Where it goes | Source | +|------|---------------|--------| +| `flux1-schnell.safetensors` (~23.8 GB, fp16) | `models/unet/` | `Comfy-Org/flux1-schnell` | +| `ae.safetensors` (~335 MB) | `models/vae/` | `sirorable/flux-ae-vae` | +| `clip_l.safetensors` (~246 MB) | `models/clip/` | `comfyanonymous/flux_text_encoders` | +| `t5xxl_fp8_e4m3fn.safetensors` (~4.9 GB) | `models/clip/` | `comfyanonymous/flux_text_encoders` | + +```bash +cd ~/dev/comfyui/models + +curl -L -o unet/flux1-schnell.safetensors \ + https://huggingface.co/Comfy-Org/flux1-schnell/resolve/main/flux1-schnell.safetensors +curl -L -o vae/ae.safetensors \ + https://huggingface.co/sirorable/flux-ae-vae/resolve/main/ae.safetensors +curl -L -o clip/clip_l.safetensors \ + https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors +curl -L -o clip/t5xxl_fp8_e4m3fn.safetensors \ + https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp8_e4m3fn.safetensors +``` + +If a new HF token is configured later (`~/.cache/huggingface/token`), the +official `black-forest-labs/FLUX.1-schnell` URL is byte-identical and can be +swapped in. + +## 3. systemd unit + +Drop `/etc/systemd/system/comfyui.service`: + +```ini +[Unit] +Description=ComfyUI image generation server +Documentation=https://github.com/comfyanonymous/ComfyUI +After=network-online.target +Wants=network-online.target + +[Service] +Type=simple +User=m +Group=m +WorkingDirectory=/home/m/dev/comfyui +ExecStart=/home/m/dev/comfyui/.venv/bin/python /home/m/dev/comfyui/main.py \ + --listen 0.0.0.0 --port 8188 \ + --output-directory /home/m/dev/comfyui/output \ + --temp-directory /home/m/dev/comfyui/temp +Restart=on-failure +RestartSec=5 +TimeoutStopSec=30 +NoNewPrivileges=true +PrivateTmp=true +LimitNOFILE=65535 + +[Install] +WantedBy=multi-user.target +``` + +Then: + +```bash +sudo systemctl daemon-reload +sudo systemctl enable --now comfyui.service +systemctl status comfyui.service +``` + +The service binds `0.0.0.0:8188`. Tailscale's wireguard fence is the only +auth — do **not** expose port 8188 to the public internet. + +## 4. Health check + +```bash +curl -fsS --max-time 5 http://mrock:8188/system_stats | jq '.devices[0]' +# expected: name "cuda:0 NVIDIA GeForce RTX 4070 Ti SUPER ...", vram_total ~16 GB +``` + +`imagen backends` (from a host with the ImaGen CLI installed) should also +report `flux-schnell-local: ok`. + +## 5. VRAM coexistence with Ollama + +mRock has 16 GB VRAM total. Ollama parks ~8 GB resident for its current +model. FLUX schnell at fp16 weights with `weight_dtype=fp8_e4m3fn` (the +default the adapter requests) needs roughly 10–12 GB peak for a 1024×1024 +generation, so concurrent Ollama + FLUX on mRock will OOM. + +Two practical options: + +- **Stop Ollama before generating** — `sudo systemctl stop ollama` frees + the GPU, run the generation, `sudo systemctl start ollama` afterwards. + Adequate while we don't have many concurrent users. +- **Move Ollama off mRock** — when ImaGen is in regular use, push Ollama to + another host so the GPU is dedicated. Tracked separately. + +Both decisions live with whoever operates the box; the adapter does not try +to manage Ollama. + +## 6. Smoke test (direct, without the imagen CLI) + +```bash +# 1) Submit a workflow +curl -fsS --max-time 30 -X POST -H 'Content-Type: application/json' \ + -d @flux-schnell-workflow.json \ + http://mrock:8188/prompt +# returns: {"prompt_id": "...", "number": ..., "node_errors": {}} + +# 2) Poll history until the prompt completes +PID=... # from above +until curl -fsS http://mrock:8188/history/$PID | jq -e ".\"$PID\".status.completed == true" >/dev/null; do + sleep 1 +done + +# 3) Pull the image +NAME=$(curl -fsS http://mrock:8188/history/$PID \ + | jq -r ".\"$PID\".outputs[\"9\"].images[0].filename") +curl -fsS "http://mrock:8188/view?filename=$NAME&type=output" -o /tmp/cat.png +file /tmp/cat.png # PNG image data, 1024 x 1024 +``` + +The full ImaGen smoke test is in [usage.md](usage.md) once the Go adapter +ships. + +## Troubleshooting + +- **`vram_free` < 6 GB in `/system_stats`**: another GPU process is holding + memory. Usually Ollama (`sudo systemctl stop ollama`). +- **Workflow returns `node_errors` with `Required input is missing` for + CLIPLoader**: text encoder filenames don't match step 2 — check that + `clip_l.safetensors` and `t5xxl_fp8_e4m3fn.safetensors` are in + `models/clip/`, not `models/text_encoders/`. +- **`Access to model … is restricted`** during a model pull: the script is + hitting a gated mirror. Use the ungated URLs from step 2. +- **Service won't start**: check `journalctl -u comfyui --since '5 min ago'`. + Common cause is a stale `pip` install — re-run step 1. diff --git a/scripts/comfyui.service b/scripts/comfyui.service new file mode 100644 index 0000000..9344ec5 --- /dev/null +++ b/scripts/comfyui.service @@ -0,0 +1,24 @@ +[Unit] +Description=ComfyUI image generation server +Documentation=https://github.com/comfyanonymous/ComfyUI +After=network-online.target +Wants=network-online.target + +[Service] +Type=simple +User=m +Group=m +WorkingDirectory=/home/m/dev/comfyui +ExecStart=/home/m/dev/comfyui/.venv/bin/python /home/m/dev/comfyui/main.py \ + --listen 0.0.0.0 --port 8188 \ + --output-directory /home/m/dev/comfyui/output \ + --temp-directory /home/m/dev/comfyui/temp +Restart=on-failure +RestartSec=5 +TimeoutStopSec=30 +NoNewPrivileges=true +PrivateTmp=true +LimitNOFILE=65535 + +[Install] +WantedBy=multi-user.target diff --git a/scripts/download-flux-schnell.sh b/scripts/download-flux-schnell.sh new file mode 100755 index 0000000..4739c17 --- /dev/null +++ b/scripts/download-flux-schnell.sh @@ -0,0 +1,37 @@ +#!/bin/bash +# Download FLUX.1 schnell + accompanying VAE/text encoders into a ComfyUI tree. +# Uses ungated mirrors — the official Black-Forest-Labs repo is gated and +# requires an HF token. See docs/setup-comfyui-mrock.md. + +set -euo pipefail + +ROOT="${1:-$HOME/dev/comfyui/models}" + +if [ ! -d "$ROOT" ]; then + echo "models root $ROOT does not exist — pass it as the first argument" >&2 + exit 1 +fi + +mkdir -p "$ROOT/unet" "$ROOT/vae" "$ROOT/clip" + +CKPT="https://huggingface.co/Comfy-Org/flux1-schnell/resolve/main/flux1-schnell.safetensors" +VAE="https://huggingface.co/sirorable/flux-ae-vae/resolve/main/ae.safetensors" +CLIP_L="https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors" +T5="https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp8_e4m3fn.safetensors" + +dl() { + local url=$1 dest=$2 + if [ -s "$dest" ]; then + echo "skip $dest (already present)" + return + fi + echo "downloading $url -> $dest" + curl -L --fail --retry 3 --retry-delay 5 -C - -o "$dest" "$url" +} + +dl "$CKPT" "$ROOT/unet/flux1-schnell.safetensors" +dl "$VAE" "$ROOT/vae/ae.safetensors" +dl "$CLIP_L" "$ROOT/clip/clip_l.safetensors" +dl "$T5" "$ROOT/clip/t5xxl_fp8_e4m3fn.safetensors" + +echo "done" diff --git a/scripts/flux-schnell-poc.json b/scripts/flux-schnell-poc.json new file mode 100644 index 0000000..b0b5f44 --- /dev/null +++ b/scripts/flux-schnell-poc.json @@ -0,0 +1,87 @@ +{ + "prompt": { + "6": { + "class_type": "CLIPTextEncode", + "inputs": { + "text": "a small fishbowl with a cat staring out, photo, soft light", + "clip": ["11", 0] + } + }, + "8": { + "class_type": "VAEDecode", + "inputs": { + "samples": ["31", 0], + "vae": ["10", 0] + } + }, + "9": { + "class_type": "SaveImage", + "inputs": { + "filename_prefix": "imagen-poc", + "images": ["8", 0] + } + }, + "10": { + "class_type": "VAELoader", + "inputs": { + "vae_name": "ae.safetensors" + } + }, + "11": { + "class_type": "DualCLIPLoader", + "inputs": { + "clip_name1": "t5xxl_fp8_e4m3fn.safetensors", + "clip_name2": "clip_l.safetensors", + "type": "flux" + } + }, + "12": { + "class_type": "UNETLoader", + "inputs": { + "unet_name": "flux1-schnell.safetensors", + "weight_dtype": "fp8_e4m3fn" + } + }, + "13": { + "class_type": "CLIPTextEncode", + "inputs": { + "text": "", + "clip": ["11", 0] + } + }, + "27": { + "class_type": "EmptySD3LatentImage", + "inputs": { + "width": 1024, + "height": 1024, + "batch_size": 1 + } + }, + "30": { + "class_type": "ModelSamplingFlux", + "inputs": { + "model": ["12", 0], + "max_shift": 1.15, + "base_shift": 0.5, + "width": 1024, + "height": 1024 + } + }, + "31": { + "class_type": "KSampler", + "inputs": { + "model": ["30", 0], + "seed": 1234567, + "steps": 4, + "cfg": 1.0, + "sampler_name": "euler", + "scheduler": "simple", + "denoise": 1.0, + "positive": ["6", 0], + "negative": ["13", 0], + "latent_image": ["27", 0] + } + } + }, + "client_id": "imagen-poc-001" +}