Restyle while a game is running: ComfyUI --lowvram profile (coexist with untracked GPU VRAM) #16

Open
opened 2026-06-07 09:10:38 +00:00 by mAi · 0 comments
Collaborator

Context

#15 landed the GPU-lease integration: restyle now goes through the mGPUmanager broker, evicts the idle AI services (mvoice/whisper/ollama), and runs FLUX with the GPU lock held across the whole cycle. Verified live.

But when an untracked GPU app (e.g. Baldur's Gate 3, ~3 GB) is running, FLUX (~13 GB) still does not fit even after evicting every managed consumer (13 + 3 + 1 reserved > 16 GB). The broker correctly returns a clean insufficient_vram instead of OOMing, but no image is produced. So restyle-while-gaming is still blocked.

Goal

Let restyle coexist with a running game by shrinking FLUX's VRAM footprint, trading speed for fit.

Approach (from #15 design doc section 6)

ComfyUI picks its VRAM mode at process start, so this is a launch-flag / second-profile decision, not per-request. Options to evaluate:

  1. A dedicated --lowvram ComfyUI instance on mRock (offloads weights to system RAM; small VRAM footprint, slower) + a second imagen.yaml backend pointing at it with a lower vram_resident_mib.
  2. Selection policy: try the normal flux-schnell-local; on insufficient_vram from the broker, automatically retry against the low-vram profile. (The client already distinguishes insufficient_vramErrBrokerInsufficientVRAM — so a retry hook is feasible.)
  3. Register the low-vram ComfyUI as its own broker consumer with a realistic small vram_resident_mib so the lease can actually grant alongside the game.

Out of scope

  • The core lease integration (#15, done).
  • Tracking arbitrary game VRAM in the broker (not feasible — games aren't managed consumers).

Refs

  • #15 (GPU lease), design doc docs/design-broker-gpu-lease.md section 6
  • mGPUmanager consumers.yaml (would gain a low-vram comfyui consumer)
## Context #15 landed the GPU-lease integration: restyle now goes through the mGPUmanager broker, evicts the idle AI services (mvoice/whisper/ollama), and runs FLUX with the GPU lock held across the whole cycle. Verified live. But when an **untracked GPU app** (e.g. Baldur's Gate 3, ~3 GB) is running, FLUX (~13 GB) still does not fit even after evicting every managed consumer (13 + 3 + 1 reserved > 16 GB). The broker correctly returns a clean `insufficient_vram` instead of OOMing, but no image is produced. So restyle-while-gaming is still blocked. ## Goal Let restyle coexist with a running game by shrinking FLUX's VRAM footprint, trading speed for fit. ## Approach (from #15 design doc section 6) ComfyUI picks its VRAM mode at process start, so this is a launch-flag / second-profile decision, not per-request. Options to evaluate: 1. A dedicated `--lowvram` ComfyUI instance on mRock (offloads weights to system RAM; small VRAM footprint, slower) + a second `imagen.yaml` backend pointing at it with a lower `vram_resident_mib`. 2. Selection policy: try the normal `flux-schnell-local`; on `insufficient_vram` from the broker, automatically retry against the low-vram profile. (The client already distinguishes `insufficient_vram` — `ErrBrokerInsufficientVRAM` — so a retry hook is feasible.) 3. Register the low-vram ComfyUI as its own broker consumer with a realistic small `vram_resident_mib` so the lease can actually grant alongside the game. ## Out of scope - The core lease integration (#15, done). - Tracking arbitrary game VRAM in the broker (not feasible — games aren't managed consumers). ## Refs - #15 (GPU lease), design doc `docs/design-broker-gpu-lease.md` section 6 - mGPUmanager consumers.yaml (would gain a low-vram comfyui consumer)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: m/ImaGen#16
No description provided.