Scan-Stack: Multi-Page-Scan in separate Dokumente teilen (Barcode- oder Blank-Page-Separator) #1
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Request
m (2026-05-15, PWA voice):
(EN: "We should be able to scan a whole stack of papers in one ADF run and have it split automatically into multiple separate documents in Paperless. Today every letter requires its own scan job. We can use blank pages or marked separator pages between letters.")
Multiple letters in one ADF run, automatic split into N Paperless documents.
Status
Paperless-ngx has a built-in barcode splitter:
PAPERLESS_CONSUMER_ENABLE_BARCODES=trueplus optionallyPAPERLESS_CONSUMER_BARCODE_SCANNER(zbar | pyzbar | zxing). When enabled, Paperless scans each page on consume for an ASN barcode or a defined separator barcode. If found, it splits the PDF at that page and creates a separate document for each chunk.Docs: https://docs.paperless-ngx.com/usage/#barcodes
The default separator is normally a sheet with a
PATCHTpatch-T barcode (ISO standard for document separators). Configurable.Implementation options
Option A — Paperless built-in barcode splitter (recommended)
PAPERLESS_CONSUMER_ENABLE_BARCODES=truein the compose env.zbardefault, lightweight).inbox/→ mover pushes totoprocess/after the mtime check → Paperless consumes, finds the patch-T pages, auto-splits into N documents → Paperless-AI classifies each individually.No code change on our side, only compose config + printed separator sheets.
Option B — Blank-page detection in the mover
infra/mdms-mover/mover.sh): before themv inbox→toprocessstep, analyse each PDF, treat near-empty pages (< 1% ink coverage) as separators, split withpdftkorqpdf.More flexible but more error-prone.
Option C — Paperless-AI semantic splitting
Recommendation
Start with Option A — Paperless's built-in feature is mature, documented, deterministic. Printing patch-T separators is seconds of effort on scan days.
If Option A turns out to be annoying in practice (m hates inserting separator sheets), follow up with Option B or C.
Scope of this issue
PAPERLESS_CONSUMER_ENABLE_BARCODESalready set anywhere? If not — which compose file owns this? (~/paperless/docker-compose.ymlon mDock per the live deployment.)inbox/).infra/paperless/— m prints them and reuses.docs/strategy.md: new section "Multi-page scan + automatic splitting" with instructions on how m uses the separator sheets.Out of scope
Role: gitster (compose edit, test scan, generate separator PDF, doc update, commit, issue comment).
Erledigt — Option A (Paperless built-in Patch-T-Splitter)
Was live ist (mDock)
~/paperless/docker-compose.yml, webserver-Service:Container mit
docker compose up -d webserverrecreated, env-Vars perdocker exec ... env | grep -i barcodeverifiziert. Standard-TrennstringPATCHT, Scannerpyzbar, DPI 300 — alles per Default OK.Backup der alten Compose-Datei:
~/paperless/docker-compose.yml.bak.before-barcodes-20260516T134549.Was committet ist
m/mDMSBranchmai/hermes/issue-1-scan-stack-multi(commit 061ea42):infra/paperless/generate_separator.py— uv-inline-deps Generator (reportlab + python-barcode)infra/paperless/separator-patchT.pdf— druckbare A4-Trennseite, Code-128 mitPATCHT, sichtbarer Header + Footer (10–20 Stück drucken, neben den Scanner legen)docs/strategy.md— neuer Abschnitt „Multi-page scan + automatic splitting (Barcode-Separator)"m/paperlessSource-of-Truth (commit 8c1ca3f): selbe zwei env-Vars indocker-compose.ymlaufmain.Test (2026-05-16, live)
Konstruierter Stapel:
CODE128: PATCHT(im paperless-Container mitpdf2image+pyzbar, also derselbe Stack wie der Splitter selbst)Drop in
/mnt/mdms/toprocess/mdms-issue1-test-stack.pdf. Paperless-Log:DB-Verifikation:
→ 6 Input-Seiten, 4 Output-Seiten (2+1+1). Trennseiten sauber entsorgt (DELETE_PAGES=true). Test-Dokumente nach Verifikation per Django-Shell gelöscht (paperless-User, damit FileLock ok).
Workflow für m
infra/paperless/separator-patchT.pdfausdrucken (10–20 Stück).~/mDMS/inbox/(SMB) → mdms-mover → Paperless splittet automatisch.Out of Scope (Follow-ups, falls Patch-T-Seiten nerven)
infra/mdms-mover/(Quelle liegt noch inm/ottoper CLAUDE.md, Migration ausstehend). False-positives bei Duplex-Rückseiten.Drift-Flag
m/paperlessdocker-compose.ymlist stark gedriftet vom live~/paperless/docker-compose.ymlauf mDock:/home/m/data/paperless/...(Repo) vs./mnt/mdms/paperless/...+/mnt/mdms/toprocess(live):latest(Repo) vs.:2.20.6(live)dokploy-networkim Repo, im live-Compose nicht vorhandenclusterzx/paperless-ai:latest(Repo) vs. custommdock/paperless-ai:3.0.9-restrict-patchmit lokalem Build (live)Die Barcode-env-Vars sind in beiden drin. Der breitere Drift ist out-of-scope dieses Issues — wäre ein eigenes Issue im
m/paperless-Repo wert, sobald jemand die Compose-Datei dort als Deployment-Source einsetzt.Branch: https://mgit.msbls.de/m/mDMS/src/branch/mai/hermes/issue-1-scan-stack-multi