Skip to content

Show progress bar for full-page OCR#512

Open
iRonin wants to merge 1 commit into
datalab-to:masterfrom
iRonin:progress-bar-full-page-ocr
Open

Show progress bar for full-page OCR#512
iRonin wants to merge 1 commit into
datalab-to:masterfrom
iRonin:progress-bar-full-page-ocr

Conversation

@iRonin

@iRonin iRonin commented Jun 2, 2026

Copy link
Copy Markdown

Closes #511.

Show a progress bar for full-page OCR

What

Add a tqdm progress bar to chat_completions_batch (the dispatch used by
full-page OCR / surya_ocr), gated by the existing settings.DISABLE_TQDM.

Why

The recognition / full-page path was the only batched path with no progress
indication — detection, layout and the OCR-error model already use tqdm. On a
multi-page PDF surya_ocr appeared to hang because nothing is printed until the
whole job finishes. In full-page mode the batch is one item per page, so
total=len(batch) is the page count and each completed item is a finished page.

Change

-    with ThreadPoolExecutor(max_workers=max_workers) as executor:
-        return list(executor.map(_process, batch))
+    with ThreadPoolExecutor(max_workers=max_workers) as executor:
+        return list(
+            tqdm(
+                executor.map(_process, batch),
+                total=len(batch),
+                desc="Recognizing Text",
+                disable=settings.DISABLE_TQDM,
+            )
+        )

(plus the two imports: from surya.settings import settings, from tqdm import tqdm)

Notes

  • Respects DISABLE_TQDM, so users who set it keep silent output.
  • executor.map preserves the existing result ordering; tqdm advances as the
    in-order results are yielded. (Could use as_completed for finer-grained
    updates, but that would require re-sorting outputs by metadata and is a
    larger change — happy to do that instead if preferred.)
  • chat_completions_batch is shared with the per-block fallback path, so the
    bar counts requests generically; in the dominant full-page path that is pages.

Tested

surya-ocr 0.20.0, llamacpp backend, Apple Silicon (MPS), 6-page scanned PDF,
SURYA_INFERENCE_PARALLEL=4:

Recognizing Text:   0%|          | 0/6 [00:00<?, ?it/s]
Recognizing Text:  17%|█▋        | 1/6 [01:17<06:28, 77.77s/it]
Recognizing Text: 100%|██████████| 6/6 [02:12<00:00, 22.01s/it]

Output (results.json / markdown) unchanged.

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

The recognition / full-page OCR path (surya_ocr) ran silently, even though
detection and layout already show tqdm bars. In full-page mode the batch is
one item per page, so total=len(batch) is the page count and each completed
item is a finished page.

Wrap the executor with tqdm, gated by the existing settings.DISABLE_TQDM,
matching the convention in surya/detection and surya/layout.
@iRonin iRonin force-pushed the progress-bar-full-page-ocr branch from 7089d24 to b7d594c Compare June 2, 2026 15:50
@iRonin

iRonin commented Jun 2, 2026

Copy link
Copy Markdown
Author

I have read the CLA Document and I hereby sign the CLA

github-actions Bot added a commit that referenced this pull request Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

No progress indication for full-page OCR (surya_ocr)

1 participant