πŸ“„ Document Readability Scorer

Pre-screen documents before expensive OCR/LLM inference. Upload a document image and get a readability score with detailed signal breakdown. Adjust weights to calibrate for your specific pipeline.

How it works

The scorer extracts 7 independent signals from the image and combines them into a single readability score (0–1):

Signal What it measures Method
Sharpness Is the text sharp/blurry? Laplacian variance + FFT high-freq energy
Contrast Is text distinguishable from background? RMS + Michelson contrast
Noise How clean is the image? Immerkær noise estimation
Text Presence Is there text on the page? MSER regions + Sobel edge density
Brightness Is exposure appropriate? Mean brightness + saturation analysis
Entropy Is there information content? Shannon entropy
Learned IQA ML-based quality score CLIP-IQA via pyiqa library

πŸ’‘ Calibration: Adjust the weight sliders to match your pipeline's sensitivity. For example, if your OCR handles blur well but fails on low contrast, increase the contrast weight.

βš–οΈ Signal Weights (auto-normalized to sum to 1.0)

0 1
0 1
0 1
0 1
0 1
0 1
0 1
0 1
🧠 Learned IQA Metric

⬆️ Upload a document to get started