π Document Readability Scorer
Pre-screen documents before expensive OCR/LLM inference. Upload a document image and get a readability score with detailed signal breakdown. Adjust weights to calibrate for your specific pipeline.
How it works
The scorer extracts 7 independent signals from the image and combines them into a single readability score (0β1):
| Signal | What it measures | Method |
|---|---|---|
| Sharpness | Is the text sharp/blurry? | Laplacian variance + FFT high-freq energy |
| Contrast | Is text distinguishable from background? | RMS + Michelson contrast |
| Noise | How clean is the image? | Immerkær noise estimation |
| Text Presence | Is there text on the page? | MSER regions + Sobel edge density |
| Brightness | Is exposure appropriate? | Mean brightness + saturation analysis |
| Entropy | Is there information content? | Shannon entropy |
| Learned IQA | ML-based quality score | CLIP-IQA via pyiqa library |
π‘ Calibration: Adjust the weight sliders to match your pipeline's sensitivity. For example, if your OCR handles blur well but fails on low contrast, increase the contrast weight.
βοΈ Signal Weights (auto-normalized to sum to 1.0)
0 1
0 1
0 1
0 1
0 1
0 1
0 1
0 1
π§ Learned IQA Metric
β¬οΈ Upload a document to get started