Detecting skew angles in scanned documents is a common but surprisingly tricky problem in document image processing. Most tools rely on spatial domain techniques like Hough transforms or edge-based heuristics to find dominant text orientation. jdeskew takes a different path: it works in the frequency domain, analyzing the Fourier magnitude spectrum to estimate skew by projecting radially and detecting peaks. This frequency domain approach is elegant and often more robust to noise and varying content.
frequency-domain skew estimation with adaptive radial projection
jdeskew implements a skew estimation method based on Adaptive Radial Projection on the Fourier Magnitude Spectrum, originally published at ICIP 2022. The core idea is to transform the document image into the frequency domain using a Fourier transform. The magnitude spectrum of this transform contains orientation information about the document’s text and layout.
The method radially projects the Fourier magnitude spectrum — effectively summing magnitudes along rays at different angles — to find the angle where these projections peak, indicating the dominant skew orientation. This contrasts with spatial methods that try to detect lines or edges directly.
Under the hood, jdeskew operates primarily on grayscale images, performs a fast Fourier transform (FFT), computes the magnitude spectrum, and then applies this radial projection to estimate skew. The output angle can then be used to rotate the image back to a corrected orientation.
The repo ships as a Python package with a minimal API focused on two main functions: get_angle for estimating the skew angle, and rotate to deskew the image accordingly. It also provides Docker and Cog (Replicate) deployment options for easier integration and scalability.
The architecture is straightforward but effective. The calculations happen mostly in NumPy and OpenCV, leveraging FFT efficiently. The repo includes Jupyter notebooks for experimentation and benchmarking, showing the method’s performance on the DISE 2021 dataset, which aggregates several document image collections like DISEC 2013 and RVL-CDIP.
technical strengths, tradeoffs, and benchmark performance
The standout technical strength of jdeskew is the use of frequency domain analysis for skew detection. This approach is less sensitive to text density, font size, or layout variations than spatial heuristics.
The code is surprisingly clean and focused. The radial projection implementation is concise and optimized, making the core skew detection logic easy to follow. The repo’s minimal API keeps the DX simple for integration into larger OCR or document preprocessing pipelines.
One important tradeoff is the input image resolution. The repo benchmarks configurations from 1024 up to 4096 pixels. Higher resolution inputs yield better angle estimation accuracy but cost more computation time and memory. The 3072 pixel resolution hits the best balance, achieving an Average Error in Degrees (AED) of 0.07, outperforming baselines like LRDE-EPITA-a which has an AED of 0.14.
The repo reports three main metrics on DISE 2021: AED (average absolute angular error), CE (corrected error), and WE (weighted error). jdeskew’s top configuration scores AED 0.07, CE 0.86, and WE 1.13, which are solid improvements over existing methods. The TOP80 metric, which measures error on the 80% best cases, is also low at 0.04 degrees.
The evaluation on a comprehensive dataset underlines the method’s robustness across document types and noise conditions. However, the frequency domain approach requires careful preprocessing and parameter tuning, especially for image resolution and FFT windowing.
Another note is that the repo depends on typical Python scientific stack libraries (NumPy, OpenCV) and uses Jupyter notebooks for demos, which means it’s easy to experiment with but not necessarily optimized for embedded or mobile environments.
quick start with pip and docker
The repo provides clear installation instructions:
pip install jdeskew
For containerized environments, you can build the Docker image yourself:
# build
DOCKER_BUILDKIT=1 docker build -t jdeskew .
These options cover typical deployment scenarios, from local development to cloud or edge deployments.
A minimal usage example looks like this:
from jdeskew import get_angle, rotate
angle = get_angle(image)
deskewed_image = rotate(image, angle)
This keeps the integration low friction.
who should consider jdeskew
If you’re working on document analysis pipelines, OCR preprocessing, or any task where skew correction is a bottleneck, jdeskew offers a robust alternative to traditional spatial methods. Its frequency domain approach reduces false detections caused by complex layouts or noise.
That said, it’s not a silver bullet. The method’s reliance on image resolution and FFT parameters means it requires some tuning for best results. Also, the frequency domain introduces overhead compared to simple heuristics, so it might not be ideal for real-time or resource-constrained environments.
The repo is well suited for researchers and engineers who want a clean, well-documented implementation of an advanced skew estimation technique. The minimal API and Docker support ease adoption, while the included benchmarks provide confidence in its accuracy.
In practice, jdeskew can slot into existing pipelines to improve deskewing quality, especially when dealing with documents that challenge edge-based methods. For anyone interested in frequency domain image processing or document analysis, it’s worth exploring.
Overall, jdeskew nails the tradeoff between accuracy and complexity in skew estimation. The code is accessible, the approach is solid, and the benchmark results speak for themselves.
→ GitHub Repo: phamquiluan/jdeskew ⭐ 166 · Jupyter Notebook