2023-10-10 02:41:47 +03:00

40 lines
644 B
Plaintext

SUMMARY
This is a collection of tools that helps me digitizing books.
In particular, it helps assembling a bunch of random page scans into a book
with correct page order, mainly by using OCR and text (number) recognition.
I use it to prepare my book releases on torrents.
SYSTEM REQUIREMENTS
Theoretically should work on any system that supports Python 3.9+ and has
required dependencies, but might need some minor modifications in the code.
Tested only on FreeBSD 13.
DEPENDENCIES
System utilities:
- tesseract
- pdftoppm
Python packages:
- pytesseract
- Pillow
AUTHORS
rootless (c) 2023
LICENSE
BSD-2-Clause