40 lines
644 B
Plaintext
40 lines
644 B
Plaintext
SUMMARY
|
|
|
|
This is a collection of tools that helps me digitizing books.
|
|
|
|
In particular, it helps assembling a bunch of random page scans into a book
|
|
with correct page order, mainly by using OCR and text (number) recognition.
|
|
|
|
I use it to prepare my book releases on torrents.
|
|
|
|
|
|
SYSTEM REQUIREMENTS
|
|
|
|
Theoretically should work on any system that supports Python 3.9+ and has
|
|
required dependencies, but might need some minor modifications in the code.
|
|
|
|
Tested only on FreeBSD 13.
|
|
|
|
|
|
DEPENDENCIES
|
|
|
|
System utilities:
|
|
|
|
- tesseract
|
|
- pdftoppm
|
|
|
|
Python packages:
|
|
|
|
- pytesseract
|
|
- Pillow
|
|
|
|
|
|
AUTHORS
|
|
|
|
rootless (c) 2023
|
|
|
|
|
|
LICENSE
|
|
|
|
BSD-2-Clause
|