Back to main



Page Scanning Wizard

This is a simple Python script to assist in processing book images. Features include:
Rotate (90 degree and precise)
Keystone
Rename/renumber
Crop

The reason I use it is because it's designed for rapid, "good enough" work. This is mostly a personal backup, but if it's useful for you, go ahead.

GPL v3

Download here Version 1.3.

Changes: Added Keystone button. (Use -/+ to change which way the keystone works.) This is very slow.

Version 1.2.

Changes: left/right/up/down grabbers for cropping. Shows original filename. Added warnings button.

Known issues: Precise rotation and keystone doesn't always apply immediately; this is a tkinter problem when a textbox attaches to a variable. If rotating by 1 degree, just set it to "1." and it'll work right.

More details:

You'll need Python, plus the Python libraries for json, PIL, numpy, and tkinter.

The first thing you need to do once you open the program is to use the "Browse" button (upper-right corner) to open a file in the directory you want to work on. (Any file.) This will load all the files in the directory to the list to the right.

All operations are saved to a ".page_scanner_wizard.json" file located in the directory you're working in. This program does not actually do anything until you hit "Export", at which point it will process the files as requested and output to an "output" subdirectory. The original files are never modified. The file is updated whenever you modify anything, so there's no need to save your work.

The upper radio buttons are for 90 degree rotation. The textbox to the right can accept arbitrary rotations. The "Rotate following pages" will rotate all pages following the one you're working on.

Keystone corrects for a page that was scanned at an angle. It squishes the left or right side in by a percentage. Use positive or negative amounts to change which side is squished. This function is very slow. Keystone following pages will apply the amount of keystoning to following pages.

Next is the page number textbox. This can accept arbitrary text; any numbers will be considered to be a page number. This is also where you can choose the output format. Png and jpg are supported. "+Page Number" will add that number to the page number in following pages. The default is 2, which is appropriate if you scanned all the (e.g.) left pages, then all the right. Finally, you can "Renumber following pages" once you have this set right to, in fact, renumber following pages.

The Page Insert button is like "Renumber following pages", but it renumbers based on the previous page, ignoring the current page. If you have something that doesn't have a proper page number you can change its name then hit this button.

Cropping is controlled by the eight circles in the preview image. Just drag and drop them. Once set, you can "Crop following pages" if you want.

"Not New series" is how you keep the "following pages" buttons from applying to all pages. Click the button to set the page as a new series. All operations that apply to multiple pages will stop if they encounter a new series. E.g., if you have all the left pages, then all the right pages, and need to fix the cropping on the left pages, you can set this on the first right page to keep "Crop following pages" from overwriting the crops you set on the right pages.

Export will do the actual work of rotate and crop. Again, this does not modify the original files. This may take a while, especially if you're exporting pngs. It will change to show progress once started.

The program does not keep the actual images in memory when editing or exporting, so I don't think there are any realistic limits on the size of the directory it can process.

Show warnings will print any issues to the console. Right now this detects repeated filenames, and missing pages. Missing pages may fail to be detected if you have multiple page number series; if you have a n0010 page, but no m0010 page, it won't detect the missing m0010 page, because it will see the n0010 page.

screenshot

Version 1.1.

Changes: minor bugfixes, and the "Page Insert" button.

Version 1.0