Scanning multipage documents without duplex mode (but with feeder scanner) on linux (crossmerge pdf files)
typedcode
Posted on January 16, 2022
If one wants to digitalise multi page documents but just has a feeder scanner that does not support duplex scanning: This is for you.
Having small documents that are printed on both pages is a pain to scan into one single file but it is possible.
E.G. one could scan each individual page and combine everything using pdfunite
or pdftk
or any other tool that supports joining pdf documents.
If the document one has are 20 or more pages. This method is very time consuming.
If one has a feeder scanner, here is what one could do:
Terminology
In this tutorial I will refer with A
as the front side of a sheet and B
will be the back. In addition numbers will refer to a specific sheet number.
e.G. 4B
will be the backside of the 4th sheet (= page 8)
HowTo
- One puts all the pages into the feeder scanner and scan each
A
side of each document. The result will be a file with the pages1A 2A 3A... nA
. We will call this filefront_pages.pdf
- Now one turns the stash of pages around so that the last page is on top. One puts the pages into the feeder scanner and scans the back pages of each sheet. The result will be a document containing every
B
side of each sheet in reverse orderBn, Bn-1, Bn-2, ..., B3, B2, B1
. We will call this fileback_pages_reverse.pdf
The last thing to do is to merge the pages. Because back_pages_reverse.pdf
is in reverse order one must merge the documents in the way:
- front_pages[1]
- back_pages_reverse[n]
- front_pages[2]
- back_pages_reverse[n-1]
...
- front_pages[n-1]
- back_pages_reverse[2]
- front_pages[n]
- back_pages_reverse[1]
This can be done with a simple bash script:
Prerequisites: The script uses pdftk
. So to use it one must install pdftk
first.
Example to install pdftk
on fedora or debian based systems
#fedora
sudo dnf install pdftk
#debian
sudo apt-get install pdftk
Script for creating the meged pdf file
crossmergereverse.sh
#!/bin/bash
numPages=$(pdftk $1 dump_data | grep NumberOfPages | awk '{print $2}')
param=""
for ((i=1 ; i <= $numPages ; i++ ));
do
bindex=$(($numPages-$i+1))
param="$param A$i B$bindex"
done
pdftk A=$1 B=$2 cat $param output $3
Running the script like
sh crossmergereserve.sh front_pages.pdf back_pages_reverse.pdf completeDocument.pdf
Will result in a newly created document completeDocument.pdf
with all the pages in the correct order 1A 1B 2A 2B... nA, nB
The script crossmergereserve.sh
can be found here. The repository also contains a script for a simple cross-merge of pdf documents.
Why back_pages_reverse.pdf
?
The script for merging the documents would be simpler if there would be two documents like 1A, 2A, ..., nA
and 1B, 2B, ..., nB
. But to achieve that one must reorder the entire document before scanning the back sides. Things one has to do manually are are less if one does it that way.
Posted on January 16, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.