For the past month, I’ve programmed in nothing but Matlab. Kinda sad, because I don’t even care for Matlab. I prefer Python.
I was going through a store called Tuesday Morning to look at the stuff for sale. It’s a bit of a junk store. They had a DVD for sale containing every Fantastic Four comic book on PDF from 1961 to 2004 for only $15. Being a comic fan, I had to get it.
Each issue is scanned into PDF format. Each 2 pages of the paper issue is combined into 1 page of the PDF. I didn’t like this, so I decided to write a Python script to cut each page of the PDF down the middle into two pages and stitch the document back together. This means that 1 page of a comic book equals 1 page of PDF.
I use is the handy pyPdf library. It’s doing all of the magic in this script. To execute this script:
$ python splitPages.py InputDocument.pdf OutputDocument.pdf
(Since the comics are intellectual property owned by Marvel, I’m going to not post screen shots.)
from pyPdf import PdfFileWriter, PdfFileReader
import sys
print "Reading", sys.argv[1]
output = PdfFileWriter()
left = PdfFileReader(file(sys.argv[1], "rb"))
right = PdfFileReader(file(sys.argv[1], "rb"))
left.decrypt('')
right.decrypt('')
pages = left.getNumPages()
for i in range(0, pages):
# Grab the left page
p = left.getPage(i)
p.mediaBox.upperRight = (
p.mediaBox.getUpperRight_x() / 2,
p.mediaBox.getUpperRight_y()
)
output.addPage(p)
# Grab the right page
p = right.getPage(i)
p.mediaBox.upperLeft = (
p.mediaBox.getUpperRight_x() / 2,
p.mediaBox.getUpperRight_y()
)
output.addPage(p)
print "Writing", sys.argv[2]
outputStream = file(sys.argv[2], "wb")
output.write(outputStream)
outputStream.close()