
On 25/10/12 18:33, Tim Connors wrote:
On Tue, 16 Oct 2012, Jason White wrote:
How can I most effectively compress scanned page images in PDF files without unduly degrading the visual quality?
I've been trying it with ImageMagick and a file that I want to compress, but so far without achieving much compression.
The ImageMagick identify command, applied to one of the original pages shows: PDF 595x842 595x842+0+0 16-bit Bilevel DirectClass 63.2KB 0.000u 0:00.009
I've been experimenting a little with the -compress, -density and -quality commands of the convert command, but without as much progress as I would prefer. In most cases the output is larger than the input.
It has been for-ever-and-a-half, but astro-ph (and other arXiV preprint servers) only ever accepted submissions of the order of ~1MB. This really sucks when you have 10 diagrams of 200,000 particles each. I believe my way around it was to use jpeg2ps to generate postscript images with embedded jpegs since the postscript protocol mandates a jpeg decoder.
But I'm not sure my memory and my poorly documented ("This program does... (author has been too lazy to update this)") shell scripts serves me correctly as I had been thinking this was for pdfs and not postscript images.
Seems it's not in debian but google returns some promising links. Probably not dfsg free - a lot of latex stuff isn't.
Source is at http://www.pdflib.com/download/free-software/jpeg2ps/ Hasn't been altered since 2002. To go from there to pdf just use ghostscripts' ps2pdf: ./jpeg2ps nesrin.jpg | ps2pdf - > nesrin.pdf Interestingly, the pdf is a little smaller than the original jpeg. Jpeg is not always the best compression for scans of text - gif or png of a black/white image may be smaller and without edge artifacts. gif2ps is part of the giflib package. libpng is the core of a number of converters - see http://www.libpng.org/pub/png/pngapcv.html