Re: Compressing images in PDF files

16 Oct 2012

      Jason White <jason@jasonjgw.net> writes:
...
How can I most effectively compress scanned page images in PDF files without
unduly degrading the visual quality?
Do you have access to the documents that built the PDF?  i.e. the
foo.tex and foo-1.jpg ?

If so it should be easy -- just deal with the images before they enter
the PDF.  jpegoptim -m75, pngcrush, etc.  I don't know how best to
compress embedded vector images -- they're usually embedded as PDF
(instead of EPS), but I guess you would do path simplification on the
source in inkscape or whatever...
...
I've been trying it with ImageMagick and a file that I want to compress, but
so far without achieving much compression.
AFAICT imagemagick operates on PDFs by calling gs to do all the work.
You could ask #imagemagick on freenode.
...
The ImageMagick identify command, applied to one of the original pages shows:
PDF 595x842 595x842+0+0 16-bit Bilevel DirectClass 63.2KB 0.000u 0:00.009
I've been experimenting a little with the -compress, -density and -quality
commands of the convert command, but without as much progress as I would
prefer. In most cases the output is larger than the input.
I don't think imagemagick is the best tool for this.

However I did recently have success improving scanned receipts (which
the scanner gave as JPEG-in-a-PDF) using pdftoimage, then using
imagemagick to reduce the size of the image a quarter and convert it to
a monochrome PNG.  Don't forget +repage when you resize.

Re: Compressing images in PDF files

trentbuck＠gmail.com