Batch PDF Image Extractor

Extract images from PDF files with this free Linux Batch PDF Image Extractor script. Works with single and multiple PDF files. Easy to use Linux script. Download batch-pdf-image-extractor.sh from my Github repo, place the bash script file in a directory that contains one or more PDF to process then run the script to extract images from the PDF docs.

Instructions

Download batch-pdf-image-extractor.sh from GitHub.
Copy batch-pdf-image-extractor.sh to a directory that contains PDF files
Make batch-pdf-image-extractor.sh executable. Either right-click to edit file properties or run chmod +x batch-pdf-image-extractor.sh
Run the script with bash batch-pdf-image-extractor.sh or just left-click the file batch-pdf-image-extractor.sh

What Will Happen

The PDF image extractor script will look for PDF files in the active directory
The files will then be processed to extract any images.
Any images in the PDF files will be extracted
Extracted images will be converted to PNG format
Images will be moved into an ‘images’ directory
PDF files will be moved into a ‘pdf’ directory

Configs

The script has three optional configuration variables. These control image conversion and whether or not processed PDF files and their images are moved or copied into sub directories. The default settings suit most workflows i.e process PDF(s), convert images to PNG, then move files to the sub directories ‘images’ and ‘pdfs’.

The configs are at the top of the script file:

##
#
# Configs
#
##

extensions=( tiff tif pmb ppm ) # List the output image extensions that should be converted to a different format
format='png' # State the format images with $extensions should be converted to.
organize='move' # 'copy' or 'move' all files into subdirectories organised by extension type. Leave empty for no organization.

## END Configs

Requirements

Batch PDF Image Extractor requires pdfimages to be installed. The script will check for the pdfimages program and prompt for its installation if not found.

Known Issues

Any issues with pdfimages will also be evident in Batch PDF Image Extractor.

Images fail to extract from some PDF files. I don’t know why this is. Will look into it when I get time. My experience is that if an image fails to extract from a PDF file then no images will extract from that same PDF file. I use GIMP to convert pages of these PDFs to images then extract the images manually.

Script Author

Batch PDF Image Extractor is written, copyrighted and maintained by Lee Hodson of JournalXtra, VR51 and WP Service Masters fame.

Donations

Development takes time and skill. I offer these software for free. Donations are welcome.

Download Batch PDF Image Extractor