Extract images from PDF files with this free Linux Batch PDF Image Extractor script. Works with single and multiple PDF files. Easy to use Linux script. Download batch-pdf-image-extractor.sh from my Github repo, place the bash script file in a directory that contains one or more PDF to process then run the script to extract images from the PDF docs.
Instructions
- Download batch-pdf-image-extractor.sh from GitHub.
- Copy batch-pdf-image-extractor.sh to a directory that contains PDF files
- Make batch-pdf-image-extractor.sh executable. Either right-click to edit file properties or run
chmod +x batch-pdf-image-extractor.sh
- Run the script with
bash batch-pdf-image-extractor.sh
or just left-click the file batch-pdf-image-extractor.sh
What Will Happen
- The PDF image extractor script will look for PDF files in the active directory
- The files will then be processed to extract any images.
- Any images in the PDF files will be extracted
- Extracted images will be converted to PNG format
- Images will be moved into an ‘images’ directory
- PDF files will be moved into a ‘pdf’ directory
Configs
The script has three optional configuration variables. These control image conversion and whether or not processed PDF files and their images are moved or copied into sub directories. The default settings suit most workflows i.e process PDF(s), convert images to PNG, then move files to the sub directories ‘images’ and ‘pdfs’.
The configs are at the top of the script file:
## # # Configs # ## extensions=( tiff tif pmb ppm ) # List the output image extensions that should be converted to a different format format='png' # State the format images with $extensions should be converted to. organize='move' # 'copy' or 'move' all files into subdirectories organised by extension type. Leave empty for no organization. ## END Configs
Requirements
Batch PDF Image Extractor requires pdfimages to be installed. The script will check for the pdfimages program and prompt for its installation if not found.
Known Issues
Any issues with pdfimages will also be evident in Batch PDF Image Extractor.
Images fail to extract from some PDF files. I don’t know why this is. Will look into it when I get time. My experience is that if an image fails to extract from a PDF file then no images will extract from that same PDF file. I use GIMP to convert pages of these PDFs to images then extract the images manually.
Script Author
Batch PDF Image Extractor is written, copyrighted and maintained by Lee Hodson of JournalXtra, VR51 and WP Service Masters fame.
Donations
Development takes time and skill. I offer these software for free. Donations are welcome.