Reducing PDF file size is a common task, especially when dealing with large documents for saving, sharing, or uploading. For example, some of the eFiling courts in the United States require the PDF file size to be less than 35 MB. In the healthcare industry, electronic medical records are typically kept under 10 MB - 20 MB for easier transmission and storage.
In this article, to simplify your work, we will explore how to reduce a PDF file size in Python with the best 3 tools, allowing you to minimize PDF file size in batch.
Spire.PDF for Python: Reduce a PDF File Size Efficiently
The first tool we will check out is Spire.PDF for Python. It is a robust and efficient library for handling PDF documents. This tool allows you to perform a wide range of PDF-related tasks, including creating, editing, converting, and compressing PDF files, all within Python. Many companies and programmers welcome this component because of its easy-to-understand API. Both beginners and experts can make a PDF smaller in Python with Spire.PDF hassle-free. You can install Spire.PDF for Python from PyPI using the pip command: pip install Spire.Pdf
.
How to Make PDF Smaller: Compress Images
In general, the excessive size of PDF documents is mainly due to high-resolution images, embedded fonts, or annotations. In this guide, we will focus on optimizing images and fonts as the primary methods to compress PDF files and decrease their size effectively. In this part, we’ll discuss how to reduce PDF file size by adjusting the size of images.
Create an object of PdfCompressor.
Get the compression object with the PdfCompressor.OptimizationOptions property.
Set image quality with the SetImageQuality() method.
Enable image resizing using the SetResizeImages() method, and compress images by setting the SetIsCompressImage() method to be
True
.Reduce the size of a PDF and save the resulting document as a new PDF.
Here is a code example of reducing PDF file size by setting the image quality to medium:
from spire.pdf import *
from spire.pdf.common import *
# Create a PdfCompressor object
compressor = PdfCompressor("/input/test.pdf")
# Get the compression options object
compression_options = compressor.OptimizationOptions
# Set the image quality to medium
compression_options.SetImageQuality(ImageQuality.Medium)
# Enable image resizing
compression_options.SetResizeImages(True)
# Apply compression to these images
compression_options.SetIsCompressImage(True)
# Compress the PDF file and save the result to a new file
compressor.CompressToFile("/output/SpireCompress_images.pdf")
How to Reduce a PDF File Size: Optimizing Fonts
If you find that the current PDF file is still too large, another optimization method is to adjust the embedded fonts in the document. When creating a PDF, using multiple fonts for layout and aesthetics can increase the file size. If retaining high-quality images is crucial, optimizing fonts can further reduce the document size.
Spire.PDF offers the OptimizationOptions.SetIsCompressFonts() and the OptimizationOptions.SetIsUnembedFonts() methods to modify the size of fonts. This part will show you how to make a PDF smaller with it, providing detailed steps.
Create a PdfCompressor instance and open the PDF document.
Access the optimization options using the PdfCompressor.OptimizationOptions property.
Enable compressing fonts by setting OptimizationOptions.SetIsCompressFonts() to be
True
, or OptimizationOptions.SetIsUnembedFonts() to beTrue
.Minimize PDF file size and store the PDF document.
Below is the code example of compressing a PDF file by optimizing fonts:
from spire.pdf import *
from spire.pdf.common import *
# Create a PdfCompressor object and load the PDF document to compress
compressor = PdfCompressor("/input/test.pdf")
# Configure the compression options to optimize fonts in the PDF
compression_options = compressor.OptimizationOptions
# Enable font compression
compression_options.SetIsCompressFonts(True)
# Or enable font unembedding
#compression_options.SetIsUnembedFonts(True)
# Compress the PDF file and save the compressed document to file
compressor.CompressToFile("/output/SpireCompress_fonts.pdf")
Apose: How to Reduce a PDF File Size by Compressing Images
The second Python PDF size reducer that we will explore is Apose.PDF. It enables users to handle PDF documents without needing MS Office or Adobe Acrobat Automation. You can easily install Aspose.PDF for Python with the command: pip install aspose-pdf
.
Based on testing, although Aspose offers direct PDF file compression and image optimization methods, they don't always guarantee a reduction in file size and may even result in a larger file. Therefore, this article will guide you through using Aspose to remove embedded fonts as an effective way to reduce a PDF file size.
Import modules that are required in this task.
Open the PDF document from the file path.
Set the options for unbedding fonts by configuring OptimizationOptions().unembed_fonts to be
Ture
.Reduce the size of a PDF by utilizing the font options.
Save the document as a new file.
Compare the size before and after compressing and print it out.
Here is the code example of shrinking the size of a PDF by removing embedded fonts:
import aspose.pdf as ap
import os
# Open document
document = ap.Document("/input/test.pdf")
# Set unembedFonts option
optimizeOptions = ap.optimization.OptimizationOptions()
optimizeOptions.unembed_fonts = True
# Optimize PDF document using OptimizationOptions
document.optimize_resources(optimizeOptions)
# Save the updated document
document.save("/output/compresspdf_fonts.pdf")
# Compare the size of the original file and the compressed one
file_stats_1 = os.stat("/input/test.pdf")
file_stats_2 = os.stat("/output/compresspdf_fonts.pdf")
print(
"Original file size: {}. Reduced file size: {}".format(
file_stats_1.st_size, file_stats_2.st_size
)
)
PyPDF2: How to Compress PDF Quickly
PyPDF2 is another useful Python PDF size reducer that allows you to directly reduce a PDF file size with ease. To install this handy tool, you can use the command: pip install PyPDF2
. If you are not a super-user (a system administrator/root), you can also just install PyPDF2 for your current user: pip install --user PyPDF2
.
Check the steps and the example below, and get your work done.
Import the needed modules.
Create a PdfReader object and read the PDF file to be compressed.
Create an object of the PdfWriter class.
Loop through pages in the document.
Compress the content on the page using the page.compress_content_streams() method.
Add compressed pages to the PdfWriter.
Save the resulting document to the disk.
Below is the code example of reducing a PDF file size directly with PyPDF2:
from PyPDF2 import PdfReader, PdfWriter
# Create a PdfReader object to read the existing PDF
reader = PdfReader("/input/test.pdf")
# Create a PdfWriter object to write the modified PDF
writer = PdfWriter()
# Iterate through each page in the PDF document
for page in reader.pages:
# Compress the content of the page (this process is CPU intensive)
page.compress_content_streams()
# Add the compressed page to the PdfWriter
writer.add_page(page)
# Save the modified PDF to a new file
output_file = "/output/compresspdf.pdf"
with open(output_file, "wb") as f:
writer.write(f)
The Conclusion
In this article, we've explored various effective methods for how to reduce a PDF file size using three tools. We began with Spire.PDF for Python, focusing on how to compress fonts and images to achieve smaller PDF files. Then, the Apose offers specific techniques for image compression to reduce file sizes. Lastly, we looked at PyPDF2 and its quick approach to PDF compression. By utilizing these tools and techniques, you can make a PDF smaller without any effort.