Optimizing your PDF files
Portable Document Format (PDF)
is a file format for representing documents in a
manner independent of the application software, hardware, and operating system
used to create them and of the output device on which they are to be displayed
or printed. A PDF document consists of a collection of objects that together
describe the appearance of one or more pages, possibly accompanied by additional
interactive elements and higher-level application data. A PDF file contains the
objects making up a PDF document along with associated structural information,
all represented as a single self-contained sequence of bytes.
Sometimes PDF files are large in size, which contain many redundancies that are unnecessary to many readers, especially when they want to download a PDF file from Web. We often call a scaled PDF file as an optimized file. PDF optimization is often overlooked when creating PDF files for the Web. While PDFs have become quite popular on the Web, many PDFs used in web sites are designed for high quality print output and are not optimized for the Web. Even PDFs designed for Web use can have a wait problem, weighed down with excess fonts, change histories, and unoptimized images and forms. Optimizing PDF files for the Web can significantly shrink their size and boost display speed, saving bandwidth and user frustration, and can be distributed efficiently.
In order to optimize PDFs for minimum file size while still maintaining accessibility and search engine visibility, we can use compression techniques, for example, compression algorithms to reduce file size such as LZW, JPEG, JBIG2, Flate Decode, RunLength, JPEG2000, and CCITT, and reduce the complexity of the document (such as read number of fonts, forms, images, and multimedia) that ultimately determines how large the resulting PDF file will be.
To create small PDF files , you can consider the main factors including image resolution, image type (bitmap or vector), the number of fonts used and how they are embedded, PDF version, and the level of compression.
In many cases many PDFs we've seen are not fully optimized for the Web. There are two main ways we recommend you to optimize the exiting file, including
Reducing the file size
Linearizing the PDF file for fast web view
We can use Advanced PDF Tools to remove redundancy or unnecessary objects from the PDF file and recompress data stream ( a sequence of data points or values) to reduce the file size, and also to linearize the PDF file.
The redundancy objects includes:
Metadata
Metadata is the general information of a PDF file, such as the document's title, author, and creation and modification dates, which is opposed to its content or structure and is intended to assist in cataloguing and searching for documents in external databases.
Javascript
A JavaScript action causes a script to be compiled and executed by the JavaScript interpreter. Depending on the nature of the script, various interactive form fields in the document may update their values or change their visual appearances.
Thumbnail
A PDF document can define thumbnail images representing the contents of its pages in miniature form. A viewer application can display these images on the screen, allowing the user to navigate to a page by clicking its thumbnail image.
Embedded file
The file embedded inside PDFs, in which the embedded file included is purely for convenience and needs not be directly processed by any PDF consumer application. It contains many factors the file needs such as including references to external files via the file system or a URL to a remote location, and also possible a binary file.
Bookmark
A PDF document may optionally display a document outline or bookmark on the screen, allowing the user to navigate interactively from one part of the document to another. The bookmark consists of a tree-structured hierarchy of outline items, which serve as a visual table of contents to display the document's structure to the user. The user can interactively open and close individual items by clicking them with the mouse. When an item is open, its immediate children in the hierarchy become visible on the screen; each child may in turn be open or closed, selectively revealing or hiding further parts of the hierarchy. When an item is closed, all of its descendants in the hierarchy are hidden. Clicking the text of any visible item activates the item, causing the viewer application to jump to a destination or trigger an action associated with the item.
Comment
Users can add comments to their PDF files. These comments can be saved, summarized and searched as the content of file.
Private data
A PDF page allows an application to embed private data in a PDF document for its own use. Application use private data to connect with document, page, or form. Such private data can convey information meaningful to the application that produces it.
Named destination
A destination may be referred to indirectly by means of a name object (PDF 1.1) or a string (PDF 1.2). This capability is especially useful when the destination is located in another PDF document.
All form actions
One of the standard action types that PDF supports.
There are four special types of form actions:
Submit-form actions transmit the names and values of selected interactive form fields to a specified uniform resource locator (URL), presumably the address of a Web server that will process them and send back a response.
Reset-form actions reset selected interactive form fields to their default values.
Import-data actions import Forms Data Format (FDF) data into the document's interactive form from a specified file.
JavaScript actions (PDF 1.3) cause a script to be compiled and executed by the JavaScript interpreter.
Optimize For Fast Web View
Fast Web View restructures a PDF document for page-at-a-time downloading (byte-serving) from web servers. This option compresses text and line art, regardless of what you have selected as compression settings on the Images panel. This makes for faster access and viewing when downloading the file from the web or a network. With page-at-a-time downloading, the web server sends only the requested page, rather than the entire PDF document. This is especially important with large documents that can take a long time to download from a server.
Check with your web master to make sure that the web server software you use supports page-at-time downloading. To ensure that the PDF documents on your website appear in older browsers, you may also want to create HTML links (versus ASP scripts or the POST method) to the PDF documents and keep path names--or URLs--to the files at less than 256 characters.
Using Advanced PDF Tools
Advanced PDF Tools (GUI version) provides many settings for reducing the size of Adobe PDF files. Whether you use all of these settings or only a few depends on how you intend to use the files and on the essential properties a file must have. In most cases, the default settings are appropriate for maximum efficiency--saving space by removing some embedded fonts, compressing images, and removing items from the file that are no longer needed.
Before you optimize a file, it's a good idea to audit the file's space usage to get a report of the total number of bytes used for specific document elements, including fonts, images, bookmarks, forms, named destinations, and comments, as well as the total file size. The results are reported both in bytes and as a percentage of the total file size. The space audit results may give you ideas about where best to reduce file size. You'd better experiment with various settings before making changes that can't be discarded.
You can also use pdfcompress command line tool (part of Advanced PDF Tools command line) to remove redundancy information. For example, you can use the command
pdfcompress -i C:\input.pdf -o C:\output.pdf
and a configuration file "compress.ini" to compress your images, e.g.,
[data]
removemetadata=1
removejavascript=1
removethumb=1
removecomment=1
removeembeddedfile=1
removebookmarks=1
removeprivatedata=1
removenamesdestination=1
removeform=1
compressstream=1
Please read pdfcompress command line user manual for more usage.
Advanced PDF Tools Command Line Home page.
Copyright © 2000-2006 by VeryPDF.com, Inc.
Send comments about this site to the
webmaster.