- Document Management - Electronic document file format for long term preservation - Part 1: Use of PDF 1.4 (PDF/A-1)
This standard defines a format (PDF/A) for the long-term archiving of electronic documents and is based on the PDF Reference Version 1.4 from Adobe Systems Inc. (implemented in Adobe Acrobat 5).
PDF/A is in fact a subset of PDF, leaving out PDF features not suited to long-term archiving. This is similar to the definition of the PDF/X subset for the printing and graphic arts.
In addition, the standard places requirements on software products that read PDF/A files. A "conforming reader" must follow certain rules including following color management guidelines, using embedded fonts for rendering, and making annotation content available to users.
Contents |
Description
The Standard does not define an archiving strategy or the goals of an archiving system. It identifies a "profile" for electronic documents that ensures the documents can be reproduced the exact same way in years to come. A key element to this reproducibility is the requirement for PDF/A documents to be 100 % self-contained. All of the information necessary for displaying the document in the same manner every time is embedded in the file. This includes, but is not limited to, all content (text, raster images and vector graphics), fonts, and color information. A PDF/A document is not permitted to be reliant on information from external sources (e.g. font programs and hyperlinks).
Other key elements to PDF/A compatibility include:
- Audio and video content are forbidden.
- Javascript and executable file launches are prohibited.
- All fonts must be embedded and also must be legally embeddable for unlimited, universal rendering. This also applies to the so-called PostScript standard fonts such as Times or Helvetica.
- Colorspaces specified in a device-independent manner.
- Encryption is disallowed.
- Use of standards-based metadata is mandated.
Conformance levels and versions
The standard specifies two levels of compliance for PDF files:
- PDF/A-1a - Level A compliance in Part 1
- PDF/A-1b - Level B compliance in Part 1
PDF/A-1b has the objective of ensuring reliable reproduction of the visual appearance of the document. PDF/A-1a includes all the requirements of PDF/A-1b and additionally requires that document structure be included, with the objective of ensuring that document content can be searched and repurposed.
A new version "PDF/A-2" is currently being worked on. It is expected to be based on the PDF Reference Version 1.6.
Identification
A PDF/A document can be identified as such through PDF/A-specific metadata located in the "http://www.aiim.org/pdfa/ns/id/" namespace. However, claiming to be PDF/A and being so are not necessarily the same:
- A PDF document can be PDF/A-compliant, except for its lack of PDF/A metadata. This may happen for instance with documents that were generated before the definition of the PDF/A standard, by authors aware of features that present long-term preservation issues.
- A PDF document can be identified as PDF/A, but may incorrectly contain PDF features not allowed in PDF/A; hence, documents which claim to be PDF/A-compliant should be tested for PDF/A conformance.
Drawbacks
As a PDF/A document must embed all fonts that it uses, a PDF/A file will often be bigger than an equivalent PDF file that does not have the fonts embedded. This may be undesirable when archiving large numbers of small files that all use the same fonts, since a separate copy of each font will be embedded in each file.
Background
PDF/A was originally a new joint
activity between NPES - The
Association for Suppliers of
Printing, Publishing and Converting
Technologies, and the Association
for Information and Image
Management, International (AIIM
International) to develop an
International standard that defines
the use of the Portable Document
Format (PDF) for archiving and
preserving documents. The goal was
to address the growing need to
electronically archive documents in
a way that will ensure preservation
of their contents over an extended
period of time, and will further
ensure that those documents will be
able to be retrieved and rendered
with a consistent and predictable
result in the future. This need
exists in a growing number of
international government and
industry segments, including legal
systems, libraries, newspapers,
regulated industries, and others.
See also:
How to Optimize PDF
Files for Web Sites?
How to Compress your PDF
files?
PDF Tools Command Line
options
PDF
Compression Command Line options
JPEG2000 compression in
Advanced PDF Tools
Modify Custom Properties
in Advanced PDF Tools
Scale your PDF pages
with PDF Tools and docPrint
VeryPDF JBIG2 Compression Engine
Understanding "Flavors" of PDF
What is
PDF/A? What is PDF/X?
Comparison between JPEG and JPEG
2000