SECTION 3.6
137
Document Structure
3.6 Document Structure
A PDF document can be regarded as a hierarchy of objects contained in the
body section of a PDF file. At the root of the hierarchy is the document’s
catalog
dictionary (see Section 3.6.1, “Document Catalog”). Most of the objects in the
hierarchy are dictionaries. For example, each page of the document is
represented by a
page object—a
dictionary that includes references to the page’s
contents and other attributes, such as its thumbnail image (Section 8.2.3,
associated with it. The individual page objects are tied together in a structure
called the
page tree
(described in Section 3.6.2, “Page Tree”), which in turn is
specified by an indirect reference in the document catalog. Parent, child, and
sibling relationships within the hierarchy are defined by dictionary entries whose
values are indirect references to other dictionaries. Figure 3.5 illustrates the
structure of the object hierarchy.
Note:
The data structures described in this section, particularly the catalog and page
dictionaries, combine entries describing document structure with ones dealing with
the detailed semantics of documents and pages. All entries are listed here, but many
of their descriptions are deferred to subsequent chapters.
3.6.1 Document Catalog
The root of a document’s object hierarchy is the
catalog
dictionary, located by
means of the
Root
entry in the trailer of the PDF file (see Section 3.4.4, “File
document’s contents, outline, article threads
(PDF 1.1),
named destinations, and
other attributes. In addition, it contains information about how the document
should be displayed on the screen, such as whether its outline and thumbnail
page images should be displayed automatically and whether some location other
than the first page should be shown when the document is opened. Table 3.25
shows the entries in the catalog dictionary.