CHAPTER 3
3
Syntax
This chapter covers everything about the syntax of PDF at the object, file, and
document level. It sets the stage for subsequent chapters, which describe how the
contents of a PDF file are interpreted as page descriptions, interactive
navigational aids, and application-level logical structure.
PDF syntax is best understood by thinking of it in four parts, as shown in Figure
•
Objects.
A PDF document is a data structure composed from a small set of
basic types of data objects. Section 3.1, “Lexical Conventions,” describes the
character set used to write objects and other syntactic elements. Section 3.2,
type, the stream object.
•
File structure.
The PDF file structure determines how objects are stored in a
PDF file, how they are accessed, and how they are updated. This structure is
independent of the semantics of the objects. Section 3.4, “File Structure,” de-
scribes the file structure. Section 3.5, “Encryption,” describes a file-level mech-
anism for protecting a document’s contents from unauthorized access.
•
Document structure.
The PDF document structure specifies how the basic ob-
ject types are used to represent components of a PDF document: pages, fonts,
annotations, and so forth. Section 3.6, “Document Structure,” describes the
overall document structure; later chapters address the detailed semantics of the
components.
•
Content streams.
A PDF
content stream
contains a sequence of instructions de-
scribing the appearance of a page or other graphical entity. These instructions,
while also represented as objects, are conceptually distinct from the objects that
47