Previous Next


                                 CHAPTER 3


                                 3   Syntax

This chapter covers everything about the syntax of PDF at the object, file, and
document level. It sets the stage for subsequent chapters, which describe how the
contents of a PDF file are interpreted as page descriptions, interactive
navigational aids, and application-level logical structure.

PDF syntax is best understood by thinking of it in four parts, as shown in Figure
3.1:

• Objects. A PDF document is a data structure composed from a small set of
  basic types of data objects. Section 3.1, “Lexical Conventions,” describes the
  character set used to write objects and other syntactic elements. Section 3.2,
  “Objects,” describes the syntax and essential properties of the objects. Section
  3.2.7, “Stream Objects,” provides complete details of the most complex data
  type, the stream object.
• File structure. The PDF file structure determines how objects are stored in a
  PDF file, how they are accessed, and how they are updated. This structure is
  independent of the semantics of the objects. Section 3.4, “File Structure,” de-
  scribes the file structure. Section 3.5, “Encryption,” describes a file-level mech-
  anism for protecting a document’s contents from unauthorized access.
• Document structure. The PDF document structure specifies how the basic ob-
  ject types are used to represent components of a PDF document: pages, fonts,
  annotations, and so forth. Section 3.6, “Document Structure,” describes the
  overall document structure; later chapters address the detailed semantics of the
  components.
• Content streams. A PDF content stream contains a sequence of instructions de-
  scribing the appearance of a page or other graphical entity. These instructions,
  while also represented as objects, are conceptually distinct from the objects that



                                         47

Previous Next