CHAPTER 10
884
Document Interchange
•
Structure types
structure types define the meaning of structure elements, such as paragraphs,
headings, articles, and tables.
•
Structure attributes
structure attributes preserve styling information used by the authoring applica-
tion in laying out content on the page.
A Tagged PDF document must also contain a mark information dictionary (see
true
for the
Marked
entry.
Note:
The types and attributes defined for Tagged PDF are intended to provide a set
of standard fallback roles and minimum guaranteed attributes to enable consumer
applications to perform operations such as those mentioned above. Producer appli-
cations are free to define additional structure types as long as they also provide a
role mapping to the nearest equivalent standard types, as described in Section
structure attributes using any of the available extension mechanisms.
10.7.1 Tagged PDF and Page Content
Like all PDF documents, a Tagged PDF document consists of a sequence of self-
contained pages, each of which is described by one or more page content streams
(including any subsidiary streams such as form XObjects and annotation appear-
ances). Tagged PDF defines some further conventions for organizing and mark-
ing content streams so that additional information can be derived from them:
•
Distinguishing between the author’s original content and artifacts of the layout
process (see “Real Content and Artifacts” on page 885)
•
Specifying a content order to guide the layout process if the page content must
be reflowed (see “Page Content Order” on page 889)
•
Representing text in a form from which a Unicode representation and informa-
tion about font characteristics can be unambiguously derived (see “Extraction
•
Representing word breaks unambiguously (see “Identifying Word Breaks” on
•
Marking text with information for making it accessible to users with visual im-
pairments (see Section 10.8, “Accessibility Support)