CHAPTER 10
948
Document Interchange
HTML file retrieved from the URL < http://www.adobe.com/> has been converted
to three pages in the PDF file. The entry for that URL in the
URLS
name tree
points to a page set containing the three pages. Similarly, the
IDS
name tree con-
tains an entry pointing to the same page set, associated with the digital identifier
calculated from the HTML source (the string shown in the figure as
904B …1EA2
).
Document catalog
Dictionary
Name
dictionary
Name tree
URLS
IDS
http://www.adobe.com/
904B…1EA2
Page set
Page
Page
Page
FIGURE 10.1
Simple Web Capture file structure
Entries in the
URLS
and
IDS
name trees may refer to an array of content sets
instead of just a single content set. The content sets need not have the same sub-
type, but may include both page sets and image sets. In Figure 10.2, for example, a
GIF file has been retrieved from a URL (< http://www.adobe.com/getacro.gif >)