SECTION 3.3
71
Filters
3.3.3 LZWDecode and FlateDecode Filters
The
LZWDecode
and (in PDF 1.2)
FlateDecode
filters have much in common and
are discussed together in this section. They decode data that has been encoded
using the LZW or Flate data compression method, respectively:
•
LZW (Lempel-Ziv-Welch) is a variable-length, adaptive compression method
that has been adopted as one of the standard compression methods in the
Tag
Image File Format
(TIFF) standard. Details on LZW encoding follow in the
next section.
•
The Flate method is based on the public-domain zlib/deflate compression
method, which is a variable-length Lempel-Ziv adaptive compression method
cascaded with adaptive Huffman coding. It is fully defined in Internet RFCs
1950,
ZLIB Compressed Data Format Specification,
and 1951,
DEFLATE Com-
pressed Data Format Specification
(see the Bibliography).
Both of these methods compress either binary data or ASCII text but (like all
compression methods) always produce binary data, even if the original data was
text.
The LZW and Flate compression methods can discover and exploit many
patterns in the input data, whether the data is text or images. As described later,
both filters support optional transformation by a
predictor function,
which
improves the compression of sampled image data. Because of its cascaded
adaptive Huffman coding, Flate-encoded output is usually much more compact
than LZW-encoded output for the same input. Flate and LZW decoding speeds
are comparable, but Flate encoding is considerably slower than LZW encoding.
Usually, both Flate and LZW encodings compress their input substantially.
However, in the worst case (in which no pair of adjacent characters appears
twice), Flate encoding
expands
its input by no more than 11 bytes or a factor of
1.003 (whichever is larger), plus the effects of algorithm tags added by PNG
predictors. For LZW encoding, the best case (all zeros) provides a compression
approaching 1365 : 1 for long files, but the worst-case expansion is at least a factor
of 1.125, which can increase to nearly 1.5 in some implementations, plus the
effects of PNG tags as with Flate encoding.