3.1.8.2 Compressing Data
The uncompressed data is first inserted into the local history buffer at the position indicated by HistoryOffset by the sender. The compressor then runs through the length of newly added uncompressed data to be sent and produces as output a sequence of literals (bytes to be sent uncompressed) or copy-tuples which consists of a <copy-offset, length-of-match> pair.
The copy-offset component of the copy-tuple is an index into HistoryBuffer (counting backwards from the current byte being compressed in the history buffer towards the start of the buffer) where there is a match to the data to be sent. The length-of-match component is the length of that match in bytes, and MUST be larger than 2 (section 3.1.8.4.1.2.2 and 3.1.8.4.2.2.2). If the resulting data is not smaller than the original bytes (that is, expansion instead of compression results), then this results in a flush and the data is sent uncompressed so as never to send more data than the original uncompressed bytes.
In this way the compressor aims to reduce the size of data that needs to be transmitted. For example, consider the following string.
-
0 1 2 3 4 012345678901234567890123456789012345678901234567890 for.whom.the.bell.tolls,.the.bell.tolls.for.thee!
The compressor produces the following:
-
for.whom.the.bell.tolls,<16,15>.<40,4><19,3>e!
The <16,15> tuple is the compression of '.the.bell.tolls' and <40,4> is 'for.', <19,3> gives 'the'.
The expansion of a copy-tuple MUST use a "replicating copy". A replicating copy is implemented using the following pseudocode.
-
SrcPtr = HistoryPtr - CopyOffset; while (LengthOfMatch > 0) { *HistoryPtr = *SrcPtr; SrcPtr = SrcPtr + 1; HistoryPtr = HistoryPtr + 1; LengthOfMatch = LengthOfMatch - 1; }
For example, consider the following compressed stream.
-
Xcd<2,4>YZ
Using a replicating copy, this is correctly decompressed to
-
XcdcdcdYZ
Literals and copy-tuples are encoded using the scheme described in section 3.1.8.4.1 or 3.1.8.4.2 (the scheme used depends on whether RDP 4.0 or 5.0 bulk compression is being used).