How to identify number of tokens in TokenSequence. MS-OVBA

Parth Gupta 180 Reputation points
2023-05-24T10:48:31.5133333+00:00

Hi,

With reference to MS-OVBAhttps://video2.skills-academy.com/en-us/openspecs/office_file_formats/ms-ovba/575462ba-bf67-4190-9fac-c275523c75fc

Section 2.4.1.1.7 Token Sequence

It is mentioned that: "The number of Tokens in the final TokenSequence MUST be greater than or equal to 1. The number of __Token__s in the final TokenSequence MUST be less than or equal to eight."

My question is, suppose I am trying to decompress a CompressedContainer, and while following the Decompressing algorithm, I encounter the Last Token Sequence in the **CompressedChunkData. **

Then How can I know the number of tokens in that TokenSequence?

From my understanding, a TokenSequence has a FlagByte, which specifies the token types (Literal token if the corresponding bit is 0b0 and Copy token otherwise). Then if I read the FlagByte in the last TokenSequence, I cant know the number of tokens, right?

Are there special values for tokens that represent an empty token?

Thanks

Office Open Specifications
Office Open Specifications
Office: A suite of Microsoft productivity software that supports common business tasks, including word processing, email, presentations, and data management and analysis.Open Specifications: Technical documents for protocols, computer languages, standards support, and data portability. The goal with Open Specifications is to help developers open new opportunities to interoperate with Windows, SQL, Office, and SharePoint.
127 questions
{count} votes

Accepted answer
  1. Tom Jebo 1,991 Reputation points Microsoft Employee
    2023-05-24T23:00:33.0933333+00:00

    Hi @Parth Gupta,

    If you follow the algorithm for decompression, you'll see that it uses the total number of compressed bytes in the chunks to determine when the final token sequence ends.

    For example, 2.4.1.3.4 Decompressing a TokenSequence shows the algorithm and the test here:

    IF CompressedCurrent is LESS THAN CompressedEnd THEN

    is what determines if the next byte(s) in the chunk is actually a token or if we've reached the end of the compressed data.

    The FlagByte only tells what kind of token is present but does not tell whether a byte is actually a token in the compressed chunks collection.

    Here is a repo that contains working decompression code as an example of following the algorithm and you can use that as a reference if needed:

    https://github.com/tomjebo/compvba.git

    Hope this helps.

    Best regards,
    Tom Jebo
    Microsoft Open Specifications Support

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful