Previous post does not contain the revision mark.
Here is a link to the extracted section.
https://1drv.ms/w/s!AvKjnW8J-ArBgfhx-vvx9k5V-t7xMQ?e=yCpFUm
Thank @Alexander P who provided suggested updates!
This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Hi!
I am to implement the V3 Content Normalized Data function.
When appending the buffer of the references the algorithm says:
IF next USHORT = REFERENCENAME.Id THEN
...
END IF
IF next USHORT = REFERENCENAME.Reserved THEN
...
END IF
As this function is not reading anything so what is the next USHORT?
Any help appreciated!
Hello Alexander,
Thank you for posting your question. One of our engineers will respond soon.
Best Regards,
Jeff McCashland
Microsoft Open Specifications
I am currently implementing this function too (what a coincidence :D).
I was struggling there too, made some assumptions and tried multiple ways.
Unfortunately, it's absolutely hard to find an error with no sample data to validate intermediate hashes.
@Hung-Chun Yu
Do you think it'd be possible to share some samplesets ( vbaProject.bin + resulting V3ContentNormalizedData + resulting FormsNormalizedData + resulting NormalizeProjectStream ). This would help developers to find errors themself and interpret the spec in the right way.
Hi!
I just need to know what the next USHORT means. To me a next USHORT would be something to read (a 2 byte value) and not to write. But this algorithm should describe the creation of the buffer! FormsNormalizedData is not the problem, because it was also used in the agile signature.
Cheers
Alex
Hi @SvenSo ,
what are your thoughts in the two "IF next USHORT..." statements?
Hi @SvenSo
I will ask Product Group to see if there are Sample sets that we can share with Public.
@Alexander P and @SvenSo
Here is what I found out what next USHORT means. Following is not official yet, is still under review by feature owner and Product Group
//Peek (but do not read) the next USHORT in the stream.
IF next USHORT = 0x0016 THEN
APPEND Buffer WITH REFERENCENAME.Id (section 2.3.4.2.2.2)
APPEND Buffer WITH REFERENCENAME.SizeOfName (section 2.3.4.2.2.2)
APPEND Buffer WITH REFERENCENAME.Name (section 2.3.4.2.2.2)
END IF
//Peek (but do not read) the next USHORT in the stream (this may be the same USHORT as was peeked above if the above block was skipped)
IF next USHORT = 0x003E THEN
APPEND Buffer WITH REFERENCENAME.Reserved (section 2.3.4.2.2.2)
APPEND Buffer WITH REFERENCENAME.SizeOfNameUnicode (section 2.3.4.2.2.2)
APPEND Buffer WITH REFERENCENAME.NameUnicode (section 2.3.4.2.2.2)
END IF
Hi @SvenSo
I got your sample request via dochelp. It will be an while before official Microsoft Samples will be ready.
Thanks to Alex, who is willing to share his samples with the public.
Here are the shared links NormalizedData_V3-8.bin and NormalizedData_V3-9.bin
Dear @Alexander P and @Hung-Chun Yu
Thanks to these sample files and the updated documentation, I was able to fix the implementation within hours.
Thanks very much for your help!
Best regards,
Sven
Previous post does not contain the revision mark.
Here is a link to the extracted section.
https://1drv.ms/w/s!AvKjnW8J-ArBgfhx-vvx9k5V-t7xMQ?e=yCpFUm
Thank @Alexander P who provided suggested updates!
Hello @Hung-Chun Yu ,
I think that I know now what you mean, but the explenation you gave is not sufficient for a someone new to the documentation to fully understand the specs.
But I remebered a conversation with a MS employee about the agile signature and the missing part of the documentation that is now called FormsNormalizedData. He gave me a similar pseudo code to get the data, but he spoke of reading streams in contrary to the records that are mentioned in the algorithm.
I will try to explain what I mean.
The V3 Content Normalized Data speaks of records. Theses records are read when the VBA project is read (during the opening of the file). So these records now reside in memory and then you only work with those records! So when implementing the algorithm, you do not read any stream, you just work with the records you read when opening the file.
As working with these in memory records, you do never have a "next USHORT".
Your algorithm only makes sense, if you actually read the dir-stream when creating the content buffer. Maybe that is what you ment by "PARAMETERS Storage as VBA Storage..." in the beginning of the function. But the storage contains more than one stream!
In my opinion it yould make sense to either exchange the text "PARAMETERS Storage... " to "PARAMETERS dir-Stream (section 2.3.4.2) or you should just write "start reading the dir stream".
I will try to finalize the hash generation this weekend and get back to you if I have more information.
Cheers and a nice weekend
Alex
Hi!
I have implemented the algorithm and the cryptographic digest is not matching.
Your explanation for the next USHORT just means: if there is a REFERENCENAME record within th REFERENCECONTROL record then write it down.
I' ve prepared a small Excel file with a short macro. Just the standard references and I've logged the V3ContentNormalizedData, the FormsNormalizedData and the ProjectNormalizedData.
The FormsNormalizedData should be OK, because it works, when I just attach the legacy and the agile signatures.
The bad thing is - i cannot upload a ZIP file with all those files. :-(
Cheers Alex
That is an excellent feedback, I will bring this feedback to the feature owner. Did you remember the name of Microsoft person who helped you?
@ AlexanderP-7851
If can share the filelink via DropBox, OneDrive, or Google Drive. Or you can email the zip file to dochelp at microsoft.com, in the body of the email forward to Hung-Chun Yu. Looking forward to the file.
Thank you for point out that existing spec is somewhat misreading.
As for the customer’s comments about “records in memory” vs “stream”, this is a misreading of the documentation. All our project hashing happens at the storage / stream level and does not require loading the project into the VBA runtime, so all our documentation regarding signing is documented based on the storage/stream format and has nothing to do with in-memory representation (which is an implementation detail anyway). In fact, all our format documentation for all of Office is documenting on-disk format, not in-memory anything.
Product group is working on a sample that we can share with implementer so it should help people understand how it work.
The problem is: you are assuming, that others do that the same way. For the FormsNomalizedData this would be the best way to implement it, because you just copy the stream buffers.
For the V3 Content Normalized Data there is not just reading. Some values are skipped. Others are transformed (MBSC to wide char) and you always speak of records. How someone treats those records must be left to the one implementing the algorithm. In my case, I read the whole project structure when opening the office file. I need to do that, because my users can read, manipulate, sign code. So I do not want to parse or read streams, do compressing/decompressing of code more than one time. Doing it several times would be ineficcient. When you document everything in a pseudo code, which creates a buffer, you should not assume that there is a "next anything". The better way would be in this case: If the ReferenceControl contains a NameRecordExtended record (section 2.3.4.2.2.2) Write ReferenceName.Id, ReferenceName.SizeOfName, ...
Then there would be a consistency in your algorithm.
Cheers
Alex
Hi @Alexander P
Thank you for your suggestion.
New insights
I made some tests with the writing of the ReferenceControl/ReferenceOriginal records.
Here your documentation - in contrary to the description of ReferenceRegistered - is OK. The LibIds are not transformed to wide char.
The documentation of ReferenceProject is also OK.
But my next test included a userform. As your documentation says, there was no changed to FormsNormalizedData. But, if I now add the FormsNormalizedData, the signature is considered as changed.
I will zip the sample and send it to you.
Everything is OK but, you can omit the function with package normalized data. This is not used. Just the corrected Project Normalized data is needed
Package Normalized Package Data
FUNCTION NormalizePackageStream
PARAMETERS Stream AS stream
RETURNS array of bytes
DECLARE Buffer AS array of bytes
SET Buffer TO resizable array of bytes
FOR EACH property in ProjectProperties (section 2.3.1.1)
IF property NOT is ProjectId (section 2.3.1.2) OR ProjectDocModule (section 2.3.1.4) OR ProjectProtectionState (section 2.3.1.15) OR ProjectPassword (section 2.3.1.16) OR ProjectVisibilityState (section 2.3.1.17) orid ProjectPckage THEN
APPEND Buffer WITH property name
APPEND Buffer WITH property value
END IF
END FOR
APPEND the string “Host Extender Info” to Buffer
APPEND HostExtenderRef without NWLN to Buffer
END FUNCTION
Hi!
I've just sent you an email with the zip file and the information about the MS employee I ve been talking to.
Cheers
Alex
Filed received. I will post here when I get an update.
Thank you very much
Hungchun Yu
Microsoft Open Specifications
Hi @Alexander P
Let us know if the Sample Code that was emailed to you helped?
Hungchun Yu
Microsoft Open Specifications
Hi @Alexander P
Thank you for the updated Sample. I shared it with product group. I will send you an update via email.
Hi @Alexander P
Thank for provided great suggestion on the MS-OVBA specifications and point out the issues you have uncovered.
While Microsoft work on the incorporation and review of suggestions for the future revisions.
Could you kindly share your Samples (with your company's private information removed) you worked on with other implementers? If you can provide a shared Link in your reply, we would greatly appreciated.
Also if you can shared what you learned and pitfall that other implementers can avoid that can speed up their implementations.