SAX2 C++ Common Notices

 

This topic provides comparison and contrast of language-specific changes that are common across all SAX interfaces when working in Microsoft® Visual C++®.

Strings

In this implementation, all strings returned to handler callbacks consist of two parts: a pointer to the wchar buffer and the length of the string. This includes strings passed either directly, such as the element name for the startElement method, or indirectly, such as strings provided by ISAXAttributes and ISAXLocator classes. Strings returned from these components are owned by the components, not the calling process. Therefore, their memory should not be released, freed, or deleted. Strings, represented as a pointer to the character buffer and length, may be not zero-terminated.

When a string consists of both a pointer to the buffer and a length, the length always contains the correct length of the string: either it is zero-terminated or not. For input parameters, the representation of strings follows traditional zero-terminated Unicode string format, for example, rdr.putBase(L"https://microsoft.com/"). It's not guaranteed that the content of the string will still keep the value at the next call to a handler.

After any successful getProperty call, the caller should free the memory. The MXXML Simple API for XML (SAX2) property strings are the exception to the general COM rule that system-allocated objects are allocated by the data source and freed by the consumer.

Declaration Conflicts

The Microsoft® XML Core Services (MSXML) implementation of SAX2 is nonvalidating. As a result, a validation error does not occur if the same element is declared twice, with different definitions, in an internal DTD. When element/entity declaration conflicts occur, the first declaration takes precedence.

Skipped Entities

If an entity is declared in an external DTD, MSXML SAX2 reports it as a skipped entity and the value of the nondeclared (skipped) entity is defaulted to empty. However, MSXML SAX2 does not report entities in attributes. They are quietly skipped.

Features

Features handled and recognized by SAXXMLReader are:

  • "exhaustive-errors"

  • "http://xml.org/sax/features/external-general-entities"

  • "http://xml.org/sax/features/external-parameter-entities"

  • "http://xml.org/sax/features/lexical-handler/parameter-entities"

  • "http://xml.org/sax/features/namespaces"

  • "http://xml.org/sax/features/namespace-prefixes"

  • "preserve-system-identifiers"

  • "schema-validation"

  • "server-http-request"

Properties

Properties handled and recognized by SAXXMLReader are:

  • "http://xml.org/sax/properties/lexical-handler"

  • "http://xml.org/sax/properties/declaration-handler"

  • "http://xml.org/sax/properties/dom-node"

  • "schemas"

  • "schema-declaration-handler"

  • "charset"

  • "xmldecl-encoding"

  • "xmldecl-version"

  • "xmldecl-standalone"

Return Codes

All handlers may return either S_OK to continue or an error code other than S_OK. For error codes other than S_OK, parsing is aborted and the return code is returned by the parse(), parseURL(), or resume method.

Most parser methods may return general error codes like the following.

E_OUTOFMEMORY
Out of memory.

E_INVALIDARG
Input parameter is invalid.

E_FAIL
General failure.

In addition to these return codes, each application is responsible for checking for and handling a return value of NULL, and should not pass NULL as a return value where it is not appropriate. Otherwise, the application might report an access violation at runtime.