Code Page Specification for Flat File Schemas

Overview

The value in the Code Page property is used to create an encoding object that is used during the disassembly and assembly of flat file documents. This encoding object allows the flat file parser to convert the native encoding of an inbound flat file document into the normalized UTF-8 encoding that is used internally by Microsoft BizTalk Server. The encoding object also allows the flat file serializer to convert the internal UTF-8 encoding back into the native encoding of the flat file document.

The setting of the Code Page property plays an important, but not exclusive, role in determining the character encoding scheme used by your flat file business documents. You must consider how inbound flat file messages are interpreted by the flat file disassembler as well as how the flat file assembler will encode characters as outbound messages are translated into flat file format.

Character encoding

There are multiple factors that play a role in determining how character encoding for a given instance message is handled, as follows:

  • When disassembling a flat file instance message, the following algorithm is used to determine and preserve encoding information:

    1. If the Charset in the Message body part is set, its value is used.

    2. Otherwise, if the envelope (or document) schema specifies a code page using the Code Page property, its value is used.

    3. Otherwise, if a byte order mark is present, its value is used.

    4. Otherwise, assume UTF-8.

  • When assembling a flat file instance message, the following algorithm is used to determine the character set to use for decoding:

  • If the XMLNorm.TargetCharset message context property is set, its value is used.

  • Otherwise, if the TargetCharset assembler (design-time) property is set, its value is used.

  • Otherwise, if the envelope (or document) schema specifies a code page using the Code Page property, its value is used.

    1. Otherwise, if the SourceCharset message context property is set, its value is used.

    2. Otherwise, use UTF-8.

See Also

Considerations When Creating Flat File Message Schemas and Code Page (Node Property of Flat File Schemas) in the UI guidance and developers API namespace reference