Understanding XML Format Files

Microsoft SQL Server 2005 and later versions provide an XML schema that defines syntax for writing XML format files. XML format files must adhere to this schema, which is defined in the XML Schema Definition Language (XSDL). XML format files are only supported when SQL Server tools are installed together with SQL Server Native Client. 

Note

SQL Server 2005 and later versions support two types of format files. In contrast, Microsoft SQL Server 2000 and earlier versions supported only a single type of format file. To distinguish the pre-existing type from XML format files, the pre-existing type of format file is called a non-XML format file.

You can use XML format files to bulk import data into tables or non-partitioned views and to bulk export data. XML format files are an alternative to non-XML format files, which are supported in SQL Server 2000 and earlier versions (and remain supported). XML format files are more flexible and powerful than non-XML format files. XML format files are easy to understand and create. Furthermore, they are human readable, making it easy to understand how data is interpreted during bulk operations. The XML encoding clearly describes the data types and data elements of the data file and also the mapping between data elements and table columns.

An XML format file can be enhanced yet remain compatible with its earlier versions. Furthermore, the clarity of XML encoding facilitates the creation of multiple format files for a given data file. This is useful if you have to map all or some of the data fields to columns in different tables or views.

The bcp command allows you to automatically generate an XML format file for a table; for more information, see bcp Utility.

Structure of XML Format Files

Like a non-XML format file, an XML format file defines the format and structure of the data fields in a data file and maps those data fields to columns in a single target table.

An XML format file possesses two main components, <RECORD> and <ROW>:

  • <RECORD> describes the data as it is stored in the data file.

    Each <RECORD> element contains a set of one or more <FIELD> elements. These elements correspond to fields in the data file. The basic syntax is as follows:

    <RECORD>

       <FIELD .../> [ ...n ]

    </RECORD>

    Each <FIELD> element describes the contents of a specific data field. A field can only be mapped to one column in the table. Not all fields need to be mapped to columns.

    A field in a data file can be either of fixed/variable length or character terminated. A field value can be represented as: a character (using single-byte representation), a wide character (using Unicode two-byte representation), native database format, or a file name. If a field value is represented as a file name, the file name points to the file that contains the value of a BLOB column in the target table.

  • <ROW> describes how to construct data rows from a data file when the data from the file is imported into a SQL Server table.

    A <ROW> element contains a set of <COLUMN> elements. These elements correspond to table columns. The basic syntax is as follows:

    <ROW>

       <COLUMN .../> [ ...n ]

    </ROW>

    Each <COLUMN> element can be mapped to only one field in the data file. The order of the <COLUMN> elements in the <ROW> element defines the order in which they are returned by the bulk operation. The XML format file assigns each <COLUMN> element a local name that has no relationship to the column in the target table of a bulk import operation.

Additional Topics