When Should I Use SAX?

 

As an events-based parser that processes documents in a serial manner, the Simple API for XML (SAX) presents an excellent alternative to the Document Object Model (DOM).

When your documents are large

Perhaps the biggest advantage of SAX is that it requires significantly less memory to process an XML document than the DOM. With SAX, memory consumption does not increase with the size of the file. For example, a 100 kilobyte (KB) document can occupy up to 1 megabyte (MB) of memory using the DOM; the same document requires significantly less memory when using SAX. If you must process large documents, SAX is the better alternative, particularly if you do not need to change the contents of the document.

When you need to abort parsing

Because SAX allows you to abort processing at any time, you can use it to create applications that fetch particular data. For example, you can create an application that searches for a part in inventory. When the application finds the part, it returns the part number and availability, and then stops processing.

When you want to retrieve small amounts of information

For many XML-based solutions, it is not necessary to read the entire document to achieve the desired results. For example, if you want to scan data for relevant news about a particular stock, it's inefficient to read the unnecessary data into memory. With SAX, your application can scan the data for news related only to the stock symbols you indicate, and then create a slimmed-down document structure to pass along to a news service. Scanning only a small percentage of the document results in a significant savings of system resources.

When you want to create a new document structure

In some cases, you might want to use SAX to create a data structure using only high-level objects, such as stock symbols and news, and then combine the data from this XML file with other news sources. Rather than build a DOM structure with low-level elements, attributes, and processing instructions, you can build the document structure more efficiently and quickly using SAX.

When you cannot afford the DOM overhead

For large documents and for large numbers of documents, SAX provides a more efficient method for parsing XML data. For example, consider a remote procedure call (RPC) that returns 10 MB of data to a middle-tier server to be passed to a client. Using SAX, the data can be processed using a small input buffer, a small work buffer, and a small output buffer. Using the DOM, the data structure is constructed in memory, requiring a 10 MB work buffer and at least a 10 MB output buffer for the formatted XML data.