Support for Multiple Languages

Note

Indexing Service is no longer supported as of Windows XP and is unavailable for use as of Windows 8. Instead, use Windows Search for client side search and Microsoft Search Server Express for server side search.

 

Because IIS can search documents in many languages, multilingual indexing and querying features are a standard feature of Indexing Service. The query system is designed to address localization considerations. The query system is completely modular and can dynamically load and unload language-specific utilities. These utilities include word breakers, stemmers, and normalizers. These language resource components are available for several languages.

Indexing Service can index multilingual documents and switch between languages as required (for example, index an English paragraph, index a French paragraph, and switch back to English). All index information is stored as Unicode characters, and all queries are converted to Unicode before they are processed.

Indexing Service does not distinguish between language once the words have been entered into the index. It is possible to return documents written in a language different from the language posted in the query. In many cases this is appropriate. For example a query for Windows 95 will return a French document that contains the English phrase Windows 95. In some cases languages contain words known as homologues, words that are spelled the same in two or more languages but have very different meanings. Indexing Service does not distinguish these cases because Indexing Service does not perform any language translation.

Finally, you should know that mixing languages can yield unpredictable results. For example, if you set the multilanguage form to German and query for English words, the results likely will not be the same if the identical query were posted with the language set to English. This is because Indexing Service is using the German linguistics modules to analyze the query field to determine which words and phrases to search for (that is, it is trying to perform German word breaking on English text). The German word breaker assumes German grammar when it breaks textual characters into words, so it often generates wrong word-break results when breaking non-German text.

This section contains: