Exchange 2013: eDiscovery Changes

With the release of Exchange 2013, there are some changes that are relevant to eDiscovery; whether it be for In-Place Holds or Litigation Queries to export to the Discovery Mailbox. Most notably, eDiscovery/Exchange Search does not support AQS – it switched to KQL. KQL is supported in the SearchQuery parameter (Keywords box in the Exchange Admin Center). However, Outlook still uses AQS.

Using KQL, we can perform searches that are beneficial to the eDiscovery and will save time/money/resources, without the need to invoke a third-party to process the data for you.

For example, if I perform a query for any messages that only have a word document as an attachment, I get the two messages I expect to find.

If perform the same query but, this time, define a subject or keyword I'm after, the messages are excluded because the primary rule hasn't been met.

If I perform a third query with words that exist in the document (but not in the document name), these documents will return in my query, as well.

There is a limitation to the number of mailboxes that can be searched and it is 5,000*. Any number beyond this and the specified query will return the following error: An unknown error occurred on the search server. Please contact your administrator for assistance. The message from the search server is 'The search exceeded the maximum number of mailboxes that can be searched at a time. Please try searching less than 5000 mailboxes.'.

*The maximum number of mailboxes that you can search can be changed in on-premises Exchange 2013. You can use the Set-ThrottlingPolicy command with the DiscoveryMaxMailboxes parameter to do so but this may come at a negative impact to performance.

As Exchange now uses the FAST Search index, we can query for what documents haven't been processed and why. For example, if I what to query for the error where the document parser encountered a processing error, I would use the following command in Exchange Management Console:

Get-FailedContentIndexDocuments Administrator -ErrorCode 7 | FT -AutoSize

DocID Database                Mailbox       Subject           Description
----- --------                -------       -------           -----------
3462  LAB-NAEX15-01 Store 002 Administrator Binaries Test     The document parser encountered a processing error.
3464  LAB-NAEX15-01 Store 002 Administrator FW: Binaries Test The document parser encountered a processing error.

Using this I can see what, precisely, caused the document to not be indexed:

$errorSevens = Get-FailedContentIndexDocuments Administrator -ErrorCode 7
$errorSevens[0].AdditionalInfo
 301002 Error parsing document 'exchange://localhost/Attachment/34eb02b4-3bc6-4163-a40d-2587faa9e0db/135d5536-d180-4198-9ba8-574b53df8206/e08d777e-e710-4407-a53d-1f57a4a58d79/a654efa1-bb87-426a-aaca-9866be73
3ccd/438086667654.0/System.Data.dll'. Document has an undetectable format and will not be parsed. 301002 Error parsing document 'exchange://localhost/Attachment/34eb02b4-3bc6-4163-a40d-2587faa9e0db/135d5536-
d180-4198-9ba8-574b53df8206/e08d777e-e710-4407-a53d-1f57a4a58d79/a654efa1-bb87-426a-aaca-9866be733ccd/438086667654.1/mscorlib.dll'. Document has an undetectable format and will not be parsed.

In this case, the documents are binaries attached to the email for testing in regards to another issue. FAST Search cannot reverse-engineer binaries, so it is safe to assume that these files aren't necessary for my eDiscovery purposes. See here for a list of formats that Exchange FAST Search can index.

The error code enumerations are as follows:

0 - No problems.
1 - An error has occurred.
2 - A timeout has occurred.
3 - The message was not processed in a timely manner.
4 - The mailbox was offline.
5 - The attachment limit was reached.
6 - The item is only partially indexed.
7 - The document parser encountered a processing error.
8 - The document annotations aren't valid.
9 - The document is suspected of being unable to be processed.
10 - The document processing failed due to a Rights Management error.
11 - The Store Session is not available.
12 - The mailbox is quarantined.
13 - The mailbox is locked.
14 - The operation is not supported.
15 - Search can't sign in to the mailbox.
16 - Body conversion failed.

Comments

  • Anonymous
    January 01, 2003
    Hi, Solvetech! Apologies for the late reply.

    You're probably getting an error because [0] is calling the first iteration (index) of an object. If there is not object (i.e.: if it is 'null'), there will be nothing to index. Most commonly, this will result in an error like "Cannot index into a null array."

    The '.AdditionalInfo' is calling the property, 'AdditionalInfo' off of the object that will return from the index of zero.
  • Anonymous
    January 01, 2003
    Hey, Chris! Thanks! :)
  • Anonymous
    April 16, 2014
    Hey John, long time no see. nice article :)
  • Anonymous
    August 21, 2014
    The comment has been removed
  • Anonymous
    May 03, 2015
    Here are some pointers on searches with EWS:

    Exchange 2013 has indexed fields and many fields automatically
  • Anonymous
    December 28, 2015
    You referencce this article in another post about the max number of mailboxes that can be searched. It appears that, regardless of what you set the throttling policy to be, the max number is 5000. Period. This article gives hope that more than 5000 can be searched by the statement: "*The maximum number of mailboxes that you can search can be changed in on-premises Exchange 2013. You can use the Set-ThrottlingPolicy command with the DiscoveryMaxMailboxes parameter to do so but this may come at a negative impact to performance."" Can you confirm that no more than 5000 mbxs can be searched regardless of what the throttling policy is set at? Thanks!
    • Anonymous
      June 20, 2016
      Is it 10.000 maximum mailboxes now? I managed to search 8-9.000 but tried to increase the Throttling Policy's DiscoveryMaxMailboxes to 50.000 and search around 20.000 but received an error that the maximum number is 10.000.
    • Anonymous
      November 15, 2016
      Also wondering if there is a hard limit of 5000. I've adjusted throttling policy settings and it seems to have no effect.