TDR debuggability improvements

To aid TDR (timeout detection and recovery) analysis, the OS historically called the kernel-mode driver's DxgkddiCollectDbgInfo callback to allow the driver to write its own payload into the TDR report that the system uploads from the customer machine.

TDR debug improvements, described in this article, are available starting in Windows 11, version 24H2 (WDDM 3.2). Graphics driver developers should be familiar with GPU timeout detection and recovery in Windows as described in Timeout detection and recovery and TDR in Windows 8 and later.

DDI changes

DxgkddiCollectDbgInfo2

DxgkddiCollectDbgInfo2 is added as a TDR debug extension. This callback allows the OS to pass more detailed information to KMD about the root cause of the TDR. The kernel-mode driver (KMD), in turn, can save state that is relevant to the part of the GPU responsible for the TDR.

DxgkddiCollectDbgInfo2 is a superset to the existing DxgkddiCollectDbgInfo.

  • A WDDM 3.2 driver isn't required to implement DxgkddiCollectDbgInfo2, in which case the OS calls DxgkddiCollectDbgInfo.

  • If KMD does implement DxgkddiCollectDbgInfo2, the OS invokes it instead of DxgkddiCollectDbgInfo in all cases.

The DRIVER_INITIALIZATION_DATA structure is extended to include a pointer to DxgkddiCollectDbgInfo2.

DXGKARG_COLLECTDBGINFO2

The OS passes the added DXGKARG_COLLECTDBGINFO2 structure to DxgkddiCollectDbgInfo2.

The layout of DXGKARG_COLLECTDBGINFO2 is backwards compatible with the existing DXGKARG_COLLECTDBGINFO structure to allow the DxgkDdiCollectDbgInfo2 implementation to reuse existing DxgkDdiCollectDbgInfo helpers as needed. For this reason, the Reason, pBuffer, BufferSize, and pExtension fields have the same semantics.

The following other fields are in DXGKARG_COLLECTDBGINFO2, but not in DXGKARG_COLLECTDBGINFO.

  • TdrType
  • TdrPayloadSize
  • TdrPayload

For some TDR types, the OS provides additional information in the TdrPayload buffer of TdrPayloadSize bytes. It can be NULL, and the driver is expected to handle this case without crashing.

When the payload isn't NULL, it can be cast to a structure that corresponds to the TDR type. The OS might grow these structures in a backwards compatible manner, adding new fields at the end. The driver must check TdrPayloadSize before accessing TdrPayload fields to make sure the OS implements the desired payload version or later.

Memory that TdrPayload points to is only valid during the DxgkddiCollectDbgInfo2 call. The driver shouldn't store a pointer to TdrPayload past the end of DxgkddiCollectDbgInfo2 call.

Starting in WDDM 3.2, the following payload structures are added as possible payloads for TdrPayload to point to.