NDKPI Deferred Processing Scheme
There are many cases where an NDK consumer will post a chain of initiator requests to the queue pair (QP). For example, a consumer could post a number of fast register requests followed by a send request. The performance for such request patterns may be improved if the chain of requests is queued to the QP and then indicated to the hardware for processing as a batch, rather than indicating each request in the chain to the hardware, one by one.
The NDK_OP_FLAG_DEFER flag value can be used for this purpose with the following request types:
- NdkBind (NDK_FN_BIND)
- NdkFastRegister (NDK_FN_FAST_REGISTER)
- NdkInvalidate (NDK_FN_INVALIDATE)
- NdkRead (NDK_FN_READ)
- NdkSend (NDK_FN_SEND)
- NdkSendAndInvalidate (NDK_FN_SEND_AND_INVALIDATE)
- NdkWrite (NDK_FN_WRITE)
The presence of the flag is a hint to the NDK provider that it may defer indicating the request to hardware for processing, but the provider may process the new request at any time.
The presence of the NDK_OP_FLAG_DEFER flag on an initiator request does not change the NDK provider's existing responsibilities with respect to generating completions. A call to the initiator request that returns a failure status must not result in a completion being queued to the CQ for the failed request. Conversely, a call that returns a success status must eventually result in a completion being queued to the CQ as long as the consumer follows the additional requirements listed below.
In addition to all the existing NDK requirements, two additional requirements (one for the provider and one for the consumer) must be observed to prevent a situation in which requests are successfully posted to the QP with the NDK_OP_FLAG_DEFER flag, but are never indicated to the hardware for processing:
- When returning a failure status from a call to an initiator request, the provider must guarantee that all requests that were previously submitted with the NDK_OP_FLAG_DEFER flag are indicated to the hardware for processing.
- The consumer guarantees that, in the absence of an inline failure, all initiator request chains will be terminated by an initiator request that does not set the NDK_OP_FLAG_DEFER flag.
For example, consider a case where a consumer has a chain of two fast register requests and a send that it needs to post to the QP:
- The consumer posts the first fast register with the NDK_OP_FLAG_DEFER flag and NdkFastRegister returns STATUS_SUCCESS.
- Again, the second fast register is posted with the NDK_OP_FLAG_DEFER flag set but now NdkFastRegister returns a failure status. In this case, the consumer will not post the send request.
- When returning the inline failure for the second call to NdkFastRegister, the NDK provider makes sure that all previously unindicated requests (the first fast register in this case) are indicated to the hardware for processing.
- Because the first call to NdkFastRegister succeeded, a completion must be generated to the CQ.
- Because the second call to NdkFastRegister failed inline, a completion must not be generated to the CQ.