Bug in NDISPROT 6001.18001?

Does NDISPROT 6001.18001 have a bug relating to cancel operations?

Looking at 60\sys\send.c:320, in NdisProtCancelWrite, the pOpenContext->Lock is released. While it had the lock, it might have found that the IRP being cancelled still exists in the PendedWrites list in the pOpenContext. It then proceeds to read the CancelId from the DriverContext of the IRP and use that to cancel the outstanding NET_BUFFER_LISTs.

However, in 60\sys\send.c:410 the pOpenContext->Lock is acquired by NdisprotSendComplete, and then removes the IRP from its list. Further down, without locks or conditionals, it does IoCompleteRequest on the IRP.

If IoCompleteRequest executes and completes before the other thread can read the CancelId from the IRP, that value or the storage for the IRP itself would be invalid.

I am assuming that two threads can reach those two lines at the same time; I believe this is possible, even with the test in NdisprotCancelWrite for a NULL pOpenContext (DriverContext[0]) while holding the CancelSpinLock. NdisprotSendComplete also grabs the CancelSpinLock, and then sets DriverContext[0] to NULL, but it does so without checking to see if the cancel routine has already been started, etc.

I believe a fix for this is to move 60\sys\send.c:326 (CancelId = pIrp->Tail.Overlay.DriverContext[2]:wink: up into where the IRP is discovered in the pOpenContext PendedWrites list, whlie the pOpenContextLock is still held. This would mean that the CancelId value is guaranteed to be good, and then when the code falls through to NdisCancelSendNetBufferLists it does not have to reference the IRP (also, referencing the IRP in the DEBUGP statement would be fixed as well).

Is this a real bug, or is there some other reason this can’t happen?

I’ve demonstrated in my code that I can definitely come to a point where:

* Thread 1 is in NdisprotCancelWrite, it has the pOpenContext->Lock, and is iterating through the list of IRP’s and has found the IRP has not yet been removed.
* Thread 2 is in NdisprotSendComplete and has already set DriverContext[0] and DriverContext[1] to NULL, but has not yet acquired the pOpenContext->Lock and removed the IRP from the PendedWrites list.

If from here Thread 1 releases the lock, and then Thread 2 gets switched in and executes all the way to IoCompleteRequest, after which Thread 1 would then attempt to reference pIrp->Tail, which may be free’d memory by that time.

Please feel free to correct me if I’ve missed something…