CM_PROB_DRIVER_FAILED_PRIOR_UNLOAD Issue (code 38)

Hello.

I have a strange problem which appears only on Windows 10 (1703/1709 tested), and not on Windows 7 (no Windows 8 tests so far). I have a bus driver (WDM- which I maintain, and did not write from scratch) which enumerates one AVSTREAM child (audio/MIDI). In certain cases, I see the code CM_PROB_DRIVER_FAILED_PRIOR_UNLOAD listed for the parent bus driver in the SETUPAPI logfile (and in the device manager).

There is a FAQ for debugging this:
“Debugging a Failed Driver Unload”
https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/debugging-a-failed-driver-unload

After a few debug sessions, I have found out that the OBJECT_HEADER of the PDO created by my bus driver has a PointerCount > 0 after the device has been unplugged, resulting in the bus device driver instance being pinned in memory.
I can usually duplicate the issue as follows:

a) Plug in device.

b) Start MIDI input/output application, which opens references to the AVSTREAM child.

c) Unplug the device, replug the device, while keeping the application open. I receive surprise removal only, as the application does not close its handles. It helps in reproducibility to do multiple unplug/replugs in this setp (while keeping the application open).

d) Close application, unplug and replug the device, and then the symptom in the device manager appears (code 38). The original child PDO has a PointerCount > 0.

Now I have seen that my bus device receives surprise removal for the PDO, reports the PDO as missing by calling IoInvalidateDeviceRelations(BusRelations) (I tried adding TargetDeviceRelation as well). I do not see query device relations (BusRelations) being sent to the bus device, probably because it too has been invalidated by the real bus driver. After the application is closed, I do receive the remove device on the PDO, and do call IoDeleteDevice on it.

What is strange is that I do not see how I could be responsible for this PointerCount issue on the PDO- the debugging technique described at the above link involves using logging break on access for the OBJECT_HEADER of the suspect object, and the dumping of all the stacks of the reference/dereference calls. All the calls which involve my device driver are either query device relations (BusRelations) on the parent, and query device relations (TargetDeviceRelation) on the child (excluding some few other balanced cases). In each case in the query device relations call, I return a referenced device object as the rules of PnP stipulate. It’s as if the caller in some cases does not dereference the returned PDO.

Note that when the child PDO is registered as surprise removed, I will never return it in QDR, either BusRelations for the parent, or TargetDeviceRelation for the child (and I see no such query anyway after surprise removal has been received).

Could anyone think of what may be happening here (if I am responsible for this, if so, how, possibly), or supply any input?
Perhaps the issue is not the bus driver handling of the PDO, but something in the KS/AVSTREAM layer handling (in my minidriver or not)- I am deleting my filter factories, and the AVSTREAM FDO is deleted and freed). WinDBG indicates that the AVSTREAM child’s module is unloaded.

Thank you,
Philip Lukidis

Just to add: it seems that I have found an interesting deviation in comparing a failing and working case.

I have compared two cases: first, the PROBLEM case with the device being unplugged/replugged while the application is still open, followed by closing the application, and finally unplugging/replugging the device (see my last post- call this “case 1”, which is the PROBLEM CASE), and the case in which there is no application, and I surprise removal the device (call this “case 2”, aka “the GOOD CASE”).

I have not, I admit, compared every reference/dereference in “case 1” and “case 2”, but at the point in “case 1” with the application still open and still accessing my device, and “case 2” with no application open, the number of references to the PDO which is reference count leaked is 6 (PointerCount) in both cases (the KS layer still references the devices even when not used by a client application). Note that I did not mention in my original post that the final PointerCount of the reference count leaked PDO is 3 for “case 1” after the device has been finally unplugged and the application closed (as opposed to 0 for “case 2” in which there is no problem).

So I compared the sequence of references/dereferences from that point onward (just before the FIRST surprise removal). I found that I can account for the missing 3 dereference counts from this base stack frame:

09 fffffa8565697980 fffff800aa3e88bd nt!PnpDeleteLockedDeviceNodes+0xb3
0a fffffa85656979f0 fffff800aa3e87cf nt!PipRemoveDevicesInRelationList+0x8d
0b fffffa8565697a40 fffff800a9e4a4d5 nt!PnpDelayedRemoveWorker+0xff
0c fffffa8565697a80 fffff800a9f25c07 nt!ExpWorkerThread+0xf5
0d fffffa8565697b10 fffff800a9f8bcc6 nt!PspSystemThreadStartup+0x47
0e fffffa8565697b60 0000000000000000 nt!KiStartSystemThread+0x16

From this stack frame, there are three additional calls to nt!ObfDereferenceObject (I have pasted only 1 of 3 such calls below, note that nt!PiSwProcessParentRemoveIrp is NEVER called for “case 1”):

00 fffffa85656976f0 fffff800aa4d886e nt!ObfDereferenceObject+0x29
01 fffffa8565697730 fffff800aa4d8d6a nt!PiSwPdoAssociationFree+0x12
02 fffffa8565697760 fffff800aa4d8ea5 nt!PiSwRemovePdoAssociation+0x36
03 fffffa8565697790 fffff800aa4d80e6 nt!PiSwUnassociateDeviceObject+0x21
04 fffffa85656977c0 fffff800aa415340 nt!PiSwDestroyDeviceObject+0x16
05 fffffa85656977f0 fffff800aa268abb nt!PiSwProcessParentRemoveIrp+0x1ac5f0
06 fffffa8565697820 fffff800a9e22a93 nt!IopRemoveDevice+0xbb
07 fffffa85656978e0 fffff800aa267d66 nt!PnpRemoveLockedDeviceNode+0x1ab
08 fffffa8565697940 fffff800aa267a93 nt!PnpDeleteLockedDeviceNode+0x4e
09 fffffa8565697980 fffff800aa3e88bd nt!PnpDeleteLockedDeviceNodes+0xb3
0a fffffa85656979f0 fffff800aa3e87cf nt!PipRemoveDevicesInRelationList+0x8d
0b fffffa8565697a40 fffff800a9e4a4d5 nt!PnpDelayedRemoveWorker+0xff
0c fffffa8565697a80 fffff800a9f25c07 nt!ExpWorkerThread+0xf5
0d fffffa8565697b10 fffff800a9f8bcc6 nt!PspSystemThreadStartup+0x47
0e fffffa8565697b60 0000000000000000 nt!KiStartSystemThread+0x16

So it seems that nt!PipRemoveDevicesInRelationList through nt!PnpRemoveLockedDeviceNode in “case 2” for some reason calls nt!IopRemoveDevice 3 more times than in “case 1” (or at least I never see nt!PiSwProcessParentRemoveIrp being called in “case 1”).

Please note that for both “case 1” and “case 2” nt!PipRemoveDevicesInRelationList through nt!PnpRemoveLockedDeviceNode DO send the remove device PnP IRP for the PDO, and in both “case 1” and “case 2”, my driver does call IoDeleteDevice on this problematic PDO.

Could anyone explain why this deviation in comparison between both cases does arise? Frankly, at this moment, I have no idea how I could have provoked this.

Thank you,
Philip Lukidis

From: Philip Lukidis
Sent: April 3, 2018 12:27 PM
To: ‘xxxxx@lists.osr.com’
Subject: CM_PROB_DRIVER_FAILED_PRIOR_UNLOAD Issue (code 38)

Hello.

I have a strange problem which appears only on Windows 10 (1703/1709 tested), and not on Windows 7 (no Windows 8 tests so far). I have a bus driver (WDM- which I maintain, and did not write from scratch) which enumerates one AVSTREAM child (audio/MIDI). In certain cases, I see the code CM_PROB_DRIVER_FAILED_PRIOR_UNLOAD listed for the parent bus driver in the SETUPAPI logfile (and in the device manager).

There is a FAQ for debugging this:
“Debugging a Failed Driver Unload”
https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/debugging-a-failed-driver-unload

After a few debug sessions, I have found out that the OBJECT_HEADER of the PDO created by my bus driver has a PointerCount > 0 after the device has been unplugged, resulting in the bus device driver instance being pinned in memory.
I can usually duplicate the issue as follows:

a) Plug in device.

b) Start MIDI input/output application, which opens references to the AVSTREAM child.

c) Unplug the device, replug the device, while keeping the application open. I receive surprise removal only, as the application does not close its handles. It helps in reproducibility to do multiple unplug/replugs in this setp (while keeping the application open).

d) Close application, unplug and replug the device, and then the symptom in the device manager appears (code 38). The original child PDO has a PointerCount > 0.

Now I have seen that my bus device receives surprise removal for the PDO, reports the PDO as missing by calling IoInvalidateDeviceRelations(BusRelations) (I tried adding TargetDeviceRelation as well). I do not see query device relations (BusRelations) being sent to the bus device, probably because it too has been invalidated by the real bus driver. After the application is closed, I do receive the remove device on the PDO, and do call IoDeleteDevice on it.

What is strange is that I do not see how I could be responsible for this PointerCount issue on the PDO- the debugging technique described at the above link involves using logging break on access for the OBJECT_HEADER of the suspect object, and the dumping of all the stacks of the reference/dereference calls. All the calls which involve my device driver are either query device relations (BusRelations) on the parent, and query device relations (TargetDeviceRelation) on the child (excluding some few other balanced cases). In each case in the query device relations call, I return a referenced device object as the rules of PnP stipulate. It’s as if the caller in some cases does not dereference the returned PDO.

Note that when the child PDO is registered as surprise removed, I will never return it in QDR, either BusRelations for the parent, or TargetDeviceRelation for the child (and I see no such query anyway after surprise removal has been received).

Could anyone think of what may be happening here (if I am responsible for this, if so, how, possibly), or supply any input?
Perhaps the issue is not the bus driver handling of the PDO, but something in the KS/AVSTREAM layer handling (in my minidriver or not)- I am deleting my filter factories, and the AVSTREAM FDO is deleted and freed). WinDBG indicates that the AVSTREAM child’s module is unloaded.

Thank you,
Philip Lukidis