Which driver failure would cause all disk access to cease without system panic?

Hello all,

This one is a bit of a toughie to debug, I thought I’d ask and see if anyone has any advice from past experience.

I’m running into an issue on a beta build of Windows 10 (1048, hence my earlier email about public symbols) wherein some driver in the disk subsystem fails and causing all attempts to read unpaged data to hang. The system slowly grinds to a halt as whatever was paged to memory requires access to the disk and is unable to proceed. Each time the symptoms vary, sometimes I can get far enough to get task manager running, sometimes I can’t. No BSOD occurs here without a full driver verifier configured.

Now what makes this hard to debug is that no crash dumps are written to the disk. I can’t remote debug because I’m only able to get to reproduce on a laptop and with full driver verification enabled (only way I can get it to BSOD… sometimes), regular boot mode causes a myriad of other BSODs in core drivers (network, touchpad, graphics) but in safe mode with only the necessary MS drivers loaded I can trigger a BSOD when this happens - but no network debugging is available.

The only filters I have loaded are EhStorClass and partmgr for the disk class and I’ve tried both an MS-provided and OEM-provided driver for my (NVME) SCSIAdapter device driver, but still get the disk access failure. I’m using the generic DiskDrive driver for the disk itself.

Any clues? I’m happy to provide whatever additional info I can.

Paging path is blocked by paging: either called paged code section or touched paged data.

Run !stacks and see what the threads are blocked at.

Thanks for the reply, Alex.

I’m not exactly sure what I’m looking for, to be honest. As luck would have it, I ended up running into this right in the middle of a !stacks call and it seemed that pretty much everything was stuck on nt!IoRemoveIoCompletion+0x8d (presumably unable to continue the normal stack progression to nt!KeRemoveQueueEx and beyond).

The StorageKD extension can be useful in these cases as it will show you the
state of the storage IRPs:

https://msdn.microsoft.com/en-US/library/windows/hardware/dn997250(v=vs.85).aspx

I usually dump the system log in these cases also. If there’s a hardware
failure sometimes you see disk retry errors:

!wmitrace.logdump EventLog-System

-scott
OSR
@OSRDrivers

wrote in message news:xxxxx@ntdev…

Thanks for the reply, Alex.

I’m not exactly sure what I’m looking for, to be honest. As luck would have
it, I ended up running into this right in the middle of a !stacks call and
it seemed that pretty much everything was stuck on
nt!IoRemoveIoCompletion+0x8d (presumably unable to continue the normal stack
progression to nt!KeRemoveQueueEx and beyond).

Hey Scott, I appreciate your chiming in.

I’ve been keeping my eye on the system log and there’s been nothing there. I’ll try the storage extensions next time this happens (if the system doesn’t crash before then).

You may need to run
!stacks 2

!stacks without arguments filters “insignificant” wait reasons, which could be wait for paging.

Alex, do you know if when debugging locally the output of !stacks is of a snapshot or real-time? i.e. on my machine, !stacks takes over 10 minutes to finish. Is the output of the later entries 10 minutes old or is it consistent with the system state at the time it appears on screen?

I ran into the hang earlier today and discovered that the !storagekd.* commands will hang in the debugger when I’m experiencing this issue. Just *busy* and no response before the machine completely froze a few seconds later. Haven’t been able to run !stacks 2 during the hang (but did manage to hang it after calling !stacks but before the !stacks call finished - hence my question above).

OK, let me just say this: I hate this. Turns out I can’t run !stacks either once this issue has kicked in. The debugger just goes *busy* and twiddles its thumbs while my PC begins to thrash in the throes of its upcoming and now inevitable death.

Maybe related question: why would a driver verifier violation in BTHPORT.SYS fail to write the dump to disk (remaining stuck at 0%)? Other BSODs dump to disk OK and I don’t see what BTHPORT.SYS would have to do with the storport subsystem.

By definition, when you break in with a remote kernel debugger, the OS state is frozen (except for KDNET activity). The debugger doesn’t make any effort to take a snapshot.

DO NOT USE LOCAL DEBUGGER FOR THIS.

Thanks, Alex. As mentioned above, remote debugging is a bit difficult due to the nature of the situation. I’m going to try via USB 3 as soon as my male-to-male cable comes in.

Progress however: I got really lucky and was able to execute !storclass xxxx 2 just after this happened without it hanging. I found a host of 0x28 read and 0x2a write failures with SRB status 0x04, which is something. It’s not the physical drive because yanking the disk and sticking it in another PC does not exhibit this problem. I suppose in this case it is the NVMe driver that is translating the hardware failure to a code 0x04 SRB failure, though I’m not sure what the underlying hardware error was what could be responsible for this condition.

It happens with both the MSFT and the Samsung NVMe drivers, so combined with the SRB failure it’s making me suspect the hardware. I think a clean install of Windows on a new partition on the same drive may be in order to see if it exhibits the same?

So I was fully expecting this to be a hardware problem but a clean installation of Windows 10 15011 revealed no instabilities until upgraded to 15048 at which point they resumed. *sigh*

My USB debug cable comes in tomorrow.

Did the !storclass command show you the sense data as well? SRB status of 4
is just the generic “SRB_STATUS_ERROR”. The SCSI status and sense data
should have more detail. You can even translate them back into the NVMe
failure using the SCSI to NVMe spec (see section 7):

http://www.nvmexpress.org/wp-content/uploads/NVM-Express-SCSI-Translation-Reference-1_1-Gold.pdf

Not to say that’s necessarily going to provide much more interesting info,
but the more details we can find the better.

-scott
OSR
@OSRDrivers

wrote in message news:xxxxx@ntdev…

Thanks, Alex. As mentioned above, remote debugging is a bit difficult due to
the nature of the situation. I’m going to try via USB 3 as soon as my
male-to-male cable comes in.

Progress however: I got really lucky and was able to execute !storclass xxxx
2 just after this happened without it hanging. I found a host of 0x28 read
and 0x2a write failures with SRB status 0x04, which is something. It’s not
the physical drive because yanking the disk and sticking it in another PC
does not exhibit this problem. I suppose in this case it is the NVMe driver
that is translating the hardware failure to a code 0x04 SRB failure, though
I’m not sure what the underlying hardware error was what could be
responsible for this condition.

It happens with both the MSFT and the Samsung NVMe drivers, so combined with
the SRB failure it’s making me suspect the hardware. I think a clean install
of Windows on a new partition on the same drive may be in order to see if it
exhibits the same?

Hi Scott,

Unfortunately I don’t think it was anything useful. Basically, the failed requests were all:

Opcode: 2a/28
SRB: 04
SCSI Status: 0
Sense Code: 00000
Sector: random
Timestamp: +/- 0.09 seconds apart
Retried

I know 0x04 is just a generic HBA/driver failure, but at least it meant that it wasn’t a regular failing disk timeout error. Beyond that, it’s not very useful…

With my debug USB cable in hand, I was *finally* able to get some real data out of this. Sorry for wasting everyone’s time with the pathetic attempts at local debugging earlier in this thread.

Unfortunately, I don’t have the symbols for this (no symbols for 15055 as of yet, I’ve contacted WinDbgFb and we’ll see if we hear back. I may have to debug this on a clean install of a different build that triggers the issue yet has the debug symbols available… *sigh*).

Anyway, this is what I get:

0: kd> !wmitrace.logdump EventLog-System

| |
| NT symbols are not available |
| reduced functionality |

(WmiTrace) LogDump for Logger Id 0x0c
Found Buffers: 2 Messages: 2, sorting entries
[0]0000.0000:: 131339913007908456 [Microsoft-Windows-StorPort/Bus reset /OpCodeBusReset]Bus reset occured on storport adapter (Port Number: 1)
[0]0000.0000:: 131339913307864363 [Microsoft-Windows-StorPort/None /Info]A request timed out for Storport Device (Port = 1, Path = 0, Target = 0, Lun = 0).
Corresponding Class Disk Device Guid is {0d61954a-3e28-2ff4-3116-b0b7bdd7c44b}.
Total of 2 Messages from 2 Buffers

0: kd> !storclass


0: kd> !storadapter
STORPORT adapters:
==================
Driver Object Extension State
-----------------------------------------------------------------
\Driver\secnvme ffffc883f4bc2050 ffffc883f4bc21a0 Working
\Driver\storahci ffffc883f4bb5050 ffffc883f4bb51a0 Working

0: kd> !storadapter ffffc883f4bc2050
ADAPTER
DeviceObj : ffffc883f4bc2050 AdapterExt: ffffc883f4bc21a0 DriverObj : ffffc883f4ba58d0
DeviceState : Working
LowerDO ffffc883f4b59620 PhysicalDO ffffc883f4b59840
SlowLock Free RemLock -666
SystemPowerState: Working AdapterPowerState D0 Full Duplex
Bus 4 Slot 0 DMA ffffc883f4bd0dd0 Interrupt 0000000000000000
Allocated ResourceList ffffc883f4bc6960
Translated ResourceList ffffc883f4ba78f0
Gateway: Outstanding 0 Lower 1024 High 1024
PortConfigInfo ffffc883f4bc22d0
HwInit ffffc883f4ba83b0 HwDeviceExt ffffc883f2c6b010 (10024 bytes)
SrbExt 4560 bytes LUExt 0 bytes

Normal Logical Units:
Product SCSI ID Object Extension Pnd Out Ct State
----------------------------------------------------------------------------------------
NVMe Samsung SS 0 0 0 ffffc883f4b9e060 ffffc883f4b9e1b0 35 0 0 Working

Zombie Logical Units:
Product SCSI ID Object Extension Pnd Out Ct State
--------------------------------------------------------------------------------------

0: kd> !storunit ffffc883f4b9e060
DO ffffc883f4b9e060 Ext ffffc883f4b9e1b0 Adapter ffffc883f4bc21a0 Working
Vendor: NVMe Product: Samsung SSD 950 SCSI ID: (0, 0, 0)
Claimed Enumerated
SlowLock Free RemLock 38 PageCount 2
QueueTagList: ffffc883f4b9e2b0 Outstanding: Head 0000000000000000 Tail 0000000000000000 Timeout 0 (Ticking Down)
DeviceQueue ffffc883f4b9e340 Depth: 254 Status: Not Frozen PauseCount: 1 BusyCount: 0
IO Gateway: Busy Count 0 Pause Count 0
Requests: Outstanding 0 Device 35 ByPass 0

[Device-Queued Requests]

IRP SRB Type SRB XRB Command MDL SGList Timeout
-----------------------------------------------------------------------------------------------------------------------------------
ffffc88400fd6bb0 [STORAGE] ffffc88400fd6db0 n/a SCSI/WRITE (10) ffffda80a7131ea0 n/a 65s
ffffc883ff39a010 [STORAGE] ffffc88401285a10 n/a SCSI/WRITE (10) ffffda80a0b27750 n/a 65s
ffffc8840106a4f0 [STORAGE] ffffc8840106a6f0 n/a SCSI/READ (10) ffffc883f46b83a0 n/a 65s
ffffc88400fdaa20 [STORAGE] ffffc883fecde200 n/a SCSI/WRITE (10) ffffc883f492a200 n/a 65s
ffffc884013dfd40 [STORAGE] ffffc884013dff40 n/a SCSI/WRITE (10) ffffc8840172b010 n/a 65s
ffffc883f317ae50 [STORAGE] ffffc883ff400bb0 n/a SCSI/WRITE (10) ffffc88401716100 n/a 65s
ffffc883fbe95460 [STORAGE] ffffc883fd84b7b0 n/a SCSI/WRITE (10) ffffc8840174e930 n/a 65s
ffffc884012637b0 [STORAGE] ffffc883ff0b4970 n/a SCSI/WRITE (10) ffffda80a16be180 n/a 65s
ffffc883fec4e450 [STORAGE] ffffc883ff725d10 n/a SCSI/WRITE (10) ffffc884013daa40 n/a 65s
ffffc883ff28e480 [STORAGE] ffffc883fe157960 n/a SCSI/READ (10) ffffc883f4404550 n/a 65s
ffffc88400ecfd40 [STORAGE] ffffc88400ecff40 n/a SCSI/WRITE (10) ffffda809fd21180 n/a 65s
ffffc883fd363010 [STORAGE] ffffc883fe02c1c0 n/a SCSI/WRITE (10) ffffda80a0f9b180 n/a 65s
ffffc88400f0c100 [STORAGE] ffffc88401344a00 n/a SCSI/READ (10) ffffc883f3189b10 n/a 65s
ffffc88401371ea0 [STORAGE] ffffc88401343a00 n/a SCSI/WRITE (10) ffffc883ff6d4510 n/a 65s
ffffc88400f99790 [STORAGE] ffffc883ffb6a7e0 n/a SCSI/READ (10) ffffc883ff1da8c0 n/a 65s
ffffc884013875e0 [STORAGE] ffffc884012171e0 n/a SCSI/WRITE (10) ffffc883fe2df640 n/a 65s
ffffc883fe5ee730 [STORAGE] ffffc883ff319c50 n/a SCSI/READ (10) ffffc883f4160d20 n/a 65s
ffffc883fb218e50 [STORAGE] ffffc883f45abd40 n/a SCSI/WRITE (10) ffffc883f3d6fb00 n/a 65s
ffffc8840159b6f0 [STORAGE] ffffc883f3cbdcd0 n/a SCSI/WRITE (10) ffffc883f42c6200 n/a 65s
ffffc883fed71a30 [STORAGE] ffffc883ff3737a0 n/a SCSI/WRITE (10) ffffc883f3907620 n/a 65s
ffffc88401553780 [STORAGE] ffffc883fed97d80 n/a SCSI/READ (10) ffffc883ff1355d0 n/a 65s
ffffc88401a11c00 [STORAGE] ffffc883fd5d5320 n/a SCSI/READ (10) ffffc883fdbdcdb0 n/a 65s
ffffc883f424bea0 [STORAGE] ffffc883ff5d23d0 n/a SCSI/WRITE (10) ffffc883f3c538a0 n/a 65s
ffffc883f447e870 [STORAGE] ffffc883ff458c10 n/a SCSI/READ (10) ffffc883ff712110 n/a 65s
ffffc884019d4d80 [STORAGE] ffffc88401791860 n/a SCSI/READ (10) ffffc883ff307d90 n/a 65s
ffffc883fb8d4b70 [STORAGE] ffffc883f3b501c0 n/a SCSI/WRITE (10) ffffc883fd30e7e0 n/a 65s
ffffc883ff196b10 [STORAGE] ffffc883f45ea010 n/a SCSI/READ (10) ffffc883ff00fde0 n/a 65s
ffffc883f436b2f0 [STORAGE] ffffc883fdcd7b30 n/a SCSI/READ (10) ffffc883fefd4f50 n/a 65s
ffffc884019baa70 [STORAGE] ffffc883f3ad4230 n/a SCSI/WRITE (10) ffffc88401233390 n/a 65s
ffffc88401737010 [STORAGE] ffffc883fe38c1b0 n/a SCSI/WRITE (10) ffffda80a027f270 n/a 65s
ffffc883fb8ef770 [STORAGE] ffffc883f4921430 n/a SCSI/WRITE (10) ffffda80a6ecd270 n/a 65s
ffffc88400f86460 [STORAGE] ffffc883fdd20010 n/a SCSI/READ (10) ffffc883fb2257d0 n/a 65s
ffffc883f38cf630 [STORAGE] ffffc883fde17310 n/a SCSI/WRITE (10) ffffc883fae5ab30 n/a 65s
ffffc883f42d1570 [STORAGE] ffffc883fdee4140 n/a SCSI/WRITE (10) ffffc883f43cbf40 n/a 65s
ffffc883fdbc12b0 [STORAGE] ffffc883fe244ae0 n/a SCSI/WRITE (10) ffffc883f3daa900 n/a 65s

[Bypass-Queued Requests]

IRP SRB Type SRB XRB Command MDL SGList Timeout
-----------------------------------------------------------------------------------------------------------------------------------

[Outstanding Requests]

IRP SRB Type SRB XRB Command MDL SGList Timeout
-----------------------------------------------------------------------------------------------------------------------------------
ffffc8840137bee0 [STORAGE] ffffc88401b863b0 ffffc883f57ab010 RESET LUN 0000000000000000 0000000000000000 30s

[Completed Requests]

IRP SRB Type SRB XRB Command MDL SGList Timeout
-----------------------------------------------------------------------------------------------------------------------------------
ERROR: 1 counted requests > 0 outstanding requests

0: kd> !storsrb ffffc88401b863b0
SRB is a STORAGE request block (SRB_EX)
SRB EX 0xffffc88401b863b0 Function 28 Version 1, Signature 53524258, SrbStatus: 0x00[Pending], SrbFunction 0x20 [RESET LUN]
Address Type is BTL8

No SrbExData

4: kd> !storsrb ffffc88401285a10
SRB is a STORAGE request block (SRB_EX)
SRB EX 0xffffc88401285a10 Function 28 Version 1, Signature 53524258, SrbStatus: 0x00[Pending], SrbFunction 0x00 [EXECUTE SCSI]
Address Type is BTL8

SRB_EX Data Type [SrbExDataTypeScsiCdb16]
[EXECUTE SCSI] SRB_EX: 0xffffc88401285aa0 OriginalRequest: 0xffffc883ff39a010 DataBuffer/Length: 0x0000000000000000 / 0x00001000
PTL: (0, 0, 0) CDB: 2A 00 05 E3 FF E0 00 00 08 00 00 00 00 00 00 00 OpCode: SCSI/WRITE (10)

4: kd> !storsrb ffffc8840106a6f0
SRB is a STORAGE request block (SRB_EX)
SRB EX 0xffffc8840106a6f0 Function 28 Version 1, Signature 53524258, SrbStatus: 0x00[Pending], SrbFunction 0x00 [EXECUTE SCSI]
Address Type is BTL8

SRB_EX Data Type [SrbExDataTypeScsiCdb16]
[EXECUTE SCSI] SRB_EX: 0xffffc8840106a780 OriginalRequest: 0xffffc8840106a4f0 DataBuffer/Length: 0xffffc88401a4cdc0 / 0x00000200
PTL: (0, 0, 0) CDB: 28 00 1E 5F 15 DF 00 00 01 00 00 00 00 00 00 00 OpCode: SCSI/READ (10)

4: kd> !storsrb ffffc883fe157960
SRB is a STORAGE request block (SRB_EX)
SRB EX 0xffffc883fe157960 Function 28 Version 1, Signature 53524258, SrbStatus: 0x00[Pending], SrbFunction 0x00 [EXECUTE SCSI]
Address Type is BTL8

SRB_EX Data Type [SrbExDataTypeScsiCdb16]
[EXECUTE SCSI] SRB_EX: 0xffffc883fe1579f0 OriginalRequest: 0xffffc883ff28e480 DataBuffer/Length: 0x0000000000000000 / 0x00008000
PTL: (0, 0, 0) CDB: 28 00 02 48 1F D8 00 00 40 00 00 00 00 00 00 00 OpCode: SCSI/READ (10)

4: kd> dt storport!_EXTENDED_REQUEST_BLOCK 0xffffc883fe1579f0
+0x000 Signature : 0x40
+0x008 Pool : 0x00000000000a1200 _NPAGED_LOOKASIDE_LIST<br> +0x010 OwnedMdl : 0y0<br> +0x010 RemoveFromEventQueue : 0y0<br> +0x010 State : 0y010<br> +0x010 RemappedSenseInfo : 0y1<br> +0x010 CompatSrbInUse : 0y0<br> +0x010 SrbActivateComponent : 0y1<br> +0x011 DoExtraAdapterDereference : 0y0<br> +0x011 DoExtraUnitDereference : 0y1<br> +0x011 AbortInProgress : 0y0<br> +0x011 ByPassPausedGateway : 0y0<br> +0x011 Reserved : 0y1110<br> +0x012 InitiatingProcessor : _PROCESSOR_NUMBER<br> +0x018 InitiatingToken : 0x0000d81f48020028 _STARTIO_TOKEN
+0x020 CompletedLink : _SLIST_ENTRY
+0x030 PendingLink : _STOR_EVENT_QUEUE_ENTRY
+0x068 Mdl : 0x0000000000001000 _MDL<br> +0x070 SgList : 0x0000000000132053 _SCATTER_GATHER_LIST
+0x078 RemappedSgListMdl : (null)
+0x080 RemappedSgList : 0x0000377c08fc1648 _SCATTER_GATHER_LIST<br> +0x088 DataInMdl : 0x0000000100000001 _MDL
+0x090 DoubleBufferedMdl : 0x0000000000000001 _MDL<br> +0x098 DataInSgList : 0x000000000000003c _SCATTER_GATHER_LIST
+0x0a0 Irp : 0x0000000100000000 _IRP<br> +0x0a8 Srb : (null) <br> +0x0b0 SrbData : <unnamed-tag><br> +0x0d8 Adapter : 0xffffc883f703ebe0 _RAID_ADAPTER_EXTENSION
+0x0e0 Unit : 0x0000000100000001 _RAID_UNIT_EXTENSION<br> +0x0e8 ScatterGatherBuffer : [424] ""<br> +0x290 CompletionRoutine : 0xffffc883f4782c80 void +ffffc883f4782c80
+0x298 u :
+0x2b0 RequestWaitDuration : 0xc
+0x2b8 RequestStartTimeStamp : _LARGE_INTEGER 0x8000000
+0x2c0 RequestAfterBuildIoTimeStamp : _LARGE_INTEGER 0xffffc883f3e99540<br> +0x2c8 RequestAfterStartIoTimeStamp : _LARGE_INTEGER 0xffffc883f42a9080
+0x2d0 RequestMiniportDuration : 0x1c
+0x2d8 ActivityId : _GUID {00000019-0000-0000-0000-000000000000}
+0x2e8 CompatSrbBufferSize : 0
+0x2ec Component : 0
+0x2f0 OriginalSrb : (null)
+0x2f8 CompatSrbBuffer : (null)
+0x300 ParentIrp : (null)
+0x308 AbortStatus : 0n0
+0x310 CryptoKeyInfo : (null)

4: kd> dt storport!_EXTENDED_REQUEST_BLOCK 0xffffc88401285aa0
+0x000 Signature : 0x40
+0x008 Pool : 0x00000000000a1200 _NPAGED_LOOKASIDE_LIST<br> +0x010 OwnedMdl : 0y0<br> +0x010 RemoveFromEventQueue : 0y0<br> +0x010 State : 0y010<br> +0x010 RemappedSenseInfo : 0y1<br> +0x010 CompatSrbInUse : 0y1<br> +0x010 SrbActivateComponent : 0y0<br> +0x011 DoExtraAdapterDereference : 0y1<br> +0x011 DoExtraUnitDereference : 0y0<br> +0x011 AbortInProgress : 0y0<br> +0x011 ByPassPausedGateway : 0y1<br> +0x011 Reserved : 0y1000<br> +0x012 InitiatingProcessor : _PROCESSOR_NUMBER<br> +0x018 InitiatingToken : 0x0000e0ffe305002a _STARTIO_TOKEN
+0x020 CompletedLink : _SLIST_ENTRY
+0x030 PendingLink : _STOR_EVENT_QUEUE_ENTRY
+0x068 Mdl : 0x0000003f0000003f _MDL<br> +0x070 SgList : 0x000001c100000000 _SCATTER_GATHER_LIST
+0x078 RemappedSgListMdl : (null)
+0x080 RemappedSgList : (null)
+0x088 DataInMdl : 0xffffc884012adc18 _MDL<br> +0x090 DoubleBufferedMdl : 0xffffc88401285b30 _MDL
+0x098 DataInSgList : 0xffffc88401285b30 _SCATTER_GATHER_LIST<br> +0x0a0 Irp : (null) <br> +0x0a8 Srb : (null) <br> +0x0b0 SrbData : <unnamed-tag><br> +0x0d8 Adapter : (null) <br> +0x0e0 Unit : 0x0000000000000001 _RAID_UNIT_EXTENSION
+0x0e8 ScatterGatherBuffer : [424] “”
+0x290 CompletionRoutine : (null)
+0x298 u :
+0x2b0 RequestWaitDuration : 0
+0x2b8 RequestStartTimeStamp : _LARGE_INTEGER 0x0
+0x2c0 RequestAfterBuildIoTimeStamp : _LARGE_INTEGER 0x0
+0x2c8 RequestAfterStartIoTimeStamp : _LARGE_INTEGER 0x0
+0x2d0 RequestMiniportDuration : 0
+0x2d8 ActivityId : _GUID {00000000-0000-0000-0000-000000000000}
+0x2e8 CompatSrbBufferSize : 0
+0x2ec Component : 0
+0x2f0 OriginalSrb : (null)
+0x2f8 CompatSrbBuffer : (null)
+0x300 ParentIrp : (null)
+0x308 AbortStatus : 0n19422656
+0x310 CryptoKeyInfo : 0x000000000badca11 _STOR_CRYPTO_KEY_INFO<br><br>4: kd&gt; dx -r1 (*((storport!_MDL *)0x3f0000003f))<br>(*((storport!_MDL *)0x3f0000003f)) [Type: _MDL]<br> [+0x000] Next : Unable to read memory at Address 0x3f0000003f<br> [+0x008] Size : Unable to read memory at Address 0x3f00000047<br> [+0x00a] MdlFlags : Unable to read memory at Address 0x3f00000049<br> [+0x010] Process : Unable to read memory at Address 0x3f0000004f<br> [+0x018] MappedSystemVa : Unable to read memory at Address 0x3f00000057<br> [+0x020] StartVa : Unable to read memory at Address 0x3f0000005f<br> [+0x028] ByteCount : Unable to read memory at Address 0x3f00000067<br> [+0x02c] ByteOffset : Unable to read memory at Address 0x3f0000006b<br><br>Nothing suspicious about the LUN reset XRB (except for the fact that it never finishes?), whatever went wrong happened before this:<br><br>0: kd&gt; dt storport!_EXTENDED_REQUEST_BLOCK 0xffffc883f57ab010<br> +0x000 Signature : 0x1f2e3d4c<br> +0x008 Pool : (null) <br> +0x010 OwnedMdl : 0y0<br> +0x010 RemoveFromEventQueue : 0y1<br> +0x010 State : 0y011<br> +0x010 RemappedSenseInfo : 0y0<br> +0x010 CompatSrbInUse : 0y0<br> +0x010 SrbActivateComponent : 0y0<br> +0x011 DoExtraAdapterDereference : 0y0<br> +0x011 DoExtraUnitDereference : 0y0<br> +0x011 AbortInProgress : 0y0<br> +0x011 ByPassPausedGateway : 0y0<br> +0x011 Reserved : 0y0000<br> +0x012 InitiatingProcessor : _PROCESSOR_NUMBER<br> +0x018 InitiatingToken : (null) <br> +0x020 CompletedLink : _SLIST_ENTRY<br> +0x030 PendingLink : _STOR_EVENT_QUEUE_ENTRY<br> +0x068 Mdl : (null) <br> +0x070 SgList : (null) <br> +0x078 RemappedSgListMdl : (null) <br> +0x080 RemappedSgList : (null) <br> +0x088 DataInMdl : (null) <br> +0x090 DoubleBufferedMdl : (null) <br> +0x098 DataInSgList : (null) <br> +0x0a0 Irp : 0xffffc8840137bee0 _IRP
+0x0a8 Srb : 0xffffc88401b863b0 _SCSI_REQUEST_BLOCK<br> +0x0b0 SrbData : <unnamed-tag><br> +0x0d8 Adapter : 0xffffc883f4bc21a0 _RAID_ADAPTER_EXTENSION
+0x0e0 Unit : 0xffffc883f4b9e1b0 _RAID_UNIT_EXTENSION<br> +0x0e8 ScatterGatherBuffer : [424] ""<br> +0x290 CompletionRoutine : 0xfffff80acd88ca80 void storport!RaidUnitCompleteResetRequest+0
+0x298 u :
+0x2b0 RequestWaitDuration : 0
+0x2b8 RequestStartTimeStamp : _LARGE_INTEGER 0x000000050c00caf5<br> +0x2c0 RequestAfterBuildIoTimeStamp : _LARGE_INTEGER 0x0<br> +0x2c8 RequestAfterStartIoTimeStamp : _LARGE_INTEGER 0x0<br> +0x2d0 RequestMiniportDuration : 0<br> +0x2d8 ActivityId : _GUID {00000000-0000-0000-0000-000000000000}<br> +0x2e8 CompatSrbBufferSize : 0x90<br> +0x2ec Component : 0<br> +0x2f0 OriginalSrb : (null) <br> +0x2f8 CompatSrbBuffer : 0xffffc883f57ac600 Void
+0x300 ParentIrp : (null)
+0x308 AbortStatus : 0n0
+0x310 CryptoKeyInfo : (null)

I can’t read any of the IRPs due to the lack of symbols. Is it normal for some of the SCSI/{READ,WRITE} requests to have DataBuffer/Length be 0x00 / 0xSomeValue? The pool for both reads and writes with DataBuffer zero was _NPAGED_LOOKASIDE_LIST. Or is/was that a null dereference?

I should add - so I wasn’t able to run !storclass because of the missing symbols; I know I can see a list of failed requests there. Should !storunit have shown me the same?

Running the target then breaking minutes later shows the same pending requests (despite the timeout have been long-since exceeded) and their SRB status is all PENDING (and none are FAILED). Digging in reveals (as expected) the SENSE data to be all zeros (as they never failed, as far as the kernel/device/driver/something is concerned).

If a request times out, StorPort will try to reset the unit and make it
abort all in progress I/O requests. If the reset doesn’t complete then
that’s bad business, usually this means that the hardware has ceased
responding (especially if you’re seeing this with two different drivers for
the same device).

Random guess, but do you have the disk configured to power down when idle
(!popolicy will tell you, but that requires symbols)? I’d try shutting that
off and see if that helps at all.

-scott
OSR
@OSRDrivers

wrote in message news:xxxxx@ntdev…

With my debug USB cable in hand, I was *finally* able to get some real data
out of this. Sorry for wasting everyone’s time with the pathetic attempts at
local debugging earlier in this thread.

Unfortunately, I don’t have the symbols for this (no symbols for 15055 as
of yet, I’ve contacted WinDbgFb and we’ll see if we hear back. I may have to
debug this on a clean install of a different build that triggers the issue
yet has the debug symbols available… *sigh*).

Anyway, this is what I get:

0: kd> !wmitrace.logdump EventLog-System

| |
| NT symbols are not available |
| reduced functionality |

(WmiTrace) LogDump for Logger Id 0x0c
Found Buffers: 2 Messages: 2, sorting entries
[0]0000.0000:: 131339913007908456 [Microsoft-Windows-StorPort/Bus reset
/OpCodeBusReset]Bus reset occured on storport adapter (Port Number: 1)
[0]0000.0000:: 131339913307864363 [Microsoft-Windows-StorPort/None /Info]A
request timed out for Storport Device (Port = 1, Path = 0, Target = 0, Lun =
0).
Corresponding Class Disk Device Guid is
{0d61954a-3e28-2ff4-3116-b0b7bdd7c44b}.
Total of 2 Messages from 2 Buffers

0: kd> !storclass


0: kd> !storadapter
STORPORT adapters:
==================
Driver Object Extension State
-----------------------------------------------------------------
\Driver\secnvme ffffc883f4bc2050 ffffc883f4bc21a0 Working
\Driver\storahci ffffc883f4bb5050 ffffc883f4bb51a0 Working

0: kd> !storadapter ffffc883f4bc2050
ADAPTER
DeviceObj : ffffc883f4bc2050 AdapterExt: ffffc883f4bc21a0 DriverObj :
ffffc883f4ba58d0
DeviceState : Working
LowerDO ffffc883f4b59620 PhysicalDO ffffc883f4b59840
SlowLock Free RemLock -666
SystemPowerState: Working AdapterPowerState D0 Full Duplex
Bus 4 Slot 0 DMA ffffc883f4bd0dd0 Interrupt 0000000000000000
Allocated ResourceList ffffc883f4bc6960
Translated ResourceList ffffc883f4ba78f0
Gateway: Outstanding 0 Lower 1024 High 1024
PortConfigInfo ffffc883f4bc22d0
HwInit ffffc883f4ba83b0 HwDeviceExt ffffc883f2c6b010 (10024 bytes)
SrbExt 4560 bytes LUExt 0 bytes

Normal Logical Units:
Product SCSI ID Object Extension Pnd
Out Ct State
----------------------------------------------------------------------------------------
NVMe Samsung SS 0 0 0 ffffc883f4b9e060 ffffc883f4b9e1b0 35
0 0 Working

Zombie Logical Units:
Product SCSI ID Object Extension Pnd
Out Ct State
--------------------------------------------------------------------------------------

0: kd> !storunit ffffc883f4b9e060
DO ffffc883f4b9e060 Ext ffffc883f4b9e1b0 Adapter ffffc883f4bc21a0
Working
Vendor: NVMe Product: Samsung SSD 950 SCSI ID: (0, 0, 0)
Claimed Enumerated
SlowLock Free RemLock 38 PageCount 2
QueueTagList: ffffc883f4b9e2b0 Outstanding: Head 0000000000000000
Tail 0000000000000000 Timeout 0 (Ticking Down)
DeviceQueue ffffc883f4b9e340 Depth: 254 Status: Not Frozen
PauseCount: 1 BusyCount: 0
IO Gateway: Busy Count 0 Pause Count 0
Requests: Outstanding 0 Device 35 ByPass 0

[Device-Queued Requests]

IRP SRB Type SRB XRB Command
MDL SGList Timeout
-----------------------------------------------------------------------------------------------------------------------------------
ffffc88400fd6bb0 [STORAGE] ffffc88400fd6db0 n/a SCSI/WRITE (10)
ffffda80a7131ea0 n/a 65s
ffffc883ff39a010 [STORAGE] ffffc88401285a10 n/a SCSI/WRITE (10)
ffffda80a0b27750 n/a 65s
ffffc8840106a4f0 [STORAGE] ffffc8840106a6f0 n/a SCSI/READ (10)
ffffc883f46b83a0 n/a 65s
ffffc88400fdaa20 [STORAGE] ffffc883fecde200 n/a SCSI/WRITE (10)
ffffc883f492a200 n/a 65s
ffffc884013dfd40 [STORAGE] ffffc884013dff40 n/a SCSI/WRITE (10)
ffffc8840172b010 n/a 65s
ffffc883f317ae50 [STORAGE] ffffc883ff400bb0 n/a SCSI/WRITE (10)
ffffc88401716100 n/a 65s
ffffc883fbe95460 [STORAGE] ffffc883fd84b7b0 n/a SCSI/WRITE (10)
ffffc8840174e930 n/a 65s
ffffc884012637b0 [STORAGE] ffffc883ff0b4970 n/a SCSI/WRITE (10)
ffffda80a16be180 n/a 65s
ffffc883fec4e450 [STORAGE] ffffc883ff725d10 n/a SCSI/WRITE (10)
ffffc884013daa40 n/a 65s
ffffc883ff28e480 [STORAGE] ffffc883fe157960 n/a SCSI/READ (10)
ffffc883f4404550 n/a 65s
ffffc88400ecfd40 [STORAGE] ffffc88400ecff40 n/a SCSI/WRITE (10)
ffffda809fd21180 n/a 65s
ffffc883fd363010 [STORAGE] ffffc883fe02c1c0 n/a SCSI/WRITE (10)
ffffda80a0f9b180 n/a 65s
ffffc88400f0c100 [STORAGE] ffffc88401344a00 n/a SCSI/READ (10)
ffffc883f3189b10 n/a 65s
ffffc88401371ea0 [STORAGE] ffffc88401343a00 n/a SCSI/WRITE (10)
ffffc883ff6d4510 n/a 65s
ffffc88400f99790 [STORAGE] ffffc883ffb6a7e0 n/a SCSI/READ (10)
ffffc883ff1da8c0 n/a 65s
ffffc884013875e0 [STORAGE] ffffc884012171e0 n/a SCSI/WRITE (10)
ffffc883fe2df640 n/a 65s
ffffc883fe5ee730 [STORAGE] ffffc883ff319c50 n/a SCSI/READ (10)
ffffc883f4160d20 n/a 65s
ffffc883fb218e50 [STORAGE] ffffc883f45abd40 n/a SCSI/WRITE (10)
ffffc883f3d6fb00 n/a 65s
ffffc8840159b6f0 [STORAGE] ffffc883f3cbdcd0 n/a SCSI/WRITE (10)
ffffc883f42c6200 n/a 65s
ffffc883fed71a30 [STORAGE] ffffc883ff3737a0 n/a SCSI/WRITE (10)
ffffc883f3907620 n/a 65s
ffffc88401553780 [STORAGE] ffffc883fed97d80 n/a SCSI/READ (10)
ffffc883ff1355d0 n/a 65s
ffffc88401a11c00 [STORAGE] ffffc883fd5d5320 n/a SCSI/READ (10)
ffffc883fdbdcdb0 n/a 65s
ffffc883f424bea0 [STORAGE] ffffc883ff5d23d0 n/a SCSI/WRITE (10)
ffffc883f3c538a0 n/a 65s
ffffc883f447e870 [STORAGE] ffffc883ff458c10 n/a SCSI/READ (10)
ffffc883ff712110 n/a 65s
ffffc884019d4d80 [STORAGE] ffffc88401791860 n/a SCSI/READ (10)
ffffc883ff307d90 n/a 65s
ffffc883fb8d4b70 [STORAGE] ffffc883f3b501c0 n/a SCSI/WRITE (10)
ffffc883fd30e7e0 n/a 65s
ffffc883ff196b10 [STORAGE] ffffc883f45ea010 n/a SCSI/READ (10)
ffffc883ff00fde0 n/a 65s
ffffc883f436b2f0 [STORAGE] ffffc883fdcd7b30 n/a SCSI/READ (10)
ffffc883fefd4f50 n/a 65s
ffffc884019baa70 [STORAGE] ffffc883f3ad4230 n/a SCSI/WRITE (10)
ffffc88401233390 n/a 65s
ffffc88401737010 [STORAGE] ffffc883fe38c1b0 n/a SCSI/WRITE (10)
ffffda80a027f270 n/a 65s
ffffc883fb8ef770 [STORAGE] ffffc883f4921430 n/a SCSI/WRITE (10)
ffffda80a6ecd270 n/a 65s
ffffc88400f86460 [STORAGE] ffffc883fdd20010 n/a SCSI/READ (10)
ffffc883fb2257d0 n/a 65s
ffffc883f38cf630 [STORAGE] ffffc883fde17310 n/a SCSI/WRITE (10)
ffffc883fae5ab30 n/a 65s
ffffc883f42d1570 [STORAGE] ffffc883fdee4140 n/a SCSI/WRITE (10)
ffffc883f43cbf40 n/a 65s
ffffc883fdbc12b0 [STORAGE] ffffc883fe244ae0 n/a SCSI/WRITE (10)
ffffc883f3daa900 n/a 65s

[Bypass-Queued Requests]

IRP SRB Type SRB XRB Command
MDL SGList Timeout
-----------------------------------------------------------------------------------------------------------------------------------

[Outstanding Requests]

IRP SRB Type SRB XRB Command
MDL SGList Timeout
-----------------------------------------------------------------------------------------------------------------------------------
ffffc8840137bee0 [STORAGE] ffffc88401b863b0 ffffc883f57ab010 RESET LUN
0000000000000000 0000000000000000 30s

[Completed Requests]

IRP SRB Type SRB XRB Command
MDL SGList Timeout
-----------------------------------------------------------------------------------------------------------------------------------
ERROR: 1 counted requests > 0 outstanding requests

0: kd> !storsrb ffffc88401b863b0
SRB is a STORAGE request block (SRB_EX)
SRB EX 0xffffc88401b863b0 Function 28 Version 1, Signature 53524258,
SrbStatus: 0x00[Pending], SrbFunction 0x20 [RESET LUN]
Address Type is BTL8

No SrbExData

4: kd> !storsrb ffffc88401285a10
SRB is a STORAGE request block (SRB_EX)
SRB EX 0xffffc88401285a10 Function 28 Version 1, Signature 53524258,
SrbStatus: 0x00[Pending], SrbFunction 0x00 [EXECUTE SCSI]
Address Type is BTL8

SRB_EX Data Type [SrbExDataTypeScsiCdb16]
[EXECUTE SCSI] SRB_EX: 0xffffc88401285aa0 OriginalRequest:
0xffffc883ff39a010 DataBuffer/Length: 0x0000000000000000 / 0x00001000
PTL: (0, 0, 0) CDB: 2A 00 05 E3 FF E0 00 00 08 00 00 00 00 00 00 00
OpCode: SCSI/WRITE (10)

4: kd> !storsrb ffffc8840106a6f0
SRB is a STORAGE request block (SRB_EX)
SRB EX 0xffffc8840106a6f0 Function 28 Version 1, Signature 53524258,
SrbStatus: 0x00[Pending], SrbFunction 0x00 [EXECUTE SCSI]
Address Type is BTL8

SRB_EX Data Type [SrbExDataTypeScsiCdb16]
[EXECUTE SCSI] SRB_EX: 0xffffc8840106a780 OriginalRequest:
0xffffc8840106a4f0 DataBuffer/Length: 0xffffc88401a4cdc0 / 0x00000200
PTL: (0, 0, 0) CDB: 28 00 1E 5F 15 DF 00 00 01 00 00 00 00 00 00 00
OpCode: SCSI/READ (10)

4: kd> !storsrb ffffc883fe157960
SRB is a STORAGE request block (SRB_EX)
SRB EX 0xffffc883fe157960 Function 28 Version 1, Signature 53524258,
SrbStatus: 0x00[Pending], SrbFunction 0x00 [EXECUTE SCSI]
Address Type is BTL8

SRB_EX Data Type [SrbExDataTypeScsiCdb16]
[EXECUTE SCSI] SRB_EX: 0xffffc883fe1579f0 OriginalRequest:
0xffffc883ff28e480 DataBuffer/Length: 0x0000000000000000 / 0x00008000
PTL: (0, 0, 0) CDB: 28 00 02 48 1F D8 00 00 40 00 00 00 00 00 00 00
OpCode: SCSI/READ (10)

4: kd> dt storport!_EXTENDED_REQUEST_BLOCK 0xffffc883fe1579f0
+0x000 Signature : 0x40
+0x008 Pool : 0x00000000000a1200 _NPAGED_LOOKASIDE_LIST<br> +0x010 OwnedMdl : 0y0<br> +0x010 RemoveFromEventQueue : 0y0<br> +0x010 State : 0y010<br> +0x010 RemappedSenseInfo : 0y1<br> +0x010 CompatSrbInUse : 0y0<br> +0x010 SrbActivateComponent : 0y1<br> +0x011 DoExtraAdapterDereference : 0y0<br> +0x011 DoExtraUnitDereference : 0y1<br> +0x011 AbortInProgress : 0y0<br> +0x011 ByPassPausedGateway : 0y0<br> +0x011 Reserved : 0y1110<br> +0x012 InitiatingProcessor : _PROCESSOR_NUMBER<br> +0x018 InitiatingToken : 0x0000d81f48020028 _STARTIO_TOKEN
+0x020 CompletedLink : _SLIST_ENTRY
+0x030 PendingLink : _STOR_EVENT_QUEUE_ENTRY
+0x068 Mdl : 0x0000000000001000 _MDL<br> +0x070 SgList : 0x0000000000132053 _SCATTER_GATHER_LIST
+0x078 RemappedSgListMdl : (null)
+0x080 RemappedSgList : 0x0000377c08fc1648 _SCATTER_GATHER_LIST<br> +0x088 DataInMdl : 0x0000000100000001 _MDL
+0x090 DoubleBufferedMdl : 0x0000000000000001 _MDL<br> +0x098 DataInSgList : 0x000000000000003c _SCATTER_GATHER_LIST
+0x0a0 Irp : 0x0000000100000000 _IRP<br> +0x0a8 Srb : (null)<br> +0x0b0 SrbData : <unnamed-tag><br> +0x0d8 Adapter : 0xffffc883f703ebe0 _RAID_ADAPTER_EXTENSION
+0x0e0 Unit : 0x0000000100000001 _RAID_UNIT_EXTENSION<br> +0x0e8 ScatterGatherBuffer : [424] ""<br> +0x290 CompletionRoutine : 0xffffc883f4782c80 void
+ffffc883f4782c80
+0x298 u :
+0x2b0 RequestWaitDuration : 0xc
+0x2b8 RequestStartTimeStamp : _LARGE_INTEGER 0x8000000
+0x2c0 RequestAfterBuildIoTimeStamp : _LARGE_INTEGER 0xffffc883f3e99540<br> +0x2c8 RequestAfterStartIoTimeStamp : _LARGE_INTEGER 0xffffc883f42a9080
+0x2d0 RequestMiniportDuration : 0x1c
+0x2d8 ActivityId : _GUID {00000019-0000-0000-0000-000000000000}
+0x2e8 CompatSrbBufferSize : 0
+0x2ec Component : 0
+0x2f0 OriginalSrb : (null)
+0x2f8 CompatSrbBuffer : (null)
+0x300 ParentIrp : (null)
+0x308 AbortStatus : 0n0
+0x310 CryptoKeyInfo : (null)

4: kd> dt storport!_EXTENDED_REQUEST_BLOCK 0xffffc88401285aa0
+0x000 Signature : 0x40
+0x008 Pool : 0x00000000000a1200 _NPAGED_LOOKASIDE_LIST<br> +0x010 OwnedMdl : 0y0<br> +0x010 RemoveFromEventQueue : 0y0<br> +0x010 State : 0y010<br> +0x010 RemappedSenseInfo : 0y1<br> +0x010 CompatSrbInUse : 0y1<br> +0x010 SrbActivateComponent : 0y0<br> +0x011 DoExtraAdapterDereference : 0y1<br> +0x011 DoExtraUnitDereference : 0y0<br> +0x011 AbortInProgress : 0y0<br> +0x011 ByPassPausedGateway : 0y1<br> +0x011 Reserved : 0y1000<br> +0x012 InitiatingProcessor : _PROCESSOR_NUMBER<br> +0x018 InitiatingToken : 0x0000e0ffe305002a _STARTIO_TOKEN
+0x020 CompletedLink : _SLIST_ENTRY
+0x030 PendingLink : _STOR_EVENT_QUEUE_ENTRY
+0x068 Mdl : 0x0000003f0000003f _MDL<br> +0x070 SgList : 0x000001c100000000 _SCATTER_GATHER_LIST
+0x078 RemappedSgListMdl : (null)
+0x080 RemappedSgList : (null)
+0x088 DataInMdl : 0xffffc884012adc18 _MDL<br> +0x090 DoubleBufferedMdl : 0xffffc88401285b30 _MDL
+0x098 DataInSgList : 0xffffc88401285b30 _SCATTER_GATHER_LIST<br> +0x0a0 Irp : (null)<br> +0x0a8 Srb : (null)<br> +0x0b0 SrbData : <unnamed-tag><br> +0x0d8 Adapter : (null)<br> +0x0e0 Unit : 0x0000000000000001 _RAID_UNIT_EXTENSION
+0x0e8 ScatterGatherBuffer : [424] “”
+0x290 CompletionRoutine : (null)
+0x298 u :
+0x2b0 RequestWaitDuration : 0
+0x2b8 RequestStartTimeStamp : _LARGE_INTEGER 0x0
+0x2c0 RequestAfterBuildIoTimeStamp : _LARGE_INTEGER 0x0
+0x2c8 RequestAfterStartIoTimeStamp : _LARGE_INTEGER 0x0
+0x2d0 RequestMiniportDuration : 0
+0x2d8 ActivityId : _GUID {00000000-0000-0000-0000-000000000000}
+0x2e8 CompatSrbBufferSize : 0
+0x2ec Component : 0
+0x2f0 OriginalSrb : (null)
+0x2f8 CompatSrbBuffer : (null)
+0x300 ParentIrp : (null)
+0x308 AbortStatus : 0n19422656
+0x310 CryptoKeyInfo : 0x000000000badca11 _STOR_CRYPTO_KEY_INFO<br><br>4: kd&gt; dx -r1 (*((storport!_MDL *)0x3f0000003f))<br>(*((storport!_MDL *)0x3f0000003f)) [Type: _MDL]<br> [+0x000] Next : Unable to read memory at Address <br>0x3f0000003f<br> [+0x008] Size : Unable to read memory at Address <br>0x3f00000047<br> [+0x00a] MdlFlags : Unable to read memory at Address <br>0x3f00000049<br> [+0x010] Process : Unable to read memory at Address <br>0x3f0000004f<br> [+0x018] MappedSystemVa : Unable to read memory at Address <br>0x3f00000057<br> [+0x020] StartVa : Unable to read memory at Address <br>0x3f0000005f<br> [+0x028] ByteCount : Unable to read memory at Address <br>0x3f00000067<br> [+0x02c] ByteOffset : Unable to read memory at Address <br>0x3f0000006b<br><br>Nothing suspicious about the LUN reset XRB (except for the fact that it <br>never finishes?), whatever went wrong happened before this:<br><br>0: kd&gt; dt storport!_EXTENDED_REQUEST_BLOCK 0xffffc883f57ab010<br> +0x000 Signature : 0x1f2e3d4c<br> +0x008 Pool : (null)<br> +0x010 OwnedMdl : 0y0<br> +0x010 RemoveFromEventQueue : 0y1<br> +0x010 State : 0y011<br> +0x010 RemappedSenseInfo : 0y0<br> +0x010 CompatSrbInUse : 0y0<br> +0x010 SrbActivateComponent : 0y0<br> +0x011 DoExtraAdapterDereference : 0y0<br> +0x011 DoExtraUnitDereference : 0y0<br> +0x011 AbortInProgress : 0y0<br> +0x011 ByPassPausedGateway : 0y0<br> +0x011 Reserved : 0y0000<br> +0x012 InitiatingProcessor : _PROCESSOR_NUMBER<br> +0x018 InitiatingToken : (null)<br> +0x020 CompletedLink : _SLIST_ENTRY<br> +0x030 PendingLink : _STOR_EVENT_QUEUE_ENTRY<br> +0x068 Mdl : (null)<br> +0x070 SgList : (null)<br> +0x078 RemappedSgListMdl : (null)<br> +0x080 RemappedSgList : (null)<br> +0x088 DataInMdl : (null)<br> +0x090 DoubleBufferedMdl : (null)<br> +0x098 DataInSgList : (null)<br> +0x0a0 Irp : 0xffffc8840137bee0 _IRP
+0x0a8 Srb : 0xffffc88401b863b0 _SCSI_REQUEST_BLOCK<br> +0x0b0 SrbData : <unnamed-tag><br> +0x0d8 Adapter : 0xffffc883f4bc21a0 _RAID_ADAPTER_EXTENSION
+0x0e0 Unit : 0xffffc883f4b9e1b0 _RAID_UNIT_EXTENSION<br> +0x0e8 ScatterGatherBuffer : [424] ""<br> +0x290 CompletionRoutine : 0xfffff80acd88ca80 void
storport!RaidUnitCompleteResetRequest+0
+0x298 u :
+0x2b0 RequestWaitDuration : 0
+0x2b8 RequestStartTimeStamp : _LARGE_INTEGER 0x000000050c00caf5<br> +0x2c0 RequestAfterBuildIoTimeStamp : _LARGE_INTEGER 0x0<br> +0x2c8 RequestAfterStartIoTimeStamp : _LARGE_INTEGER 0x0<br> +0x2d0 RequestMiniportDuration : 0<br> +0x2d8 ActivityId : _GUID {00000000-0000-0000-0000-000000000000}<br> +0x2e8 CompatSrbBufferSize : 0x90<br> +0x2ec Component : 0<br> +0x2f0 OriginalSrb : (null)<br> +0x2f8 CompatSrbBuffer : 0xffffc883f57ac600 Void
+0x300 ParentIrp : (null)
+0x308 AbortStatus : 0n0
+0x310 CryptoKeyInfo : (null)

I can’t read any of the IRPs due to the lack of symbols. Is it normal for
some of the SCSI/{READ,WRITE} requests to have DataBuffer/Length be 0x00 /
0xSomeValue? The pool for both reads and writes with DataBuffer zero was
_NPAGED_LOOKASIDE_LIST. Or is/was that a null dereference?

Hi Scott,

Thanks for checking in! You can make that 3 different drivers - I tried the Intel NVMe drivers and was this close to calling it a success when it happened again. I reverted to 15048 and have symbols, did a multihour windbg session yesterday and poked around looking for sense data to no avail.

Automatic shutdown of the disk was enabled; I thought it might be that but I didn’t explore it further since it was occurring both at idle and peak load. Disabling it didn’t help. This is a laptop, fwiw. I thought dynamic PCI-E power management may have been a culprit, this being NVMe and all, but alas that too was a no-go.

It’s definitely not the drive, I yanked it again and stuck it in another machine for 72 hours of normal usage and had no issues. Put it back and this all started again. If it were a SATA device, we could blame the SATA controller, but being an NVMe it’s really just a bus leading straight to the PCIe lines… It makes me suspect something like active state power management, but the Dell BIOS doesn’t expose any such settings.

Mini-rant: I absolutely hate buying “power user hardware” for this reason. Entry-level motherboards/devices get so much more eyeballs and QA that BIOS issues get sorted out ASAP whereas some of the so-called “enterprise” gear (like this Precision laptop) are much less-thoroughly vetted.

I’ve already updated both the BIOS and the drive’s firmware, neither helped.

Curiouser and curiouser… except it’s getting to be more than I can bear.

A replacement device came in today. Different processor and gpu, but same motherboard and everything else. Same behavior. I am so exasperated with this thing…

Anyway, the first time it froze up on me with the new device was preceded by a period of intense UI lag (seconds for typed characters to appear on-screen in any application, missed keystrokes, etc) which prompted me to run a WPA session. I had 2 second delays in calls to storport.sys - but of course, no symbols so that was useless.

Back when I had symbols, the longest delays in storport were due to TRIM calls, but they were more on the order of 2ms rather than 2s.

At this point, it’s not a hardware issue and it’s not a 3rd party driver issue (clean install exhibits the same behavior, Intel/Samsung/MSFT NVMe controller drivers exhibit identical behavior). Maybe I didn’t test long enough, but this isn’t something I ran into running W10 15011, but have experienced on upgrading to 15048-15061. The same physical disk runs fine (running the same install of W10) in another machine for weeks on end.

I’ve ordered another model NVMe disk (but same manufacturer, perhaps a mistake on my end…) to see if that could be the issue. I’m not sure where to turn to at this point. The constant missing symbols is a real PITA, but I just keep upgrading from 15048 (which has published symbols) to the latest fast ring build each time a new build comes out in hopes of experiencing something different. I guess the storport delays above count as that (running 15061).

FWIW, I’m observing some mysterious hangups (slowdown, and timeouts) in a Intel NUC box with Windows 7 booted from Intel NVMe stick, with Microsoft NVMe driver.

Alex, your reply is worth more than you could imagine. It’s honestly great knowing (or at least, suspecting) I’m not alone in this.

Is this a Kaby Lake NUC?