Isolation filter related crash

It doesn’t happen all the time but I’ve been getting some system crashes when my isolation filter is loaded and the system resumes from hibernate. My filter isn’t in the call stack so I’m not sure how to debug the problem. The only thing I can think of is that I may be missing a FileObject which then makes its way down the stack but the call stack doesn’t show that. Here’s an !analyze-v output from a Windows 8.1 system.

IRQL_NOT_LESS_OR_EQUAL (a)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If a kernel debugger is available get the stack backtrace.
Arguments:
Arg1: ffffe000c2286ee0, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000001, bitfield :
bit 0 : value 0 = read operation, 1 = write operation
bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status)
Arg4: fffff80373b3db5a, address which referenced memory

Debugging Details:

DUMP_CLASS: 1

DUMP_QUALIFIER: 402

BUILD_VERSION_STRING: 9600.18505.amd64fre.winblue_ltsb.160930-0600

SYSTEM_MANUFACTURER: Gateway

SYSTEM_PRODUCT_NAME: NE56R

SYSTEM_SKU: NE56R_0649_V2.01

SYSTEM_VERSION: V2.01

BIOS_VENDOR: Gateway

BIOS_VERSION: V2.01

BIOS_DATE: 08/06/2012

BASEBOARD_MANUFACTURER: Gateway

BASEBOARD_PRODUCT: EG50_HC_HR

BASEBOARD_VERSION: Type2 - Board Version

DUMP_TYPE: 0

BUGCHECK_P1: ffffe000c2286ee0

BUGCHECK_P2: 2

BUGCHECK_P3: 1

BUGCHECK_P4: fffff80373b3db5a

WRITE_ADDRESS: ffffe000c2286ee0 Nonpaged pool

CURRENT_IRQL: 2

FAULTING_IP:
nt!MiClearFilePointer+62
fffff803`73b3db5a 48832000 and qword ptr [rax],0

CPU_COUNT: 2

CPU_MHZ: 704

CPU_VENDOR: GenuineIntel

CPU_FAMILY: 6

CPU_MODEL: 2a

CPU_STEPPING: 7

CPU_MICROCODE: 6,2a,7,0 (F,M,S,R) SIG: 28’00000000 (cache) 28’00000000 (init)

DEFAULT_BUCKET_ID: WIN8_DRIVER_FAULT

BUGCHECK_STR: AV

PROCESS_NAME: System

TRAP_FRAME: ffffd001e3d307e0 – (.trap 0xffffd001e3d307e0)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=ffffe000c2286ee0 rbx=0000000000000000 rcx=0000000080000000
rdx=0000000000000000 rsi=0000000000000000 rdi=0000000000000000
rip=fffff80373b3db5a rsp=ffffd001e3d30970 rbp=0000000000000000
r8=0000000000000000 r9=0000000000000001 r10=0000007ffffffff8
r11=0000098000000000 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei pl zr na po nc
nt!MiClearFilePointer+0x62:
fffff80373b3db5a 48832000 and qword ptr [rax],0 ds:ffffe000c2286ee0=???
Resetting default scope

LAST_CONTROL_TRANSFER: from fffff80373bdbee9 to fffff80373bd03a0

STACK_TEXT:
ffffd001e3d30698 fffff80373bdbee9 : 000000000000000a ffffe000c2286ee0 0000000000000002 0000000000000001 : nt!KeBugCheckEx
ffffd001e3d306a0 fffff80373bda73a : 0000000000000001 0000000000000000 ffffe000c2701800 fffff80300000003 : nt!KiBugCheckDispatch+0x69
ffffd001e3d307e0 fffff80373b3db5a : 0000000000000001 0000000000000000 0000000000000001 fffff80373b1851c : nt!KiPageFault+0x23a
ffffd001e3d30970 fffff80373b3b3d5 : ffffe000c1ea9750 ffffe000c1ea9750 0000000000000002 ffffe000c1ea97c8 : nt!MiClearFilePointer+0x62
ffffd001e3d309a0 fffff80373b3b321 : ffffe000c1ea97c8 0000000000000000 ffffe000c1ea9750 00001f8000000001 : nt!MiCheckForControlAreaDeletion+0x45
ffffd001e3d309d0 fffff80373b3af54 : fffffa800481c230 ffffd001e3d30b10 ffffd001e3600000 fffffa8003f481e0 : nt!MiDereferenceControlAreaPfn+0x95
ffffd001e3d30a10 fffff80373b97783 : fffffa800481c230 fffffa800481c230 0000000000000000 fffff80373dd7288 : nt!MiRestoreTransitionPte+0x20c
ffffd001e3d30b50 fffff80373b975a9 : fffffa8004816a70 fffff80373dd72d8 0000000000000000 000000000018078d : nt!MiRemoveLowestPriorityStandbyPage+0x1b7
ffffd001e3d30be0 fffff80373b973f1 : fffff80373b973e4 ffffe000be9a4880 fffff80373d55858 ffffe000be9a4880 : nt!MiPurgeTransitionList+0x81
ffffd001e3d30c20 fffff80373ac0d6f : fffff80373f02d78 ffffe000be9a49c0 ffffe000be9a4880 0000000000000000 : nt!MiFinishResume+0xd
ffffd001e3d30c50 fffff80373ab2f34 : fffff80373dd6e02 ffffe000be9a4880 0000000000000080 ffffe000be9a4880 : nt!ExpWorkerThread+0x69f
ffffd001e3d30d00 fffff80373bd69c6 : ffffd001e8241180 ffffe000be9a4880 ffffe000c27f3880 0000000000000000 : nt!PspSystemThreadStartup+0x58
ffffd001e3d30d60 0000000000000000 : ffffd001e3d31000 ffffd001e3d2b000 0000000000000000 0000000000000000 : nt!KiStartSystemThread+0x16

Is there anything you can decipher from the FO that is being torn down ?
Is it the same one everytime ?
If yes then maybe you can investigate.
Check also your defrag paths in the filter. FSCTL_MOVE_FILE for example
uses a handle in the MOVE_FILE_DATA and that could potentially pass down
and ruin everything.

Cheers,
Gabriel.
www.kasardia.com

On Thu, Mar 16, 2017 at 5:12 AM, wrote:

> It doesn’t happen all the time but I’ve been getting some system crashes
> when my isolation filter is loaded and the system resumes from hibernate.
> My filter isn’t in the call stack so I’m not sure how to debug the
> problem. The only thing I can think of is that I may be missing a
> FileObject which then makes its way down the stack but the call stack
> doesn’t show that. Here’s an !analyze-v output from a Windows 8.1 system.
>
> IRQL_NOT_LESS_OR_EQUAL (a)
> An attempt was made to access a pageable (or completely invalid) address
> at an
> interrupt request level (IRQL) that is too high. This is usually
> caused by drivers using improper addresses.
> If a kernel debugger is available get the stack backtrace.
> Arguments:
> Arg1: ffffe000c2286ee0, memory referenced
> Arg2: 0000000000000002, IRQL
> Arg3: 0000000000000001, bitfield :
> bit 0 : value 0 = read operation, 1 = write operation
> bit 3 : value 0 = not an execute operation, 1 = execute operation
> (only on chips which support this level of status)
> Arg4: fffff80373b3db5a, address which referenced memory
>
> Debugging Details:
> ------------------
>
>
> DUMP_CLASS: 1
>
> DUMP_QUALIFIER: 402
>
> BUILD_VERSION_STRING: 9600.18505.amd64fre.winblue_ltsb.160930-0600
>
> SYSTEM_MANUFACTURER: Gateway
>
> SYSTEM_PRODUCT_NAME: NE56R
>
> SYSTEM_SKU: NE56R_0649_V2.01
>
> SYSTEM_VERSION: V2.01
>
> BIOS_VENDOR: Gateway
>
> BIOS_VERSION: V2.01
>
> BIOS_DATE: 08/06/2012
>
> BASEBOARD_MANUFACTURER: Gateway
>
> BASEBOARD_PRODUCT: EG50_HC_HR
>
> BASEBOARD_VERSION: Type2 - Board Version
>
> DUMP_TYPE: 0
>
> BUGCHECK_P1: ffffe000c2286ee0
>
> BUGCHECK_P2: 2
>
> BUGCHECK_P3: 1
>
> BUGCHECK_P4: fffff80373b3db5a
>
> WRITE_ADDRESS: ffffe000c2286ee0 Nonpaged pool
>
> CURRENT_IRQL: 2
>
> FAULTING_IP:
> nt!MiClearFilePointer+62
> fffff80373b3db5a 48832000 and qword ptr [rax],0<br>&gt;<br>&gt; CPU_COUNT: 2<br>&gt;<br>&gt; CPU_MHZ: 704<br>&gt;<br>&gt; CPU_VENDOR: GenuineIntel<br>&gt;<br>&gt; CPU_FAMILY: 6<br>&gt;<br>&gt; CPU_MODEL: 2a<br>&gt;<br>&gt; CPU_STEPPING: 7<br>&gt;<br>&gt; CPU_MICROCODE: 6,2a,7,0 (F,M,S,R) SIG: 28'00000000 (cache) 28'00000000<br>&gt; (init)<br>&gt;<br>&gt; DEFAULT_BUCKET_ID: WIN8_DRIVER_FAULT<br>&gt;<br>&gt; BUGCHECK_STR: AV<br>&gt;<br>&gt; PROCESS_NAME: System<br>&gt;<br>&gt; TRAP_FRAME: ffffd001e3d307e0 -- (.trap 0xffffd001e3d307e0)<br>&gt; NOTE: The trap frame does not contain all registers.<br>&gt; Some register values may be zeroed or incorrect.<br>&gt; rax=ffffe000c2286ee0 rbx=0000000000000000 rcx=0000000080000000<br>&gt; rdx=0000000000000000 rsi=0000000000000000 rdi=0000000000000000<br>&gt; rip=fffff80373b3db5a rsp=ffffd001e3d30970 rbp=0000000000000000<br>&gt; r8=0000000000000000 r9=0000000000000001 r10=0000007ffffffff8<br>&gt; r11=0000098000000000 r12=0000000000000000 r13=0000000000000000<br>&gt; r14=0000000000000000 r15=0000000000000000<br>&gt; iopl=0 nv up ei pl zr na po nc<br>&gt; nt!MiClearFilePointer+0x62:<br>&gt; fffff80373b3db5a 48832000 and qword ptr [rax],0
> ds:ffffe000c2286ee0=????????????????<br>&gt; Resetting default scope<br>&gt;<br>&gt; LAST_CONTROL_TRANSFER: from fffff80373bdbee9 to fffff80373bd03a0<br>&gt;<br>&gt; STACK_TEXT:<br>&gt; ffffd001e3d30698 fffff80373bdbee9 : 000000000000000a ffffe000c2286ee0<br>&gt; 0000000000000002 0000000000000001 : nt!KeBugCheckEx<br>&gt; ffffd001e3d306a0 fffff80373bda73a : 0000000000000001 0000000000000000<br>&gt; ffffe000c2701800 fffff80300000003 : nt!KiBugCheckDispatch+0x69<br>&gt; ffffd001e3d307e0 fffff80373b3db5a : 0000000000000001 0000000000000000<br>&gt; 0000000000000001 fffff80373b1851c : nt!KiPageFault+0x23a<br>&gt; ffffd001e3d30970 fffff80373b3b3d5 : ffffe000c1ea9750 ffffe000c1ea9750<br>&gt; 0000000000000002 ffffe000c1ea97c8 : nt!MiClearFilePointer+0x62<br>&gt; ffffd001e3d309a0 fffff80373b3b321 : ffffe000c1ea97c8 0000000000000000<br>&gt; ffffe000c1ea9750 00001f8000000001 : nt!MiCheckForControlAreaDeletion+<br>&gt; 0x45<br>&gt; ffffd001e3d309d0 fffff80373b3af54 : fffffa800481c230 ffffd001e3d30b10<br>&gt; ffffd001e3600000 fffffa8003f481e0 : nt!MiDereferenceControlAreaPfn+0x95<br>&gt; ffffd001e3d30a10 fffff80373b97783 : fffffa800481c230 fffffa800481c230<br>&gt; 0000000000000000 fffff80373dd7288 : nt!MiRestoreTransitionPte+0x20c<br>&gt; ffffd001e3d30b50 fffff80373b975a9 : fffffa8004816a70 fffff80373dd72d8<br>&gt; 0000000000000000 000000000018078d : nt!MiRemoveLowestPriorityStandbyP<br>&gt; age+0x1b7<br>&gt; ffffd001e3d30be0 fffff80373b973f1 : fffff80373b973e4 ffffe000be9a4880<br>&gt; fffff80373d55858 ffffe000be9a4880 : nt!MiPurgeTransitionList+0x81<br>&gt; ffffd001e3d30c20 fffff80373ac0d6f : fffff80373f02d78 ffffe000be9a49c0<br>&gt; ffffe000be9a4880 0000000000000000 : nt!MiFinishResume+0xd<br>&gt; ffffd001e3d30c50 fffff80373ab2f34 : fffff80373dd6e02 ffffe000be9a4880<br>&gt; 0000000000000080 ffffe000be9a4880 : nt!ExpWorkerThread+0x69f<br>&gt; ffffd001e3d30d00 fffff80373bd69c6 : ffffd001e8241180 ffffe000be9a4880<br>&gt; ffffe000c27f3880 0000000000000000 : nt!PspSystemThreadStartup+0x58<br>&gt; ffffd001e3d30d60 0000000000000000 : ffffd001e3d31000 ffffd001e3d2b000<br>&gt; 0000000000000000 00000000`00000000 : nt!KiStartSystemThread+0x16
>
>
>
>
> —
> NTFSD is sponsored by OSR
>
>
> MONTHLY seminars on crash dump analysis, WDF, Windows internals and
> software drivers!
> Details at http:
>
> To unsubscribe, visit the List Server section of OSR Online at <
> http://www.osronline.com/page.cfm?name=ListServer&gt;
>


Bercea. G.</http:>

Thanks Gabe. I found the FO is one of mine and the MiClearFilePointer function is checking the SectionObjectPointers. When I try to examine the address the SOP points to, it’s not found obviously hence the crash.

Because this is one of my FO’s, this means that my filter has already released the memory for the SOP structure or else this wouldn’t crash. In my filter, the SOP release is only done during the STREAM_CONTEXT cleanup routine. I was under the impression that the context cleanup is called sometime after IRP_MJ_CLOSE is called. So if that’s the case, I don’t really understand how or why my FO got to this point. If it already made it through CLOSE then there shouldn’t be any more references correct? With this being on the hibernation resume code path, are there any specific FSCTL’s I may be missing that allows one of my FO’s to sneak through? I would assume though that I would get the NtfsDecode BSOD instead of this error.

If you haven’t I’d fire up verifier because this sort of situation is made
for "!verifier 80.

Because this is one of my FO’s, this means that my filter has already
released the memory for the SOP structure or else this wouldn’t crash.
As a sanity check can you navigate from your fscontext to your SOPs without
issue?

In my filter, the SOP release is only done during the STREAM_CONTEXT
cleanup routine.

Which one? upper or lower? What are you pinning the lower file objects
with. I could see a situation in which a dismount might cause FLTMGR to do
a forced teardown on the STREAM_CONTEXT - your file object will be about but
the stream context won’t be.

The offending address is not found with !verifier 80

The FsContext is pointing to invalid memory. That too is not found with !verifier 80. The FsContext2 does point to valid memory but it looks like it’s was already reallocated by something else.

Upper

Another file on the same volume

I haven’t been able to recreate this crash and again it doesn’t make a lot of sense to me about how it’s happening. I was thinking that I might have an issue with reference counting but I would assume verifier would pick that up.

Related to that, and your point about the forced teardown made me think of it, should an isolation filter be allowed to unload? Because if an app opens a file and a SFO is created but then the driver is unloaded and FLTMGR forces an unload, as soon as that app does any operation on that outstanding FO, it’ll BSOD with the NtfsDecode error. Or will the driver not unload until all FOs are closed (which can take a long time).