OSRLogo
OSRLogoOSRLogoOSRLogo x Subscribe to The NT Insider
OSRLogo
x

Everything Windows Driver Development

x
x
x
GoToHomePage xLoginx
 
 

    Thu, 14 Mar 2019     118020 members

   Login
   Join


 
 
Contents
  Online Dump Analyzer
OSR Dev Blog
The NT Insider
The Basics
File Systems
Downloads
ListServer / Forum
  Express Links
  · The NT Insider Digital Edition - May-June 2016 Now Available!
  · Windows 8.1 Update: VS Express Now Supported
  · HCK Client install on Windows N versions
  · There's a WDFSTRING?
  · When CAN You Call WdfIoQueueP...ously

Analyst's Perspective: x64 Trap Frames

Analyst's Perspective is a NEW column focusing on Windows kernel debugging and problem analysis topics.

I'm not quite sure how I haven't been burned by this before, but this week I analyzed an x64 crash that didn't make much sense. The information displayed in the bugcheck simply didn't match up with the information displayed in the trap frame:

IRQL_NOT_LESS_OR_EQUAL (a)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If a kernel debugger is available get the stack backtrace.
Arguments:

Arg1: fffffbbc5f00001a, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000000, bitfield :
bit 0 : value 0 = read operation, 1 = write operation

TRAP_FRAME: fffff8000415fcc0 -- (.trap 0xfffff8000415fcc0)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.

rax=0000013c5f000000 rbx=0000000000000000 rcx=fffffbbc5f000000
rdx=fffff80001654000 rsi=0000000000000000 rdi=0000000000000000
rip=fffff8000175e080 rsp=fffff8000415fe50 rbp=0000000000000002
r8=0000000000000002 r9=0000000000000000 r10=fffff800017cc6b0
r11=0000007d30034dd0 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei pl nz na po nc
nt!MmFreeContiguousMemory+0x110:
fffff800`0175e080 mov al,byte ptr [rsi+1Ah] ds:00000000`0000001a=??

Notice how the bad memory address from the bugcheck information is 0xfffffbbc5f00001a, but the faulting instruction shows RSI+1A as the bad reference with RSI equal to zero. If that was correct, the first parameter to the bugcheck should have been 0x000000000000001a. So, where did 0x0xfffffbbc5f00001a come from?

Disassembling the block of code around the invalid memory reference shows that the compiler assigned RSI to RCX prior to the faulting instruction:

nt!MmFreeContiguousMemory+0xc7:
xor ecx,ecx
call nt!MiDeferredUnlockPages (fffff800`016c6010)
mov rax,7FFFFFFFF8h
mov rcx,0FFFFFA8000000000h
mov r11,rdi
shr r11,9
and r11,rax
mov rax,0FFFFF68000000000h
mov rax,qword ptr [r11+rax]
shr rax,0Ch
and rax,rbx
lea rax,[rax+rax*2]
shl rax,4
lea rsi,[rcx+rax]
mov rcx,rsi
mov al,byte ptr [rsi+1Ah]

So I decided to check RCX and see if it contained a value that matched what was reported in the bugcheck code:

kd> ?@rcx+1a
Evaluate expression:
-4688510451686 = fffffbbc`5f00001a
kd> .bugcheck
Bugcheck code 0000000A
Arguments fffffbbc`5f00001a 00000000`00000002 00000000`00000000 fffff800`0175e080

Not surprisingly RCX did indeed have the bad address, so why was RSI zero in the trap frame?

The answer lies in this warning that I have been merrily ignoring since it started appearing long ago:

NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.

The documentation made no mention of it, so I assumed it was there for some rare edge case that wasn't even worth mentioning (memory corruption, etc). But, in reality, this represents a major issue for those of us analyzing crash dumps.

As it turns out, the trap frame generation code in the x64 versions of Windows simply does not save the contents of the non-volatile registers. The idea is that any code that runs after the trap frame generation will properly handle saving and restoring the registers in its own frame, making the saving of the registers in the trap frame an unnecessary step in a hot path in the kernel.

Unfortunately, this means that you should expect the information in the trap frame to be entirely incorrect for any register that is non-volatile. Volatile registers are preserved however, thus the values of RCX, RDX, R8-11, and XMM0-XMM5 are always valid in the trap frame. With any other register, you're on your own, and the values should be viewed as stack garbage that happens to occupy the slack space in the structure.

This represents a fundamental change in the way that we approach using trap frames when dealing with x64 dumps. Instead of being presented with the exact state of the CPU at the time of the trap, we're now dealing with a partial view with some registers valid and others garbage. I highly suggest leaving a note on your desk that lists the volatile registers so you know who to trust. And if you need the contents of one of the non-volatile registers, be prepared to retrieve it indirectly through a volatile register or grovel through the stack looking for the last save of the register contents.

Until the next problem...

Snoone

Analyst's Perspective is a column by OSR consulting associate, Scott Noone. When he's not root-causing complex kernel issues, he's leading the development and instruction of OSR's kernel Debugging seminar. Comments on suggestions for this or future Analyst's Perspective article submissions can be addressed to ap@osr.com.

 

User Comments
Rate this article and give us feedback. Do you find anything missing? Share your opinion with the community!
Post Your Comment

"Excellent!"
Grappling with these issues today, so very helpful - thanks!

06-Oct-09, Lyndon Clarke


Post Your Comments.
Print this article.
Email this article.
bottom nav links