The following is an analysis that I wrote up based on a
crash dump provided by a former Kernel
Debugging and Crash Analysis for Windows student. Thought that the
information might be of use to others and that I'd share it here.
2:
kd> !analyze -v
*******************************************************************************
*
*
*
Bugcheck Analysis *
*
*
*******************************************************************************
KERNEL_MODE_EXCEPTION_NOT_HANDLED
(8e)
This
is a very common bugcheck. Usually the exception address pinpoints
the
driver/function that caused the problem. Always note this address
as
well as the link date of the driver/image that contains this address.
Some
common problems are exception code 0x80000003. This means a hard
coded
breakpoint or assertion was hit, but this system was booted
/NODEBUG.
This is not supposed to happen as developers should never have
hardcoded
breakpoints in retail code, but ...
If
this happens, make sure a debugger gets connected, and the
system
is booted /DEBUG.? This will let us see why this breakpoint is
happening.
Arguments:
Arg1:
c0000005, The exception code that was not handled
Arg2:
9dc8f09d, The address that the exception occurred at
Arg3:
93dc3964, Trap Frame
Arg4:
00000000
Debugging
Details:
------------------
FAULTING_IP:
win32k!xxxProcessEventMessage+36e
9dc8f09d
mov edx,dword ptr [edx]
TRAP_FRAME:?
93dc3964 -- (.trap 0xffffffff93dc3964)
ErrCode
= 00000000
eax=fe863cd0
ebx=88c68048 ecx=00004110 edx=00000000 esi=93dc3aac edi=fe4ba860
eip=9dc8f09d
esp=93dc39d8 ebp=93dc39fc iopl=0???????? nv up ei pl nz na pe nc
cs=0008?
ss=0010? ds=0023? es=0023? fs=0030? gs=0000???????????? efl=00010206
win32k!xxxProcessEventMessage+0x36e:
9dc8f09d
mov edx,dword ptr [edx] ds:0023:00000000=????????
Resetting
default scope
DEFAULT_BUCKET_ID:
WIN7_DRIVER_FAULT
BUGCHECK_STR:
0x8E
PROCESS_NAME:
AcroRd32.exe
CURRENT_IRQL:
0
LAST_CONTROL_TRANSFER:
from 82eed68e to 82f17e98
STACK_TEXT:
93dc34d4
82eed68e 0000008e c0000005 9dc8f09d nt!KeBugCheckEx+0x1e
93dc38f4
82e774e6 93dc3910 00000000 93dc3964 nt!KiDispatchException+0x1ac
93dc395c
82e7749a 93dc39fc 9dc8f09d badb0d00 nt!CommonDispatchException+0x4a
93dc3988
9dcacbc7 00000000 00000000 00000023 nt!Kei386EoiHelper+0x192
93dc39fc
9dcb51c9 fe4ba860 fe042458 0e3effc3 win32k!PostInputMessage+0x126
93dc3b4c
9dce7f50 fe4ba860 93dc3be4 00000000 win32k!xxxScanSysQueue+0x291
93dc3bb4
9dcdcf0e 93dc3be4 000025ff 00000000 win32k!xxxRealInternalGetMessage+0x32e
93dc3c18
82e768fa 002efca0 00000000 00000000 win32k!NtUserPeekMessage+0x3f
93dc3c18
77d97094 002efca0 00000000 00000000 nt!KiFastCallEntry+0x12a
WARNING:
Frame IP not in any known module. Following frames may be wrong.
002efc40
00000000 00000000 00000000 00000000 0x77d97094
The bugcheck is a vanilla NULL pointer deref, where EDX
contains the NULL value:
2:
kd> r
Last
set context:
eax=fe863cd0
ebx=88c68048 ecx=00004110 edx=00000000 esi=93dc3aac edi=fe4ba860
eip=9dc8f09d
esp=93dc39d8 ebp=93dc39fc
iopl=0 nv up ei pl nz na pe nc
cs=0008
ss=0010 ds=0023 es=0023 fs=0030
gs=0000
efl=00010206
win32k!xxxProcessEventMessage+0x36e:
9dc8f09d
8b12
mov edx,dword ptr
[edx] ds:0023:00000000=????????
In the interest of finding out where EDX came from,
unassemble the faulting function:
2:
kd> uf win32k!xxxProcessEventMessage
Flow
analysis was incomplete, some code may be missing
win32k!xxxProcessEventMessage:
9dc8ed2f
mov edi,edi
9dc8ed31
push ebp
9dc8ed32
mov ebp,esp
9dc8ed34
sub esp,18h
9dc8ed37
push esi
9dc8ed38
mov esi,dword ptr [ebp+0Ch]
9dc8ed3b
push edi
9dc8ed3c
push offset win32k!CleanEventMessage (9dca6f32)
9dc8ed41
lea eax,[ebp-18h]
9dc8ed44
push eax
9dc8ed45
push esi
9dc8ed46
call win32k!PushW32ThreadLock (9dcf3483)
9dc8ed4b
mov eax,dword ptr [esi+30h]
9dc8ed4e
mov edi,dword ptr [ebp+8]
9dc8ed51
mov ecx,dword ptr [edi+0BCh]
9dc8ed57
and eax,3FFFFFFFh
9dc8ed5c
dec eax
9dc8ed5d
mov dword ptr [ebp+0Ch],ecx
9dc8ed60
cmp eax,10h
9dc8ed63
ja win32k!xxxProcessEventMessage+0x567 (9dc8f296)
win32k!xxxProcessEventMessage+0x3a:
9dc8ed69
push ebx
9dc8ed6a
jmp dword ptr win32k!xxxProcessEventMessage+0x578
(9dc8f2a5)[eax*4]
win32k!xxxProcessEventMessage+0x567:
9dc8f296
lea eax,[ebp-18h]
9dc8f299
push eax
9dc8f29a
call win32k!PopW32ThreadLock (9dca90fc)
9dc8f29f
pop edi
9dc8f2a0
pop esi
9dc8f2a1
leave
9dc8f2a2
ret 8
Pretty short function, but our faulting instruction is not
included! The trouble is this line:
9dc8ed6a
jmp dword ptr win32k!xxxProcessEventMessage+0x578
(9dc8f2a5)[eax*4]
We likely have jumped through this instruction, which WinDbg
isn't following with the uf command. The trouble is that the path taken depends
on the value of EAX at the point of the call sequence. If we dump this area out
with dps we should expect to see an array of function pointers:
2:
kd> dps 9dc8f2a5
9dc8f2a5 9dc8edb8 win32k!xxxProcessEventMessage+0x89
9dc8f2a9 9dc8efd7 win32k!xxxProcessEventMessage+0x2a8
9dc8f2ad 9dc8ee23 win32k!xxxProcessEventMessage+0xf4
9dc8f2b1 9dc8ee30 win32k!xxxProcessEventMessage+0x101
9dc8f2b5 9dc8efc9 win32k!xxxProcessEventMessage+0x29a
9dc8f2b9 9dc8ee43 win32k!xxxProcessEventMessage+0x114
9dc8f2bd 9dc8f019 win32k!xxxProcessEventMessage+0x2ea
9dc8f2c1 9dc8ed71 win32k!xxxProcessEventMessage+0x42
9dc8f2c5 9dc8f03f win32k!xxxProcessEventMessage+0x310
9dc8f2c9 9dc8f04c win32k!xxxProcessEventMessage+0x31d
9dc8f2cd 9dc8f07f win32k!xxxProcessEventMessage+0x350
9dc8f2d1 9dc8ee16 win32k!xxxProcessEventMessage+0xe7
9dc8f2d5 9dc8f138 win32k!xxxProcessEventMessage+0x409
9dc8f2d9 9dc8f16d win32k!xxxProcessEventMessage+0x43e
9dc8f2dd 9dc8f1aa win32k!xxxProcessEventMessage+0x47b
9dc8f2e1 9dc8f22b win32k!xxxProcessEventMessage+0x4fc
9dc8f2e5 9dc8f261 win32k!xxxProcessEventMessage+0x532
9dc8f2e9
90909090
9dc8f2ed
55ff8b90
9dc8f2f1
ec83ec8b
9dc8f2f5
1d8b5350
9dc8f2f9
9de43db8 win32k!gptiCurrent
9dc8f2fd
08758b56
9dc8f301
00bc868b
9dc8f305
33570000
9dc8f309
fc5d89ff
9dc8f30d
39f87d89
9dc8f311
840f2878
9dc8f315
000002a0
9dc8f319
0a74f33b
9dc8f31d
50b0458d
9dc8f321
cf34e856
Unfortunately it's more than one, so to figure out the code
sequence we either need to:
1.
Go through these one by one with the u/uf command until we find one that
leads to our sequence
2.
Reconstruct EAX at the time of the crash
I tried #1 first and quickly gave up. So, back to the
disassembly of the function with just the interesting pieces shown:
win32k!xxxProcessEventMessage:
9dc8ed2f
mov edi,edi
9dc8ed31
push ebp
9dc8ed32
mov ebp,esp
...
9dc8ed38 mov esi,dword ptr [ebp+0Ch]
...
9dc8ed4b mov eax,dword ptr [esi+30h]
...
9dc8ed57 and eax,3FFFFFFFh
9dc8ed5c dec eax
9dc8ed5d mov dword ptr
[ebp+0Ch],ecx
9dc8ed60
cmp eax,10h
9dc8ed63
ja win32k!xxxProcessEventMessage+0x567 (9dc8f296)
win32k!xxxProcessEventMessage+0x3a:
9dc8ed69
push ebx
9dc8ed6a
jmp dword ptr win32k!xxxProcessEventMessage+0x578
(9dc8f2a5)[eax*4]
Thus, the path of EAX is:
1.
[EBP+C] goes into ESI
2.
[ESI+30] goes into EAX
3.
Mask off the high two bits of EAX (EAX & 0x3FFFFFFF)
4.
Decrement EAX by one
Thus, in order to reconstruct we either need [EBP+C] or ESI.
You'll note however that [EBP+C] is overwritten as part of this sequence
(highlighted in red), so we can?t get the correct value of [EBP+C] from this
frame. However, we're in an EBP frame and therefore we know that EBP+C is the
second stack passed parameter. Therefore, we should be able to reconstruct the
value based on information in the caller's frame:
9dcb51b8
and dword ptr [eax+14h],0
9dcb51bc lea eax,[ebp-0A0h]
9dcb51c2 push eax
9dcb51c3
push ebx
9dcb51c4
call win32k!xxxProcessEventMessage (9dc8ed2f)
9dcb51c9
jmp win32k!xxxScanSysQueue+0xf2 (9dcb502a)
If this is correct, we should be able to figure out the
second stack passed argument: it's the caller's EBP-A0 (note that there?s an
LEA instruction here, so no pointer deref!).
2:
kd> r
Last
set context:
eax=fe863cd0
ebx=88c68048 ecx=00004110 edx=00000000 esi=93dc3aac edi=fe4ba860
eip=9dc8f09d
esp=93dc39d8 ebp=93dc39fc
iopl=0 nv up ei pl nz na pe nc
cs=0008
ss=0010 ds=0023 es=0023 fs=0030
gs=0000
efl=00010206
win32k!xxxProcessEventMessage+0x36e:
9dc8f09d
mov edx,dword ptr
[edx]
ds:0023:00000000=????????
2:
kd> * Use our EBP to get the caller's EBP
2:
kd> dd @ebp l1
93dc39fc
93dc3b4c
2:
kd> * Subtract A0 to get the stack passed argument
2:
kd> ? @$p-A0
Evaluate
expression: -1814283604 = 93dc3aac
2:
kd> * This +30 should give us our initial EAX value:
2:
kd> dd 93dc3aac+30 l1
93dc3adc
0000000b
2:
kd> * AND it with 3FFFFFFF and subtract one
2:
kd> ? (@$p & 0x3FFFFFFF) - 1
Evaluate
expression: 10 = 0000000a
We should then expect to see our faulting sequence located
at index A of the jump table:
win32k!xxxProcessEventMessage+0x3a:
9dc8ed69
push ebx
9dc8ed6a
jmp dword ptr win32k!xxxProcessEventMessage+0x578 (9dc8f2a5)[eax*4]
2:
kd> dps 9dc8f2a5 + (a*4) l1
9dc8f2cd
9dc8f07f win32k!xxxProcessEventMessage+0x350
2:
kd> u win32k!xxxProcessEventMessage+0x350
win32k!xxxProcessEventMessage+0x350:
9dc8f07f
mov ecx,dword ptr [esi+8]
9dc8f082
mov ebx,dword ptr [edi+0C8h]
9dc8f088
mov dl,1
9dc8f08a
call win32k!HMValidateHandleNoSecure (9dce8623)
9dc8f08f
mov ecx,dword ptr [ebx+14h]
9dc8f092
test ecx,5C00h
9dc8f098
je win32k!xxxProcessEventMessage+0x381 (9dc8f0b0)
9dc8f09a
mov edx,dword ptr [ebx+64h]
9dc8f09d mov edx,dword ptr [edx]
That's great and all, but given that what do we know?
Well, we can now see that EDX came from EBX+64:
2:
kd> dd @ebx+64 l1
88c680ac
00000000
And this means we can try to see if EBX is anything
interesting:
2:
kd> !pool @ebx 1
Pool
page 88c68048 region is Nonpaged pool
*88c68000
size: d0 previous size: 0 (Allocated) *Desk
(Protected)
Owning component : Unknown (update pooltag.txt)
88c68008 00000000 00000084 00000070 82f6fa00
88c68018 e38412c8 00000001 00000000 0010000e
88c68028 8fa7c558 00000000 00000e74 00000027
88c68038 00000000 000e0015 82f6fa00 bc1f3855
88c68048 00000001 fe800578 ff9d1728 88c5bf78
88c68058 88c6bd28 00004110 80480004 fe802a68
88c68068 fe80ee70 00000000 00000000 00000000
88c68078 fe87da48 fe800748 fe800880 8fa87fd8
88c68088 fe800000 00c00000 000604e8 000001a0
88c68098 000003fc 000001a8 00000408 fd053ac0
88c680a8 ff553960 00000000 00050045
00000744
88c680b8 0000003b 00000748 0000003f 00000190
88c680c8 00000000 000e082c
NULL pointer is highlighted in green in the output. Just
based on the dump of the memory region, I can't immediately characterize this
as a corruption in any kind of meaningful way (i.e. the entire structure isn't
zero or anything).
I suspect that this allocation represents a desktop object
just based on the tag, "Desk". This is an object used by the graphics subsystem
to indicate, well, a desktop. I can confirm my guess by using !object:
2:
kd> !object @ebx
Object:
88c68048 Type: (856c89c8) Desktop
ObjectHeader: 88c68030 (new version)
HandleCount: 39 PointerCount: 3700
Directory Object: 00000000 Name: Default
Given all of that, it all looks pretty in order. We have a
NULL pointer deref of a common graphics subsystem object, I don't see any
obvious third party interference. Just because it's a NULL pointer, I decided
to look at the memory state of the machine:
2:
kd> !vm 21
***
Virtual Memory Usage ***
Physical Memory: 894268 ( 3577072 Kb)
Page File: \??\C:\pagefile.sys
Current: 3577072 Kb Free Space: 1380964 Kb
Minimum: 3577072 Kb Maximum:
10731216 Kb
Available Pages: 225970 (
903880 Kb)
ResAvail Pages: 610095 (
2440380 Kb)
Locked IO Pages: 0
( 0 Kb)
Free System PTEs: 36039 (
144156 Kb)
Modified Pages: 5104
( 20416 Kb)
Modified PF Pages: 4673 (
18692 Kb)
NonPagedPool Usage: 24142 ( 96568 Kb)
NonPagedPool Max: 522760 ( 2091040 Kb)
PagedPool 0 Usage: 55224 ( 220896 Kb)
PagedPool 1 Usage: 10576 (
42304 Kb)
PagedPool 2 Usage: 7263 (
29052 Kb)
PagedPool 3 Usage: 7277 (
29108 Kb)
PagedPool 4 Usage: 7390 (
29560 Kb)
PagedPool Usage: 87730 (
350920 Kb)
PagedPool Maximum: 523264 ( 2093056 Kb)
********** 11 pool allocations have
failed **********
Session Commit: 14541
( 58164 Kb)
Shared Commit: 220028
( 880112 Kb)
Special Pool:
0
( 0 Kb)
Shared Process: 4482
( 17928 Kb)
PagedPool Commit: 87788 (
351152 Kb)
Driver Commit: 19638
( 78552 Kb)
Committed pages: 1280380 ( 5121520 Kb)
Commit limit: 1788098 (
7152392 Kb)
VA Type
CurrentUse Peak
Limit Failures
Unused
134 Mb 0 Mb
OPEN
0
SessionSpace
68 Mb 106 Mb
OPEN
4
ProcessSpace 16
Mb 0 Mb
OPEN
0
BootLoaded 52
Mb 66 Mb
OPEN
0
PfnDatabase 22
Mb 128 Mb
OPEN
0
NonPagedPool 110
Mb 114 Mb
OPEN
0
PagedPool
610 Mb 618 Mb
OPEN 18
SpecialPool 0
Mb 0 Mb
OPEN
0
SystemCache
648 Mb 1026 Mb
OPEN 54
SystemPtes
322 Mb 346 Mb
OPEN
1
Hal
4 Mb 0 Mb
OPEN
0
SessionGlobal 12
Mb 12 Mb
OPEN
0
Driver Images 50
Mb 0 Mb
OPEN
0
NPSpecialPool 0
Mb 0 Mb
OPEN
0
ProtoPTE Pool 0
Mb 0 Mb
OPEN
0
Maximum contiguous
unused VA: 10 Mb
Pretty suspicious that we have some memory allocation
failures. However, I also see that the system has plenty of free memory at the
moment so these could just be red herrings.
Lastly, I used !sysinfo to figure out what type of computer
this is:
2: kd>
!sysinfo smbios
[SMBIOS
Data Tables v2.6]
[DMI
Version - 38]
[2.0
Calling Convention - No]
[Table
Size - 2448 bytes]
...
Family
ThinkPad T420
...
And did some quick Googling to see if this is a known issue.
Unfortunately, I came up with nada, so another dead end.
If I were really on the spot to figure this out, I'd
probably set a breakpoint in this code path to walk through the working case.
From there, I could figure out what the NULL value is supposed to be and start
trying to think of how it might have become NULL.