Can you count the number of WinDBG commands you know on one hand? Been meaning to learn some commands other than!analyze –v but been too busy to crack the docs open? Well then, this article is for you! I’m going to break down ten WinDBG commands that I couldn’t live without.
System Information Commands
Sometimes as part of your analysis, you’d like a bit more detailed information about the target system that generated the crash dump. The commands in this section are going to let you find out critical details about your system that just might be the clues you need to perform your analysis.
Don’t be fooled by the name, the !vm command gives you a great quick view into the virtual and physical memory usage on a system. When I run !vm I like to use a flags value of 0x21, which will omit some process specific memory usage information and add in some extra info about the kernel address space on platforms that support it (See Figure 1).
kd> !vm 0x21
*** Virtual Memory Usage ***
Physical Memory: 261886 ( 1047544 Kb)
Page File: \??\C:\pagefile.sys
Current: 1572864 Kb Free Space: 1571132 Kb
Minimum: 1572864 Kb Maximum: 3145728 Kb
Available Pages: 211575 ( 846300 Kb)
Free System PTEs: 231247 ( 924988 Kb)
NonPagedPool Usage: 0 ( 0 Kb)
NonPagedPoolNx Usage: 2969 ( 11876 Kb)
NonPagedPool Max: 52691 ( 210764 Kb)
PagedPool Usage: 4904 ( 19616 Kb)
PagedPool Maximum: 51200 ( 204800 Kb)
NOTE: The !vm output currently has a bug where the non-paged pool usage will always be listed as zero. The actual non-paged pool usage is listed as, “NonPagedPoolNx Usage” in the output.
Note here that we see the amount of physical memory in the system as well as how much memory is currently free. We then get to note the current usage of the system PTEs as well as the pools. If we suspect some sort of resource exhaustion going on in the system, we can use this command to quickly pinpoint which resource is being consumed.
Do you have a customer that can repeatedly reproduce a problem but you just can’t reproduce it with the exact same procedure? Maybe you’re not using a fast enough processor or the right BIOS version, but in any event, how can you tell what system configuration the customer is using from just a dump file? Enter !sysinfo, a command that can tell you just about anything you’d want to know about your system using information cached on the target. For example, let’s see what kind of processor is in this system (See Figure 2).
kd> !sysinfo cpuinfo
~MHz = REG_DWORD 1779
Component Information = REG_BINARY 0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0
Configuration Data = REG_FULL_RESOURCE_DESCRIPTOR ff,ff,ff,ff,ff,ff,ff,ff,0,0,0,0,0,0,0,0
Identifier = REG_SZ x86 Family 15 Model 1 Stepping 2
ProcessorNameString = REG_SZ Intel(R) Pentium(R) 4 CPU 1.80GHz
Update Signature = REG_BINARY 0,0,0,0,2d,0,0,0
Update Status = REG_DWORD 0
VendorIdentifier = REG_SZ GenuineIntel
MSR8B = REG_QWORD 2d00000000
CPUID1 = REG_BINARY 12,f,0,0,8,8,1,0,0,0,0,0,ff,fb,eb,3f
How about the BIOS version and other platform info?
kd> !sysinfo machineid
Machine ID Information [From Smbios 2.3, DMIVersion 35, Size=2982]
BiosVendor = Dell Computer Corporation
BiosVersion = A05
BiosReleaseDate = 10/05/2001
SystemManufacturer = Dell Computer Corporation
SystemProductName = OptiPlex GX400
BaseBoardManufacturer = Dell Computer Corporation
BaseBoardProduct = OptiPlex GX400
There’s more here as well if you go exploring the documentation for the command. For example, you can even query information about which RAM slots are populated using the smbios switch (e.g. !sysinfo smbios –memory).
Suspected Race Condition Commands
Race conditions are the worst. They’re difficult to track, difficult to reproduce, and when you get a crash it may be too late. The race has already happened and when the system crashes you’re dealing with the secondary failure, so there’s nothing that can be done, right? Wrong! WinDBG has a couple of commands that can make you feel like you’ve won the lottery and pinpoint the racing thread with ease.
If you’re lucky, the thread that is racing with your crashing thread is still running on another processor. This is where !running comes in, which will show you information about each thread that is currently running on a processor in the system. Whenever I run this command I like to specify the –ti switch, to include thread stacks in the output as well as idle threads:
1: kd> !running -ti
0 f7857120 85ed2da8 ................
ba9be270 804f961f nt!KeBugCheckEx+0x19
ba9be62c 805310dd nt!KiDispatchException+0x307
ba9be694 8053108e nt!CommonDispatchException+0x4d
ba9be6a4 f6cd768d nt!Kei386EoiHelper+0x18e
ba9be6b4 f6c0675a ks!KsReleaseIrpOnCancelableQueue+0x5b
ba9be758 f6c15264 portcls!CIrpStream::ReleaseUnmappingIrp+0xd0
ba9be780 f6c21760 portcls!UpdateActivePinCount+0xb
f6cd7553 10c2c95e portcls!CPortPinWavePci::DistributeDeviceState+0x4d
1 f7867120 86fb5b30 ...............
f7a1eba0 f6c0d445 portcls!CIrpStream::GetMapping+0x17
f7a1ebc8 f6c31ce1 portcls!CPortPinWavePci::GetMapping+0x2a
If the thread isn’t actively running, you might think that you would have to go the long way and try finding a racing thread with !process 0 7. However, WinDBG also provides us a way to look at threads that are ready to run, with the !ready command. Maybe the current thread pre-empted another thread and that’s the reason for the race, in which case the other thread will be in the ready state. Whenever using !ready, I like to use the 0xF flags value so that I can see the call stacks of the threads, though I won’t do that here just to keep the output short (see Figure 3).
Processor 0: Ready Threads at priority 8
THREAD 8543cd48 Cid 0004.0b58 Teb: 00000000 Win32Thread: 00000000 READY
Processor 0: Ready Threads at priority 1
THREAD 85367020 Cid 0004.0008 Teb: 00000000 Win32Thread: 00000000 READY
Have an address and want to know what it is? Is it a pool allocation? Is it paged out? Here are a couple of commands that will get you the information that you need.
!pool is a standard command for any toolbox, so I suspect that most of you know it and love it already. However, for those that might not be aware, !pool will take an arbitrary virtual address and let you know if it is a pool allocation or not. If it is indeed a pool allocation, you’ll be told some details about it, such as whether it’s allocated or freed, the length of the allocation, the tag, etc. When I use !pool, I like to specify a flags value of 2 to suppress information about other allocations surrounding the address (See Figure 4).
kd> !pool 8539da40 2
Pool page 8539da40 region is Nonpaged pool
*8539da40 size: 8 previous size: 148 (Free) Io
Pooltag Io : general IO allocations, Binary : nt!io
Before moving on, I’d like to note something in the output here that often confuses people. The previous size value mentioned here is not the, “previous size of this allocation.” Instead, what it is telling you is the size of the allocation preceding this entry in the pool page. This is used as part of a consistency check by the Memory Manager to validate that the page of memory has not been corrupted by buffer overruns or underruns.
Sometimes you’d like to view the virtual memory structures for a given virtual address, such as the PDE and PTE. In that case, you can use the !pte command, which will provide decoded information about a virtual address. Here’s some example output for a valid virtual address:
kd> !pte 9371a000
PDE at C0300934 PTE at C024DC68
contains 9B441863 contains 8B660121
pfn 9b441 ---DA--KWEV pfn 8b660 -G--A--KREV
We can also see what happens if we specify a virtual address that isn’t valid to the hardware, such as one with its backing page currently in transition:
kd> !pte 93726000
PDE at C0300934 PTE at C024DC98
contains 9B441863 contains 8B5A0860
pfn 9b441 ---DA--KWEV not valid
Protect: 3 - ExecuteRead
Now we have some further details as to why the address is invalid, which may be invaluable to our investigation.
Viewing O/S Trace Information
The O/S has some built in trace facilities that you can turn on to collect data that might be useful during analysis. Unfortunately these facilities need to be turned on before the problem happens, but knowing that this information is available can be useful in some situations.
We’re all using Driver Verifier, right? Well, what you might not realize is that starting in Windows Vista Verifier has been enhanced to keep a log of interesting events that happen in your driver. Assuming that you’ve enabled Driver Verifier on your driver, you can now extract valuable information with the following !verifier commands:
!verifier 0x80 Address – This command dumps the allocate and free log, which logs each pool allocate and free made by your driver. Included in the output is the call stack of the operation, which can be invaluable when you’re trying to track down use after free or double free bugs. Optionally, the command takes an address value that will limit the output to only include allocation ranges including that address.
!verifier 0x100 Address – This command dumps the IRP log, which logs each call to IoAllocateIrp,IoCancelIrp, andIoCompleteRequest made by your driver.
!verifier 0x200– This command dumps the critical region log, which logs each call to KeEnterCriticalRegion andKeLeaveCriticalRegion made by your driver.
!htrace and !obtrace
Handle leaks and object reference leaks can be very tricky to track down, especially when working with a large code base. Luckily, the O/S has built in facilities for logging handle and reference count activities. All you need to do is enable them and be aware of the commands available for extracting the logs, which in this case are !htrace and !obtrace.
Handle tracing needs to be enabled on a per-process basis, which can be done by using Application Verifier. As driver writers, however, we’re typically only interested in kernel handles. By implementation, kernel handles are actually just handles from the handle table of the System process. And, as luck would have it, if you enable Driver Verifier handle tracing is automatically turned on for the System process. Thus, as long as Driver Verifier is enabled on the target you can dump the handle tracing log for all kernel handles with !htrace 0 PEPROCESS:
1: kd> !htrace 0 85e0a170
Handle 0x281C - CLOSE
Thread ID = 0x00000ab4, Process ID = 0x00000408
Handle 0x281C - OPEN
Thread ID = 0x00000ab4, Process ID = 0x00000408
Object reference tracing, on the other hand, needs to be enabled on a system wide basis with GFlags. Due to the volume of tracing generated, when you enable tracing you must specify the pool tag of the object you want to trace (e.g. ‘File’) and you can also limit the tracing to only apply to a single process’ objects. Once you have enabled tracing via GFlags, you can view the trace for a given object with !obtrace(shown in Figure 5, page 29).
Plug and Play and Power Issues
Nothing is more annoying than when the system hangs during a plug and play or power operation. Luckily, the debugger provides a quick way to identify the threads participating in the operation so that you can get right to resolving the issue.
!pnptriage is a nifty command that combines the output of several PnP related debugging commands. It will identify any of your devnodes with problems as well as dump out any PnP worker threads that are currently executing, which will give you the ability to quickly identify the threads in the system that might be of interest to you:
0: kd> !pnptriage
Dumping devnodes with problems...
Dumping IopRootDeviceNode (= 0x86c05c08)
DevNode 0x8a131e78 for PDO 0x8a1af6a8
InstancePath is "USB\VID_0403&PID_6001\7&2363c875&0&1"
State = DeviceNodeInitialized (0x302)
Previous State = DeviceNodeUninitialized (0x301)
Problem = CM_PROB_FAILED_INSTALL
Dumping currently active PnP thread (if any)...
Dumping device action thread...
THREAD 847f8798 Cid 0004.0044 Teb: 00000000 Win32Thread: 00000000 WAIT: (Executive) KernelMode Non-Alertable
!poaction is the essential command for debugging any of your power related issues. Most importantly, !poaction will show any outstanding query or set power operations and the driver to which they were sent, which can be used to quickly identify which devices are preventing the power operations from occurring. Great for getting insight into what’s going on when the system will mysteriously refuse to enter or resume from a lower power state:
1: kd> !poaction
State..........: 3 - Set System State
Lightest State.: Hibernate
Flags..........: 80000004 OverrideApps|Critical
Irp minor......: SetPower
System State...: Hibernate
Hiber Context..: 89dd5978
Allocated power irps (PopIrpList - 82978480)
IRP: 8e1d8f00 (set/D0,), PDO: 89c0a248, CURRENT: 89fde028
IRP: 9d722e48 (set/D0,), PDO: 89c08818, CURRENT: 89f92620
IRP: 9fe7ee70 (set/D0,), PDO: 89c08940, CURRENT: 89f917a0
Did I Miss Any?
Got your own favorite command that wasn’t represented here? Send me an email at email@example.com let me know!
Analyst’s Perspective is a column by OSR Consulting Associate, Scott Noone. When he’s not root-causing complex kernel issues, he’s leading the development and instruction of OSR’s Kernel Debugging seminar. Comments or suggestions for this or future Analyst’s Perspective columns can be addressed to firstname.lastname@example.org.