BAD_POOL_HEADER

OSR_Community_User · May 28, 2012, 12:49pm

Hi,

This is probably a lame question but, what causes a BAD_POOL_HEADER?

Regards,

Nuno

OSR_Community_User · May 28, 2012, 1:07pm

Contextualizing a little bit more…

I’m running tests on a driver deploy and this BAD_POOL_HEADER is happening when a suprise removal of the USB device occurs.

This WinDbg analysis is provided below.

I’m currently testing those drivers in a completely clean Windows 7 installation, without any kind of update.

Since the crash happens on a module that is not mine (but of course it is related) because my device implements a virtual HID driver, I start to think that I should test this driver on updates machines. What is your opinion regarding this?

I have been improving the driver and associated control panel, but I haven’t made any special chance to start having this crashes, so instead of driving crazy and reverting to older versions,i i’m trying to understand what can be causing this.

WinDbg analysis output:

BAD_POOL_HEADER (19)
The pool is already corrupt at the time of the current request.
This may or may not be due to the caller.
The internal pool links must be walked to figure out a possible cause of
the problem, and then special pool applied to the suspect tags or the driver
verifier to a suspect driver.
Arguments:
Arg1: 00000020, a pool block header size is corrupt.
Arg2: 86c5c918, The pool entry we were looking for within the page.
Arg3: 86c5c960, The next pool entry.
Arg4: 08090005, (reserved)

Debugging Details:

BUGCHECK_STR: 0x19_20

POOL_ADDRESS: 86c5c918 Nonpaged pool

DEFAULT_BUCKET_ID: VISTA_DRIVER_FAULT

PROCESS_NAME: System

CURRENT_IRQL: 2

LAST_CONTROL_TRANSFER: from 82b1a083 to 82ab6110

STACK_TEXT:
807e55fc 82b1a083 00000003 b4da8039 00000065 nt!RtlpBreakWithStatusInstruction
807e564c 82b1ab81 00000003 86c5c918 000001ff nt!KiBugCheckDebugBreak+0x1c
807e5a10 82b5cc6b 00000019 00000020 86c5c918 nt!KeBugCheck2+0x68b
807e5a8c 8fd9822b 86c5c920 00000000 86c5fa24 nt!ExFreePoolWithTag+0x1b1
807e5aa8 8fda08bd 00000000 86daa7d0 8fd9e21c HIDCLASS!DestroyPingPongs+0x45
807e5ac0 8fda09a3 86c5fa10 86daa7d0 86c5fa10 HIDCLASS!HidpFdoPnp+0x143
807e5adc 8fd97b54 86c5fa10 00000017 86daa93c HIDCLASS!HidpIrpMajorPnp+0x5b
807e5af8 82a72593 00c5f958 0000001b 807e5b94 HIDCLASS!HidpMajorHandler+0xc8
807e5b10 82c14f95 86bf3030 86c54858 86bf3030 nt!IofCallDriver+0x63
807e5b40 82d01a3f 86bf3030 00000000 86c54858 nt!IopSynchronousCall+0xc2
807e5b98 82cf98f2 86bf3030 00000017 86c54858 nt!IopRemoveDevice+0xd4
807e5bc0 82cf977b 96337de8 00000000 807e5c04 nt!PnpSurpriseRemoveLockedDeviceNode+0x101
807e5bd0 82cf9a3b 00000003 00000000 00000000 nt!PnpDeleteLockedDeviceNode+0x21
807e5c04 82cfd055 86bf3030 96337de8 00000003 nt!PnpDeleteLockedDeviceNodes+0x4c
807e5cc4 82bed2ca 807e5cf4 00000000 9ad797a0 nt!PnpProcessQueryRemoveAndEject+0x586
807e5cdc 82bfb3ca 00000000 870a4288 84fdc798 nt!PnpProcessTargetDeviceEvent+0x38
807e5d00 82ab8aab 870a4288 00000000 84fdc798 nt!PnpDeviceEventWorker+0x216
807e5d50 82c44f5e 00000001 b4da8be5 00000000 nt!ExpWorkerThread+0x10d
807e5d90 82aec219 82ab899e 00000001 00000000 nt!PspSystemThreadStartup+0x9e
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x19

STACK_COMMAND: kb

FOLLOWUP_IP:
HIDCLASS!DestroyPingPongs+45
8fd9822b ff4508 inc dword ptr [ebp+8]

SYMBOL_STACK_INDEX: 4

SYMBOL_NAME: HIDCLASS!DestroyPingPongs+45

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: HIDCLASS

IMAGE_NAME: HIDCLASS.SYS

DEBUG_FLR_IMAGE_TIMESTAMP: 4ce79c09

FAILURE_BUCKET_ID: 0x19_20_HIDCLASS!DestroyPingPongs+45

BUCKET_ID: 0x19_20_HIDCLASS!DestroyPingPongs+45

Alex_Grig · May 28, 2012, 1:34pm

Buffer overrun (write after the end of an allocated buffer). Also can be caused by a write into a previously freed buffer.

OSR_Community_User · May 28, 2012, 1:44pm

The important thing to recognize about the bad pool header crash is that
it is reported at the time of DETECTION, not at the time you trashed the
pool. This could be billions of instructions later, and the stack dump is
essentially that of an innocent bystander. Sort of like you removed a
manhoe cover and three hours later a pedestrian who is busy texting falls
into the hole. So your driver can do something catastrophic and it will
not be noticed by the computer for week or months (of computer time, more
like a couple seconds of human time). I’ve sen situations where the
damage caused by a defective driver did not show up until a coupke minutes
of realtime (millenia in computer time) had gone by. Sort of like a
modern human had fallen into a mastodon trap.

There are several potential causes:
Uninitialized pointers on the stack that are used; you inherit whatever
garbage is left there
Uninitialized pointers in heap structures; again, you get whatever random
garbage was there
Continuing to store into a structure you have already freed (dangling
pointer)
Taking a long walk off a short pier–that is, allocating too little memory

So there are several solutions. For example, declare all local pointers
with

WHATEVER * value = NULL;

Don’t worry about “efficiency” here, that is just silly. If you use
optimizations in the release build, and the compiler discovers that your
code makes an assignment before use, it will eliminate the instructions
that assign NULL.

Zero out any heap allocations; if you are concerned with efficiency, put
them undet a #ifdef DBG conditional (but be careful–your code may
accidentally work correctly if things are zeroed out but fail in the
presence of nonzero garbage in the release build)

Whenever you free storage, the next statement shoul set that pointer to
NULL, especially for variables in your device extension.

Make sure you don’t store via an IRP pointer that has become invalid.
This means you cannot use the pointer after you have put the IRP into a
queue, and after completion.

Dangling pointers are the second-most-frequent cause of bad pool headers

But far and away, the most frequent cause is writing out of bounds
(remember to always color within the lines!).

This can be as simple as the classic off-by-one error, allocating ten
items and writing using the subscript value 10, or the more subtle one:
PWHATEVER value = (PWHATEVER)ExAllocateOfYourChoice(PoolOfYourChoice,
sizeof(PWHATEVER));
which is particularly nasty.

The Driver Verifier with Special Pool Checking enabled usually catches
thes errors at the point they occur, sort of like you being arrested for
doing something dangerous when you removed the manhole cover.

Since this only occurs during surprise removal, it is important to check
all the paths of execution for that condition. But always keep in mind
that surprise removal may not be the proximate cause of the trashing, it
may just cause a secondary failure (I once had to track down an error
where the base error trashed a pointer, which caused a legitimate piece of
code that trusted that pointer was correct to store a pointer value which
just happened to land in a pointer field of some other structure, and
therefore caused something else to be trashed…anyway, the damage was
seven levels away from where it was actually detected).
joe

Contextualizing a little bit more…

I’m running tests on a driver deploy and this BAD_POOL_HEADER is happening
when a suprise removal of the USB device occurs.

This WinDbg analysis is provided below.

I’m currently testing those drivers in a completely clean Windows 7
installation, without any kind of update.

Since the crash happens on a module that is not mine (but of course it is
related) because my device implements a virtual HID driver, I start to
think that I should test this driver on updates machines. What is your
opinion regarding this?

I have been improving the driver and associated control panel, but I
haven’t made any special chance to start having this crashes, so instead
of driving crazy and reverting to older versions,i i’m trying to
understand what can be causing this.

WinDbg analysis output:

BAD_POOL_HEADER (19)
The pool is already corrupt at the time of the current request.
This may or may not be due to the caller.
The internal pool links must be walked to figure out a possible cause of
the problem, and then special pool applied to the suspect tags or the
driver
verifier to a suspect driver.
Arguments:
Arg1: 00000020, a pool block header size is corrupt.
Arg2: 86c5c918, The pool entry we were looking for within the page.
Arg3: 86c5c960, The next pool entry.
Arg4: 08090005, (reserved)

Debugging Details:

BUGCHECK_STR: 0x19_20

POOL_ADDRESS: 86c5c918 Nonpaged pool

DEFAULT_BUCKET_ID: VISTA_DRIVER_FAULT

PROCESS_NAME: System

CURRENT_IRQL: 2

LAST_CONTROL_TRANSFER: from 82b1a083 to 82ab6110

STACK_TEXT:
807e55fc 82b1a083 00000003 b4da8039 00000065
nt!RtlpBreakWithStatusInstruction
807e564c 82b1ab81 00000003 86c5c918 000001ff nt!KiBugCheckDebugBreak+0x1c
807e5a10 82b5cc6b 00000019 00000020 86c5c918 nt!KeBugCheck2+0x68b
807e5a8c 8fd9822b 86c5c920 00000000 86c5fa24 nt!ExFreePoolWithTag+0x1b1
807e5aa8 8fda08bd 00000000 86daa7d0 8fd9e21c
HIDCLASS!DestroyPingPongs+0x45
807e5ac0 8fda09a3 86c5fa10 86daa7d0 86c5fa10 HIDCLASS!HidpFdoPnp+0x143
807e5adc 8fd97b54 86c5fa10 00000017 86daa93c HIDCLASS!HidpIrpMajorPnp+0x5b
807e5af8 82a72593 00c5f958 0000001b 807e5b94
HIDCLASS!HidpMajorHandler+0xc8
807e5b10 82c14f95 86bf3030 86c54858 86bf3030 nt!IofCallDriver+0x63
807e5b40 82d01a3f 86bf3030 00000000 86c54858 nt!IopSynchronousCall+0xc2
807e5b98 82cf98f2 86bf3030 00000017 86c54858 nt!IopRemoveDevice+0xd4
807e5bc0 82cf977b 96337de8 00000000 807e5c04
nt!PnpSurpriseRemoveLockedDeviceNode+0x101
807e5bd0 82cf9a3b 00000003 00000000 00000000
nt!PnpDeleteLockedDeviceNode+0x21
807e5c04 82cfd055 86bf3030 96337de8 00000003
nt!PnpDeleteLockedDeviceNodes+0x4c
807e5cc4 82bed2ca 807e5cf4 00000000 9ad797a0
nt!PnpProcessQueryRemoveAndEject+0x586
807e5cdc 82bfb3ca 00000000 870a4288 84fdc798
nt!PnpProcessTargetDeviceEvent+0x38
807e5d00 82ab8aab 870a4288 00000000 84fdc798 nt!PnpDeviceEventWorker+0x216
807e5d50 82c44f5e 00000001 b4da8be5 00000000 nt!ExpWorkerThread+0x10d
807e5d90 82aec219 82ab899e 00000001 00000000
nt!PspSystemThreadStartup+0x9e
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x19

STACK_COMMAND: kb

FOLLOWUP_IP:
HIDCLASS!DestroyPingPongs+45
8fd9822b ff4508 inc dword ptr [ebp+8]

SYMBOL_STACK_INDEX: 4

SYMBOL_NAME: HIDCLASS!DestroyPingPongs+45

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: HIDCLASS

IMAGE_NAME: HIDCLASS.SYS

DEBUG_FLR_IMAGE_TIMESTAMP: 4ce79c09

FAILURE_BUCKET_ID: 0x19_20_HIDCLASS!DestroyPingPongs+45

BUCKET_ID: 0x19_20_HIDCLASS!DestroyPingPongs+45

NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

OSR_Community_User · May 28, 2012, 1:48pm

I have found that when I was teaching, and this error occurred, I would
ask “does anyone here know how a storage allocator works?” and would be
greeted by 10-20 blank stares. If you know how a storage allocator works,
the interpretation of this message, and its cause, are obvious. You might
want to go to my MVP Tips site www.flounder.com/mvp_tips.htm and find the
two articles I wrote on storage allocators.

I already gave a detailed answer in another post.
joe

Hi,

This is probably a lame question but, what causes a BAD_POOL_HEADER?

Regards,

Nuno

NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Alex_Grig · May 28, 2012, 5:36pm

Enable DriverVerifier, “special pool” for your driver. You’ll diagnose the problem pretty quickly.

OSR_Community_User · May 29, 2012, 1:13pm

Thanks everyone for the quick reply.

Verifier was crucial to determine the error. It was pretty easy to solve!

Thanks,

With my best regards,

Nuno