OSRLogo
OSRLogoOSRLogoOSRLogo x Seminar Ad
OSRLogo
x

Everything Windows Driver Development

x
x
x
GoToHomePage xLoginx
 
 

    Mon, 20 Oct 2014     104927 members

   Login
   Join


 
 
Contents
  About This Site
What's New?
OSR Dev Blog
The NT Insider
The Basics
File Systems
Downloads
ListServer / Forum
Driver Jobs
Store
  Express Links
  · The NT Insider Digital Edition - Sept-Oct 2014 Now Available!
  · Sept-Oct Issue of The NT Insider Released!
  · Writing WDF Drivers: Advanced Implementation Techniques
  · OSR Seminar Schedule
  · Windows 8.1 Update: VS Express Now Supported

Bugchecks Explained: PAGE_FAULT_IN_NONPAGED_AREA

What Happened?

To understand this bugcheck code, it’s first necessary to understand what a "page fault" is. If you’re not completely sure you understand this concept, read the article So, Exactly What Is A Page Fault here at OSR Online.

The Windows Memory Manager reserves pre-defined ranges of kernel virtual address space for specific uses. Because the Windows operating system utilizes virtual memory, the Memory Manager does not necessarily assign physical memory to every possible kernel virtual address within its pre-defined ranges. The Memory Manager knows that some of its kernel virtual address ranges are used for pageable memory, and other ranges are used for non-pageable memory. For example, the kernel virtual address space that is reserved for use by the non-paged pool is (obviously) part of one of the Memory Manager’s non-pageable address spaces.

Whenever the Memory Manager detects a page fault (that is, a failure to translate a kernel virtual address to a physical address) in one of its pre-assigned address ranges in which the memory is supposed to be non-pageable, it halts system execution with a PAGE_FAULT_IN_NONPAGED_AREA bugcheck.

The only thing that can cause one of these page faults is an inadvertent reference by a kernel mode component to an invalid memory address that just happens to correspond to one of the Memory Manager’s pre-assigned non-pageable address ranges. The most common reason for this bugcheck is a driver de-referencing a bad pointer.

There are basically innumerable things that can happen that can lead to an invalid memory access, so tracking down these bugchecks can sometimes be particularly difficult. Some of the most common reasons for these bugchecks are buffer overruns and underruns, or accessing of a completely bogus address.

Who Did It?

When analyzing these crash dumps, it is either immediately obvious who caused the problem or it can take some serious detective work. The bugcheck parameters for this particular code are the invalid address that was accessed, whether the access was a read or a write, and the address of the instruction that caused the invalid access. Here are some things to think about when analyzing these dumps:

1) Why is the address bad?

a) Was it previously freed? The !pool WinDBG command can be helpful in determining this.

b) Is this potentially a buffer underrun? A buffer overrun? To determine this, you will need to look at how the address is being used. If, for example, the address is being used in a copy operation, starting your analysis believing it to be a buffer overrun might not be a bad assumption (but just don’t forget that it might not be the right assumption!).

c) Is the address just completely bogus? The !pool command is also useful here, as is the !pte command

2) Where did the address being accessed come from?

3) At which point did the address become bad? Was it previously used successfully by another component?

Using the information gathered from the above steps you can usually begin to get a better idea as to where things went wrong.

How Should I Fix It?

Using Driver Verifier and the checked build of Windows should allow you to better pinpoint the offending driver in the system. If the driver is not a driver that you have any control over, the only available option is disabling the driver until a fixed version is available.

Related WinDBG Commands

· !pte

· !pool

Related Windows O/S Variables

· nt!MmPagedPoolStart

· nt!MmPagedPoolEnd

· nt!MmNonPagedPoolStart

· nt!MmNonPagedPoolEnd

An Example

Here’s an example that puts the above guidelines to use and tracks down a misbehaving driver. For clarity, the WinDBG output in this example has been stripped down to the parts important to our discussion.

PAGE_FAULT_IN_NONPAGED_AREA (50)

Invalid system memory was referenced. This cannot be protected by try-except,

it must be protected by a Probe. Typically the address is just plain bad or it

is pointing at freed memory.

Arguments:

Arg1: ff8b6000, memory referenced.

Arg2: 00000000, value 0 = read operation, 1 = write operation.

Arg3: 804238fd, If non-zero, the instruction address which referenced the bad memory address.

Arg4: 00000000, (reserved)

...

READ_ADDRESS: ff8b6000 Nonpaged pool

FAULTING_IP:

nt!IopCompleteRequest+ab

804238fd f3a5 rep movsd

...

STACK_TEXT:

bed4ec7c 804b06e7 811bba08 bed4ecc4 bed4ecb8 nt!IopCompleteRequest+0xab

bed4eca4 804ac360 8143e4d0 80000005 81158f88 nt!IopSynchronousServiceTail+0x8f

bed4ed48 80466389 0000084c 0155f8c8 0155f8b0 nt!NtQueryVolumeInformationFile+0x320

bed4ed48 77f8e593 0000084c 0155f8c8 0155f8b0 nt!KiSystemService+0xc9

0155f870 767ebb9f 0000084c 0155f8c8 0155f8b0 ntdll!ZwQueryVolumeInformationFile+0xb

OK, so our system bugchecked because we tried to read address 0xFF8B6000, presumably while trying to complete an IRP_MJ_QUERY_VOLUME_INFORMATION IRP. The address looks reasonable and the bugcheck info is telling me that the address is in the nonpaged address space, so let’s see what the debugger says about the address:

0: kd> !pool ff8b6000

ff8b6000: Unable to get contents of pool block

That wasn’t much help, but because the !pool command didn’t tell me that the address had been freed, I’m going to assume that we’re not dealing with a memory access to freed pool. This may be a completely invalid assumption, but it allows me to move on for the moment.

0: kd> !pte ff8b6000

FF8B6000 - PDE at C0300FF8 PTE at C03FE2D8

contains 01036963 contains 7F8BD000

pfn 1036 G-DA--KWV not valid

PageFile 0

Offset 7f8bd

Protect: 0

The page table entry for the nonpaged address that we accessed is invalid, which is why the system bugchecked. The faulting IP from the bugcheck info is a rep movsd instruction, which is a copy instruction on the x86. So, I’m going to assume for the time being that this bugcheck occurred because of a buffer overrun. With that info in hand, I can move on to step two and figure out where the address came from.

Looking in the DDK documentation, I see that IRP_MJ_QUERY_VOLUME_INFORMATION IRPs all use METHOD_BUFFERED. Therefore, when I find the IRP that is being completed here, its data buffer is going to be at Irp->AssociatedIrp.SystemBuffer. Now, unfortunately, the last two calls on the call stack that I’ve been given aren’t documented. This means that I have no idea what their parameters are and so I have no idea where to find the IRP. Because of that, I have to find the IRP the hard way. Dumping all of the memory contents starting at the last frame’s EBP (0xBED4EC7C) and executing the !irp command on anything that looks like an IRP eventually leads to success:

0: kd> !irp 811bb9c8

Irp is active with 1 stacks 3 is current (= 0x811bba80)

No Mdl System buffer = ff8b5fe8 Thread 813ad980: Irp is completed.

cmd flg cl Device File Completion-Context

[ a, 0] 0 0 8143e4d0 00000000 00000000-00000000

\Driver\

Args: 00000000 00000000 00000000 00000000

That system buffer address looks awfully suspect in terms of the address that generated the blue screen, so let’s see what we can find out about it:

0: kd> !pool ff8b5fe8

...

*ff8b5fe0 size: 20 previous size: 20 (Allocated) Process: 81457020

0: kd> ? ff8b5fe0+0x20

Evaluate expression: -7643136 = ff8b6000

I can see here that the allocation that the system buffer address lies in is valid from 0xFF8B5FE0 up to but not including 0xFF8B6000, the address that killed us. Taking a look at the completion status of the IRP:

0: kd> dt nt!_IRP 811bb9c8 –r

...

+0x018 IoStatus :

+0x000 Status : 0x80000005

+0x000 Pointer : 0x80000005

+0x004 Information : 0x1c

...

Aha! The offending driver in this case has returned STATUS_BUFFER_OVERFLOW with the Information field set to the number of bytes needed to complete this request. Unfortunately, what this driver writer didn’t realize was that STATUS_BUFFER_OVERFLOW is simply an informational message and the I/O Manager will go ahead and copy the number of bytes specified by Information out of the system buffer and into the user’s buffer (remember, these IRPs are all METHOD_BUFFERED). Note that these are not the same semantics that you get when returning the error code STATUS_BUFFER_TOO_SMALL, which is what the developer meant to return. If a driver completes an IRP with that code, the user is returned the number of bytes needed to complete the request but no data is copied.

So, because of this mishap, when the IRP was completed with IoCompleteRequest the I/O Manager attempted to copy 0x1C bytes of data starting at address 0xFF8B5FE8 into the user’s buffer, which led to a buffer overrun and a PAGE_FAULT_IN_NONPAGED_AREA bugcheck.

User Comments
Rate this article and give us feedback. Do you find anything missing? Share your opinion with the community!
Post Your Comment

"Thank you"
It's very good!Thank you for you,i have learned much

Rating:
22-Aug-11, ming li


"this article is really helpful"
This article is really helpful for driver developers. We hope you proceed to post similar kind of bugcheck explanations.

-Srilatha

Rating:
25-Sep-06, Srilatha Bala


"Very helpful."
Excellent bug trace.

Rating:
31-Mar-05, Peter Trinh


"RE: How do I lock the driver code to Non paged pool ?"
Hi,

By default, all driver code is non paged. The only way driver code becomes pageable is if it is explicitly marked as pageable via a #pragma or it is paged out using the MmPageEntireDriver DDI. This is all discussed under "Making Drivers Pageable" in the DDK (note that that is the name of the section in the 3790 DDK, it may be different in earlier DDKs).

-scott

13-Sep-04, Scott Noone


"How do I lock the driver code to Non paged pool ?"
How do I ensure the code executed at IRQL above DISPATCH_LEVEL, does not page fault. How can I lock that part of the code to Non paged pool ?

Rating:
12-Sep-04, Mohamed Husain


"100%"
Excellent article, no loose ends. Keep it up!

Rating:
29-Aug-04, Erwin Zoer


"PAGE_FAULT_IN_NONPAGED_AREA"
(1) Again an excellent article. I really like the use of the examples of using Windbg to analyze the BugCheck. You get to learn debugging techniques and they are very very good. Thanks much!!!

Rating:
25-Aug-04, William Jones


"Bugchecks Explained: PAGE_FAULT_IN_NONPAGED_AREA"
Great article ! I'm not a driver addict, I just like to program using the DDK, and I'm trying to write a little storage class "driver".

Nevertheless, article like these really help understanding the way the operating system works.

thaks a lot !

Rating:
25-Aug-04, David Landelle


Post Your Comments.
Print this article.
Email this article.

Windows Internals and SW Drivers
LAB

Dulles/Sterling, VA
20-24 Oct 2014

Developing File Systems for Windows
Seattle, WA
4-7 Nov 2014

Kernel Debugging and Crash Analysis
LAB

Boston/Waltham, MA
10-14 Nov 2014

Writing WDF Drivers: Core Concepts
LAB

Palo Alto, CA
12-16 Jan 2015

 
 

Windows Debugger

Checked Build Downloads
29-Apr-10

Debugging Symbols

WDK Documentation

Windows WDK

 
 
x
LetUsHelp
 

Need to develop a Windows file system solution?

We've got a kit for that.

Need Windows internals or kernel driver expertise?

Bring us your most challenging project - we can help!

System hangs/crashes?

We've got a special diagnostic team that's standing by.

Visit the OSR Corporate Web site for more information about how OSR can help!

 
bottom nav links