Print an article from OSR Online

The NT Insider

Master of the Obvious -- MDLs are Lists that Describe Memory
(By: The NT Insider, Vol 12, Issue 4, September-October 2005 | Published: 16-Nov-05| Modified: 16-Nov-05)

My Latin teacher in high school was a real pain when grading translations on exams. No credit would be given if you used the obvious English word for which the Latin word was the root. So, for example, if you translated the word, "mandatum" to "mandate", zero points. Had to be "order", "instruction" or something else with an equivalent meaning. You basically had to be a walking Latin to English thesaurus in order to make it out of the class alive. Much like Latin itself, this was of course amazingly annoying. However, some good did come from this suffering. For one, it made me realize that I was glad that Latin was a dead language, and for two I'm now overly cautious in my use of circular definitions.

So, why share this anecdote in an article about MDLs? Well, because I'm fairly certain that whoever wrote the DDK documentation for the MDL APIs never took my high school Latin class. Exhibit A will be the entry for MmMapLockedPagesSpecifyCache:

The MmMapLockedPagesSpecifyCache routine maps the physical pages that are described by an MDL, and allows the caller to specify the cache behavior of the mapped memory.

So, according to this, MmMapLockedPagesSpecifyCache maps the pages in the MDL and allows you to specify the cache type...Zero points.

Let's Try to Improve Those Definitions

Based on the number of questions I get about MDL issues, I'm guessing that I'm not the only one who finds the documentation in the DDK on this subject to be woefully inadequate. In this article I hope to demystify several of the common MDL operations such as probing, locking, and mapping. I will also provide insight into several of the common (and some not so common) APIs that transcends just turning the API name into a sentence.

First things first: In order to get anything out of this article you need to understand the basics of virtual memory. There is so much material available on this topic that I won't even attempt to describe the concepts of virtual memory here. If you don't know your ass from a kernel virtual address, read the background articles from past issues of The NT Insider (Windows NT Virtual Memory Part I, Mar-Apr 1998; Windows NT Virtual Memory Part II, Jan-Feb 1999).

Since this is going to be an article about MDLs, let's start with my attempt at a reasonable definition of an MDL:

An MDL is a structure that describes the fixed physical memory locations that comprise a contiguous data buffer in virtual memory.

Here are a few significant facts you should take special notice of in this definition:

Each MDL can only describe a single virtually contiguous data buffer.

The data buffer that the MDL describes can be in either a kernel virtual address space, user virtual address space, or both.

The MDL describes the data buffer at a fixed position in physical memory. In other words, the data buffer an MDL describes will always be paged in (resident), and its pages will be locked-down ("pinned"). This means the data buffer can neither be paged out nor moved. These pages will remain locked for the lifetime of the MDL.

The data buffer that the MDL describes does not need to be page aligned, nor does it need to be an integral number of pages in length.

I know how to properly use the word "comprise" and have not confused it with the word "composed."

The Structure Itself

An article on MDLs wouldn't be complete without a cursory glance at the structure itself. Note that the MDL is documented to be fully opaque, which means you should never dereference its fields directly nor should you make any assumptions about its internal structure. However, as is true with much in the kernel, just because you should not rely on undocumented details in your drivers does not mean that you should not explore in order to help your debugging skills or overall understanding.

Here is the structure definition from the Server 2003 SP1 DDK. Above the structure definition is a great comment that describes a lot of the fields for us beautifully, so I'm just going to paste it in its entirety in Figure 1.

// An MDL describes pages in a virtual buffer in terms

// of physical pages. The pages associated with the

// buffer are described in an array that is allocated

// just after the MDL header structure itself.

// One simply calculates the base of the array by

// adding one to the base MDL pointer:

// Pages = (PPFN_NUMBER) (Mdl + 1);

// Notice that while in the context of the subject

// thread, the base virtual address of a buffer mapped

// by an MDL may be referenced using the following:

// Mdl->StartVa | Mdl->ByteOffset

typedef struct _MDL {

struct _MDL *Next;

CSHORT Size;

CSHORT MdlFlags;

struct _EPROCESS *Process;

PVOID MappedSystemVa;

PVOID StartVa;

ULONG ByteCount;

ULONG ByteOffset;

} MDL, *PMDL;

Figure 1 - MDL Structure Definition (and comment)

From this comment you know that the MDL is actually a variable length structure with an array that contains the underlying physical pages of the address range at the tail. Also you know that StartVa is the page aligned virtual address of the start of the range and ByteOffset is the starting offset into the first page.

If you search the DDK headers for more MDL goodness, you'll discover macros that are available for retrieving most of these fields accompanied by comments describing the fields. The last field that I'll mention for now is ByteCount, which tells you the entire length of the address range mapped by the MDL.

Building MDLs

So, let's get to the main point of this article - building and using MDLs.

Step 1: Allocating the Structure

Earlier you learned that the MDL structure is actually of variable length. Therefore, when you allocate the MDL, you do not just allocate a pool of memory that is size of (MDL). Instead, use the IoAllocateMdl DDI:

PMDL

IoAllocateMdl(

IN PVOID VirtualAddress,

IN ULONG Length,

IN BOOLEAN SecondaryBuffer,

IN BOOLEAN ChargeQuota,

IN OUT PIRP Irp OPTIONAL

);

TRIVIA POINT: If your MDL is under a certain size, the I/O Manager will proceed to allocate the MDL from a lookaside list instead of pool. In this case, the MDL_ALLOCATED_ FIXED_SIZE bit will be set in the MdlFlags field of the MDL.

Even though the SecondaryBuffer and Irp parameters are hardly ever used, they are worth noting. If you specify an IRP in the last parameter, the I/O Manager will set Irp->MdlAddress to the resulting MDL before returning. If you supply an IRP and specify TRUE to the SecondaryBuffer parameter, the I/O manager assumes there is already an MDL in the IRP and chains the resulting MDL into the existing MDL using the MDL's Next field. Chained MDLs are largely unsupported outside of the networking stack, so they are not worth discussing in great detail in this article.

The important thing to keep in mind about IoAllocateMdl is that its primary function is to allocate the storage for the MDL and fill in the parameters related to the buffer's virtual address in the current mode. The part of the MDL that describes its physical pages can't be built until those pages are locked into physical memory.

Step 2: Probing and Locking

Once you have the MDL appropriately allocated, it is time to probe the buffer it describes to ensure the user has proper access and lock the underlying pages of the buffer into physical memory. This operation is done by the aptly named MmProbeAndLockPages DDI:

VOID

MmProbeAndLockPages (

__inout PMDL MemoryDescriptorList,

__in KPROCESSOR_MODE AccessMode,

__in LOCK_OPERATION Operation

);

When calling this DDI, specify whether or not this operation is on the behalf of a user (the AccessMode parameter) and the type of access being asked for: IoReadAccess, IoWriteAccess, or IoModifyAccess.

TRIVIA POINT: I'm sure you're asking, "What on Earth is the difference between write access and modify access?" The answer to this is, of course, absolutely nothing. Both flags result in the MDL_WRITE_ OPERATION flag being set in the MDL and are treated identically.

Realize that process context is important when calling this DDI on an MDL that describes a buffer within a user virtual address space. One of the purposes of this function is to bring in the pages described by the virtual address range passed to IoAllocateMdl and lock them into memory. This is done by translating the virtual addresses that make up the range into the individual PFN entries and referencing them. If this occurs in a process context other than the original, the page tables will be different and the pages brought in may not be what you expected.

MmProbeAndLockPages is one of a select few DDIs that throws an exception on error. As such, any calls to this function must be wrapped in a structured exception handling block.

Step 3: Mapping

This stage is potentially an optional stage. However, for now let's assume your driver wants a new virtual address for accessing the underlying pages.

At this point you have an MDL that is setup for the proper access and that has the physical pages that back it pinned into memory. The data buffer described by the MDL is only mapped into the address space (or spaces) in which it originally resided. If you want to map the buffer that the MDL describes into another virtual address range, use the MmMapLockedPagesSpecifyCache DDI:

PVOID

MmMapLockedPagesSpecifyCache (

__in PMDL MemoryDescriptorList,

__in KPROCESSOR_MODE AccessMode,

__in MEMORY_CACHING_TYPE CacheType,

__in_opt PVOID RequestedAddress,

__in ULONG BugCheckOnFailure,

__in MM_PAGE_PRIORITY Priority

);

The most common use for this DDI is to map a data buffer into the non-process dependent part of the kernel virtual address space. This allows the data in the data buffer to be accessed by your driver in an arbitrary process context such as within the driver's DPC. To map the buffer described by the MDL into kernel virtual address space, specify KernelMode as the AccessMode parameter when you call this function. No need to worry about page faults when you access the buffer, right? It's physical pages are locked!

Alternatively, a particularly interesting way to use MmMapLockedPagesSpecifyCache is to map the buffer specified by an MDL into the context of a user process. Note that because the MDL describes a set of fixed physical pages, the MDL can be referred to in any process context. To map the data buffer described by the MDL into the user virtual address space of the current process, call MmMapLockedPagesSpecifyCache and specify UserMode as the AccessMode parameter.

Assuming you want a kernel virtual address for this MDL, (normal case), you can fight off carpel tunnel for a few more days by using the MmGetSystemAddressForMdlSafe macro shown in Figure 2.

#define MmGetSystemAddressForMdlSafe(MDL, PRIORITY) \

(((MDL)->MdlFlags & (MDL_MAPPED_TO_SYSTEM_VA | \

MDL_SOURCE_IS_NONPAGED_POOL)) ? \

((MDL)->MappedSystemVa) : \

(MmMapLockedPagesSpecifyCache((MDL), \

KernelMode, \

MmCached, \

NULL, \

FALSE, \

(priority))))

Figure 2 - MmGetSystemAddressForMdlSafe Macro

TRIVIA POINT: After a successful call to MmMapLockedPagesSpecifyCache with AccessMode set to KernelMode, MDL_MAPPED_TO_SYSTEM _VA is set in the MdlFlags field and the MappedSystemVa is set to the base of the resulting mapping. Note that the end result of this is that only the first call to MmGetSystemAddressForMdlSafe actually builds a kernel virtual address while subsequent calls are returned the same value.

Practical Use

Imagine that your driver wants to read some data from the disk and you decided the best way to do this is by sending an IRP that you built. Since the disk driver is a METHOD_DIRECT driver, you need an MDL for the buffer. In this situation you know at least three things:

You're about to write new code so it will be awful and probably corrupt your disk. With much trepidation, you must trust that the mere fact that it compiled is a miracle and therefore it will work through divine intervention.

It's your driver, so you know where the memory came from and whether or not it is pageable.

You know the address is context independent.

In this case, you're about to put on your I/O manager hat and send another driver an MDL that describes the buffer. It is worth mentioning what state the I/O manager guarantees an MDL to be in before sending it to a METHOD_DIRECT driver.

When a driver specifies METHOD_DIRECT, the I/O manager guarantees that the driver will receive an MDL that describes the full virtual address range specified by the original requestor and that the MDL has been probed and locked. Note that the MDL does not need to be mapped, just probed and locked. So, if you wish to play I/O manager, the MDL that you supply to the disk driver must be of adequate size to handle the request and must be probed and locked.

In the case where you're sending a data buffer to the disk driver that has come from pageable pool, then you must use MmProbeAndLockPages in order to properly pull the physical pages in and pin them into memory. Just as before, passing KernelMode allows you to skip much of the probing, but the locking is unavoidable.

What about the case when the data buffer comes from non-paged pool? Clearly you have access to the buffer and do not need to probe it. Also, the physical pages that back the allocation are already pinned in memory. So, while MmProbeAndLockPages wouldn't be wrong, it is unnecessary work that would need to be undone later. Therefore, you have MmBuildMdlForNonPagedPool to the rescue.

MmBuildMdlForNonPagedPool simply builds the PFN array at the tail of the MDL structure and returns. There is no need to do any extra access checking or referencing of pages since the pages are assumed to be accessible and locked in memory. Note that this operation eliminates the need to call MmProbeAndLockPages.

TRIVIA POINT: MmBuildMdlForNonPagedPool sets the MappedSystemVa of the MDL to (Mdl->StartVa | Mdl->ByteOffset) and then sets the MDL_SOURCE_IS_ NONPAGED_POOL bit in the MdlFlags. Looking back at MmGetSystemAddressForMdlSafe, this means that after a call to MmBuildMdlForNonPagedPool, an attempt to get a non-pageable system address simply returns the original non-paged address passed to IoAllocateMdl.

Operation MDL Cleanup

So far this article has spent a lot of time talking about building MDLs for different situations, but has not addressed the issues of cleaning up after MDLs. This discussion has been left to the end of the article for a good reason. It seems to me that MDLs have this mystical air about them that is completely unwarranted. Once you understand what the DDIs used to build them actually do, a discussion about tearing them down should be fairly simple.

Let's look at the DDIs that have been used so far and list the "undo" DDIs for each operation.

IoAllocateMdl

Whenever you allocate an MDL with IoAllocateMdl, at some point you must call IoFreeMdl.

MmProbeAndLockPages

For every call to MmProbeAndLockPages, there must be a call to MmUnlockPages. When you think of MmProbeAndLockPages as taking out a reference to the underlying pages, then the idea that there must always be a matching MmUnlockPages to drop the reference makes sense.

MmBuildMdlForNonPagedPool

Because this DDI doesn't need to take out any references to the underlying pages in order to make them resident, no undo operation is required.

MmMapLockedPagesSpecifyCache

The purpose of this DDI is to build new virtual addresses that translate to the underlying physical pages described by the MDL. These translations, known better as Page Table Entries (PTEs), are a resource within the O/S that must be returned when the driver is done using them. The undo operation for this DDI has the (somewhat unfortunate) name of MmPrepareMdlForReuse. Note that if you are not familiar with calling this routine after (for example) a call to MmGetSystemAddressForMdlSafe, it is because IoFreeMdl makes this call for you before returning the MDL to pool.

That's Enough...For Now

One of the great things about working in the kernel is that no matter what bit of information you choose to explore, there's always something more to learn. MDLs are no exception. Hopefully this article has led you to a better understanding of common operations that will result in far less painful development and debugging experiences in the future.

MDLs Sidebar

You call either MmProbeAndLockPages or MmBuildMdlForNonPagedPool - but not both.When used appropriately, either operation is sufficient to fully build an MDL that is ready to be mapped into a virtual address space or given to the HAL for logical translation for DMA. So, please, let's not see anymore code that does this:

MmBuildMdlForNonPagedPool(mdl);

_try {

MmProbeAndLockPages (mdl,

KernelMode,

IoWriteAccess);

} _except( EXCEPTION_EXECUTE_HANDLER ) {

}

Before writing code that blindly uses every MDL DDI under the sun, think about what they mean, what they're doing, and when they're appropriate.

This article was printed from OSR Online http://www.osronline.com