Click Here to Download: Code Associated With This Article Zip Archive, 15KB
At one time or another, most driver writers will have the need to share memory between a driver and a user-mode program. And, as with most such things, there are a wide variety of ways to accomplish this goal. Some of these approaches are right and some are decidedly wrong. Two of the easiest techniques are:
The application sends an IOCTL to the driver, providing a pointer to the memory that the driver and the application thereafter share.
The driver allocates pages of memory, maps those pages into the address space of a specific user-mode process, and returns the address to the application.
For the sake of brevity, we'll restrict our discussion to these two straightforward techniques. Other perfectly acceptable techniques include sharing a named section that's backed by either the paging file or a memory mapped file. Perhaps we'll discuss those in a future article. Also, note that this article won't specifically address sharing memory that's resident on a device. While many of the concepts are the same, sharing device memory with a user-mode program brings with it its own set of special challenges.
Sharing Buffers Using IOCTLs
Sharing memory between a driver and a user-mode app using a buffer described with an IOCTL is the simplest form of "memory sharing". After all, it's identical to the way drivers support other, more typical, I/O requests. The base address and length of the buffer to be shared are specified by the application in the OutBuffer parameters of a call to the Win32 function DeviceIoControl(). The only interesting decision for the driver writer who uses this method of buffer sharing is which buffer method to specify for the IOCTL. Either METHOD_xxx_DIRECT or METHOD_NEITHER will work.
If METHOD_xxx_DIRECT is used, the user buffer will be checked for the correct access and if successful, the user buffer will be locked into memory. The driver will also need to call MmGetSystemAddressForMdlSafe to map the described data buffer into kernel virtual address space. An advantage of this method is that the driver can access the shared memory buffer from an arbitrary process context, and at any IRQL. Use METHOD_IN_DIRECT if data is only being passed in to the driver. Use METHOD_OUT_ DIRECT if data is being returned from the driver to the application or if data is being exchanged in both directions.
There are a number of restrictions and caveats inherent in using METHOD_NEITHER to describe a shared memory buffer. Basically, these are the same ones that apply any time a driver uses this method. Chief among these is the rule that the driver must only access the buffer in the context of the requesting process. This is because access to the shared buffer is via the buffer's user virtual address. This will almost always mean that the driver must be at the top of the device stack, called directly by the user application via the I/O Manager - there can be no intermediate or file system drivers layered above the driver. Practically speaking, a WDM driver will be typically restricted to accessing the user buffer from within its dispatch routine, and a KMDF driver will need to use its EvtIoInCallerContext event processing callback.
Another important restriction inherent in using METHOD_NEITHER is that access by the driver to the user buffer must always be at IRQL PASSIVE_LEVEL. This is because the I/O Manager hasn't locked the user buffer in memory, and it could be paged out when accessed by the driver. If the driver can't meet this requirement, it will need to build an MDL and then lock the buffer in memory.
Another, perhaps less immediately obvious restriction to this method, regardless of the transfer type chosen, is that the memory to be shared must be allocated by the user mode application. The amount of memory that can be allocated can be restricted, for example, due to quota limitations. Additionally, user applications cannot allocate physically contiguous or non-cached memory. Still, if all a driver and a user mode application need to do is pass data back and forth using a reasonably-sized data buffer, this technique can be both easy and useful.
As easy as it is, using IOCTLs to share memory between a driver and a user-mode application is also one of the most frequently misused schemes. One common mistake new Windows driver devs make when using this scheme is that they complete the IOCTL sent by the application after having retrieved the buffer address from it. This is a very bad thing. Why? What happens if the user application suddenly exits, for example, due to an exception? With no I/O operation in progress to track the reference on the user buffer, the driver could unintentionally overwrite a random chunk of memory. Another problem is that when using METHOD_xxx_ DIRECT, if the IRP with the MDL is completed the buffer will no longer be mapped into system address space. An attempt to access the previously valid kernel virtual address (obtained using MmGetSystemAddressForMdlSafe) will crash the system. This is generally to be avoided.
One solution to this problem is for the application to open the device for FILE_FLAG_OVERLAPPED and issue the IOCTL using an OVERLAPPED structure. A driver can then set a cancel routine on the IRP (using IoSetCancel Routine), mark the IRP pending, (using IoMarkIrpPending), and queue the IRP internally before returning STATUS_ PENDING to the caller. A KMDF driver is, of course, relieved from having to do these sorts of machinations and just needs to keep the request in progress and cancellable, such as on a WDFQUEUE.
Using this approach has two advantages:
The application will be notified that the buffer is mapped when it receives ERROR_IO_PENDING back from the IOCTL call, and will know when the buffer has been unmapped when the IOCTL finally completes.
The driver, via its cancel routine (WDM) or an EvtIoCanceledOnQueue event processing callback (KMDF), can be notified when the application exits or cancels the IO so that it can perform the operations necessary to complete the IOCTL and thus have the MDL allocation for the operation unmapped.
Allocating and Mapping Pages
That leaves us with the second scheme mentioned above: Allocating pages of memory and mapping the pages into the user virtual address space of a specified process. This scheme is surprisingly easy, uses APIs familiar to most Windows driver writers, and yet allows the driver to retain maximum control of the type of memory being allocated.
The driver uses whatever standard method it desires to allocate the memory to be shared. For example, if the driver needs a device (logical) address appropriate for DMA, as well as a kernel virtual address for the memory block, it could allocate the memory using AllocateCommonBuffer. If no special memory characteristics are required and the amount of memory to be shared is modest, the driver can allocate the buffer from zero-filled, non-paged physical memory pages.
Allocating zero filled, non-paged, pages from main memory is done using MmAllocatePagesForMDL or MmAllocate PagesForMdlEx (Srv2003 SP1 and later). This function returns an MDL that describes the memory allocation. The driver maps the allocated pages described by the MDL into kernel virtual address space using the function MmGetSystemAddressForMdlSafe. Allocating pages from main memory is inherently more secure than using paged or non-paged pool, which is never a good idea.
With an MDL built that describes the memory to be shared, the driver is now ready to map those pages into the address space of the user process. This is accomplished using the function MmMapLockedPagesSpecifyCache. The only "tricks" you need to know about calling this function are:
You must call the function from within the context of the process into which you want to map the buffer.
You specify UserMode for the AccessMode parameter. The value returned from the MmMap LockedPagesSpecifyCache call is the user virtual address into which the memory described by the MDL has been mapped. The driver can return that to the user application in a buffer in response to an IOCTL.
You need a way to perform cleanup of the allocated memory when it is no longer needed. In other words, you are going to need to call MmFreePagesFromMdl to release the memory pages and call IoFreeMdl to delete the MDL that was allocated by the call to MmAllocatePages ForMdl(Ex). You'll almost certainly need to do this in your drivers IRP_MJ_CLEANUP handler (WDM) or EvtFileCleanup event processing callback (KMDF).
That's all there is to it. Put together, the code to accomplish this process is shown in Figure 1.
PVOID CreateAndMapMemory(PMDL* PMemMdl,PVOID* UserVa)
// Initialize the Physical addresses need for MmAllocatePagesForMdl
lowAddress.QuadPart = 0;
totalBytes.QuadPart = PAGE_SIZE;
// Allocate a 4K buffer to share with the application
mdl = MmAllocatePagesForMdl(lowAddress,highAddress,lowAddress,totalBytes);
// The preferred way to map the buffer into user space
MmMapLockedPagesSpecifyCache(mdl, // MDL
UserMode, // Mode
MmCached, // Caching
NULL, // Address
FALSE, // Bugcheck?
NormalPagePriority); // Priority
// If we get NULL back, the request didn't work.
// I'm thinkin' that's better than a bug check anyday.
// Return the allocated pointers
*UserVa = userVAToReturn;
*PMemMdl = mdl;
DbgPrint("UserVA = 0x%0x\n", userVAToReturn);
Figure 1 -- Allocating Pages and Mapping Into User Mode
Of course, this method does have the disadvantage that the call to MmMapLockedPagesSpecifyCache must be done in the context of the process into which you want the pages to be mapped. This might at first make this method appear no more flexible than the method that uses an IOCTL with METHOD_NEITHER. However, unlike that method, this one only requires one function (MmMapLockedPages SpecifyCache) to be called in the target process context. Because many drivers for OEM devices are in a device stacks of one above the bus (that is, there is no device above them, and no driver but the bus driver below them) this condition will be easily met. For the rare device driver that will want to share a buffer directly with a user-mode application that's located deep within a device stack, an enterprising driver writer can probably find a safe way to call MmMapLocked PagesSpecifyCache in the context of the requesting process.
After the pages have been mapped, like the method that uses the IOCTL with METHOD_xxx_DIRECT, the shared memory can be accessed from an arbitrary process context, and even at elevated IRQL (because the shared memory is not pageable).
If you use this method, there is one final thing that you'll have to keep in mind: You will have to ensure that your driver provides a method to unmap those pages that you mapped into the user process any time the user process exits. Failure to do this will cause the system to crash as soon as the app exits, which is definitely to be avoided. One easy way that we've found of doing this is to unmap the pages whenever the application closes the device. Because closing the handle, expected or otherwise, always results in an IRP_MJ_ CLEANUP being received by your driver for the File Object that represented the applications open instance of your device, you can be sure this will work. You want to perform this operation at CLEANUP time, not CLOSE, because you can be (relatively) assured that you will get the cleanup IRP in the context of the requesting thread. Releasing the resources allocated in Figure 1 can be seen in Figure 2.
void UnMapAndFreeMemory(PMDL PMdl,PVOID UserVa)
// Make sure we have an MDL to free
// Return the allocated resources.
Figure 2 -- Releasing the Allocated Resources from Figure 1
The DuplicateHandle Problem
One of more common objections that people cite to using MmMapLockedPagesSpecifyCache to share memory between a driver and a process is that people ask: "What happens if somebody calls DuplicateHandle on the handle that's used by the process to open the device? Then, the driver will not receive an IRP_MJ_CLEANUP when the process with the mapped section exits (if this process exits before the process that duplicated the handle). While this is correct, it sort of ignores the fact in order to call DuplicateHandle a process would requires access rights to the process with the handle to be duplicated. In this case, the calling process could find lots of ways to perturb the target process. But there is a way to handle even this scenario. The driver can use PsSetCreateProcessNotifyRoutine to register a callback to be informed of process exit, and do the necessary cleanup of the mapping context at that time. We're not saying this is an attractive option, but it does close the hole that some folks tend to be concerned above.
Despite the mechanism used, the driver and application will need a common method of synchronizing access to the shared buffer. This can be done in a variety of ways. Probably the simplest mechanism is sharing one or more named events. One of the easiest ways for an application and a driver to share an event is for the application to create the event and pass the event's handle into the driver. The driver then references the handle from within the context of the application. If you use this method, don't forget to dereference the handle from within your driver's cleanup processing code!
We've looked at two methods for allowing a driver and a user-mode application to share a memory: Using a buffer created by a user application and passed to a driver via an IOCTL, and allocating pages in the driver using MmAllocatePagesForMdl and then mapping them into the application's address space using the MmMapLockedPagesSpecifyCache function.
These two techniques are demonstrated in the OSRMEM sample driver and test program which accompany this article. Both methods are relatively simple, as long as you follow a few rules. Have fun!