Shared memory between 64bit Driver / 32bit App

Hello,

I am in beginning of porting a 32bit driver to 64bit and so far it runs, except one problem.

I use nonpaged memory block allocated by the driver mapped and used as high performance queue in the application.
This works fine (using MmMapLockedPages) when both driver and app is either 32bit or 64bit but I wasnt able to find a way how to remap the memory block for 32bit application when driver is 64bit. Just converting pointer causes either application crash or BSOD so I assume it only works when pointer is created in 32bit application.

On linux i found it, but not on Windows (Vista 64bit to be precise).

Is there some skilled programmer who dealt with this rather rare issue?

Greetings and many thanks,
Tanja

Hmmmm… they way this SHOULD work is that if your 64-bit driver maps the shared memory block into the application’s address space (using MmMapLockedPagesSpecifyCache specifying AccessMode as UserMode), and does this in the context of the 32-bit process, then it should be mapped into the 32-bit address space of the current process.

Are you saying this is what you do and it does not work??

Peter
OSR

yes, it is called as this

devstr->queue_user_mem = MmMapLockedPages (devstr->queue_mdl, UserMode);

i was thinking about replace it with MmMapLockedPagesSpecifyCache later. The pointer seems valid but when i try to access it from application, it crashes, if i reuse the pointer in driver (it is rather complicated sequence sharing but in principle works) driver BSODs.

MmMapLockedPages just calls MmMapLockedPagesSpecifyCache… so that’s not the problem.

Wow… that’s a good one. Let me try a few more questions (because I have nothing better to suggest):

a) You’re certain the memory you’re mapping is non-paged and the MDL is valid?
b) You’re SURE you’re in the context of the requesting 32-bit process when you do the mapping operation?
c) Is the address you’re getting back < 2GB (i.e. what would be a valid 32-bit address for the user-mode application?)
d) Is the address that you get back valid in the context of your driver?

Sorry, those are about the only questions I can figure out to ask at this point,

Peter
OSR

no need to sorry, i am happy for any help!

a) I think so. Driver fully works with same code is used on 32/32 and 64/64 scenario. I admit I have to add more error checking to be certain. Values seem to return correctly, but checked them only fast in DebugView.

b) yes, it should be in context of 32bit process when i do the mapping. The IoIs32bitProcess returns proper value confirming they understand each other and all other IOCTL operations works.

c) yes

d) not sure yet, have to check tomorrow but i think not. because when i send it to application and than back (unfortunatelly didnt yet check if another conversion is not done in mean step) it BSODs when trying to use it. Didnt tried to use it directly yet.

aka

queue->version = 0x7357; // BSODs in driver context

if (queue->version)/**/; // Crashes the app

xxxxx@xpwn.net wrote:

d) not sure yet, have to check tomorrow but i think not. because when i send it to application and than back (unfortunatelly didnt yet check if another conversion is not done in mean step) it BSODs when trying to use it. Didnt tried to use it directly yet.

Let me read between the lines here. Are you (a) allocating in the
driver, (b) mapping to 32-bit, (c) passing the 32-bit pointer out to
user-mode, (d) passing that SAME 32-bit pointer back into kernel mode,
and (e) trying to access the memory in kernel mode by using the 32-bit
user mode pointer?

Why aren’t you using the kernel address from within the kernel driver?
It’s extremely problematic to access a user-mode address from a kernel
driver.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

That’s scary, Tanja. I don’t know what else to suggest.

Tim’s right, as a general principle… but it should work and I’d prefer to focus on the main problem: Why the address that’s returned doesn’t seem to be valid in the context of the user program.

Sorry… it’s a mystery to me. I suspect there must be a very ordinary bug that’s making it LOOK like this is the problem,

Peter
OSR

no you understood me wrong, I reuse the address between different drivers but it is not point of the issue. This was just to verify whenever the 32bit pointer is valid for driver (and it have to work once in future).

Right now, I allocate memory in 64bit driver, than need to pass it (somehow) to 32bit application to access it.

nothing more.

Later once that will work, application will be able to pass this pointer to other drivers to piggyback on it and share it as well. That is second step and not needed now. I only wanted to describe all tests i did:

a) access the 64bit pointer within 64bit driver -> that works

b) access the mapped pointer from 32bit process -> that crashes app

c) access the 64bit mapped pointer from 64bit process -> that works

d) remap the 32bit pointer back to 64 and trying to access it from 64bit driver -> BSOD

e) same code as in c, compiled for 32bit driver -> that works as well

Rather than giving it to us one line at a time over the course of
several posts, it would be a lot easier if you would post all the code
that: app) calls the driver to get the address, and uses the address,
and kernel) handles the IOCTL, maps the address, and later uses the address.

I’m guessing that there’s probably some simple unrelated bug.

xxxxx@xpwn.net wrote:

Hello,

I am in beginning of porting a 32bit driver to 64bit and so far it runs, except one problem.

I use nonpaged memory block allocated by the driver mapped and used as high performance queue in the application.
This works fine (using MmMapLockedPages) when both driver and app is either 32bit or 64bit but I wasnt able to find a way how to remap the memory block for 32bit application when driver is 64bit. Just converting pointer causes either application crash or BSOD so I assume it only works when pointer is created in 32bit application.

On linux i found it, but not on Windows (Vista 64bit to be precise).

Is there some skilled programmer who dealt with this rather rare issue?

Greetings and many thanks,
Tanja


Ray
(If you want to reply to me off list, please remove “spamblock.” from my
email address)

Ok, seems to be a bit more complicated than simple answer, so here it is. Thank You again for any help.

queue = ExAllocatePoolWithTag(NonPagedPool,size,tag);

/*snip*/

if (IoIs32bitProcess(irp)) {
devstr->queue_mdl = MmCreateMdl (NULL, devstr->queue, queue_sizeof(devstr->queue));
MmBuildMdlForNonPagedPool (devstr->queue_mdl);
devstr->queue_user_mem = MmMapLockedPages (devstr->queue_mdl, UserMode);
}

io32->ptr = (VOID*POINTER_32)PtrToUlong(od->rx_queue_user_mem);
}
/**/
io32->ptr is then processed to 32bit application which assing it to local queue pointer
and if trying to access it, crashes:

appqueue = (PWNQUEUE)io32->ptr;

if (appqueue->version) KABOOM;

hope this part of code helps, there is unlikely bug in rest of the code since it works in all other scenarios

od->rx_queue_user_mem

should read

devstr->queue_user_mem

copied from another module

Are you sure you find yourself in the context of the process in which you
map the user buffer ? Can you explain from where you are calling this code ?

Some other comments:

It would help if you post the declarations of your variables and structures.
You should check the return values of your function calls and do error
handling in case of failure.
You must call MmMapLockedPages with userMode only from within a try/except
block as this can raise an exception.
MmMapLockedPages is also obsolete, use MmMapLockedPagesSpecifyCache instead.
MmCreateMdl is obsolete, you should use IoAllocateMdl instead.

Follow the suggestions in the doc and read the following article on using
MDLs:
http://www.osronline.com/article.cfm?id=423

//Daniel

wrote in message news:xxxxx@ntdev…
> Ok, seems to be a bit more complicated than simple answer, so here it is.
> Thank You again for any help.
>
> queue = ExAllocatePoolWithTag(NonPagedPool,size,tag);
>
> /snip/
>
> if (IoIs32bitProcess(irp)) {
> devstr->queue_mdl = MmCreateMdl (NULL, devstr->queue,
> queue_sizeof(devstr->queue));
> MmBuildMdlForNonPagedPool (devstr->queue_mdl);
> devstr->queue_user_mem = MmMapLockedPages
> (devstr->queue_mdl, UserMode);
> }
>
> io32->ptr =
> (VOID*POINTER_32)PtrToUlong(od->rx_queue_user_mem);
> }
> /**/
> io32->ptr is then processed to 32bit application which assing it to local
> queue pointer
> and if trying to access it, crashes:
>
> appqueue = (PWNQUEUE)io32->ptr;
>
> if (appqueue->version) KABOOM;
>
> hope this part of code helps, there is unlikely bug in rest of the code
> since it works in all other scenarios
>
>
>
>

thank You for comments.

it is called within IOCTL handling routine once called from 32bit application. I assume it is enough to ensure that it is in enough to ensure correct context? The definitions of all pointers etc should be correct as it works with other examples.

Please also note that all other IOCTL functionality (and it is plenty) works without problem.

Yes those functions are obsolete but should work, i would prefer to change it only if it would cause the issue (and it dont).

I’ll spare you the lectures about security and how bad it is to map non-paged pool back into user space, and try to focus on solving your current problem. Make a note to yourself, however, that you’ve created a serious security loophole by doing what you’re doing.

ANYhow… Daniel asked if you’re 100% sure you’re in the context of the calling process – where are you making these function calls – I’d like to know that as well. Also, can you verify that:

io32->ptr = (VOID*POINTER_32)PtrToUlong(od->queue_user_mem);

isn’t causing any weird truncation or anything?? In other words, is it true that:

io32->ptr == od->queue_user_mem

Peter
OSR

Peter, you dont need to spare me anything. I welcome all comments. However performance wise-there is afaik no better solution. There are other ways to secure installation in industrial environment and performance should never be compromised.

a) I answered it. It is called in routine handling IOCTL. Driver is aware it is 32bit caller because IoIs32bitProcess(irp) returns true. So application sends DeviceIOControl to driver and gets the pointer back.

b)From what i tried, the lower 32bits from devstr->queue_user_mem are equal to value of io32->ptr

what would be better way to cast it?

Also, since I don’t think you have yet, please post the !analyze -v output
of the crash. That way we can at least see what the system doesn’t like
about the address.

-scott


Scott Noone
Software Engineer
OSR Open Systems Resources, Inc.
http://www.osronline.com

The “better” solution is describe here:

http://www.osronline.com/article.cfm?article=39

You can still use shared memory, just not shared memory from non-paged pool.

Sorry if I’m reading this too literally: But you said “the lower 32 bits”… the UPPER 32-bits had better all zero. They are, right?

SNoone’s request for the analyze -v output is also an excellent one.

Peter
OSR

> I’m guessing that there’s probably some simple unrelated bug.

This can be a Windows bug in MmMapLockedPagesSpecifyCache(UserMode), probably the 64bit OS does not do the check for 32bit process.

Is the issue here on several x64 Windows versions? or only in one of them?


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

> if (IoIs32bitProcess(irp)) {

devstr->queue_mdl = MmCreateMdl (NULL, devstr->queue, queue_sizeof(devstr->queue));

Use IoAllocateMdl instead. To use MmCreateMdl, you need a pre-existing struct _MDL with its tail, and IoAllocateMdl will allocate the MDL with a proper tail size.

What is the pointer value? is it fitting 32bit or not so?


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

> Peter, you dont need to spare me anything. I welcome all comments. However performance wise-

there is afaik no better solution.

Allocate in the app then pass this to the driver.

I.e. - VirtualAlloc in the app, DeviceIoControl to the overlapped handle in the app.

In the driver, just pend this IRP for the whole lifetime till the driver needs the buffer. I.e. IoMarkIrpPending, save the IRP pointer somewhere, then return STATUS_PENDING.

The MDL you need is in Irp->MdlAddress, do MmGetSystemAddressForMdlSafe on it.

When the driver will want to free this buffer, it should just complete the IRP in question.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com