Driver Problems? Questions? Issues?
Put OSR's experience to work for you! Contact us for assistance with:
  • Creating the right design for your requirements
  • Reviewing your existing driver code
  • Analyzing driver reliability/performance issues
  • Custom training mixed with consulting and focused directly on your specific areas of interest/concern.
Check us out. OSR, the Windows driver experts.

Monthly Seminars at OSR Headquarters

East Coast USA
Windows Internals and SW Drivers, Dulles (Sterling) VA, 13 November 2017

Kernel Debugging & Crash Analysis for Windows, Nashua (Amherst) NH, 4 December 2017

Writing WDF Drivers I: Core Concepts, Nashua (Amherst) NH, 8 January 2018

WDF Drivers II: Advanced Implementation Techniques, Nashua (Amherst) NH, 15 January 2018


Go Back   OSR Online Lists > ntdev
Welcome, Guest
You must login to post to this list
  Message 1 of 5  
19 Nov 17 09:48
Kern Mode
xxxxxx@gmail.com
Join Date: 19 Nov 2017
Posts To This List: 2
Optimized Memory Dumps

Hello OSR community, I am trying to implement an optimized memory dump mechanism. The basic idea is, that I want to dump READ/WRITE memory form a user space process (e.g. firefox.exe) on specific events. The events on which my dumping mechanism is triggered, are specific packets within the WSPSend/WSPRecv calls from the mswsock.dll. I have implemented a DLL which intercepts the send and recv calls inside the target process and performs the dumping when the dumping condition is triggered. I use VirtualQueryEx to filter the memory regions. As mentioned before I am interested in READ/WRITE memory. I also skip the thread stacks because I don't need them even if they are READ/WRITE. The problem with this approach is that every dump is about 90 mb (firefox.exe). To optimize the dump, the idea would be to track the memory which was written since the last dump. This should lead to smaller dumps on subsequent triggers. The idea now would be to implement a kernel driver and utilize the additional information from kernel mode. The previously mentioned DLL should now send a dump request to the driver (on the same trigger events) and the driver would would peform the dumping. What i have done so far is to traverse the process VAD tree, and to retrieve the PTE while looping through the region start until region end by adding 0x1000 to the start address. I encountered a few problems. First i have to suspend all the threads except the running one within the user process (firefox.exe), otherwise the VAD traversal will result BSOD. How can i implement a proper synchonization solution for the VAD traversal, such that the structure is not changed during the traversal. I have tried to use the eprocess->AddressCreationLock with the macro LOCK_ADDRESS_SPACE which passes the EX_PUSH_LOCK to KeAcquireGuardedMutex, but the types seems not to be equal and it results in a BSOD. The operating system i am using for testing is Windows 7 64 Bit SP1. The macro LOCK_ADDRESS_SPACE uses the KeAcquireGuardedMutex function, which requires a PKGUARDED_MUTEX as parameter but the AddressCreationLock is EX_PUSH_LOCK in Win 7 (dumped with windbg). I would like to get the pages modified since the last dump and only dump the changed ones, such that the dump size is minimal. Does anybody know if this can be done in another way and if my approach may have any wrong assumptions? Thanks in advance!
  Message 2 of 5  
19 Nov 17 13:48
Tim Roberts
xxxxxx@probo.com
Join Date: 28 Jan 2005
Posts To This List: 11673
Optimized Memory Dumps

On Nov 19, 2017, at 6:47 AM, xxxxx@gmail.com = <xxxxx@lists.osr.com> wrote: >=20 > I use VirtualQueryEx to filter the memory regions. As mentioned before = I am interested in > READ/WRITE memory. I also skip the thread stacks because I don't need = them even if they are READ/WRITE. The problem with this approach is that = every dump is about 90 mb (firefox.exe).=20 90 MB is not very big. What do you expect to DO wth these memory dumps? = If you are going through them by hand, then the task is hopeless. You = couldn't go through 90 MB even once. If you are going to do some = automated processing to search for stuff, then 90 MB is a trivial = amount. > To optimize the dump, the idea would be to track the memory which was = written since the last dump. This should lead to smaller dumps on = subsequent triggers. > The idea now would be to implement a kernel driver and utilize the = additional information from kernel mode. What information do you think you can get from kernel mode that isn't = available in user mode? Have you investigated the debugger APIs? You can exercise relatively = complete control over another process using them. =E2=80=94=20 Tim Roberts, xxxxx@probo.com Providenza & Boekelheide, Inc.
  Message 3 of 5  
19 Nov 17 14:01
anton bassov
xxxxxx@hotmail.com
Join Date: 16 Jul 2006
Posts To This List: 4401
Optimized Memory Dumps

> How can i implement a proper synchonization solution for the VAD traversal, >such that the structure is not changed during the traversal. You may not - as simple as that. You simply cannot synchronise an access to the resources that you don't own.Full stop. The only thing left to do is to corral CPUs if you run in the KM, or suspend all the threads (or change their contexts and make them go blocking on the synch resource that you DO own) of the target process in the userland. These tricks would hardly qualify for a "proper" solution - for example, consider what happens if one of the target threads already owns the resource that you are trying to grab behind the scenes. Some approaches may be looking as a proper solution (in fact, be the one in the vast majority of cases, so that a subtle"Heisenbug" that they present may go undetected for quite a while). For example, in your particular scenario suspending all threads of the target process from the userland may work just fine, at least at the first glance. The actual suspension may take place only at the time when the target thread is about to return to the userland, and at this point it may not already own any kernel resources, for understandable reasons. Seems to be a perfect solution, does not it. Unfortunately, not - it does not resolve the potential conflict of your KM code with any other thread in the system that makes some MM-related call that tries to access you target structures. Once all the threads in the target process are suspended they cannot make MM-related calls, so that you seem to be safe. However,consider what happens if some other process calls, say VirtualQuery() or ReadProcessMemory() on your target process. As you can see, the possibilityof a conflict is still there. Therefore, no matter what you do, you just cannot develop a solution that is absolutely safe. In any case, I don't see any reason why doing things from a driver may be beneficial in your case. Anton Bassov
  Message 4 of 5  
21 Nov 17 06:35
Slava Imameev
xxxxxx@hotmail.com
Join Date: 13 Sep 2013
Posts To This List: 207
Optimized Memory Dumps

<QUOTE> What i have done so far is to traverse the process VAD tree, and to retrieve the PTE .... I would like to get the pages modified since the last dump and only dump the changed ones, such that the dump size is minimal. </QUOTE> Just for curiosity - how are you going to prevent the Memory Manager(MM) and the Working Set Manager ( a part of the MM ) from toggling a modified flag in a PTE between two snapshots? Looks like your design is based on a wrong assumption that you can use the PTE modified flag as an indicator.
  Message 5 of 5  
30 Nov 17 14:07
Kern Mode
xxxxxx@gmail.com
Join Date: 19 Nov 2017
Posts To This List: 2
Optimized Memory Dumps

Thank you for the reply, The 90 MB dumps are generated per network connection (session), and firefox is generating many sessions. Which would result in multiple dumps around 90 mb. The dumped data is later used for automated processing, where specific data should be extracted. By minimizing the dump there would be less false positives within the extracted data. The reason for the kernel mode driver is that I wanted to be able to determine memory pages which were written (dirty bit set) after the trigger mechanism was triggered. This way I would only dump the dirty pages which would result in a smaller dump. I'm not sure if this could also be implemented from the user mode. I would guess that by utilizing vectored exception handling and modifying the page protections of R/W pages, to "no access" this should also be possible.
Posting Rules  
You may not post new threads
You may not post replies
You may not post attachments
You must login to OSR Online AND be a member of the ntdev list to be able to post.

All times are GMT -5. The time now is 21:43.


Copyright ©2015, OSR Open Systems Resources, Inc.
Based on vBulletin Copyright ©2000 - 2005, Jelsoft Enterprises Ltd.
Modified under license