PsSetLoadImageNotifyRoutine called at PASSIVE_LEVEL/APC_LEVEL

Hi,

I stumbled upon the following article: http://www.piotrbania.com/all/articles/allocate_deadlock.txt , which describes that PsSetLoadImageNotifyRoutine is called at PASSIVE_LEVEL when loading the ntdll.dll library, but afterwards the function is called at APC_LEVEL for every loaded DLL. This is described in the following paragraph if you do not want to open the URL:

It appears that for every test i did when the original process PE file
is loaded (some .exe file) and ntdll.dll the callback routine operates at
the PASSIVE_LEVEL. However for every further loaded image the IRQL is
raised to APC_LEVEL. And now when you will try to execute
ZwAllocateVirtualMemory or something “similiar” you will experience a
deadlock.

I’ve used the following code inside the PsSetLoadImageNotifyRoutine routine to determine the actual ‘EPROCESS->AddressCreationLock’ (which is located at 0x368 on Windows 10). However the value being printed is always 0.

PEPROCESS proc = NULL;
NTSTATUS status = PsLookupProcessByProcessId(ProcessId, &proc);
ULONG lock = (ULONG)*(PULONG_PTR)((ULONG_PTR)eProcess + (ULONG_PTR)0x368);

Does somebody know why AddressCreationLock is never acquired when the PsSetLoadImageNotifyRoutine is called. Is this still relevant in currently active versions of Windows - starting from Windows 7 - Windows 10? Is there a guarantee that the PsSetLoadImageNotifyRoutine routine will always be called at specific IRQL level, possibly PASSIVE_LEVEL regardless of DLL being loaded?

Actually your question should have been about synchronization.

There is no guarantee that some synchronization objects are not acquired when a callback is called. An elevated IRQL does not indicate with certainty that any synchronization primitive has been acquired.

Generally speaking, you can experience a deadlock being at PASSIVE_LEVEL( not all synchronization methods rise IRQL, so being at PASSIVE doesn’t mean safety ) and walk away unscathed at APC_LEVEL ( as IRQL can be raised without holding any lock ).

Hi,

So what exactly should I do in order to properly synchronize the things done in the PsSetLoadImageNotifyRoutine routine. I basically need to allocate a certain amount of memory in the process and write a predefined data to that memory.

  • Check if routine called at a higher IRQL than PASSIVE_LEVEL in which case I shouldn’t proceed with memory allocation; therefore, the safest way would be to return immediately without doing anything; this skips the process entirely, but is safe regarding higher IRQLs.
  • If routine is called at PASSIVE_LEVEL, then any sychronization object can still be held. Therefore, I need to check whether any sychronization object relevant to allocation memory in the process is held in which case I also need to return immediately. Is there a list of such synchronization objects anywhere?

Can you suggest what would be the best way to properly sychronize the PsSetLoadImageNotifyRoutine routine in order to avoid problems further on.

I guess you are developing a hooking driver to inject hooks/code in a user mode processes. I did this 8 years ago.

No as Windows is a closed source system.

This can be done using some tricks that require intimate understanding of the IO Manager and Memory Manager. This allows to synchronize even the APC_LEVEL case. But I am afraid this constitutes a trade secret of a company that paid for this. Sorry.

Look at the documentation. You have a “Best Practices” page here:

https://msdn.microsoft.com/en-us/library/windows/hardware/ff559917(v=vs.85).aspx#best

Now according to the documentation, PLOAD_IMAGE_NOTIFY_ROUTINE callback functions are called at PASSIVE_LEVEL. I would not trust the document you point to earlier in this thread. AddressCreationLock is a push lock and not a mutex. A mutex can be acquired recursively but a push lock cannot. Look at the disas of the LOCK_ADDRESS_SPACE/UNLOCK_ADDRESS_SPACE routines.

Just forget about the list of locks as a new one can appear anytime without prior notice and your code could be broken if such a new lock is created by Microsoft.

Use a work-item or even better, your own system worker-thread to allocate memory. If your system worker-thread is the only place where the memory is allocated then you can’t have a dead-lock unless your implementation is wrong.

Remember that to have a process running, at least two images must be loaded: the process image and NTDLL.DLL.

Good luck.