Every NT driver writer we know wants to be a good memory citizen and avoid fragmenting precious non-paged pool, so, like them, we followed conventional wisdom and chose to use lookaside lists to manage frequent allocations and deallocations of similarly sized chunks of memory.
The theory behind lookaside lists is that you get to specify, among other things, the maximum depth, or how many identically-sized chunks of memory the list can contain. As memory is freed back to the lookaside list from which it was obtained, NT checks the actual depth of the lookaside list; and if the maximum depth has not been reached, NT defers the actual ExFreePool(…) deallocation, and instead adds the memory to the lookaside list. Subsequent allocation requests made to the lookaside list are satisfied (if possible) by reusing recently freed memory and decrementing the actual depth.
For some of our lists, we specified a depth of 32. We wrote a custom memory allocator and deallocator, which we provided to ExInitializexPagedLookasideList(…). We first saw a problem when the deallocator started being called from ExFreeToNPagedLookasideList(…) long before the depth specified had been reached. In fact, it appeared that NT was behaving as if the depth of the list was 4.
We started our investigation by looking at the value of the Depth setting (in the GENERAL_LOOKASIDE structure, found in ntddk.h) for our list. It was, in fact, 4, even though we had specified the depth to be 32 at initialization time.
Next, we stepped through the ExInitializeNPagedLookasideList(…) routine to see what NT was doing with the specified depth. We found that at the beginning of the routine, NT initializes most of the GENERAL_LOOKASIDE members, and then sets the depth of the list to 4. Later in the routine, NT sets member values such as size according to the parameters passed in, except for the depth parameter, which is ignored!
We tried to outsmart NT by directly changing the depth in the GENERAL_LOOKASIDE structure to 32, but noticed that the depth was mysteriously being set back to 4. By setting a memory breakpoint on the Depth variable, we discovered that the routine ExpScanGeneralLookasideList(…) (in NTOSKRNL) is being called periodically – about every two seconds – and in turn calls ExpComputeLookasideDepth(…) for every lookaside list known to the system. ExpComputeLookasideDepth(…) recalculates the value of the depth for a lookaside list, and returns a flag indicating whether the depth was decreased or not. Presumably, if the depth is decreased, memory will be freed from the lookaside list until the actual depth is less than or equal to the maximum depth.
Of particular interest is the algorithm that ExpComputeLookasideDepth(…) uses to recalculate the depth. It seems that the routine first calculates the ratio of allocation misses to allocation requests since the last time the routine was called. If this ratio is less than 10%, then the depth is decreased by 10 (but not below 4). Otherwise, the depth is increased by (ratio / 2) * max_depth; and max_depth is kept in the "pad" field in the GENERAL_LOOKASIDE structure, and is initialized to 0x100.
It seems to us that this algorithm rewards lists that fail to allocate efficiently! If more than 10% of your requests to allocate memory from a lookaside list in a 2-second period result in a call to ExAllocatePool(…), your depth will be increased. In our opinion, a better algorithm would have looked at the ratio of free misses to allocation misses. This would measure the number of times an ExFreePool/ExAllocatePool(…) pair of calls could have been avoided during the last 2-second period. If this happened frequently, the depth of the lookaside list should be increased.
One way to circumvent the situation is to avoid registering lookaside lists in the system’s master list of lookaside lists. This could be done by hand-initializing all the members of the GENERAL_LOOKASIDE structure (except for ListEntry). In this way, ExpScanGeneralLookasideList(…) would never have knowledge of your lookaside list, and therefore would not be able to recalculate a depth. All other lookaside list routines would work, except ExDeletexPagedLookasideList(…) – you would have to manually delete entries still in the lookaside list before discarding the structure.
So, after all the discussions in leading publications about NT driver memory management and how lookaside lists are better than sliced bread, it appears that driver writers are no better off than if they just called ExAllocatePool(…) and ExFreePool(…) directly!