Hi,
We are building an NDIS lightweight filter driver to communicate user mode applications with miniports in the most direct and efficient way we can (both for sending and receiving).
We have already used other available drivers for the same purpose, but we have faced performance problems or limitations, difficult to deal with without a deep understanding of the inner workings of those drivers.
The aim of our driver is to share a buffer in reception and transmission (one for each direction) between the kernel and the application, mapped only once to kernel and application virtual address spaces, and then use them in the most efficient way in terms of context switches and throughput, to be able to capture and send Ethernet frames at line speed in 1G and 10G network interfaces (at first attempt, over Windows Teaming).
We have two problems at this stage:
– We don’t know how to avoid, in reception, the copy from the miniport reception buffer to our (“lightweight filter and application” shared) buffer. Specifically, we would like to avoid the call to NdisGetDataBuffer to copy bytes from the miniport buffer (inaccessible to us) to ours.
– We have a problem in transmission, by which, if the operating system is Windows Server 2012 or 2012 R2 (it doesn’t happen in Windows 10), send operations randomly fail when we send a frame which is located passed the first 100 KB of the shared buffer. We have a single MDL that describes the whole buffer, and use NET_BUFFER and NET_BUFFER_LIST structures to build a view of a portion of the buffer. That adapted view is what we pass NDIS to be sent. We know of the problem because we receive Status other than NDIS_STATUS_SUCCESS (usually NDIS_STATUS_FAILURE) inside the NET_BUFFER_LIST passed to the call to the FILTER_SEND_NET_BUFFER_LISTS_COMPLETE callback routine. The error code is unspecific.
What follows is a more detailed description of the sending process:
-
The application sends an IOCTL to the driver requesting a Tx channel.
-
The driver reserves some non-paged memory pages with MMAllocatePagesForMdlEx, maps them to virtual addresses for both the user and kernel spaces and returns the information to the application in response to the IOCTL. We?ve also tried (with similar results) reserving memory from the non-paged memory pool using NdisAllocateMemoryWithTagPriority and then using IOAllocateMdl and MMBuildMdlForNonPagedPool to create an MDL describing the memory.
-
The buffer is large enough to hold many messages (each of them with its preamble) so we can send batches of messages instead of individual ones. A number of NetBufferLists are pre-allocated for this purpose, all of them pointing to the same MDL.
-
When the application needs to send a message, it is copied to the shared buffer including a self-defined preamble with the message length and a few control flags.
-
An IOCTL is sent to the driver to indicate the presence of data in the shared buffer.
-
The driver starts reading messages from the buffer and mapping them to the pre-allocated NetBufferLists (taking care that the Offset, Length and Next properties of the NetBufferLists and NetBuffers contain correct values).
-
The complete NetBufferList chain is passed to NDIS for sending.
-
When the SendNetBufferListCompleteHandler is invoked by NDIS, the NetBufferLists are returned to the pool and the preamble of the messages in the buffer are flagged as success/error depending on the status reported by the operation.
This system seems to be working fine in windows 10. In windows server 2012, for buffer sizes up to 100K everything works fine, but if we try to use larger buffers, there appears to be a problem sending messages stored beyond this point. Upon completion, the status of these NetBufferLists just indicates a failure status (0xC0000001).
Basically, for a buffer of 131072 bytes it works, for a buffer of 262144 it doesn?t.
We understand how strange this sounds and we?ve tried experimenting by removing the interaction with the application layer (just in case) and the results were the same. We?ve also tried reserving a huge memory buffer and then mapping only a 100K section of that buffer for send operations and in this experiment everything works fine.
We also understand that the buffer used to send a message via NDIS must only be large enough to hold the message being sent, which for an Ethernet frame would be around 1500 bytes. Our buffer is larger because we want the application to be able to go on producing frames on free portions of the shared buffer while previous frames are being sent.
For the first problem, our questions are: Can we avoid extra copies in NDIS lightweight filter driver reception? For example, interacting directly with the DMA process (setting the DMA buffer from the filter driver or something like that, and then sharing that buffer with the application)? We are trying to build a zero copy solution.
For the second problem, our questions are: Is it possible that NDIS has problems handling messages inside NetBufferLists that reference large MDLs? Has anyone found anything like this before or has any clues on where the problem might be?
Thank you very much.
J.M.