NDIS passive network tap

Paul_McDonough · July 31, 2015, 3:46pm

Problem: I need to *passively* monitor a network link. Basically I’m looking to create a software based network tap to use in place of a dedicated hardware tap (obviously there is still hardware involved but just less of it).

In my mind I would just need a system with two NICs and then create an NDIS LWF to attach to those two NICs. Then I would just need to copy the data from one NIC to the other (all while making a copy for my other purposes). However, I’m sure I am over simplifying things.

So before I start banging out some code, I’m hoping to get a little design validation and identify any “gotchas” that something like this might entail. On heavily utilized links, packet loss or noticeable latency would be a concern so are there any optimization strategies to make it as efficient as possible?

It’s unfortunate that Thomas is no longer with us because this would be a question right up his alley.

Jeffrey_Tippet_MSFT · July 31, 2015, 9:28pm

A LWF can meet those requirements, but an NDIS protocol driver can do it a little easier. (It’s a little klunky to mess with the packet filter from a LWF). Usually we’d recommend a LWF for a packet-capturing solution because you also need to observe the packets that your OS is transmitting. In this case, it sounds like the OS isn’t originating any packets on these network interfaces, so there’s no need to drag in the observational powers of an LWF. (Unless you want your own monitoring LWF for other projects.)

The only reason you would *need* a LWF here is if you want to strictly(*) enforce the “passive” requirement. If you want a guarantee that nobody else on the box can send any packets on the link, then a LWF is the way to go. Note that reflecting traffic & denying writes are orthogonal purposes. It might be a good idea to create two drivers: a protocol to reflect traffic and a LWF to deny writes.

(*) Note that Windows does not *require* NIC drivers to not emit packets on their own initiative. That is, even if the OS never asks the NIC to send a packet, the NIC might still originate one or two packets. 802.3 PAUSE frames (for flow control) are commonly originated by hardware, without the involvement of the OS. But if you are worried about this, then you probably wouldn’t be investigating a “software” tap.

Anyway, an NDIS protocol driver (or LWF) can definitely do this, and is probably the right layer to solve this sort of problem.

You’ll definitely need to set the RX NIC into promiscuous mode to get all the traffic. Note that once you go into p-mode, the NIC is allowed to indicate up malformed Ethernet frames (like a runt packet, or one with a bad FCS). Not all NICs do this. You can decide whether you want to attempt to retransmit the malformed frame on the other NIC, but the other NIC is most likely going to drop the packet, or try to “fix” it (e.g., by adding padding to an otherwise undersized packet). Depending on the NIC vendor, it might not be possible to preserve malformed packets.

You might also need to set the TX NIC into promiscuous mode, so it doesn’t mess with the source MAC address.

Put aside NDIS minutia for a moment. You’ll definitely want to sit back and think about the general streams of data, the queues of data, and the threads/CPUs that will service them. A $10 Ethernet card can move 10 times as much data per second as a $100 hard drive can. So there’s a high chance that your solution will, at some point, have more network traffic flowing through it than you can possibly save to the hard drive. The same $10 Ethernet card can also move up to 2 million packets per second, which is quite likely more IOPS than your CPU can handle.

There’s only a few strategies on how to solve this bottleneck:

Throttle the network traffic down to a bandwidth that your storage can handle; and the IOPS that your CPU can handle
Make peace with the fact that you’ll have to drop traffic in some cases
Buy more expensive CPUs and storage
Document that your solution “doesn’t support” traffic that exceeds a certain threshold; say “I told you so” if the threshold is exceeded

I’m not aware of other solutions. It’s good to pick your strategy up-front, since that dictates some of your software architecture.

One thing you DON’T want to do is to casually conclude that “multithreading will surely solve this”. The problem is that, while parallel processing will likely reduce your CPU usage a bit, you start running a high risk of re-ordering the packet stream. If your goal is to be a (mostly) transparent bump-in-the-wire, reordering packets is a bad behavior. Even if you aren’t strict about the passive requirement, reordering packets can increase CPU usage or even cause application errors on other nodes in the network. (It’s generally an app bug if the app assumes that UDP packets are delivered perfectly in order, but you don’t want to be the one who exposes the bug.)

That’s not to say that you can’t use threads. They might be a great solution. But you have to think about the flow of data from the RX NIC to your storage system, to the TX NIC. Where are the queues? Where will the bottlenecks be? What ensures the packets are re-transmitted in-order? Where do packets get dropped? What is your drop strategy (random drops, bursty drops, etc)? If you intend to apply back-pressure, how do you do that (PAUSE frames, DCBX, something else)? Once you have the flows and queues drawn out on paper, you can start thinking about what sort of threading you’ll use to service these.

Paul_McDonough · August 1, 2015, 12:07am

Thanks for the awesome answer Jeff.

I mentioned LWF just because I already developed a pretty good multi-NIC p-mode capture filter so I just assumed that extending it wouldn’t be much work. I also like LWF’s better than protocol drivers because they are lower level which gives more control for some of the reasons you mentioned. The box will be dedicated for this capture purpose so I don’t have to worry about a user trying to originate packets although the OS itself still has various networking components doing things. I’ll just have to disable all of those services to cause less strain on the filter.

So strict passivity is not an absolute requirement as this will be used in a permissive environment but I just don’t want the system to be directly addressable. So a few flow control frames isn’t a deal breaker. Although if I can find hardware that doesn’t automatically do things on its own that would be nice. Also, figuring out which vendors try to “fix” bad packets would also be a benefit. Ideally I just want to be a transparent bump; bad frames and all. In your experience, do vendors post that sort of technical details anywhere or will it be trial and error? Or I guess being a MSFT guy means you can just ask them.

You mention that I “might also need to set the TX NIC into promiscuous mode.” I was thinking that both NIC’s will be acting as Rx and Tx depending on where the packet comes from (either from the wire or from the buffer passed from the other NIC) so both will have to be in promiscuous mode. Am I not thinking about this correctly because I don’t quite understand why both wouldn’t have to be in p-mode.

For hardware I was planning on using a RAID-0 SSD config which should provide sufficient I/O bandwidth for a gigabit link even at max utilization. For my requirements the data isn’t sticking around long so I should be good on disk space and my post processing thread can hopefully burn through the data before things starts getting overwritten or dropped.

The threading issue was something I was thinking about but I don’t believe that multi-threading is the answer. Not only would it be complicated to get right as you mention, context switches are expensive and having a thread sit idle does no good. My thinking was more along the line of 1 thread per core per functional area. So on my quad core test system I set 1 thread/core for rx operations, 1 for tx, 1 for post processing and then let the OS have the other core for its own things. Although the issue there is I can’t really control the OS thread affinity. At more than 4 cores I could add more processing threads as that doesn’t need to be done in order but that obviously doesn’t help any rx/tx issues that could arise.

Regardless, it’s going to be baby steps first to get it working and then test it under load to figure out what the stress points are. My multi-NIC capture driver mentioned above hasn’t had a dropped packet issue yet so in the environment I’ll be deploying this idea in, I don’t anticipate too many problems. Although I would like to have it as robust as possible hence the original reason I asked for feedback in the first place.

So again, thank you for your response.

David_R_Cattley · August 1, 2015, 11:37am

So you want to build a transparent bridge with a capture port?

Windows includes an Ethernet bridge component that will bind exclusively to
the two NICs and shuffle packets between them quite efficiently. The
bridge will also expose a (virtual) NIC interface to the host on which you
can bind a protocol listening in promiscuous mode and see all of the traffic
flowing through the bridge.

You could in fact just use WinPCAP to capture all traffic from this
arrangement.

Perhaps I am missing some detail here but it seems like being a bump in the
wire packet capture system is pretty straight forward and very little (if
any) KM software other than in-box capabilities, vendor drivers, and some
packet capture stack are required.

I have this very setup in my lab to monitor a link inline because I am too
cheap to buy a HW monitor.

BTW I think the reason Jeffery said that both the TX and RX NICs need to be
in promiscuous mode is because this thing is symmetric right? They are not
TX and RX but more like ‘left’ and ‘right’. That and sending a frame
with a SRCMAC that is not the current MAC address of a NIC may get filtered
out as ‘bogus’ if the binding is not in promiscuous mode.

Good Luck,
Dave Cattley

David_R_Cattley · August 1, 2015, 11:45am

One more thing that came to mind …

If you are going to use your LWF capture driver (no reason not to if it
works) and bind it to both ports then you will want to be sure to ignore
‘loopback’ packets (NetBufferLists with NDIS_NBL_FLAGS_IS_LOOPBACK_PACKET)
that result from a TX as you will have already seen that frame on the RX of
the other NIC. You probably already knew that but I just thought I would
put that out there too.

If you are going to build the ‘bridge’ function then this is really
important so that you don’t generate an infinite packet duplicator.

Good Luck,
Dave Cattley

Paul_McDonough · August 1, 2015, 1:52pm

Thanks Dave. My only issue with bridging is that it’s a bit noisier than I’m looking for unless there is a way to disable the STP messages. I suppose I could just block those with a filter but I’m unsure of how that would impact the actual bridging operation.

But at least we are both on the same page with being too cheap to buy a hardware device

Maxim_S_Shatskih · August 3, 2015, 4:32am

> Problem: I need to *passively* monitor a network link.

For passive monitor (without ability of altering the traffic) you can just create a NETMON-style protocol driver, which will turn promiscous mode on and listen for all packets.

No need in filters.

–
Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com