A LWF can meet those requirements, but an NDIS protocol driver can do it a little easier. (It’s a little klunky to mess with the packet filter from a LWF). Usually we’d recommend a LWF for a packet-capturing solution because you also need to observe the packets that your OS is transmitting. In this case, it sounds like the OS isn’t originating any packets on these network interfaces, so there’s no need to drag in the observational powers of an LWF. (Unless you want your own monitoring LWF for other projects.)
The only reason you would *need* a LWF here is if you want to strictly(*) enforce the “passive” requirement. If you want a guarantee that nobody else on the box can send any packets on the link, then a LWF is the way to go. Note that reflecting traffic & denying writes are orthogonal purposes. It might be a good idea to create two drivers: a protocol to reflect traffic and a LWF to deny writes.
(*) Note that Windows does not *require* NIC drivers to not emit packets on their own initiative. That is, even if the OS never asks the NIC to send a packet, the NIC might still originate one or two packets. 802.3 PAUSE frames (for flow control) are commonly originated by hardware, without the involvement of the OS. But if you are worried about this, then you probably wouldn’t be investigating a “software” tap.
Anyway, an NDIS protocol driver (or LWF) can definitely do this, and is probably the right layer to solve this sort of problem.
You’ll definitely need to set the RX NIC into promiscuous mode to get all the traffic. Note that once you go into p-mode, the NIC is allowed to indicate up malformed Ethernet frames (like a runt packet, or one with a bad FCS). Not all NICs do this. You can decide whether you want to attempt to retransmit the malformed frame on the other NIC, but the other NIC is most likely going to drop the packet, or try to “fix” it (e.g., by adding padding to an otherwise undersized packet). Depending on the NIC vendor, it might not be possible to preserve malformed packets.
You might also need to set the TX NIC into promiscuous mode, so it doesn’t mess with the source MAC address.
Put aside NDIS minutia for a moment. You’ll definitely want to sit back and think about the general streams of data, the queues of data, and the threads/CPUs that will service them. A $10 Ethernet card can move 10 times as much data per second as a $100 hard drive can. So there’s a high chance that your solution will, at some point, have more network traffic flowing through it than you can possibly save to the hard drive. The same $10 Ethernet card can also move up to 2 million packets per second, which is quite likely more IOPS than your CPU can handle.
There’s only a few strategies on how to solve this bottleneck:
- Throttle the network traffic down to a bandwidth that your storage can handle; and the IOPS that your CPU can handle
- Make peace with the fact that you’ll have to drop traffic in some cases
- Buy more expensive CPUs and storage
- Document that your solution “doesn’t support” traffic that exceeds a certain threshold; say “I told you so” if the threshold is exceeded
I’m not aware of other solutions. It’s good to pick your strategy up-front, since that dictates some of your software architecture.
One thing you DON’T want to do is to casually conclude that “multithreading will surely solve this”. The problem is that, while parallel processing will likely reduce your CPU usage a bit, you start running a high risk of re-ordering the packet stream. If your goal is to be a (mostly) transparent bump-in-the-wire, reordering packets is a bad behavior. Even if you aren’t strict about the passive requirement, reordering packets can increase CPU usage or even cause application errors on other nodes in the network. (It’s generally an app bug if the app assumes that UDP packets are delivered perfectly in order, but you don’t want to be the one who exposes the bug.)
That’s not to say that you can’t use threads. They might be a great solution. But you have to think about the flow of data from the RX NIC to your storage system, to the TX NIC. Where are the queues? Where will the bottlenecks be? What ensures the packets are re-transmitted in-order? Where do packets get dropped? What is your drop strategy (random drops, bursty drops, etc)? If you intend to apply back-pressure, how do you do that (PAUSE frames, DCBX, something else)? Once you have the flows and queues drawn out on paper, you can start thinking about what sort of threading you’ll use to service these.