@brad_H said:
@MBond2 said:
how have you concluded that your scanning is not a significant cause of performance reduction? What specific measurements have you made or analysis done?
perhaps your logic is that that part canât be made to go any faster, so it must not be a problem. if the analysis is for TCP data, and a significant window into the stream is required, you are going to introduce significant latency. Thatâs going to have a significant effect on the way that layers above you respond
@Tim_Roberts said:
And as i said the user-mode inspection is optimized and fast,
Doesnât matter how fast it is if itâs only getting called âa few times a secondâ. Every interval you delay adds latency to the packets. Also, you are ignoring the time it takes to transition between kernel mode, where the packets arrive, and user mode, where they get reviewed, and back to kernel mode, where theyâll be approved/declined.
I measured it with procmon, so the user-mode checking takes around 10% of the time and most of it is happening in kernel.
My main question here right now is what is the most efficient way of storing and transmitting these packets to user mode, and sending the ones that are marked as OK by the user-mode using the NDIS driver?
ProcMon isnât the way to measure kernel space timings ⌠thatâs limited to usermode applications. There are ways to get timings in kernel mode, Iâd do some searching on this list for âprecision timingsâ
The answer to your question isnât as clear cut as it looks, because itâs not clear whoâs doing what âŚ
Youâve got data coming in from the NIC card, which is going to be in the context of the OS as it works itâs way from ISR to DPC. Now the data is sitting in a preSniffed packet buffer queue in the driver waiting for some thread to scoop it out âŚ
Youâve got a usermode service running, with some threads waiting for something to do âŚ
So what it sounds like youâre going to do (and this is all a wild guess) is have a thread in the service post an inverted call to the driver waiting for the packet buffer to have something to look at. The driver moves a (single) packet from the packet buffer into the inverted call thread buffer and that call completes, moving the packet into the service where it is looked at. If all is good then the thread then makes another inverted call into the driver (this time passing the packet or a packet ID back) and waits for the next packet to be looked at.
The driver takes the sniffed packet (or packet ID) from the inverted call and puts it into a postSniffed packet queue
At some point another driver in the network stack (and it needs to be at the kernel level, because Kernel Winsock exists and if you donât handle that then bypassing your packet sniffer is not only trivial, itâs expected behaviour) pulls packets from the postSniffed packet queue
Do you see how long and tortuous of a journey every single packet is going to have to make, and how long all of those kernel to usermode transitions are going to take?
Most (actually all, not most) packet introspection happens entirely in the kernel and thatâs where youâre going to have to put your sniffing. Most (actually all) use system thread pools and most (actually all) donât do any packet copying, they work entirely on a packet as they are DMAâed from the NIC or the offload engine.
tldr: Over in LinuxLand (which because of itâs nature you can read the source code) there are things that do packet sniffing; I would really strongly recommend you see what they are doing and try to emulate that in your product ⌠you canât cut and paste the code and declare victory [love that Doron, Iâm starting to use that!] and you canât lock a processor at ISR level and keep it there like Linux can but it will give you an idea of what is working for packet scanning. Your approach, unfortunately, at best is going to turn a 10GB network connection into a 1980âs 300baud modem connection âŚ