using Winsock Kernel for high performance datagram sockets

Johannes_F · November 8, 2013, 8:13am

Hello,

I’ve written a kernel driver using WinsockKernel for sending and receiving multicast datagrams at a very high rate (up to 50.000 datagrams per second). I need this for sending/receiving media streams to/from some embedded devices that do the network stuff through their FPGA.

My driver seems to works correct so far but there are two effects I’d like to solve:

from time to time I’m loosing some datagrams. They seem to be lost on sender side, as on two receiving pc’s I’m missing exactly the same packets. But my driver doesn’t report any errors during the sending procedure and the irps of WskSendTo() are completed with OK status. From time to time means sometimes it’s several minutes up to hours without a single packet lost, sometimes there are up to ten random packets per second missing.
on one receiving pc, which doesn’t show any differences to the others most time, I’m loosing many datagrams when I try to receive while I’m listening to some audio on the onboard audiocard. On my other pcs listening to audio doesn’t affect receiving the datagrams at all.

I tried to improve both effects by changing the size of the sending and receiving socket buffers, but this doesn’t work and I’m experiencing the same effect as

http://www.osronline.com/showthread.cfm?link=157209

unfortunately this thread is not answered.

Should WskGetOption/WskSetOption SO_RCVBUF/SO_SNDBUF work for datagram sockets? Could this improve the effects I’m watching?

Thanks and regards,
Johannes

Maxim_S_Shatskih · November 8, 2013, 10:54am

> - from time to time I’m loosing some datagrams.

So what? this is 100% normal for datagram sockets and UDP. Be prepared.

If you need lossless protocol - use TCP instead and forget datagrams.

Also, datagrams can arrive out of order.

SO_SND/RCVBUF for UDP really do some things IIRC, but 100% different things from what they do on TCP.

–
Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

Alex_Grig · November 8, 2013, 11:15am

Do you have flow control enabled on your network?

And by the way, it’s “lose”, not “loose”. “loose” means “not tight”.

Johannes_F · November 8, 2013, 5:03pm

Yes, I know UDP is not lossless but we also need very low latencies (< 10 msec) and finally I have no choice as I’m not in the position to change this format and it’s not so easy to argue when other devices handle such traffic without problem … so - is it really 100% normal for datagram sockets? … at least I think it’s not normal for the network, as I’ve never seen this packet loss when streaming from embedded device to embedded device. My network is a GBit network, I’m using a managed Cisco switch, there’s no other network traffic and I’m also checking, whether the datagrams are out of order.

So I tend to see the problem inside the windows socket. It was a big step forward for me to use Winsock Kernel instead of User mode sockets, but I’d like be to as good as the embedded devices and so I try to improve my driver and one idea was to increase the socket buffer size. Do you think this could help? Where can I find some more information about SO_SND/RCFBUF for UDP? What could be the reason for STATUS_INVALID_PARAMETER whenever I try to get/set the buffer size (exactly as described in http://www.osronline.com/showthread.cfm?link=157209)? Is there a chance to get some more information about the packet loss from the OS after I’ve invoked WskSendTo() and checked the irp completion status?

Thanks and regards,
Johannes

Alex_Grig · November 8, 2013, 6:43pm

Also keep in mind, when you enable the kernel debugging in Windows 2012+, NDIS will disable the flow control on all NICs, unless you set a special registry value. This may cause you to drop sent packets.

Johannes_F · November 9, 2013, 2:28am

I’m doing the performance tests with a signed release version of my driver on normal W7/64 and W8.1/64 machines. So there shouldn’t be any effects related to debugging flags etc…

If I was able to get some additional information about the reason for the packet loss, perhaps I could finetune the NIC according to this article:

http://msdn.microsoft.com/en-us/library/cc558565(v=bts.10).aspx

Alex_Grig · November 9, 2013, 9:57am

>we also need very low latencies (< 10 msec)

Is if *mili*seconds or *micro*seconds? What physical layer you’re using?

Johannes_F · November 9, 2013, 10:35am

10 msec here is meant to be 10 milliseconds. So it’s not low latency in terms of embedded hardware.

I’m using WinsockKernel Sockets and configure them as raw sockets. So that should be Layer 4 (transport layer)?

Alex_Grig · November 9, 2013, 2:16pm

Physical layer means: Is it 100BaseT, 1GBaseT, 10G fiber, 10GBaseT?

Johannes_F · November 9, 2013, 3:40pm

I’m using 1GBaseT. So when I’m transferring 48000 datagrams per second (size per datagram = 780 Bytes) it’s about 300 MBit or 30% percent load and that’s exactly the value shown by the TaskManager etc…

Today I did some tests with a slightly modified timing inside my driver and with some modified parameters of the NIC like offload options and Max Transmit/Send Descriptors and Send Buffers but until now I could not watch any positive or negative impact and I’m still losing up to 10 datagrams per second.

Alex_Grig · November 9, 2013, 3:53pm

Are you testing back to back, or you have an intermediate switch?

If it’s a managed switch, what do statistics show in the ports for bad CRC, dropped frames, issued and received PAUSE frames, etc?

How many frames are queued for transmission at once?

James_Harper · November 9, 2013, 5:12pm

>

I’m using 1GBaseT. So when I’m transferring 48000 datagrams per second
(size per datagram = 780 Bytes) it’s about 300 MBit or 30% percent load and
that’s exactly the value shown by the TaskManager etc…

What is the interrupt rate? Are you getting one interrupt per packet?

Do you have a way to change the transfer rate, even as a test, eg what is the packet loss at 30000, 20000, or 10000 packets per second?

James

Pavel_A1 · November 9, 2013, 7:10pm

On 09-Nov-2013 22:40, xxxxx@Freyberger.de wrote:

I’m using 1GBaseT. So when I’m transferring 48000 datagrams per second (size per datagram = 780 Bytes) it’s about 300 MBit or 30% percent load and that’s exactly the value shown by the TaskManager etc…

Today I did some tests with a slightly modified timing inside my driver and with some modified parameters of the NIC like offload options and Max Transmit/Send Descriptors and Send Buffers but until now I could not watch any positive or negative impact and I’m still losing up to 10 datagrams per second.

In the network connection settings dialog, is the “QoS Packet scheduler”
checkbox checked?

– pa

Maxim_S_Shatskih · November 10, 2013, 2:25pm

> Yes, I know UDP is not lossless but we also need very low latencies (< 10 msec)

Don’t you know that the perpetuum mobile is not possible? remember the energy conservation law?

Or maybe you know that Shannon’s law on information exchange, which follow the same pattern as the 2nd Law of Thermodynamics, just using another notion on enthropy?

In short: the combination of hard reliability reqs and hard worst-case latency reqs is IMPOSSIBLE TASK. Totally. It’s a perpetuum mobile.

You should choose what to sacrifice.

Usual Internet (like HTTP) sacrifices latency. They actually begin with sacrificing latency.

Some apps where latency cannot be totally sacrificed, like Skype - then sacrifice the voice quality.

But anyway something is sacrificied.

and finally I have no choice as I’m not in the position to change this format

Too sad to have one’s bosses as morons

The person who designed something UDP-based and is NOT prepared to the fact datagrams can be lost (as also reordered) - is a moron.

No exceptions.

and it’s not so easy to argue when other >devices handle such traffic without problem … so - is it
really 100% normal for datagram sockets?

Surely yes.

In one particular network config, this kind of things will be very rare. But, if this will be productized and used in the customer’s networks with some El Cheapo Chinese routers (since their CFO wants to cut costs) or such - then you will see this A LOT.

from embedded device to embedded device. My network is a GBit network, I’m using a managed
Cisco switch,

…and your customers will use some unmanaged El Cheapo Chinese stuff off the local store…

So I tend to see the problem inside the windows socket.

No.

This is NOT a problem. When UDP loses a packet - this is NOT a problem. This is by design.
The “problem” (which is not a problem actually) can be anywhere - in MS’s IP stack, in the NIC driver on the Windows machine, or in Cisco box.

It was a big step forward for me to use Winsock Kernel instead of User mode sockets

No step forward at all. Just a bit of CPU load reduction eliminating syscall overhead. Nothing else. The underlying engine is still the same.

was to increase the socket buffer size. Do you think this could help?

Surely no.

You have the following ways:

switch to existing reliable protocol. It’s called TCP, is described (version 0.5 or such) here: http://www.ietf.org/rfc/rfc793.txt, and is free from such issues.
implement your own reliable protocol (yes, with retransmissions) on top of UDF.
relax the latency requirements. Make a product heterogeneous, so that latency is only important between Data Source and First Level Server, and then not important between First and Second Level Servers.
require (yes, require your customers) the use of some advanced stuff, like the particular high-end Cisco box, or Converged Ethernet, or 10GBps or such.

Switch from user-mode sockets to WinSock Kernel is just literally nothing compated to above ways.

Where can I find some more information about SO_SND/RCFBUF for UDP?

By Google, as usually.

IIRC SO_RCVBUF for UDP just governs how much datagram bytes can the engine hold in memory if the arrival rate from the network > consumption rate by recv() calls.

If the incoming datagram will try to overflow this limit - it is silently discarded.

–
Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

anton_bassov · November 10, 2013, 2:52pm

> I’m using 1GBaseT. So when I’m transferring 48000 datagrams per second (size per datagram = 780 Bytes)

What else would you expect from UDP datagrams in excess of 576 bytes, i.e. the minimal size of a datagram that IPis required to assemble, which also happens to coincide with the smallest MTU allowed by IP??? By going over 576 byte limit you decrease a chance of datagram ever reaching its destination dramatically.This is why all UDP-based protocols are designed around this limit…

Anton Bassov

Pavel_A1 · November 10, 2013, 4:46pm

On 10-Nov-2013 21:24, Maxim S. Shatskih wrote:
[snip]

Awesome post… so deep an rich in various details.
But recall that the OP wrote earlier: the packet loss is on sending side.

It is hard to tell why Windows fails to put packets on the wire.
Nevertheless you may be right: Windows is not specially designed for the
kind of traffic, and same about the NIC driver. Either the NIC
or components above it can silently drop excessive tx data.

So the advice to the OP is to view the Windows machine as a black box.
Even if there’s a bug somewhere, chance to get it fixed is to low.
Basically, this means same as “Windows is not a real time OS”.

Regards,
– pa

anton_bassov · November 10, 2013, 6:12pm

>But recall that the OP wrote earlier: the packet loss is on sending side.

This is why I immediately suspected packet size in excess of 576-byte limit may be a culprit here…

If the IP implementation on the sending machine knows that UDP packet may be silently discarded by the IP implementation on the receiving end may it may silently discard it straight away without even trying to
send it - after all, the general rule of networking is that you drop “unreliable” packets as early as possible…

In the network connection settings dialog, is the “QoS Packet scheduler” checkbox checked?

QoS components are unlikely to drop UDP packets…

The main idea behind dropping packets is to let a sender know that the network is about to get clogged so that it should lower the transmission rate. It is normally done by routers. In order for such a drop to be beneficial the sender must be able to understand that it has to cut down the transmission rate. Therefore, it is normally done with TCP packets. Alternatively, Explicit Congestion Notification may be sent, but in order to be effective it has to be supported by both endpoints, as well as by underlying network…

Once UDP is unreliable protocol that does not use acknowledgments, dropping UDP packets is not going to
indicate anything to a sender, and, hence, is simply pointless…

Anton Bassov

Alex_Grig · November 10, 2013, 7:21pm

And remember: 10 ms is 10 MB worth of data. If your senders queue more than that, you’re out of luck.

Pavel_A1 · November 10, 2013, 7:29pm

On 11-Nov-2013 01:11, xxxxx@hotmail.com wrote:

If the IP implementation on the sending machine knows that UDP packet may be silently discarded by the IP implementation on the receiving end may it may silently discard it

It does not know for sure, it may only suspect…

The main idea behind dropping packets is to let a sender know that the network is about to get clogged so that it should lower the transmission rate.

But the “network” in this case is trivial. Both sender and receiver are
connected to the same eth. switch, and data rate is much less than the
switch throughput. And the other (non-Windows) machines behave well
with this setup, so the switch probably is not the culprit.
They do not need much of the sockets layer for UDP - knowing the MAC and
IP of the destination, they could make up ethernet packets using a “raw
socket”, or even binding directly to NDIS as a protocol.
Then the netcard driver should provide back-pressure ‘naturally’.
Of course that won’t be as clean and classy as using sockets…

– pa

Maxim_S_Shatskih · November 10, 2013, 10:15pm

> But recall that the OP wrote earlier: the packet loss is on sending side.

Given that UDP is by definition unreliable, the NIC driver authors could trivially do this for their convinience.

IIRC NDIS5 has a queue, and the packets were silently discarded if the queue was to overflow.

–
Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com