> Yes, I know UDP is not lossless but we also need very low latencies (< 10 msec)
Don’t you know that the perpetuum mobile is not possible? remember the energy conservation law?
Or maybe you know that Shannon’s law on information exchange, which follow the same pattern as the 2nd Law of Thermodynamics, just using another notion on enthropy?
In short: the combination of hard reliability reqs and hard worst-case latency reqs is IMPOSSIBLE TASK. Totally. It’s a perpetuum mobile.
You should choose what to sacrifice.
Usual Internet (like HTTP) sacrifices latency. They actually begin with sacrificing latency.
Some apps where latency cannot be totally sacrificed, like Skype - then sacrifice the voice quality.
But anyway something is sacrificied.
and finally I have no choice as I’m not in the position to change this format
Too sad to have one’s bosses as morons 
The person who designed something UDP-based and is NOT prepared to the fact datagrams can be lost (as also reordered) - is a moron.
No exceptions.
and it’s not so easy to argue when other >devices handle such traffic without problem … so - is it
really 100% normal for datagram sockets?
Surely yes.
In one particular network config, this kind of things will be very rare. But, if this will be productized and used in the customer’s networks with some El Cheapo Chinese routers (since their CFO wants to cut costs) or such - then you will see this A LOT.
from embedded device to embedded device. My network is a GBit network, I’m using a managed
Cisco switch,
…and your customers will use some unmanaged El Cheapo Chinese stuff off the local store…
So I tend to see the problem inside the windows socket.
No.
- This is NOT a problem. When UDP loses a packet - this is NOT a problem. This is by design.
- The “problem” (which is not a problem actually) can be anywhere - in MS’s IP stack, in the NIC driver on the Windows machine, or in Cisco box.
It was a big step forward for me to use Winsock Kernel instead of User mode sockets
No step forward at all. Just a bit of CPU load reduction eliminating syscall overhead. Nothing else. The underlying engine is still the same.
was to increase the socket buffer size. Do you think this could help?
Surely no.
You have the following ways:
- switch to existing reliable protocol. It’s called TCP, is described (version 0.5 or such) here: http://www.ietf.org/rfc/rfc793.txt, and is free from such issues.
- implement your own reliable protocol (yes, with retransmissions) on top of UDF.
- relax the latency requirements. Make a product heterogeneous, so that latency is only important between Data Source and First Level Server, and then not important between First and Second Level Servers.
- require (yes, require your customers) the use of some advanced stuff, like the particular high-end Cisco box, or Converged Ethernet, or 10GBps or such.
Switch from user-mode sockets to WinSock Kernel is just literally nothing compated to above ways.
Where can I find some more information about SO_SND/RCFBUF for UDP?
By Google, as usually.
IIRC SO_RCVBUF for UDP just governs how much datagram bytes can the engine hold in memory if the arrival rate from the network > consumption rate by recv() calls.
If the incoming datagram will try to overflow this limit - it is silently discarded.
–
Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com