Perhaps one of you networking gurus can enlighten me.
I’m working with a telemetry system that ships 100 megabytes/second
continuously over a fiber gigabit Ethernet. The stream is sent as UDP
packets. The thing generates 8k byte samples at roughly a 12 kHz rate.
To keep the interrupt rate down, we actually send these as IP fragments,
with 7 fragments making up one 56k byte UDP packet.
The IP header includes a 16-bit identification number for each big
packet, so that all of the fragments can be identified and collected.
If you do the math, you’ll see that this identification number rolls
over every 37.5 seconds.
Here’s the problem. Say that, for whatever reason, the first three
fragments after power up never reach the destination. The receiving
system (running Windows Server 2008 R2) will hold on to the next four
fragments. When the identification number rolls over, 37.5 seconds
later, it takes the first three fragments of this new packet, combines
them with the four old fragments, and sends that out as the UDP
packet. It then holds on to the last four fragments of this new packet
until another 37.5 seconds passes. It stays out of sync like that forever.
My friend Google tells me that there used to be a registry parameter to
control this (IpReassemblyTimeout) but that the parameter is not used by
any 21st Century Windows system, where the timeout is hard-coded to 60
seconds.
Clearly, this 60 second number was invented before there were practical
real-life scenarios where these numbers could overlap in less time than
that. Is there really no solution to this problem?
–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.