I’ve done a lot of WinSock programming, but this one has me stumped.
A client opens multiple synchronous (i.e. blocking send and recv calls) TCP
sockets to a remote server without incident. Data is transferred back and
forth, successfully, for some time on all sockets. Then, for some reason,
one of the client’s send calls “stalls”… no data is transferred, but the
blocking send call doesn’t initially come back with an error. Eventually,
the server’s code (which is specifically designed to do this) recognizes
that the socket has been quiet for too long and forcefully closes it with
shutdown/closesocket. The client receives that notification, and its send
call returns with an error at that time. Meanwhile, the other sockets
continue to function normally, and a replacement socket can be obtained and
used without problems.
I can’t figure out what would cause a TCP connection to simply “stall” like
that. The connection between the two machines is known to be good because
the other sockets continue to operate perfectly. The TCP/IP stacks on both
machines don’t appear to be “damaged” because a replacement TCP connection
can be obtained and used. I would think that one or the other had lost
awareness of the connection, but if so I wouldn’t expect the client’s send
call to return when the server closes the socket; that proves both ends are
still aware of the connection.
Any ideas gratefully accepted!