Hi,
I’m having a problem with my driver. For other reasons that don’t matter I have to write a tcp server that accepts connections from a regular tcp client application. It has to work on win2k/xp/2k3 so the obvious choice was TDI.
I suspect there’s a problem with my design in the way I associate transport addresses and connection contexts. What I basicaly need to do is to convert the semantics of TDI to something very similar to user-mode socket semantics. Just to take the record straight, from this point when I say “socket” I don’t mean user-mode socket, I mean my kernel-mode implementation of a network socket which has very similar semantics with standard berkeley sockets. I know it’s not quite right to call them “sockets” but I don’t know how else to call them.
Back to the theory: My driver is supposed to accept tcp connections. This means I must have a listening “socket”. This pseudo-“socket” is built from a transport address and a connection context that are linked and the connection context is set to accept incoming connections. To be able to serve those incoming connections I also have a backlog of other “sockets” that are made from other pairs of connection context / transport address but those also have a pre-allocated data buffer used to dump the data I will later get from the remote applications. Each of those backlog “sockets” also have a preallocated Accept Irp that would be later used. When I get the ClientEventConnect notification from TDI I would pick one “socket” from the backlog and tell TDI to use it as a local endpoint for the incoming connection. I do by passing the Accept IRP and the custom connection context of the backlog “socket”. After a connection is received and it’s accepted I create a new “socket” to replenish my backlog that is bound to the same ip/port with the one that has just been used.
This design works, I tested many edge cases and it was ok except one. Let’s say I have a listening “socket”, two “sockets” in my backlog reserve and one active “socket” that is actualy connected to a remote application. This active/connected “socket” had been a backlog “socket” before the connection was enstablished and was spawned from the same listening socket that is still waiting for incoming connections. At some point I wanna stop listening for new incoming connections but still keep my connected “socket” with the remote party alive. Obviously I close the listening and the backlog “sockets” by closing their connection contexts and transport addresses. At this point the connection with the remote party is lost and the remote application receives the disconnect reason “dropped by remote party”. In theory this should not have happened because the connection contexts of the closed “sockets” are independent of the one of the active one that should have remained opened, and the transport address objects are also different (even though all the transport addresses are referencing the same ip/port).
My theory: I suspect there might be a problem with the transport addresses. I know that one transport address can actualy be associated with multiple connection contexts but in my case there’s one transport address for each connection context and some of the transport addresses overllap (same ip/port). This means that the listening “socket”, all the backlog “sockets” of that listening “socket” and the connected “sockets” spawned from the same listening “socket” would have different transport address objects that reference the same ip/port. Could it be possible that when I close one transport address object the tcp/ipv4 provider would actualy also close all other transport address objects using the same ip/port. The object itself is still valid (for the remaining “socket” that was connected to the remote party) but the connection is severed, the connection does not appear anymore in netstat, it’s droped for sure. A possible fix to this possible design issue would be to have only one transport address object bound to all connection contexts that would use it. This means that all the “sockets” bound to the same ip/port would share the same transport address object, which translates in that fact that all connections contexts of those “sockets” would be associated with the same transport address object referincing the same ip/port. This also means that a new low level layer that manages unique transport address objects should be put implemented in my design but this should not be a problem.
I’m not sure my theory is right and I don’t realy wanna start making extensive modifications to the code only to find out that I might have been wrong. That’s why I’m asking for help. Is my evaluation correct? Does the fix seem valid for my problem? Maybe I got it completely wrong, if I did so please correct me.
Thanks in advance!