TDI Filter Driver - Will this work?

Nicholas_Nack · July 11, 2007, 6:38pm

Hi,

We have a TDI filter driver that currently filters TDI_SEND_DATAGRAM requests, and modifies the target host/port to point to a usermode app running on the same machine. We also add a header to the datagram to mention the original target host/port - so that the usermode app can do the needful (encapsulation and relay).

We want to be able to do the reverse - when UDP datagrams are received by the user mode app, to be able to forward them to the TDI client they are actually meant for, spoofing the UDP source address so it appears as if they came from the original source host. The user mode app is some kind of proxy/relay that forwards UDP datagrams.

This is the architecture we came up with - would appreciate your thoughts on it, possible pitfalls, and whether this will work or not

Filter TDI_RECEIVE_DATAGRAM requests, and wait on a semaphore (per Handle semaphore). When the semaphore is signalled, retrieve a datagram (buffer) from an npaged lookaside list, and copy over to the buffer passed in the IRP. Modify the source address to point to the address stored along with the datageam in the npaged lookaside buffer. Free the npaged memory, complete the irp and return STATUS_SUCCESS.
Filter TDI_SET_EVENT_HANDLER requests, and check for ClientEventReceiveDatagram - if this event is being set, store the address of the Event Handler in a Per Handle context structure. Complete the IRP - dont pass it down, return STATUS_SUCCESS.
Ignore TDI_SET_EVENT_HANDLER for ClientEventChainedReceiveDatagramfor now (note below).
When the usermode app has a datagram to send to the TDI client, it sends an IOCTL to the filter driver. The IOCTL passes using buffered I/O the original source address of the datagram, and the datagram itself (payload).
The TDI filter driver checks if the TDI client is using TDI_RECEIVE_DATAGRAM - if so, it adds the datagram and source address to the npaged lookaside list, and signals the semaphore.
If the TDI client isn’t using TDI_RECEIVE_DATAGRAM, the filter driver checks for an event handler - if present, it just copies over the source address and passes the datagram buffer (the datagram as tsdu) and calls the event handler from the address stored in the per-handle context struct. When the event handler returns, the npaged buffer is cleaned up.

Notes/Questions -

Our assumption here is that for ClientEventReceiveDatagram, the tsdu passed is actually just a UDP datagram (no header / IP header). Is this right?
When dealing with ClientEventChainedReceiveDatagram, if the event handler returns STATUS_PENDING, then we can’t free the tsdu buffer, and the TDI client will eventually call TdiReturnChainedReceives, with a tsdu descriptor that can only be invalid / NULL (since it is passed by us, not an NDIS protocol driver). Any idea how we can handle this? One way is to pass a NULL tsdu descriptor and simply do a time-delayed free of the tsdu buffer. This does seem risky however.

Your feedback will be very welcome - this is a rather lengthy post to read, so thanks for just reading this far.

Nick

anton_bassov · July 12, 2007, 1:02pm

If I was in your place I would do everything in NDIS IM filter. I would create a standalone device in order to communicate with an app ( an app would submit few IRPs to this device that driver would pend). When packet of interest arrives, instead of indicating ut up the stack, my filter would extract packet headers and data and complete an IRP, so that my app would get it. Then my app would do all necessary processing, and submit one more request to my standalone device with modified data.
My driver would allocate NDIS_PACKET, set it up properly, and indicate it up the stack, so that TCPIP will be able to handle things.

Anton Bassov

Nicholas_Nack · July 12, 2007, 1:58pm

Hi Anton,

Thanks for the response/suggestion. We had considered this, but not really examined it fully because we don’t want to start writing an NDIS IM driver from scratch (more work!). Thanks for outlining how we would do it with NDIS though - that will really help if we do go the NDIS way.

If anybody has any suggestions / feedback on the TDI filter driver we’ll be glad to check it out - we really want to avoid the NDIS driver unless there is no alternative.

Thank You,
Nick

David_R_Cattley · July 12, 2007, 9:09pm

Anton’s suggestion is the canonical approach to diverting and encapsulating
IP traffic (aka, a VPN) on Windows.

I hesitate to add anything other than to say that a TDI filter which
attempts to modify / divert the TSDU stream as you have suggested is likely
to be very difficult to get right and not just because of the little nasty
of Chained operations really being partially opaque access to a NDIS_PACKET
which cannot *ever* come back to the TDI filter - TdiReturnChainedReceives()
is a rather trivial wrapper around NdisReturnPackets(). This is an
unfortunate ‘breaking’ of the layering of TDI Client <-> TDI Transport <->
NDIS but it is what it is. TDI filtering has always been the proverbial
“between a rock and a hard place” of network driver programming.

None the less, you may consider that “filtering” is not the only approach to
insert at the TDI layer. You might find that you can leverage the filter
position to actually divert the TSDU traffic in and out of an entirely
different AddressFileObject - one that is conveniently bound to send and
receive datagrams from your local proxy process. It is a bit of a shell
game but the idea is to induce the (de-encapsulated) packet to emerge from
the other AFO. It is up to your filter to make the *two* AFOs appear as one
to the original application.

But such an idea seems vastly more complex to me than the MAC level
diversion Anton suggests. Having built (a few) such MAC level solutions
(and a few TDI filters, clients, and even a transport too) I can only say
that in my experience they are simpler and easier to write, understand, and
get correct than *anything* to do with TDI. The most difficult item that is
unique to the MAC level processing would likely be re-assembly and/or
fragmentation handling of IP datagrams - not terribly mysterious stuff.

Plus, you can unload an IM driver (or a virtual MAC adapter).

Good Luck,
Dave Cattley
Consulting Engineer
System Software Development

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@artofping.com
Sent: Thursday, July 12, 2007 2:01 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] TDI Filter Driver - Will this work?

Hi Anton,

Thanks for the response/suggestion. We had considered this, but not really
examined it fully because we don’t want to start writing an NDIS IM driver
from scratch (more work!). Thanks for outlining how we would do it with NDIS
though - that will really help if we do go the NDIS way.

If anybody has any suggestions / feedback on the TDI filter driver we’ll be
glad to check it out - we really want to avoid the NDIS driver unless there
is no alternative.

Thank You,
Nick

Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

anton_bassov · July 12, 2007, 9:25pm

> Having built (a few) such MAC level solutions (and a few TDI filters, clients, and even a transport > too) I can only say that in my experience they are simpler and easier to write, understand, and

get correct than *anything* to do with TDI.

This is the only reason why I said that I would do everything in NDIS IM…

The most difficult item that is unique to the MAC level processing would likely be
re-assembly and/or fragmentation handling of IP datagrams

Actually, as long as the OP speaks about UDP (i.e. datagram-based service), he does not even have to think about fragmentation - unlike TCP transmissions, datagrams don’t get fragmented, because
datagram size cannot exceed that of IP packet…

Anton Bassov

OSR_Community_User · July 13, 2007, 7:53am

xxxxx@hotmail.com wrote:

> Having built (a few) such MAC level solutions (and a few TDI filters, clients, and even a transport > too) I can only say that in my experience they are simpler and easier to write, understand, and
> get correct than *anything* to do with TDI.
>

This is the only reason why I said that I would do everything in NDIS IM…

> The most difficult item that is unique to the MAC level processing would likely be
> re-assembly and/or fragmentation handling of IP datagrams
>

Actually, as long as the OP speaks about UDP (i.e. datagram-based service), he does not even have to think about fragmentation - unlike TCP transmissions, datagrams don’t get fragmented, because
datagram size cannot exceed that of IP packet…

UDP datagrams may have up to 64k in size. They will be sent as fragments
from the start.

Anton Bassov

Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Nicholas_Nack · July 13, 2007, 8:54am

Hi Dave,

Thanks for the words of advice…both on the TDI filtering and on the NDIS IM driver. Its very tempting to seek to just extend the existing TDI filter driver by adding support for datagram receive, but it does seem to me a very messy affair (especially with reference to the chained event issues). You mentioned other issues - just curious, could you enlighten me more on that?

Anton - thanks for the clarification again.

Let me look into the NDIS IM driver architecture and see what requires to be done…thanks for your advice, and will be sure to come back for more if required when working on the NDIS IM driver.

Thanks!

Nicholas_Nack · July 13, 2007, 8:59am

Andrei - missed your response (didnt show up till I posted) - so essentially large datagrams will probably be fragmented across a bunch of ethernet frames?

anton_bassov · July 13, 2007, 9:13am

> UDP datagrams may have up to 64k in size. They will be sent as fragments from the start.

Theoretically, this is true. However, let’s get real. In actuality, fragmentation and reassembly are
features of IP, rather that of higher-level protocols. All IP implementations must be able to reassemble datagrams of some certain size (IIRC, the minimum required size is 576 bytes), but some may be able to deal with larger transmissions. Unlike TCP, UDP has no mechanism of discovering the maximum datagram that can be reassembled by a remote host. Therefore, all UDP-based protocols are normally designed to work within 576-byte limit, so that they normally send datagrams that don’t exceed 576 bytes - assuming underlying media is Ethernet, any datagram that these protocols send fits within Ethernet MTU,and, hence, does not get fragmented…

I do agree that THEORETICALLY UDP datagram may may have up to 64k, but, once UDP does not guarantee packet delivery, any UDP-based protocol that sends large datagrams will be “not-so-efficient”, so that large UDP datargrams are practically never used…

Anton Bassov

David_R_Cattley · July 13, 2007, 10:04am

Surely not an exhaustive list and perhaps you have already addressed (or are
at least familiar) with these if you have TDI filter in place but here are a
few of my (least?) favorites:

IRQL level contract of event callbacks tends to limit (severely) what a
filter can do to help it make a decision on how to handle the callback.
Other TDI filters are intollerant of requests that TCPIP might otherwise
be happy with.
Other TDI filters don?t properly (or, at least in my opinion) filter some
types of requests and dealing with them is a constant PITA. (See
DirectSend{Datagram})
Inserting or removing data can be a real challenge when considering the
number of ways that the transport can indicate data or be asked to send
data. You already found that inserting data into Chained Receives is, well
(AFAIK) impossible.
Handling DirectSend / DirectSendDatagram filtering is challenging, in
light of the possiblity that your filter might not have seen the
IRP_MJ_CREATE on the file object and *especially* since some other broken
TDI filter might have completely flummoxed the IO_STACK_LOCATION or IRP.
“Delaying” a callback is tricky. The risk of confusing the Transport, the
Client, or another Filter is pretty high.
Handling error conditions in the Filter. They really should not effect
the Client, Transport, or other Filters if you are being good.
Bad Clients and other Filters that seem to have the issue of not calling
IoGetRelatedDeviceObject() and ‘bypassing’ lower attached device objects.
Of course you have the “let’s hook the DEVICE_OBJECT” group of TDI ‘hooking’
filters which ‘solve’ that.
Load order wars. “You first, no, you first. I insist! Not before you.
ARRRRRRRRG!!!”

In any event, it’s all a big nightmare to be avoided if possible. Many
cannot avoid it and tread carefully. Some should have not tread at all and
I am sure the list archives (and members) have many scars to bear.

You are welcome to contact me off list as this issue of TDI filters etc.
tends to interest very few and little that can be said has already been said
in the archives at one time or another.

Mostly lots of AAAAAARRRRGGGGGH

-dave

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@artofping.com
Sent: Friday, July 13, 2007 8:56 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] TDI Filter Driver - Will this work?

Hi Dave,

Thanks for the words of advice…both on the TDI filtering and on the NDIS
IM driver. Its very tempting to seek to just extend the existing TDI filter
driver by adding support for datagram receive, but it does seem to me a very
messy affair (especially with reference to the chained event issues). You
mentioned other issues - just curious, could you enlighten me more on that?

Anton - thanks for the clarification again.

Let me look into the NDIS IM driver architecture and see what requires to be
done…thanks for your advice, and will be sure to come back for more if
required when working on the NDIS IM driver.

Thanks!

Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

David_R_Cattley · July 13, 2007, 10:09am

When TCPIP.SYS sends an IP datagram that exceeds the MTU of the destination
interface, it will fragment it unless explicitly told not to do so. When
receiving packets from an interface, it is possible that a router along the
way (or the source station) fragmented an IP datagram to meet some
intervening MTU requirement.

Desipite Anton’s practical position, it is not wise to ignore these
possbilities in a MAC level solution for *any* IP protocol (TCP, UDP, or
otherwise).

Not all UDP protocols limit themselves to 576 octet IP datagrams and
occasionally one of these gets fragmented. Better to plan to deal with it.

-dave

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@artofping.com
Sent: Friday, July 13, 2007 9:01 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] TDI Filter Driver - Will this work?

Andrei - missed your response (didnt show up till I posted) - so essentially
large datagrams will probably be fragmented across a bunch of ethernet
frames?

Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

anton_bassov · July 13, 2007, 10:24am

> Not all UDP protocols limit themselves to 576 octet IP datagrams and

occasionally one of these gets fragmented.

If I got it right, the OP speaks about his own UDP-based protocol, so that datagram size is under his control. Therefore, unless he does something “unconventional”, he can safely assume that his UDP packets are not going to get fragmented…

Anton Bassov

Nicholas_Nack · July 13, 2007, 12:58pm

Hmmm…

Actually, I’m not using my own UDP based protocol - so I will have to handle arbitrary datagram sizes.

However, I believe I can accomplish my solution by simply using IOCTLs to signal the NDIS IM Driver to create raw IP packets containing the UDP datagrams with spoofed source address, and destination address at localhost (the client application is listening on this address), and sending them via the NDISSend / NDISSendPackets function of the NDIS miniport driver. That should automatically route these datagrams to the client application, with the spoofed source address. (this technique was I believe used by nmap when XP SP2 disabled UDP source address spoofing with raw packets).

Dave/Anton - I guess this should take care of the fragmentation issue?

Many thanks for the advice.

Nicholas_Nack · July 13, 2007, 1:04pm

PS: As a response to Anton’s post on large UDP datagrams - just as a bit of information, quite a few modern games (MMORPG, MMOFPS, regular FPS) tend to use large UDP datagrams (I’ve seen a lot of 1-2K datagrams).

Dave - thanks for the insight on TDI filters…I guess I’ll stay away from them apart from the work require to maintain our existing filter driver (its pretty simple / straightforward right now).

David_R_Cattley · July 13, 2007, 1:09pm

Your welcome. I hope all those WoW users who are playing 10hrs a day while
hiding from their employers behind corporate firewall policies with DirectX
multiplayer game blocking appreciate your efforts (and pay you the $8 /
month).

My nephew will be so happy right up until I show his mom how to block the
HTTP proxy/vpn.

Good Luck,
-dave

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@artofping.com
Sent: Friday, July 13, 2007 1:01 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] TDI Filter Driver - Will this work?

Hmmm…

Actually, I’m not using my own UDP based protocol - so I will have to handle
arbitrary datagram sizes.

However, I believe I can accomplish my solution by simply using IOCTLs to
signal the NDIS IM Driver to create raw IP packets containing the UDP
datagrams with spoofed source address, and destination address at localhost
(the client application is listening on this address), and sending them via
the NDISSend / NDISSendPackets function of the NDIS miniport driver. That
should automatically route these datagrams to the client application, with
the spoofed source address. (this technique was I believe used by nmap when
XP SP2 disabled UDP source address spoofing with raw packets).

Dave/Anton - I guess this should take care of the fragmentation issue?

Many thanks for the advice.

Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

anton_bassov · July 13, 2007, 4:53pm

> Actually, I’m not using my own UDP based protocol - so I will have to handle

arbitrary datagram sizes.

Then just ignore my statements about fragmentation - they don’t apply to you. Instead, you will have to deal with this possibility if you choose NDIS IM filter. In any case, as David has already pointed out, this part is not that complex - after all, all standards and specifications are freely available. I believe it is so much easier (and more predictable) than doing rather dodgy tricks in TDI filter…

Anton Bassov

Nicholas_Nack · July 14, 2007, 3:36pm

Fair enough - having gone through the DDK docs for NDIS IM drivers, I can see the difference - the framework allows you to just plug in your filtering code rather than having to do a lot of IRP management or TDI/NDIS specific tweaks…

anton_bassov · July 14, 2007, 4:13pm

> Fair enough - having gone through the DDK docs for NDIS IM drivers, I can see

the difference - the framework allows you to just plug in your filtering code
rather than having to do a lot of IRP management or TDI/NDIS specific tweaks…

Don’t underestimate the complexity of your task - writing modifying NDIS IM filter *properly* is not the easiest task one can imagine (there are quite a few not-so-obvious details that you have to take into the account). However, for the purpose of your task, NDIS IM still seems to be
more appropriate option that TDI filter…

Anton Bassov

Nicholas_Nack · July 15, 2007, 8:46am

Okay.

This being the case - may I draw your attention to a question I posed earlier…

“However, I believe I can accomplish my solution by simply using IOCTLs to signal the NDIS IM Driver to create raw IP packets containing the UDP datagrams with spoofed source address, and destination address at localhost (the client application is listening on this address), and sending them via the NDISSend / NDISSendPackets function of the NDIS miniport driver. That should automatically route these datagrams to the client application, with the spoofed source address. (this technique was I believe used by nmap when XP SP2 disabled UDP source address spoofing with raw packets).”

As far as I see this, this is just a question of intercepting the IOCTL, constructing the packets in multiple frames, sending them, and doing the cleanup once the sends complete.

Given this minimum complexity algorithmically speaking, would you still see some pitfalls / unobvious details?

Thanks for your continued scrutiny/review of what we’re trying.

anton_bassov · July 15, 2007, 9:32am

> However, I believe I can accomplish my solution by simply using IOCTLs to

signal the NDIS IM Driver to create raw IP packets containing the UDP datagrams
with spoofed source address, and destination address at localhost (the client
application is listening on this address), and sending them via the NDISSend /
NDISSendPackets function of the NDIS miniport driver. That should automatically
route these datagrams to the client application, with the spoofed source address.

NdisSend() and NdisSendPackets() send packets down the stack to the miniport, i.e. to the network. However, you want to forward them to TCPIP that, in turn, will forward them to the client app. Therefore, instead of sending packets with NdisSend(), you have to indicate them up the stack with NdisMIndicatePackets() from your IM…

As far as I see this, this is just a question of intercepting the IOCTL, constructing the
packets in multiple frames, sending them, and doing the cleanup once the sends complete.

I am afraid you are much too optimistic. The complexity of modifying NDIS IM filter lies not with
actually sending and indicating its own packets, but with making sure its very presence does not screw up communications between miniport and bound protocols. Taking into account the sad fact that some adapters (particularly virtual ones) are poorly written, as well as possible presence of third-party AV/PF products that don’t always “play by the rules”( for example, Kaspersky AV provides its own partial reimpletementation of the network stack that bypasses system-provided packet scheduler PSCHED althogether) on the target machine, writing a good modifying NDIS IM filter that flawlessly works on any machine is not the easiest task one can imagine…

Anton Bassov