strategy for virtual ethernet device

Aaron_Young · November 2, 2012, 6:10pm

I’m investigating a design for a virtual network device. The device needs to be able to connect a user-mode application to the bottom of the IP stack for both send and receive. This user-mode application exchanges IP packets over a propietary interface to an external source.

I’m very new to driver development, but I’ve spent the last week or so reading NDIS and WDM documentation. Based on my understanding so far, here’s how I think I ought to do it:
Create an NDIS miniport driver who creates a secondary device using NdisRegisterDeviceEx to talk to user-mode.

In MiniportSendNetBufferLists, complete any IRP_READ’s or put the NBL’s into a queue.

Process IRP_WRITE (for the secondary device) by transferring the buffer to a NBL and calling NdisMIndicateReceiveNetBufferLists.

Process IRP_READ by completing it if there are any queued NBL’s, or queue the IRP otherwise.

I know there are lots of other details to consider, but am I on the right track here? A few questions I have at the moment are:

thread contexts - Is it OK to call NdisMIndicate… from the IRP_WRITE dispatch routine? Or do I need to put that in a DPC? More generally, do I even need to use DPC’s at all (since I don’t have any interrupts)? I noticed the netvmini sample uses DPC’s even though it doesn’t talk to hardware, but I’m guessing that’s because its showing you how you would.
memory management - Do I actually need to copy data from the IRP_WRITE’s to the NBL’s? Or can I set it up to where the ownership of the backing buffers are just transferred from the IRP to the NBL? If not, should I allocate a new NBL for each IRP_WRITE, or pre-allocate and maintain a list of free NBL’s?
completing IRP_WRITE - does it make any difference whether I complete the IRP in the dispatch routine (right after calling NdisMIndicate…) or should I be completing it after the associated NBL is returned to me? Is the only consequence that if the packet fails on its way up the stack the user-mode application won’t know it?
queueing IRP’s - since the secondary device is part of an NDIS driver, I can’t use StartIo right? (Because NDIS actually implements this for the driver?) Is using a cancel-safe queue here the way to go?

Any help, either related to the questions or not, will be greatly appreciated!
Thanks,
Aaron

OSR_Community_User · November 2, 2012, 6:45pm

If they are IP packets as opposed to raw Ethernet packets, and you’re on a server OS, I believe you can open a raw IP socket from user mode, no new driver required. You might also consider why you need a special kind of IP packet instead of just a UDP datagram. The major reason UDP packets may not be viable is you have no control over this proprietary interface, and it’s already designed by somebody else. Another big plus of using standard UDP packets is things like VPN tunneling (like the SDN using NVGRE in Server 2012) will often just work.

Jan

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@yahoo.com
Sent: Friday, November 02, 2012 3:15 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] strategy for virtual ethernet device

I’m investigating a design for a virtual network device. The device needs to be able to connect a user-mode application to the bottom of the IP stack for both send and receive. This user-mode application exchanges IP packets over a propietary interface to an external source.

Aaron_Young · November 2, 2012, 7:12pm

Hi Jan, thanks for your response. Actually a previous semi-related solution did in fact attempt to use raw sockets. However, IIRC one issue was that permissions were too restrictive on a certain service pack of WinXP. This solution will have to work on Windows 7. A quick glance at the documentation tells me its a similar story there. Bummer.

The proprietary interface is sort of fixed in the sense that I’m stuck with taking IP packets from user space and somehow getting them to UDP/TCP/ICMP applications via their regular socket interfaces on the same machine (and the reverse direction).

anton_bassov · November 5, 2012, 2:42am

I guess what you need here is not virtual miniport but LWF, at least as long as you are speaking about Win7(on XP you need NDIS IM filter)…

Anton Bassov

Aaron_Young · November 5, 2012, 10:54am

Hi Anton, if I understand your suggestion correctly, this would imply hijacking an existing ethernet device and redirecting its traffic to user-space. I think this would be ok, except that there is another constraint I didn’t mention The existing network interfaces on the PC are used for connections to other services. In our system, the data path we are targeting with this virtual device is generally configured to exist in a separate subnet from these other services. That’s why we assumed we would be adding a virtual network adapter, since a LWF approach would require implementing some logic to decide what packets are passed to the real network and which ones are redirected to user-space. But even if we do LWF (we don’t need to support WinXP, btw), I think a lot of the concepts I’m asking about in the OP still apply, right? Do you have any ideas/comments about them?

David_R_Cattley · November 5, 2012, 11:04am

> 1. thread contexts - Is it OK to call NdisMIndicate… from the IRP_WRITE
dispatch routine?

Yes.

memory management - Do I actually need to copy data from the IRP_WRITE’s to
the NBL’s?

No. But you do need to ensure that the MDLs describe kernel address space and that the memory has been locked. Do depending on what your DIOCTL interface looks like, it might just be easier to copy (bounce) the write and be done with it.

completing IRP_WRITE - …

You must *NOT* complete the write (or IOCTL, or whatever) until you no longer need the buffer described by it. If you copy the write into an independent buffer then you can complete the write immediately. If you use the buffer specified in the write to ‘back’ the NBL buffer chain directly, you must not complete the write until the NBL is returned to you.

queueing IRP’s - …
Yes, a CSQ is the ‘way to go’.

Good Luck,
Dave Cattley

anton_bassov · November 5, 2012, 11:56am

You seem to be missing the whole point here…

Look an the network stack from the incoming packet’s perspective - it arrives to the physical NICs; NIC miniport driver indicates it to NDIS which, in turn, invokes bound protocols Rcv handlers. How does your virtual miniport fit into this model??? Where is it going to get its data from on its lower edge??? Don’t forget that NDIS indicates data only to the bound protocols( after putting it through LWFs on NDIS 6+ ) - miniports never get in from NDIS, and protocol driver X cannot prevent protocol driver Y from seeing the packet. This is how NDIS works.

Therefore, you need a filter in the network stack in order to make your model work. Your filter will look for the packets of interest, and redirect them to your virtual miniport via some proprietary interface of your choice. Your miniport, on its upper edge, will already indicate them to NDIS. Another option (probably more correct one) would be MUX, but, in my opinion, it is more complex approach, compared to filter+miniport one.

In any case, you cannot get your job done by a virtual miniport alone. This is what I am trying to explain to you…

Concerning the rest, I guess Dave has already answered your questions…

Anton Bassov

Aaron_Young · November 5, 2012, 12:25pm

Maybe I didn’t explain the context of this design very well… The incoming IP packets are not arriving at a physical NIC (at least, not for all intensive purposes). They are being received in user-space by some means that is outside the scope of this design. You can think of this user-space application as the lower-edge of the virtual miniport.

I think that’s all that’s needed to understand what I’m trying to do, but just in case you’re curious I’ll give you a few more details without violating any disclosure policies This user-space application I’m talking about is for testing a particular radio protocol that supports IP. The interface to the hardware that does the testing is TCP-based. Most of the tests we develop don’t care what goes in the IP packets running on top of the radio stack, but there are some tests where we have to support the application these IP packets are associated with in order to get the test into some desired state. But there is currently no TCP/UDP/IP stack to process these packets. So we are trying to get the local Windows stack to handle them for us.

Pavel_A1 · November 5, 2012, 4:06pm

On 05-Nov-2012 19:30, xxxxx@yahoo.com wrote:

Maybe I didn’t explain the context of this design very well… The incoming IP packets are not arriving at a physical NIC (at least, not for all intensive purposes). They are being received in user-space by some means that is outside the scope of this design. You can think of this user-space application as the lower-edge of the virtual miniport.

I think that’s all that’s needed to understand what I’m trying to do, but just in case you’re curious I’ll give you a few more details without violating any disclosure policies This user-space application I’m talking about is for testing a particular radio protocol that supports IP. The interface to the hardware that does the testing is TCP-based. Most of the tests we develop don’t care what goes in the IP packets running on top of the radio stack, but there are some tests where we have to support the application these IP packets are associated with in order to get the test into some desired state. But there is currently no TCP/UDP/IP stack to process these packets. So we are trying to get the local Windows stack to handle them for us.

Then, this would be interesting to you, if you do not need to support
WinXP: Raw IP, a.k.a. NdisMediumIP

http://msdn.microsoft.com/en-us/library/windows/hardware/ff559158(v=vs.85).aspx

– pa

Aaron_Young · November 5, 2012, 5:06pm

Thanks for the tip, Pavel. It looks like this will lesson the complexity a bit by keeping me from having to deal with Ethernet MAC headers.