Still struggling with NDIS Packets

Hi ppl

I have a User mode application which sends UDP packets. I need to transmit
the packet as fast as I can. I have to write a kernel mode NDIS driver for
it. I have thought about an intermediate driver which would handle this
packet. So the user mode application would bypass the protocol driver and
send it directly to the intermediate driver, which would queue the packet
at the start of sending buffer, so that it gets sent first. Hence high
speed streaming can be acheived. For this, I feel I need to make a socket
connection between the user app and my intermediate driver. Is it
possible? Or can I do it in any other fashion?Another problem is that how
should I handle the received packets for the user app in such a way that
it reaches the application as fast as possible. Note that this user app
that I would make is only for testing purposes. So I can have some data
and make it into a UDP packet and then send it down.

I am in deep sea:(totally lost. Please help.

regards
Arijit

On Tue, 2004-08-24 at 05:58, Arijit Bhattacharyya wrote:

Hi ppl

I have a User mode application which sends UDP packets. I need to transmit
the packet as fast as I can. I have to write a kernel mode NDIS driver for
it.

The architecture you described would work, but I admit I’m a bit
surprised that the difference is that significant. Lots of code manages
to be quite fast without resorting to a private IOCTL interface to an IM
driver. How did you decide you needed to do this?

That said, you should indeed be able to ioctl your packets from user
mode to your IM driver. You should probably use METHOD_DIRECT for these
IOCTLs, but it would be an interesting exercise to time it both ways
(i.e. METHOD_BUFFERED).

The passthru sample in the DDK shows how to use NdisMRegisterDevice to
manage a private IOCTL interface. That should get you started.

-sd

Well, I had to develop something in the kernel mode that would resort to
high speed streaming . So I guess making an intermediate driver was one
choice. Well, I am stll in the design phase, so if you think that there is
some other design that would be more efficient and easier to make in the
kernel mode, I would be glad to hear it. Another thing which you did not
mention was whether an IOCTL interface would be suffice. Can I send UDP
packets down to the intermediate driver by transferring it into a user
buffer? and similarly, when I receive a packet destined for that user app,
then how should I handle that packet?. One idea is as follows:

1)set up a socket in the user app which is listening at a port
2)when a packet comes at the NIC destined for the specified port, then in
the intermediate driver, just push that packet to the top of the receive
buffer so that it is delivered faster to that particular port.

what do you think of the solution? Can you give a better solution?

thanks in advance. You might be surprised by some of my questions but I am
new to driver programming. Only worked on simple device drivers b4:(

please help

Regards
Arijit

On Tue, 2004-08-24 at 05:58, Arijit Bhattacharyya wrote:
> Hi ppl
>
> I have a User mode application which sends UDP packets. I need to
> transmit
> the packet as fast as I can. I have to write a kernel mode NDIS driver
> for
> it.

The architecture you described would work, but I admit I’m a bit
surprised that the difference is that significant. Lots of code manages
to be quite fast without resorting to a private IOCTL interface to an IM
driver. How did you decide you needed to do this?

That said, you should indeed be able to ioctl your packets from user
mode to your IM driver. You should probably use METHOD_DIRECT for these
IOCTLs, but it would be an interesting exercise to time it both ways
(i.e. METHOD_BUFFERED).

The passthru sample in the DDK shows how to use NdisMRegisterDevice to
manage a private IOCTL interface. That should get you started.

-sd


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@fht-esslingen.de
To unsubscribe send a blank email to xxxxx@lists.osr.com

Please give your requirements in detail. so that i could suggest you some
solution if i know one.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com]On Behalf Of Arijit
Bhattacharyya
Sent: Tuesday, August 24, 2004 5:00 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] Still struggling with NDIS Packets

Well, I had to develop something in the kernel mode that would resort to
high speed streaming . So I guess making an intermediate driver was one
choice. Well, I am stll in the design phase, so if you think that there is
some other design that would be more efficient and easier to make in the
kernel mode, I would be glad to hear it. Another thing which you did not
mention was whether an IOCTL interface would be suffice. Can I send UDP
packets down to the intermediate driver by transferring it into a user
buffer? and similarly, when I receive a packet destined for that user app,
then how should I handle that packet?. One idea is as follows:

1)set up a socket in the user app which is listening at a port
2)when a packet comes at the NIC destined for the specified port, then in
the intermediate driver, just push that packet to the top of the receive
buffer so that it is delivered faster to that particular port.

what do you think of the solution? Can you give a better solution?

thanks in advance. You might be surprised by some of my questions but I am
new to driver programming. Only worked on simple device drivers b4:(

please help

Regards
Arijit

On Tue, 2004-08-24 at 05:58, Arijit Bhattacharyya wrote:
> Hi ppl
>
> I have a User mode application which sends UDP packets. I need to
> transmit
> the packet as fast as I can. I have to write a kernel mode NDIS driver
> for
> it.

The architecture you described would work, but I admit I’m a bit
surprised that the difference is that significant. Lots of code manages
to be quite fast without resorting to a private IOCTL interface to an IM
driver. How did you decide you needed to do this?

That said, you should indeed be able to ioctl your packets from user
mode to your IM driver. You should probably use METHOD_DIRECT for these
IOCTLs, but it would be an interesting exercise to time it both ways
(i.e. METHOD_BUFFERED).

The passthru sample in the DDK shows how to use NdisMRegisterDevice to
manage a private IOCTL interface. That should get you started.

-sd


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@fht-esslingen.de
To unsubscribe send a blank email to xxxxx@lists.osr.com


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: unknown lmsubst tag argument: ‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com

On Tue, 2004-08-24 at 06:29, Arijit Bhattacharyya wrote:

Well, I had to develop something in the kernel mode that would resort to
high speed streaming . So I guess making an intermediate driver was one
choice. Well, I am stll in the design phase, so if you think that there is
some other design that would be more efficient and easier to make in the
kernel mode,

I’m not sure it would be more efficent, but I’m guessing another design
could be “effecient enough” for your requirements. I vaguely remember
Arlie Davis posting something in this forum a long time ago about how to
optimize networking from user mode. He seemed to indicate that you
shouldn’t have any trouble, if you use winsock properly. What are your
performance requirements? What made you decide you had to write
kernel-mode code?

I would be glad to hear it. Another thing which you did not
mention was whether an IOCTL interface would be suffice. Can I send UDP
packets down to the intermediate driver by transferring it into a user
buffer? and similarly, when I receive a packet destined for that user app,
then how should I handle that packet?.

Yeah, the basic idea would be that you would build your UDP datagram in
a user buffer, IOCTL that to the driver as METHOD_DIRECT, and then the
driver would directly forward it to the card. You’ll have to add a MAC
header in the IM driver, unless you ioctl the source mac. Also, you’ll
have to resolve the dest mac address with ARP from user mode (I think
the function for this is in iphlpapi).

As for inbound, you’d have to parse every packet that comes in and look
for your special packets. This is similar to any other packet filtering
code. Ioctl the right packets back to usermode and you’re done.

The details are a little tricky; study passthru and www.ndis.com.

One idea is as follows:

1)set up a socket in the user app which is listening at a port
2)when a packet comes at the NIC destined for the specified port, then in
the intermediate driver, just push that packet to the top of the receive
buffer so that it is delivered faster to that particular port.

I’m not quite sure how this works; you’re not bypassing winsock at all
here, and you seem to be thinking you can reorder tcpip.sys’s internal
queues. I may be misunderstanding you, but if i’m not, this idea is
doomed. Either do normal winsock calls (recommended) or do the ioctl
thing I described above.

-sd

Thanks for the reply…Actually its been my mistake to over empasize the
user mode application. I am building the user mode application for testing
purposes only, just to see whether I am able to send and receive UDP
packets faster. In reality, my driver would be sending and receiving all
the audio video packets at a higher speed. So there comes the necessity of
building something in the kernel mode, which looks at every packet, not
just the packets for a particular user mode app.

Now another question. Given the above scenario, how do you think that I
can speed up reception of audio video data? One way which I thought was
reordering the receive buffer so that audio/video data gets transmitted up
faster and leave the other packets to be sent up later. But you seem to
say that this idea wont work. Why is this? Your idea of sending up the
special packet directly to the app is not my concern since it would not be
just one single app but any app which is looking forward to audio video
data. The user app that I would make is only for testing. I think you get
my point. Can you suggest some architecture for it?

Ok, about sending the packets, I think you agree to my IOCTL thing. So no
problem with that. Still, if you feel something better can be done, then
guide me.

As I said, I am still designing the architecture and would welcome any
idea regarding my problem. Please be kind enough to help.

Regards
Arijit.

On Tue, 2004-08-24 at 06:29, Arijit Bhattacharyya wrote:
> Well, I had to develop something in the kernel mode that would resort to
> high speed streaming . So I guess making an intermediate driver was one
> choice. Well, I am stll in the design phase, so if you think that there
> is
> some other design that would be more efficient and easier to make in the
> kernel mode,

I’m not sure it would be more efficent, but I’m guessing another design
could be “effecient enough” for your requirements. I vaguely remember
Arlie Davis posting something in this forum a long time ago about how to
optimize networking from user mode. He seemed to indicate that you
shouldn’t have any trouble, if you use winsock properly. What are your
performance requirements? What made you decide you had to write
kernel-mode code?

> I would be glad to hear it. Another thing which you did not
> mention was whether an IOCTL interface would be suffice. Can I send UDP
> packets down to the intermediate driver by transferring it into a user
> buffer? and similarly, when I receive a packet destined for that user
> app,
> then how should I handle that packet?.

Yeah, the basic idea would be that you would build your UDP datagram in
a user buffer, IOCTL that to the driver as METHOD_DIRECT, and then the
driver would directly forward it to the card. You’ll have to add a MAC
header in the IM driver, unless you ioctl the source mac. Also, you’ll
have to resolve the dest mac address with ARP from user mode (I think
the function for this is in iphlpapi).

As for inbound, you’d have to parse every packet that comes in and look
for your special packets. This is similar to any other packet filtering
code. Ioctl the right packets back to usermode and you’re done.

The details are a little tricky; study passthru and www.ndis.com.

> One idea is as follows:
>
> 1)set up a socket in the user app which is listening at a port
> 2)when a packet comes at the NIC destined for the specified port, then
> in
> the intermediate driver, just push that packet to the top of the receive
> buffer so that it is delivered faster to that particular port.

I’m not quite sure how this works; you’re not bypassing winsock at all
here, and you seem to be thinking you can reorder tcpip.sys’s internal
queues. I may be misunderstanding you, but if i’m not, this idea is
doomed. Either do normal winsock calls (recommended) or do the ioctl
thing I described above.

-sd


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@fht-esslingen.de
To unsubscribe send a blank email to xxxxx@lists.osr.com

On Tue, 2004-08-24 at 07:52, Arijit Bhattacharyya wrote:

Now another question. Given the above scenario, how do you think that I
can speed up reception of audio video data? One way which I thought was
reordering the receive buffer so that audio/video data gets transmitted up
faster and leave the other packets to be sent up later. But you seem to
say that this idea wont work. Why is this?

Ahh, sorry, I misunderstood the last time. Yeah, you can probably
re-order the miniport receive queue, but be warned that you’ll have to
handle things carefully or you could really hurt performance of
everything else. You should try not to re-order parts of a stream -
i.e. keep each socket’s packets in order, even if you shuffle the
overall order of the packets.

Your idea of sending up the
special packet directly to the app is not my concern since it would not be
just one single app but any app which is looking forward to audio video
data. The user app that I would make is only for testing. I think you get
my point. Can you suggest some architecture for it?

What you’ve proposed seems as good as any to me. If you control the
apps, the IOCTL will work; if you don’t, you are stuck with your
re-ordreing idea. I don’t think you’re going to see much of a perf
improvement, though. Perhaps someone else has some advice.

-sd

Thanks very much. I really cant figure out any other architecture, so
maybe would stick to this one and pray for a performance improvement. I
went thru Arlie Davis’s mail and it had some nice insights about what to
do to improve network performance. Though they were for the user mode, I
think I can use some of that wisdom for my kernel level code as well.

Please be kind enough to mail me if anyone has any other idea. I think it
would take a day or two to finalize the final design. Till then, I hope
someone else comes up with a better idea:)

with regards
Arijit

On Tue, 2004-08-24 at 07:52, Arijit Bhattacharyya wrote:
> Now another question. Given the above scenario, how do you think that I
> can speed up reception of audio video data? One way which I thought was
> reordering the receive buffer so that audio/video data gets transmitted
> up
> faster and leave the other packets to be sent up later. But you seem to
> say that this idea wont work. Why is this?

Ahh, sorry, I misunderstood the last time. Yeah, you can probably
re-order the miniport receive queue, but be warned that you’ll have to
handle things carefully or you could really hurt performance of
everything else. You should try not to re-order parts of a stream -
i.e. keep each socket’s packets in order, even if you shuffle the
overall order of the packets.

> Your idea of sending up the
> special packet directly to the app is not my concern since it would not
> be
> just one single app but any app which is looking forward to audio video
> data. The user app that I would make is only for testing. I think you
> get
> my point. Can you suggest some architecture for it?

What you’ve proposed seems as good as any to me. If you control the
apps, the IOCTL will work; if you don’t, you are stuck with your
re-ordreing idea. I don’t think you’re going to see much of a perf
improvement, though. Perhaps someone else has some advice.

-sd


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@fht-esslingen.de
To unsubscribe send a blank email to xxxxx@lists.osr.com

> Now another question. Given the above scenario, how do you think that I

can speed up reception of audio video data? One way which I thought was
reordering the receive buffer so that audio/video data gets transmitted up
faster and leave the other packets to be sent up later. But you seem to

This is only possible with QoS Packet Scheduler.

Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com