Can I complete the request in EvtIoInCallerContext?

Can I complete the request in EvtIoInCallerContext(don’t call the WdfDeviceEnqueueRequest to enqueue the request)?

I means that would I complete all IO request in the EvtIoInCallerContext?
We don’t need any layer driver or queued IO.
Is there any obscue problems if I do so?

Yes, you can do this, but why also you want to do this. This callback is meant to capture process values and not much else.

d

dent from a phpne with no keynoard

-----Original Message-----
From: xxxxx@yahoo.com.cn
Sent: August 12, 2010 4:01 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Can I complete the request in EvtIoInCallerContext?

I means that would I complete all IO request in the EvtIoInCallerContext?
We don’t need any layer driver or queued IO.
Is there any obscue problems if I do so?


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

I get the same issues.
I need to develop a high-performance ( 800K to 1.2M pps ) driver for windows.
We use the DeviceIoControl interface to provide user-land application synchronous method to access our driver.
Then, we get the problem.
If we just obey the guidline, we cann’t afford such high-performance. ( For each command, user must create a event and then enter it and leave it, kernel get the IRQ to enqueue and dispatch and other costs, all these things themself can lost the target).
So we get the overlapped i/o to be done like this:
we complete it in the EvtIoInCallerContext, where we are in user’s thread context and block the thread, where IRQ are in no queues and we avoid further costs.

But it is not a proper way of cousre.

So, How could I achieve multhread synchronou I/O? Or where is the proper place to doing such thing?

Hi, Doron. Does WDF/KMDF has such a serious performance issue? Is it true
that queueing IRPs will direct in low performance? I’m very interesting with
wuhanck’s thread and want to listen to Doron’s opinion.

2010/8/15

> I get the same issues.
> I need to develop a high-performance ( 800K to 1.2M pps ) driver for
> windows.
> We use the DeviceIoControl interface to provide user-land application
> synchronous method to access our driver.
> Then, we get the problem.
> If we just obey the guidline, we cann’t afford such high-performance. ( For
> each command, user must create a event and then enter it and leave it,
> kernel get the IRQ to enqueue and dispatch and other costs, all these things
> themself can lost the target).
> So we get the overlapped i/o to be done like this:
> we complete it in the EvtIoInCallerContext, where we are in user’s thread
> context and block the thread, where IRQ are in no queues and we avoid
> further costs.
>
> But it is not a proper way of cousre.
>
> So, How could I achieve multhread synchronou I/O? Or where is the proper
> place to doing such thing?
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>

> I need to develop a high-performance ( 800K to 1.2M pps ) driver for windows.

What is “pps”?

1.2MB/s is surely a tiny and funny figure.

If we just obey the guidline, we cann’t afford such high-performance. ( For each command, user must
create a event and then enter it and leave it

No need, use IO completion ports.

we complete it in the EvtIoInCallerContext, where we are in user’s thread context and block the
thread, where IRQ are in no queues and we avoid further costs.

These costs are negligible.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

There was a discussion I started last year called “What are the maximum
number of I/O requests people have seen on a 2P system?” that will give you
a good idea of the overhead of KMDF, versus WDM, versus FAST I/O. For a
small payload one of the responders got roughly 180K operations a second for
KMDF, versus 300K operations a second for WDM (also achievable by using
EvtDeviceWdmIrpPreprocess). Going to Fast I/O got 1300K operations a
second.


Don Burn (MVP, Windows DKD)
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr

wrote in message news:xxxxx@ntdev…
>I get the same issues.
> I need to develop a high-performance ( 800K to 1.2M pps ) driver for
> windows.
> We use the DeviceIoControl interface to provide user-land application
> synchronous method to access our driver.
> Then, we get the problem.
> If we just obey the guidline, we cann’t afford such high-performance. (
> For each command, user must create a event and then enter it and leave it,
> kernel get the IRQ to enqueue and dispatch and other costs, all these
> things themself can lost the target).
> So we get the overlapped i/o to be done like this:
> we complete it in the EvtIoInCallerContext, where we are in user’s thread
> context and block the thread, where IRQ are in no queues and we avoid
> further costs.
>
> But it is not a proper way of cousre.
>
> So, How could I achieve multhread synchronou I/O? Or where is the proper
> place to doing such thing?
>
>
> Information from ESET NOD32 Antivirus, version of virus
> signature database 5367 (20100814)

>
> The message was checked by ESET NOD32 Antivirus.
>
> http://www.eset.com
>
>
>

Information from ESET NOD32 Antivirus, version of virus signature database 5367 (20100814)

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

pps means packet per-seconds
It is used to measure the network performance.
Can be approxiate to say, one pps means one IOCTL command per-seconds.
1.2M pps means 1.2 million IOCTL commands per-second.

> 1.2M pps means 1.2 million IOCTL commands per-second.

Can you switch to, say, 120K IRPs/s, but with 10 time larger buffers?


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

thanks, all guys.

Don Burn:
300K operations a second for WDM (also achievable by using EvtDeviceWdmIrpPreprocess)

Do you means we had no choise, except doing such thing in EvtDeviceWdmIrpPreprocess or EvtIoInCallerContext which is not designed for it…?
There is no proper way? or this is a harmless method?
I am quite not sure about it.

I’ll check your post.
Thanks again.

Maxim S. Shatskih:
You can say: 1.2 Mpps equal to 1,200K IRPs/s. :frowning:

How big a packet, and what are you going to do with it once you get it? A
colleague and I did a driver last year that on a dual core system could
achieve over 600K messages a second and it was hardware bound, but to get
there the hardware has to be smart and the driver used FAST I/O is the
hardware was free and otherwise sent it through KMDF.


Don Burn (MVP, Windows DKD)
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr

wrote in message news:xxxxx@ntdev…
> Maxim S. Shatskih:
> You can say: 1.2 Mpps equal to 1,200K IRPs/s. :frowning:
>
>
> Information from ESET NOD32 Antivirus, version of virus
> signature database 5367 (20100814)

>
> The message was checked by ESET NOD32 Antivirus.
>
> http://www.eset.com
>
>
>

Information from ESET NOD32 Antivirus, version of virus signature database 5367 (20100814)

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

Don Burn:
packet from user-land is about 64bytes to some serveral million bytes.
We use ioctl to get it encrypted or compressed or other special tranformation.
It is demanded about 800K pps for 64 bytes in userland synchronous mode; that is a desired performance.

HW is about 1.2M or higher for 64bytes.
We had archive that in Linux system. So some 800K is posed…

I am a newbie and I cann’t find your post last discussing about kmdf, wdm and fastIO…
May you get me the link or hints?

Thanks all guys.

To Don Burn :
Fast IO is designed for file system. But it has many limitations to use and there is few document to describe. You mentioned " What are the maximum number of I/O requests people have seen on a 2P system?" above, but I can’t find it, sorry. Can you give me the linking address?

The link is at http://www.osronline.com/showThread.cfm?link=159595 Fast
IO can be used for IOCTL’s by any driver, the article
http://www.osronline.com/article.cfm?id=166 gives a decent overview of Fast
IO. The trick here is to only use the FAST IO handler for requests you can
immediately satisfy, the second the hardware backs up return FALSE from the
handler and let the KMDF functions handle the call, that way KMDF can do all
the good things it does such as PnP and I/O queue management, but for
operatins that can be handled immediately you get the speed.


Don Burn (MVP, Windows DKD)
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr

wrote in message news:xxxxx@ntdev…
> Thanks all guys.
>
> To Don Burn :
> Fast IO is designed for file system. But it has many limitations to use
> and there is few document to describe. You mentioned " What are the
> maximum number of I/O requests people have seen on a 2P system?" above,
> but I can’t find it, sorry. Can you give me the linking address?
>
>
> Information from ESET NOD32 Antivirus, version of virus
> signature database 5367 (20100814)

>
> The message was checked by ESET NOD32 Antivirus.
>
> http://www.eset.com
>
>
>

Information from ESET NOD32 Antivirus, version of virus signature database 5367 (20100814)

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

I can confirm that here at OSR in our testing, on a pretty conventional LAPTOP, we were able to achieve 1.2M PPS using Fast I/O.

Of course, it ALL depends on HOW MUCH YOU DO in your Fast I/O routine, right??

And let me hasten to add: I emphatically DO NOT RECOMMEND Fast I/O for device driver writers unless resorting to this unusual method is absolutely necessary. Unless there’s really zero alternative.

I would STRONGLY recommend people who want to achieve sustained rates of hundreds of thousands of I/Os per second through their drivers to SERIOUSLY RECONSIDER what they’re trying to do. In general, this is not a good design. It is fraught will all SORTS of inherent risks and dangers.

Just because you CAN do something, doesn’t mean it’s a good idea. I mean, you COULD cut off your arm and tie it off the stump with your t-shirt. It’d work. That doesn’t mean you SHOULD, right?

Peter
OSR