Performance estimation of Usermode drivers?

Could anybody share data on performance of UMDF driver vs. similar kernel driver?

For a future project, I’m considering a usermode driver, but the customer has doubts about performance impact. The device itself is quite trivial, it is (PCI) memory-mapped array of various digital sensors, A/D converters etc. These should be polled at unknown rate by unknown number of client apps. Lots of small-size requests. We can ‘do the math’ on the device accesses, but have difficulty to estimate effect of thread switches, caches and other such things.
Expected target machines are at least 4 cores i5-class cpu, with enough RAM.

Regards,
– pa

I don’t think an UMDF driver is allowed to access BARs.

xxxxx@fastmail.fm wrote:

Could anybody share data on performance of UMDF driver vs. similar kernel driver?

For a future project, I’m considering a usermode driver, but the customer has doubts about performance impact. The device itself is quite trivial, it is (PCI) memory-mapped array of various digital sensors, A/D converters etc. These should be polled at unknown rate by unknown number of client apps. Lots of small-size requests. We can ‘do the math’ on the device accesses, but have difficulty to estimate effect of thread switches, caches and other such things.

Microsoft reps on this list have quoted a 10% penalty over an all-kernel
driver. For most applications, that would be undetectable. That’s
especially true for sensors, because the bandwidth is usually quite
low. Most sensors are queried dozens of times per second, not hundreds
of thousands. That’s why the sensor driver model has been user-mode
from the start.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

xxxxx@broadcom.com wrote:

I don’t think an UMDF driver is allowed to access BARs.

Actually, they can. WdfDeviceMapIoSpace creates a user-mode mapping to
BAR space:

https://msdn.microsoft.com/en-us/windows/hardware/drivers/wdf/reading-and-writing-to-device-registers


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

OR via a system call. The mapping is just one of two options.

Peter
OSR
@OSRDrivers

As of UMDF 2, both KMDF and UMDF are sufficiently similar that you can TRY it in user-mode and then TRY it in kernel-mode to see which you like best. For most drivers, the conversion process is just a few hours, and once you’ve got the code written that lets your driver work in either mode, you can switch back and forth at will.

Now… let me hasten to add: I’ve never done this for a real “commercial” project, but I *have* done it multiple times for play time projects… and it’s proved pretty easy.

Peter
OSR
@OSRDrivers

Tim Roberts wrote:

Microsoft reps on this list have quoted a 10% penalty over an
all-kernel driver.

I thought what was quoted was a 10% penalty of KMDF versus WDM, not UMDF versus KMDF.

I don’t Know what an unnamed “Microsoft rep” might have said, but I can’t agree with a 10% penalty between WDM and KMDF. The perf difference really can’t be accurately measured for the general case.

A lot of what’s done in KMDF by the Framework is stuff that you’d need to do in your driver in WDM. Like, oh, IRP queuing for example. This makes it really difficult to compare the two, absent a specific design or implementation.

The UMDF perf hit relative to KMDF will be very highly related to the number of requests processed. More small requests will suffer a bigger pentalty than fewer large requests. Once you’re in the driver, I can’t imagine the perf is too much different… I guess depending on how you access your data. But that’s hardly news…

Peter
OSR
@OSRDrivers

xxxxx@osr.com wrote:

I don’t Know what an unnamed “Microsoft rep” might have said, but I can’t agree with a 10% penalty between WDM and KMDF. The perf difference really can’t be accurately measured for the general case.

I remember a plainly stated quote from a list luminary, but I can’t find
it now. The search terms are just too common.

However, there was a discussion of this topic in 2008 which asserted
that a UMDF driver using small transfers sees a 10% to 20% penalty over
KMDF, but for a driver using larger transfers (64k), the difference is
negligible.

I agree that the KMDF to WDM difference is insignificant.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Tim Roberts wrote:

I remember a plainly stated quote from a list luminary, but I can’t
find it now. The search terms are just too common.

However, there was a discussion of this topic in 2008 which asserted
that a UMDF driver using small transfers sees a 10% to 20% penalty
over KMDF, but for a driver using larger transfers (64k), the difference
is negligible.

Here’s the post I was referring to, look at message 4 of 7, where a “list luminary” gives a certain “figure”…which the assorted members will have to decide whether is “insignificant”:

http://www.osronline.com/showThread.cfm?link=99794

Btw we fixed the fwd progress guarantee problem for storage since that post :slight_smile:

Bent from my phone


From: xxxxx@lists.osr.com on behalf of xxxxx@gmail.com
Sent: Friday, January 6, 2017 7:40:14 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Performance estimation of Usermode drivers?

Tim Roberts wrote:

> I remember a plainly stated quote from a list luminary, but I can’t
> find it now. The search terms are just too common.
>
> However, there was a discussion of this topic in 2008 which asserted
> that a UMDF driver using small transfers sees a 10% to 20% penalty
> over KMDF, but for a driver using larger transfers (64k), the difference
> is negligible.

Here’s the post I was referring to, look at message 4 of 7, where a “list luminary” gives a certain “figure”…which the assorted members will have to decide whether is “insignificant”:

http://www.osronline.com/showThread.cfm?link=99794


NTDEV is sponsored by OSR

Visit the list online at: http:

MONTHLY seminars on crash dump analysis, WDF, Windows internals and software drivers!
Details at http:

To unsubscribe, visit the List Server section of OSR Online at http:</http:></http:></http:>

Thank you Mr. Aseltine, for using your Google-fu to find the original quote you were thinking of. Much appreciated.

It’s a BIT more nuanced than “there a 10% perf penalty for using KMDF” – The quote from Mr. Holan (almost TEN years ago) is the following:

Wait for it…

[/quote]
and if
you are already doing sync of pnp/power state while processing i/o under a lock,
the increase is smaller since KMDF is doing the same thing you were doing
beforehand.
[/quote]

But, it depends on your specific case… it’s hard to generalize, and Mr. Holan goes on to say:

Indeed you do.

MY net takeaway from all this is “The perf difference really can’t be accurately stated for the general case… but in most cases the perf delta between KMDF and WDM is insignificant. If you want to know the differences for a specific case, you’d have to measure.”

Others may take something else from Mr. Holan’s statements.

Peter
OSR
@OSRDrivers

Peter Viscarola (OSR) wrote:

Thank you Mr. Aseltine, for using your Google-fu to find the original
quote you were thinking of. Much appreciated.

FWIW, I used the OSR search tool, not Google, to find it.

The perf difference really can’t be accurately stated for the general
case… but in most cases the perf delta between KMDF and WDM is
insignificant. If you want to know the differences for a specific case,
you’d have to measure.

But what does that mean in practice? If you want to know how much perf you’re losing in KMDF, you have to implement your driver a second time in WDM, and “measure”, to really know? If that’s the case, then what’s the point?

Webinator rocks.

Yes.

PRECISELY!

Just don’t worry about it, and use KMDF.

Peter
OSR
@OSRDrivers

I think it means implement your driver in KMDF, and then measure it and see if the performance meets your needs.

If not you can look at falling back to WDM for the paths you need to optimize.

-p

Q: Which is faster, a Porsche 911 or a Toyota Sienna?
A: Does it matter if you have to drive 6 people someplace?

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@gmail.com
Sent: Friday, January 6, 2017 9:13 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Performance estimation of Usermode drivers?

Peter Viscarola (OSR) wrote:

> Thank you Mr. Aseltine, for using your Google-fu to find the original
> quote you were thinking of. Much appreciated.

FWIW, I used the OSR search tool, not Google, to find it.

> The perf difference really can’t be accurately stated for the general
> case… but in most cases the perf delta between KMDF and WDM is
> insignificant. If you want to know the differences for a specific
> case, you’d have to measure.

But what does that mean in practice? If you want to know how much perf you’re losing in KMDF, you have to implement your driver a second time in WDM, and “measure”, to really know? If that’s the case, then what’s the point?


NTDEV is sponsored by OSR

Visit the list online at: http:

MONTHLY seminars on crash dump analysis, WDF, Windows internals and software drivers!
Details at http:

To unsubscribe, visit the List Server section of OSR Online at http:</http:></http:></http:>

I was going to differ, but after I wrote my reply I decided we were actually in violent agreement :wink:

If you don’t get the perf you need, it’s not like you’re going to start again and re-conceptualize your driver in WDM. That’s neither going to wise, nor productive, nor even necessary.

Rather, if you don’t get the perf you need, you look at streamlining your KMDF design… which MAY include various types of optimizations and restructuring that include falling back to WDM (within the KMDF overall structure) for some specific handling.

Classic case:

You write your driver to have a parallel Queue that receives all/most of your Requests, and then – based on the type of Request or something – forwards those Requests to other specific Queues. This is a nice design. I’ve used it myself.

You aren’t getting the throughput you need, or your Request to Request latency is too large. OK, maybe you change things around a bit and do the dispatching yourself using an EvtDeviceWdmIrpDispatch Event Processing Callback, thereby delivering IRPs/Requests directly to the proper Queue in the first place.

Peter
OSR
@OSRDrivers

Thanks to everyone who answered. So, the bottom line is the perf impact is not expected to be large - but to verify it, the customer is going to get two drivers - a KMDF one and UMDF one :slight_smile:

But the main reason for me to consider UMDF is robustness against crashes. The customer may want later to do complicated processing in the driver (filtering, triggers …) using floating point math, maybe even a small “scripting language” like JS. This can decrease chattiness between apps and the driver and compensate performance loss. Has anyone tried this in UMDF driver?

Thanks,

  • Pavel

Peter Viscarola wrote:

If you don’t get the perf you need, it’s not like you’re going to start again
and re-conceptualize your driver in WDM. That’s neither going to wise, nor
productive, nor even necessary.

Rather, if you don’t get the perf you need, you look at streamlining your KMDF
design… which MAY include various types of optimizations and restructuring

In this case unfortunately I don’t know upfront enough on the app behavior to plan such optimization. All I have now is the requirement to serve lots of small requests…

Thanks again,
– pa

If invoking the scripting language is through a dll import and the closure of all of its dependencies are imports, it should work. When you start getting into com activation for interfaces, it is unclear.

Bent from my phone


From: xxxxx@lists.osr.com on behalf of xxxxx@fastmail.fm
Sent: Monday, January 9, 2017 6:10:56 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Performance estimation of Usermode drivers?

Thanks to everyone who answered. So, the bottom line is the perf impact is not expected to be large - but to verify it, the customer is going to get two drivers - a KMDF one and UMDF one :slight_smile:

But the main reason for me to consider UMDF is robustness against crashes. The customer may want later to do complicated processing in the driver (filtering, triggers …) using floating point math, maybe even a small “scripting language” like JS. This can decrease chattiness between apps and the driver and compensate performance loss. Has anyone tried this in UMDF driver?

Thanks,
- Pavel


NTDEV is sponsored by OSR

Visit the list online at: http:

MONTHLY seminars on crash dump analysis, WDF, Windows internals and software drivers!
Details at http:

To unsubscribe, visit the List Server section of OSR Online at http:</http:></http:></http:>

Doron wrote

If invoking the scripting language is through a dll import and the closure of
all of its dependencies are imports, it should work.

Great. All I need is to load DLLs, no COM involved.
Thank you.

– pa