Is KMDF a suitable framework for an Audio KS driver?

I have an opportunity to write both an audio driver and several APO’s, and would like to get some sage, seasoned advice. The vast majority of my drivers are written in KMDF … for a variety of reasons, it’s been my go-to springboard for drivers and I’d like for this audio driver to be no exception.

The driver is supposed to be getting audio data off a hardware tap from a digital microphone, and there is hardware to support this which then exposes a USB interface. I’ve reviewed the SYSVAD sample (written in WDM) in the WDK and have searched here and on StackOverflow for others who might be writing audio drivers … everyone writing audio drivers seem to be using the SYSVAD sample as a starting point and moving from there, staying in WDM. The APO’s appear to be pretty straightforward and there are good examples I’ve found of those.

The architecture is reasonably straightforward … the bottom edge is communicating with a USB chipset for getting the audio data from a microphone, the top edge is exposing some pins and attributes … is there an architectural reason why KMDF would be an unsuitable framework for an audio driver?

Thx!

The “sysvad” sample is a KMDF driver. You have to run it in “miniport” mode, and you can’t let it do dispatching. Because of that, I’m unclear as to what you gain, but it is possible.

It would be better to make your hardware USB Audio Class compliant and let the standard usbaudio.sys driver handle things. It’s not that hard.

Thanks for the feedback Tim! I’ll research KMDF in “miniport” mode and hopefully I can shoehorn in the important stuff in SYSVAD into my KMDF framework … but it’s good that you don’t see any fundamental roadblocks with using a KMDF based driver as a starting point

As to the usbaudio.sys suggestion, well, … if only the people who designed hardware were up on the MS spec’s when they designed it, then all I would need to do is fill out a new .inf and I’m done … but where’s the fun in that! Typically the people who design hardware come from a Linux universe and they have their own ideas on how hardware should work … again which is fine, keeps me busy!

In this particular case, the USB chipset the hardware folks have chosen to communicate with the downstream chip which has the hardware tap exposes five endpoints: EP 0, two INT EP’s (IN/OUT) and two Bulk EP’s (IN/OUT) with the USB chipset using the two INT EP’s as mailboxes and the Bulk EP’s for loading firmware into downstream chip and pulling data audio data off an internal buffer on the downstream chip …

Which is a nice general solution for the USB chipset, but which doesn’t quite fit into the USB Audio 2.0 spec … :slight_smile:

As to the usbaudio.sys suggestion, well, … if only the people who designed hardware were up on the MS spec’s when they
designed it, then all I would need to do is fill out a new .inf and I’m done

The USB Audio Class specification has been an international standard since 1998. There’s nothing Microsoft-specific about it, and really no excuse for producing an audio device that ignores it.

It’s easy enough to write a WinUSB-based application to talk to your device. Do you seriously need this to be treated as a system audio device? That’s not going to be easy.

Hi Tim, I have a question in regards to this thread. It is in regards to a USB device needing to provide Windows 10 functionality explicitly through kernel mode driver properties that I doubt the usbaudio.sys driver is actually implementing. It is not after all part of the USB Audio Class specification device interface that is public. Given that that is public, I assume that a Windows driver can also implement this interface to talk to the device and this driver would not be usbaudio.sys.

An example of this would be a keyword hardware spotter device. This requires, according to MSDN, a combination of kernel mode driver properties as well as a keyword detector COM DLL in user mode. The kernel driver properties for a hardware keyword spotter provides a COM CLSID that would be provided to Windows to load a keyword detector COM DLL through a new Windows 10 kernel mode audio driver property (this DLL is also registered through INF entries but it appears to require the kernel mode driver property as well at this time). This requirement seems to be tightly coupled. The COM DLL is called in user mode (I think AarSvc_xxxxx is the name of the service process) and Windows would call into kernel mode other properties after loading the DLL. In additional, for the detection of a hardware keyword spotter event the kernel mode driver property would provide back to Windows data that the COM interface would use to call into the user mode COM DLL. As you can see, this does not match the standard for a USB Audio Class specification device and usbaudio.sys is probably not written to this specification as it is probably very specific.

So my question, based on your answer above, is that the audio device should provide a USB Audio Class specification device interface to Windows. This would work for basic functionality that a USB microphone would provide. But given the coupling between the COM interface and the kernel driver Windows audio driver properties, it sounds like there would need to be a custom audio driver that communicates with the device but it does not use usbaudio.sys. It understands the USB Audio Class specification but implements itself. Am I correct in my understanding here?

I’m not sure that’s a good example. Microsoft only expects tablets or phones to have a hardware keyword spotter. Those units have built-in audio hardware with a custom driver. Everyone else is expected to use a software keyword spotter.

Microsoft discusses hardware keyword spotting here [https://docs.microsoft.com/en-us/windows-hardware/drivers/audio/voice-activation] and that doesn’t say “for phones only” (which are deprecated, you’ll never see another MS phone again) or “for tablets only” in the documentation. More importantly, to properly support WakeOnVoice you pretty much need a hardware KWS for it to function like a HID WakeOnX event, similar to a mouse event triggering a WakeOn event. That’s why they as discussing HW KWS in that context, I would gather …

Expecting folks to implement a software KWS when you also want to implement “WakeOnAudio” 'kinda breaks the whole model of “keep my computer in as low power mode as possible until a wake event occurs” … it would be like implementing WakeOnLan (which is hardware) in software by having a packet sniffer always running …

The statement appears valid; the USBAudio class driver currently simply does not support what MS requires for a hardware KWS … a custom driver would appear to be the proper course of action to support this, which is likely why the doc’s say “… Create a custom keyword detector based on the SYSVAD sample described later in this topic …” and not “using the MS USB Audio class driver, …”

Or am I missing something in the doc’s somewhere?

Or am i missing something in the doc’s somewhere?

Yes, you are missing the many references to SoC vendors. That’s “system on a chip” – the processors where everything including the kitchen sink is packed onto one chip. Those are only used in small devices.

The statement appears valid; the USBAudio class driver currently simply does not support what MS requires for a hardware KWS

Agreed. The driver cannot possibly do so, because the USB Audio Class spec does not provide any way for this to be done. It has to be a WaveRT device.

Consider how much of the system would have to be alive in order for a USB audio device to do this. You’d need the whole host controller, the hubs, and the device. You’d need the audio driver to be fully alive, in order to exchange packets. That’s not a “low power condition”. I don’t think you’ll ever do “hardware keyword spotting” with a USB device. It’s only practical with the SoC model.

The most interesting thing on that Voice Activation page is this:

Can you even WAKE a system with “Hey Cortana” anymore? Isn’t Cortana going on a long trip soon, you know, being sent to “live on a farm”??

I thought cortana was being sent to live in the cloud…

(Smile) Yes, Cortana has indeed been led off to look at that last sunset … too bad too, I really liked it in Halo! Maybe if they had “Clippy” pop up for the Cortana avatar that would have endeared it to the masses … :slight_smile: … or not …

Regarding the HW KWS and “WakeOnAudio” though, I’m still going to gnaw on that bone a bit more … as I recall I can set Windows 10 to “wake” on various USB HID devices (keyboard and mouse), using the process documented here [https://www.intowindows.com/wake-pc-from-sleep-using-keyboard-and-mouse-in-windows-10/#:~:text=Wake%20Windows%2010%20computer%20from%20sleep%20with%20a%20keyboard&text=Step%202%3A%20In%20the%20Device,to%20wake%20the%20computer%20option.]

This seems to work on my laptop running Win10 20H1 … I set the USB keyboard to “wake” the computer in the properties, I put the OS to “sleep” with the laptop “sleep” button, I hit the keyboard, it “wakes” up. Same with the USB mouse … set the properties to “wake” the computer, put the OS to “sleep”, move the mouse, it “wakes” up …

The USB keyboard, and USB mouse, go through a USB host controller, an internal USB hub, and have fully alive USB drivers (and my laptop is definitely not an SoC design) … so all of that sleeping and waking seems to work under the “WakeOnX” paradigm (here WakeOnMouse or WakeOnKeyboard), so apparently “sleep” on a laptop running Win10 20H1 is “… but keep awake the USB host controller and internal hubs so I can respond to a USB mouse or USB keyboard activity” …

Suppose then I have a hardware device that is somehow capable of monitoring an audio stream with say a hardware tap to an existing always on digital microphone, Suppose further then that I have that hardware expose some USB endpoints, and then also suppose that I write a [custom] USB driver to connect to those endpoints … that USB connection will be using the same USB host controller and USB internal hub as my USB mouse and keyboard, which I’ve just shown are going to be “alive” although my laptop has been put to “sleep”.

It would stand to reason then that just as we have a “WakeOnKeyboard” functionality with a USB keyboard and a “WakeOnMouse” with a USB mouse we would also be able to have a “WakeOnSomeNoise” with a pure USB audio monitor … and if that USB audio monitor sitting on the audio stream also had a hardware keyword spotter (so it didn’t just wake when someone closed a door, but woke when someone said “WakeUp!” then we would apparently now have a “WakeOnAudio” functionality …

… which is what I believe the MS document I cited discusses. I think the references to “SoC” that are referring to is that you would need to have the hardware KWS chip, well, soldered onto the hardware and not just attached to a dongle … I happen to have sitting on my desk here a commercially available laptop with just such a hardware KWS chip mounted onto the laptop motherboard which is monitoring an “always on” digital microphone (it buffers up 2sec of audio) and which exposes some USB endpoints …

Hmm …

(Caveat: I haven’t looked at the whole topic of hardware KWS for at least a couple of years… when we were approached to work on one.)

USB devices can certainly wake the system from S4, and I believe waking from S5 is supported as well (assuming you have the right BIOS support). I know that we did wake from S5 using CIR (consumer infrared) “back in the day” when there was Windows Media Center. Remember that? No? You’re better off…

So, if you had a sufficiently clever hardware Key Word Spotter… I can absolutely imagine it being capable of waking the system.

Peter