Audio/KS filtering confusion

I have a KMDF filter driver that is able to attach to media class devices and it can selectively block I/O. I need to be able to limit the scope to input devices like microphones while leaving output devices like speakers unaffected. When they are separate devices the simple strategy works fine, but when a single device contains speakers and a mic, for example, then it’s an all or nothing proposition which isn’t sufficient.

From what I can tell by the time the IOCTL_KS_* requests arrive the destination is already decided, I just can’t quite tell how. How can I correlate which file objects relate to which pin categories to know whether or not I want to block a particular request? I’ve tried looking at the DirectKS sample and ksuser.dll and when they create a handle, which will get an eventual IOCTL_KS_* sent to it, they seem to do the NtCreateFile passing the handle to the filter (audio filter, not filter driver) as the root directory of the objAttr and then passing some undocumented GUID as the filename. I’m not really following how this is mapped…

Thanks for reading!

-JT

xxxxx@gmail.com wrote:

I have a KMDF filter driver that is able to attach to media class devices and it can selectively block I/O.

For gosh sakes, why? Surely there are 1,000 better ways to do this.

I need to be able to limit the scope to input devices like microphones while leaving output devices like speakers unaffected. When they are separate devices the simple strategy works fine, but when a single device contains speakers and a mic, for example, then it’s an all or nothing proposition which isn’t sufficient.

What do you do about audio drivers that don’t do their streaming through
ioctls, like WaveRT drivers?

From what I can tell by the time the IOCTL_KS_* requests arrive the destination is already decided, I just can’t quite tell how. How can I correlate which file objects relate to which pin categories to know whether or not I want to block a particular request? I’ve tried looking at the DirectKS sample and ksuser.dll and when they create a handle, which will get an eventual IOCTL_KS_* sent to it, they seem to do the NtCreateFile passing the handle to the filter (audio filter, not filter driver) as the root directory of the objAttr and then passing some undocumented GUID as the filename. I’m not really following how this is mapped…

Clients open a file handle to the device, and separate file handles to
each pin. Streaming requests get sent to the pin.

The GUID is not random. It’s almost certainly
{146F1A80-4791-11D0-A5D6-28DB04C10000}, which a few seconds of grepping
would have told you is KSNAME_Pin. If you see that GUID, then you are
seeing an open for a pin. However, there’s more than that. The file
name actually contains binary data (which is allowed because it’s a
kernel-to-kernel transaction). After the GUID, you’ll find a
KSPIN_CONNECT structure that contains, among other things, the pin ID
that is the target of the connection. That’s how you make the connection.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Hi Tim,

I had hoped this would catch your eye as you seem to be the only one in the last 5 years that answers any KS-related posts :slight_smile: I spent a week reading the archives before posting but wasn’t making much progress. You are correct about the GUID being the pin GUID, but I never see any GUID names make it to my driver’s create. I get names like “\espdifoutwave” and “\eduplicatedhpspeakertopo”. That said, I don’t require 1000 ways to do this, 1 would suffice. How would you suggest doing it? My feature wishlist would be to know what application is trying to read any type of input audio device (mostly mics) and in turn be able to block that attempt. Ideally the open could still be allowed to go through but the subsequent reads could be allowed or blocked on the fly. If blocking the open is the only robust way I could probably make that work as well.

Thanks again for taking the time to respond!
-JT

xxxxx@gmail.com wrote:

You are correct about the GUID being the pin GUID, but I never see any GUID names make it to my driver’s create. I get names like “\espdifoutwave” and “\eduplicatedhpspeakertopo”.

Those are the reference strings associated with the different pins.
Those names are stored in the registry (you can see them in an audio
device’s INF) and are strictly up to the manufacturer.

That said, I don’t require 1000 ways to do this, 1 would suffice. How would you suggest doing it?

I would suggest not doing it.

My feature wishlist would be to know what application is trying to read any type of input audio device (mostly mics) and in turn be able to block that attempt. Ideally the open could still be allowed to go through but the subsequent reads could be allowed or blocked on the fly. If blocking the open is the only robust way I could probably make that work as well.

Ah, I see. That’s utterly hopeless. The only time you’ll see an open
from the actual client process context is when the application uses
something like DirectShow to open the microphone. Apps that use waveIn
or DirectSound or WASAPI, which means MOST audio applications, end up
routing through the special Audio Engine process, and it’s the Audio
Engine that opens the driver handle. The originating process is long
gone by that time.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

I had noticed that everything seems to be coming from audiodg.exe. If I relax the requirements and say I just want to block all mic input during a period of time, independent of what user/process is requesting it, does this still fall in the “just don’t do it” bucket?

And just to further my education here, what is the “official” reason as to why this is not a supported design? Microsoft has provided frameworks so that I can do this exact same thing with files, registry objects, network connections, etc. - all of which are at least as “important” or “sensitive” as an audio stream, I would argue. Does this somehow tie back to DRM paranoia?

Thanks again for the dialog, I do appreciate it.
-JT

xxxxx@gmail.com wrote:

I had noticed that everything seems to be coming from audiodg.exe. If I relax the requirements and say I just want to block all mic input during a period of time, independent of what user/process is requesting it, does this still fall in the “just don’t do it” bucket?

Well, part of that is just me. I have a personal crusade against the
huge number of projects whose intent is to INHIBIT the successful
operation of my computer, rather than to ENHANCE it.

If the microphone is opened in shared mode, you can go set the
microphone volume to 0 using the standard system APIs and accomplish the
same thing. If the microphone is opened in exclusive mode, then you
can’t get in, and that’s consistent with their philosophy – in
exclusive mode, the audio hardware is owned by the application, and the
application ought to be in complete control. With the old driver model,
there were so many helpers and filters and “enhancers” that the
professional audio applications couldn’t do their work with acceptable
latencies. That drove the design of the new audio engine.

There is a mailing list specifically for audio driver writers called
[wdmaudiodev]. The professional audio guys hang out there, as do a
number of members of the Microsoft audio team. If you can ask your
question in a way that interests one of them, there’s no better way to
get the straight dope.

And just to further my education here, what is the “official” reason as to why this is not a supported design? Microsoft has provided frameworks so that I can do this exact same thing with files, registry objects, network connections, etc. - all of which are at least as “important” or “sensitive” as an audio stream, I would argue. Does this somehow tie back to DRM paranoia?

The “exclusive mode” stuff was certainly driven by DRM concerns.
However, audio is really in a different class from those other things.
You can’t do any damage to the computer through a microphone. It isn’t
an attack vector. I’m having a hard time visualizing the need for your
project.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

I would have to respectfully disagree on that last point. There are certainly nefarious apps which could be listening via my mic to a sensitive phone call that I’m making while seated at my desk. This could be of far more interest to an attacker than my collection of dog pictures. I imagine the NSA would love to have control over the mics of all Windows PCs, don’t you?

xxxxx@gmail.com wrote:

I would have to respectfully disagree on that last point. There are certainly nefarious apps which could be listening via my mic to a sensitive phone call that I’m making while seated at my desk. This could be of far more interest to an attacker than my collection of dog pictures. I imagine the NSA would love to have control over the mics of all Windows PCs, don’t you?

Maybe, but that’s not the attack vector your design addresses or
detects. The Audio Engine already has mechanisms in place (thanks to
Hollywood) to make sure that the path to exclusive applications is as
impediment-free as possible.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Hi Tim,

I’m not sure I understand either of your points. First, if I can block all apps reading from the mic then how am I not protecting against an unwelcome listener? The scenario is a bit of a hammer vs scalpel, but it allows the user to enable a secure mode and know he is not being listened to. Secondly, any exclusive access built into the system only helps prevent an attacker from listening in on something like a skype call, where the mic is currently in use for legitimate purposes. But the other 99.9% of the day when nothing legitimate is using the mic, how does the Audio Engine protect me against malware doing the listening?

-JT

xxxxx@gmail.com wrote:

I’m not sure I understand either of your points. First, if I can block all apps reading from the mic then how am I not protecting against an unwelcome listener?

If you are having a conversation in your office that should not be
overheard, then you had better mute the microphone. If you are having
an online conversation that is going to involve sensitive information,
then you need to be using an application that uses “exclusive mode”.
Then, the Audio Engine protects you. If you are having an online
conversation in an application that uses “shared mode”, then it’s pretty
freakin’ hopeless. How can you possibly hope to identify an “unwelcome
listener”?

But the other 99.9% of the day when nothing legitimate is using the mic, how does the Audio Engine protect me against malware doing the listening?

It can’t, and neither can you. If you’re in that kind of an
environment, maybe you’d better get in the habit of using microphones
with physical mute buttons.

Look, I know I’m arguing a strawman here. I’m not going to change your
mind, and that’s fine. I’m well known for my tilting-at-windmills
crusade against projects whose primary purpose is de-featuring my
computer, rather than enhancing it. I’ll go away now.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

As I said before I definitely appreciate your responses, no argument is or was intended from me. I was just seeking clarification. And while I agree that if I mic were the only thing to worry about, a physical mute button might be the best route (although almost all of us have built-in mics on our laptops with no such feature) that certainly doesn’t help with someone turning on my webcam, or stealing my keystrokes, etc. There are many places where a user can be monitored and thus a single “protect me while I do something sensitive” button may be of interest to some people. Enhanced security usually comes at the cost of reduced functionality but as long as the user is clear on the trade-offs I don’t see it as de-featuring. I’ll leave it there, again, I really appreciate the time you spent replying!

-JT