Manipulate audio stream externally from kernel mode driver based on Sysvad

Hello, my name is André Fellows and I am part of a team that is starting the study in the driver’s area.

The project goal is to make a noise canceling driver.

As nobody on the team was aware of the matter, we had to read the documentation made available by Microsoft and in forums a lot.
We managed to use the example of Sysvad, which is quite complete and does much of what we need.
However, we still cannot understand how we will be able to send the audio stream from within the driver to an application/service and receive the stream handled back and send it to the real audio device.

What would be the most appropriate way for a kernel-mode driver based on Sysvad to send a buffer (stream) to an external application to manipulate the stream audio, receive the modified stream and send it to the real audio device?

The flow we think for capture would be more or less this:
1 - on Skype the user selects the driver’s microphone based on Sysvad
2 - the driver captures the microphone sound
3 - the driver sends the microphone stream to an application to remove noise from the stream
4 - the driver receives the modified stream
5 - the driver sends the stream to Skype

The flow we think for reproduction would be more or less this:
1 - on Skype the user selects the driver’s speaker based on Sysvad
2 - Skype sends the audio stream to the driver
3 - the driver sends the audio stream to an application to remove noise from the stream
4 - the driver receives the modified stream
5 - the driver sends the stream to the actual system speaker

In both cases, as it is a driver based on Sysvad, he does not know the real hardware (microphone and speaker), how is it possible to send or receive the streams of the real hardware inside this driver created from Sysvad?

Thanks in advance,
André Fellows

(please don’t post the same content twice… It just makes work for us and confusion for people who see the double post)

@“Peter_Viscarola_(OSR)” said:
(please don’t post the same content twice… It just makes work for us and confusion for people who see the double post)

Sorry Peter, I didn’t mean to do this. I think this post was gone because I didn’t find it in my topics.

Did you look in the archives? We had a discussion on this exact topic this month.

The big flaw in your plan is that the sysvad driver has no means to communicate with the real hardware. That job has to be done by the application. Your application will pull microphone data from the real microphone, do its processing, and send it to sysvad through a backdoor. Sysvad will deliver that to Skype. Skype will send speaker data back to sysvad, and your application will pull that data through a back door.

Please note that there are already a number of companies doing this exact thing. Unless you have a research team that has come up a killer new noise cancellation algorithm, you’re not going to make the world a better place.

@Tim_Roberts said:
Did you look in the archives? We had a discussion on this exact topic this month.

The big flaw in your plan is that the sysvad driver has no means to communicate with the real hardware. That job has to be done by the application. Your application will pull microphone data from the real microphone, do its processing, and send it to sysvad through a backdoor. Sysvad will deliver that to Skype. Skype will send speaker data back to sysvad, and your application will pull that data through a back door.

Please note that there are already a number of companies doing this exact thing. Unless you have a research team that has come up a killer new noise cancellation algorithm, you’re not going to make the world a better place.

Thank you for your answer Mr Tim!

We know there are many companies doing this but maybe we do have a good noise cancellation algorithm (pehaps yet under development).

We still don’t know how to do exactly what you said: “send/receive to/from sysvad through a backdoor”.

We still missing how can sysvad send a buffer to the application manipulate it and send it back (on the same function call?) or even redirect to the speaker.

So, as you mention, should the flow be something like this:

1 - on Skype the user selects the driver’s microphone based on Sysvad
2 - the external noise cancelling app receive the stream from the real mic
3 - the external noise cancelling app remove the noise
4 - noise cancelling app send the manipulated stream to the sysvad driver internal buffer
5 - the driver sends the stream to Skype

The flow we think for reproduction would be more or less this:
1 - on Skype the user selects the driver’s speaker based on Sysvad
2 - Skype sends the audio stream to the driver
3 - the driver sends the audio stream to the external noise cancelling app to remove the noise from the stream
4 - the noise cancelling app send the stream to the real system speaker

Is this a better approach?

Thanks again!

Best regards,
André Fellows

That’s how it has to work, yes. Please take the time to check the list archives for this discussion. I don’t want to have to go through this design yet again.

Remember that a driver can’t initiate sending anything to an application. The application will have to send requests to sysvad that get queued in the driver. Sysvad can fill those requests when it has data.

@Tim_Roberts said:
That’s how it has to work, yes. Please take the time to check the list archives for this discussion. I don’t want to have to go through this design yet again.

Remember that a driver can’t initiate sending anything to an application. The application will have to send requests to sysvad that get queued in the driver. Sysvad can fill those requests when it has data.

Thank you very much Mr Tim!!!

So, as you said, the application should query the driver frequently (milisseconds?) asking for the speaker buffer data to process and the application will send another request when it receives stream from the mic.

But doing this frequently asking for some data on the driver won’t flood the driver with this many queries?

Thanks again!!
André Fellows

You need to take some time to think about what you’re doing and design your solution. I assume you aren’t really going to modify the microphone data at all, but are instead going to use the microphone data to alter the speaker stream. That means you have some very serious latency issues to worry about. You can’t process any more speaker data until you get more live microphone data, so that’s going to determine how often you’ll be processing data. Once you get going, your app, the live hardware, and the sysvad buffers will all pretty much be in lock step. You will have to experiment to find out what timing works best.