How to perform adaptative resampling

Tim Roberts wrote:

Yes, with video. Twenty years ago, in the ISDN days, you had
to do it with audio, too. I doubt they do that much any more.
Voice-quality datarates aren’t much of a bandwidth burden
these days.

I use FaceTime Audio on the train and I *regularly* hear it switching between data rates/quality’s. One moment it will be perfectly clear, the next moment it sounds like ICQ did when I was in high school.

On 2016-04-26 22:20:22 +0000, Maxim S. Shatskih said:

Hi !

>
> The sample rate of the hardware source is not fixed, it depends on the>
> current file being played.
> Allowed value could be one of 44100, 48000, 88200, 96000, 176400 or 192000 Hz.

File being played? then it is software source, not hardware. It has no
clock of its own, and trivially adapts to the sink clock.
From my point of vue, a computer is a hardware device, with its own clock.

Usually, RTP is used for audio over network, it is based on UDP.

You can do deep buffering like DirectShow does, this will ensure the
Hi-Fi quality, but will cause time lag from start to first audible
sound. Unacceptable (as is DirectShow, at least was 10 years ago) for
communication apps.

My guess is he means that if the file to be played is, say 96k, then the
remote device is first switched to 96k using its own crystal, and will then
run at its own crystal rate, expecting its play buffer to be kept from
emptying by data being sent to it. It sound like it does not have any
feedback to the sending computer that can throttle the tx data, so the
computer has to send at the average speed of 96k (determined by its own
reference, though not clear that a PC has one that is easily used for audio
unless it uses its own sound card for this) in the hope that the target has
buffered enough, and has enough spare capacity, to handle variations on the
actual data rate.

Oddly, he has said the data must be sent by UDP which can lose packets.
Failing any feedback path there is no way these can be re-sent. If there is
a feedback path I don’t understand why it can’t be used to throttle the
sending computer’s data, which could then produce a reliable working system.

----- Original Message -----
From: Maxim S. Shatskih
Newsgroups: ntdev
To: Windows System Software Devs Interest List
Sent: Tuesday, April 26, 2016 11:20 PM
Subject: Re:[ntdev] How to perform adaptative resampling

The sample rate of the hardware source is not fixed, it depends on the
current file being played.
Allowed value could be one of 44100, 48000, 88200, 96000, 176400 or 192000
Hz.

File being played? then it is software source, not hardware. It has no clock
of its own, and trivially adapts to the sink clock.

Usually, RTP is used for audio over network, it is based on UDP.

You can do deep buffering like DirectShow does, this will ensure the Hi-Fi
quality, but will cause time lag from start to first audible sound.
Unacceptable (as is DirectShow, at least was 10 years ago) for
communication apps.


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com


NTDEV is sponsored by OSR

Visit the list online at:
http:

MONTHLY seminars on crash dump analysis, WDF, Windows internals and software
drivers!
Details at http:

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer</http:></http:>

On 2016-04-27 08:39:22 +0000, Mike Kemp said:

Hi Mike !

My guess is he means that if the file to be played is, say 96k, then
the remote device is first switched to 96k using its own crystal, and
will then run at its own crystal rate, expecting its play buffer to be
kept from emptying by data being sent to it.
Yes, that’s true.

It sound like it does not have any feedback to the sending computer
that can throttle the tx data, so the computer has to send at the
average speed of 96k (determined by its own reference, though not clear
that a PC has one that is easily used for audio unless it uses its own
sound card for this) in the hope that the target has buffered enough,
and has enough spare capacity, to handle variations on the actual data
rate.
Actually, the streaming application that received audio data from the
driver and sends audio data to the remode hardware device has no way to
throttle the audio data being sent.

Oddly, he has said the data must be sent by UDP which can lose packets.
Failing any feedback path there is no way these can be re-sent. If
there is a feedback path I don’t understand why it can’t be used to
throttle the sending computer’s data, which could then produce a
reliable working system.
There is a feedback path from the remote hardware device and the
sending computer. This feedback tells me which sample number is
currently being played and how full the buffer is.
Packets are sent using UPD and the streaming application is able to
resent lost audio packets, but if for any reason I don’t receive
feedback packets, well, I can’t do anything.

Assuming I could modify the streaming application source code and
throttle tx, I would be able to slow down tx. What if I need to
accelerate ? How could I go faster than without throtling ?

----- Original Message ----- From: Maxim S. Shatskih
Newsgroups: ntdev
To: Windows System Software Devs Interest List
Sent: Tuesday, April 26, 2016 11:20 PM
Subject: Re:[ntdev] How to perform adaptative resampling

> The sample rate of the hardware source is not fixed, it depends on the
> current file being played.
> Allowed value could be one of 44100, 48000, 88200, 96000, 176400 or 192000 Hz.

File being played? then it is software source, not hardware. It has no
clock of its own, and trivially adapts to the sink clock.

Usually, RTP is used for audio over network, it is based on UDP.

You can do deep buffering like DirectShow does, this will ensure the
Hi-Fi quality, but will cause time lag from start to first audible
sound. Unacceptable (as is DirectShow, at least was 10 years ago) for
communication apps.

> From my point of vue, a computer is a hardware device, with its own clock.

If the separate computer with its own clock is just sending the file over the network, its clock is not used at all in the process.

Receiving the file on the other side is not clocked either, it is just like reading the local file.

So, this source is async, and must be just fast enough for a purpose.

The issues I was considering occur when you have a microphone+sound combo, with microphone
HW and speaker HW having its own sampling clocks. Jitter and skew occurs between the two.


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

I don’t understand why you need to be writing a driver at all. If you have
control over the streaming app which sends UDP packets (with your own retry
mechanism) and knows which sample the remote device is playing, all you need
to do is open the audio file as a data file and send audio data directly
from the file as fast as is needed to keep the target buffer full.

My advice on driver writing is don’t. Unless all other ways of doing are
exhausted.

Mike

----- Original Message -----
From: Matthieu Collette
Newsgroups: ntdev
To: Windows System Software Devs Interest List
Sent: Wednesday, April 27, 2016 10:13 AM
Subject: Re:[ntdev] How to perform adaptative resampling

On 2016-04-27 08:39:22 +0000, Mike Kemp said:

Hi Mike !

My guess is he means that if the file to be played is, say 96k, then the
remote device is first switched to 96k using its own crystal, and will
then run at its own crystal rate, expecting its play buffer to be kept
from emptying by data being sent to it.
Yes, that’s true.

It sound like it does not have any feedback to the sending computer that
can throttle the tx data, so the computer has to send at the average speed
of 96k (determined by its own reference, though not clear that a PC has
one that is easily used for audio unless it uses its own sound card for
this) in the hope that the target has buffered enough, and has enough
spare capacity, to handle variations on the actual data rate.
Actually, the streaming application that received audio data from the
driver and sends audio data to the remode hardware device has no way to
throttle the audio data being sent.

Oddly, he has said the data must be sent by UDP which can lose packets.
Failing any feedback path there is no way these can be re-sent. If there
is a feedback path I don’t understand why it can’t be used to throttle the
sending computer’s data, which could then produce a reliable working
system.
There is a feedback path from the remote hardware device and the
sending computer. This feedback tells me which sample number is
currently being played and how full the buffer is.
Packets are sent using UPD and the streaming application is able to
resent lost audio packets, but if for any reason I don’t receive
feedback packets, well, I can’t do anything.

Assuming I could modify the streaming application source code and
throttle tx, I would be able to slow down tx. What if I need to
accelerate ? How could I go faster than without throtling ?

----- Original Message ----- From: Maxim S. Shatskih
Newsgroups: ntdev
To: Windows System Software Devs Interest List
Sent: Tuesday, April 26, 2016 11:20 PM
Subject: Re:[ntdev] How to perform adaptative resampling

> The sample rate of the hardware source is not fixed, it depends on the
> current file being played.
> Allowed value could be one of 44100, 48000, 88200, 96000, 176400 or
> 192000 Hz.

File being played? then it is software source, not hardware. It has no
clock of its own, and trivially adapts to the sink clock.

Usually, RTP is used for audio over network, it is based on UDP.

You can do deep buffering like DirectShow does, this will ensure the Hi-Fi
quality, but will cause time lag from start to first audible sound.
Unacceptable (as is DirectShow, at least was 10 years ago) for
communication apps.


NTDEV is sponsored by OSR

Visit the list online at:
http:

MONTHLY seminars on crash dump analysis, WDF, Windows internals and software
drivers!
Details at http:

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer</http:></http:>

On 2016-04-27 09:46:30 +0000, Mike Kemp said:

I don’t understand why you need to be writing a driver at all. If you
have control over the streaming app which sends UDP packets (with your
own retry mechanism) and knows which sample the remote device is
playing, all you need to do is open the audio file as a data file and
send audio data directly from the file as fast as is needed to keep the
target buffer full.
My application should act a sound card from the user point of view.
The user should be able to use whatever audio player he wants on his
computer, select the virtual sound card I’m trying to write a driver
for as the audio output device, and everything should be streamed from
the computer to the remote playback device.

My advice on driver writing is don’t. Unless all other ways of doing
are exhausted.
Thanks for your advice, but I have no other choice at the moment :wink:

Mike

----- Original Message ----- From: Matthieu Collette
Newsgroups: ntdev
To: Windows System Software Devs Interest List
Sent: Wednesday, April 27, 2016 10:13 AM
Subject: Re:[ntdev] How to perform adaptative resampling

On 2016-04-27 08:39:22 +0000, Mike Kemp said:

Hi Mike !

> My guess is he means that if the file to be played is, say 96k, then
> the remote device is first switched to 96k using its own crystal, and
> will then run at its own crystal rate, expecting its play buffer to be
> kept from emptying by data being sent to it.
Yes, that’s true.

> It sound like it does not have any feedback to the sending computer
> that can throttle the tx data, so the computer has to send at the
> average speed of 96k (determined by its own reference, though not clear
> that a PC has one that is easily used for audio unless it uses its own
> sound card for this) in the hope that the target has buffered enough,
> and has enough spare capacity, to handle variations on the actual data
> rate.
Actually, the streaming application that received audio data from the
driver and sends audio data to the remode hardware device has no way to
throttle the audio data being sent.

>
> Oddly, he has said the data must be sent by UDP which can lose packets.
> Failing any feedback path there is no way these can be re-sent. If
> there is a feedback path I don’t understand why it can’t be used to
> throttle the sending computer’s data, which could then produce a
> reliable working system.
There is a feedback path from the remote hardware device and the
sending computer. This feedback tells me which sample number is
currently being played and how full the buffer is.
Packets are sent using UPD and the streaming application is able to
resent lost audio packets, but if for any reason I don’t receive
feedback packets, well, I can’t do anything.

Assuming I could modify the streaming application source code and
throttle tx, I would be able to slow down tx. What if I need to
accelerate ? How could I go faster than without throtling ?

>
> ----- Original Message ----- From: Maxim S. Shatskih
> Newsgroups: ntdev
> To: Windows System Software Devs Interest List
> Sent: Tuesday, April 26, 2016 11:20 PM
> Subject: Re:[ntdev] How to perform adaptative resampling
>
>
>> The sample rate of the hardware source is not fixed, it depends on the
>> current file being played.
>> Allowed value could be one of 44100, 48000, 88200, 96000, 176400 or 192000 Hz.
>
> File being played? then it is software source, not hardware. It has no
> clock of its own, and trivially adapts to the sink clock.
>
> Usually, RTP is used for audio over network, it is based on UDP.
>
> You can do deep buffering like DirectShow does, this will ensure the
> Hi-Fi quality, but will cause time lag from start to first audible
> sound. Unacceptable (as is DirectShow, at least was 10 years ago) for
> communication apps.


NTDEV is sponsored by OSR

Visit the list online at: http:
>
> MONTHLY seminars on crash dump analysis, WDF, Windows internals and
> software drivers!
> Details at http:
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer</http:></http:>

Well, now we know you are getting throttling info back from the target you
can simulate a sound card by running a sort of software local clock
frequency locked to the remote device. That will allow you to read audio
from the OS just like a normal driver. No “resampling” is required, adaptive
or otherwise.

To know how to write an audio card driver, I hand you back to the driver
experts. I have written Windows drivers some time ago and I follow this
forum mainly so I can knowledgeably advise people not to write drivers!

Good luck, Mike
----- Original Message -----
From: Matthieu Collette
Newsgroups: ntdev
To: Windows System Software Devs Interest List
Sent: Wednesday, April 27, 2016 10:57 AM
Subject: Re:[ntdev] How to perform adaptative resampling

On 2016-04-27 09:46:30 +0000, Mike Kemp said:

I don’t understand why you need to be writing a driver at all. If you have
control over the streaming app which sends UDP packets (with your own
retry mechanism) and knows which sample the remote device is playing, all
you need to do is open the audio file as a data file and send audio data
directly from the file as fast as is needed to keep the target buffer
full.
My application should act a sound card from the user point of view.
The user should be able to use whatever audio player he wants on his
computer, select the virtual sound card I’m trying to write a driver
for as the audio output device, and everything should be streamed from
the computer to the remote playback device.

My advice on driver writing is don’t. Unless all other ways of doing are
exhausted.
Thanks for your advice, but I have no other choice at the moment :wink:

Mike

----- Original Message ----- From: Matthieu Collette
Newsgroups: ntdev
To: Windows System Software Devs Interest List
Sent: Wednesday, April 27, 2016 10:13 AM
Subject: Re:[ntdev] How to perform adaptative resampling

On 2016-04-27 08:39:22 +0000, Mike Kemp said:

Hi Mike !

> My guess is he means that if the file to be played is, say 96k, then the
> remote device is first switched to 96k using its own crystal, and will
> then run at its own crystal rate, expecting its play buffer to be kept
> from emptying by data being sent to it.
Yes, that’s true.

> It sound like it does not have any feedback to the sending computer that
> can throttle the tx data, so the computer has to send at the average
> speed of 96k (determined by its own reference, though not clear that a PC
> has one that is easily used for audio unless it uses its own sound card
> for this) in the hope that the target has buffered enough, and has enough
> spare capacity, to handle variations on the actual data rate.
Actually, the streaming application that received audio data from the
driver and sends audio data to the remode hardware device has no way to
throttle the audio data being sent.

>
> Oddly, he has said the data must be sent by UDP which can lose packets.
> Failing any feedback path there is no way these can be re-sent. If there
> is a feedback path I don’t understand why it can’t be used to throttle
> the sending computer’s data, which could then produce a reliable working
> system.
There is a feedback path from the remote hardware device and the
sending computer. This feedback tells me which sample number is
currently being played and how full the buffer is.
Packets are sent using UPD and the streaming application is able to
resent lost audio packets, but if for any reason I don’t receive
feedback packets, well, I can’t do anything.

Assuming I could modify the streaming application source code and
throttle tx, I would be able to slow down tx. What if I need to
accelerate ? How could I go faster than without throtling ?

>
> ----- Original Message ----- From: Maxim S. Shatskih
> Newsgroups: ntdev
> To: Windows System Software Devs Interest List
> Sent: Tuesday, April 26, 2016 11:20 PM
> Subject: Re:[ntdev] How to perform adaptative resampling
>
>
>> The sample rate of the hardware source is not fixed, it depends on the
>> current file being played.
>> Allowed value could be one of 44100, 48000, 88200, 96000, 176400 or
>> 192000 Hz.
>
> File being played? then it is software source, not hardware. It has no
> clock of its own, and trivially adapts to the sink clock.
>
> Usually, RTP is used for audio over network, it is based on UDP.
>
> You can do deep buffering like DirectShow does, this will ensure the
> Hi-Fi quality, but will cause time lag from start to first audible sound.
> Unacceptable (as is DirectShow, at least was 10 years ago) for
> communication apps.


NTDEV is sponsored by OSR

Visit the list online at:
http:
>
> MONTHLY seminars on crash dump analysis, WDF, Windows internals and
> software drivers!
> Details at http:
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer


NTDEV is sponsored by OSR

Visit the list online at:
http:

MONTHLY seminars on crash dump analysis, WDF, Windows internals and software
drivers!
Details at http:

To unsubscribe, visit the List Server section of OSR Online at
http:</http:></http:></http:></http:></http:>

On 2016-04-27 10:19:07 +0000, Mike Kemp said:

Well, now we know you are getting throttling info back from the target
you can simulate a sound card by running a sort of software local clock
frequency locked to the remote device. That will allow you to read
audio from the OS just like a normal driver. No “resampling” is
required, adaptive or otherwise.
So it’s not a resampling problem…

To know how to write an audio card driver, I hand you back to the
driver experts. I have written Windows drivers some time ago and I
follow this forum mainly so I can knowledgeably advise people not to
write drivers!
Tim told me to post my problem on the WDMAUDIODEV list, that’s what I
did, hope I’ll be able to get a good solution.

Good luck, Mike
Thanks for your advice :wink:

----- Original Message ----- From: Matthieu Collette
Newsgroups: ntdev
To: Windows System Software Devs Interest List
Sent: Wednesday, April 27, 2016 10:57 AM
Subject: Re:[ntdev] How to perform adaptative resampling

On 2016-04-27 09:46:30 +0000, Mike Kemp said:

> I don’t understand why you need to be writing a driver at all. If you
> have control over the streaming app which sends UDP packets (with your
> own retry mechanism) and knows which sample the remote device is
> playing, all you need to do is open the audio file as a data file and
> send audio data directly from the file as fast as is needed to keep the
> target buffer full.
My application should act a sound card from the user point of view.
The user should be able to use whatever audio player he wants on his
computer, select the virtual sound card I’m trying to write a driver
for as the audio output device, and everything should be streamed from
the computer to the remote playback device.

>
> My advice on driver writing is don’t. Unless all other ways of doing
> are exhausted.
Thanks for your advice, but I have no other choice at the moment :wink:

>
> Mike
>
> ----- Original Message ----- From: Matthieu Collette
> Newsgroups: ntdev
> To: Windows System Software Devs Interest List
> Sent: Wednesday, April 27, 2016 10:13 AM
> Subject: Re:[ntdev] How to perform adaptative resampling
>
>
> On 2016-04-27 08:39:22 +0000, Mike Kemp said:
>
> Hi Mike !
>
>> My guess is he means that if the file to be played is, say 96k, then
>> the remote device is first switched to 96k using its own crystal, and
>> will then run at its own crystal rate, expecting its play buffer to be
>> kept from emptying by data being sent to it.
> Yes, that’s true.
>
>> It sound like it does not have any feedback to the sending computer
>> that can throttle the tx data, so the computer has to send at the
>> average speed of 96k (determined by its own reference, though not clear
>> that a PC has one that is easily used for audio unless it uses its own
>> sound card for this) in the hope that the target has buffered enough,
>> and has enough spare capacity, to handle variations on the actual data
>> rate.
> Actually, the streaming application that received audio data from the
> driver and sends audio data to the remode hardware device has no way to
> throttle the audio data being sent.
>
>>
>> Oddly, he has said the data must be sent by UDP which can lose packets.
>> Failing any feedback path there is no way these can be re-sent. If
>> there is a feedback path I don’t understand why it can’t be used to
>> throttle the sending computer’s data, which could then produce a
>> reliable working system.
> There is a feedback path from the remote hardware device and the
> sending computer. This feedback tells me which sample number is
> currently being played and how full the buffer is.
> Packets are sent using UPD and the streaming application is able to
> resent lost audio packets, but if for any reason I don’t receive
> feedback packets, well, I can’t do anything.
>
> Assuming I could modify the streaming application source code and
> throttle tx, I would be able to slow down tx. What if I need to
> accelerate ? How could I go faster than without throtling ?
>
>>
>> ----- Original Message ----- From: Maxim S. Shatskih
>> Newsgroups: ntdev
>> To: Windows System Software Devs Interest List
>> Sent: Tuesday, April 26, 2016 11:20 PM
>> Subject: Re:[ntdev] How to perform adaptative resampling
>>
>>
>>> The sample rate of the hardware source is not fixed, it depends on the
>>> current file being played.
>>> Allowed value could be one of 44100, 48000, 88200, 96000, 176400 or 192000 Hz.
>>
>> File being played? then it is software source, not hardware. It has no
>> clock of its own, and trivially adapts to the sink clock.
>>
>> Usually, RTP is used for audio over network, it is based on UDP.
>>
>> You can do deep buffering like DirectShow does, this will ensure the
>> Hi-Fi quality, but will cause time lag from start to first audible
>> sound. Unacceptable (as is DirectShow, at least was 10 years ago) for
>> communication apps.
>
>
>
> —
> NTDEV is sponsored by OSR
>
> Visit the list online at: http:
>>
>> MONTHLY seminars on crash dump analysis, WDF, Windows internals and
>> software drivers!
>> Details at http:
>>
>> To unsubscribe, visit the List Server section of OSR Online at
>> http://www.osronline.com/page.cfm?name=ListServer
>
>
>
> —
> NTDEV is sponsored by OSR
>
> Visit the list online at: http:
>
> MONTHLY seminars on crash dump analysis, WDF, Windows internals and
> software drivers!
> Details at http:
>
> To unsubscribe, visit the List Server section of OSR Online at
> http:</http:></http:></http:></http:></http:>

Clearly you have been given a problem with specific parameters that you don?t quite understand from your manager. Eventually I am able to discern that you (or your manager) wish to create a virtual audio device that will be compatible with arbitrary applications (ie Media Player etc.) that will transmit the audio samples via a UDP/IP connection to some remote device for playback ? if I have got any part of this wrong, please correct me before reading further

In theory this should work out okay since the playback application knows the data rate of the audio to be played and should send enough data down to prevent underruns and not so much to cause overruns. This is the theory. On modern machines, this generally works, but there are no specific guarantees as the OS has no hard realtime limits ? and as this is audio playback for music or speech and not the control rod actuator for a nuclear power plant this is generally deemed acceptable.

Adding a network component introduces two high level problems that you need to consider

  1. The latency, jitter and packet loss inherent in any IP network (okay there are exceptions but they require specialized hardware); and

  2. The clock skew between whichever clock source that your arbitrary UM app is using and the one used by your remote device

In your case the use of UDP is very appropriate since the loss of a single packet will not initiate meltdown and you will about the arbitrarily long delays associated with TCP retransmission for data whos ?time has passed?. The network quality will have a direct effect on the quality of audio output, but in exactly the same way as the quality of a speaker wire does.

Much has been made of the difference is clock sources, but I would expect that unless the PC and device in question have exceptionally bad clock sources, that this will not be a problem for you. You may be able to generate some statistics on ho well your driver / device behave, but you have already realized that there is no way to force an arbitrary application to send you data more slowly / quickly that it will do on its own and that the overhead of resampling for a tiny variance in frequency is probably not worth the effort or viable. I don?t know your situation, but unless you have extensive evidence that there is a real problem, I would opt for the simple solution that simply ignores any telemetry from the device about its buffer state.

Sent from Mailhttps: for Windows 10

From: Matthieu Collettemailto:xxxxx
Sent: April 27, 2016 8:34 AM
To: Windows System Software Devs Interest Listmailto:xxxxx
Subject: Re:[ntdev] How to perform adaptative resampling

On 2016-04-27 10:19:07 +0000, Mike Kemp said:

> Well, now we know you are getting throttling info back from the target
> you can simulate a sound card by running a sort of software local clock
> frequency locked to the remote device. That will allow you to read
> audio from the OS just like a normal driver. No “resampling” is
> required, adaptive or otherwise.
So it’s not a resampling problem…

>
> To know how to write an audio card driver, I hand you back to the
> driver experts. I have written Windows drivers some time ago and I
> follow this forum mainly so I can knowledgeably advise people not to
> write drivers!
Tim told me to post my problem on the WDMAUDIODEV list, that’s what I
did, hope I’ll be able to get a good solution.

>
> Good luck, Mike
Thanks for your advice :wink:

> ----- Original Message ----- From: Matthieu Collette
> Newsgroups: ntdev
> To: Windows System Software Devs Interest List
> Sent: Wednesday, April 27, 2016 10:57 AM
> Subject: Re:[ntdev] How to perform adaptative resampling
>
>
> On 2016-04-27 09:46:30 +0000, Mike Kemp said:
>
>> I don’t understand why you need to be writing a driver at all. If you
>> have control over the streaming app which sends UDP packets (with your
>> own retry mechanism) and knows which sample the remote device is
>> playing, all you need to do is open the audio file as a data file and
>> send audio data directly from the file as fast as is needed to keep the
>> target buffer full.
> My application should act a sound card from the user point of view.
> The user should be able to use whatever audio player he wants on his
> computer, select the virtual sound card I’m trying to write a driver
> for as the audio output device, and everything should be streamed from
> the computer to the remote playback device.
>
>>
>> My advice on driver writing is don’t. Unless all other ways of doing
>> are exhausted.
> Thanks for your advice, but I have no other choice at the moment :wink:
>
>>
>> Mike
>>
>> ----- Original Message ----- From: Matthieu Collette
>> Newsgroups: ntdev
>> To: Windows System Software Devs Interest List
>> Sent: Wednesday, April 27, 2016 10:13 AM
>> Subject: Re:[ntdev] How to perform adaptative resampling
>>
>>
>> On 2016-04-27 08:39:22 +0000, Mike Kemp said:
>>
>> Hi Mike !
>>
>>> My guess is he means that if the file to be played is, say 96k, then
>>> the remote device is first switched to 96k using its own crystal, and
>>> will then run at its own crystal rate, expecting its play buffer to be
>>> kept from emptying by data being sent to it.
>> Yes, that’s true.
>>
>>> It sound like it does not have any feedback to the sending computer
>>> that can throttle the tx data, so the computer has to send at the
>>> average speed of 96k (determined by its own reference, though not clear
>>> that a PC has one that is easily used for audio unless it uses its own
>>> sound card for this) in the hope that the target has buffered enough,
>>> and has enough spare capacity, to handle variations on the actual data
>>> rate.
>> Actually, the streaming application that received audio data from the
>> driver and sends audio data to the remode hardware device has no way to
>> throttle the audio data being sent.
>>
>>>
>>> Oddly, he has said the data must be sent by UDP which can lose packets.
>>> Failing any feedback path there is no way these can be re-sent. If
>>> there is a feedback path I don’t understand why it can’t be used to
>>> throttle the sending computer’s data, which could then produce a
>>> reliable working system.
>> There is a feedback path from the remote hardware device and the
>> sending computer. This feedback tells me which sample number is
>> currently being played and how full the buffer is.
>> Packets are sent using UPD and the streaming application is able to
>> resent lost audio packets, but if for any reason I don’t receive
>> feedback packets, well, I can’t do anything.
>>
>> Assuming I could modify the streaming application source code and
>> throttle tx, I would be able to slow down tx. What if I need to
>> accelerate ? How could I go faster than without throtling ?
>>
>>>
>>> ----- Original Message ----- From: Maxim S. Shatskih
>>> Newsgroups: ntdev
>>> To: Windows System Software Devs Interest List
>>> Sent: Tuesday, April 26, 2016 11:20 PM
>>> Subject: Re:[ntdev] How to perform adaptative resampling
>>>
>>>
>>>> The sample rate of the hardware source is not fixed, it depends on the
>>>> current file being played.
>>>> Allowed value could be one of 44100, 48000, 88200, 96000, 176400 or 192000 Hz.
>>>
>>> File being played? then it is software source, not hardware. It has no
>>> clock of its own, and trivially adapts to the sink clock.
>>>
>>> Usually, RTP is used for audio over network, it is based on UDP.
>>>
>>> You can do deep buffering like DirectShow does, this will ensure the
>>> Hi-Fi quality, but will cause time lag from start to first audible
>>> sound. Unacceptable (as is DirectShow, at least was 10 years ago) for
>>> communication apps.
>>
>>
>>
>> —
>> NTDEV is sponsored by OSR
>>
>> Visit the list online at: http:
>>
>> MONTHLY seminars on crash dump analysis, WDF, Windows internals and
>> software drivers!
>> Details at http:
>>
>> To unsubscribe, visit the List Server section of OSR Online at
>> http://www.osronline.com/page.cfm?name=ListServer
>
>
>
> —
> NTDEV is sponsored by OSR
>
> Visit the list online at: http:
>
> MONTHLY seminars on crash dump analysis, WDF, Windows internals and
> software drivers!
> Details at http:
>
> To unsubscribe, visit the List Server section of OSR Online at
> http:


NTDEV is sponsored by OSR

Visit the list online at: http:

MONTHLY seminars on crash dump analysis, WDF, Windows internals and software drivers!
Details at http:

To unsubscribe, visit the List Server section of OSR Online at http:</http:></http:></http:></http:></http:></http:></http:></http:></mailto:xxxxx></mailto:xxxxx></https:>

On 2016-04-27 23:47:16 +0000, Marion Bond said:

Hi Marion !

Thanks for your time and your ideas.

Clearly you have been given a problem with specific parameters that you
don’t quite understand from your manager.  Eventually I am able to
discern that you (or your manager) wish to create a virtual audio
device that will be compatible with arbitrary applications (ie Media
Player etc.) that will transmit the audio samples via a UDP/IP
connection to some remote device for playback – if I have got any part
of this wrong, please correct me before reading further
Everything is correct.
I’m able to create the virtual device, select it, send audio data from
kernel to user space using a TCP socket to a streaming application
which sends audio data using a UDP socket.

 
In theory this should work out okay since the playback application
knows the data rate of the audio to be played and should send enough
data down to prevent underruns and not so much to cause overruns.  This
is the theory.  On modern machines, this generally works, but there are
no specific guarantees as the OS has no hard realtime limits – and as
this is audio playback for music or speech and not the control rod
actuator for a nuclear power plant this is generally deemed acceptable. 
It works almost as expected, despite the fact that it seems I’m not
feeding fast enough the remote device.

 
Adding a network component introduces two high level problems that you
need to consider
>>>> 1)      The latency, jitter and packet loss inherent in any IP network
>>>> (okay there are exceptions but they require specialized hardware); and
>>>> 2)      The clock skew between whichever clock source that your
>>>> arbitrary UM app is using and the one used by your remote device
 
Latency is not a problem, I mean it is not a problem to wait a few
seconds before starting audio playback in order to buffer some audio
data.

In your case the use of UDP is very appropriate since the loss of a
single packet will not initiate meltdown and you will about the
arbitrarily long delays associated with TCP retransmission for data
whos ‘time has passed’.  The network quality will have a direct effect
on the quality of audio output, but in exactly the same way as the
quality of a speaker wire does.
The quality of audio output depends effectively on the quality of network.
I first want to make things work with a good quality (no Wifi network
link between the computer and the remote device).

 
Much has been made of the difference is clock sources, but I would
expect that unless the PC and device in question have exceptionally bad
clock sources, that this will not be a problem for you.  You may be
able to generate some statistics on ho well your driver / device
behave, but you have already realized that there is no way to force an
arbitrary application to send you data more slowly / quickly that it
will do on its own and that the overhead of resampling for a tiny
variance in frequency is probably not worth the effort or viable.  I
don’t know your situation, but unless you have extensive evidence that
there is a real problem, I would opt for the simple solution that
simply ignores any telemetry from the device about its buffer state.
Even when I’m ignoring telemetry and let the driver pushing audio data
as its own pace, I am still experiencing the same problem, sooner or
later, a buffer under run occurs.

 
Sent from Mail for Windows 10
 
From: Matthieu Collette
Sent: April 27, 2016 8:34 AM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] How to perform adaptative resampling
 
On 2016-04-27 10:19:07 +0000, Mike Kemp said:

> Well, now we know you are getting throttling info back from the target
> you can simulate a sound card by running a sort of software local clock
> frequency locked to the remote device. That will allow you to read
> audio from the OS just like a normal driver. No “resampling” is
> required, adaptive or otherwise.
So it’s not a resampling problem…

>
> To know how to write an audio card driver, I hand you back to the
> driver experts. I have written Windows drivers some time ago and I
> follow this forum mainly so I can knowledgeably advise people not to
> write drivers!
Tim told me to post my problem on the WDMAUDIODEV list, that’s what I
did, hope I’ll be able to get a good solution.

>
> Good luck, Mike
Thanks for your advice :wink:

> ----- Original Message ----- From: Matthieu Collette
> Newsgroups: ntdev
> To: Windows System Software Devs Interest List
> Sent: Wednesday, April 27, 2016 10:57 AM
> Subject: Re:[ntdev] How to perform adaptative resampling
>
>
> On 2016-04-27 09:46:30 +0000, Mike Kemp said:
>
>> I don’t understand why you need to be writing a driver at all. If you
>> have control over the streaming app which sends UDP packets (with your
>> own retry mechanism) and knows which sample the remote device is
>> playing, all you need to do is open the audio file as a data file and
>> send audio data directly from the file as fast as is needed to keep the
>> target buffer full.
> My application should act a sound card from the user point of view.
> The user should be able to use whatever audio player he wants on his
> computer, select the virtual sound card I’m trying to write a driver
> for as the audio output device, and everything should be streamed from
> the computer to the remote playback device.
>
>>
>> My advice on driver writing is don’t. Unless all other ways of doing
>> are exhausted.
> Thanks for your advice, but I have no other choice at the moment :wink:
>
>>
>> Mike
>>
>> ----- Original Message ----- From: Matthieu Collette
>> Newsgroups: ntdev
>> To: Windows System Software Devs Interest List
>> Sent: Wednesday, April 27, 2016 10:13 AM
>> Subject: Re:[ntdev] How to perform adaptative resampling
>>
>>
>> On 2016-04-27 08:39:22 +0000, Mike Kemp said:
>>
>> Hi Mike !
>>
>>> My guess is he means that if the file to be played is, say 96k, then
>>> the remote device is first switched to 96k using its own crystal, and
>>> will then run at its own crystal rate, expecting its play buffer to be
>>> kept from emptying by data being sent to it.
>> Yes, that’s true.
>>
>>> It sound like it does not have any feedback to the sending computer
>>> that can throttle the tx data, so the computer has to send at the
>>> average speed of 96k (determined by its own reference, though not clear
>>> that a PC has one that is easily used for audio unless it uses its own
>>> sound card for this) in the hope that the target has buffered enough,
>>> and has enough spare capacity, to handle variations on the actual data
>>> rate.
>> Actually, the streaming application that received audio data from the
>> driver and sends audio data to the remode hardware device has no way to
>> throttle the audio data being sent.
>>
>>>
>>> Oddly, he has said the data must be sent by UDP which can lose packets.
>>> Failing any feedback path there is no way these can be re-sent. If
>>> there is a feedback path I don’t understand why it can’t be used to
>>> throttle the sending computer’s data, which could then produce a
>>> reliable working system.
>> There is a feedback path from the remote hardware device and the
>> sending computer. This feedback tells me which sample number is
>> currently being played and how full the buffer is.
>> Packets are sent using UPD and the streaming application is able to
>> resent lost audio packets, but if for any reason I don’t receive
>> feedback packets, well, I can’t do anything.
>>
>> Assuming I could modify the streaming application source code and
>> throttle tx, I would be able to slow down tx. What if I need to
>> accelerate ? How could I go faster than without throtling ?
>>
>>>
>>> ----- Original Message ----- From: Maxim S. Shatskih
>>> Newsgroups: ntdev
>>> To: Windows System Software Devs Interest List
>>> Sent: Tuesday, April 26, 2016 11:20 PM
>>> Subject: Re:[ntdev] How to perform adaptative resampling
>>>
>>>
>>>> The sample rate of the hardware source is not fixed, it depends on the
>>>> current file being played.
>>>> Allowed value could be one of 44100, 48000, 88200, 96000, 176400
or 192000 Hz.
>>>
>>> File being played? then it is software source, not hardware. It has no
>>> clock of its own, and trivially adapts to the sink clock.
>>>
>>> Usually, RTP is used for audio over network, it is based on UDP.
>>>
>>> You can do deep buffering like DirectShow does, this will ensure the
>>> Hi-Fi quality, but will cause time lag from start to first audible
>>> sound. Unacceptable (as is DirectShow, at least was 10 years ago) for
>>> communication apps.
>>
>>
>>
>> —
>> NTDEV is sponsored by OSR
>>
>> Visit the list online at:
http:
> >>
> >> MONTHLY seminars on crash dump analysis, WDF, Windows internals and
> >> software drivers!
> >> Details at http:
> >>
> >> To unsubscribe, visit the List Server section of OSR Online at
> >> http://www.osronline.com/page.cfm?name=ListServer
> >
> >
> >
> > —
> > NTDEV is sponsored by OSR
> >
> > Visit the list online at: http:
> >
> > MONTHLY seminars on crash dump analysis, WDF, Windows internals and
> > software drivers!
> > Details at http:
> >
> > To unsubscribe, visit the List Server section of OSR Online at
> > http:
>
>
>
> —
> NTDEV is sponsored by OSR
>
> Visit the list online at: http:
>
> MONTHLY seminars on crash dump analysis, WDF, Windows internals and
> software drivers!
> Details at http:
>
> To unsubscribe, visit the List Server section of OSR Online at
> http:</http:></http:></http:></http:></http:></http:></http:></http:>

If the stream position is calculated by Microsoft audio sample driver rather than reading it from hardware, the solution could possibly be very simple…

…noticed the comment below in Windows 8.1 msvad archived driver sample?
Function CMiniportWaveCyclicStreamMSVAD::GetPosition
// Carry forward the remainder of this division so we don’t fall behind with our position.

…also noticed the updated comment in Windows 10 universal driver sample sysvad on GitHub? Function CMiniportWaveRTStream::UpdatePosition
// Carry forward the remainder of this division so we don’t fall behind with our position too much.

Marcel Ruedinger
datronicsoft

On 2016-04-28 12:02:42 +0000, xxxxx@datronic.de said:

Hi Marcel !

If the stream position is calculated by Microsoft audio sample driver
rather than reading it from hardware, the solution could possibly be
very simple…

…noticed the comment below in Windows 8.1 msvad archived driver sample?
Function CMiniportWaveCyclicStreamMSVAD::GetPosition
// Carry forward the remainder of this division so we don’t fall behind
with our position.
That’s what I did (implement this interface).
I’m computing the position depending on the the sample rate and the
adjusted clock value which allow me to slow down / accelerate the pace
at which the driver pushes audio data.
I ran some tests this morning and discovered the clock adjustements
were too small regarding the windows kernel timer precision, explaining
why it seems not to work.
Using bigger value, I can see that the remote playback buffer is being
filled, avoiding buffer under run.
I now need to adjust the algorithm computing the clock adjustement and
it should be ok.

…also noticed the updated comment in Windows 10 universal driver
sample sysvad on GitHub? Function CMiniportWaveRTStream::UpdatePosition
// Carry forward the remainder of this division so we don’t fall behind
with our position too much.
This is only available with Windows 10, so I won’t use the
CMiniportWaveRTStream.

Marcel Ruedinger
datronicsoft

>2) The clock skew between whichever clock source that your arbitrary UM app is using and the

one used by your remote device

There is no physical microphone or such.

So, the source app can just send the data async (fast enough) to the destination app. It is just like SFTP or SMB file copy from this side. There is no source clock in this case.

The destination app will feed the data to the physical speaker device, which is the only HW clock there, and thus is a DirectShow master clock.

The issues which occur with 2 HW clocks are not applicable there.

Just the transmission must be fast enough, and buffers large enough, to avoid buffer underruns.

meltdown and you will about the arbitrarily long delays associated with TCP retransmission

Well, if the network quality is really pathetic (say cellphone networking deep in the countryside or on a highway stopped in traffic, where most drivers will use their mapping/navigation SW or just talk over cell) - then sooner or later TCP will drop the connection due to retransmit failure.

You just plain cannot avoid audible clicks in such a case.

This is like Shannon’s theorem: for each network with data losses, given its data loss rate, there is a mechanism for loss-less data transfers. But, if the network will go worse then its estimated rate, with the same mechanism losses are inevitable.

For this particular application, the “mechanism” is buffering. You have DataRate, which is defined by the HW clock of the signal and by file header field (HW clock will obey to it). Also you have RecoveryTime, and BufferSize which is DataRate*RecoveryTime.

Such a mechanism will fix up to 1 error per RecoveryTime. But, if the errors will be more frequent, then the mechanism will fail.

And, surely, there is a cost, in terms of initial time lag, which will be == RecoveryTime.

Note that, in degenerate case, you can just download the whole sound file and then play it locally.

Much has been made of the difference is clock sources,

There is no such difference in this case. The source is just pumping the data on. The pumping rate must not obey any realtime clock, it just must be fast enough.


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

When implementing GetPosition(),
what did you base your calculation on

  • Hardware position register?
  • Previous GetPosition() call?
  • Start time of the stream?

Marcel Ruedinger

datronicsoft

On 2016-04-28 13:55:46 +0000, xxxxx@datronic.de said:

When implementing GetPosition(),
what did you base your calculation on

  • Hardware position register?
  • Previous GetPosition() call?
  • Start time of the stream?
    Using tick counts between each GetPosition calls and the adjusted clock
    rate (based on remote playback device feedback), I am able to compute
    how much data should have been read.

Marcel Ruedinger

datronicsoft

We have seen quite a few developers experiencing similar problems with this WDK sample (and older versions of it since more than a decade). Inaccuracies and jitter are added up when stream position is calculated relatively based on the previous GetPosition() call. Just base the calculation on stream start time instead of the previous GetPosition(…) call. Then any inaccuracy will be compensated automatically over time. Then the additional “carry forward” variables complicating the calculation and adding inaccuracy are not needed any more either.

PS: It would be very interesting to get your feedback if changing the calculation as mentioned above improves your situation and/or possibly even eliminates the need of the “adjustment” to the clock rate (which actually sounds a bit suspicious to me anyway ;-).

Marcel Ruedinger
datronicsoft

On 2016-04-29 21:31:54 +0000, xxxxx@datronic.de said:

Hi Marcel!

We have seen quite a few developers experiencing similar problems with
this WDK sample (and older versions of it since more than a decade).
Inaccuracies and jitter are added up when stream position is calculated
relatively based on the previous GetPosition() call. Just base the
calculation on stream start time instead of the previous
GetPosition(…) call. Then any inaccuracy will be compensated
automatically over time. Then the additional “carry forward” variables
complicating the calculation and adding inaccuracy are not needed any
more either.
Thanks for this tips, I will try it as soon as possible.
I still don’t understand why computing the stream position based on
GetPosition calls could be less accurate than based on the stream start
?

PS: It would be very interesting to get your feedback if changing the
calculation as mentioned above improves your situation and/or possibly
even eliminates the need of the “adjustment” to the clock rate (which
actually sounds a bit suspicious to me anyway ;-).
I will let you know if I notice any change.

Marcel Ruedinger
datronicsoft

Thanks for your help !