What is the correct way to use the reference clock and fill audio data in AVStream?

My pin-centric driver will get a clock at the acquire state, like the samples do, and then fills audio data based on CorrelatedTime.

The problem is that sometimes CorrelatedTime appears to slow down. The difference between CorrelatedTime and SystemTime (obtained from the parameter of GetCorrelatedTime) increases. At times, the audio also seems to slow down (but I'm not 100% sure, I don’t have a precise way to verify the audio). This problem happens on graphedit.

If I ignore CorrelatedTime and fill data every time the process called, other players such as VLC will drop an amount of data, and only fragments of the audio remain. PS: but VLC debug shows a lot: "dshow debug: CapturePin::Receive trashing late input sample"

So why does this happen (CorrelatedTime slow down)? and how should I fill the data at an appropriate rate?

I think it will be hard for anyone to answer without more information. At least, what kind of hardware is this? And maybe specific examples of the time values that you are seeing

Is this a pin-centric driver with both a video pin and an audio pin? The audio data should be clocked by the audio hardware itself, not by the system time.

Thanks for the reminder.

The audio device is only a simulation, as same as the avssamp sample. It reads a .wav file that format is 48khz, 24bit, 2 channels.
In avssamp, both video and audio are filled into one stream pointer in the filter process, and simply fill the amount of data corresponding to one interval.

My current design is the driver has a video filter and a audio filter. Each filter has a single pin, and both filters are pin-centric.
With this design, the audio pin process will process a queue of stream pointer, and there is no interval or duration info about one frame.

Currently I reuse the video interval (33.3ms for 30fps), set the presentation time as 0, 333333, 666666..., and fill the corresponding amount of audio data.
This way works well in graphedit, but it still has problems in VLC (the audio playback sounds random and consists only of short fragments)

No, each pin has its own set of stream pointers. The filter-centric Process routine just calls them each in order.

Did you notice that, in avssamp, the audio packets are not timestamped unless the graph has a master clock, and then the timestamp comes from that clock? Unlike video data, which can be compressed, audio data streams are self-clocking. 48000 samples equals 1 second.

You're right. Both video and audio fill their own stream pointers (comes from PKSPROCESSPIN).

Regarding the clock, I have confirmed that both graphedit and VLC will give a clock (or say the driver can get a clock) at the acquire state.
However, even when I set clock->GetTime (as same as avssamp) or clock->GetCorrelatedTime (refer the document) to presentationtime, VLC still plays fragmented sound.
About the data length, I have tried filling data with SynthesizeFixed(my_interval...) or SynthesizeTo(clock_time...), but neither approach improves the behavior on VLC.

However, even when I set clock->GetTime (as same as avssamp) or clock->GetCorrelatedTime (refer the document) to presentationtime, VLC still plays fragmented sound.

I assume you mean that the other way around: you set PresentationTime to clock→GetTime().

Audio can be frustrating. Have you done any instrumentation, like printing debug log messages saying “at x time, I sent y bytes”, so you can verify yourself that you’re sending 48,000 samples a second?

And, minor point, but you mentioned 30 FPS. Is that actually what your source material is? It’s not 29.97?

That’s correct. Sorry for the confusion.

Yes. I enabled debug message to confirm that the data length and timestamps are set as expected, and then disabled debug to check the behavior. I’m currently trying to use WPP to minimize side effects during debugging.

Sorry, do you mean the video frame rate? In my current tests on graphedit and VLC, I only use the audio filter.


Quite a while ago I was working on a similar project, and for audio debugging I found that using a sine wave audio source was really helpful in spotting glitches it rendering output.

What happens if you don’t set the time in your stream headers at all?

@Mark_Roddy

OK, I’ll try the sine wave example in sysvad. Thanks for the suggestion.

@Tim_Roberts

If I don’t set presentationTime, VLC drops all data (it outputs a lot of debug msg "Receive trashing late input sample") and plays nothing.
graphedit runs in live fullness mode (check from renderer properties) and plays sound correctly. It seems VLC can not support no timestamp condition.
Anyway, graphedit is robust.