Why DPCs are processed a clock tick behind on Win8 ?

Folks on the audio forums are very worried about Win8 RP because of the high
latencies they are seeing. The latency measuring software that is being used
executes a kernel timer with an associated DPC and the time difference
between each execution of the DPC routine is measured. Very often, the timer
DPC execution is skipped an entire clock tick. Is there something like a
maximum queue depth for DPCs on Win8 or a limit to the amount of DPCs that
can be executed per clock tick ? Actually, I think not because changing the
DPC importance to HighImportance so that it is inserted at the top of the
queue does not solve the problem, DPCs are still executed a clock tick
behind very often. Is there any change to the scheduler or other part of the
OS that could explain this behavior ?

//Daniel

I noticed a similar discusion here:
http://social.technet.microsoft.com/Forums/en-US/w8itproperf/thread/aa16c25a
-9c72-4579-ba02-cea920296271.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@resplendence.com
Sent: Friday, June 08, 2012 2:05 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Why DPCs are processed a clock tick behind on Win8 ?

Folks on the audio forums are very worried about Win8 RP because of the high
latencies they are seeing. The latency measuring software that is being used
executes a kernel timer with an associated DPC and the time difference
between each execution of the DPC routine is measured. Very often, the timer
DPC execution is skipped an entire clock tick. Is there something like a
maximum queue depth for DPCs on Win8 or a limit to the amount of DPCs that
can be executed per clock tick ? Actually, I think not because changing the
DPC importance to HighImportance so that it is inserted at the top of the
queue does not solve the problem, DPCs are still executed a clock tick
behind very often. Is there any change to the scheduler or other part of the
OS that could explain this behavior ?

//Daniel


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Could this have to do with more aggressive timer coalescing in Win8?

Timer Coalescing:
http:

Peter
OSR</http:>

wrote in message news:xxxxx@ntdev…
> Could this have to do with more aggressive timer coalescing in Win8?
>

I would suppose not because this feature requires specific functions such as
KeSetCoalescableTimer.

One thing that takes my attention is that after an effort to set the clock
resolution to 1ms, clockres actually reports a resolution of 1.001 ms. It is
set to this value because it comes from a hardcoded HAL table which sets the
possible accepted values. If the actual clock resolution would be just below
1ms (0.999 ms), I could perfectly understand why a periodic timer of 1ms is
delayed an extra clock tick. But with a clock resolution that is slightly
higher, I cannot see this as the reason.

//Daniel

Thinking a little longer, I think I understand the reason why sometimes an
entire clock interrupt is skipped, it can be fully attributed to the weird
clock resolution of 1.001 ms.

If a HPET periodic timer, that is likley used for the clock interrupt
“precalculates” the interrupt times with absolute values, the duetime is
slightly shifting with every invocation of the timer interrupt.

Actual clock interrupt:
XXXX!XXXX!XXXX!XXXX!XXXX!..
Requested DPC due time:
XXX!XXX!XXX!XXX!XXX!XXX!..

Very soon, the actual clock interrupt is less far away than 1ms, then a
clock interrupt is skipped because the DPC is not yet due.

The solution should be to use aperiodic timers (with relative values) for
measuring always and never rely on periodic timers that are not exactly in
sync with the clock resolution. Question remains why they fiilled that HAL
table (with the only clock resolutions that are honored) with that weird
value of 1.001 ms.

//Daniel

wrote in message news:xxxxx@ntdev…
> Folks on the audio forums are very worried about Win8 RP because of the
> high latencies they are seeing. The latency measuring software that is
> being used executes a kernel timer with an associated DPC and the time
> difference between each execution of the DPC routine is measured. Very
> often, the timer DPC execution is skipped an entire clock tick. Is there
> something like a maximum queue depth for DPCs on Win8 or a limit to the
> amount of DPCs that can be executed per clock tick ? Actually, I think not
> because changing the DPC importance to HighImportance so that it is
> inserted at the top of the queue does not solve the problem, DPCs are
> still executed a clock tick behind very often. Is there any change to the
> scheduler or other part of the OS that could explain this behavior ?
>
> //Daniel
>
>
>

Technically, being one timer increment “late” is within the contract of the timer APIs, because we do round timer due times up to the next timer resolution boundary. Basically, this behavior comes out of dynamic tick when we are putting the hardware clock in aperiodic mode. We look at the nearest timer duetime and push the hardware clock out to that time plus one timer resolution increment to maximize the potential for expiring multiple timers for a single processor wakeup.

-Neeraj Singh
Windows Kernel Core Team

This message offers no warranties and confers no rights.

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@resplendence.com
Sent: Friday, June 08, 2012 11:20 AM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] Why DPCs are processed a clock tick behind on Win8 ?

Thinking a little longer, I think I understand the reason why sometimes an entire clock interrupt is skipped, it can be fully attributed to the weird clock resolution of 1.001 ms.

If a HPET periodic timer, that is likley used for the clock interrupt “precalculates” the interrupt times with absolute values, the duetime is slightly shifting with every invocation of the timer interrupt.

Actual clock interrupt:
XXXX!XXXX!XXXX!XXXX!XXXX!..
Requested DPC due time:
XXX!XXX!XXX!XXX!XXX!XXX!..

Very soon, the actual clock interrupt is less far away than 1ms, then a clock interrupt is skipped because the DPC is not yet due.

The solution should be to use aperiodic timers (with relative values) for measuring always and never rely on periodic timers that are not exactly in sync with the clock resolution. Question remains why they fiilled that HAL table (with the only clock resolutions that are honored) with that weird value of 1.001 ms.

//Daniel

wrote in message news:xxxxx@ntdev…
> Folks on the audio forums are very worried about Win8 RP because of
> the high latencies they are seeing. The latency measuring software
> that is being used executes a kernel timer with an associated DPC and
> the time difference between each execution of the DPC routine is
> measured. Very often, the timer DPC execution is skipped an entire
> clock tick. Is there something like a maximum queue depth for DPCs on
> Win8 or a limit to the amount of DPCs that can be executed per clock
> tick ? Actually, I think not because changing the DPC importance to
> HighImportance so that it is inserted at the top of the queue does not
> solve the problem, DPCs are still executed a clock tick behind very
> often. Is there any change to the scheduler or other part of the OS that could explain this behavior ?
>
> //Daniel
>
>
>


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

This sounds like an attempt to use Windows as a hard-real-time OS, which
it most definitely is not. No guarantees were ever made (and, in fact,
are explicitly not promised) on interrupt or DPC latency, or
thread-scheduling-delay for application threads. If it ever worked at
all, this should be considered a fortuitous accident. Microsoft does not
guarantee backward-compatibility of fortuitous accidents.
joe

Folks on the audio forums are very worried about Win8 RP because of the
high
latencies they are seeing. The latency measuring software that is being
used
executes a kernel timer with an associated DPC and the time difference
between each execution of the DPC routine is measured. Very often, the
timer
DPC execution is skipped an entire clock tick. Is there something like a
maximum queue depth for DPCs on Win8 or a limit to the amount of DPCs that
can be executed per clock tick ? Actually, I think not because changing
the
DPC importance to HighImportance so that it is inserted at the top of the
queue does not solve the problem, DPCs are still executed a clock tick
behind very often. Is there any change to the scheduler or other part of
the
OS that could explain this behavior ?

//Daniel


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Thanks for your answer. But can you explain why timer resolutions of 500us
and 2000us are perfectly honored a(s always) but a timer resolution of
1000us becomes 1001us on Windows 8 ?

timeBeginPeriod(1);
// now run clockres

//Daniel

“Neeraj Singh” wrote in message
news:xxxxx@ntdev…
Technically, being one timer increment “late” is within the contract of the
timer APIs, because we do round timer due times up to the next timer
resolution boundary. Basically, this behavior comes out of dynamic tick
when we are putting the hardware clock in aperiodic mode. We look at the
nearest timer duetime and push the hardware clock out to that time plus one
timer resolution increment to maximize the potential for expiring multiple
timers for a single processor wakeup.

-Neeraj Singh
Windows Kernel Core Team

This message offers no warranties and confers no rights.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@resplendence.com
Sent: Friday, June 08, 2012 11:20 AM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] Why DPCs are processed a clock tick behind on Win8 ?

Thinking a little longer, I think I understand the reason why sometimes an
entire clock interrupt is skipped, it can be fully attributed to the weird
clock resolution of 1.001 ms.

If a HPET periodic timer, that is likley used for the clock interrupt
“precalculates” the interrupt times with absolute values, the duetime is
slightly shifting with every invocation of the timer interrupt.

Actual clock interrupt:
XXXX!XXXX!XXXX!XXXX!XXXX!..
Requested DPC due time:
XXX!XXX!XXX!XXX!XXX!XXX!..

Very soon, the actual clock interrupt is less far away than 1ms, then a
clock interrupt is skipped because the DPC is not yet due.

The solution should be to use aperiodic timers (with relative values) for
measuring always and never rely on periodic timers that are not exactly in
sync with the clock resolution. Question remains why they fiilled that HAL
table (with the only clock resolutions that are honored) with that weird
value of 1.001 ms.

//Daniel

wrote in message news:xxxxx@ntdev…
> Folks on the audio forums are very worried about Win8 RP because of
> the high latencies they are seeing. The latency measuring software
> that is being used executes a kernel timer with an associated DPC and
> the time difference between each execution of the DPC routine is
> measured. Very often, the timer DPC execution is skipped an entire
> clock tick. Is there something like a maximum queue depth for DPCs on
> Win8 or a limit to the amount of DPCs that can be executed per clock
> tick ? Actually, I think not because changing the DPC importance to
> HighImportance so that it is inserted at the top of the queue does not
> solve the problem, DPCs are still executed a clock tick behind very
> often. Is there any change to the scheduler or other part of the OS that
> could explain this behavior ?
>
> //Daniel
>
>
>


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Some other questions,

Do I understand right this aperiodic hardware timer support is a new Win8
feature ?

Even though this may not break any “contract” have you considered all the
scenarios where this may disturb the ecosystem or completely break existing
applications and upset end-users ? What about non-interrupt driven solutions
that contact the hardware directly that have a tolerable delay of up to one
clock tick ? Won’t they stop working ?

If the timer interrupt becomes “dynamic” should this not delay execution of
all DPCs with one or more clockticks (and not only those DPCs associated
with a timer) ? Should this not dramatically increase latencies up to the
point where this will hurt overall system performance ?

What will be the purpose of existing interfaces such as timeBeginPeriod and
ExSetTimerResolution, will they be nothing more than an artifact or will
they somehow still influence the dynamic clock interrupt ?

//Daniel



“Neeraj Singh” wrote in message
news:xxxxx@ntdev…
Technically, being one timer increment “late” is within the contract of the
timer APIs, because we do round timer due times up to the next timer
resolution boundary. Basically, this behavior comes out of dynamic tick
when we are putting the hardware clock in aperiodic mode. We look at the
nearest timer duetime and push the hardware clock out to that time plus one
timer resolution increment to maximize the potential for expiring multiple
timers for a single processor wakeup.

-Neeraj Singh
Windows Kernel Core Team

This message offers no warranties and confers no rights.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@resplendence.com
Sent: Friday, June 08, 2012 11:20 AM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] Why DPCs are processed a clock tick behind on Win8 ?

Thinking a little longer, I think I understand the reason why sometimes an
entire clock interrupt is skipped, it can be fully attributed to the weird
clock resolution of 1.001 ms.

If a HPET periodic timer, that is likley used for the clock interrupt
“precalculates” the interrupt times with absolute values, the duetime is
slightly shifting with every invocation of the timer interrupt.

Actual clock interrupt:
XXXX!XXXX!XXXX!XXXX!XXXX!..
Requested DPC due time:
XXX!XXX!XXX!XXX!XXX!XXX!..

Very soon, the actual clock interrupt is less far away than 1ms, then a
clock interrupt is skipped because the DPC is not yet due.

The solution should be to use aperiodic timers (with relative values) for
measuring always and never rely on periodic timers that are not exactly in
sync with the clock resolution. Question remains why they fiilled that HAL
table (with the only clock resolutions that are honored) with that weird
value of 1.001 ms.

//Daniel

wrote in message news:xxxxx@ntdev…
> Folks on the audio forums are very worried about Win8 RP because of
> the high latencies they are seeing. The latency measuring software
> that is being used executes a kernel timer with an associated DPC and
> the time difference between each execution of the DPC routine is
> measured. Very often, the timer DPC execution is skipped an entire
> clock tick. Is there something like a maximum queue depth for DPCs on
> Win8 or a limit to the amount of DPCs that can be executed per clock
> tick ? Actually, I think not because changing the DPC importance to
> HighImportance so that it is inserted at the top of the queue does not
> solve the problem, DPCs are still executed a clock tick behind very
> often. Is there any change to the scheduler or other part of the OS that
> could explain this behavior ?
>
> //Daniel
>
>
>


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

wrote in message news:xxxxx@ntdev…

> If the timer interrupt becomes “dynamic” should this not delay execution
> of all DPCs with one or more clockticks (and not only those DPCs
> associated with a timer) ? Should this not dramatically increase latencies
> up to the point where this will hurt overall system performance ?

Daniel, have you actually seen this extra delay or jitter for non-timer
DPCs, such as ISR DPCs?
– pa

“Pavel A.” wrote in message news:xxxxx@ntdev…
> wrote in message news:xxxxx@ntdev…
> …
>> If the timer interrupt becomes “dynamic” should this not delay execution
>> of all DPCs with one or more clockticks (and not only those DPCs
>> associated with a timer) ? Should this not dramatically increase
>> latencies up to the point where this will hurt overall system performance
>> ?
>
> Daniel, have you actually seen this extra delay or jitter for non-timer
> DPCs, such as ISR DPCs?
> – pa
>

Honestly, not yet. So far I have seen a lot of weirdness that I am trying to
understand and explain but MSDN says that after KeInsertQueueDpc processing
of the DPC queue (generally) begins immediately. Whether that is a contract
of some sort I don’t know.

I have to look and test further, who knows these aperiodic dynamic timers
actually allow us to finally take control of hardware timer interrupts.
Historically, the DueTime parameter in KeSetTimerEx has been finegrained but
always dependent on the very coarse periodic clock interrupt.

There might be good news as well, like there is for a lot of new features in
Win8 but for the moment I don’t know what to tell to my customers.

//Daniel

When I taught my Windows System Programming course, I always characterized
it as “Windows, like nearly every general-purpose operating system, has
only the vaguest concept of ‘time’. It’s only real understanding is ‘time
moves forward’. You have to bring in Quantum Theory, which explains that
time moves forward in discrete increments which are quantized, and that
time can be some value T or some value T-plus-delta-T, but it cannot be
any value between those two. If you care about time, use an RTOS.”

If you are concerned about the difference between 1.000 milliseconds and
1.001 milliseconds, you are using the wrong operating system. Time moves
forward, and expectations of something more sophisticated are largely
unwarranted.
joe

“Pavel A.” wrote in message news:xxxxx@ntdev…
>> wrote in message news:xxxxx@ntdev…
>> …
>>> If the timer interrupt becomes “dynamic” should this not delay
>>> execution
>>> of all DPCs with one or more clockticks (and not only those DPCs
>>> associated with a timer) ? Should this not dramatically increase
>>> latencies up to the point where this will hurt overall system
>>> performance
>>> ?
>>
>> Daniel, have you actually seen this extra delay or jitter for non-timer
>> DPCs, such as ISR DPCs?
>> – pa
>>
>
> Honestly, not yet. So far I have seen a lot of weirdness that I am trying
> to
> understand and explain but MSDN says that after KeInsertQueueDpc
> processing
> of the DPC queue (generally) begins immediately. Whether that is a
> contract
> of some sort I don’t know.
>
> I have to look and test further, who knows these aperiodic dynamic timers
> actually allow us to finally take control of hardware timer interrupts.
> Historically, the DueTime parameter in KeSetTimerEx has been finegrained
> but
> always dependent on the very coarse periodic clock interrupt.
>
> There might be good news as well, like there is for a lot of new features
> in
> Win8 but for the moment I don’t know what to tell to my customers.
>
> //Daniel
>
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>

What you should realize is that all the “solutions” that you have mentioned firmly belong in “broken by design” class. If a “solution” like that stops working after some system update it means this “solution” is broken by design, and the update has simply exposed the fallacies of its design. How on Earth you can say that “a delay of up to one clock tick is tolerable” if clock tick happens to be a configurable parameter that may vary???

Anton Bassov

> for the moment I don’t know what to tell to my customers.

The only thing you should tell them is “Windows NT is not RTOS and will never be the one”. Unfortunately, you seem to be just refusing to accept this simple and obvious fact - you seem to be just determined to keep on fooling yourself…

Anton Basov

Thanks Anton, but we knew that already. But by taking that theory very
literally we shouldn’t be doing a lot of the things we are doing today and
that would eliminate an entire industry as well. Then there is also
something called soft real-time where limiting the number of missed
deadlines is highly desired, but you already knew that did you ? Then
several of my customers are not pursuing real-time solutions at all but are
having completely other concerns but they still would like to know what is
up with the clock resolution and kernel timers in Win8.

//Daniel

wrote in message news:xxxxx@ntdev…
>> for the moment I don’t know what to tell to my customers.
>
> The only thing you should tell them is “Windows NT is not RTOS and will
> never be the one”. Unfortunately, you seem to be just refusing to accept
> this simple and obvious fact - you seem to be just determined to keep on
> fooling yourself…
>

I would like to add as an audio man that there is indeed a whole industry that is based on the fact that Windows (so far) can be made to work as a “good enough” real time system to be commercially successful, given careful design of hardware and software. I expect that many of your favourite TV shows and music have had audio recorded or edited on PC.

Clearly no one can hold MS to account if changes are made that break many of the solutions, except it creates a demand that people stick with older versions of the OS (or migrate to other vendors). In line with another thread here, maybe this means XP will have an even longer life, although Win7 seems an acceptable alternative.

I believe MS once set its target for audio support by defining professional audio as “audio with almost no glitches”. Some of us have made a living out of exploiting the failure of computer people to understand video and audio media - the more they think they do, the worse it seems to get.

BTW I find myself migrating more towards Apple these days, as it seems to be leading the way rather than being years behind, and I have started getting nervous waiting for Win8. Especially as I read that MS development tools are pricing themselves out of the market (to reference yet another thread).

Mike

Mike Kemp, Technical Director, Sintefex Audio, Lda (http://www.sintefex.com)
Vale Formosilho, S. Marcos da Serra, P-8375, Portugal, Tel +351 282 361748 Fax +351 282 361749
The contents of this email are CONFIDENTIAL and do not form a basis for contract.

----- Original Message -----
From: xxxxx@resplendence.com
Newsgroups: ntdev
To: Windows System Software Devs Interest List
Sent: Sunday, June 10, 2012 7:04 AM
Subject: Re:[ntdev] Why DPCs are processed a clock tick behind on Win8 ?

Thanks Anton, but we knew that already. But by taking that theory very
literally we shouldn’t be doing a lot of the things we are doing today and
that would eliminate an entire industry as well. Then there is also
something called soft real-time where limiting the number of missed
deadlines is highly desired, but you already knew that did you ? Then
several of my customers are not pursuing real-time solutions at all but are
having completely other concerns but they still would like to know what is
up with the clock resolution and kernel timers in Win8.

//Daniel

wrote in message news:xxxxx@ntdev…
>> for the moment I don’t know what to tell to my customers.
>
> The only thing you should tell them is “Windows NT is not RTOS and will
> never be the one”. Unfortunately, you seem to be just refusing to accept
> this simple and obvious fact - you seem to be just determined to keep on
> fooling yourself…
>


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

If i had to guess, I would assume that the designers wanted to ensure that
the hardware timer resolution could never exactly correspond to any of the
software specified timer intervals. This would have the result of creating
a ripple in the periodicity of each timer, ensuring that no one timer would
be favoured over any others in the system. This is pure speculation
however, and maybe someone from Microsoft can comment further on their
rational.

wrote in message news:xxxxx@ntdev…

Thinking a little longer, I think I understand the reason why sometimes an
entire clock interrupt is skipped, it can be fully attributed to the weird
clock resolution of 1.001 ms.

If a HPET periodic timer, that is likley used for the clock interrupt
“precalculates” the interrupt times with absolute values, the duetime is
slightly shifting with every invocation of the timer interrupt.

Actual clock interrupt:
XXXX!XXXX!XXXX!XXXX!XXXX!..
Requested DPC due time:
XXX!XXX!XXX!XXX!XXX!XXX!..

Very soon, the actual clock interrupt is less far away than 1ms, then a
clock interrupt is skipped because the DPC is not yet due.

The solution should be to use aperiodic timers (with relative values) for
measuring always and never rely on periodic timers that are not exactly in
sync with the clock resolution. Question remains why they fiilled that HAL
table (with the only clock resolutions that are honored) with that weird
value of 1.001 ms.

//Daniel

wrote in message news:xxxxx@ntdev…
> Folks on the audio forums are very worried about Win8 RP because of the
> high latencies they are seeing. The latency measuring software that is
> being used executes a kernel timer with an associated DPC and the time
> difference between each execution of the DPC routine is measured. Very
> often, the timer DPC execution is skipped an entire clock tick. Is there
> something like a maximum queue depth for DPCs on Win8 or a limit to the
> amount of DPCs that can be executed per clock tick ? Actually, I think not
> because changing the DPC importance to HighImportance so that it is
> inserted at the top of the queue does not solve the problem, DPCs are
> still executed a clock tick behind very often. Is there any change to the
> scheduler or other part of the OS that could explain this behavior ?
>
> //Daniel
>
>
>