Guys,
one of my drivers uses KeQueryPerformanceCounter() to timestamp packets. One
of our users has reported a problem on his machine (XPSP2 -AMD64) by which
the values returned by KeQueryPerformanceCounter() do not increase
monotonically when the function is called on different CPUs. Specifically
the machine uses an AMD Athlon™ 64 X2 Dual Core Processor 3800+.
I found KB896256 (http://support.microsoft.com/kb/896256) that seems to
address a similar issue (the description of the fix is actually not
completely clear) but it’s for XP x86 only. The x64 is said to be private
(Note This problem also applies to x64-based versions of Microsoft Windows
Server 2003. However, this article and its associated private hotfix …)
Has anyone encountered such problem, or used this “private hotfix”?
Have a nice day
GV
Gianluca Varenni wrote:
one of my drivers uses KeQueryPerformanceCounter() to timestamp packets. One
of our users has reported a problem on his machine (XPSP2 -AMD64) by which
the values returned by KeQueryPerformanceCounter() do not increase
monotonically when the function is called on different CPUs. Specifically
the machine uses an AMD Athlon™ 64 X2 Dual Core Processor 3800+.
Absolutely true. Windows NT 4 and earlier synchronized the cycle
counters on multiple CPUs at boot time (they are, after all, writable
registers) so that they varied by a few dozen cycles. On XP, that isn’t
true, and the delta between the CPUs can be many millions of cycles. I
was unpleasantly surprised when I learned this.
–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.
----- Original Message -----
From: “Tim Roberts”
To: “Windows System Software Devs Interest List”
Sent: Tuesday, February 13, 2007 4:32 PM
Subject: Re: [ntdev] KeQueryPerformanceCounter on AMD64 DualCore machines
issue
> Gianluca Varenni wrote:
>> one of my drivers uses KeQueryPerformanceCounter() to timestamp packets.
>> One
>> of our users has reported a problem on his machine (XPSP2 -AMD64) by
>> which
>> the values returned by KeQueryPerformanceCounter() do not increase
>> monotonically when the function is called on different CPUs. Specifically
>> the machine uses an AMD Athlon™ 64 X2 Dual Core Processor 3800+.
>>
>
> Absolutely true. Windows NT 4 and earlier synchronized the cycle
> counters on multiple CPUs at boot time (they are, after all, writable
> registers) so that they varied by a few dozen cycles. On XP, that isn’t
> true, and the delta between the CPUs can be many millions of cycles. I
> was unpleasantly surprised when I learned this.
What surprises me is the fact that this does not seem to be documented at
all in the DDK docs. The DDK actually says
“KeQueryPerformanceCounter is intended for time-stamping packets or for
computing performance and capacity measurements.”
How can one use such a function if the return values are meaningful on a
per-CPU basis (and obtaining the CPU clock skew is not easy, I would say
almost impossible)?
And what’s more interesting is that the user level QueryPerformanceCounter
(which performs a syscall in NtQueryPerformanceCounter, prolly to
KeQuery…) has this interesting note
-----
On a multiprocessor computer, it should not matter which processor is
called. However, you can get different results on different processors due
to bugs in the basic input/output system (BIOS) or the hardware abstraction
layer (HAL). To specify processor affinity for a thread, use the
SetThreadAffinityMask function.
-----
Anyone (from MS) who can shed the light on it?
GV
>
> –
> Tim Roberts, xxxxx@probo.com
> Providenza & Boekelheide, Inc.
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
> Absolutely true. Windows NT 4 and earlier synchronized the cycle
counters on multiple CPUs at boot time (they are, after all, writable
registers) so that they varied by a few dozen cycles. On XP, that isn’t
true, and the delta between the CPUs can be many millions of cycles. I
was unpleasantly surprised when I learned this.
–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.
It is a hard and complex HW/system problem and there is no easy solution
which Microsoft can just implement in a simple hotfix.
TSC, ACPI timer, and KeQueryPerformanceCounter weirdness is a major source of
problems for us (VMware) on all HW and OS platforms.
TSCs are initially synchronized by BIOS and /or kernel but they drift because
AMD CPU power management doesn’t change TSC frequency synchronously on all
CPUs. Intel platforms have their own problems.
Here are the relevant pointers:
The article from Rich Brunner, ex-AMD Fellow:
“TSC and Power Management Events on AMD Processors”
http://lkml.org/lkml/2005/11/4/173
AMD TSC Drift Solutions in Red Hat Enterprise Linux:
http://developer.amd.com/articles.jsp?id=92&num=1
Dmitriy Budko
VMware
More cores and power throttling are going to just get bigger. If it is true there is no guaranteed relationship between successive values returned from KeQueryPerformanceCounter, then for all purposes this call is a pseudo random number generator. What possible use is it to anyone?
But I am confused. The knowledge base article implies the ACPI timer is used to assure KeQueryPerforamnceCounter works properly and that only rogue drivers that bypass the Windows calls and send their own RDTSC are at risk. So the way I read it Microsoft already addressed the problem.
eof
> But I am confused. The knowledge base article implies the ACPI timer is
used to assure KeQueryPerforamnceCounter works properly and that only
rogue drivers that bypass the Windows calls and send their own RDTSC are
at risk. So the way I read it Microsoft already addressed the problem.
QueryPerfCtr is based on various timers in various HALs. In some HALs it
will use the ACPI timer. In others it will use the timestamp counter or the
8254 (I think it was). There is a new “high resolution” timer in some PCs
that I think may also be used under some circumstances.
Note that the ACPI timer is itself a pseudo-random number generator in most
inplementations.
The hardware types have certainly mucked up the concept of high resolution
timekeeping in the last few chip generations. I give them about 3-5 more
years before enough people complain and convince them that random timer
speeds are only of interest to people doing chip simulation, and are useless
for any normal software purpose.
Loren
> ----------
From: xxxxx@lists.osr.com[SMTP:xxxxx@lists.osr.com] on behalf of xxxxx@email.com[SMTP:xxxxx@email.com]
Reply To: Windows System Software Devs Interest List
Sent: Wednesday, February 14, 2007 11:59 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] KeQueryPerformanceCounter on AMD64 DualCore machines issue
But I am confused. The knowledge base article implies the ACPI timer is used to assure KeQueryPerforamnceCounter works properly and that only rogue drivers that bypass the Windows calls and send their own RDTSC are at risk. So the way I read it Microsoft already addressed the problem.
It the mentioned hotfix which doesn’t seem to be (public) available for x64. XP SP2 KeQueryPerformanceCounter() implementation can use RDTSC; it works this way at my dual Opteron PC.
Best regards,
Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]
All the explanations about the problem make perfectly sense. What doesn’t
make sense to me (as another person pointed out) is that this is *not*
documented in the DDK. What should I do with an API that basically returns
“random” numbers on specific architectures. And worse, I have no way to
detect if KQPC returns meaningful numbers, as it always did on x86 and
2k/XP/2k3/Vista in my experience, and x64 (where one of my users reported
the problem).
Does anyone know how to obtain these private hotfixes from MS (considering
that I need to deliver that to one of my customers)?
have a nice day
GV
----- Original Message -----
From: “Loren Wilton”
To: “Windows System Software Devs Interest List”
Sent: Wednesday, February 14, 2007 4:05 AM
Subject: Re: RE:[ntdev] KeQueryPerformanceCounter on AMD64 DualCore machines
issue
>> But I am confused. The knowledge base article implies the ACPI timer is
>> used to assure KeQueryPerforamnceCounter works properly and that only
>> rogue drivers that bypass the Windows calls and send their own RDTSC are
>> at risk. So the way I read it Microsoft already addressed the problem.
>
> QueryPerfCtr is based on various timers in various HALs. In some HALs it
> will use the ACPI timer. In others it will use the timestamp counter or
> the 8254 (I think it was). There is a new “high resolution” timer in some
> PCs that I think may also be used under some circumstances.
>
> Note that the ACPI timer is itself a pseudo-random number generator in
> most inplementations.
>
> The hardware types have certainly mucked up the concept of high resolution
> timekeeping in the last few chip generations. I give them about 3-5 more
> years before enough people complain and convince them that random timer
> speeds are only of interest to people doing chip simulation, and are
> useless for any normal software purpose.
>
> Loren
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
Gianluca Varenni wrote:
All the explanations about the problem make perfectly sense. What
doesn’t make sense to me (as another person pointed out) is that this
is *not* documented in the DDK. What should I do with an API that
basically returns “random” numbers on specific architectures. And
worse, I have no way to detect if KQPC returns meaningful numbers, as
it always did on x86 and 2k/XP/2k3/Vista in my experience, and x64
(where one of my users reported the problem).
It is basically only a happy accident that this has *ever *worked. The
APIs haven’t changed. The hardware did.
–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.
>>
It the mentioned hotfix which doesn’t seem to be (public) available for x64. XP SP2 KeQueryPerformanceCounter() implementation can
use RDTSC; it works this way at my dual Opteron PC.
For those interested : there *exists* a " #define ReadTimeStampCounter ( ) " that generates an RDTSC instruction in your code
when called … Just do a text search for ReadTimeStampCounter within the \inc folder of your DDK …
Christiaan
Best regards,
Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]
Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256
To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
> there *exists* a " #define ReadTimeStampCounter ( ) "
that generates an RDTSC instruction in your code
Surely no one would contemplate shipping a driver with this because (1) it is undocumented, (2) on todays processors the results are unpredictable, (3) like inline assembly it’s one of those things likely to be eliminated in the future so better to write your code once the right way instead of once the wrong way and then later the right way after customers experience problems.
eof
The problems are with the instruction and how it is implemented especially
in SMP systems. The code for that define is written by Microsoft and it
will be supported however it must be in future OS versions. It is contained
in miniport.h, ntddk.h, and wdm.h which only excludes ntifs.h from the
current Vista WDK.
wrote in message news:xxxxx@ntdev…
>> there exists a " #define ReadTimeStampCounter ( ) "
>> that generates an RDTSC instruction in your code
>
> Surely no one would contemplate shipping a driver with this because (1) it
> is undocumented, (2) on todays processors the results are unpredictable,
> (3) like inline assembly it’s one of those things likely to be eliminated
> in the future so better to write your code once the right way instead of
> once the wrong way and then later the right way after customers experience
> problems.
>
> eof
>
>
As a matter of facts, either you use that macro (ReadTimeStampCounter) or
KeQueryPerformanceCounter, they are almost useless if used for timestamping
a piece of information that is not cpu-bound. So, the only reliable function
is KeQuerySystemTime (and similar) which gives timestamps in the millisecond
range precision.
BTW, I just discovered yesterday from a log sent by my user that DebugView
probably uses the same function, as it shows the same identical problem with
timestamps.
GV
----- Original Message -----
From: “David Craig”
Newsgroups: ntdev
To: “Windows System Software Devs Interest List”
Sent: Wednesday, February 14, 2007 10:05 PM
Subject: Re:[ntdev] KeQueryPerformanceCounter on AMD64 DualCore machines
issue
> The problems are with the instruction and how it is implemented especially
> in SMP systems. The code for that define is written by Microsoft and it
> will be supported however it must be in future OS versions. It is
> contained in miniport.h, ntddk.h, and wdm.h which only excludes ntifs.h
> from the current Vista WDK.
>
> wrote in message news:xxxxx@ntdev…
>>> there exists a " #define ReadTimeStampCounter ( ) "
>>> that generates an RDTSC instruction in your code
>>
>> Surely no one would contemplate shipping a driver with this because (1)
>> it is undocumented, (2) on todays processors the results are
>> unpredictable, (3) like inline assembly it’s one of those things likely
>> to be eliminated in the future so better to write your code once the
>> right way instead of once the wrong way and then later the right way
>> after customers experience problems.
>>
>> eof
>>
>>
>
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
xxxxx@email.com wrote:
> there *exists* a " #define ReadTimeStampCounter ( ) "
> that generates an RDTSC instruction in your code
>
Surely no one would contemplate shipping a driver with this because (1) it is undocumented, (2) on todays processors the results are unpredictable, (3) like inline assembly it’s one of those things likely to be eliminated in the future so better to write your code once the right way instead of once the wrong way and then later the right way after customers experience problems.
I disagree. This is, in fact, exactly the right way to do something
that had previously required inline assembly. This define generates the
appropriate code for whatever architecture you happen to be running.
Your point #2 is kind of silly, since KeQueryPerformanceCounter() will
return exactly the same number as ReadTimeStampCounter() with most HALs.
How do you define “the right way”?
–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.
> is KeQuerySystemTime (and similar) which gives timestamps in the millisecond
range precision.
The granularity is NOT 1 millisecond , rather 10 milliseconds … The same applies
to KeQueryTickCount , even the granularity of this last one may vary.
Christiaan
----- Original Message -----
From: “Gianluca Varenni”
To: “Windows System Software Devs Interest List”
Sent: Thursday, February 15, 2007 5:21 PM
Subject: Re: Re:[ntdev] KeQueryPerformanceCounter on AMD64 DualCore machines issue
> As a matter of facts, either you use that macro (ReadTimeStampCounter) or
> KeQueryPerformanceCounter, they are almost useless if used for timestamping
> a piece of information that is not cpu-bound. So, the only reliable function
> is KeQuerySystemTime (and similar) which gives timestamps in the millisecond
> range precision.
>
> BTW, I just discovered yesterday from a log sent by my user that DebugView
> probably uses the same function, as it shows the same identical problem with
> timestamps.
>
> GV
>
> ----- Original Message -----
> From: “David Craig”
> Newsgroups: ntdev
> To: “Windows System Software Devs Interest List”
> Sent: Wednesday, February 14, 2007 10:05 PM
> Subject: Re:[ntdev] KeQueryPerformanceCounter on AMD64 DualCore machines
> issue
>
>
>> The problems are with the instruction and how it is implemented especially
>> in SMP systems. The code for that define is written by Microsoft and it
>> will be supported however it must be in future OS versions. It is
>> contained in miniport.h, ntddk.h, and wdm.h which only excludes ntifs.h
>> from the current Vista WDK.
>>
>> wrote in message news:xxxxx@ntdev…
>>>> there exists a " #define ReadTimeStampCounter ( ) "
>>>> that generates an RDTSC instruction in your code
>>>
>>> Surely no one would contemplate shipping a driver with this because (1)
>>> it is undocumented, (2) on todays processors the results are
>>> unpredictable, (3) like inline assembly it’s one of those things likely
>>> to be eliminated in the future so better to write your code once the
>>> right way instead of once the wrong way and then later the right way
>>> after customers experience problems.
>>>
>>> eof
>>>
>>>
>>
>>
>>
>> —
>> Questions? First check the Kernel Driver FAQ at
>> http://www.osronline.com/article.cfm?id=256
>>
>> To unsubscribe, visit the List Server section of OSR Online at
>> http://www.osronline.com/page.cfm?name=ListServer
>
>
> —
> Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
----- Original Message -----
From: “Christiaan Ghijselinck”
To: “Windows System Software Devs Interest List”
Sent: Thursday, February 15, 2007 9:29 AM
Subject: Re: Re:[ntdev] KeQueryPerformanceCounter on AMD64 DualCore machines
issue
>
>> is KeQuerySystemTime (and similar) which gives timestamps in the
>> millisecond range precision.
>
> The granularity is NOT 1 millisecond , rather 10 milliseconds … The same
> applies
That’s what I meant by “millisecond range”. Also, depending on what you do
with EsSetTimerResolution and on the platform, you can go to about 1ms. This
is done by the Multimedia user level APIs (mmtimer stuff), for example.
GV
> to KeQueryTickCount , even the granularity of this last one may vary.
>
> Christiaan
>
>
> ----- Original Message -----
> From: “Gianluca Varenni”
> To: “Windows System Software Devs Interest List”
> Sent: Thursday, February 15, 2007 5:21 PM
> Subject: Re: Re:[ntdev] KeQueryPerformanceCounter on AMD64 DualCore
> machines issue
>
>
>> As a matter of facts, either you use that macro (ReadTimeStampCounter) or
>> KeQueryPerformanceCounter, they are almost useless if used for
>> timestamping a piece of information that is not cpu-bound. So, the only
>> reliable function is KeQuerySystemTime (and similar) which gives
>> timestamps in the millisecond range precision.
>>
>> BTW, I just discovered yesterday from a log sent by my user that
>> DebugView probably uses the same function, as it shows the same identical
>> problem with timestamps.
>>
>> GV
>>
>> ----- Original Message -----
>> From: “David Craig”
>> Newsgroups: ntdev
>> To: “Windows System Software Devs Interest List”
>> Sent: Wednesday, February 14, 2007 10:05 PM
>> Subject: Re:[ntdev] KeQueryPerformanceCounter on AMD64 DualCore machines
>> issue
>>
>>
>>> The problems are with the instruction and how it is implemented
>>> especially in SMP systems. The code for that define is written by
>>> Microsoft and it will be supported however it must be in future OS
>>> versions. It is contained in miniport.h, ntddk.h, and wdm.h which only
>>> excludes ntifs.h from the current Vista WDK.
>>>
>>> wrote in message news:xxxxx@ntdev…
>>>>> there exists a " #define ReadTimeStampCounter ( ) "
>>>>> that generates an RDTSC instruction in your code
>>>>
>>>> Surely no one would contemplate shipping a driver with this because (1)
>>>> it is undocumented, (2) on todays processors the results are
>>>> unpredictable, (3) like inline assembly it’s one of those things likely
>>>> to be eliminated in the future so better to write your code once the
>>>> right way instead of once the wrong way and then later the right way
>>>> after customers experience problems.
>>>>
>>>> eof
>>>>
>>>>
>>>
>>>
>>>
>>> —
>>> Questions? First check the Kernel Driver FAQ at
>>> http://www.osronline.com/article.cfm?id=256
>>>
>>> To unsubscribe, visit the List Server section of OSR Online at
>>> http://www.osronline.com/page.cfm?name=ListServer
>>
>>
>> —
>> Questions? First check the Kernel Driver FAQ at
>> http://www.osronline.com/article.cfm?id=256
>>
>> To unsubscribe, visit the List Server section of OSR Online at
>> http://www.osronline.com/page.cfm?name=ListServer
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
At 11:51 PM 2/14/2007, you wrote:
… so better to write your code once the right way instead of once the
wrong way and then later the right way after customers experience problems.
Ok, what is the right way?
…assuming that “the right way” works.
> ----------
From: xxxxx@lists.osr.com[SMTP:xxxxx@lists.osr.com] on behalf of Gianluca Varenni[SMTP:xxxxx@gmail.com]
Reply To: Windows System Software Devs Interest List
Sent: Thursday, February 15, 2007 5:21 PM
To: Windows System Software Devs Interest List
Subject: Re: Re:[ntdev] KeQueryPerformanceCounter on AMD64 DualCore machines issue
As a matter of facts, either you use that macro (ReadTimeStampCounter) or
KeQueryPerformanceCounter, they are almost useless if used for timestamping
a piece of information that is not cpu-bound. So, the only reliable function
is KeQuerySystemTime (and similar) which gives timestamps in the millisecond
range precision.
At most platforms I saw it is about 15.620 ms which is ages at modern CPU.
BTW, I just discovered yesterday from a log sent by my user that DebugView
probably uses the same function, as it shows the same identical problem with
timestamps.
Yes, it does. I have the same experience with it.
Accidentally, I just needed to improve time measuring in my driver. Originally, I used KeQueryTickCount() but I need better resolution now. So I tried KeQueryInterruptTime() which claims to return finer grained measurement (according to WDK docs). I wondered how it is possible if KeQueryTimeIncrement() is documented to return timer increment for both functions but nevertheless, I tried. Nothing changed, 15.620 ms resolution. For completeness, I tried KeQuerySystemTime() with the same result.
The only remaining function is KeQueryPerformanceCounter() which uses RDTSC at my dual Opteron machine. Which means it is susceptible to problems mentioned in this thread and I can see them.
So what is the right solution? I can’t believe such a basic thing just doesn’t work…
Best regards,
Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]
>>> is KeQuerySystemTime (and similar) which gives timestamps in the
>> millisecond range precision.
>
> The granularity is NOT 1 millisecond , rather 10 milliseconds … The
> same applies
That’s what I meant by “millisecond range”. Also, depending on what you do
with EsSetTimerResolution and on the platform, you can go to about 1ms.
This is done by the Multimedia user level APIs (mmtimer stuff), for
example.
Well… no. I just ran a test (at user level) on GetSystemTimAsFileTime to
see if it is suitable for a replacement of QPC in this case. It returns the
standard 15.625ms tick interval as the base increments.
This is regardless of the time resolution.
So it appears that the only thing that can be done if you need timestamps
below 1ms on a multiproc system is to implement a routine that has a thread
per proc at realtime priority, and will periodically all sync together
burning processor until they can get a near-simultaneous RDTSC value from
each processor. You can then be careful of which processor you read the
timestamp counter from and use a correction factor based on the last known
offset and speed for that processor.
How often you have to get the base sync values probably depends on the
environment. I deal with datacenter servers, so I can probably get away
with getting a sync value every 100 seconds or so. On a laptop with active
power management you might have to get sync values several times per second
to have actual accuracies in the millisecond range.
It appears that, short of a special hardware timer plugin card, it is no
longer possible to get reliable timestamps that are good to less than 1ms,
and even those might be questionable in a number of circumstances.
Loren
Michal Vodicka wrote:
The only remaining function is KeQueryPerformanceCounter() which uses RDTSC at my dual Opteron machine. Which means it is susceptible to problems mentioned in this thread and I can see them.
So what is the right solution? I can’t believe such a basic thing just doesn’t work…
One answer is to use affinity to make sure your code always runs on the
same CPU.
–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.