Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Before Posting...
Please check out the Community Guidelines in the Announcements and Administration Category.

Interesting issue with (Ke)QueryPerformanceCounter

Gianluca_VarenniGianluca_Varenni Member Posts: 334
Hi all.

A customer of mine reported a strange issue, and I tracked the issue down to
wrong values returned by KeQueryPerformanceCounter (or the user counterpart
QueryPerformanceCounter/Frequency).

The MSDN docs clearly state that

KeQueryPerformanceCounter always returns a 64-bit integer that represents
the current value of a monotonically nondecreasing counter. The resolution
of the timer used to accumulate the current count can be obtained by
specifying a pointer value for PerformanceFrequency. The value that is
returned is the number of ticks per second. The count starts accumulating
when the computer is started up.


What I get on some specific customer machine is something that is not "ticks
per second". Either the frequency is bogus, or the counter itself.

I wrote a small user app to test it:

int _tmain(int argc, _TCHAR* argv[])
{
int i;
LARGE_INTEGER freq, counter;
SYSTEMTIME systemTime;

for (i = 0; i < 10; i++)
{
GetSystemTime(&systemTime);

if (QueryPerformanceCounter(&counter) == FALSE ||
QueryPerformanceFrequency(&freq) == FALSE)
{
printf("Error getting performance counters\n");
}

printf("Counter = %I64d Freq = %I64d", counter.QuadPart, freq.QuadPart);
printf(" system: %02d:%02d:%02d\n",
systemTime.wHour,
systemTime.wMinute,
systemTime.wSecond);

Sleep(1000);
}

return 0;
}

And this is the output on the failing machine:

Counter = 56497408135970 Freq = 3158750000 system: 01:18:57
Counter = 56497818110218 Freq = 3158750000 system: 01:18:58
Counter = 56498206404250 Freq = 3158750000 system: 01:18:59
Counter = 56498715528269 Freq = 3158750000 system: 01:19:00
Counter = 56499113984213 Freq = 3158750000 system: 01:19:01
Counter = 56499489120527 Freq = 3158750000 system: 01:19:02
Counter = 56499867623469 Freq = 3158750000 system: 01:19:03
Counter = 56500252073478 Freq = 3158750000 system: 01:19:04
Counter = 56500713252945 Freq = 3158750000 system: 01:19:05
Counter = 56501096560332 Freq = 3158750000 system: 01:19:06

If you make some computations, the counter seems to say that every iteration
is about 0.12 seconds apart from the previous one, while it should be more
or less 1 second.
The failing machine is running XP SP3 (hal.dll 5.1.2600.5687, ntoskrnl.exe
5.1.2600.5938) on an Intel Core 2 Duo E8500.

Has anyone ever seen this?


Have a nice day
GV
«1

Comments

  • Thomas_Divine-2Thomas_Divine-2 Member Posts: 635
    Haven't seen this.

    But you might try starting/stopping the multimedia timer. See if that causes
    difference in behavior.

    Thomas


    --------------------------------------------------
    From: "Gianluca Varenni" <xxxxx@gmail.com>
    Sent: Wednesday, September 01, 2010 3:34 PM
    To: "Windows System Software Devs Interest List" <xxxxx@lists.osr.com>
    Subject: [ntdev] Interesting issue with (Ke)QueryPerformanceCounter

    > Hi all.
    >
    > A customer of mine reported a strange issue, and I tracked the issue down
    > to wrong values returned by KeQueryPerformanceCounter (or the user
    > counterpart QueryPerformanceCounter/Frequency).
    >
    > The MSDN docs clearly state that
    >
    > KeQueryPerformanceCounter always returns a 64-bit integer that represents
    > the current value of a monotonically nondecreasing counter. The resolution
    > of the timer used to accumulate the current count can be obtained by
    > specifying a pointer value for PerformanceFrequency. The value that is
    > returned is the number of ticks per second. The count starts accumulating
    > when the computer is started up.
    >
    >
    > What I get on some specific customer machine is something that is not
    > "ticks per second". Either the frequency is bogus, or the counter itself.
    >
    > I wrote a small user app to test it:
    >
    > int _tmain(int argc, _TCHAR* argv[])
    > {
    > int i;
    > LARGE_INTEGER freq, counter;
    > SYSTEMTIME systemTime;
    >
    > for (i = 0; i < 10; i++)
    > {
    > GetSystemTime(&systemTime);
    >
    > if (QueryPerformanceCounter(&counter) == FALSE ||
    > QueryPerformanceFrequency(&freq) == FALSE)
    > {
    > printf("Error getting performance counters\n");
    > }
    >
    > printf("Counter = %I64d Freq = %I64d", counter.QuadPart, freq.QuadPart);
    > printf(" system: %02d:%02d:%02d\n",
    > systemTime.wHour,
    > systemTime.wMinute,
    > systemTime.wSecond);
    >
    > Sleep(1000);
    > }
    >
    > return 0;
    > }
    >
    > And this is the output on the failing machine:
    >
    > Counter = 56497408135970 Freq = 3158750000 system: 01:18:57
    > Counter = 56497818110218 Freq = 3158750000 system: 01:18:58
    > Counter = 56498206404250 Freq = 3158750000 system: 01:18:59
    > Counter = 56498715528269 Freq = 3158750000 system: 01:19:00
    > Counter = 56499113984213 Freq = 3158750000 system: 01:19:01
    > Counter = 56499489120527 Freq = 3158750000 system: 01:19:02
    > Counter = 56499867623469 Freq = 3158750000 system: 01:19:03
    > Counter = 56500252073478 Freq = 3158750000 system: 01:19:04
    > Counter = 56500713252945 Freq = 3158750000 system: 01:19:05
    > Counter = 56501096560332 Freq = 3158750000 system: 01:19:06
    >
    > If you make some computations, the counter seems to say that every
    > iteration is about 0.12 seconds apart from the previous one, while it
    > should be more or less 1 second.
    > The failing machine is running XP SP3 (hal.dll 5.1.2600.5687, ntoskrnl.exe
    > 5.1.2600.5938) on an Intel Core 2 Duo E8500.
    >
    > Has anyone ever seen this?
    >
    >
    > Have a nice day
    > GV
    >
    >
    >
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
  • anton_bassovanton_bassov Member Posts: 4,901
    > Has anyone ever seen this?

    We had a discussion with Michal around a week ago about something similar. Basically, you can sleep less than the amount of time you have specified under some circumstances. You can check "Some doubts about KeStallExecutionProcessor and KeDelayExecutionThread' thread (http://www.osronline.com/showThread.cfm?link=187626) for more info...


    Anton Bassov
  • Gianluca_VarenniGianluca_Varenni Member Posts: 334
    One thing is sleeping less than exactly one second, one other thing is
    reporting numbers that are almost 10 times off.

    The test runs for about 10 seconds. The difference between the first and the
    last sample is around 9 seconds (remember that I print the result of
    GetSystemTime). QueryPerformanceCounter reports that the time difference is
    1.17 seconds.

    GV


    --------------------------------------------------
    From: <xxxxx@hotmail.com>
    Sent: Wednesday, September 01, 2010 1:17 PM
    To: "Windows System Software Devs Interest List" <xxxxx@lists.osr.com>
    Subject: RE:[ntdev] Interesting issue with (Ke)QueryPerformanceCounter

    >> Has anyone ever seen this?
    >
    > We had a discussion with Michal around a week ago about something similar.
    > Basically, you can sleep less than the amount of time you have specified
    > under some circumstances. You can check "Some doubts about
    > KeStallExecutionProcessor and KeDelayExecutionThread' thread
    > (http://www.osronline.com/showThread.cfm?link=187626) for more info...
    >
    >
    > Anton Bassov
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 7,058
    Yeah, yeah... it's not even remotely related to what we talked about a couple of weeks back...

    You posted user-mode code, though, which makes my head hurt.

    Have you tried this in kernel mode?? Thankfully, in kernel-mode KeQueryPerformanceCounter both returns the value *and* the frequency... which makes life a lot easier.

    Peter
    OSR

    Peter Viscarola
    OSR
    @OSRDrivers

  • Pavel_APavel_A Member Posts: 2,648
    Is this some kind of virtual machine? Or do they have anything that tampers with whatever timer QueryPerformanceCounter uses? (run this test again in safe mode, or boot a WinPE CD).
    If none of above, then... that machine is only 99% compatible with Windows. It happens.

    Regards,
    --pa
  • mmmm Member - All Emails Posts: 1,409
    Is this running in a VM?


    mm

    -----Original Message-----
    From: xxxxx@lists.osr.com
    [mailto:xxxxx@lists.osr.com] On Behalf Of Gianluca Varenni
    Sent: Wednesday, September 01, 2010 4:27 PM
    To: Windows System Software Devs Interest List
    Subject: Re: RE:[ntdev] Interesting issue with (Ke)QueryPerformanceCounter

    One thing is sleeping less than exactly one second, one other thing is
    reporting numbers that are almost 10 times off.

    The test runs for about 10 seconds. The difference between the first and the
    last sample is around 9 seconds (remember that I print the result of
    GetSystemTime). QueryPerformanceCounter reports that the time difference is
    1.17 seconds.

    GV


    --------------------------------------------------
    From: <xxxxx@hotmail.com>
    Sent: Wednesday, September 01, 2010 1:17 PM
    To: "Windows System Software Devs Interest List" <xxxxx@lists.osr.com>
    Subject: RE:[ntdev] Interesting issue with (Ke)QueryPerformanceCounter

    >> Has anyone ever seen this?
    >
    > We had a discussion with Michal around a week ago about something similar.

    > Basically, you can sleep less than the amount of time you have specified
    > under some circumstances. You can check "Some doubts about
    > KeStallExecutionProcessor and KeDelayExecutionThread' thread
    > (http://www.osronline.com/showThread.cfm?link=187626) for more info...
    >
    >
    > Anton Bassov
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer


    ---
    NTDEV is sponsored by OSR

    For our schedule of WDF, WDM, debugging and other seminars visit:
    http://www.osr.com/seminars

    To unsubscribe, visit the List Server section of OSR Online at
    http://www.osronline.com/page.cfm?name=ListServer
  • Gianluca_VarenniGianluca_Varenni Member Posts: 334
    --------------------------------------------------
    From: <xxxxx@osr.com>
    Sent: Wednesday, September 01, 2010 1:39 PM
    To: "Windows System Software Devs Interest List" <xxxxx@lists.osr.com>
    Subject: RE:[ntdev] Interesting issue with (Ke)QueryPerformanceCounter

    > Yeah, yeah... it's not even remotely related to what we talked about a
    > couple of weeks back...
    >
    > You posted user-mode code, though, which makes my head hurt.
    >
    > Have you tried this in kernel mode?? Thankfully, in kernel-mode
    > KeQueryPerformanceCounter both returns the value *and* the frequency...
    > which makes life a lot easier.

    The problem was originally detected with KeQueryPerformanceCounter (where it
    was used to timestamp packets). Even in that case, the frequency was always
    the same, but the counter was off in the same exact way.

    Unfortunately writing a similar test application in kernel mode is a bit
    more complicated. The customer having the issue is not a direct customer. We
    sell the library and driver to our customer, and he builds the application
    for the end customer (having the issue). So I have no whatsoever access to
    the failing machine. I can just ask my customer to check his customer's
    machine (or run some test apps on it). Having said that, I have a similar
    trace, where I was printing the values returned by KeQuerySystemTime and
    KeQueryPerformance counter (where the iterations are not 1second apart, I
    was printing them for every packet received by my protocol driver) and it
    was showing bogus performance counters as well. The frequency was always the
    same, it was not changing.

    I'm checking with the customer if they are actually running from within a VM
    (and they forgot to tell me).

    Have a nice day
    GV

    >
    > Peter
    > OSR
    >
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
  • Gianluca_VarenniGianluca_Varenni Member Posts: 334
    Just checked with the customer. Nope. It's an HP 8000 Elite.

    Have a nice day
    GV

    --------------------------------------------------
    From: "M. M. O'Brien" <xxxxx@gmail.com>
    Sent: Wednesday, September 01, 2010 1:49 PM
    To: "Windows System Software Devs Interest List" <xxxxx@lists.osr.com>
    Subject: RE: RE:[ntdev] Interesting issue with (Ke)QueryPerformanceCounter

    > Is this running in a VM?
    >
    >
    > mm
    >
    > -----Original Message-----
    > From: xxxxx@lists.osr.com
    > [mailto:xxxxx@lists.osr.com] On Behalf Of Gianluca Varenni
    > Sent: Wednesday, September 01, 2010 4:27 PM
    > To: Windows System Software Devs Interest List
    > Subject: Re: RE:[ntdev] Interesting issue with (Ke)QueryPerformanceCounter
    >
    > One thing is sleeping less than exactly one second, one other thing is
    > reporting numbers that are almost 10 times off.
    >
    > The test runs for about 10 seconds. The difference between the first and
    > the
    > last sample is around 9 seconds (remember that I print the result of
    > GetSystemTime). QueryPerformanceCounter reports that the time difference
    > is
    > 1.17 seconds.
    >
    > GV
    >
    >
    > --------------------------------------------------
    > From: <xxxxx@hotmail.com>
    > Sent: Wednesday, September 01, 2010 1:17 PM
    > To: "Windows System Software Devs Interest List" <xxxxx@lists.osr.com>
    > Subject: RE:[ntdev] Interesting issue with (Ke)QueryPerformanceCounter
    >
    >>> Has anyone ever seen this?
    >>
    >> We had a discussion with Michal around a week ago about something
    >> similar.
    >
    >> Basically, you can sleep less than the amount of time you have specified
    >> under some circumstances. You can check "Some doubts about
    >> KeStallExecutionProcessor and KeDelayExecutionThread' thread
    >> (http://www.osronline.com/showThread.cfm?link=187626) for more info...
    >>
    >>
    >> Anton Bassov
    >>
    >> ---
    >> NTDEV is sponsored by OSR
    >>
    >> For our schedule of WDF, WDM, debugging and other seminars visit:
    >> http://www.osr.com/seminars
    >>
    >> To unsubscribe, visit the List Server section of OSR Online at
    >> http://www.osronline.com/page.cfm?name=ListServer
    >
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
    >
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
  • mmmm Member - All Emails Posts: 1,409
    Bummer

    On Sep 1, 2010 5:16 PM, "Gianluca Varenni"
    wrote:
    > Just checked with the customer. Nope. It's an HP 8000 Elite.
    >
    > Have a nice day
    > GV
    >
    > --------------------------------------------------
    > From: "M. M. O'Brien"
    > Sent: Wednesday, September 01, 2010 1:49 PM
    > To: "Windows System Software Devs Interest List"
    > Subject: RE: RE:[ntdev] Interesting issue with (Ke)QueryPerformanceCounter
    >
    >> Is this running in a VM?
    >>
    >>
    >> mm
    >>
    >> -----Original Message-----
    >> From: xxxxx@lists.osr.com
    >> [mailto:xxxxx@lists.osr.com] On Behalf Of Gianluca Varenni
    >> Sent: Wednesday, September 01, 2010 4:27 PM
    >> To: Windows System Software Devs Interest List
    >> Subject: Re: RE:[ntdev] Interesting issue with
    (Ke)QueryPerformanceCounter
    >>
    >> One thing is sleeping less than exactly one second, one other thing is
    >> reporting numbers that are almost 10 times off.
    >>
    >> The test runs for about 10 seconds. The difference between the first and
    >> the
    >> last sample is around 9 seconds (remember that I print the result of
    >> GetSystemTime). QueryPerformanceCounter reports that the time difference
    >> is
    >> 1.17 seconds.
    >>
    >> GV
    >>
    >>
    >> --------------------------------------------------
    >> From:
    >> Sent: Wednesday, September 01, 2010 1:17 PM
    >> To: "Windows System Software Devs Interest List"
    >> Subject: RE:[ntdev] Interesting issue with (Ke)QueryPerformanceCounter
    >>
    >>>> Has anyone ever seen this?
    >>>
    >>> We had a discussion with Michal around a week ago about something
    >>> similar.
    >>
    >>> Basically, you can sleep less than the amount of time you have specified
    >>> under some circumstances. You can check "Some doubts about
    >>> KeStallExecutionProcessor and KeDelayExecutionThread' thread
    >>> (http://www.osronline.com/showThread.cfm?link=187626) for more info...
    >>>
    >>>
    >>> Anton Bassov
    >>>
    >>> ---
    >>> NTDEV is sponsored by OSR
    >>>
    >>> For our schedule of WDF, WDM, debugging and other seminars visit:
    >>> http://www.osr.com/seminars
    >>>
    >>> To unsubscribe, visit the List Server section of OSR Online at
    >>> http://www.osronline.com/page.cfm?name=ListServer
    >>
    >>
    >> ---
    >> NTDEV is sponsored by OSR
    >>
    >> For our schedule of WDF, WDM, debugging and other seminars visit:
    >> http://www.osr.com/seminars
    >>
    >> To unsubscribe, visit the List Server section of OSR Online at
    >> http://www.osronline.com/page.cfm?name=ListServer
    >>
    >>
    >> ---
    >> NTDEV is sponsored by OSR
    >>
    >> For our schedule of WDF, WDM, debugging and other seminars visit:
    >> http://www.osr.com/seminars
    >>
    >> To unsubscribe, visit the List Server section of OSR Online at
    >> http://www.osronline.com/page.cfm?name=ListServer
    >
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    http://www.osronline.com/page.cfm?name=ListServer
  • Tim_RobertsTim_Roberts Member - All Emails Posts: 12,854
    Gianluca Varenni wrote:
    > A customer of mine reported a strange issue, and I tracked the issue down to
    > wrong values returned by KeQueryPerformanceCounter (or the user counterpart
    > QueryPerformanceCounter/Frequency).
    > ...
    > What I get on some specific customer machine is something that is not "ticks
    > per second". Either the frequency is bogus, or the counter itself.
    > ...
    > If you make some computations, the counter seems to say that every iteration
    > is about 0.12 seconds apart from the previous one, while it should be more
    > or less 1 second.
    > The failing machine is running XP SP3 (hal.dll 5.1.2600.5687, ntoskrnl.exe
    > 5.1.2600.5938) on an Intel Core 2 Duo E8500.
    >
    > Has anyone ever seen this?

    QPC is, in this case, directly reading the processor's cycle counter.
    The processor is running at 3.1 GHz. If the machine is not doing very
    much, it's possible for the cycle frequency to change because of power
    management. I have not seen differences quite this dramatic, but it
    would be interesting to change your program to do __readtsc() during
    each loop and see if the results are identical to QPC.

    --
    Tim Roberts, xxxxx@probo.com
    Providenza & Boekelheide, Inc.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • Cay_BremerCay_Bremer Member Posts: 121
    Hello,

    I can't offer an explanation either, but have you already tried running
    'bcdedit /set useplatformclock yes' (or setting /USEPMTIMER in boot.ini),
    the standard workaround for QPC-related problems?


    On Wed, 01 Sep 2010 22:34:23 +0300, Gianluca Varenni
    wrote:
    > Hi all.
    >
    > A customer of mine reported a strange issue, and I tracked the issue
    > down to wrong values returned by KeQueryPerformanceCounter (or the user
    > counterpart QueryPerformanceCounter/Frequency).
    >
    > The MSDN docs clearly state that
    >
    > KeQueryPerformanceCounter always returns a 64-bit integer that
    > represents the current value of a monotonically nondecreasing counter.
    > The resolution of the timer used to accumulate the current count can be
    > obtained by specifying a pointer value for PerformanceFrequency. The
    > value that is returned is the number of ticks per second. The count
    > starts accumulating when the computer is started up.
    >
    >
    > What I get on some specific customer machine is something that is not
    > "ticks per second". Either the frequency is bogus, or the counter itself.
    >
    > I wrote a small user app to test it:
    >
    > int _tmain(int argc, _TCHAR* argv[])
    > {
    > int i;
    > LARGE_INTEGER freq, counter;
    > SYSTEMTIME systemTime;
    >
    > for (i = 0; i < 10; i++)
    > {
    > GetSystemTime(&systemTime);
    >
    > if (QueryPerformanceCounter(&counter) == FALSE ||
    > QueryPerformanceFrequency(&freq) == FALSE)
    > {
    > printf("Error getting performance counters\n");
    > }
    >
    > printf("Counter = %I64d Freq = %I64d", counter.QuadPart,
    > freq.QuadPart);
    > printf(" system: %02d:%02d:%02d\n",
    > systemTime.wHour,
    > systemTime.wMinute,
    > systemTime.wSecond);
    >
    > Sleep(1000);
    > }
    >
    > return 0;
    > }
    >
    > And this is the output on the failing machine:
    >
    > Counter = 56497408135970 Freq = 3158750000 system: 01:18:57
    > Counter = 56497818110218 Freq = 3158750000 system: 01:18:58
    > Counter = 56498206404250 Freq = 3158750000 system: 01:18:59
    > Counter = 56498715528269 Freq = 3158750000 system: 01:19:00
    > Counter = 56499113984213 Freq = 3158750000 system: 01:19:01
    > Counter = 56499489120527 Freq = 3158750000 system: 01:19:02
    > Counter = 56499867623469 Freq = 3158750000 system: 01:19:03
    > Counter = 56500252073478 Freq = 3158750000 system: 01:19:04
    > Counter = 56500713252945 Freq = 3158750000 system: 01:19:05
    > Counter = 56501096560332 Freq = 3158750000 system: 01:19:06
    >
    > If you make some computations, the counter seems to say that every
    > iteration is about 0.12 seconds apart from the previous one, while it
    > should be more or less 1 second.
    > The failing machine is running XP SP3 (hal.dll 5.1.2600.5687,
    > ntoskrnl.exe 5.1.2600.5938) on an Intel Core 2 Duo E8500.
    >
    > Has anyone ever seen this?
    >
    >
    > Have a nice day
    > GV
  • Alex_GrigAlex_Grig Member Posts: 3,238
    I suggest you instead of sleep() run a tight loop waiting for the time to jump to the next second and see if the performance counter will behave better.

    In other words, the PerfCount may run with the claimed speed only when you're actually doing something. If the processor is idle, then power management may kick in
  • Cay_BremerCay_Bremer Member Posts: 121
    Newer AMD and Intel processors, including the Core 2 family, update the
    TSC at a constant rate, so CPU throttling cannot be the culprit here.

    (Sorry for mentioning BCDEdit in the first place. I had originally missed
    that the machine is running Windows XP.)


    On Thu, 02 Sep 2010 05:41:09 +0300, wrote:
    > I suggest you instead of sleep() run a tight loop waiting for the time
    > to jump to the next second and see if the performance counter will
    > behave better.
    >
    > In other words, the PerfCount may run with the claimed speed only when
    > you're actually doing something. If the processor is idle, then power
    > management may kick in
  • Daniel_TerhellDaniel_Terhell Member Posts: 1,349
    I think the problem is that QPC is broken on Windows XP and Vista. The CPU
    guides clearly state that RDTSC is not a serializing instruction and is
    subject to out of order execution among other things and the various
    implementations on Windows don't do that. (On Win7 RDTSC is no longer used
    possibly depending on available hardware features).

    On XP SP3:
    hal!KeQueryPerformanceCounter:
    806ebb94 8bff mov edi,edi
    806ebb96 55 push ebp
    806ebb97 8bec mov ebp,esp
    806ebb99 5d pop ebp
    806ebb9a ff2548156f80 jmp dword ptr [hal!HalpHeapStart+0xc
    (806f1548)]
    0: kd> t
    hal!HalpAcpiTimerQueryPerfCount:
    806e4c78 a0dcd06e80 mov al,byte ptr [hal!HalpUse8254 (806ed0dc)]
    806e4c78 a0dcd06e80 mov al,byte ptr [hal!HalpUse8254 (806ed0dc)]
    806e4c7d 0ac0 or al,al
    806e4c7f 752d jne hal!HalpAcpiTimerQueryPerfCount+0x36
    (806e4cae)
    806e4c81 8b4c2404 mov ecx,dword ptr [esp+4]
    806e4c85 0bc9 or ecx,ecx
    806e4c87 7412 je hal!HalpAcpiTimerQueryPerfCount+0x23
    (806e4c9b)
    806e4c89 64a1a4000000 mov eax,dword ptr fs:[000000A4h]
    806e4c8f 648b15a8000000 mov edx,dword ptr fs:[0A8h]
    806e4c96 8901 mov dword ptr [ecx],eax
    806e4c98 895104 mov dword ptr [ecx+4],edx
    806e4c9b 0f31 rdtsc
    806e4c9d 640305ac000000 add eax,dword ptr fs:[0ACh]
    806e4ca4 641315b0000000 adc edx,dword ptr fs:[0B0h]
    806e4cab c20400 ret 4

    There is no serializing instruction here making RDTSC subject to out of
    order execution among other things.

    What I would try is issue a CPUID serializing instruction before calling QPC
    (__CPUID is included in intrin.h) and see if that solves the problem.

    You can read more about the difficulties getting RDTSC right in this
    article:
    http://www.ccsl.carleton.ca/~jamuir/rdtscpm1.pdf

    but you can read about these as well in the latest intel and AMD manuals.

    //Daniel




    "Gianluca Varenni" <xxxxx@gmail.com> wrote in message
    news:xxxxx@ntdev...
    > Hi all.
    >
    > A customer of mine reported a strange issue, and I tracked the issue down
    > to wrong values returned by KeQueryPerformanceCounter (or the user
    > counterpart QueryPerformanceCounter/Frequency).
    >
    > The MSDN docs clearly state that
    >
    > KeQueryPerformanceCounter always returns a 64-bit integer that represents
    > the current value of a monotonically nondecreasing counter. The resolution
    > of the timer used to accumulate the current count can be obtained by
    > specifying a pointer value for PerformanceFrequency. The value that is
    > returned is the number of ticks per second. The count starts accumulating
    > when the computer is started up.
    >
    >
    > What I get on some specific customer machine is something that is not
    > "ticks per second". Either the frequency is bogus, or the counter itself.
    >
    > I wrote a small user app to test it:
    >
    > int _tmain(int argc, _TCHAR* argv[])
    > {
    > int i;
    > LARGE_INTEGER freq, counter;
    > SYSTEMTIME systemTime;
    >
    > for (i = 0; i < 10; i++)
    > {
    > GetSystemTime(&systemTime);
    >
    > if (QueryPerformanceCounter(&counter) == FALSE ||
    > QueryPerformanceFrequency(&freq) == FALSE)
    > {
    > printf("Error getting performance counters\n");
    > }
    >
    > printf("Counter = %I64d Freq = %I64d", counter.QuadPart, freq.QuadPart);
    > printf(" system: %02d:%02d:%02d\n",
    > systemTime.wHour,
    > systemTime.wMinute,
    > systemTime.wSecond);
    >
    > Sleep(1000);
    > }
    >
    > return 0;
    > }
    >
    > And this is the output on the failing machine:
    >
    > Counter = 56497408135970 Freq = 3158750000 system: 01:18:57
    > Counter = 56497818110218 Freq = 3158750000 system: 01:18:58
    > Counter = 56498206404250 Freq = 3158750000 system: 01:18:59
    > Counter = 56498715528269 Freq = 3158750000 system: 01:19:00
    > Counter = 56499113984213 Freq = 3158750000 system: 01:19:01
    > Counter = 56499489120527 Freq = 3158750000 system: 01:19:02
    > Counter = 56499867623469 Freq = 3158750000 system: 01:19:03
    > Counter = 56500252073478 Freq = 3158750000 system: 01:19:04
    > Counter = 56500713252945 Freq = 3158750000 system: 01:19:05
    > Counter = 56501096560332 Freq = 3158750000 system: 01:19:06
    >
    > If you make some computations, the counter seems to say that every
    > iteration is about 0.12 seconds apart from the previous one, while it
    > should be more or less 1 second.
    > The failing machine is running XP SP3 (hal.dll 5.1.2600.5687, ntoskrnl.exe
    > 5.1.2600.5938) on an Intel Core 2 Duo E8500.
    >
    > Has anyone ever seen this?
    >
    >
    > Have a nice day
    > GV
    >
    >
    >
    >
  • Taed_WynnellTaed_Wynnell Member Posts: 105
    The mention of /USEPMTIMER got me looking and lead me to this KB article (http://support.microsoft.com/kb/895980) which describes "Programs that use the QueryPerformanceCounter function may perform poorly in Windows Server 2000, in Windows Server 2003, and in Windows XP". It sounds exactly like what you're describing.

    However, that particular issue is only for AMD CPUs, and according to this (http://h10010.www1.hp.com/wwpc/sg/en/sm/WF06a/12454-12454-64287-3328898-3328898-4065889.html) the HP Elite 8000 uses Intel CPUs.

    But I think that the solution is still worth trying: update to the latest hardware/cpu drivers from Intel.
  • Daniel_TerhellDaniel_Terhell Member Posts: 1,349
    Athough it doesn't explain the behavior of regular intervals of about 100ms
    for the sake of sanity I would also make sure you are running always on the
    same CPU by using SetThreadAffinityMask because each core has its own time
    stamp counter and they can be out of sync. Also disable variable speed
    features such as Intel Speedstep in the BIOS and read the rest of the
    recommendations in this article:
    http://msdn.microsoft.com/en-us/library/ee417693(VS.85).aspx

    //Daniel


    <xxxxx@resplendence.com> wrote in message news:xxxxx@ntdev...
    > I think the problem is that QPC is broken on Windows XP and Vista. The CPU
    > guides clearly state that RDTSC is not a serializing instruction and is
    > subject to out of order execution among other things and the various
    > implementations on Windows don't do that. (On Win7 RDTSC is no longer used
    > possibly depending on available hardware features).
    >
    > On XP SP3:
    > hal!KeQueryPerformanceCounter:
    > 806ebb94 8bff mov edi,edi
    > 806ebb96 55 push ebp
    > 806ebb97 8bec mov ebp,esp
    > 806ebb99 5d pop ebp
    > 806ebb9a ff2548156f80 jmp dword ptr [hal!HalpHeapStart+0xc
    > (806f1548)]
    > 0: kd> t
    > hal!HalpAcpiTimerQueryPerfCount:
    > 806e4c78 a0dcd06e80 mov al,byte ptr [hal!HalpUse8254 (806ed0dc)]
    > 806e4c78 a0dcd06e80 mov al,byte ptr [hal!HalpUse8254 (806ed0dc)]
    > 806e4c7d 0ac0 or al,al
    > 806e4c7f 752d jne hal!HalpAcpiTimerQueryPerfCount+0x36
    > (806e4cae)
    > 806e4c81 8b4c2404 mov ecx,dword ptr [esp+4]
    > 806e4c85 0bc9 or ecx,ecx
    > 806e4c87 7412 je hal!HalpAcpiTimerQueryPerfCount+0x23
    > (806e4c9b)
    > 806e4c89 64a1a4000000 mov eax,dword ptr fs:[000000A4h]
    > 806e4c8f 648b15a8000000 mov edx,dword ptr fs:[0A8h]
    > 806e4c96 8901 mov dword ptr [ecx],eax
    > 806e4c98 895104 mov dword ptr [ecx+4],edx
    > 806e4c9b 0f31 rdtsc
    > 806e4c9d 640305ac000000 add eax,dword ptr fs:[0ACh]
    > 806e4ca4 641315b0000000 adc edx,dword ptr fs:[0B0h]
    > 806e4cab c20400 ret 4
    >
    > There is no serializing instruction here making RDTSC subject to out of
    > order execution among other things.
    >
    > What I would try is issue a CPUID serializing instruction before calling
    > QPC (__CPUID is included in intrin.h) and see if that solves the problem.
    >
    > You can read more about the difficulties getting RDTSC right in this
    > article:
    > http://www.ccsl.carleton.ca/~jamuir/rdtscpm1.pdf
    >
    > but you can read about these as well in the latest intel and AMD manuals.
    >
    > //Daniel
    >
    >
    >
    >
    > "Gianluca Varenni" <xxxxx@gmail.com> wrote in message
    > news:xxxxx@ntdev...
    >> Hi all.
    >>
    >> A customer of mine reported a strange issue, and I tracked the issue down
    >> to wrong values returned by KeQueryPerformanceCounter (or the user
    >> counterpart QueryPerformanceCounter/Frequency).
    >>
    >> The MSDN docs clearly state that
    >>
    >> KeQueryPerformanceCounter always returns a 64-bit integer that represents
    >> the current value of a monotonically nondecreasing counter. The
    >> resolution of the timer used to accumulate the current count can be
    >> obtained by specifying a pointer value for PerformanceFrequency. The
    >> value that is returned is the number of ticks per second. The count
    >> starts accumulating when the computer is started up.
    >>
    >>
    >> What I get on some specific customer machine is something that is not
    >> "ticks per second". Either the frequency is bogus, or the counter itself.
    >>
    >> I wrote a small user app to test it:
    >>
    >> int _tmain(int argc, _TCHAR* argv[])
    >> {
    >> int i;
    >> LARGE_INTEGER freq, counter;
    >> SYSTEMTIME systemTime;
    >>
    >> for (i = 0; i < 10; i++)
    >> {
    >> GetSystemTime(&systemTime);
    >>
    >> if (QueryPerformanceCounter(&counter) == FALSE ||
    >> QueryPerformanceFrequency(&freq) == FALSE)
    >> {
    >> printf("Error getting performance counters\n");
    >> }
    >>
    >> printf("Counter = %I64d Freq = %I64d", counter.QuadPart, freq.QuadPart);
    >> printf(" system: %02d:%02d:%02d\n",
    >> systemTime.wHour,
    >> systemTime.wMinute,
    >> systemTime.wSecond);
    >>
    >> Sleep(1000);
    >> }
    >>
    >> return 0;
    >> }
    >>
    >> And this is the output on the failing machine:
    >>
    >> Counter = 56497408135970 Freq = 3158750000 system: 01:18:57
    >> Counter = 56497818110218 Freq = 3158750000 system: 01:18:58
    >> Counter = 56498206404250 Freq = 3158750000 system: 01:18:59
    >> Counter = 56498715528269 Freq = 3158750000 system: 01:19:00
    >> Counter = 56499113984213 Freq = 3158750000 system: 01:19:01
    >> Counter = 56499489120527 Freq = 3158750000 system: 01:19:02
    >> Counter = 56499867623469 Freq = 3158750000 system: 01:19:03
    >> Counter = 56500252073478 Freq = 3158750000 system: 01:19:04
    >> Counter = 56500713252945 Freq = 3158750000 system: 01:19:05
    >> Counter = 56501096560332 Freq = 3158750000 system: 01:19:06
    >>
    >> If you make some computations, the counter seems to say that every
    >> iteration is about 0.12 seconds apart from the previous one, while it
    >> should be more or less 1 second.
    >> The failing machine is running XP SP3 (hal.dll 5.1.2600.5687,
    >> ntoskrnl.exe 5.1.2600.5938) on an Intel Core 2 Duo E8500.
    >>
    >> Has anyone ever seen this?
    >>
    >>
    >> Have a nice day
    >> GV
    >>
    >>
    >>
    >>
    >
  • Daniel_TerhellDaniel_Terhell Member Posts: 1,349
    Sorry for following up to myself again but what I forgot about is that QPC
    has its own implementation in usermode which boils down to
    ntdll.RtlQueryPerformanceCounter. Nonethelesss it suffers from the same
    problems as the kernel implementations, no instruction which flushes the
    pipeline, so unreliable.

    //Daniel




    <xxxxx@resplendence.com> wrote in message news:xxxxx@ntdev...
    > Athough it doesn't explain the behavior of regular intervals of about
    > 100ms for the sake of sanity I would also make sure you are running always
    > on the same CPU by using SetThreadAffinityMask because each core has its
    > own time stamp counter and they can be out of sync. Also disable variable
    > speed features such as Intel Speedstep in the BIOS and read the rest of
    > the recommendations in this article:
    > http://msdn.microsoft.com/en-us/library/ee417693(VS.85).aspx
    >
    > //Daniel
    >
    >
    > <xxxxx@resplendence.com> wrote in message news:xxxxx@ntdev...
    >> I think the problem is that QPC is broken on Windows XP and Vista. The
    >> CPU guides clearly state that RDTSC is not a serializing instruction and
    >> is subject to out of order execution among other things and the various
    >> implementations on Windows don't do that. (On Win7 RDTSC is no longer
    >> used possibly depending on available hardware features).
    >>
    >> On XP SP3:
    >> hal!KeQueryPerformanceCounter:
    >> 806ebb94 8bff mov edi,edi
    >> 806ebb96 55 push ebp
    >> 806ebb97 8bec mov ebp,esp
    >> 806ebb99 5d pop ebp
    >> 806ebb9a ff2548156f80 jmp dword ptr [hal!HalpHeapStart+0xc
    >> (806f1548)]
    >> 0: kd> t
    >> hal!HalpAcpiTimerQueryPerfCount:
    >> 806e4c78 a0dcd06e80 mov al,byte ptr [hal!HalpUse8254 (806ed0dc)]
    >> 806e4c78 a0dcd06e80 mov al,byte ptr [hal!HalpUse8254 (806ed0dc)]
    >> 806e4c7d 0ac0 or al,al
    >> 806e4c7f 752d jne hal!HalpAcpiTimerQueryPerfCount+0x36
    >> (806e4cae)
    >> 806e4c81 8b4c2404 mov ecx,dword ptr [esp+4]
    >> 806e4c85 0bc9 or ecx,ecx
    >> 806e4c87 7412 je hal!HalpAcpiTimerQueryPerfCount+0x23
    >> (806e4c9b)
    >> 806e4c89 64a1a4000000 mov eax,dword ptr fs:[000000A4h]
    >> 806e4c8f 648b15a8000000 mov edx,dword ptr fs:[0A8h]
    >> 806e4c96 8901 mov dword ptr [ecx],eax
    >> 806e4c98 895104 mov dword ptr [ecx+4],edx
    >> 806e4c9b 0f31 rdtsc
    >> 806e4c9d 640305ac000000 add eax,dword ptr fs:[0ACh]
    >> 806e4ca4 641315b0000000 adc edx,dword ptr fs:[0B0h]
    >> 806e4cab c20400 ret 4
    >>
    >> There is no serializing instruction here making RDTSC subject to out of
    >> order execution among other things.
    >>
    >> What I would try is issue a CPUID serializing instruction before calling
    >> QPC (__CPUID is included in intrin.h) and see if that solves the problem.
    >>
    >> You can read more about the difficulties getting RDTSC right in this
    >> article:
    >> http://www.ccsl.carleton.ca/~jamuir/rdtscpm1.pdf
    >>
    >> but you can read about these as well in the latest intel and AMD manuals.
    >>
    >> //Daniel
    >>
    >>
    >>
    >>
    >> "Gianluca Varenni" <xxxxx@gmail.com> wrote in message
    >> news:xxxxx@ntdev...
    >>> Hi all.
    >>>
    >>> A customer of mine reported a strange issue, and I tracked the issue
    >>> down to wrong values returned by KeQueryPerformanceCounter (or the user
    >>> counterpart QueryPerformanceCounter/Frequency).
    >>>
    >>> The MSDN docs clearly state that
    >>>
    >>> KeQueryPerformanceCounter always returns a 64-bit integer that
    >>> represents the current value of a monotonically nondecreasing counter.
    >>> The resolution of the timer used to accumulate the current count can be
    >>> obtained by specifying a pointer value for PerformanceFrequency. The
    >>> value that is returned is the number of ticks per second. The count
    >>> starts accumulating when the computer is started up.
    >>>
    >>>
    >>> What I get on some specific customer machine is something that is not
    >>> "ticks per second". Either the frequency is bogus, or the counter
    >>> itself.
    >>>
    >>> I wrote a small user app to test it:
    >>>
    >>> int _tmain(int argc, _TCHAR* argv[])
    >>> {
    >>> int i;
    >>> LARGE_INTEGER freq, counter;
    >>> SYSTEMTIME systemTime;
    >>>
    >>> for (i = 0; i < 10; i++)
    >>> {
    >>> GetSystemTime(&systemTime);
    >>>
    >>> if (QueryPerformanceCounter(&counter) == FALSE ||
    >>> QueryPerformanceFrequency(&freq) == FALSE)
    >>> {
    >>> printf("Error getting performance counters\n");
    >>> }
    >>>
    >>> printf("Counter = %I64d Freq = %I64d", counter.QuadPart, freq.QuadPart);
    >>> printf(" system: %02d:%02d:%02d\n",
    >>> systemTime.wHour,
    >>> systemTime.wMinute,
    >>> systemTime.wSecond);
    >>>
    >>> Sleep(1000);
    >>> }
    >>>
    >>> return 0;
    >>> }
    >>>
    >>> And this is the output on the failing machine:
    >>>
    >>> Counter = 56497408135970 Freq = 3158750000 system: 01:18:57
    >>> Counter = 56497818110218 Freq = 3158750000 system: 01:18:58
    >>> Counter = 56498206404250 Freq = 3158750000 system: 01:18:59
    >>> Counter = 56498715528269 Freq = 3158750000 system: 01:19:00
    >>> Counter = 56499113984213 Freq = 3158750000 system: 01:19:01
    >>> Counter = 56499489120527 Freq = 3158750000 system: 01:19:02
    >>> Counter = 56499867623469 Freq = 3158750000 system: 01:19:03
    >>> Counter = 56500252073478 Freq = 3158750000 system: 01:19:04
    >>> Counter = 56500713252945 Freq = 3158750000 system: 01:19:05
    >>> Counter = 56501096560332 Freq = 3158750000 system: 01:19:06
    >>>
    >>> If you make some computations, the counter seems to say that every
    >>> iteration is about 0.12 seconds apart from the previous one, while it
    >>> should be more or less 1 second.
    >>> The failing machine is running XP SP3 (hal.dll 5.1.2600.5687,
    >>> ntoskrnl.exe 5.1.2600.5938) on an Intel Core 2 Duo E8500.
    >>>
    >>> Has anyone ever seen this?
    >>>
    >>>
    >>> Have a nice day
    >>> GV
    >>>
    >>>
    >>>
    >>>
    >>
    >
  • Gianluca_VarenniGianluca_Varenni Member Posts: 334
    --------------------------------------------------
    From: <xxxxx@broadcom.com>
    Sent: Wednesday, September 01, 2010 7:41 PM
    To: "Windows System Software Devs Interest List" <xxxxx@lists.osr.com>
    Subject: RE:[ntdev] Interesting issue with (Ke)QueryPerformanceCounter

    > I suggest you instead of sleep() run a tight loop waiting for the time to
    > jump to the next second and see if the performance counter will behave
    > better.

    That would prove that it's a power management issue. It still doesn't solve
    the problem of QueryPerformanceCounter returning bogus numbers :-(

    Have a nice day
    GV

    >
    > In other words, the PerfCount may run with the claimed speed only when
    > you're actually doing something. If the processor is idle, then power
    > management may kick in
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 7,058
    <QUOTE>
    There is no serializing instruction here making RDTSC subject to out of
    order execution among other things.
    </QUOTE>

    You're right... that IS a known bug. But Gianluca is reporting here isn't that the time being returned is ragged, highly variable, or off "a little"... he's reporting that it's off by almost A FACTOR OF MAGNITUDE.

    I'm having trouble believing out of order execution, or reading the TSC on two different processors which are not kept in sync, could be to blame for THAT level of error. Or, do you have a scenario in mind that would explain this.

    This finding is really some pretty scary stuff, at least in my mind.

    Peter
    OSR

    Peter Viscarola
    OSR
    @OSRDrivers

  • Daniel_TerhellDaniel_Terhell Member Posts: 1,349
    That makes sense but this could be in combination with a power management issue going on like Alex mentioned. If a CPU is idle, instead of doing a zero page thread it might call directly into intelppm.sys doing some magic to save consumption of the processor it is running. If RDTSC is issued after that instead of before that should make a huge difference.

    Also with processors, their caches and pipelines growing larger and more sophisticated, OOE might mean something that makes sense only if you understand the design of the CPU.

    The issues with QPC and RDTSC is rather long so to say and I would say it's better not to draw any conclusions from empirical observation only.

    //Daniel
  • mmmm Member - All Emails Posts: 1,409
    Have we tried disabling speedstep?

    On Sep 2, 2010 11:23 AM, wrote:
    > That makes sense but this could be in combination with a power management
    issue going on like Alex mentioned. If a CPU is idle, instead of doing a
    zero page thread it might call directly into intelppm.sys doing some magic
    to save consumption of the processor it is running. If RDTSC is issued after
    that instead of before that should make a huge difference.
    >
    > Also with processors, their caches and pipelines growing larger and more
    sophisticated, OOE might mean something that makes sense only if you
    understand the design of the CPU.
    >
    > The issues with QPC and RDTSC is rather long so to say and I would say
    it's better not to draw any conclusions from empirical observation only.
    >
    > //Daniel
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    http://www.osronline.com/page.cfm?name=ListServer
  • Mike_KempMike_Kemp Member - All Emails Posts: 291
    QueryPerformanceFrequency() is documented as not changing while the system
    is running. If that's true it can only really indicate the design speed of
    the CPU, which appears correct for the CPU specified in this test.

    Since it seems BIOSes these days throttle back the CPU, it does not sound
    unreasonable that the cycle counter is running slow.

    It seems the obvious test is to determine what the actual CPU clock speed is
    during this test. If it is being clocked at ~10% of the rated speed on
    average during this progam execution, then the counter is being read
    correctly, and there is no problem.

    In fact it would be surprising to observe anything else unless the actual
    clock speed is fixed at the design speed (which may be a setup option which
    could work around the issue for the time being).

    The various documents cited so far suggest that as a cycle counter QPC can
    be useful (if used with care) but it is not much use for any real time
    timing.

    I think the logic is that if you want to time items for external purposes,
    then you should use timing linked to the interface. e.g. if your external
    interface is video, then the actual frame rate is the finest resolution you
    need because what you do in between frames is irrelevant. If your external
    interface is audio, you should use the audio sample rate (or some
    subdivision based on blocks of samples) for the same reason. Or whatever the
    interface is. These timings must be known to the program since it is
    interfacing to them.

    But it is a major nuisance, and will mean rewriting some of my library code
    for new projects!

    Mike

    ----- Original Message -----
    From: xxxxx@osr.com
    To: Windows System Software Devs Interest List
    Sent: Thursday, September 02, 2010 4:13 PM
    Subject: RE:[ntdev] Interesting issue with (Ke)QueryPerformanceCounter


    <QUOTE>
    There is no serializing instruction here making RDTSC subject to out of
    order execution among other things.
    </QUOTE>

    You're right... that IS a known bug. But Gianluca is reporting here isn't
    that the time being returned is ragged, highly variable, or off "a
    little"... he's reporting that it's off by almost A FACTOR OF MAGNITUDE.

    I'm having trouble believing out of order execution, or reading the TSC on
    two different processors which are not kept in sync, could be to blame for
    THAT level of error. Or, do you have a scenario in mind that would explain
    this.

    This finding is really some pretty scary stuff, at least in my mind.

    Peter
    OSR


    ---
    NTDEV is sponsored by OSR

    For our schedule of WDF, WDM, debugging and other seminars visit:
    http://www.osr.com/seminars

    To unsubscribe, visit the List Server section of OSR Online at
    http://www.osronline.com/page.cfm?name=ListServer
  • Tim_RobertsTim_Roberts Member - All Emails Posts: 12,854
    Cay Bremer wrote:
    > Newer AMD and Intel processors, including the Core 2 family, update the
    > TSC at a constant rate, so CPU throttling cannot be the culprit here.

    His machine has a Core i5, which is newer than the Core 2. I, at least,
    don't know whether this family has the guaranteed cycle counter or not

    --
    Tim Roberts, xxxxx@probo.com
    Providenza & Boekelheide, Inc.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • Pavel_APavel_A Member Posts: 2,648
    By the way - as your disassembly shows - KeQPC immediately jumps
    thru some pointer. The default implementation does rdtsc,
    but "cpu drivers" may replace it to something better.

    --pa


    <xxxxx@resplendence.com> wrote in message news:xxxxx@ntdev...
    > Sorry for following up to myself again but what I forgot about is that QPC
    > has its own implementation in usermode which boils down to
    > ntdll.RtlQueryPerformanceCounter. Nonethelesss it suffers from the same
    > problems as the kernel implementations, no instruction which flushes the
    > pipeline, so unreliable.
    >
    > //Daniel
    >
    >
    >
    >
    > <xxxxx@resplendence.com> wrote in message news:xxxxx@ntdev...
    >> Athough it doesn't explain the behavior of regular intervals of about
    >> 100ms for the sake of sanity I would also make sure you are running
    >> always on the same CPU by using SetThreadAffinityMask because each core
    >> has its own time stamp counter and they can be out of sync. Also disable
    >> variable speed features such as Intel Speedstep in the BIOS and read the
    >> rest of the recommendations in this article:
    >> http://msdn.microsoft.com/en-us/library/ee417693(VS.85).aspx
    >>
    >> //Daniel
    >>
    >>
    >> <xxxxx@resplendence.com> wrote in message news:xxxxx@ntdev...
    >>> I think the problem is that QPC is broken on Windows XP and Vista. The
    >>> CPU guides clearly state that RDTSC is not a serializing instruction and
    >>> is subject to out of order execution among other things and the various
    >>> implementations on Windows don't do that. (On Win7 RDTSC is no longer
    >>> used possibly depending on available hardware features).
    >>>
    >>> On XP SP3:
    >>> hal!KeQueryPerformanceCounter:
    >>> 806ebb94 8bff mov edi,edi
    >>> 806ebb96 55 push ebp
    >>> 806ebb97 8bec mov ebp,esp
    >>> 806ebb99 5d pop ebp
    >>> 806ebb9a ff2548156f80 jmp dword ptr [hal!HalpHeapStart+0xc
    >>> (806f1548)]
    >>> 0: kd> t
    >>> hal!HalpAcpiTimerQueryPerfCount:
    >>> 806e4c78 a0dcd06e80 mov al,byte ptr [hal!HalpUse8254
    >>> (806ed0dc)]
    >>> 806e4c78 a0dcd06e80 mov al,byte ptr [hal!HalpUse8254
    >>> (806ed0dc)]
    >>> 806e4c7d 0ac0 or al,al
    >>> 806e4c7f 752d jne hal!HalpAcpiTimerQueryPerfCount+0x36
    >>> (806e4cae)
    >>> 806e4c81 8b4c2404 mov ecx,dword ptr [esp+4]
    >>> 806e4c85 0bc9 or ecx,ecx
    >>> 806e4c87 7412 je hal!HalpAcpiTimerQueryPerfCount+0x23
    >>> (806e4c9b)
    >>> 806e4c89 64a1a4000000 mov eax,dword ptr fs:[000000A4h]
    >>> 806e4c8f 648b15a8000000 mov edx,dword ptr fs:[0A8h]
    >>> 806e4c96 8901 mov dword ptr [ecx],eax
    >>> 806e4c98 895104 mov dword ptr [ecx+4],edx
    >>> 806e4c9b 0f31 rdtsc
    >>> 806e4c9d 640305ac000000 add eax,dword ptr fs:[0ACh]
    >>> 806e4ca4 641315b0000000 adc edx,dword ptr fs:[0B0h]
    >>> 806e4cab c20400 ret 4
    >>>
    >>> There is no serializing instruction here making RDTSC subject to out of
    >>> order execution among other things.
    >>>
    >>> What I would try is issue a CPUID serializing instruction before calling
    >>> QPC (__CPUID is included in intrin.h) and see if that solves the
    >>> problem.
    >>>
    >>> You can read more about the difficulties getting RDTSC right in this
    >>> article:
    >>> http://www.ccsl.carleton.ca/~jamuir/rdtscpm1.pdf
    >>>
    >>> but you can read about these as well in the latest intel and AMD
    >>> manuals.
    >>>
    >>> //Daniel
    >>>
    >>>
    >>>
    >>>
    >>> "Gianluca Varenni" <xxxxx@gmail.com> wrote in message
    >>> news:xxxxx@ntdev...
    >>>> Hi all.
    >>>>
    >>>> A customer of mine reported a strange issue, and I tracked the issue
    >>>> down to wrong values returned by KeQueryPerformanceCounter (or the user
    >>>> counterpart QueryPerformanceCounter/Frequency).
    >>>>
    >>>> The MSDN docs clearly state that
    >>>>
    >>>> KeQueryPerformanceCounter always returns a 64-bit integer that
    >>>> represents the current value of a monotonically nondecreasing counter.
    >>>> The resolution of the timer used to accumulate the current count can be
    >>>> obtained by specifying a pointer value for PerformanceFrequency. The
    >>>> value that is returned is the number of ticks per second. The count
    >>>> starts accumulating when the computer is started up.
    >>>>
    >>>>
    >>>> What I get on some specific customer machine is something that is not
    >>>> "ticks per second". Either the frequency is bogus, or the counter
    >>>> itself.
    >>>>
    >>>> I wrote a small user app to test it:
    >>>>
    >>>> int _tmain(int argc, _TCHAR* argv[])
    >>>> {
    >>>> int i;
    >>>> LARGE_INTEGER freq, counter;
    >>>> SYSTEMTIME systemTime;
    >>>>
    >>>> for (i = 0; i < 10; i++)
    >>>> {
    >>>> GetSystemTime(&systemTime);
    >>>>
    >>>> if (QueryPerformanceCounter(&counter) == FALSE ||
    >>>> QueryPerformanceFrequency(&freq) == FALSE)
    >>>> {
    >>>> printf("Error getting performance counters\n");
    >>>> }
    >>>>
    >>>> printf("Counter = %I64d Freq = %I64d", counter.QuadPart,
    >>>> freq.QuadPart);
    >>>> printf(" system: %02d:%02d:%02d\n",
    >>>> systemTime.wHour,
    >>>> systemTime.wMinute,
    >>>> systemTime.wSecond);
    >>>>
    >>>> Sleep(1000);
    >>>> }
    >>>>
    >>>> return 0;
    >>>> }
    >>>>
    >>>> And this is the output on the failing machine:
    >>>>
    >>>> Counter = 56497408135970 Freq = 3158750000 system: 01:18:57
    >>>> Counter = 56497818110218 Freq = 3158750000 system: 01:18:58
    >>>> Counter = 56498206404250 Freq = 3158750000 system: 01:18:59
    >>>> Counter = 56498715528269 Freq = 3158750000 system: 01:19:00
    >>>> Counter = 56499113984213 Freq = 3158750000 system: 01:19:01
    >>>> Counter = 56499489120527 Freq = 3158750000 system: 01:19:02
    >>>> Counter = 56499867623469 Freq = 3158750000 system: 01:19:03
    >>>> Counter = 56500252073478 Freq = 3158750000 system: 01:19:04
    >>>> Counter = 56500713252945 Freq = 3158750000 system: 01:19:05
    >>>> Counter = 56501096560332 Freq = 3158750000 system: 01:19:06
    >>>>
    >>>> If you make some computations, the counter seems to say that every
    >>>> iteration is about 0.12 seconds apart from the previous one, while it
    >>>> should be more or less 1 second.
    >>>> The failing machine is running XP SP3 (hal.dll 5.1.2600.5687,
    >>>> ntoskrnl.exe 5.1.2600.5938) on an Intel Core 2 Duo E8500.
    >>>>
    >>>> Has anyone ever seen this?
    >>>>
    >>>>
    >>>> Have a nice day
    >>>> GV
    >>>>
    >>>>
    >>>>
    >>>>
    >>>
    >>
    >
  • Pavel_APavel_A Member Posts: 2,648
    "Cay Bremer" <xxxxx@lavabit.com> wrote in message news:xxxxx@ntdev...
    > Newer AMD and Intel processors, including the Core 2 family, update the
    > TSC at a constant rate, so CPU throttling cannot be the culprit here.

    ??? But this would defeat the main purpose of rdtsc instruction?

    -- pa

    > (Sorry for mentioning BCDEdit in the first place. I had originally missed
    > that the machine is running Windows XP.)
    >
    >
    > On Thu, 02 Sep 2010 05:41:09 +0300, <xxxxx@broadcom.com> wrote:
    >> I suggest you instead of sleep() run a tight loop waiting for the time
    >> to jump to the next second and see if the performance counter will
    >> behave better.
    >>
    >> In other words, the PerfCount may run with the claimed speed only when
    >> you're actually doing something. If the processor is idle, then power
    >> management may kick in
    >
  • Tim_RobertsTim_Roberts Member - All Emails Posts: 12,854
    Pavel A. wrote:
    > "Cay Bremer" <xxxxx@lavabit.com> wrote in message news:xxxxx@ntdev...
    >> Newer AMD and Intel processors, including the Core 2 family, update the
    >> TSC at a constant rate, so CPU throttling cannot be the culprit here.
    > ??? But this would defeat the main purpose of rdtsc instruction?

    That depends entirely on what you believe its "main purpose" is. Many
    people believe the main purpose of rdtsc is to provide finely-grained
    elapsed time. I would say that the people who care how many CPU cycles
    have actually been executed are a very small minority.

    --
    Tim Roberts, xxxxx@probo.com
    Providenza & Boekelheide, Inc.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • Cay_BremerCay_Bremer Member Posts: 121
    On Fri, 03 Sep 2010 00:23:17 +0300, Pavel A. wrote:
    > ??? But this would defeat the main purpose of rdtsc instruction?

    You could have checked this claim yourself instead:
    http://www.intel.com/Assets/PDF/manual/253668.pdf (16.11 TIME-STAMP
    COUNTER)


    On Thu, 02 Sep 2010 19:16:44 +0300, Tim Roberts wrote:
    > His machine has a Core i5, which is newer than the Core 2. I, at least,
    > don't know whether this family has the guaranteed cycle counter or not

    This stands in contradiction with his original statement:
    "The failing machine is running XP [...] on an Intel Core 2 Duo E8500."

    In any case, "[t]his [constant TSC] is the architectural behavior moving
    forward", according to Intel.
  • Mike_KempMike_Kemp Member - All Emails Posts: 291
    That's a useful Intel reference. If there are any hardware designers here
    can they confirm that the CPU is always fed a constant clock so a constant
    rate counter is possible, or whether some motherboards alter the master
    clock to the CPU to control its speed (which would make the constant rate
    update impossible). This may be processor dependent but it would be useful
    to know whether it is possible on any or all of processors from core duo
    onward. Also the intel doc says that asserting DPSLP pin may stop the TSC
    increment. Does anyone know whether this is used even briefly in normal
    operation for power saving? Either of these would render the TSC unusable
    for realtime and may explain the OP observation. thx. M

    >You could have checked this claim yourself instead:
    >http://www.intel.com/Assets/PDF/manual/253668.pdf (16.11 TIME-STAMP
    >COUNTER)<
  • Pavel_APavel_A Member Posts: 2,648
    "Cay Bremer" <xxxxx@lavabit.com> wrote in message news:xxxxx@ntdev...
    > On Fri, 03 Sep 2010 00:23:17 +0300, Pavel A. <xxxxx@fastmail.fm> wrote:
    >> ??? But this would defeat the main purpose of rdtsc instruction?
    >
    > You could have checked this claim yourself instead:
    > http://www.intel.com/Assets/PDF/manual/253668.pdf (16.11 TIME-STAMP
    > COUNTER)

    Thanks Cay, it is a great idea. The fine System programming manual is so
    amusing reading.
    Every year it is not like a year ago...
    So they now call that "architectural behavior moving forward".

    There still are two delicate moments:

    1. the "constant TSC" rate depends on combination of other obscure
    parameters:
    "The specific processor configuration determines the behavior." (page 16-48,
    more esoteric details in 30.10.5) - and this is exactly what
    defines the PerformanceFrequency value. Unless the OS or a "CPU driver"
    measures
    the frequency in runtime, it may be possible to get wrong number - which may
    lead to results of the OP.

    2. What is "Invariant TSC" then? (16.11.1)

    About "constant TSC", they wrote:
    "Constant TSC behavior ensures that the duration of each clock tick is
    uniform and supports the
    use of the TSC as a wall clock timer even if the processor core changes
    frequency."

    But then goes:

    "The invariant TSC will run at a constant rate in all ACPI P-, C-. and
    T-states. This is
    the architectural behavior moving forward. On processors with invariant TSC
    support, the OS may use the TSC for wall clock timer services (instead of
    ACPI or
    HPET timers). ...
    Processor's support for invariant TSC is indicated by
    CPUID.80000007H:EDX[8]."

    Now I'm confused. My Core 2 cpu does not have this cpuid bit set - but,
    according to 16.11,
    Core 2 has constant TSC. So can I rely on TSC being uniform if C states
    change?
    -- pa

    >
    > On Thu, 02 Sep 2010 19:16:44 +0300, Tim Roberts <xxxxx@probo.com> wrote:
    >> His machine has a Core i5, which is newer than the Core 2. I, at least,
    >> don't know whether this family has the guaranteed cycle counter or not
    >
    > This stands in contradiction with his original statement:
    > "The failing machine is running XP [...] on an Intel Core 2 Duo E8500."
    >
    > In any case, "[t]his [constant TSC] is the architectural behavior moving
    > forward", according to Intel.
    >
  • Cay_BremerCay_Bremer Member Posts: 121
    On Sat, 04 Sep 2010 00:31:19 +0300, Pavel A. wrote:
    > 1. the "constant TSC" rate depends on combination of other obscure
    > parameters:
    > [...]
    > Unless the OS or a "CPU driver" measures the frequency in runtime,it may
    > be possible to get wrong number - which may lead to results
    > of the OP.

    I don't understand what you are trying to debate with me here. First, you
    question their decision of changing the TSC behaviour, even though that is
    obviously beyond my control, and now, you speculate about the intricacies
    of frequency retrieval/measurement, which I haven't even commented on. In
    fact, I am the one who suggested to try out a different QPC timer.
    In my reply that you have made the source of your controversy, I only
    addressed dynamic frequency scaling (but "CPU throttling" may have been
    ambiguous), which was, given the original behaviour of the TSC, actually a
    reasonable guess.

    (English comma rules are killing me.)


    > 2. What is "Invariant TSC" then? (16.11.1)
    >
    > [...]
    >
    > Now I'm confused. My Core 2 cpu does not have this cpuid bit set - but,
    > according to 16.11,
    > Core 2 has constant TSC. So can I rely on TSC being uniform if C states
    > change?

    If the documentation is to be trusted, a mere constant TSC "may" only be
    stopped in C3 and deeper states:

    "Following a RESET, the counter increments even when the processor is
    halted by the HLT instruction [C1] or the external STPCLK# pin [C2]. Note
    that the assertion of the external DPSLP# pin may cause the time-stamp
    counter to stop [C3]."
Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Upcoming OSR Seminars
Writing WDF Drivers 25 Feb 2019 OSR Seminar Space
Developing Minifilters 8 April 2019 OSR Seminar Space