Deviations using KeQueryPerformanceCounter

Hello,

I have a WDM driver function that reads the timestamp of my PCI hardware and calls KeQueryPerformanceCounter. Both values are returned to the user so he/she can correlate them.

Tests on Windows 7 show that usually both timestamps have an accuracy of approx. 1 microsecond. But in 1 of 100 calls, the timestamps show a deviation of 50 us.

Here is my code:

// ****************************************************
// Firmware might delay the driver’s read request, so we read the time stamp first and
// than the performance counter. As the time stamp is cached by the firmware
// just read first the high part and than the low part from high resolution time stamp.

const unsigned long long highPart = driverRead(pDevExt, BAR0::HIRES_TIMESTAMP_HIGH::Register);
const unsigned long long lowPart = driverRead(pDevExt, BAR0::HIRES_TIMESTAMP_LOW::Register);
pVal->HighResTimeStamp = ((highPart << 32LL) | lowPart);

LARGE_INTEGER performanceCounter = KeQueryPerformanceCounter(&performanceFrequency);
pVal->PerformanceCounter = performanceCounter.QuadPart;

return STATUS_SUCCESS;
// ****************************************************

performanceFrequency is 2148505. The HwTimestamp has a resolution of 1 microsecond.
The GetTimestamp function of my driver is called often (12 times in 1 ms).

Here are the measured values:
HwTimestamp (us), delta HwTimestamp (us), PerformanceCounter, delta PerformanceCounter, delta PerformanceCounter (us)
1622922832,44,3453117539,95,44
1622922876,44,3453117633,94,44
1622922919,43,3453117725,92,43
1622922962,43,3453118130,405,189 -> This line has deviations
1622923207,245,3453118343,213,99 -> This line has deviations
1622923330,123,3453118608,265,123
1622923380,50,3453118716,108,50
1622923429,49,3453118821,105,49

I noticed that when I add the two deltas (hw timestamp + hw timestamp, performance counter + performance counter) with the deviation, I get the same value: 288 us.

The code runs with IRQL_DISPATCH, so assume, it can only be interrupted by IRQL_DEVICE (ISRs). Is it possible that an IRQL_DEVICE function from the system or another driver interrupts my driver’s GetTimestamp function for 50 us?

xxxxx@gmx.de wrote:

I have a WDM driver function that reads the timestamp of my PCI hardware and calls KeQueryPerformanceCounter. Both values are returned to the user so he/she can correlate them.

Tests on Windows 7 show that usually both timestamps have an accuracy of approx. 1 microsecond. But in 1 of 100 calls, the timestamps show a deviation of 50 us.

The code runs with IRQL_DISPATCH, so assume, it can only be interrupted by IRQL_DEVICE (ISRs). Is it possible that an IRQL_DEVICE function from the system or another driver interrupts my driver’s GetTimestamp function for 50 us?

Absolutely. Among other things, the timer interrupt is going to fire
every now and then.

I certainly hope this result does not surprise you. It shouldn’t. This
is not a real-time environment. There are no guarantees.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

An ISR should not do much: Read the interrupt register from the hardware to check whether the interrupt came from me. Acknowledge the interrupt on the hardware. Schedule the DPC. I hoped, that this is faster than the 50 us I saw.

There are also IPI’s (inter-processor interrupts) happening across all the
processors. For example, if you change the kernel memory map, the TLB’s on
every processor may need to get flushed. There are lots of ways IPI’s get
optimized, but there still may be ugly cases that have no choice but to
use cpu cycles, like for example if sometimes needs to synchronize to a
single core, it might have to trigger an IPI and every core then needs to
essentially spin on a lock while one core does it’s thing.

Hardware devices other than the timer are also taking interrupts.

Some systems might be doing things in SMM mode, like for example laptops
might be adjusting the fan speed. I’ve also seen servers that did ECC
memory scrubbing in SMM mode. Those servers would take a periodic SMM
interrupt, and then hog the CPU while 4K of memory was scrubbed for
correctable errors.

I believe Vtune might have a way to profile specific ranges of code, which
I assume would include the profile of what interrupts happened.

A DIY method of looking at what interrupt is happening might be to make
your code spin reading the hw and perf counters, and when it notices there
was a glitch, record an ETW event. If you then ran an ETW trace session
that included system interrupts and your ETW events, I would think looking
at the event times would give good clues about who interrupted just before
your glitch event.

Jan

On 2/24/15, 7:58 AM, “xxxxx@gmx.de” wrote:

>An ISR should not do much: Read the interrupt register from the hardware
>to check whether the interrupt came from me. Acknowledge the interrupt on
>the hardware. Schedule the DPC. I hoped, that this is faster than the 50
>us I saw.
>
>—
>NTDEV is sponsored by OSR
>
>Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
>
>OSR is HIRING!! See http://www.osr.com/careers
>
>For our schedule of WDF, WDM, debugging and other seminars visit:
>http://www.osr.com/seminars
>
>To unsubscribe, visit the List Server section of OSR Online at
>http://www.osronline.com/page.cfm?name=ListServer

Couldn’t this deviation also be caused by the source of KeQueryPerformanceCounter? Looking at the given frequency the source doesn’t seem to be the HPET and so I think the returned value could slightly change after switching the core of a multicore CPU or the different sleep states?

Johannes

I checked the timestamp source. Coreinfo.exe returns:
RDTSCP * Supports RDTSCP instruction
TSC * Supports RDTSC instruction
TSC-DEADLINE * Local APIC supports one-shot deadline timer
TSC-INVARIANT * TSC runs at constant rate

So I assume QPC uses RDTSCP.