How can I get highest resolution timestamps?

My goal is to estimate how quickly I can service my HW by taking precision timestamps and inserting them into a buffer each time I read a byte from my hardware. Then once I fill my buffer with timestamps run some statistics such as average time between reads, min and max time between consecutive reads, etc.

I have been trying to use KeQuerySystemTime to generate my timestamps, but I am suspicious that the resolution is killing my results. The WDK documentation shows KeQuerySystemTime to have resolution closer to ~10ms. Is there a better way to accomplish this such that I can get closer to 1ms resolution?

KeQueryPerformanceCounter


Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

wrote in message news:xxxxx@ntdev…
> My goal is to estimate how quickly I can service my HW by taking precision
timestamps and inserting them into a buffer each time I read a byte from my
hardware. Then once I fill my buffer with timestamps run some statistics such
as average time between reads, min and max time between consecutive reads, etc.
>
> I have been trying to use KeQuerySystemTime to generate my timestamps, but I
am suspicious that the resolution is killing my results. The WDK documentation
shows KeQuerySystemTime to have resolution closer to ~10ms. Is there a better
way to accomplish this such that I can get closer to 1ms resolution?
>

How does this compare to KeQueryInterruptTime?

I noticed that KeQueryPerformanceCounter may temporarily disable interrupts. This may cause me issues.

Can someone that has used either of these methods inform me of any caveats I should be aware of?

I dunno, dude. Reading the WDK it says:

“A call to KeQueryInterruptTime has considerably less overhead than a call to KeQueryPerformanceCounter, as well.”

If you read this (and the rest of the doc page on this DDI) I think you’ll see it’s pretty clear.

Peter
OSR

It depends on HAL…

For example, on MP APIC HAL KeQueryPerformanceCounter() relies upon RDTSC instruction that returns the number of CPU clock cycles since boot, i.e. offers the highest resolution possible, and does not disable interrupts. However, on other HALs situation may be different - KeQueryPerformanceCounter() may rely upon the system clock, and it may disable interrupts as well…

Anton Bassov

xxxxx@hotmail.com wrote:

My goal is to estimate how quickly I can service my HW by taking precision timestamps and inserting them into a buffer each time I read a byte from my hardware. Then once I fill my buffer with timestamps run some statistics such as average time between reads, min and max time between consecutive reads, etc.

I have been trying to use KeQuerySystemTime to generate my timestamps, but I am suspicious that the resolution is killing my results. The WDK documentation shows KeQuerySystemTime to have resolution closer to ~10ms. Is there a better way to accomplish this such that I can get closer to 1ms resolution?

KeQueryInterruptTime is a bit better, although its resolution is
non-deterministic.

KeQueryPerformanceCounter is better yet, but on a multiprocessor
machine, it uses the cycle counter, so you need to use processor
affinity to get monotonic results.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

xxxxx@hotmail.com wrote:

How does this compare to KeQueryInterruptTime?

KeQueryInterruptTime and timeGetTime are exactly identical. They read a
static variable that the kernel updates during certain interrupt
processing. In the worst case, it has the same granularity as
KeQueryTickCount. If someone has used timeBeginPeriod to reduce the
timer interval (which increases overhead), then KeQueryInterruptTime
will do much better.

I noticed that KeQueryPerformanceCounter may temporarily disable interrupts. This may cause me issues.

Well, you’re talking about 2 or 3 microseconds or less, and that only on
a uniprocessor HAL, where it reads the motherboard countdown timer.
That has to be done with interrupts disabled. With a multiprocessor
HAL, it uses the cycle counter, which takes a few nanoseconds and
doesn’t need to disable interrupts, but you do have to be aware that the
counters on the various processors are not synchronized.

Can someone that has used either of these methods inform me of any caveats I should be aware of?

Lots of them. But you’re doing this for one-time performance
measurement, not for production use, right?


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Tim,

>Can someone that has used either of these methods inform me of any caveats
> I should be aware of?

Lots of them. But you’re doing this for one-time performance measurement,
not for production use, right?

The only question is what he is going to achieve on his performance measurements - after all,
KeQueryPerformanceCounter() is not really meant to be a function that measures elapsed time.

Therefore, as we have already discussed it around a month ago, it just does not make sense to speak about tests that offer high-precision results without mentioning the context in which they have to be made. As I have already mentioned in our previous discussion, these tests are just pointless at IRQL< DPC level and offer 100% reliable results only at highest possible IRQL
( or with cleaq IF flag, which, in practical terms, is exactly the same thing). The good news here is that in this context your pre-test and post-test calls to KeQueryPerformanceCounter() are guaranteed to be made on the same CPU, which eliminates a headache of worrying about discrepancies between cycle counters on different CPUs.

However, another problem is still left. Let’s say you have discovered that a piece of code X
runs N cycles faster than the one Y. What does this difference mean in *practical* terms, i.e how it translates to the *actual* performance??? Even if you translate this difference into milli(mlicro,nano,et)seconds. it still has to be viewed in context. of a given operation - it just does not really make sense to be bothered about few microseconds in context of an operation that is going to take at least, say, 100ms…

Therefore, the OP has to think twice before going for tests of so high precision, in the first place - there is a good chance that they are just pointless in context that he plans to run them in…

Anton Bassov

> Re: How can I get highest resolution timestamps?

Use a hardware trace module / logic analyzer, monitoring the external
data communications with your device. Thus you get system-independent
accurate timestamping with any desired resolution.

If you invoke any software routine inside your Windows™ driver you
will change the very data you want to monitor (how your driver behaves)
by the observation process.

-H