KeQueryTickCount and Windbg

Hi
I am observing a seemingly odd behaviour in my driver with respect to the periodicity of the timers when using debug prints and Windbg and I would like to understand this better. Apologize in advance if this query belongs to Windbg forum.
My driver manages a FCoE storage adapter and depends on a timer mechanism to periodically (every 8 seconds, using KeSetTimer) send protocol specific information on the wire (FIP Keep Alive). The other end-point will wait for this frame to be delivered within 20 seconds before terminating the link. The Windbg is connected via serial port.
For every debug print statement, the driver uses KeQueryTickCount to retrieve the system tick count and converts it ?seconds.milliseconds? format (using KeQueryTimeIncrement) before using it as part of the print string.
When I hook up Windbg and have large number of debug prints, I see the other end-point terminating the link as it does not see the protocol specific information sent out for 20 seconds. However, the time information captured as part of the debug prints indicates a time lapse of only 2 seconds before the link is terminated. See prints below:

Wed Jan 6 13:56:00.617 2016 (UTC - 6:00): 211.000 TX>>>> DST:01:10:18:01:00:02 SRC:34:64:A9:93:8E:49 P:3 V:0000 FIP Op=0004/01 (VLAN request)
Wed Jan 6 13:56:00.749 2016 (UTC - 6:00): 211.000 RX<<<< DST:34:64:A9:93:8E:49 SRC:D0:BF:9C:DA:53:DB P:0 V:NONE FIP Op=0004/02 (VLAN response)
Wed Jan 6 13:56:01.148 2016 (UTC - 6:00): 211.000 TX>>>> DST:01:10:18:01:00:02 SRC:34:64:A9:93:8E:49 P:3 V:0100 FIP Op=0001/01 (Discovery Solicitation)
Wed Jan 6 13:56:01.359 2016 (UTC - 6:00): 211.000 RX<<<< DST:34:64:A9:93:8E:49 SRC:D0:BF:9C:DA:53:DB P:3 V:0100 FIP Op=0001/02 (Discovery Advertisement)
Wed Jan 6 13:56:03.061 2016 (UTC - 6:00): 212.375 RX<<<< DST:01:10:18:01:00:01 SRC:D0:BF:9C:DA:53:DB P:3 V:0100 FIP Op=0001/02 (Discovery Advertisement)
Wed Jan 6 13:56:04.092 2016 (UTC - 6:00): 213.015 TX>>>> DST:34:64:A9:93:8E:49 SRC:D0:BF:9C:DA:53:DB FIP Keep Alive // Protocol specific information sent by the driver
*** Mulitple other debug prints ***
Wed Jan 6 13:56:26.712 2016 (UTC - 6:00): 215.328 RX<<<< DST:34:64:A9:93:8E:49 SRC:D0:BF:9C:DA:53:DB P:3 V:0100 FIP Op=0003/02 (Clear Virtual Links) // End-point terminating the link

As you can see above, the driver sends out the protocol specific information at time 213.015 (seconds.ms format) and is waiting for the 8 seconds to elapse. However, on the wire, at the other end-point and on Windbg, the time elapsed is beyond 20 seconds when the link is terminated. Even at this point, the driver queried tick count shows an elapsed time of 2.313 seconds.

A couple of other observations:

  1. The timer invocation for the frame on the wire is correct if the Windbg is not active (verified by the wire trace). Using DbgView (instead of Windbg) does not show this skew for debug prints. The behaviour with Windbg is the same with both serial and network transports.
  2. If I reduce the number of prints, the timer information is accurate (both on prints and the frame on the wire).

Can someone help me understand how Windbg is affecting the value/behavior of KeQueryTickCount and KeSetTimer on the target server?

Regards,
Girish.

xxxxx@hotmail.com wrote:

When I hook up Windbg and have large number of debug prints, I see the other end-point terminating the link as it does not see the protocol specific information sent out for 20 seconds. However, the time information captured as part of the debug prints indicates a time lapse of only 2 seconds before the link is terminated.

Sending debug prints over a wired WinDbg connection is not compatible
with real-time processes. It’s easy to forget how much overhead is
involved in WinDbg transmissions, but it takes a HUGE amount of time,
and the kernel is frozen in an inactive state while WinDbg takes is
sending characters along.

You just need to chalk this up to “that’s the way it is”. You cannot do
meaningful real time measurement with WinDbg spew going on. This is
where DebugView shines. It doesn’t have to interface with physical
media, so the overhead is much smaller.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

See if you can find QueueDelayedWorkitem function in your code.

Use it, young padawan, instead of KeSetTimer. And you will become a true Jedi, then. I’m pretty sure QueueDelayedWorkitem is already used to send the FKA.

And yes, a lot of debug spew *will* cause timeouts. That’s how it is.

> meaningful real time measurement with WinDbg spew going on. This is

where DebugView shines.

…as also WPP/ETW


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

Thanks Tim, Alex and Maxim for clarifying the behavior with large number of debug prints.