What logging solution do you use when you are not in debug mode ?

Hi !

As mentioned, I would be interested to know what kind of solution do
you use to log some traces from within your driver when you are not in
debug mode.

Do you use some kind of file or is there some kind of system log file
where a driver can write to in release mode ?

Thanks in advance.

Best regards.

Well, there’s always WPP Tracing:
https:</https:>
https:</https:>

There are other ETW-based methods as well… but logging unstructured data is what WPP was designed for.

Is that the sort of thing you’re asking about?

Peter
OSR
@OSRDrivers

Matthieu Collette wrote:

As mentioned, I would be interested to know what kind of solution do
you use to log some traces from within your driver when you are not in
debug mode.

Do you use some kind of file or is there some kind of system log file
where a driver can write to in release mode ?

This is a topic with which I still struggle. My drivers generally have
a compile-time switch to switch between WPP and DbgPrint. DbgPrint
sends to the kernel debugger even in release mode, so I can still use
DbgView to check the traces. When we get closer to a production
release, I’ll either #ifdef out the DbgPrint calls, or switch over to WPP.

I really do need to get more comfortable capturing and analyzing WPP
traces. I never remember the spelling of the commands, so I have to
Google it every time. DbgPrint is a much lazier solution.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

It’s not only much lazier (which in and of itself is a good thing, all other things being equal), for debugging it’s really a much more APPROPRIATE solution.

I don’t WANT to see my trace output in a separate program on the target system. I’m sitting at my dev box, writing code. The debugger console window is on the left… VS is right in front of me.

You’re working in the debugger… step, step, step… you WANT to see your debug output. You can scroll back through the buffer. You can search for the output text.

Yes… I realize you CAN get WPP output to appear in WinDbg using !wmitrace. I haven’t done it in a long while. I seem to recall it being a mess…

Peter
OSR
@OSRDrivers

I’ve become a fan of ETW tracing, either using manifests or TraceLogging flavors. Manifests are a bit higher performance and produce smaller trace files, but are more work. These have very little overhead when disabled, and can give detailed structured data. Manifest based logs also nicely integrate into the system log facility, just like the hundreds of built in log event providers. Having unified logs across many OS components is very powerful.

One reason I favor the ETW based tracing over WPP is events have an activity id, which is a unique key to group a set of related events. You also can have a related activity id, which allows grouping events into a hierarchy, although the current viewing tools are poor, so you end up having to manually follow the connections. There are lots of ways of capturing events, but it’s easy to generate overwhelming amounts of data, so meaningful filtering and event tagging is really important.

A plus of ETW tracing is it generates structured data, which means key/value pairs, which can then have more sophisticated analysis done. For example, you can easily write some code to analyze the distribution of values or distribution of time between certain events. With an unstructured text log, it’s harder to parse the text and turn it back into structured fields suitable for non-human analysis.

On the downside, ETW tracing is not platform portable and for initial bringup, printf like strings to the kernel debugger are just easier. A partial solution is I wrote some code that turns TraceLogging events into DbgPrintEx output, which may end up on github if I ever get the time. WPP can be convinced with some macros to log to WPP (ETW logging at its core) or go to DbrPrintEx or to other printf like output. WPP normally uses a preprocessor to strip the event metadata into your .PDB, so actual events only need to write the binary values. Formatting is then done with tools afterwards.

I usually use the Windows Performance Recorder to capture ETW logs, but have never understood how Windows Performance Analyzer is supposed to analyze them. I use either perfview (fast, simple pattern filtering) or “Microsoft Message Analyzer” (slower, but better filter expressions) to look at the files, ETW can log at rates like hundreds of thousands of events/sec, with a minimum of a few tenths of a microsecond between events. ETW events are a few hundred bytes each, so data rates get pretty high, so you need to use appropriate keyword/level values on each event for filtering at capture time.

Intel also sells a tracing framework which I believe they claim can log tens of millions of events/sec.

Jan

On 11/4/16, 8:49 AM, “xxxxx@lists.osr.com on behalf of Matthieu Collette” wrote:

Hi !

As mentioned, I would be interested to know what kind of solution do
you use to log some traces from within your driver when you are not in
debug mode.

Do you use some kind of file or is there some kind of system log file
where a driver can write to in release mode ?

Thanks in advance.

Best regards.


NTDEV is sponsored by OSR

Visit the list online at: http:

MONTHLY seminars on crash dump analysis, WDF, Windows internals and software drivers!
Details at http:

To unsubscribe, visit the List Server section of OSR Online at http:</http:></http:></http:>

I totally agree that DbgPrints to windbg are just way easier and more useful, for earlier development stages. You want to see the debug messages synchronously. Last time I tried using the WPP to windbg trace, it tended to defer output until IRQL dropped, so you would go step-step-step, hmm, NOW it prints the trace event from 20 steps ago, and now the context from that moment is gone.

For later debugging stages, debug spew to the debugger are less useful. Just yesterday I was working on a problem and captured a few million events (10 seconds of running), and could find the events after a failure, and by looking at the values in those events could search backwards twenty thousand events to find values that indicated the original first failure. The event rates were just way too high for any kind of windbg output.

Jan

On 11/4/16, 11:00 AM, “xxxxx@lists.osr.com on behalf of xxxxx@osr.com” wrote:



It’s not only much lazier (which in and of itself is a good thing, all other things being equal), for debugging it’s really a much more APPROPRIATE solution.

I don’t WANT to see my trace output in a separate program on the target system. I’m sitting at my dev box, writing code. The debugger console window is on the left… VS is right in front of me.

You’re working in the debugger… step, step, step… you WANT to see your debug output. You can scroll back through the buffer. You can search for the output text.

Yes… I realize you CAN get WPP output to appear in WinDbg using !wmitrace. I haven’t done it in a long while. I seem to recall it being a mess…

Peter
OSR
@OSRDrivers


NTDEV is sponsored by OSR

Visit the list online at: http:

MONTHLY seminars on crash dump analysis, WDF, Windows internals and software drivers!
Details at http:

To unsubscribe, visit the List Server section of OSR Online at http:</http:></http:></http:>

I like WPP.
WPP is fast, reliable and relative simple solution.
It also supported on Window XP and available for user mode applications.

DbgPrint is not a logging API but a kernel mode kind of “printf” or “print to console” API. ETW or WPP can be used for logging. The EventDrv sample is a good starting point for ETW.

https://github.com/Microsoft/Windows-driver-samples/tree/master/general/tracing/evntdrv/Eventdrv

Today I’ve just seen a POS (Point Of Sale) system that is running with XP and it is a nation wide conpany so they have the money to upgrade.

There are also some cash dispensers (banks) that are still running XP.

That’s why you may still need WPP.

I can’t agree that WPP is helpful back to XP. In fact, the only problems with WPP are its weird lack of cross-version compatibility in older versions of Windows. WPP works nicely from about Win7 and later. If you need to support Windows version before Win 7 (first of all, m very sorry and) you might want to think twice before committing to WPP.

Mr. Bottorf has advocated for ETW, especially TarveLogging before here. At least partially because of his advocacy, we committed to using TraceLogging in a major project we’ve got underway here at OSR. It’s quirky (in some really weird and unexpected ways … like its use of stack space) but it’s also very handy. Definitely a good tool to have in the chest.

Peter
OSR
@OSRDrivers

Quirky would match my experience of TraceLogging. At times I curse it and at times I really appreciate it. The stack space use is a little strange, and I’ve seen some strange errors reported by code analysis (cpp only). As TraceLogging is implemented as all macros, syntax errors can give pretty meaningless messages.

A thing that’s not apparent in TraceLogging is the use of the activity id fields, by using the TraceLoggingWriteActivity instead of TraceLoggingWrite. The event is always written with one activity id anyway, all zeros by default, so always making this a meaningful value has no significant impact on performance.

What’s very fuzzy is how to generate activity ids, which are defined as GUIDs, any 128-bit value seems to be fine. Currently I have a little function that does InterlockedIncrement64 on a counter in the device context, giving the low 64-bits, and I append the device context address as the high 64-bits, making the set of events for each device instance unique. A higher performance implementation could have a per core counter, with the initial values spread, so each core gets a huge range of unique counter values. The second activity id field can be used to create an association between groups of activity’s, and normally the second activity field is not written. I have also added fields to some of my data structures to store the activity id for a groups of events.

There is supposed to be an activity id field associated with a thread context, but I don’t see how to tell TraceLogging in a driver to use it instead of zeros. The concept is you are supposed to be able to set a new activity id in a user mode application, and that id flows down into the kernel, and kernel components can associate work with a parent activity id. In theory, if everybody properly set activity ids and related activity ids, you could follow a user mode event down though all the driver layers and associate the I/O completion with the original user mode work. This would be very useful for debugging and performance analysis, as it could connect things like the hardware accesses to an application HTTP request.

To give an example of how activity id’s fit together, on the NIC driver I have:

  1. Tx packets get a context structure allocated to hold tx state information, this is not extra for tracing but there is a field to store an activity id. If tracing of the tx path is enabled (there is a function to test keywords) I generate a unique activity id for the specific packet, and may write a begin transmit event, using the TraceLoggingOpcode macro to tag this event as the beginning of an operation. You can tag events as begin/end and use that to calculate the elapsed time. I believe Windows Performance Analyzer understands these Opcode tags, although I’ve never figured out exactly what It wants. Microsoft should write a document called “Writing events formatted to get the best use of Windows Performance Analyzer”.
  2. Events are written for tx activity, like triggering the mailbox register for the hw, all tagged with the activity of the matching packet.
  3. The tx ISR can generate a new activity id for the interrupt, and stores it in an interrupt context. My hardware has a mask bit for each interrupt, and the first time the tx interrupt fires, I set the mask bit (and software flag) and generate a new activity id. My hardware takes a bit of time for disabling the interrupt to happen (posted write), and I can see spurious interrupts (MSIX writes already inflight). My ISR notices if an interrupt happens when the software flag says they are disabled, and generates a trace event, using the activity id from the first interrupt. This allows me to easily see when analyzing the trace, the initial interrupt and any spurious interrupts after that initial one.
  4. My DPC uses the activity id generated in the ISR to tag the ring descriptors processed, making it easy to see which descriptors were associated with a specific interrupt. As descriptors get processed, the associated tx packet context can also be used to write a completion event for the original packet. This is one of the places I use the second activity id. This tags events such that I can find the interrupt activity id that caused the completion of a specific tx packet, and can find all the tx packets that were completed as a result of a single interrupt, or the time from when you wrote the tx hw mailbox to the time you got the completion interrupt for that packet.
  5. My driver also needs to avoid triggering the DPC watchdog, so it may pass work off to a thread to continue processing tx completion descriptors. I generate a new activity id when this thread starts up, and specify the original interrupt activity id and the new thread invocation activity id. My thread uses the thread activity id when processing descriptors, so I can separately group the descriptors processed in the DPC and the thread, and by following the activity id chains, can determine which packets were completed by the DPC, and which by the thread, and associate them all back to the original interrupt. All these events have timestamps, and it’s very useful see the time interval between ISR, and DPC activation, and handing work off to a passive thread.

I frequently generate traces with a few (or tens) million events and it becomes really difficult to find the correct events for debugging some issue. When viewing, subsetting events based on activity ids that are stored in some context, generally eliminates the need to subset events per processor. The full trace in timestamp order, across many processors, can be pretty hard to make sense of. Having a number of keywords, and subsetting for one keyword and one core makes the trace a lot more readable without any activity ids.

A limitation I find of this tracing systems is it’s not zero overhead. The system seems to use a clock of processor timestamp counter / 1024, so I get about 0.3 microsecond timestamp resolution on a 3 Ghz processor, and Message Analyzer seems to displays timestamps to 100 nanosecond resolution. Back to back events sometimes have the same timestamp, so I believe the event processing time is in the 0.1 to 0.3 microsecond range. I believe it would be possible to have a tracing system with an order of magnitude (or more) better resolution, and lower overhead. I can trace a few hundred thousand interrupts/sec but can’t trace 20 million descriptors/sec, unless I write summary events with the current system. There are TraceLogging macros for arrays of values, so you can sometimes write summary events with much of the same data, although the viewing tools don’t know that one event is really data from multiple events, and you only get one timestamp, unless you capture your own.

Something that would be super useful would be a tool to view the events that understood how activity ids can form chains. Using Message Analyzer, you can build up complex filter expressions by looking at the activity id field (right click add to filter), which is one of the ETW event header fields. It would be nicer to have a display mode that allows traversing forward and backward across the network of activity id associations. It’s like you’re trying to get a display of all packets at step 7, that had a common activity at step 4. This is really useful because often when debugging, the root cause problem fans out, and you only detect the problem after its spread. You really want to interactively navigate this node network, expanding/contracting future/past branches on demand.

Windows Performance Analyzer seems to understand TraceLogging events, but I’ve never had much success getting it to do what I want. Perhaps I’m just holding it wrong. I do almost always use Windows Performance Recorder to capture the traces, although the control files you have to make are a pain. Message Analyzer 1.4 can do event captures, but seems too slow and drops events compared to Performance Recorder.

I also use perfview, see https://github.com/Microsoft/perfview Perfview is fast, and can really easily filter on the event name and time range, and can do text pattern filters, but can’t easily build compound filter expressions like you get in Message Analyzer. It also has a bunch of .Net specific stuff you need to ignore.

Jan

On 11/6/16, 5:38 AM, “xxxxx@lists.osr.com on behalf of xxxxx@osr.com” wrote:

It’s quirky (in some really weird and unexpected ways … like its use of stack space) but it’s also very handy. Definitely a good tool to have in the chest.

On 2016-11-04 16:53:55 +0000, xxxxx@osr.com said:

Well, there’s always WPP Tracing:
https:</https:>
https:</https:>

There are other ETW-based methods as well… but logging unstructured
data is what WPP was designed for.

Is that the sort of thing you’re asking about?

Peter
OSR
@OSRDrivers

I don’t need to log structured data at the moment, thanks a lot for the
two links, I’ll have a look at both !

Hi Tilm !

On 2016-11-04 17:03:06 +0000, Tim Roberts said:

Matthieu Collette wrote:
> As mentioned, I would be interested to know what kind of solution do
> you use to log some traces from within your driver when you are not in
> debug mode.
>
> Do you use some kind of file or is there some kind of system log file
> where a driver can write to in release mode ?

This is a topic with which I still struggle. My drivers generally have
a compile-time switch to switch between WPP and DbgPrint. DbgPrint
sends to the kernel debugger even in release mode, so I can still use
DbgView to check the traces. When we get closer to a production
release, I’ll either #ifdef out the DbgPrint calls, or switch over to WPP.
Do you need to do set something specific to see your traces in DbgView
using DbgPrint in release mode ?

I really do need to get more comfortable capturing and analyzing WPP
traces. I never remember the spelling of the commands, so I have to
Google it every time. DbgPrint is a much lazier solution.

Matthieu Collette wrote:

On 2016-11-04 17:03:06 +0000, Tim Roberts said:
> This is a topic with which I still struggle. My drivers generally have
> a compile-time switch to switch between WPP and DbgPrint. DbgPrint
> sends to the kernel debugger even in release mode, so I can still use
> DbgView to check the traces. When we get closer to a production
> release, I’ll either #ifdef out the DbgPrint calls, or switch over to WPP.
Do you need to do set something specific to see your traces in DbgView
using DbgPrint in release mode ?

No. I call DbgPrint, which isn’t dependent on debug/release mode. I
set a debug level in each message, so I control the output that way.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

I realize this is an older thread, but I was wondering, @Jan_Bottorff, if, in addition to logging structured events in your ETW solution, you are still able to log any printf-like messages as well? We’re currently using WPP but would also to log structured events for xperf, etc.

Thanks!

I realize this is an older thread

Then why did you post to it.

Did you not read the guidelines for this forum that say posting to old threads is not allowed??

Start a new thread if you have a question for Mr. Bottorff, or anybody else here on the forum.

Peter

1 Like