"Incorrect RegHandle value specified as parameter for EtwUnregister" error from HLK tests

Hello,

I use TraceLogging framework in my driver (based on KMDF and NetAdapterCx). When running HLK tests, many of them fail (for example DF - PNP Surprise Remove Device Test (Reliability)) with the same error:

DRIVER_VERIFIER_DETECTED_VIOLATION (c4)
Arg1: 00000000000000dc, Incorrect RegHandle value specified as parameter for EtwUnregister.

I call TraceLoggingRegister in DriverEntry and TraceLoggingUnregister in EvtDriverUnload:

_Use_decl_annotations_
VOID
EvtDriverUnload(_In_ WDFDRIVER driver)
{
    UNREFERENCED_PARAMETER(driver);

    TraceLoggingUnregister(OvpnTraceProvider);
}

NTSTATUS
DriverEntry(_In_ PDRIVER_OBJECT driverObject, _In_ PUNICODE_STRING registryPath)
{
    NTSTATUS status;
    BOOLEAN traceLoggingRegistered = FALSE;
    GOTO_IF_NOT_NT_SUCCESS(done, status, TraceLoggingRegister(OvpnTraceProvider));
    traceLoggingRegistered = TRUE;
    
    WDF_DRIVER_CONFIG driverConfig;
    WDF_DRIVER_CONFIG_INIT(&driverConfig, EvtDeviceAdd);
    driverConfig.EvtDriverUnload = EvtDriverUnload;
    GOTO_IF_NOT_NT_SUCCESS(done, status, WdfDriverCreate(driverObject, registryPath, &driverAttrs, &driverConfig, WDF_NO_HANDLE));

Stack trace doesn’t reveal much - it looks like framework calls driver’s Unload callback which calls TraceLoggingUnregister which triggers verifier error:

2: kd> kn
 # Child-SP          RetAddr               Call Site
00 fffffd8e`db110568 fffff807`68e66eb4     nt!KeBugCheckEx
01 fffffd8e`db110570 fffff807`68e6cc81     nt!VerifierBugCheckIfAppropriate+0xe0
02 fffffd8e`db1105b0 fffff807`68e7c713     nt!VfTargetEtwUnregister+0xb1
03 fffffd8e`db110600 fffff807`69ab6719     nt!VerifierEtwUnregister+0x13
04 (Inline Function) --------`--------     Wdf01000!FxDriverUnload::Invoke+0xc [minkernel\wdf\framework\shared\inc\private\common\FxDriverCallbacks.hpp @ 97] 
05 fffffd8e`db110630 fffff80c`30268792     Wdf01000!FxDriver::Unload+0x285a9 [minkernel\wdf\framework\shared\core\fxdriver.cpp @ 189] 
06 fffffd8e`db110680 ffffd389`ce291f30     ovpn_dco+0x8792
07 fffffd8e`db110688 ffffd389`ce291f30     0xffffd389`ce291f30
08 fffffd8e`db110690 ffffd389`ce291f10     0xffffd389`ce291f30
09 fffffd8e`db110698 00000000`00000000     0xffffd389`ce291f10

FxDriverUnload looks like this:

    //
    // Invoke the driver if they specified an unload routine.
    //
    if (pDriver->m_DriverUnload.Method) {
        pDriver->m_DriverUnload.Invoke(pDriver->GetHandle());
        DoTraceLevelMessage(pFxDriverGlobals, TRACE_LEVEL_VERBOSE, TRACINGDRIVER,
                            "Driver unload routine Exit WDFDRIVER %p, PDRIVER_OBJECT_UM %p",
                            pDriver->GetHandle(), DriverObject);
    }

m_DriverUnload is set to EvtDriverUnload in the same fxdriver.cpp file:

m_DriverUnload.Method     = Config->EvtDriverUnload;

And now the interesting part - this error doesn’t happen when testing debug version of the driver, it happens only with release one. Also it doesn’t happen when TraceLoggingRegister and TraceLoggingUnregister calls are commented out, but this is expected.

I’ve spent some time trying to reproduce this outside of HLK tests but haven’t got any luck. Has anybody experienced this with TraceLogging framework? Any ideas why this happens with release only? Are there some easy ways to simulate “PNP Surprise Remove Device Test” manually, outside of HLK?

Where your OvpnTraceProvider is placed? The docum states it should be in nonpaged pool.

(I changed the name from OvpnTraceProvider to g_hOvpnEtwProvider).

This is a global variable. Header file:

#include <TraceLoggingProvider.h>

TRACELOGGING_DECLARE_PROVIDER(g_hOvpnEtwProvider);

CPP file:

TRACELOGGING_DEFINE_PROVIDER(g_hOvpnEtwProvider,
“OVPN”,
(0x4970f9cf, 0x2c0c, 0x4f11, 0xb1, 0xcc, 0xe3, 0xa1, 0xe9, 0x95, 0x88, 0x33));

Does it mean that it is in nonpaged pool?

I found an interesting thing.

If I add DbgPrint call inside TraceLoggingUnregister, which is called from OvpnEvtDriverUnload, everything works fine. I compared dumpbin output with and without DbgPrint and turns out that OvpnEvtDriverUnload is only mentioned in “Guard CF Function Table”

    Guard CF Function Table

          Address
          --------       
     0000000140003A30  ?OvpnEvtDriverUnload@@YAXPEAUWDFDRIVER__@@@Z (void __cdecl OvpnEvtDriverUnload(struct WDFDRIVER__ *))
    

but is missing in function table. However, with DbgPrint it is presented both in “Guard CF Function Table” and “Function Table”:

    Guard CF Function Table

          Address
          --------           
     0000000140003A30  ?OvpnEvtDriverUnload@@YAXPEAUWDFDRIVER__@@@Z (void __cdecl OvpnEvtDriverUnload(struct WDFDRIVER__ *))
Function Table (93)

           Begin    End      Info      Function Name
  00000180 00003A30 00003A60 0000F3E0  ?OvpnEvtDriverUnload@@YAXPEAUWDFDRIVER__@@@Z (void __cdecl OvpnEvtDriverUnload(struct WDFDRIVER__ *))
    Unwind version: 2
    Unwind flags: None
    Size of prologue: 0x04
    Count of codes: 3
    Unwind codes:
      05: EPILOG, flags = 0x1, size = 0x5
      00: EPILOG, offset from end = 0x0
      04: ALLOC_SMALL, size=0x28

Does it mean that compiler by some reasons optimizes it out? Why is that?

Since my OvpnEvtDriverUnload looks like this:

VOID
OvpnEvtDriverUnload(_In_ WDFDRIVER driver)
{
    UNREFERENCED_PARAMETER(driver);

    TraceLoggingUnregister(g_hOvpnEtwProvider);
}

I wonder if compiler got confused by TraceLoggingUnregister, which is macro in 19041 WDK, and optimizes it out. I fixed it by wrapping function definition in #pragma optimize("", off) #pragma optimize("", on).

Your last comment rings a bell. The optimizer is aggressive here and tail call optimization eliminates this function incorrectly. Your solution to turn the optimizer off is correct. An alternative is to put a _nop() call immediately after the unregister call as it prevents the tail call optimization while still allowing other optimizer algorithms.

1 Like

@Doron_Holan said:
Your last comment rings a bell. The optimizer is aggressive here and tail call optimization eliminates this function incorrectly.

Thanks Doron,

I added __nop() after the unregister call. Do you happen to know if MSVC folks are aware of this and the incorrect optimization has been fixed in recent compiler versions?` If not, I could create a sample driver project which exposes this issue and file a bug.

I wonder if compiler got confused by TraceLoggingUnregister, which is macro in 19041 WDK,

It’s not a macro, it’s an __inline function. So, you have a function which consists of nothing but a call to an inline function. It’s possible that the loader optimizer moved the inline function into the FxDriverUnload call, since that’s the only reference, so that your function no longer exists as an independent entity, but that doesn’t explain how it would lose the value of g_hOvpnEtwProvider.

That proposal is also a bit weak, because the driver unload routine is only referenced through a pointer in driverConfig. it MUST exist somewhere.

Scary.

Compiler: “You don’t need this function call. I’ll just remove it.”

Seriously?

Peter

It’s not a macro, it’s an __inline function

I am confused here.

c:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\shared\TraceLoggingProvider.h, like 445

  /*
  Determine the full (ODR-safe) name for tracelogging functions that are affected
  by macro overrides.
  */
  #define TraceLoggingUnregister     _tlg_PASTE2(TraceLoggingUnregister_,  TLG_EVENT_UNREGISTER)

and 200 lines below:

_IRQL_requires_max_(PASSIVE_LEVEL)
TLG_INLINE
void _tlg_CALL
TraceLoggingUnregister(
    TraceLoggingHProvider _Inout_ hProvider);

It’s possible that the loader optimizer moved the inline function into the FxDriverUnload call, since that’s the only reference, so that your function no longer exists as an independent entity

This looks like what is happening, according to stack trace - FxDriverUnload calls EtwUnregister.

00 fffffd8e`db110568 fffff807`68e66eb4     nt!KeBugCheckEx
01 fffffd8e`db110570 fffff807`68e6cc81     nt!VerifierBugCheckIfAppropriate+0xe0
02 fffffd8e`db1105b0 fffff807`68e7c713     nt!VfTargetEtwUnregister+0xb1
03 fffffd8e`db110600 fffff807`69ab6719     nt!VerifierEtwUnregister+0x13
04 (Inline Function) --------`--------     Wdf01000!FxDriverUnload::Invoke+0xc [minkernel\wdf\framework\shared\inc\private\common\FxDriverCallbacks.hpp @ 97] 

but that doesn’t explain how it would lose the value of g_hOvpnEtwProvider. That proposal is also a bit weak, because the driver unload routine is only referenced through a pointer in driverConfig. it MUST exist somewhere.

I think I use TraceLogging API according to documentation. I admit I don’t understand why optimizer breaks it and I would be interested in knowing why. I don’t know how to investigate it further but I am open to suggestions. So far, disable TCO by adding __nop is good enough workaround for me.