Deadlock waiting for KEVENT on the stack in EVT_WDF_DEVICE_SELF_MANAGED_IO_SUSPEND

I have a USB driver with a driver created thread that takes care of some of my shutdown procedure. This includes making sure that no IN nor OUT transfers are active after I return from EVT_WDF_DEVICE_SELF_MANAGED_IO_SUSPEND. In EVT_WDF_DEVICE_SELF_MANAGED_IO_SUSPEND I create a KEVENT on the stack and I pass this to the driver created thread, and when no transfer are active then I signal the event. The problem I am seeing is that the thread that is calling EVT_WDF_DEVICE_SELF_MANAGED_IO_SUSPEND is stuck waiting on the KEVENT, even though I can see that the condition for the event to be signaled is fullfilled (so it should have been signaled).

In locks.doc I can see the following written:

Similarly, a driver might wait on a synchronization event in its DispatchPnp routine to ensure that I/O has completed before stopping or removing the device.
… so I think that what I am doing is not a completely bad idea.

The EVT_WDF_DEVICE_SELF_MANAGED_IO_SUSPEND callback is not marked as pageable (unsure if that would make a difference).
EVT_WDF_DEVICE_SELF_MANAGED_IO_SUSPEND is always called at PASSIVE_LEVEL, so I guess waiting on a KEVENT is fine.

I am looking for ideas on how to debug this, or if someone can spot any mistake I am doing.
Just let me know if anyone needs more info.

Adding some more info about what I have tried…

This is the stack trace where things are stuck waiting:

ffffdc80`d76b89a0 fffff801`1a21c9c0     : 00000000`0000000f 00000000`ffffffff 00000000`00000000 ffff8d88`5bf57158 : nt!KiSwapContext+0x76
ffffdc80`d76b8ae0 fffff801`1a21beef     : 1d7fa027`000004f6 00000000`00000000 ffffdc80`d76b8ca0 fffff801`1a217a42 : nt!KiSwapThread+0x500
ffffdc80`d76b8b90 fffff801`1a21b793     : 00000000`00000000 00000000`00000000 00000000`00000000 ffff8d88`57698180 : nt!KiCommitThreadWait+0x14f
ffffdc80`d76b8c30 fffff801`6cc4ae9b     : ffffdc80`d76b8d70 fffff801`00000000 ffff8d88`5a0cdf00 00000000`00000000 : nt!KeWaitForSingleObject+0x233
(Inline Function) --------`--------     : --------`-------- --------`-------- --------`-------- --------`-------- : USBDriver!WaitableCallback::wait+0x1b (Inline Function @ fffff801`6cc4ae9b) [C:\Users\WDKRemoteUser.DESKTOP-1ODDJ59\code\suite\lib\usb\windows\USBDriver\USBDriver\WaitableCallback.h @ 25] 
ffffdc80`d76b8d20 fffff801`1c6e9d8f     : ffff8d88`5c8c37d8 ffff8d88`5c8c3830 ffff8d88`5a0cdc80 00000000`0000000c : USBDriver!USBDriverEvtDeviceSelfManagedIoSuspend+0xcb [C:\Users\WDKRemoteUser.DESKTOP-1ODDJ59\code\suite\lib\usb\windows\USBDriver\USBDriver\Device.cpp @ 486] 
ffffdc80`d76b8da0 ffff8d88`5c8c37d8     : ffff8d88`5c8c3830 ffff8d88`5a0cdc80 00000000`0000000c 00000000`0000000c : 0xfffff801`1c6e9d8f
ffffdc80`d76b8da8 ffff8d88`5c8c3830     : ffff8d88`5a0cdc80 00000000`0000000c 00000000`0000000c fffff801`1c6e27df : 0xffff8d88`5c8c37d8
ffffdc80`d76b8db0 ffff8d88`5a0cdc80     : 00000000`0000000c 00000000`0000000c fffff801`1c6e27df 00000000`00000000 : 0xffff8d88`5c8c3830
ffffdc80`d76b8db8 00000000`0000000c     : 00000000`0000000c fffff801`1c6e27df 00000000`00000000 fffff801`1c7650e0 : 0xffff8d88`5a0cdc80
ffffdc80`d76b8dc0 00000000`0000000c     : fffff801`1c6e27df 00000000`00000000 fffff801`1c7650e0 ffff8d88`5a0cdc80 : 0xc
ffffdc80`d76b8dc8 fffff801`1c6e27df     : 00000000`00000000 fffff801`1c7650e0 ffff8d88`5a0cdc80 fffff801`1c6e0c20 : 0xc
ffffdc80`d76b8dd0 00000000`00000000     : fffff801`1c7650e0 ffff8d88`5a0cdc80 fffff801`1c6e0c20 ffff8d88`5c8c37d8 : 0xfffff801`1c6e27df

I would like to inspect the KEVENT object more closely.

I was reading here about something similar here: https://community.osr.com/discussion/33572/finding-owner-of-kevent

The “Args to child” columns for the call to KeWaitForSingleObject in my stack trace is:

ffffdc80d76b8d70 fffff80100000000 ffff8d885a0cdf00 0000000000000000

… and if I do as in the above link then I get:

3: kd> dt -b _KEVENT 0xfffff801
intelppm!_KEVENT
+0x000 Header : _DISPATCHER_HEADER
+0x000 Lock : ??
+0x000 LockNV : ??
+0x000 Type : ??
+0x001 Signalling : ??
+0x002 Size : ??
+0x003 Reserved1 : ??
+0x000 TimerType : ??
+0x001 TimerControlFlags : ??
+0x001 Absolute : ??
+0x001 Wake : ??
+0x001 EncodedTolerableDelay : ??
+0x002 Hand : ??
+0x003 TimerMiscFlags : ??
+0x003 Index : ??
+0x003 Inserted : ??
+0x003 Expired : ??
+0x000 Timer2Type : ??
+0x001 Timer2Flags : ??
+0x001 Timer2Inserted : ??
+0x001 Timer2Expiring : ??
+0x001 Timer2CancelPending : ??
+0x001 Timer2SetPending : ??
+0x001 Timer2Running : ??
+0x001 Timer2Disabled : ??
+0x001 Timer2ReservedFlags : ??
+0x002 Timer2ComponentId : ??
+0x003 Timer2RelativeId : ??
+0x000 QueueType : ??
+0x001 QueueControlFlags : ??
+0x001 Abandoned : ??
+0x001 DisableIncrement : ??
+0x001 QueueReservedControlFlags : ??
+0x002 QueueSize : ??
+0x003 QueueReserved : ??
+0x000 ThreadType : ??
+0x001 ThreadReserved : ??
+0x002 ThreadControlFlags : ??
+0x002 CycleProfiling : ??
+0x002 CounterProfiling : ??
+0x002 GroupScheduling : ??
+0x002 AffinitySet : ??
+0x002 Tagged : ??
+0x002 EnergyProfiling : ??
+0x002 SchedulerAssist : ??
+0x002 ThreadReservedControlFlags : ??
+0x003 DebugActive : ??
+0x003 ActiveDR7 : ??
+0x003 Instrumented : ??
+0x003 Minimal : ??
+0x003 Reserved4 : ??
+0x003 AltSyscall : ??
+0x003 UmsScheduled : ??
+0x003 UmsPrimary : ??
+0x000 MutantType : ??
+0x001 MutantSize : ??
+0x002 DpcActive : ??
+0x003 MutantReserved : ??
+0x004 SignalState : ??
+0x008 WaitListHead : _LIST_ENTRY
+0x000 Flink : ???
+0x008 Blink : ???
Memory read error 00000000fffff811

… which does’nt look correct.

First, you can see that 0xfffff801 is very different from 0xfffff80100000000, can’t you?

However, that’s the wrong parameter anyway. Look at the documentation for KeWaitForSingleObject. The object is first, ffffdc80`d76b8d70. The second parameter is the wait reason, which is 0 in this case (it’s a 32-bit value passed on a 64-bit stack). Essentially every system structure will be aligned on an 8 byte boundary, so an address that doesn’t end in 0 or 8 is probably wrong.

@Tim_Roberts said:
First, you can see that 0xfffff801 is very different from 0xfffff80100000000, can’t you?

Yes, sorry, my bad.

Is the fffff801 in fffff801 00000000 just some garbage or is it typical for 32 bit value passed on a 64 bit stack?

Thanks! The output below looks better. I´m guessing that it is

+0x004 SignalState : 0n0

… that tells me that the event is not signaled.

1: kd> dt -b _KEVENT 0xffffdc80d76b8d70
nt!_KEVENT
   +0x000 Header           : _DISPATCHER_HEADER
      +0x000 Lock             : 0n393216
      +0x000 LockNV           : 0n393216
      +0x000 Type             : 0 ''
      +0x001 Signalling       : 0 ''
      +0x002 Size             : 0x6 ''
      +0x003 Reserved1        : 0 ''
      +0x000 TimerType        : 0 ''
      +0x001 TimerControlFlags : 0 ''
      +0x001 Absolute         : 0y0
      +0x001 Wake             : 0y0
      +0x001 EncodedTolerableDelay : 0y000000 (0)
      +0x002 Hand             : 0x6 ''
      +0x003 TimerMiscFlags   : 0 ''
      +0x003 Index            : 0y000000 (0)
      +0x003 Inserted         : 0y0
      +0x003 Expired          : 0y0
      +0x000 Timer2Type       : 0 ''
      +0x001 Timer2Flags      : 0 ''
      +0x001 Timer2Inserted   : 0y0
      +0x001 Timer2Expiring   : 0y0
      +0x001 Timer2CancelPending : 0y0
      +0x001 Timer2SetPending : 0y0
      +0x001 Timer2Running    : 0y0
      +0x001 Timer2Disabled   : 0y0
      +0x001 Timer2ReservedFlags : 0y00
      +0x002 Timer2ComponentId : 0x6 ''
      +0x003 Timer2RelativeId : 0 ''
      +0x000 QueueType        : 0 ''
      +0x001 QueueControlFlags : 0 ''
      +0x001 Abandoned        : 0y0
      +0x001 DisableIncrement : 0y0
      +0x001 QueueReservedControlFlags : 0y000000 (0)
      +0x002 QueueSize        : 0x6 ''
      +0x003 QueueReserved    : 0 ''
      +0x000 ThreadType       : 0 ''
      +0x001 ThreadReserved   : 0 ''
      +0x002 ThreadControlFlags : 0x6 ''
      +0x002 CycleProfiling   : 0y0
      +0x002 CounterProfiling : 0y1
      +0x002 GroupScheduling  : 0y1
      +0x002 AffinitySet      : 0y0
      +0x002 Tagged           : 0y0
      +0x002 EnergyProfiling  : 0y0
      +0x002 SchedulerAssist  : 0y0
      +0x002 ThreadReservedControlFlags : 0y0
      +0x003 DebugActive      : 0 ''
      +0x003 ActiveDR7        : 0y0
      +0x003 Instrumented     : 0y0
      +0x003 Minimal          : 0y0
      +0x003 Reserved4        : 0y00
      +0x003 AltSyscall       : 0y0
      +0x003 UmsScheduled     : 0y0
      +0x003 UmsPrimary       : 0y0
      +0x000 MutantType       : 0 ''
      +0x001 MutantSize       : 0 ''
      +0x002 DpcActive        : 0x6 ''
      +0x003 MutantReserved   : 0 ''
      +0x004 SignalState      : 0n0
      +0x008 WaitListHead     : _LIST_ENTRY [ 0xffff8d88`57698180 - 0xffff8d88`57698180 ]
         +0x000 Flink            : 0xffff8d88`57698180 
         +0x008 Blink            : 0xffff8d88`57698180 

Is the fffff801 in fffff801 00000000 just some garbage or is it typical for 32 bit value passed on a 64 bit stack?

It’s just leftover randomness. When you’re doing 32-bit computations, the instructions only use the 32-bit registers (like eax), and the top half just retains whatever garbage it had before.

ok, thanks!

BTW, I figured out why the event wasn´t signaled. It was a coding mistake from my side :slight_smile: