In doing some live migration testing on a Win2k3 VM, I ran into this bugcheck after 20 or so migrates under load. Two things confuse me.
-
There is no obvious way that this code can be running at IRQL 0x1b, and the function that calls this one starts with a PAGED_CODE macro. Can the recorded IRQL be bogus in any situations?
-
We store and initialize some KEVENT objects in space mapped from a CmResourceTypeMemory. Something tells me that this could be a bad idea, but I technically don't know why.
40051 is the value of KEVENT->Header.Lock plus 0x50.
0: kd> dt -b KEVENT f8a9482c
dump_XenVbd!KEVENT
+0x000 Header : _DISPATCHER_HEADER
+0x000 Type : 0x1 ''
+0x001 Abandoned : 0 ''
+0x001 Absolute : 0 ''
+0x001 NpxIrql : 0 ''
+0x001 Signalling : 0 ''
+0x002 Size : 0x4 ''
+0x002 Hand : 0x4 ''
+0x003 Inserted : 0 ''
+0x003 DebugActive : 0 ''
+0x003 DpcActive : 0 ''
+0x000 Lock : 262145
+0x004 SignalState : 1
+0x008 WaitListHead : _LIST_ENTRY [0xf8a95834 - 0xf8a94834]
+0x000 Flink : 0xf8a95834
+0x004 Blink : 0xf8a94834
0: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
IRQL_NOT_LESS_OR_EQUAL (a)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If a kernel debugger is available get the stack backtrace.
Arguments:
Arg1: 00040051, memory referenced
Arg2: d000001b, IRQL
Arg3: 00000001, bitfield :
bit 0 : value 0 = read operation, 1 = write operation
bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status)
Arg4: 808312b0, address which referenced memory
Debugging Details:
WRITE_ADDRESS: 00040051
CURRENT_IRQL: 1b
FAULTING_IP:
nt!KiUnwaitThread+8
808312b0 095650 or dword ptr [esi+50h],edx
DEFAULT_BUCKET_ID: DRIVER_FAULT
BUGCHECK_STR: 0xA
PROCESS_NAME: System
TRAP_FRAME: f88a2c8c -- (.trap 0xfffffffff88a2c8c)
ErrCode = 00000002
eax=f8a95834 ebx=00000001 ecx=00040001 edx=00000100 esi=00040001 edi=f8a94834
eip=808312b0 esp=f88a2d00 ebp=f88a2d04 iopl=0 nv up ei ng nz na po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010282
nt!KiUnwaitThread+0x8:
808312b0 095650 or dword ptr [esi+50h],edx ds:0023:00040051=????????
Resetting default scope
LAST_CONTROL_TRANSFER: from 808312b0 to 8088c963
STACK_TEXT:
f88a2c8c 808312b0 badb0d00 00000100 00000000 nt!KiTrap0E+0x2a7
f88a2d04 8082818e 00000001 821a1020 8214c018 nt!KiUnwaitThread+0x8
f88a2d20 f8266173 00a9482c 00000001 00000000 nt!KeSetEvent+0x84
f88a2d38 f8266937 8215b608 15120f0c 00000001 xenpci!XenPciSuspendPDO+0x93 [c:\projects\ovmwinpvdriver.hg\xenpci\pnp_fdo.c @ 1306]
f88a2d6c 808ec1eb 8214c018 814bcd08 808ae5fc xenpci!XenPciSuspendWorker+0x137 [c:\projects\ovmwinpvdriver.hg\xenpci\pnp_fdo.c @ 1535]
f88a2d80 80880441 81d96408 00000000 821a1020 nt!IopProcessWorkItem+0x13
f88a2dac 80949b7c 81d96408 00000000 00000000 nt!ExpWorkerThread+0xeb
f88a2ddc 8088e062 80880356 00000001 00000000 nt!PspSystemThreadStartup+0x2e
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16