IRP passed to IoCompleteRequest still has cancel routine set

I’m hitting an issue while testing my minifilter on Win2k3 sp1 checked build. I have not tried to reproduce this on any other platform as of now. It’s most likely due to my filter corrupting some memory. Before I start to go down that path, please share if you have seen this problem in any other situation. Note that being a minifilter, I don’t touch the irp.

The problem happens when multiple instances of the test appln are communicating with the filter using communication port. I have not seen this problem before, but I’m still in the early stages of testing.

DRIVER_VERIFIER_IOMANAGER_VIOLATION (c9)
The IO manager has caught a misbehaving driver.
Arguments:
Arg1: 00000007, IRP passed to IoCompleteRequest still has cancel routine set
Arg2: ffffffff, the cancel routine pointer
Arg3: 828c1890, the IRP
Arg4: 00000000, 0

BUGCHECK_STR: 0xc9_7

DRIVER_VERIFIER_IO_VIOLATION_TYPE: 7

IRP_CANCEL_ROUTINE:
+ffffffffffffffff
ffffffff ?? ???

FAULTING_IP:
+ffffffffffffffff
ffffffff ?? ???

FOLLOWUP_IP:
+ffffffffffffffff
ffffffff ?? ???

IRP_ADDRESS: 828c1890

DEVICE_OBJECT: 8606aab8

DRIVER_OBJECT: 86072e10

IMAGE_NAME: disk.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 42435b4a

MODULE_NAME: disk

FAULTING_MODULE: f74d7000 disk

DEFAULT_BUCKET_ID: DRIVER_FAULT

PROCESS_NAME: FilterTest.exe

CURRENT_IRQL: 2

LAST_CONTROL_TRANSFER: from 80874b79 to 8081d98e

STACK_TEXT:
f78a6978 80874b79 00000003 00000007 00000000 nt!RtlpBreakWithStatusInstruction
f78a69c4 80875996 00000003 00000000 85bbf988 nt!KiBugCheckDebugBreak+0x19
f78a6d5c 80875dae 000000c9 00000007 ffffffff nt!KeBugCheck2+0x5b2
f78a6d7c 809cc8b1 000000c9 00000007 ffffffff nt!KeBugCheckEx+0x1b
f78a6df8 f73417ab f78a6e28 f7341ba9 8606aab8 nt!IovCompleteRequest+0x13e
f78a6e00 f7341ba9 8606aab8 828c1890 00000001 CLASSPNP!ClassCompleteRequest+0x11
f78a6e28 809cc283 00000000 86800f48 85bbf988 CLASSPNP!TransferPktComplete+0x1fd
f78a6e4c 80828ddf 00000000 86800f48 f78a6eb0 nt!IovpLocalCompletionRoutine+0xb4
f78a6e7c 809cc80d 828c14f0 828c14f0 85bbfa34 nt!IopfCompleteRequest+0xcd
f78a6ee8 f735551f 86061570 86800f48 f78a6f2c nt!IovCompleteRequest+0x9a
f78a6ef8 f7354a7c 828c14f0 00000001 00000000 SCSIPORT!SpCompleteRequest+0x5e
f78a6f2c f73541d8 86061570 828c14f0 f78a6fa3 SCSIPORT!SpProcessCompletedRequest+0x6a7
f78a6fa4 80828878 8606152c 860614b8 00000000 SCSIPORT!ScsiPortCompletionDpc+0x2bd
f78a6ff4 80820813 b9d2794c 00000000 00000000 nt!KiRetireDpcList+0xca
f78a6ff8 b9d2794c 00000000 00000000 00000000 nt!KiDispatchInterrupt+0x38
WARNING: Frame IP not in any known module. Following frames may be wrong.
80820813 00000000 0000000a bb835b75 00000128 0xb9d2794c

While looking at this particular IRP, except the cancel routine the other part of it seems fine. This makes me wonder whether this is really a corruption issue.

kd> !irp 828c1890 1
Irp is active with 9 stacks 4 is current (= 0x828c196c)
Mdl=bacbbb48: No System Buffer: Thread 82e45020: Irp stack trace.
Flags = 40000043
ThreadListEntry.Flink = 828c18a0
ThreadListEntry.Blink = 828c18a0
IoStatus.Status = 00000000
IoStatus.Information = 00001000
RequestorMode = 00000000
Cancel = 00
CancelIrql = 0
ApcEnvironment = 00
UserIosb = bacbbd54
UserEvent = bacbbbdc
Overlay.AsynchronousParameters.UserApcRoutine = 00000000
Overlay.AsynchronousParameters.UserApcContext = 00000000
Overlay.AllocationSize = 00000000 - 00000000
CancelRoutine = ffffffff
UserBuffer = 00000000
&Tail.Overlay.DeviceQueueEntry = 828c18d0
Tail.Overlay.Thread = 82e45020
Tail.Overlay.AuxiliaryBuffer = 00000000
Tail.Overlay.ListEntry.Flink = 00000000
Tail.Overlay.ListEntry.Blink = 00000000
Tail.Overlay.CurrentStackLocation = 828c196c
Tail.Overlay.OriginalFileObject = 826b88a0
Tail.Apc = 00000000
Tail.CompletionKey = 00000000
cmd flg cl Device File Completion-Context
[0, 0] 0 0 00000000 00000000 00000000-00000000

Args: 00000000 00000000 00000000 00000000
[0, 0] 0 0 00000000 00000000 00000000-00000000

Args: 00000000 00000000 00000000 00000000
[0, 0] 0 0 00000000 00000000 00000000-00000000

Args: 00000000 00000000 00000000 00000000

[4,34] 0 e1 8606aab8 00000000 f74c8582-8606a880 Success Error Cancel pending
\Driver\Disk PartMgr!PmIoCompletion
Args: 00001000 00000000 c2168e00 00000001
[4, 0] 0 e1 8606a880 00000000 f73f6558-86021c40 Success Error Cancel pending
\Driver\PartMgr ftdisk!FtpRefCountCompletionRoutine
Args: e9b5520a 0000005a c2168e00 00000001
[4, 0] 0 e1 86021b88 00000000 f73a0638-8601fbd0 Success Error Cancel pending
\Driver\Ftdisk volsnap!VspRefCountCompletionRoutine
Args: e9b551f9 0000005a c2161000 00000001
[4, 0] 0 e1 8601fb18 00000000 f7204779-bacbb6d0 Success Error Cancel pending
\Driver\VolSnap Ntfs!NtfsSingleSyncCompletionRoutine
Args: 00001000 00000000 c2161000 00000001
[4, 0] 0 e0 85fe8020 826b88a0 f72d4d04-828c16d8 Success Error Cancel
\FileSystem\Ntfs fltmgr!FltpPassThroughCompletion
Args: 00001000 00000000 00000000 00000000
[4, 0] 0 1 85bd8620 826b88a0 00000000-00000000 pending
\FileSystem\FltMgr
Args: 00001000 00000000 00000000 00000000
kd> dd 828c1890
828c1890 01b40006 bacbbb48 40000043 00000000
828c18a0 828c18a0 828c18a0 00000000 00001000
828c18b0 04090000 80000000 bacbbd54 bacbbbdc
828c18c0 00000000 00000000 ffffffff 00000000
828c18d0 00000000 00000000 00000000 00000000
828c18e0 82e45020 00000000 00000000 00000000
828c18f0 828c196c 826b88a0 00000000 00000000
828c1900 00000000 00000000 00000000 00000000

FIRST of all, KUDOS to you for testing your driver on the checked build. Too few people do that, and it’s still pretty important. Don’t forget to test with Driver Verifier enabled, too :slight_smile:

Given that the cancel routine address is 0xFFFFFFFF it’s obviously SOME kind of corruption problem, yes?

And whenever I hear things like “this only happens with multiple instances of my testapp running” I automatically think “race condition” and “bug in locking strategy”.

Does your minifilter register a cancel routine in the IRP at any point?

One place to consider for corruption (not that I’m sure it would manifest this way) is if you allocate any IRPs in your driver… make sure that your driver calls IoGetNextIrpStackLocation (and not IoGetCurrentIrpStackLocation) to get the appropriate pointer to setup the IRP stack before passing the IRP along. I’ve seen this bug over and over (and the now discontinued tool Call Usage Verifier was good at catching this bug, too bad it isn’t in th WDK anymore, huh?).

Peter
OSR

Ofcourse verifier is enabled and call usage verifier is also linked against. I’m not doing anything with an IRP at any point in my filter. No cancel routines too.

It’s possible that someone is decrementing the refcount at the wrong address. The problem is manifesting in different places and most recent being below:

IRQL_NOT_LESS_OR_EQUAL (a)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If a kernel debugger is available get the stack backtrace.
Arguments:
Arg1: f9f9f9e9, memory referenced
Arg2: 00000002, IRQL
Arg3: 00000000, bitfield :
bit 0 : value 0 = read operation, 1 = write operation
bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status)
Arg4: 80833848, address which referenced memory

Debugging Details:

READ_ADDRESS: f9f9f9e9 Nonpaged pool expansion

CURRENT_IRQL: 2

FAULTING_IP:
nt!CcGetDirtyPages+97
80833848 66813efd02 cmp word ptr [esi],2FDh

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0xA

PROCESS_NAME: System

TRAP_FRAME: b9bf38a4 – (.trap 0xffffffffb9bf38a4)
ErrCode = 00000000
eax=85c5e3d8 ebx=00000000 ecx=f9f9f9f9 edx=b9bf3924 esi=f9f9f9e9 edi=85c5e3c8
eip=80833848 esp=b9bf3918 ebp=b9bf397c iopl=0 nv up ei pl nz na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010206
nt!CcGetDirtyPages+0x97:
80833848 66813efd02 cmp word ptr [esi],2FDh ds:0023:f9f9f9e9=???
Resetting default scope

LAST_CONTROL_TRANSFER: from 80874b79 to 8081d98e

STACK_TEXT:
b9bf34a0 80874b79 00000003 f9f9f9e9 00000000 nt!RtlpBreakWithStatusInstruction
b9bf34ec 80875996 00000003 f9f9f9e9 80833848 nt!KiBugCheckDebugBreak+0x19
b9bf3884 80826493 0000000a f9f9f9e9 00000002 nt!KeBugCheck2+0x5b2
b9bf3884 80833848 0000000a f9f9f9e9 00000002 nt!KiTrap0E+0x2a1
b9bf397c f723696c e1307688 f7206013 b9bf3c28 nt!CcGetDirtyPages+0x97
b9bf3bf4 f72378f1 b9bf3c28 85fe8100 00000000 Ntfs!NtfsCheckpointVolume+0x6e1
b9bf3d80 808203cb 00000000 00000000 848d6a60 Ntfs!NtfsCheckpointAllVolumes+0xd2
b9bf3dac 80905d1c 00000000 00000000 00000000 nt!ExpWorkerThread+0xeb
b9bf3ddc 80828499 8082030e 80000000 00000000 nt!PspSystemThreadStartup+0x2e
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16

STACK_COMMAND: kb

FOLLOWUP_IP:
nt!CcGetDirtyPages+97
80833848 66813efd02 cmp word ptr [esi],2FDh

SYMBOL_STACK_INDEX: 4

SYMBOL_NAME: nt!CcGetDirtyPages+97

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: nt

IMAGE_NAME: ntoskrnl.exe

DEBUG_FLR_IMAGE_TIMESTAMP: 45ebe4e1

FAILURE_BUCKET_ID: 0xA_VRF_nt!CcGetDirtyPages+97

BUCKET_ID: 0xA_VRF_nt!CcGetDirtyPages+97

I’m now working backwards by disabling some recent modifications to see when I introduced this issue. With some luck I might find out.

Sajeev

May be I’m inching closer to the problem. Looks like fltmgr expects a signature of F104, whereas the current value is F103. This reinforces the fact that I’m wrongly derefencing some address instead of the object I’m supposed to. Let me have a closer look at my ref/deref logic.

*** Assertion failed: ((&(IrpCtrl)->Type) != NULL) && ((&(IrpCtrl)->Type)->Signature == (0xF104))
*** Source File: d:\srvrtm\base\fs\filtermgr\filter\callbacksup.c, line 2531

kd> kb
*** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr Args to Child
ba2664f8 f72c3dbd 83e248c8 80887252 83e248c8 fltmgr!FltpCompleteCompletionNode+0x2f
ba26655c f72d36f2 00e248c8 83e248c8 ba266584 fltmgr!FltpPerformPostCallbacks+0x4ff
ba26656c f72d4ebb 83e248c8 839aa608 ba26660c fltmgr!FltpProcessIoCompletion+0x52
ba266584 809cc283 a53ceee8 839aa608 83e248c8 fltmgr!FltpPassThroughCompletion+0x1b7
ba2665a8 80828ddf a53ceee8 839aa608 ba26660c nt!IovpLocalCompletionRoutine+0xb4
ba2665d8 809cc80d 839aa608 e56879e0 00000000 nt!IopfCompleteRequest+0xcd
ba266644 f7206b6d ba266858 e56879e0 ba266848 nt!IovCompleteRequest+0x9a
ba266654 f7204cf3 ba266858 839aa608 00000000 Ntfs!NtfsCompleteRequest+0xc8
ba266848 f7204df7 ba266858 839aa608 0108070a Ntfs!NtfsCommonWrite+0x2069
ba2669c4 809cc57d 89434718 839aa608 8609d000 Ntfs!NtfsFsdWrite+0x16a
ba2669f4 80853664 f72d5e9d ba266a30 f72d5e9d nt!IovCallDriver+0x112
ba266a00 f72d5e9d 80a6ea90 887dcee8 00000000 nt!IofCallDriver+0x13
ba266a30 809cc57d 887dcee8 839aa774 80887252 fltmgr!FltpDispatch+0x29d
ba266a60 80853664 f72d58e2 ba266a94 f72d58e2 nt!IovCallDriver+0x112
ba266a6c f72d58e2 00000000 839aa608 a53ceee8 nt!IofCallDriver+0x13
ba266a94 f72d5d7a ba266ab4 a53ceee8 00000000 fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x3cc
ba266ad0 809cc57d a53ceee8 839aa608 839b4f90 fltmgr!FltpDispatch+0x17a
ba266b00 80853664 80841f53 ba266b20 80841f53 nt!IovCallDriver+0x112
ba266b0c 80841f53 00000000 839b4aa0 ba266d54 nt!IofCallDriver+0x13
ba266b20 80841a68 839b4f09 ba266b48 ba266c04 nt!IoSynchronousPageWrite+0xaf

kd> dt 83e248c8 _IRP_CTRL
fltmgr!_IRP_CTRL
+0x000 Type : _FLT_TYPE
+0x004 Flags : 0x20000004 (No matching name)
+0x008 MajorFunction : 0x4 ‘’
+0x009 Reserved0 : 0 ‘’
+0x00a CompletionStackLength : 0x1 ‘’
+0x00b NextCompletion : 0x1 ‘’
+0x00c CompletionStack : 0x83e2497c _COMPLETION_NODE
+0x010 SyncEvent : _KEVENT
+0x020 Irp : 0x839aa608 _IRP
+0x020 FsFilterData : 0x839aa608 _FS_FILTER_CALLBACK_DATA
+0x024 AsyncCompletionRoutine : (null)
+0x028 AsyncCompletionContext : (null)
+0x02c InitiatingInstance : (null)
+0x030 PendingCallbackNode : 0xffffffff _CALLBACK_NODE
+0x030 StartingCallbackNode : 0xffffffff _CALLBACK_NODE
+0x034 preOp : __unnamed
+0x034 postOp : __unnamed
+0x038 PostCompletionRoutine : (null)
+0x03c DeviceObject : 0xa53ceee8 _DEVICE_OBJECT
+0x040 FileObject : 0x839b4f90 _FILE_OBJECT
+0x044 WorkQueueItem : _WORK_QUEUE_ITEM
+0x044 PendingCallbackContext : (null)
+0x048 CachedCompletionNode : (null)
+0x04c PendingStatus : 0
+0x054 CreateIrp : __unnamed
+0x05c Data : _FLT_CALLBACK_DATA
+0x088 WorkingParameters : _FLT_IO_PARAMETER_BLOCK

kd> dt 83e248c8 _FLT_TYPE
fltmgr!_FLT_TYPE
+0x000 Signature : 0xf103
+0x002 Size : 0xfc