Hi,
I know this bug is very hard to trace, but it is very important
for us, because in a very particular configuration (OS, AV, apps and our
driver), it is very easy to repro.
The bugcheck is 0x4E with Arg1:9A (attempt to free pool that is
still locked for I/O) (WinDBG output will follow at the end of e-mail).
Most of the time (around 40% of crash dumps) it occurs in our driver in
the same line, very rarely in another line, and rest (40%) in one AV or
OS (ntfs, afd, rdp) drivers. The quirky part is that when it occurs in
our driver, the line where it occurs is guaranteed not to free any
locked pool (in fact, it is our temporary pool, which is not shared with
other drivers - and in the few rare cases, it is not even used, i.e.
simply allocated/deallocated with no use). I am sure the buffer address
is not corrupted between pre/post QUERY_DIRECTORY, because for testing,
the buffer had two fields for comparison (they match at the time of
crash).
I’ve done a !search on the pool to see what MDLs have it locked
(apparently many!), but none of them lead to active IRPs with a !search
for the MDL address
Turning DV with selective or all options yielded no different
results (harder to repro, but we still got a BSOD with the same
bugcheck/Arg1).
So as not to make this post too long, does anyone have any idea
what else I can try? WinDBG output attached.
(If the AV authors would like to investigate this, I can provide
several dumps where the bugcheck occurs in their driver apparently - it
is obvious some driver is corrupting memory, but I can’t track it down
further:(
BugCheck 4E, {9a, e6b8, 6, 9}
*** ERROR: Symbol file could not be found. Defaulted to export symbols
for mfehidk.sys -
Probably caused by : VShield.sys ( VShield!DirCtrlPostOp+ce )
Followup: MachineOwner
kd> !analyze -v
*******************************************************************************
*
*
* Bugcheck
Analysis *
*
*
*******************************************************************************
PFN_LIST_CORRUPT (4e)
Typically caused by drivers passing bad memory descriptor lists (ie:
calling
MmUnlockPages twice with the same list, etc). If a kernel debugger is
available get the stack trace.
Arguments:
Arg1: 0000009a,
Arg2: 0000e6b8
Arg3: 00000006
Arg4: 00000009
Debugging Details:
DEFAULT_BUCKET_ID: DRIVER_FAULT
BUGCHECK_STR: 0x4E
PROCESS_NAME: mssearch.exe
CURRENT_IRQL: 0
LAST_CONTROL_TRANSFER: from 80861a71 to 80826659
STACK_TEXT:
f74eea44 80861a71 0000004e 0000009a 0000e6b8 nt!KeBugCheckEx+0x1b
f74eea60 8088b751 81193c20 808a7bc0 00dede38 nt!MiBadRefCount+0x33
f74eea98 8088c5ad ff345000 813d1348 813d13fc nt!MiFreePoolPages+0x5cf
f74eeaec 8088cb25 6b725761 00000000 f74eeb24 nt!ExFreePoolWithTag+0x277
f74eeafc f84ada6e ff345000 00000000 00000000 nt!ExFreePool+0xf
f74eeb24 f820db83 813d13a4 f74eeb48 ff9f24e0 VShield!DirCtrlPostOp+0xce
[f:\projects\alfaff\driver\afm_afp\dirctrl.c @ 198]
f74eeb8c f820ffe0 003d1348 00000000 813d1348
fltMgr!FltpPerformPostCallbacks+0x1c5
f74eeba0 f821050f 813d1348 814017f8 f74eebe0
fltMgr!FltpProcessIoCompletion+0x10
f74eebb0 f8210ba1 821b2840 814017f8 813d1348
fltMgr!FltpPassThroughCompletion+0x89
f74eebe0 f8210d03 f74eec00 80000006 00000000
fltMgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x269
f74eec18 8081d39d 821b2840 814017f8 81c3a8f8 fltMgr!FltpDispatch+0x11f
f74eec2c f72fdaec 814017f8 f74eeccc 808e6cb6 nt!IofCallDriver+0x45
WARNING: Stack unwind information not available. Following frames may be
wrong.
f74eecac f730738e 81e9ed00 814017f8 f74eece4 mfehidk+0x6aec
f74eecbc f73073de f74eeccc 81ede678 81821cf0 mfehidk+0x1038e
f74eece4 8081d39d 81821cf0 814017f8 00bbf9ac
mfehidk!DEVICEDISPATCH::DispatchPassThrough+0x48
f74eecf8 808ec789 f74eed64 00bbf9ac 808e6cb6 nt!IofCallDriver+0x45
f74eed0c 808e6d13 81821cf0 814017f8 81e9ed00
nt!IopSynchronousServiceTail+0x10b
f74eed30 80882fa8 00000284 00000000 00000000
nt!NtQueryDirectoryFile+0x5d
f74eed30 7c82ed54 00000284 00000000 00000000 nt!KiFastCallEntry+0xf8
00bbf9f4 00000000 00000000 00000000 00000000 0x7c82ed54
STACK_COMMAND: kb
FOLLOWUP_IP:
VShield!DirCtrlPostOp+ce [f:\projects\alfaff\driver\afm_afp\dirctrl.c @
198]
f84ada6e 8b4dfc mov ecx,dword ptr [ebp-4]
FAULTING_SOURCE_CODE:
194: if(p2pCtx)
195: {
196: AFF_ReleaseContext(lpContext, lpFsCtx);
197: ExFreePool(p2pCtx->workBuf);
198: ExFreeToNPagedLookasideList(&Pre2Post_List, p2pCtx);
199: }
200: return retValue;
201: }
(The error occurs at ExFreePool line, p2pCtx->workBuf value is ff345000,
as is first parameter of ExFreePool - this is DirCtrlPostOp - and this
code is executed directly, i.e. is the only code in PostOp)
SYMBOL_STACK_INDEX: 5
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: VShield
IMAGE_NAME: VShield.sys
DEBUG_FLR_IMAGE_TIMESTAMP: 4623e42e
SYMBOL_NAME: VShield!DirCtrlPostOp+ce
FAILURE_BUCKET_ID: 0x4E_VShield!DirCtrlPostOp+ce
BUCKET_ID: 0x4E_VShield!DirCtrlPostOp+ce
Followup: MachineOwner
–
Kind regards, Dejan
http://www.alfasp.com
File system audit, security and encryption kits.