minifilter BSOD on FltReplyMessage

Hi,

I have a BSOD in a minifilter for which I can't identify the source of the problem. It happens when a FltReplyMessage is called from user-mode. The ideea is, that the request does NOT reach my driver, the BSOD happens before that point.

As a side-note question: if the parameters of FltReplyMessage would be wrong, shouldn't it fail with an error, instead of a BSOD?

The BSOD is NOT reproductible on whish, but I have seen a couple of them in the last days on several machines that run automated stress testing in loops.

If anyone has any hints, I would greatly appreciate :slight_smile:

BugCheck 8E, {c0000005, bad2faba, b7736ac0, 0}

PEB is paged out (Peb.Ldr = 7ffdd00c). Type ".hh dbgerr001" for details
PEB is paged out (Peb.Ldr = 7ffdd00c). Type ".hh dbgerr001" for details
Probably caused by : fltmgr.sys ( fltmgr!FltpFilterReply+78 )

Followup: MachineOwner

0: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

KERNEL_MODE_EXCEPTION_NOT_HANDLED (8e)
This is a very common bugcheck. Usually the exception address pinpoints
the driver/function that caused the problem. Always note this address
as well as the link date of the driver/image that contains this address.
Some common problems are exception code 0x80000003. This means a hard
coded breakpoint or assertion was hit, but this system was booted
/NODEBUG. This is not supposed to happen as developers should never have
hardcoded breakpoints in retail code, but ...
If this happens, make sure a debugger gets connected, and the
system is booted /DEBUG. This will let us see why this breakpoint is
happening.
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: bad2faba, The address that the exception occurred at
Arg3: b7736ac0, Trap Frame
Arg4: 00000000

Debugging Details:

PEB is paged out (Peb.Ldr = 7ffdd00c). Type ".hh dbgerr001" for details
PEB is paged out (Peb.Ldr = 7ffdd00c). Type ".hh dbgerr001" for details

EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%08lx referenced memory at 0x%08lx. The memory could not be %s.

FAULTING_IP:
fltmgr!FltpFilterReply+78
bad2faba 8b5008 mov edx,dword ptr [eax+8]

TRAP_FRAME: b7736ac0 -- (.trap 0xffffffffb7736ac0)
ErrCode = 00000000
eax=08c25d5e ebx=08c25d5e ecx=86a8c150 edx=9090f0eb esi=86a8c130 edi=0180fed4
eip=bad2faba esp=b7736b34 ebp=b7736b80 iopl=0 ov up ei ng nz na po cy
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010a83
fltmgr!FltpFilterReply+0x78:
bad2faba 8b5008 mov edx,dword ptr [eax+8] ds:0023:08c25d66=????????
Resetting default scope

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0x8E

PROCESS_NAME: ??????.exe

LAST_CONTROL_TRANSFER: from 80827817 to 80822f33

STACK_TEXT:
b7736688 80827817 0000008e c0000005 bad2faba nt!KeBugCheckEx+0x1b
b7736a50 8086b085 b7736a6c 00000000 b7736ac0 nt!KiDispatchException+0x3b1
b7736ab8 8086b036 b7736b80 bad2faba badb0d00 nt!CommonDispatchException+0x4d
b7736b80 bad39ecf 85d921b0 0180fed4 00000028 nt!KiExceptionExit+0x18a
b7736bac bad2da37 85d92100 880b8f00 0180fed4 fltmgr!FltpMsgDeviceControl+0x7b
b7736bf0 bad2df7f 86c2e540 880b8f68 86c2e540 fltmgr!FltpMsgDispatch+0x87
b7736c1c 8081818f 86c2e540 880b8f68 80a0f428 fltmgr!FltpDispatch+0x33
b7736c2c 80981128 85d921b0 80a0f410 880b8f68 nt!IopfCallDriver+0x31
b7736c50 808a8982 880b8fd8 85d921b0 880b8f68 nt!IovCallDriver+0xa0
b7736c64 808a97f7 86c2e540 880b8f68 85d921b0 nt!IopSynchronousServiceTail+0x70
b7736d00 808a2274 00000730 00000000 00000000 nt!IopXxxControlFile+0x5c5
b7736d34 8086a61c 00000730 00000000 00000000 nt!NtDeviceIoControlFile+0x2a
b7736d34 7c90e4f4 00000730 00000000 00000000 nt!KiFastCallEntry+0xfc
WARNING: Frame IP not in any known module. Following frames may be wrong.
0180fdd4 00000000 00000000 00000000 00000000 0x7c90e4f4

have a nice day,

Sandor

Looks like a data corruptor.

My suggestion: set the debugger context to that of the trap frame and
then grab the stack backtrace.

The question from the information you did post is “from where did EAX
originate?” It looks like it is expected to be a data structure of some
sort but what you have isn’t a data structure. So, look back in the code
stream to see if you can figure out where EAX was loaded - if you figure
out what that containing structure is, you’ll know what EAX was supposed
to be (assuming you know the structure declaration for that precursor.)

This might just be a frustrating memory corruptor. Those are always
tough to track down. The key is to try and figure out if there is a
pattern - is it in your structure or an OS structure; in either case
that will hopefully point you in the right direction for finding the
guilty culprit.

Tony
OSR

Thank you Tony,

EAX comes from FltpFilterReply’s parametrization, like:

fltmgr!FltpFilterReply:
bad2fa42 6a30 push 30h

bad2fa9b 8b4508 mov eax,dword ptr [ebp+8]
bad2fa9e 8b7010 mov esi,dword ptr [eax+10h]
bad2faa1 83c608 add esi,8

bad2faac 8d4e20 lea ecx,[esi+20h]
bad2faaf 8b01 mov eax,dword ptr [ecx]

bad2faba 8b5008 mov edx,dword ptr [eax+8] <== this is where crashes

Obviously here are data structures, however, without sources, I can’t figure out the details :expressionless:

One important hint: I have NOT seen this problem in any other OS except 32 bit XP SP3 (however we are intensively searching for other cases).

have a nice day,
Sandor

Just to close this thread, this is a bug in Windows XP where in case of a race between FilterReplyMessage and CloseHandle FltMgr will end up dereferencing an invalid pointer. This has been fixed in SRV03 and newer OSes. The workaround is to make sure that the user mode component doesn’t call FilterReplyMessage on an already closed port.

Regards,
Alex.
This posting is provided “AS IS” with no warranties, and confers no rights.