Trying to understand BSOD stack in FLTMGR

Hello there,

I’m using custom Dokan-based file system implementation. Recently, I started running into BSOD in Filter Manager. I’m not sure what begun triggering it, can be either transition to Windows 10 (from Windows 7) or new circumstances (possibly more filters attaching since we started to use more AV/security stuff). The BSOD itself is not frequent, but seems to happen randomly when provoked by intense activity. So far I have seen it locally and have a dump from another W10 user with the same issue. No reports from W7 users so far.

I’m trying to understand what leads to given BSOD. Per the dump, filter manager is being asked to provide a name for a file object belonging to my file system, the file object itself looks ok, and so do “high-level” filter manager objects involved (FLT_VOLUME, FLT_INSTANCE). I’m not so sure about the low-level objects (cache node, name generation context) since some types don’t seem to be public so I’m only guessing what am I looking at.

So far I tried reproducing with Verifier enabled on my FS driver, got usual BSOD without any Verifier info. I tried enabling Verifier on FLTMGR as well, but was unable to repro and gave up quickly due to impact on the machine as a whole. Didn’t try doing this on debug machine yet since setting the scenario up is somewhat tricky; I thought I would learn as much as possible from the dump first.

I assume the issue goes like “FLTMGR has valid inputs, asks my driver for information on the file object, gets something ugly in return, and meets bad end”. I’m wondering what exactly may the “asks for information” mean FS-wise. That is, when FltGetFileNameInformation is called, what requests does/can FLTMGR make into FS backing given file object?

BSOD info:
PAGE_FAULT_IN_NONPAGED_AREA (50)
Arg1: ffffaf07705ab000, memory referenced.
Arg2: 0000000000000000, value 0 = read operation
Arg3: fffff8030fadc9a5
Arg4: 0000000000000000, (reserved)

00 ffff9c014e7d08a8 fffff8011660dbfa nt!KeBugCheckEx
01 ffff9c014e7d08b0 fffff801164e90fd nt!MiSystemFault+0xfffea
02 ffff9c014e7d09a0 fffff801165ecffc nt!MmAccessFault+0x27d
03 ffff9c014e7d0ba0 fffff8030fadc9a5 nt!KiPageFault+0x13c
04 ffff9c014e7d0d38 fffff8030fb05112 FLTMGR!memcpy+0xa5
05 ffff9c014e7d0d40 fffff8030fb04e59 FLTMGR!FltpGetFileName+0x242
06 ffff9c014e7d0de0 fffff8030fb05887 FLTMGR!FltpGetOpenedFileName+0x19
07 ffff9c014e7d0e10 fffff8030fb056f7 FLTMGR!FltpCallOpenedFileNameHandler+0x2b
08 ffff9c014e7d0e60 fffff8030fb0565e FLTMGR!FltpGetNormalizedFileNameWorker+0x2f
09 ffff9c014e7d0ea0 fffff8030fb04c9c FLTMGR!FltpGetNormalizedFileName+0x1a
0a ffff9c014e7d0ef0 fffff8030fad6bbf FLTMGR!FltpCreateFileNameInformation+0x32c
0b ffff9c014e7d0f40 fffff8030fad7b20 FLTMGR!FltpGetFileNameInformation+0x38f
0c ffff9c014e7d0fe0 fffff8031041bd99 FLTMGR!FltGetFileNameInformation+0x1b0
0d ffff9c014e7d1060 fffff8031041c286 fileinfo!FIStreamQueryInfo+0xc9
0e ffff9c014e7d10f0 fffff8030fad3d15 fileinfo!FIPostCreateCallback+0x216

Thanks for advice,
L.

> Arg1: ffffaf07705ab000, memory referenced.

That looks suspiciously like buffer overflow in a verified system (or even
an unverifier one). What does “!pool ffffaf07705aaFFF” say about that
area?

I’d also be suspicious about your filesystem’s support for Stream Contexts,
but that has no basis in anything other than the “smell of the problem”.

You might also want to insert procmon or MSpy between fileInfo and your
filesystem…

So far I tried reproducing with Verifier enabled on my FS driver, got
usual BSOD without any Verifier info.
I tried enabling Verifier on FLTMGR as well, but was unable to repro and
gave up quickly due to impact
on the machine as a whole.

If you turn on verifier for fileInfo then fltmgr will flip into a verifier
mode for operations from that filter…

R

Hello Rod,

thanks for quick response. The memory just below the fail address is

ffffaf07705aaed0 size: 130 previous size: 110 (Allocated) *FMfn
Pooltag FMfn : NAME_CACHE_NODE structure, Binary : fltmgr.sys

I tried inspecting that one previously, the content seems “incorrect”/overwritten (though I would apply “I’m not sure what am I looking at” there).

11: kd> dt FLTMGR!_NAME_CACHE_NODE ffffaf07705aaee0
+0x000 Type : _FLT_TYPE
+0x008 ProvidingInstance : 0xffff890c41c18c70 _FLT_INSTANCE +0x010 CreationTime : _LARGE_INTEGER 0xffff890c41c18c70
+0x018 TreeLink : _TREE_NODE
+0x050 NameInfo : _FLT_FILE_NAME_INFORMATION
+0x0c8 UseCount : 0n3211314

Curiously, the content of the memory does seem to make sense in other ways:

11: kd> db ffffaf07705aaee0 ffffaf07705ab000
ffffaf07705aaee0 00 00 88 03 07 af ff ff-70 8c c1 41 0c 89 ff ff ........p..A.... ffffaf07705aaef0 70 8c c1 41 0c 89 ff ff-88 03 00 00 00 00 00 00 p…A…
ffffaf07705aaf00 1e 01 00 00 5c 00 55 00-73 00 65 00 72 00 73 00 ....\.U.s.e.r.s. ffffaf07705aaf10 5c 00 6c 00 6b 00 72 00-65 00 73 00 74 00 61 00 .l.k.r.e.s.t.a.
ffffaf07705aaf20 5c 00 41 00 70 00 70 00-44 00 61 00 74 00 61 00 \.A.p.p.D.a.t.a. ffffaf07705aaf30 5c 00 4c 00 6f 00 63 00-61 00 6c 00 5c 00 50 00 .L.o.c.a.l..P.
ffffaf07705aaf40 61 00 63 00 6b 00 61 00-67 00 65 00 73 00 5c 00 a.c.k.a.g.e.s.\. ffffaf07705aaf50 4d 00 69 00 63 00 72 00-6f 00 73 00 6f 00 66 00 M.i.c.r.o.s.o.f.
ffffaf07705aaf60 74 00 2e 00 4d 00 69 00-63 00 72 00 6f 00 73 00 t...M.i.c.r.o.s. ffffaf07705aaf70 6f 00 66 00 74 00 45 00-64 00 67 00 65 00 5f 00 o.f.t.E.d.g.e._.
ffffaf07705aaf80 38 00 77 00 65 00 6b 00-79 00 62 00 33 00 64 00 8.w.e.k.y.b.3.d. ffffaf07705aaf90 38 00 62 00 62 00 77 00-65 00 5c 00 41 00 43 00 8.b.b.w.e..A.C.
ffffaf07705aafa0 5c 00 23 00 21 00 31 00-32 00 31 00 5c 00 4d 00 \.#.!.1.2.1.\.M. ffffaf07705aafb0 69 00 63 00 72 00 6f 00-73 00 6f 00 66 00 74 00 i.c.r.o.s.o.f.t.
ffffaf07705aafc0 45 00 64 00 67 00 65 00-5c 00 43 00 61 00 63 00 E.d.g.e.\.C.a.c. ffffaf07705aafd0 68 00 65 00 5c 00 49 00-30 00 56 00 4a 00 4d 00 h.e..I.0.V.J.M.
ffffaf07705aafe0 4c 00 59 00 59 00 5c 00-49 00 6e 00 73 00 74 00 L.Y.Y.\.I.n.s.t. ffffaf07705aaff0 61 00 6e 00 63 00 65 00-53 00 74 00 00 00 00 00 a.n.c.e.S.t…
ffffaf07`705ab000 ?? ?

I’ll enable Verifier for FileInfo and retry, and check stream context support in the driver, I don’t recall ever touching respective part of the code.

I have ProcMon enabled here and there, didn’t seem to make a difference (in BSOD rate or anything else).
L.

Following up with some news: after more time with WinDbg and co. I figured out the FLTMGR queries the name information from the driver (IRP_MJ_QUERY_INFORMATION) as part of the FltGetFileNameInformation and before my crash case. And I noticed the “Prior to Windows 8, Filter Manager obtained the normalized name…” sentence in FltGetFileNameInformation documentation. 1+1 = my driver was not implementing FileNormalizedNameInformation file information class support properly, returned empty outcome and STATUS_SUCCESS. I changed it to return “not implemented” as a quick test, and haven’t seen the BSOD so far.

I don’t know whether this could indeed be the cause (since this would have happened many times before the crash already), and something is definitely still wrong (since I’m seeing 1 minute freezes of certain apps ~ ProcessExplorer, MSVS ~ occasionally, somewhat correlating to intense IO/AV activity) but it looks like a progress.