MmProbeAndLockPages bugcheck in Ntfs.sys (but likely related to my own FSD)

I have a File System Driver (not Filter Driver) that acts as a “proxy” between a file system implemented in user mode and the kernel. This FSD works well in a variety of scenarios and with multiple user mode file systems.

I am experiencing a problem with one of the file systems, named “passthrough”, which simply passes all file system operations through to NTFS. Here is how “passthrough” works, for example, for ReadFile:

[OP is the originating process, OS is NTOS, KF is my FSD, UF is the user mode file system. The UF interacts with the KF by issuing DeviceIoControl calls.]

OP: call NtReadFile
OS: IoCallDriver IRP_MJ_READ
KF: post READ IRP into a queue (ignoring caching complications here)
---- context switch ----
KF: wake up, dequeue READ IRP and return from DeviceIoControl
UF: process READ IRP and send response using DeviceIoControl
KF: complete the READ IRP (and wait for additional IRP’s)

The passthrough file system’s READ IRP processing is simply ReadFile on an underlying NTFS file. This works fine if the underlying NTFS file is opened without FILE_FLAG_NO_BUFFERING, but fails catastrophically if the file is opened with FILE_FLAG_NO_BUFFERING.

I attach the full “!analyze -v” report at the end, but the crux is that Ntfs.sys bugchecks in MmProbeAndLockPages with MEMORY_MANAGEMENT and 61946. The following message by Pavel Lebedinsky explains: http://www.osronline.com/ShowThread.cfm?link=240647

The 0x1A/0x61946 bugcheck happens when the memory manager issues a paging r=
ead and some driver then tries to create a secondary, write-access MDL desc=
ribing the same physical pages. This is bad because when that secondary MDL=
is unlocked the pages will get marked dirty when they’re not supposed be, =
causing various problems downstream.

My FSD implements a “zero-copy” technique where the OP (originating process) buffer gets mapped into the address space of the user mode file system. I do this by using IoAllocateMdl, MmProbeAndLockPages(IoWriteAccess) and MmMapLockedPagesSpecifyCache(UserMode) (in the correct process context). Thus the buffer that arrives to the passthrough file system and is later sent to ReadFile has already an MDL against it. This likely causes the issue.

According to Pavel Lebedinsky again:

To fix this you need to reuse the original MDL instead of creating a new one.

But obviously I cannot do so in my case (even if I somehow made the MDL available to the user mode file system there is no way to use it from user mode).

Any guidance on this is very much appreciated. Is there any way to create the MDL in my FSD so that it does not cause this problem? Any other workarounds? [Other than the obvious ones of ripping the zero-copy mechanism in the FSD or by making a copy of the read buffer in the user mode file system.]

Bill

kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

MEMORY_MANAGEMENT (1a)

Any other values for parameter 1 must be individually examined.

Arguments:
Arg1: 0000000000061946, The subtype of the bugcheck.
Arg2: fffffa80039918c0
Arg3: 0000000000034a31
Arg4: 0000000000000000

Debugging Details:

DUMP_CLASS: 1

DUMP_QUALIFIER: 0

BUILD_VERSION_STRING: 9200.16384.amd64fre.win8_rtm.120725-1247

DUMP_TYPE: 0

BUGCHECK_P1: 61946

BUGCHECK_P2: fffffa80039918c0

BUGCHECK_P3: 34a31

BUGCHECK_P4: 0

BUGCHECK_STR: 0x1a_61946

CPU_COUNT: 1

CPU_MHZ: c1c

CPU_VENDOR: GenuineIntel

CPU_FAMILY: 6

CPU_MODEL: 3d

CPU_STEPPING: 4

CPU_MICROCODE: 6,3d,4,0 (F,M,S,R) SIG: 0’00000000 (cache) 0’00000000 (init)

DEFAULT_BUCKET_ID: WIN8_DRIVER_FAULT

PROCESS_NAME: passthrough-x6

CURRENT_IRQL: 2

ANALYSIS_SESSION_HOST: WINDOWS

ANALYSIS_SESSION_TIME: 02-13-2017 12:47:34.0142

ANALYSIS_VERSION: 10.0.10586.567 amd64fre

LAST_CONTROL_TRANSFER: from fffff801089800ea to fffff8010887f930

STACK_TEXT:
fffff880047328f8 fffff801089800ea : 0000000000000000 000000000000001a fffff88004732a60 fffff801089044b8 : nt!DbgBreakPointWithStatus
fffff88004732900 fffff8010897f742 : 0000000000000003 fffff88004732a60 fffff80108904e90 000000000000001a : nt!KiBugCheckDebugBreak+0x12
fffff88004732960 fffff80108885144 : 000000000000010c 0000000000000004 0000000000000000 0000000000000000 : nt!KeBugCheck2+0x79f
fffff88004733080 fffff801089e60d9 : 000000000000001a 0000000000061946 fffffa80039918c0 0000000000034a31 : nt!KeBugCheckEx+0x104
fffff880047330c0 fffff801088d7201 : 0000000000000000 fffff88004733179 fffffa800383c5c0 fffffa80039918f0 : nt! ?? ::FNODOBFM::string'+0x1e423 fffff88004733120 fffff8800164299b : 0000000000000001 fffffa8002aa9701 fffff9800910ec60 0000000000000001 : nt!MmProbeAndLockPages+0x161 fffff880047331e0 fffff88001637140 : fffffa8002aa9780 fffff9800910ec60 fffff880047332e0 fffff9800910ec60 : Ntfs!NtfsLockUserBuffer+0x6f fffff88004733230 fffff880016363d8 : fffff880047332e0 fffff88004733330 fffffa8002aa9780 0000000000000001 : Ntfs!NtfsPrepareBuffers+0x68 fffff880047332a0 fffff88001645b56 : 0000000000000000 fffff880047336a0 fffff8a0024f8198 fffff8a0024f8140 : Ntfs!NtfsNonCachedIo+0x1e8 fffff880047334b0 fffff8800164745b : fffffa8002aa9780 fffff9800910ec60 fffff88004733701 0000000000000000 : Ntfs!NtfsCommonRead+0x896 fffff88004733670 fffff80108e47d26 : fffff9800910ec60 fffff9800910ec60 0000000000000002 fffffa8002620340 : Ntfs!NtfsFsdRead+0x1db fffff88004733720 fffff880014034ee : fffffa8003572750 fffff880047337c0 fffff9800910ec60 fffffa8002620340 : nt!IovCallDriver+0x3e6 fffff88004733770 fffff880014010b6 : fffffa80035dade0 0000000000000002 fffff9800910ec60 fffffa800262b0b8 : fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x25e fffff88004733810 fffff80108e47d26 : fffff9800910ec60 0000000000000002 0000000000000000 fffff8010881772a : fltmgr!FltpDispatch+0xb6 fffff88004733870 fffff80108c509e8 : 0000000000000000 fffff88004733941 fffffa8003855090 fffffa800262b010 : nt!IovCallDriver+0x3e6 fffff880047338c0 fffff80108bf5c13 : fffffa8003855090 fffff88004733b80 00000094140c0000 fffffa8003855090 : nt!IopSynchronousServiceTail+0x158 fffff88004733990 fffff80108884053 : fffffa800383c5c0 0000000000000000 0000000000000000 0000009415c6f948 : nt!NtReadFile+0x661 fffff88004733a90 000007fc4c192c0a : 000007fc4925ebb6 0000009415c6f8c8 cccccccccccccccc cccccccccccccccc : nt!KiSystemServiceCopyEnd+0x13 0000009415c6f868 000007fc4925ebb6 : 0000009415c6f8c8 cccccccccccccccc cccccccccccccccc 0000000000000104 : ntdll!NtReadFile+0xa 0000009415c6f870 000007f7a3a941ff : 0000000000000000 0000000000000000 cccccccccccccccc 0000009415c6fa94 : KERNELBASE!ReadFile+0x11b 0000009415c6f8f0 000007fc434bff3e : 0000009414146d50 0000009414146d20 00000094140c0000 0000000000000000 : passthrough_x64!Read+0xaf [c:\users\billziss\projects\winfsp\tst\passthrough\passthrough.c @ 343] 0000009415c6fa60 000007fc434c2291 : 0000009414146d50 0000009414147070 000000941414b040 0000009414147070 : winfsp_x64_7fc43490000!FspFileSystemOpRead+0xae [c:\users\billziss\projects\winfsp\src\dll\fsop.c @ 914] 0000009415c6faf0 000007fc49aa167e : 0000009414146d50 0000000000000000 0000000000000000 0000000000000000 : winfsp_x64_7fc43490000!FspFileSystemDispatcherThread+0x241 [c:\users\billziss\projects\winfsp\src\dll\fs.c @ 519] 0000009415c6fbc0 000007fc4c1ac3f1 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : KERNEL32!BaseThreadInitThunk+0x1a 0000009415c6fbf0 0000000000000000 : 0000000000000000 0000000000000000 0000000000000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x1d

STACK_COMMAND: kb

THREAD_SHA1_HASH_MOD_FUNC: c36002737b6fc4159123ea2023cea3f05a2e0e75

THREAD_SHA1_HASH_MOD_FUNC_OFFSET: 4b75bb64569a0fc1ced02ae2eab19a870861b616

THREAD_SHA1_HASH_MOD: 45402f6c887c51b53c8a6ab8db6bf327719f04f2

FOLLOWUP_IP:
nt! ?? ::FNODOBFM::string'+1e423 fffff801089e60d9 cc int 3

FAULT_INSTR_CODE: 8b4865cc

SYMBOL_STACK_INDEX: 4

SYMBOL_NAME: nt! ?? ::FNODOBFM::`string’+1e423

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: nt

IMAGE_NAME: ntkrnlmp.exe

DEBUG_FLR_IMAGE_TIMESTAMP: 5010ac4b

IMAGE_VERSION: 6.2.9200.16384

BUCKET_ID_FUNC_OFFSET: 1e423

FAILURE_BUCKET_ID: 0x1a_61946_VRF_nt!??::FNODOBFM::string

BUCKET_ID: 0x1a_61946_VRF_nt!??::FNODOBFM::string

PRIMARY_PROBLEM_CLASS: 0x1a_61946_VRF_nt!??::FNODOBFM::string

TARGET_TIME: 2015-11-17T23:10:19.000Z

OSBUILD: 9200

OSSERVICEPACK: 0

SERVICEPACK_NUMBER: 0

OS_REVISION: 0

SUITE_MASK: 272

PRODUCT_TYPE: 1

OSPLATFORM_TYPE: x64

OSNAME: Windows 8

OSEDITION: Windows 8 WinNt TerminalServer SingleUserTS

OS_LOCALE:

USER_LCID: 0

OSBUILD_TIMESTAMP: 2012-07-26 03:32:43

BUILDDATESTAMP_STR: 120725-1247

BUILDLAB_STR: win8_rtm

BUILDOSVER_STR: 6.2.9200.16384.amd64fre.win8_rtm.120725-1247

ANALYSIS_SESSION_ELAPSED_TIME: 329

ANALYSIS_SOURCE: KM

FAILURE_ID_HASH_STRING: km:0x1a_61946_vrf_nt!??::fnodobfm::string

FAILURE_ID_HASH: {ce34d447-5962-57c2-f153-8958bf4cc1f1}

Followup: MachineOwner

Having been bitten by this before…

The MDLs associated with inpage operations are treated special. One of the
reasons for this is that you don’t want an inpage operation to result in the
dirty bit being set in the PFN. If it did, the Mm’s background writer
threads would say, “gee, this data is dirty” and write it back out to the
file. Prior to this check it was kind of fun because you could get in a
situation where simply reading a file caused the file to be written.

The problem in your case is that MmProbeAndLockPages with IoWriteAccess
causes the dirty bit to be set in the PFN. You didn’t build the original MDL
(Mm did) and you didn’t build this new MDL (NTFS did), so I’m not away of
any games you can play to “fix” this.

In our experience double buffering works and can be limited in scope to only
the cases where it matters. Always open to a more clever solution though :slight_smile:

-scott
OSR
@OSRDrivers

wrote in message news:xxxxx@ntfsd…

I have a File System Driver (not Filter Driver) that acts as a “proxy”
between a file system implemented in user mode and the kernel. This FSD
works well in a variety of scenarios and with multiple user mode file
systems.

I am experiencing a problem with one of the file systems, named
“passthrough”, which simply passes all file system operations through to
NTFS. Here is how “passthrough” works, for example, for ReadFile:

[OP is the originating process, OS is NTOS, KF is my FSD, UF is the user
mode file system. The UF interacts with the KF by issuing DeviceIoControl
calls.]

OP: call NtReadFile
OS: IoCallDriver IRP_MJ_READ
KF: post READ IRP into a queue (ignoring caching complications here)
---- context switch ----
KF: wake up, dequeue READ IRP and return from DeviceIoControl
UF: process READ IRP and send response using DeviceIoControl
KF: complete the READ IRP (and wait for additional IRP’s)

The passthrough file system’s READ IRP processing is simply ReadFile on an
underlying NTFS file. This works fine if the underlying NTFS file is opened
without FILE_FLAG_NO_BUFFERING, but fails catastrophically if the file is
opened with FILE_FLAG_NO_BUFFERING.

I attach the full “!analyze -v” report at the end, but the crux is that
Ntfs.sys bugchecks in MmProbeAndLockPages with MEMORY_MANAGEMENT and 61946.
The following message by Pavel Lebedinsky explains:
http://www.osronline.com/ShowThread.cfm?link=240647

The 0x1A/0x61946 bugcheck happens when the memory manager issues a paging
r=
ead and some driver then tries to create a secondary, write-access MDL
desc=
ribing the same physical pages. This is bad because when that secondary
MDL=
is unlocked the pages will get marked dirty when they’re not supposed be,

causing various problems downstream.

My FSD implements a “zero-copy” technique where the OP (originating process)
buffer gets mapped into the address space of the user mode file system. I do
this by using IoAllocateMdl, MmProbeAndLockPages(IoWriteAccess) and
MmMapLockedPagesSpecifyCache(UserMode) (in the correct process context).
Thus the buffer that arrives to the passthrough file system and is later
sent to ReadFile has already an MDL against it. This likely causes the
issue.

According to Pavel Lebedinsky again:

To fix this you need to reuse the original MDL instead of creating a new
one.

But obviously I cannot do so in my case (even if I somehow made the MDL
available to the user mode file system there is no way to use it from user
mode).

Any guidance on this is very much appreciated. Is there any way to create
the MDL in my FSD so that it does not cause this problem? Any other
workarounds? [Other than the obvious ones of ripping the zero-copy mechanism
in the FSD or by making a copy of the read buffer in the user mode file
system.]

Bill

kd> !analyze -v
*******************************************************************************
*
*
* Bugcheck Analysis
*
*
*
*******************************************************************************

MEMORY_MANAGEMENT (1a)

Any other values for parameter 1 must be individually examined.

Arguments:
Arg1: 0000000000061946, The subtype of the bugcheck.
Arg2: fffffa80039918c0
Arg3: 0000000000034a31
Arg4: 0000000000000000

Debugging Details:

DUMP_CLASS: 1

DUMP_QUALIFIER: 0

BUILD_VERSION_STRING: 9200.16384.amd64fre.win8_rtm.120725-1247

DUMP_TYPE: 0

BUGCHECK_P1: 61946

BUGCHECK_P2: fffffa80039918c0

BUGCHECK_P3: 34a31

BUGCHECK_P4: 0

BUGCHECK_STR: 0x1a_61946

CPU_COUNT: 1

CPU_MHZ: c1c

CPU_VENDOR: GenuineIntel

CPU_FAMILY: 6

CPU_MODEL: 3d

CPU_STEPPING: 4

CPU_MICROCODE: 6,3d,4,0 (F,M,S,R) SIG: 0’00000000 (cache) 0’00000000 (init)

DEFAULT_BUCKET_ID: WIN8_DRIVER_FAULT

PROCESS_NAME: passthrough-x6

CURRENT_IRQL: 2

ANALYSIS_SESSION_HOST: WINDOWS

ANALYSIS_SESSION_TIME: 02-13-2017 12:47:34.0142

ANALYSIS_VERSION: 10.0.10586.567 amd64fre

LAST_CONTROL_TRANSFER: from fffff801089800ea to fffff8010887f930

STACK_TEXT:
fffff880047328f8 fffff801089800ea : 0000000000000000 000000000000001a
fffff88004732a60 fffff801089044b8 : nt!DbgBreakPointWithStatus
fffff88004732900 fffff8010897f742 : 0000000000000003 fffff88004732a60
fffff80108904e90 000000000000001a : nt!KiBugCheckDebugBreak+0x12
fffff88004732960 fffff80108885144 : 000000000000010c 0000000000000004
0000000000000000 0000000000000000 : nt!KeBugCheck2+0x79f
fffff88004733080 fffff801089e60d9 : 000000000000001a 0000000000061946
fffffa80039918c0 0000000000034a31 : nt!KeBugCheckEx+0x104
fffff880047330c0 fffff801088d7201 : 0000000000000000 fffff88004733179
fffffa800383c5c0 fffffa80039918f0 : nt! ?? ::FNODOBFM::string'+0x1e423 fffff88004733120 fffff8800164299b : 0000000000000001 fffffa8002aa9701 fffff9800910ec60 0000000000000001 : nt!MmProbeAndLockPages+0x161 fffff880047331e0 fffff88001637140 : fffffa8002aa9780 fffff9800910ec60 fffff880047332e0 fffff9800910ec60 : Ntfs!NtfsLockUserBuffer+0x6f fffff88004733230 fffff880016363d8 : fffff880047332e0 fffff88004733330 fffffa8002aa9780 0000000000000001 : Ntfs!NtfsPrepareBuffers+0x68 fffff880047332a0 fffff88001645b56 : 0000000000000000 fffff880047336a0 fffff8a0024f8198 fffff8a0024f8140 : Ntfs!NtfsNonCachedIo+0x1e8 fffff880047334b0 fffff8800164745b : fffffa8002aa9780 fffff9800910ec60 fffff88004733701 0000000000000000 : Ntfs!NtfsCommonRead+0x896 fffff88004733670 fffff80108e47d26 : fffff9800910ec60 fffff9800910ec60 0000000000000002 fffffa8002620340 : Ntfs!NtfsFsdRead+0x1db fffff88004733720 fffff880014034ee : fffffa8003572750 fffff880047337c0 fffff9800910ec60 fffffa8002620340 : nt!IovCallDriver+0x3e6 fffff88004733770 fffff880014010b6 : fffffa80035dade0 0000000000000002 fffff9800910ec60 fffffa800262b0b8 : fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x25e fffff88004733810 fffff80108e47d26 : fffff9800910ec60 0000000000000002 0000000000000000 fffff8010881772a : fltmgr!FltpDispatch+0xb6 fffff88004733870 fffff80108c509e8 : 0000000000000000 fffff88004733941 fffffa8003855090 fffffa800262b010 : nt!IovCallDriver+0x3e6 fffff880047338c0 fffff80108bf5c13 : fffffa8003855090 fffff88004733b80 00000094140c0000 fffffa8003855090 : nt!IopSynchronousServiceTail+0x158 fffff88004733990 fffff80108884053 : fffffa800383c5c0 0000000000000000 0000000000000000 0000009415c6f948 : nt!NtReadFile+0x661 fffff88004733a90 000007fc4c192c0a : 000007fc4925ebb6 0000009415c6f8c8 cccccccccccccccc cccccccccccccccc : nt!KiSystemServiceCopyEnd+0x13 0000009415c6f868 000007fc4925ebb6 : 0000009415c6f8c8 cccccccccccccccc cccccccccccccccc 0000000000000104 : ntdll!NtReadFile+0xa 0000009415c6f870 000007f7a3a941ff : 0000000000000000 0000000000000000 cccccccccccccccc 0000009415c6fa94 : KERNELBASE!ReadFile+0x11b 0000009415c6f8f0 000007fc434bff3e : 0000009414146d50 0000009414146d20 00000094140c0000 0000000000000000 : passthrough_x64!Read+0xaf [c:\users\billziss\projects\winfsp\tst\passthrough\passthrough.c @ 343] 0000009415c6fa60 000007fc434c2291 : 0000009414146d50 0000009414147070 000000941414b040 0000009414147070 : winfsp_x64_7fc43490000!FspFileSystemOpRead+0xae [c:\users\billziss\projects\winfsp\src\dll\fsop.c @ 914] 0000009415c6faf0 000007fc49aa167e : 0000009414146d50 0000000000000000 0000000000000000 0000000000000000 : winfsp_x64_7fc43490000!FspFileSystemDispatcherThread+0x241 [c:\users\billziss\projects\winfsp\src\dll\fs.c @ 519] 0000009415c6fbc0 000007fc4c1ac3f1 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : KERNEL32!BaseThreadInitThunk+0x1a 0000009415c6fbf0 0000000000000000 : 0000000000000000 0000000000000000 0000000000000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x1d

STACK_COMMAND: kb

THREAD_SHA1_HASH_MOD_FUNC: c36002737b6fc4159123ea2023cea3f05a2e0e75

THREAD_SHA1_HASH_MOD_FUNC_OFFSET: 4b75bb64569a0fc1ced02ae2eab19a870861b616

THREAD_SHA1_HASH_MOD: 45402f6c887c51b53c8a6ab8db6bf327719f04f2

FOLLOWUP_IP:
nt! ?? ::FNODOBFM::string'+1e423 fffff801089e60d9 cc int 3

FAULT_INSTR_CODE: 8b4865cc

SYMBOL_STACK_INDEX: 4

SYMBOL_NAME: nt! ?? ::FNODOBFM::`string’+1e423

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: nt

IMAGE_NAME: ntkrnlmp.exe

DEBUG_FLR_IMAGE_TIMESTAMP: 5010ac4b

IMAGE_VERSION: 6.2.9200.16384

BUCKET_ID_FUNC_OFFSET: 1e423

FAILURE_BUCKET_ID: 0x1a_61946_VRF_nt!??::FNODOBFM::string

BUCKET_ID: 0x1a_61946_VRF_nt!??::FNODOBFM::string

PRIMARY_PROBLEM_CLASS: 0x1a_61946_VRF_nt!??::FNODOBFM::string

TARGET_TIME: 2015-11-17T23:10:19.000Z

OSBUILD: 9200

OSSERVICEPACK: 0

SERVICEPACK_NUMBER: 0

OS_REVISION: 0

SUITE_MASK: 272

PRODUCT_TYPE: 1

OSPLATFORM_TYPE: x64

OSNAME: Windows 8

OSEDITION: Windows 8 WinNt TerminalServer SingleUserTS

OS_LOCALE:

USER_LCID: 0

OSBUILD_TIMESTAMP: 2012-07-26 03:32:43

BUILDDATESTAMP_STR: 120725-1247

BUILDLAB_STR: win8_rtm

BUILDOSVER_STR: 6.2.9200.16384.amd64fre.win8_rtm.120725-1247

ANALYSIS_SESSION_ELAPSED_TIME: 329

ANALYSIS_SOURCE: KM

FAILURE_ID_HASH_STRING: km:0x1a_61946_vrf_nt!??::fnodobfm::string

FAILURE_ID_HASH: {ce34d447-5962-57c2-f153-8958bf4cc1f1}

Followup: MachineOwner

Scott Noone wrote:

Having been bitten by this before…

Thanks for your comments and the extended explanation.

On a practical level my FSD can indirectly cause Windows to bugcheck and it therefore has a “bug” that needs to be fixed.

However:

  • I do not understand Microsoft’s reasoning in making this case a hard bugcheck. The routine is called Mm**Probe**AndLockPages after all, so if it does not like the UserBuffer it should simply raise STATUS_ACCESS_VIOLATION. Perhaps it could bugcheck under the Driver Verifier, but not otherwise.

  • I would argue that if Microsoft gives us the ability to map an MDL into user mode (MmMapLockedPagesSpecifyCache) that mapping should not come with any attached strings and should behave exactly like any other user mode address. At the very least the fact that there are attached strings should be documented.

In our experience double buffering works and can be limited in scope to only
the cases where it matters. Always open to a more clever solution though :slight_smile:

I am not holding much hope for a more clever solution, but I do intend to experiment and see if I come up with something.

Bill

On 02/15/2017 12:30 PM, xxxxx@billz.fastmail.fm wrote:

  • I do not understand Microsoft’s reasoning in making this case a hard bugcheck. The routine is called Mm**Probe**AndLockPages after all, so if it does not like the UserBuffer it should simply raise STATUS_ACCESS_VIOLATION. Perhaps it could bugcheck under the Driver Verifier, but not otherwise.
    Because as Scott was pointing out, prior to it being a hard bugcheck the
    result was pages that would be dirty after a read, so that any data read
    through the path would be written sometime later. This was sufficiently
    subtle that we saw drivers generate this pattern in practice.
  • I would argue that if Microsoft gives us the ability to map an MDL into user mode (MmMapLockedPagesSpecifyCache) that mapping should not come with any attached strings and should behave exactly like any other user mode address. At the very least the fact that there are attached strings should be documented.
    I may be missing something with this sequence, but it seems to me like
    even if it was possible to map into user mode without rendering the
    pages dirty the next step is about to dirty them anyway. Did I
    understand correctly that the usermode process satisfies data reads by
    calling NtReadFile to some other destination? If so, that read is about
    to generate a whole new MDL describing those usermode pages, so any
    driver satisfying that read is going to end up dirtying those pages.

And when you think about it, that’s exactly what should happen for an
unbuffered read. If a usermode process reads data into its buffer,
those pages can’t just be discarded and refetched - they need to be
written to the pagefile. So it’s expected that user reads will leave
the target pages dirty, but it’s also expected that paging reads will
leave the target pages clean.

I am not holding much hope for a more clever solution, but I do intend to experiment and see if I come up with something.
I’d strongly encourage serious consideration of double buffering.

That said, if my understanding above is correct, it may be possible to
have the usermode process satisfy the request by issuing it to your
driver so that you can find the original MDL. Heck, you could be able
to do this implicitly as a filter by noticing your user process
generating a corresponding read (you know where you mapped that buffer
in the user process) and implicitly switching the read to use the
original MDL.

It sounds like the goal though is to allow the usermode process to
operate with maximum flexibility, so double buffering would be a better
solution.

  • M


http://www.malsmith.net

Malcolm Smith wrote:

> - I do not understand Microsoft’s reasoning in making this case a hard
> bugcheck. The routine is called Mm**Probe**AndLockPages after all, so if it does
> not like the UserBuffer it should simply raise STATUS_ACCESS_VIOLATION. Perhaps
> it could bugcheck under the Driver Verifier, but not otherwise.

Because as Scott was pointing out, prior to it being a hard bugcheck the
result was pages that would be dirty after a read, so that any data read
through the path would be written sometime later. This was sufficiently
subtle that we saw drivers generate this pattern in practice.

Malcolm, thanks for the insider perspective.

While I do not understand the exact bugcheck condition in MmProbeAndLockPages, I am now wondering whether it is possible to trigger it in a different manner: consider a (malicious or buggy) process that issues two non-cached ReadFile requests over the *same* buffer. Could it be that these two independent requests result in two MDLs describing the same VA range and when the second MmProbeAndLockPages happens, it bugchecks? To maximize the chances of this happening, this could be tried with overlapped reads with huge buffers on different files (and perhaps even different volumes/drivers, for example one ReadFile on an NTFS C: and one ReadFile on a FAT D:) on a system with multiple processors.

> - I would argue that if Microsoft gives us the ability to map an MDL into user
> mode (MmMapLockedPagesSpecifyCache) that mapping should not come with any
> attached strings and should behave exactly like any other user mode address. At
> the very least the fact that there are attached strings should be documented.

I may be missing something with this sequence, but it seems to me like
even if it was possible to map into user mode without rendering the
pages dirty the next step is about to dirty them anyway. Did I
understand correctly that the usermode process satisfies data reads by
calling NtReadFile to some other destination? If so, that read is about
to generate a whole new MDL describing those usermode pages, so any
driver satisfying that read is going to end up dirtying those pages.

And when you think about it, that’s exactly what should happen for an
unbuffered read. If a usermode process reads data into its buffer,
those pages can’t just be discarded and refetched - they need to be
written to the pagefile. So it’s expected that user reads will leave
the target pages dirty, but it’s also expected that paging reads will
leave the target pages clean.

Let me see if I understand what you are saying here, because I must admit that I do not fully grasp the problem yet. What I say below may just prove that I do not know what I am talking about.

We are considering two cases: user reads (issuing a ReadFile from a user process) and paging reads (e.g. reading from a mapped view).

User reads: in this case the FSD will fill the buffer with the file contents and the buffer should have its pages marked *dirty*, so that they can be eventually written to the page file.

Paging reads: in this case the FSD will fill the buffer with the file contents and the buffer should have its pages marked *clean*, because the buffer is backed by the file just read.

Is this a good description of the problem? If yes, doesn’t this problem already exist with FSD’s shipping with Windows if they decide to create an MDL for READ (for example, FastFat does so in FatPostStackOverflowRead, although I note that it seems to avoid doing so when handling the page file). When the MmUnlockPages finally comes (during IRP completion) how does it know what to do (mark them clean or modified)?

> I am not holding much hope for a more clever solution, but I do intend to
> experiment and see if I come up with something.

I’d strongly encourage serious consideration of double buffering.

That said, if my understanding above is correct, it may be possible to
have the usermode process satisfy the request by issuing it to your
driver so that you can find the original MDL. Heck, you could be able
to do this implicitly as a filter by noticing your user process
generating a corresponding read (you know where you mapped that buffer
in the user process) and implicitly switching the read to use the
original MDL.

An interesting idea!

It sounds like the goal though is to allow the usermode process to
operate with maximum flexibility, so double buffering would be a better
solution.

I agree. I will probably end up double-buffering both Reads and Writes, but my assumption is that this is only required for Reads (IoWriteAccess). Is this correct?

Bill

Actually Microsoft doesn’t care about these scenarios that might result in a BSOD. Microsoft support will blame the application or anything else but refuse to accept this as a bug.

How do I know? This is exactly what happened with the bug when an unprivileged user can crash RDBSS based RDPDR file system because of a lack of synchronization in RDBSS or RDPDR. MS knew about this bug since August last year but did nothing and even released Windows Server 2016 with it.

The full story is here https://social.technet.microsoft.com/Forums/windows/en-US/26691ffa-7b9c-4691-9639-03156a0c6215/windows-10-crashed-when-accessing-tsclient-path-from-remote-desktop-via-far-manager?forum=win10itprosecurity

There were at least another two reports about this BSOD. No reaction from Microsoft. I had to spent my time to add an RDPDR related synchronization inside my isolation filter to reduce the possibility of a BSOD.

On 02/17/2017 01:19 PM, xxxxx@billz.fastmail.fm wrote:

Let me see if I understand what you are saying here, because I must
> admit that I do not fully grasp the problem yet. What I say below may
> just prove that I do not know what I am talking about.
>
> We are considering two cases: user reads (issuing a ReadFile from a
> user process) and paging reads (e.g. reading from a mapped view).
>
> User reads: in this case the FSD will fill the buffer with the file
> contents and the buffer should have its pages marked *dirty*, so that
> they can be eventually written to the page file.
>
> Paging reads: in this case the FSD will fill the buffer with the file
> contents and the buffer should have its pages marked *clean*, because
> the buffer is backed by the file just read.
>
> Is this a good description of the problem?

Yes, that’s my understanding of it. I’m speculating a bit because I
haven’t seen the bugcheck in the debugger to verify. The two are
distinguished by a bit in the MDL which tells the memory manager what to
do on IO completion. If I’m right, this implies your filter is seeing a
paging read then reflecting that up to usermode which issues another
read (with a new MDL) and that read is a non-paging read because
usermode can’t do anything else.

If yes, doesn’t this problem already exist with FSD’s shipping with
> Windows if they decide to create an MDL for READ (for example,
> FastFat does so in FatPostStackOverflowRead, although I note that it
> seems to avoid doing so when handling the page file). When the
> MmUnlockPages finally comes (during IRP completion) how does it know
> what to do (mark them clean or modified)?

It would be a problem if they create a new MDL in order to satisfy a
paging read. So long as the original MDL is used there is no problem.
Normal IO processing means that an MDL is associated with an Irp and
remains associated with that Irp for life - the problem typically arises
when a driver maps pages into VA then attempts to perform IO with a new
Irp on the VA. FatLockUserBuffer will do nothing if an MDL is already
present, and for paging IO, it must already be present. So I don’t
think there’s anything in FatPostStackOverflowRead that would be
problematic.

I am now wondering whether it is possible to trigger it in a
> different manner: consider a (malicious or buggy) process that issues
> two non-cached ReadFile requests over the *same* buffer. Could it be
> that these two independent requests result in two MDLs describing the
> same VA range and when the second MmProbeAndLockPages happens, it
> bugchecks? To maximize the chances of this happening, this could be
> tried with overlapped reads with huge buffers on different files (and
> perhaps even different volumes/drivers, for example one ReadFile on
> an NTFS C: and one ReadFile on a FAT D:) on a system with multiple
> processors.

If usermode was issuing two reads to the same buffer, both IO
completions would attempt to mark the pages dirty. It doesn’t matter if
both succeed. The bugcheck here is because pages were marked as dirty
when completing a paging read.

If you were trying to test this, the case would be to memory map a file,
then attempt to read data into those pages, which should mark them
dirty. The system should first try to lock the memory mapped region
(issuing any paging reads), which should complete and be clean, then the
later user read should mark them dirty.

I agree. I will probably end up double-buffering both Reads and
> Writes, but my assumption is that this is only required for Reads
> (IoWriteAccess). Is this correct?

I believe so.

  • M


http://www.malsmith.net

Thank you for your help. I am now in the process of modifying my FSD to work around this problem.