setting MDL in IRP_MJ_READ/IRP_MJ_WRITE IRP

I have device(PDO), which need read or write data from some file when in got IO request from upper(FDO) device. nothing complex here - i setup next IRP stack location with valid IRP_MJ_READ or IRP_MJ_WRITE IRP data and call FS device. but i want do one optimization here - i already have MDL which cover buffer for read/write (rarely MDL cover larger buffer: MappedSystemVa != buffer for r/w - in this case i create IoBuildPartialMdl for use MDL with exactly MappedSystemVa == buffer for r/w). form fastfat source i view that FS by self create MDL for user buffer, if yet no MDL in IRP ( look deviosup.c -> FatLockUserBuffer), also i am look how IoPageRead work - it also set self MDL in IRP. so - i decide set MDL in IRP before call FS device with r/w IRP. and in own completion i restore original Irp->MdlAddress. also i set irp->Flags = IRP_NOCACHE. this solution well worked several years before win10.
but on win10 i found problem - this work well only if file, from which i read/write not compressed. but if file is compressed - NTFS not using NonCachedIO despite IRP_NOCACHE flag, but always Cached Io (by nt!CcAsyncCopyRead) and in completion (NTFS!NtfsAsyncCachedReadCompletionCallBack - free (!) MDL if IoStatus.Status < 0) - this cause bugcheck when real owner of MDL try free it (double free) - normally (if NTFS not free MDL) i remove it(more exactly restore original Irp->MdlAddress value) from IRP in own completion routine. as solution for this i found that if i set irp->Flags = IRP_NOCACHE|IRP_PAGING_IO in flags - look like now NTFS no more free MDL by self and all again work well, even on compressed files. and IRP_PAGING_IO is OK for me because i never extend file by write requests - his size is unchanged.
but now i already not shure - are this solution is correct ? so question:
can i set Irp->MdlAddress by self, before pass r/w request to FS with irp->Flags = IRP_NOCACHE|IRP_PAGING_IO; ? this is correct or we cannot be shure in this ?

Thanks

IRP_PAGING_IO means an IRP comes from the Memory Manager(MM) and MDL is being freed by a caller ( i.e. the Memory Manager when it decides to do this ), you should consider this as a design feature. Using IRP_PAGING_IO is not documented and intended for a system use but third party drives have done this for 20+ years. Irp completion for paging IO skips some steps ( like APC ), you better have a completion routine to synchronize with completion.

The downsides of the solution -

  • Paging IO might skip some synchronization inside file system driver(FSD) as Memory Manager uses FastIo*( AcquireForModWrite e.g.) callbacks to give FSD a chance to acquire resources before issuing a paging IRP, this is a way to provide a correct synchronization order between MM and FSD. FSD is a client for MM and MM is a client for FSD so these FastIo* callbacks is a way to acquire resources in a correct order to avoid deadlocks. The Cache Manager(CC) also uses callbacks ( AcquireForLazyWrite e.g. ).

  • If a file you are reading from is cached and the cache contains dirty ( i.e. unflushed ) data your are going to read old data from the disk. That means that if you decide to go with IRP_PAGING_IO you should be sure that all cached writes are done for file objects with FO_WRITE_THROUGH flag being set, this gives you some sort of data synchronization. If you are writing with IRP_PAGING_IO then data might be overwritten by a stalled data from the cache if the cache is not write through and file system will not synchronize for this case as this is responsibility of the Cache Manager(Cc) and the Memory Manager.

  • If a file is memory mapped and being changed by an application by writing to a virtual memory addresses backed by a file then there is practically no way to synchronize data as a stalled data might overwrite the data written by paging IO. This is a feature of Windows kernel when a process PTE dirty bit is transferred with a delay to the prototype PTE.

If possible, I would avoid IRP_PAGING_IO. As PDO receives MDL then pages that backs virtual memory range have been already locked and there will be no big overhead when file system driver calls MmProbeAndLockPages for this buffer.

Also, keep in mind that there is no obligation for FSD to honor IRP_NOCACHE flag, cache can be used for non cached IO.

Regarding Irp->MdlAddress - it should be set for read/write IRPs with IRP_PAGING_IO flag. It doesn’t make sense to issue paging read/write IRP without MdlAddress.

my main goal - not let NTFS driver free Mdl in Irp (Irp->MdlAddress)
“Irp completion for paging IO skips some steps” - i restore Irp->Flags to original values in self CompletionRoutine - so this not change behaviour of IofCompleteRequest for this Irp
“Using IRP_PAGING_IO is not documented and intended for a system use” - because this i and use only IRP_NOCACHE in flag initially, but found that on win10 NTFS can free Irp->MdlAddress in self completion - so i begin search how disable this - and found that with IRP_NOCACHE|IRP_NOCACHE and Irp->MdlAddress != 0 all work well (how minimum after short tests)
also forget say this at begin - file opened in my device in exclusive mode - nobody more read or write to it. and file not mapped in memory. preferably (for me) - only noncached io on file.

yes, pages that backs virtual memory range for r/w have been already locked - no problem of course not set Irp->MdlAddress in r/w Irp - this is simplyest way and it worked.
however for me interesting - are will be correct set Mdl in Irp for r/w - with (IRP_NOCACHE|IRP_NOCACHE) flags or something another way. in own CompletionRoutine - i restore original Irp->Flags and Irp->MdlAddress

Irp->MdlAddress should be set for IRP_PAGING_IO in case of read/write IRP

“Irp->MdlAddress should be set for IRP_PAGING_IO in case of read/write IRP” - of course yes. i and set Irp->MdlAddress always. but i ask about another. may be bad English :frowning:
my question:
are correct will be set:
Irp->MdlAddress = Mdl; // with Mdl->MappedSystemVa == Irp->UserBuffer
Irp->Flags = IRP_NOCACHE|IRP_NOCACHE;
IrpSp->MajorFunction = IRP_MJ_READ (or IRP_MJ_WRITE)
IrpSp->MinorFunction = 0;

and pass this Irp to FS
in self completion - i restore Irp->MdlAddress (to 0) and Irp->Flags to original values
look like this work well, but i would like to hear the opinions of experts on this.

several years i use solution like this, but with Irp->Flags = IRP_NOCACHE; util not found that in win10 NTFS free Irp->MdlAddress if file is compressed on read error. however in 8.1 and previous version - under this case NTFS Not free Mdl and as result code worked

I would provide you with an excerpt from a driver that used paging IO

Irp->MdlAddress = Mdl;
Irp->Flags = IRP_PAGING_IO | IRP_NOCACHE | IRP_SYNCHRONOUS_PAGING_IO;

Irp->RequestorMode = KernelMode;
Irp->UserIosb = IoStatusBlock;
Irp->UserEvent = &Event;
Irp->UserBuffer = (PVOID) ((PCHAR) Mdl->StartVa + Mdl->ByteOffset);
Irp->Tail.Overlay.OriginalFileObject = FileObject;
Irp->Tail.Overlay.Thread = PsGetCurrentThread();

The most tricky part is Irp->UserBuffer, FSDs have some ideas about UserBuffer value when they build partial MDLs .

Slava Imameev - thank you for example. i of course view code like this (say from IoPageRead) and i have not problems with Irp->UserBuffer or Irp initialization (really i not allocate Irp by self but use Irp wich i receiver from upper driver and after some changes send it forward to FSD driver). my doubts only about - are NTFS *NOT FREE* Irp->MdlAddress under some conditions. until recently i think that FSD must not free Mdl from Irp - can and must use, but not free (except IRP_MN_COMPLETE case, but here it not used). but found that NTFS from win10 under some condition call IoFreeMdl(Irp->MdlAddress). i want prevent this. not let NTFS to free it. i decide use IRP_PAGING_IO in Irp->Flags - i assume that it must prevent NTFS(or any FSD) from free Mdl in Irp. but i not shure on 100% for thsis - so and ask.
thanks

Actually IRP_PAGING_IO suppresses the I/O Manager from freeing the MDL. File systems shouldn’t free it. This is because for paging I/O operations the MDL belongs to the memory manager (it’s a parameter to IoPageRead, for example).

Tony
OSR

Tony Mason - thanks.
“IRP_PAGING_IO suppresses the I/O Manager from freeing the MDL” - yes, i know it. and because this decide and add IRP_PAGING_IO this flag to Irp. really i have no any problem with I/O Manager - becouse IofCompleteIrp - first call drivers callbacks - and i in self Completion routine zero (Irp->MdlAddress = 0;) Mdl - so here no problems with him. but NTFS can by self free ‘my’ Mdl from Irp before call IofComplete request (if IRP_PAGING_IO is not set) - so i hope that IRP_PAGING_IO must give 100% guarantee that Irp->MdlAddress will be not free. both on read and write.
an this code in general correct ? (i have some doubt about using IRP_PAGING_IO flag - are no some side effect here ?)

Irp->MdlAddress = Mdl; // ASSERT(Mdl->MappedSystemVa == Irp->UserBuffer)
Irp->Flags = IRP_NOCACHE|IRP_PAGING_IO;

IrpSp->MajorFunction = IRP_MJ_READ (or IRP_MJ_WRITE)
IrpSp->MinorFunction = 0;
IrpSp->Flags = 0;
IrpSp->Control = SL_INVOKE_ON_SUCCESS|SL_INVOKE_ON_ERROR|SL_INVOKE_ON_CANCEL;
IrpSp->FileObject = _MyFileObject;
IrpSp->Parameters.Read.Length = ByteCount;
IrpSp->Parameters.Read.ByteOffset.QuadPart = Va;
IrpSp->Parameters.Read.Key = 0;
IrpSp->Context = ;
IrpSp->CompletionRoutine = OnComplete;

in OnComplete i restore Irp->MdlAddress and Irp->Flags to original saved values

Would it be acceptable for you to create a separate paging IRP to read/write data from/to a file? Changing IRP flags in completion is a too hard approach.

i sure that no sense create separate paging IRP to read/write:

  1. from FSD view - no different are this is new allocated IRP already with IRP_NOCACHE|IRP_PAGING_IO or i change Irp->Flags in existing Irp - they cannot determinate this
  2. I/O manager looking on Irp fields only in IofCompleteRequest after call all CompletionRoutines in stack (and nobody return STATUS_MORE_PROCESSING_REQUIRED) but i restore original Irp->Flags in own CompletionRoutine. so i sure here all correct.
    i have problems not with I/O Manager, but with NTFS.
    i discover that setting
    Irp->Flags = IRP_NOCACHE|IRP_PAGING_IO;
    have side effect (compare with Irp->Flags = IRP_NOCACHE) :
    all work great, but on IRP_MN_REMOVE_DEVICE when i do cleanup, my code hung on call NtClose(_hFile) - this is file on which i do r/w. looking thread stack -
    NtfsCommonCleanup -> NtfsAcquirePagingResourceExclusive -> ExAcquireResourceExclusiveLite
    but somebody hold this resource in shared mode (look like this is Scb->Header.PagingIoResource bu not sure)
    i decide for test close _hFile at begin, already on start device ( i hold open file handle for not got STATUS_FILE_CLOSED on r/w, but with PagingIo possible r/w on file without opened handle) - in this case i call only ObfDereferenceObject(_FileObject); in REMOVE_DEVICE and in this case all ok. _FileObject is freed (i check this some time later - memory for _FileObject already used by another file).
    but when i then open this file, from user mode app, it hang in call ZwOpenFile and i view that thread is wait with WrResource wait reason - so again ExAcquireResource*Lite
    and this is only on win 7, win 8.1 - hung on resource. in win 10 - all working without hung.
    also i claim that no ongoing I/O request on file:
    before send r/w Irp to File, i call if (ExAcquireRundownProtection){} and ExReleaseRundownProtection in CompletionRoutine. and finally ExWaitForRundownProtectionRelease in IRP_MN_REMOVE_DEVICE before call NtClose(_File);ObfDereferenceObject(_FileObject);
    in what can be problem with IRP_PAGING_IO and Header.PagingIoResource here ?
    i search not for simplyest working solution (in this case i can simply not use Mdl and PagingIo in Irp) but for deep understanding problem.

Use the “!locks” command to investigate a possible deadlock on resource acquisition. Blocking on following ZwOpenFile makes me suspect a resource is acquired and never released.

Slava Imameev - you was right at begin:

“Paging IO might skip some synchronization inside file system driver(FSD) as Memory Manager uses FastIo*( AcquireForModWrite e.g.) callbacks to give FSD a chance to acquire resources before issuing a paging IRP, this is a way to provide a correct synchronization order between MM and FSD.”

my error that i not call direct or indirect AcquireForModWrite/ReleaseForModWrite on IRP_MJ_WRITE

now for test set
Irp->Flags = fwrite ? IRP_NOCACHE : IRP_NOCACHE|IRP_PAGING_IO;
so doing IRP_PAGING_IO only on read - in this case all work, remove device as well. no hung on resource.
but with Irp->Flags = IRP_NOCACHE|IRP_PAGING_IO; always - i got hung on open/close file after first paging write request.
so think i need call AcquireForModWrite/ReleaseForModWrite ? until not sure how call this correct. direct from FastIoTable (this skip possible FsFilters callbacks) or some like FsRtlAcquireFileForModWrite ? but this not exported.

any suggestion how call this correct ?

It is not clear who acquired the lock and held it indefinitely, it points on failing to call some sort of completion that releases the lock. Acquiring the locks by FastIO callbacks might not solve the problem, though it is not the problem - call IoGetRelatedDeviceObject and use FastIoDispatch for a returned device object.

after more looking fastfat and ntfs src, hope i found good solution - i open own file as PageFile:
IoCreateFile(*, SL_OPEN_PAGING_FILE);
this logic at all very good for me because file must be in exclusive use.
with PagingFile code:
Irp->Flags = IRP_NOCACHE|IRP_PAGING_IO;
work well and no any hung on cleanup - tested win7, win8.1, win10

about hung without FCB_STATE_PAGING_FILE in 7, 8.1 (interesting that on win10 no this hung) - i understand that is mistake direct send IRP_PAGING_IO Irp without some previous synhcronization. comment from write.c

//
// For all paging I/O, the correct resource has already been
// acquired shared - PagingIoResource if it exists, or else
// main Resource. In some rare cases this is not currently
// true (shutdown & segment dereference thread), so we acquire
// shared here, but we starve exclusive in these rare cases
// to be a little more resilient to deadlocks! Most of the
// time all we do is the test.
//

really, how i researh code is hange (in Close,Open calls) when called

PFSRTL_COMMON_FCB_HEADER Header;
ExAcquireResourceExclusiveLite( Header->PagingIoResource, TRUE );

but this resource already acquired shared by several threads. this is usual system worked threads, and all they wait wait WrQueue

http://i.imgur.com/mbqveHQ.png (from debugger )

“it points on failing to call some sort of completion that releases the lock” - sure that no. faster NTFS assume that issuer of IRP_PAGING_IO(IRP_MJ_WRITE) is hold lock and him and must release it. so need call AcquireForModWrite/ReleaseForModWrite (?) but how this must be done (direct call it from driver FastIoDispatch - think that not good way)- until not sure, not found example

I had to go back to read your original email to see how we got here…

Your experience with setting the IRP_PAGING_IO flag sounds about right, this
is usually what happens any time someone starts trying to set this flag. The
locking requirements are very unforgiving yet poorly described and once you
set the paging I/O bit you’re expected to intimately understand the
behaviors between the Mm and the FS. This can be very difficult and is a
moving target as the implementation changes. Expect lots of deadlocks while
trying to make it work.

Back to your original note, we appear to have gotten here because of NTFS
freeing the MDL out from underneath you in
NtfsAsyncCachedReadCompletionCallBack. Poking at the disassembly, this
appears to only be done if the MDL is not locked or describing a non-paged
buffer:

.text:00000001C002D2AE mov rcx, [rbx+8] ; Mdl
.text:00000001C002D2B2 test rcx, rcx
.text:00000001C002D2B5 jz loc_1C000E957
.text:00000001C002D2BB test byte ptr [rcx+_MDL.MdlFlags],
6
.text:00000001C002D2BF jnz loc_1C000E957
.text:00000001C002D2C5 call cs:__imp_IoFreeMdl

Are you sending down an unlocked MDL?

-scott
OSR
@OSRDrivers

wrote in message news:xxxxx@ntfsd…

after more looking fastfat and ntfs src, hope i found good solution - i open
own file as PageFile:
IoCreateFile(*, SL_OPEN_PAGING_FILE);
this logic at all very good for me because file must be in exclusive use.
with PagingFile code:
Irp->Flags = IRP_NOCACHE|IRP_PAGING_IO;
work well and no any hung on cleanup - tested win7, win8.1, win10

about hung without FCB_STATE_PAGING_FILE in 7, 8.1 (interesting that on
win10 no this hung) - i understand that is mistake direct send IRP_PAGING_IO
Irp without some previous synhcronization. comment from write.c

//
// For all paging I/O, the correct resource has already been
// acquired shared - PagingIoResource if it exists, or else
// main Resource. In some rare cases this is not currently
// true (shutdown & segment dereference thread), so we acquire
// shared here, but we starve exclusive in these rare cases
// to be a little more resilient to deadlocks! Most of the
// time all we do is the test.
//

really, how i researh code is hange (in Close,Open calls) when called

PFSRTL_COMMON_FCB_HEADER Header;
ExAcquireResourceExclusiveLite( Header->PagingIoResource, TRUE );

but this resource already acquired shared by several threads. this is usual
system worked threads, and all they wait wait WrQueue

http://i.imgur.com/mbqveHQ.png (from debugger )

“it points on failing to call some sort of completion that releases the
lock” - sure that no. faster NTFS assume that issuer of
IRP_PAGING_IO(IRP_MJ_WRITE) is hold lock and him and must release it. so
need call AcquireForModWrite/ReleaseForModWrite (?) but how this must be
done (direct call it from driver FastIoDispatch - think that not good way)-
until not sure, not found example

“Are you sending down an unlocked MDL?” - no, MDL is 100% locked and mapped.
at first i not create Mdl by self, but got this Mdl from ClassPnp - don`t think he send me not locked Mdl.
and i always call MmGetSystemAddressForMdlSafe(MdlAddress, LowPagePriority) before call FSD.

about NtfsAsyncCachedReadCompletionCallBack - yes, i also look it binary code and note that call IoFreeMdl only if (status < 0) and MDL is not have (MDL_PAGES_LOCKED|MDL_SOURCE_IS_NONPAGED_POOL) flags. not complete understand how this happens, but may be this is partial Mdl with Flags - MDL_MAPPED_TO_SYSTEM_VA|MDL_PARTIAL|MDL_PARTIAL_HAS_BEEN_MAPPED|MDL_PARENT_MAPPED_SYSTEM_VA

solution with paging file - open file with SL_OPEN_PAGING_FILE work very well on all systems, however i cannot found any documentation about correct using this an feature.

about how correct lock not paged file before sending paging write request - i need more study how system by self do this

Sorry that this message is unrelated, but I’m new, and I would like to know
how to ask a question and what I should know about osr.com. If anyone could
point me in the right direction in terms of how to get started, that would
be really appreciated. Thanks!

On Wed, Aug 24, 2016 at 11:34 AM, wrote:

> “Are you sending down an unlocked MDL?” - no, MDL is 100% locked and
> mapped.
> at first i not create Mdl by self, but got this Mdl from ClassPnp - don`t
> think he send me not locked Mdl.
> and i always call MmGetSystemAddressForMdlSafe(MdlAddress,
> LowPagePriority) before call FSD.
>
> about NtfsAsyncCachedReadCompletionCallBack - yes, i also look it binary
> code and note that call IoFreeMdl only if (status < 0) and MDL is not have
> (MDL_PAGES_LOCKED|MDL_SOURCE_IS_NONPAGED_POOL) flags. not complete
> understand how this happens, but may be this is partial Mdl with Flags -
> MDL_MAPPED_TO_SYSTEM_VA|MDL_PARTIAL|MDL_PARTIAL_HAS_BEEN_
> MAPPED|MDL_PARENT_MAPPED_SYSTEM_VA
>
> solution with paging file - open file with SL_OPEN_PAGING_FILE work very
> well on all systems, however i cannot found any documentation about correct
> using this an feature.
>
> about how correct lock not paged file before sending paging write request
> - i need more study how system by self do this
>
> —
> NTFSD is sponsored by OSR
>
>
> MONTHLY seminars on crash dump analysis, WDF, Windows internals and
> software drivers!
> Details at http:
>
> To unsubscribe, visit the List Server section of OSR Online at <
> http://www.osronline.com/page.cfm?name=ListServer&gt;
></http:>

Reading from files (compressed or not) with user I/O is a pretty well worn
path in Windows. I would be more inclined to track down why the MDL is
unlocked at this point (access breakpoint on the MdlFlags might help) or
special case things for compressed files (double buffer the transfer, map
the MDL and send down just the buffer, etc.) than going off and trying
random things that affect your 99% case.

-scott
OSR
@OSRDrivers

wrote in message news:xxxxx@ntfsd…

“Are you sending down an unlocked MDL?” - no, MDL is 100% locked and mapped.
at first i not create Mdl by self, but got this Mdl from ClassPnp - don`t
think he send me not locked Mdl.
and i always call MmGetSystemAddressForMdlSafe(MdlAddress, LowPagePriority)
before call FSD.

about NtfsAsyncCachedReadCompletionCallBack - yes, i also look it binary
code and note that call IoFreeMdl only if (status < 0) and MDL is not have
(MDL_PAGES_LOCKED|MDL_SOURCE_IS_NONPAGED_POOL) flags. not complete
understand how this happens, but may be this is partial Mdl with Flags -
MDL_MAPPED_TO_SYSTEM_VA|MDL_PARTIAL|MDL_PARTIAL_HAS_BEEN_MAPPED|MDL_PARENT_MAPPED_SYSTEM_VA

solution with paging file - open file with SL_OPEN_PAGING_FILE work very
well on all systems, however i cannot found any documentation about correct
using this an feature.

about how correct lock not paged file before sending paging write request -
i need more study how system by self do this