Handling IRP_MJ_READ

Hi all
I modified the minispy minifilter driver to display a DbgPrint log whenever an IRP_MJ_READ is sent to the filesystem. I have 2 test programs: the first program opens a file, mmaps it and closes the file; the second opens a file, reads some data using ReadFile, and closes the file.

(Minifilter FLT_OPERATION_REGISTRATION flag for READ IRP is 0)

Observations after running the first program with minispy attached to the volume in which the file exists:

  1. The very first time the test program is run, I see an IRP_MJ_READ for that file in the minispy logs (confirmed by using procmon as well). Subsequent runs of the test do not show any IRP_MJ_READ calls.
  2. I rebooted the system and ran the test. Again, the first run of the test generated an IRP_MJ_READ.

The whole file is read irrespective of what parameter I pass to MapViewOfFile.

Observations after running the second program with minispy attached to the volume in which the file exists:

  1. Every ReadFile request triggers IRP_MJ_READ.

The number of bytes in FLT_PARAMETERS for IRP_MJ_READ is equal to the number of bytes requested in ReadFile.

Based on these observations, I have some questions:

  1. Is there a different cache for mmaped files?
    Or in other words, is a mmap request handled by bypassing the file system filter drivers?
    (I understand that the filesystem below might be caching data)

  2. If the above is true, why is it that the first mmap request triggers a READ IRP?

  3. If I have a file mmaped by process A, and if process B tries to mmap the same file, will that trigger a READ IRP?

I have gone through related threads on osr, but I m still not clear on how to handle READ IRP. As you can see, I have lots of things in my mind right now.

I am new to minifilter coding, so feel free to correct any mistakes.

Thanks

Your filter is not “bypassed”, so much as it is involved when an actual read is generated (as you’ve observed). For section views, a read only needs to be generated if the desired file contents were not already in physical memory at the time when the view mapping was touched.

Accesses to section views only touch the file system when necessary and not on every memory access.

  • S

-----Original Message-----
From: xxxxx@yahoo.co.in
Sent: Friday, July 10, 2009 11:43
To: Windows File Systems Devs Interest List
Subject: [ntfsd] Handling IRP_MJ_READ

Hi all
I modified the minispy minifilter driver to display a DbgPrint log whenever an IRP_MJ_READ is sent to the filesystem. I have 2 test programs: the first program opens a file, mmaps it and closes the file; the second opens a file, reads some data using ReadFile, and closes the file.

(Minifilter FLT_OPERATION_REGISTRATION flag for READ IRP is 0)

Observations after running the first program with minispy attached to the volume in which the file exists:
1) The very first time the test program is run, I see an IRP_MJ_READ for that file in the minispy logs (confirmed by using procmon as well). Subsequent runs of the test do not show any IRP_MJ_READ calls.
2) I rebooted the system and ran the test. Again, the first run of the test generated an IRP_MJ_READ.

The whole file is read irrespective of what parameter I pass to MapViewOfFile.

Observations after running the second program with minispy attached to the volume in which the file exists:
1) Every ReadFile request triggers IRP_MJ_READ.

The number of bytes in FLT_PARAMETERS for IRP_MJ_READ is equal to the number of bytes requested in ReadFile.

Based on these observations, I have some questions:
1) Is there a different cache for mmaped files?
Or in other words, is a mmap request handled by bypassing the file system filter drivers?
(I understand that the filesystem below might be caching data)

2) If the above is true, why is it that the first mmap request triggers a READ IRP?

3) If I have a file mmaped by process A, and if process B tries to mmap the same file, will that trigger a READ IRP?

I have gone through related threads on osr, but I m still not clear on how to handle READ IRP. As you can see, I have lots of things in my mind right now.

I am new to minifilter coding, so feel free to correct any mistakes.

Thanks


NTFSD is sponsored by OSR

For our schedule of debugging and file system seminars
(including our new fs mini-filter seminar) visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

There are 2 paths in the FSD - “upper” (between the app and the cache) and “lower” (between the cache and the underlying storage).

For noncached file open, each read is an IRP with is both “upper” and “lower”, it just goes from the app through the FSD down to its lower edge.

For cached file open, the app’s read is either a FastIo call or the “upper” IRP - from the cache to the app. The “upper” IRP is handled by the FSD in a special path, which is actually a wrapper around CcCopyRead. The FastIo call usually uses the default implementation of FsRtlCopyRead, which is also a wrapper around CcCopyRead.

Also, Cc itself can send the “lower” IRPs back to the same FSD to populate the cache with data. They are sent as special “paging” IRPs, which go to the FSD’s lower egde.

With mmap, the page fault on the mapped area is handled by the Mm calling IoPageRead, which sends the “lower” IRP to the FSD without Cc involved, directly from Mm and Io.

Cc uses the same physical pages for the cache as Mm for mmaped areas. The objects involved here is the “shared cache map” (CcSc), 1 or 0 per FCB, which belongs to Cc. CcSc holds a ref against the Mm’s structure called “data control area” (MmCa), again 1 or 0 per FCB. MmCa, in turn, holds a ref to the file object used for its creation. The set of physical pages is owned by MmCa and is used for both cache and mmap.

So, with mmap you see:

  • 1 paging (“lower”) read, then this data is in the page maintained by the Mm (and Cc also uses the same set of physical pages).

With usual read you see:

  • 1 “upper” IRP, which corresponds to the ReadFile call, and copies the data from the cache to the app, and also 1 “lower” (paging) IRP to populate the cache.

Now note that Cc has an optimization by preserving the cache for already closed files for a case of quick reopen. This is done by some interesting lifetime management of CcSc objects - they are destroyed by some garbage collector after a long (hours) delay or in case of low memory.

So, after you close the file, CcSc survives, so MmCa survives, the file object survives (CLEANUP without CLOSE), and the physical pages with the file data survive.

Then, when you reopen the file and mmap it again, MmCa is reused (yes, the file object is new, but FCB is old - same pathname, and MmCa is weakly referenced by the FCB and reused), and the set of physical pages is also reused.

That’s why you do not see the second paging read on second mmap.

If you do not want this - remount the volume (chkdsk /f is the simplest way), this will tear down all volume’s objects, including all CcSc’s and MmCa’s.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

Thanks a lot for the replies.

  1. So, from what I understand, if an mmaped page is modified, I should not see an IRP_MJ_WRITE (irrespective of whether its the first modification or any subsequent modification).

  2. There is a call (FlushViewOfFile) to flush any changes. Each time the modifications are flushed, there should be a lower IRP.

  3. Every WriteFile request should result in a lower IRP.

Is my understanding correct?

Thanks.

> 1) So, from what I understand, if an mmaped page is modified, I should not see an IRP_MJ_WRITE

(irrespective of whether its the first modification or any subsequent modification).

Incorrect. Sooner or later it will be flushed to the disk - either on low memory condition by MiMappedPageWriter or by explicit FlushViewOfFile. Both cases will send a paging write IRP to the FSD.

So, on mmaped data update:

  • synchronous paging write on FlushViewOfFile
  • or async paging write from MiMappedPageWriter in some point in the future.
  1. Every WriteFile request should result in a lower IRP.

No, not necessary.

Noncached WriteFile: explicit IRP.
Cached WriteFile: “upper” IRP or FastIo to copy the data to the cache, then in some point of the future “lower” paging write IRP from CcFlushCache (probably called by the lazy writer).
Mmap update: see above.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

Thanks again for the reply.
So, in case of fast I/O, will there be an upper “IRP”, or in other words, will the minifilter driver’s
IRP_MJ_READ callback routine be invoked if the request is of type fast I/O? (Assuming that the routine is registered without the flag FLTFL_OPERATION_REGISTRATION_SKIP_CACHED_IO)

Also, can a fast I/O operation fail and result in a lower paging IRP being sent to the disk?

> So, in case of fast I/O, will there be an upper “IRP”, or in other words, will the minifilter driver’s

IRP_MJ_READ callback routine be invoked if the request is of type fast I/O? (Assuming that the
routine is registered without the flag FLTFL_OPERATION_REGISTRATION_SKIP_CACHED_IO)

Yes, FltMgr provides such callbacks.

Also, can a fast I/O operation fail and result in a lower paging IRP being sent to the disk?

FastIo can:

  • say “I cannot” (return FALSE)
  • fail (return TRUE and fill IO status block with failure)
  • succeed (return TRUE and fill IO status block with success)

Only in the first case the OS creates an IRP and sends it to the stack as “upper” (top-level, not paging) IRP.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com