FsRtlCreateSectionForDataScan

destiny · December 7, 2017, 6:18am

Hi!

I need to read the whole file in the minifilter driver. For now I need it in the post-create callback, but I plan to sometimes do it in the pre-read callback. For windows 8+ I want to use FltCreateSectionForDataScan, but the driver should work on windows 7 as well.

As far as I understand after looking at the code of FltCreateSectionForDataScan, it is an extended version of FsRtlCreateSectionForDataScan, with some additional IRP queuing mechanism to avoid access problems on the file, for which the section is created. I don’t care about this that much, since I actually want to lock the file until I process it, so on windows 7 I would just opt in for the FsRtl-function. The problem I am trying to understand is what is the line
“Important The FsRtlCreateSectionForDataScan routine should only be used in cases where a handle to the file object specified in the FileObject parameter has not yet been created (typically while processing a post-create operation)” on MSDN means.
It makes me think that actually I shall implement the following:

if FltCreateSectionForDataScan is available, use it;
if FO_HANDLE_CREATED is set in the post-create, or if i am in the pre-read, then use ObOpenObjectByPointer + ZwCreateSection on the file object;
if FO_HANDLE_CREATED is not set in the post-create, then use FsRtlCreateSectionForDataScan.

Does it make sense, or did I miss some scenarios/possibilities?

Thanks!

Scott_Noone_OSR · December 11, 2017, 3:34pm

Yes

FO_HANDLE_CREATED won’t be set in post-create. The handle can’t be created
until the create request completes.

And what’s the point of doing this in pre-read? Why not wait until
post-read?

-scott
OSR
@OSRDrivers

destiny · December 12, 2017, 4:57am

Thanks, Scott.

Ok, good, so you say it is safe to assume this.

Well, it won’t make much of a difference. I assume you would suggest to just use Data->Iopb->Parameters.Read in the Post-Read, but it does not work for me: I need the whole file, while the calling application might read just a portion of it.

Scott_Noone_OSR · December 13, 2017, 8:43am

You’re much better off scanning at time of open instead of trying to do this
in the read path (especially in the case of paging I/O).

-scott
OSR
@OSRDrivers

destiny · December 13, 2017, 9:40am

It is not true for network shares, is it? It would make sense for me to “scan” such a file after some significant amount has been read

Scott_Noone_OSR · December 13, 2017, 2:56pm

Yes, no, maybe…

It depends on what you’re doing. Are you blocking the read while you scan
the file or are you doing the scan asynchronous to the read? If the former,
then the application is still going to perceive a long delay in accessing
the file whether you block the open or the read. If the latter, then I’m not
as opposed though whether or not it’s a reasonable design depends on what
you’re trying to do.

-scott
OSR
@OSRDrivers

destiny · December 13, 2017, 6:20pm

The scan is supposed to happen on the “real” access to the file. I am not sure with the details yet, but in the former case the application won’t receive any delay if, say, only the icon is loaded by explorer.exe from a 20mb exe-file. The problem is not only the delay, but also the network usage: I don’t want to fully download files that are only slightly touched… So I am seeing it as ending in the pre- (or post-) read callback, where the file object is already there, and seeking for the best way to work with it. It would be nice to try to avoid duplicate downloads as much as possible, so that in case I download the file for my purposes, the system won’t have to download it again for the application.

Scott_Noone_OSR · December 13, 2017, 10:10pm

To back up a bit…You probably already know this, but there are three types of reads: cached, non-cached, and paging. The first two are lumped into the category of “user I/O” and paging is, er, “paging I/O”. Applications can directly generate cached or non-cached I/O. Paging I/O is sent indirectly as a result of cached I/O (to populate the file cache) or memory mapped I/O.

When a filter creates a data section for a file and accesses it, the pages will be read via paging I/O. The resulting pages will/can be used to populate the file cache OR to satisfy other memory mapped I/O in the future. So, scanning a file this way is optimal because once the scan is done the user’s cached or memory mapped I/O will soft fault in (modulo memory pressure, which might evict the pages).

Getting back to the point of scanning on read…One immediate annoyance is that you’re going to see your own paging reads. So, you need to know not to hold up your own paging I/Os that are generated as a result of holding up I/Os. BUT, you can’t just cheese out and ignore all paging I/Os, because then you’d miss the case where someone reads the file through a memory mapping (e.g. Notepad).

I also always like to think: what would happen if I put my filter above my filter? If a filter beneath you decided that an I/O was meaningful and tried to scan, its paging I/Os would recurse into your filter. Would those paging I/Os make you think the file was being meaningfully accessed, thus cause you to hold them up and try to scan the file? I don’t think this would end well for anyone…

In the interest of simplicity, what about scanning on IRP_MJ_CLEANUP? You could trigger based either on the fact that someone read from the file with user I/O or created a section (in which case you’re more speculative, but it’s in the interest of simplicity).

-scott
OSR
@OSRDrivers

destiny · December 14, 2017, 6:18am

Well, I guess in case of the non-cached I/O, which happens not too often, hopefully, I can not do much about it. Indeed, a non-cached access to a network location would imply that I have absolutely no ability to check if the file contents has actually changed in the meanwhile. But I really would be happy to optimize the cahed I/O.

I see what you mean by recursive paging I/O problems. I could use the File Context to store the state and let the requests pass through for this file object, but then I would have a problem in case of the parallel usage of the file, the worst case being a duplicated handle. I could allow requests only from the current thread or from my usermode service, while pausing all other requests until the processing is finished, but then I might have a problem in case an underlying driver wants to do some work with the file in the background before returning contol to me, which is theoretically possible, though I can not think of a real use case [probably some minifilter which would parallelize the reading of the file in case of RAID?]

By the way, as far as I understand, the Flt* functions are supposed not to touch anything above the current layer. In particular it would imply, that I am on the safe side if I use FltCreateSectionForDataScan in the Read callback, right? I think I would go for this option in win8+ then, but do all work in the post-create on win7. Scanning on cleanup is unfortunately too late? Actually you can think of it as a kind of an av driver.

I don’t see why creating a section should be treated as a special case. Would not it issue a sequence of read operations with total read size >= actual size requested by the application?

Scott_Noone_OSR · December 14, 2017, 9:17am

That’s fair. I’m not trying to be a dick about it, but many FS filter projects die on what appears to be a simple requirement. Just want to make sure that all the variables are considered and, for a security product, the optimizations you end up with fit within your threat model.

It does not save you in this case. If you’re using FltCreateSectionForDataScan to create a section to the user’s file object, then the paging I/O generated will go to the top of the stack. If you opened your own file object with FltCreateFile and then created a section the I/O wouldn’t go to the top.

Scanning in cleanup will be easier. If it’s too late then it’s too late, but in some cases where you’re just trying to be reactive it may be good enough (e.g. if you need to detect that someone is reading files that they’re not supposed to, then you don’t necessarily need to detect it on the read operation)

The standard Windows application pattern for memory mapped file access is:

CreateFile()
CreateFileMapping()
CloseHandle(hFile)
MapViewOfFile
Access mapping

Thus, you get the paging reads AFTER the cleanup (Step 3). Or you get no paging reads at all if the data is already in memory.

If you want to stay out of the paging read path, or know every application that is reading the file, then you need to deal with the section creation case somehow.

-scott
OSR
@OSRDrivers

destiny · December 15, 2017, 4:34am

It is sad to realize that no matter what I do, I am screwed Though I was thinking that the IRP queuing mechanism initiated inside FltCreateSectionForDataScan should take care of this issue. Thanks for your explanation, I will try to think of reasonable trade-offs that I can allow myself, like probably not blocking the file at all after I start scanning it and until I finish the scan.

The only abstract question I still have is what is in general better for the file object for which a handle has already been created: FltCreateSectionForDataScan or ObOpenObjectByPointer + ZwCreateSection. I guess the first, since inside it actually does more or less the second + some checks, while the second might be screwed in many ways, like for example if a parallel thread closes the handle.

Oh, I actually didn’t know one can close the file handle before mapping the file, interesting!

destiny · December 15, 2017, 5:06am

By the way, just a random crazy idea: if I just copy the FILE_OBJECT structure into another memory location and pass it to FltCreateSectionForDataScan… I shall not do this, right?

MBond · December 15, 2017, 7:33pm

Correct

The design of Windows as an OS does not allow for the possibility of the type of scanning that you propose. It can be done of course, but it is not a trivial exercise as you see.

Re the file mapping object, remember that that has a handle (and a reference) too. While that handle is open, the underlying objects in KM must be too. This may seem trivial, but this gets much more complex when you handle inheritance for child processes. Handle inheritance, especially unintentional inheritance, can have important side effects for you as the sequence of events on you see re your file object may be altered. A process may live for a long time holding a handle that it does not even know about that prevents cleanup or even affects the ability of other process to operate properly. Clearly this sort of effect would complicate your decision as to when a meaningful read of data

CloseHandle on another thread from UM wile you are processing an IO request should not screw with you in any case. Before the object can be destroyed, all of the references must be released ? including all of the pending IO

Sent from Mailhttps: for Windows 10

________________________________
From: xxxxx@lists.osr.com on behalf of 00.d35.00+xxxxx@gmail.com
Sent: Friday, December 15, 2017 4:34:25 AM
To: Windows File Systems Devs Interest List
Subject: RE:[ntfsd] FsRtlCreateSectionForDataScan

It is sad to realize that no matter what I do, I am screwed Though I was thinking that the IRP queuing mechanism initiated inside FltCreateSectionForDataScan should take care of this issue. Thanks for your explanation, I will try to think of reasonable trade-offs that I can allow myself, like probably not blocking the file at all after I start scanning it and until I finish the scan.

The only abstract question I still have is what is in general better for the file object for which a handle has already been created: FltCreateSectionForDataScan or ObOpenObjectByPointer + ZwCreateSection. I guess the first, since inside it actually does more or less the second + some checks, while the second might be screwed in many ways, like for example if a parallel thread closes the handle.

Oh, I actually didn’t know one can close the file handle before mapping the file, interesting!

—
NTFSD is sponsored by OSR

MONTHLY seminars on crash dump analysis, WDF, Windows internals and software drivers!
Details at http:

To unsubscribe, visit the List Server section of OSR Online at http:</http:></http:></https:>