Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Before Posting...
Please check out the Community Guidelines in the Announcements and Administration Category.

MmForceSectionClosed when both DataSectionObject and ImageSectionObject exist

Alan_Adams-2Alan_Adams-2 Member Posts: 46
If both DataSectionObject and ImageSectionObject exist, it appears
that the intentional behavior of MmForceSectionClosed is to "only act
against ImageSectionObject." How can I most intelligently add the
DeleteOnClose flag to the DataSectionObject, at a time when the
ImageSectionObject also exists?

The intention is as you might expect; to get IRP_MJ_CLOSE as soon as
the reference counts from the mappings hit zero. Something a call to
MmForceSectionClosed is successfully achieving when only one or the
other of the sections exist, but not when both exist at the same time.

Alan Adams
Client for Open Enterprise Server
Micro Focus
xxxxx@microfocus.com

Comments

  • Alan_Adams-2Alan_Adams-2 Member Posts: 46
    > If both DataSectionObject and ImageSectionObject exist, it appears
    > that the intentional behavior of MmForceSectionClosed is to "only act
    > against ImageSectionObject."

    I wrote that backwards; when both sections are present, it's
    DataSectionObject that is acted on by MmForceSectionClosed, not
    ImageSectionObject.

    Alan Adams
    Client for Open Enterprise Server
    Micro Focus
    xxxxx@microfocus.com
  • Mike_BoucherMike_Boucher Member - All Emails Posts: 13
    Will there be a race condition? If you start when you see that the
    reference counts are zero, can an intervening event occur that raises a
    count before your operation finishes?

    On Mon, Feb 5, 2018 at 12:04 PM, Alan Adams <
    xxxxx@lists.osr.com> wrote:

    > > If both DataSectionObject and ImageSectionObject exist, it appears
    > > that the intentional behavior of MmForceSectionClosed is to "only act
    > > against ImageSectionObject."
    >
    > I wrote that backwards; when both sections are present, it's
    > DataSectionObject that is acted on by MmForceSectionClosed, not
    > ImageSectionObject.
    >
    > Alan Adams
    > Client for Open Enterprise Server
    > Micro Focus
    > xxxxx@microfocus.com
    >
    > ---
    > NTFSD is sponsored by OSR
    >
    >
    > MONTHLY seminars on crash dump analysis, WDF, Windows internals and
    > software drivers!
    > Details at
    >
    > To unsubscribe, visit the List Server section of OSR Online at <
    > http://www.osronline.com/page.cfm?name=ListServer>;
    >
  • Alan_Adams-2Alan_Adams-2 Member Posts: 46
    > Will there be a race condition? If you start when you see that the
    > reference counts are zero, can an intervening event occur that raises a
    > count before your operation finishes?

    Thanks for jumping in.

    I expect the simple answer to your question is "Yes", but I believe
    that's MmForceSectionClosed' problem to deal with and not mine. He
    has his own set of locks that he takes when deciding whether or not to
    clean up and remove a pointer from the SECTION_OBJECT_POINTERS that
    was passed, which I expect addresses the race condition described.

    From the file system driver's perspective (actually a network
    redirector in our case), we don't "know" what the reference counts are
    to begin with. That's opaque nt!_CONTROL_AREA business, which you
    might know if debugging and using !ca. And whether the reference
    count is zero isn't really the key; we just care that DeleteOnClose
    flag is added even if the reference count is not yet zero.

    After continued debugging today, although the question perhaps still
    stands for "how could I cause DeleteOnClose to be added to both
    sections when both sections are present", I think the root cause of my
    issue is actually more about "when" the sections become populated,
    rather than "I needed MmForceSectionClosed to mark both sections with
    DeleteOnClose."

    What currently happens is this:

    1. User mode application calls LoadLibrary.

    2. Windows loader opens FILE_OBJECT to DLL.

    3. Windows loader maps ImageSectionObject for this FILE_OBJECT/FCB.

    4. IRP_MJ_CLEANUP issued for this FILE_OBJECT, as Windows loader
    closes handle (presuming he's the only/last one).

    5. During IRP_MJ_CLEANUP our redirector invokes MmForceSectionClosed,
    which just marks the ImageSectionObject as DeleteOnClose, since the
    image mapping references are still non-zero at this time.

    6. A third-party kernel-mode caller invokes
    FsRtlCreateSectionForDataScan using the same FILE_OBJECT as step 3.

    7. DataSectionObject is now established on this FILE_OBJECT/FCB, in
    addition to the ImageSectionObject that already existed.

    8. A series of IRP_MJ_READs are issued as paging I/O, as the
    FsRtlCreateSectionForDataScan caller page-faults their way through the
    mapped file.

    9. The FsRtlCreateSectionForDataScan caller correctly closes the
    Section object handle, and ObDereferenceObject's the Section object
    returned. (Verified via !obtrace.)

    10. At some point later, the user mode application calls FreeLibrary.

    11. The ImageSectionObject reaches zero references, and is destroyed
    due to presence of the DeleteOnClose flag previously applied.

    12. Although the FsRtlCreateSectionForDataScan caller has
    de-referenced the Section they created, the DataSectionObject control
    area remains, because DeleteOnClose was never applied to that section.

    13. IRP_MJ_CLOSE for this FILE_OBJECT "never" comes (until memory
    manager is ready to destroy on his own terms, presumably), because the
    outstanding DataObjectSection keeps the FILE_OBJECT referenced.

    So it seems like really, I need to start calling MmForceSectionClosed
    in the IRP_MJ_READ paging I/O path, because that's the first and
    "only" opportunity I have to see that DataSectionObject on this same
    FILE_OBJECT has become populated by FsRtlCreateSectionForDataScan.

    Some more context might be appropriate: We are a legacy file system
    redirector. Not a filter, and not mini-/RDBSS-based. The customer
    context is Windows 7 SP1, but we see the same in Windows 10 1703. For
    what it's worth we also have no Windows Cache Manager interaction;
    SharedCacheMap is always NULL in our SECTION_OBJECT_POINTERS.

    Finally, the customer issue is that the network-based .DLL file is
    being held open "indefinitely" even though the application process has
    completely terminated.

    Which we now understand to be a side-effect of the kernel-mode
    FsRtlCreateSectionForDataScan caller that established
    DataObjectSection, and this additional section not getting marked as
    DeleteOnClose by the MmForceSectionClosed work our redirector was
    already doing in IRP_MJ_CLEANUP, because the DataSectionObject was not
    yet established at that time.

    Alan Adams
    Client for Open Enterprise Server
    Micro Focus
    xxxxx@microfocus.com
  • Scott_Noone_(OSR)Scott_Noone_(OSR) Administrator Posts: 3,021
    Attempting to do the purge in the paging read path isn't really a fix.
    Nothing says that you're even going to receive paging reads in this case as
    the data could already be present in memory.

    There's a vague warning about calling FsRtlCreateSectionForDataScan on file
    objects post-cleanup:

    "Important The FsRtlCreateSectionForDataScan routine should only be used in
    cases where a handle to the file object specified in the FileObject
    parameter has not yet been created (typically while processing a post-create
    operation). If the driver has a handle to the file object or can obtain a
    handle to the file object, the driver should use the ZwCreateSection routine
    instead."

    Can you determine the context in which FsRtlCreateSectionForDataScan is
    being called? A write access breakpoint on the control area should tell you
    (assuming this is reproducible enough).

    Two other questions:

    1. What failure are you ultimately seeing from this? I'm just curious what
    the manifestation is that would cause your users/customers to complain

    2. Can you reproduce the behavior with the third party product and another
    file system?

    -scott
    OSR
    @OSRDrivers

    -scott
    OSR

  • Alan_Adams-2Alan_Adams-2 Member Posts: 46
    Hello Scott. Thanks for the additional input.

    > Attempting to do the purge in the paging read path isn't really a fix.
    > Nothing says that you're even going to receive paging reads in this case as
    > the data could already be present in memory.

    Completely agreed. If the FsRtlCreateSectionForDataScan caller never
    actually faults in any file content, the scenario "1 through 13" I
    described would be back to not being able to force the
    DataSectionObject closed.

    The IRP_MJ_READ path just appears to be the only synchronous execution
    opportunity our redirector currently receives in the customer
    scenario, where we even /could/ call MmForceSectionClosed at a time
    when the DataSectionObject is populated. We are /sometimes/ seeing
    FileBasicInformation being queried against that same FILE_OBJECT, but
    not as consistently as the IRP_MJ_READ.

    > Can you determine the context in which FsRtlCreateSectionForDataScan is
    > being called?

    We believe that the FsRtlCreateSectionForDataScan caller's premise is
    correct and legitimate. We expect they are filtering IRP_MJ_CREATE
    and want to perform data inspection at a time when handles cannot yet
    be created, because the IRP_MJ_CREATE processing has not completed
    from the Windows I/O manager perspective.

    > 1. What failure are you ultimately seeing from this? I'm just curious what
    > the manifestation is that would cause your users/customers to complain

    So far, just the symptom of "the application-specific .DLL file is
    being held open across the network, even though we have completely
    exited the application and no process using that DLL remains running."

    The customer is also having a sporadic "attempting to re-open the
    application fails", but that's not been proven to be related to this
    DataSectionObject behavior (yet).

    The "files are being held open 100% of the time after we exit the
    application" is just the first behavior they noticed while attempting
    to investigate the failure. Which seems like a reasonable expectation
    and cause for the customer's concern, given that this behavior didn't
    happen until & unless the third-party FsRtlCreateSectionForDataScan
    caller is also present.

    But no, within just the "DataSectionObject control area remains unless
    we're able to force it closed with MmForceSectionClosed", there is no
    overt "failure" occurring from that scenario. Only the underlying
    network file handle(s) being left open, which are subjectively "not
    supposed to still be open."

    If some other client workstation wanted exclusive access to those
    files (as opposed to just shared read-only-execute), the fact that a
    workstation "holds those files open indefinitely, until the local
    workstation's memory manager decides its appropriate to free up the
    control area" is probably the best way to cast this current behavior
    in the light of "being a problem."

    > 2. Can you reproduce the behavior with the third party product and
    > another file system?

    I haven't proven that yet; it's a scenario I was going to look at if
    we needed to tell the customer "this is just the way it's going to be"
    and assuage their concerns by demonstrating how it's happening with
    other redirectors, too. For a local file system, probably nobody
    cares if this is happening.

    Not sure what I would see from MRxSMB handling this situation across
    the wire. One thought I have along those lines is that any file
    system that uses Windows Cache Manager would have "an extra excuse"
    for the file still being open even after the application exited.
    Which, since we don't ever CcInitializeCacheMap in our file system,
    ostensibly doesn't apply in our case.

    Alan Adams
    Client for Open Enterprise Server
    Micro Focus
    xxxxx@microfocus.com
  • Alan_Adams-2Alan_Adams-2 Member Posts: 46
    There was also a good suggestion to try and use
    FsRtlRegisterFileSystemFilterCallbacks and the
    PostReleaseForSectionSynchronization callback.

    To provide a more definitive opportunity to use MmForceSectionClosed
    to mark the created section with DeleteOnClose, rather than depending
    on the IRP_MJ_READ paging I/O path getting invoked.

    But what I've encountered when registering for
    FsRtlRegisterFileSystemFilterCallbacks from our legacy file system
    driver (network redirector) is that although I do receive the "Pre"
    callbacks for AcquireForSectionSync and ReleaseForSectionSync, we do
    not receive the corresponding "Post" callbacks.

    If I register for FsRtlRegisterFileSystemFilterCallbacks from a legacy
    FILTER driver attached to the same stack, I receive both the "Pre" and
    the "Post" filters for the same FILE_OBJECTs that the underlying
    network redirector driver only receives "Pre" callbacks for.

    As though nt!FsFilter* has some reason to think that the "Post"
    callbacks shouldn't be sent to the registrant who had a DEVICE_OBJECT
    that wasn't for a filter. I have in fact NULL'd out the
    FAST_IO_DISPATCH entries that correspond to the callbacks (e.g.
    ReleaseFileForNtCreateSection), so that there shouldn't be duplication
    or confusion on that front.

    Just wanted to mention the "no Post callbacks are received when
    registered for FsRtlRegisterFileSystemFilterCallbacks from the actual
    file system driver" behavior, in case someone has any experience with
    that, or knows why it would actually be by design, etc.

    Alan Adams
    Client for Open Enterprise Server
    Micro Focus
    xxxxx@microfocus.com
  • Scott_Noone_(OSR)Scott_Noone_(OSR) Administrator Posts: 3,021
    I've never registered this post callback from a file system, so I can't
    comment on the behavior you're seeing from experience.

    However, the idea with the PostReleaseForSectionSynchronization callback is
    that it's called immediately after the file system returns from
    AcquireFileForNtCreateSection. This doesn't actually do you any good because
    you already know when you're at the end of the AcquireFileForNtCreateSection
    (it's your code :)). Also note that all of this happens before the section
    is actually created, so I don't think this actually helps you at all.

    Thinking back to your earlier description, if the filter above you is really
    calling the data scan API from PostCreate then you should see another
    IRP_MJ_CLEANUP happen at some point when the corresponding HANDLE is closed.
    Presumably you're not seeing this?

    -scott
    OSR
    @OSRDrivers

    -scott
    OSR

  • Scott_Noone_(OSR)Scott_Noone_(OSR) Administrator Posts: 3,021
    >However, the idea with the PostReleaseForSectionSynchronization callback is
    >that it's called immediately after the file system returns from
    >AcquireFileForNtCreateSection.

    Sorry, disregard this...I read "PostReleaseForSectionSynchronization" as
    "PostAcquireFor...", was clearly not paying close enough attention.

    However, the point of you being a file system stands. The filter callbacks
    wrap around the calls into the file system, if you're the file system then
    you already know when these things are happening. The filter callbacks can
    be used and are useful to file systems for other reasons, but they don't
    tell you anything you don't already know.

    -scott
    OSR
    @OSRDrivers

    -scott
    OSR

  • Alan_Adams-2Alan_Adams-2 Member Posts: 46
    > However, the point of you being a file system stands. The filter callbacks
    > wrap around the calls into the file system, if you're the file system then
    > you already know when these things are happening. The filter callbacks can
    > be used and are useful to file systems for other reasons, but they don't
    > tell you anything you don't already know.

    Oh no. I guess this means my next question is going to sound very
    dumb: Why would I know? The fact "the file system should already
    know" has come up in other discussion too, but I have yet to
    understand that point.

    To my knowledge, I'm not the one wanting to create the section, nor
    the one who does create the section. Aside from the fact our FCB must
    provide the SECTION_OBJECT_POINTERS storage which will be /used/ for
    managing the sections (by code that is NOT my file system driver), I'm
    not specifically aware that the FILE_OBJECT is being used for that
    purpose, except to infer it by inspecting the current state of the
    SECTION_OBJECT_POINTERS.

    So apparently I'm missing a big and apparently basic piece of the
    puzzle as to why the file system driver will already know when a
    section is being created. But that certainly would fit with why
    nt!FsFilter* may be intentionally thinking the file system would need
    to receive this callback.

    > Thinking back to your earlier description, if the filter above you is really
    > calling the data scan API from PostCreate then you should see another
    > IRP_MJ_CLEANUP happen at some point when the corresponding HANDLE is closed.
    > Presumably you're not seeing this?

    We do see an IRP_MJ_CLEANUP when the handle the Windows loader had
    opened is closed. At that point only the ImageSectionObject exists.
    So you're right, that probably does mean "it's not a straight-forward
    blocking operation in PostCreate", else the DataSectionObject would
    have already been visible at the time of the IRP_MJ_CLEANUP, too.

    They do end up performing the FsRtlCreateSectionForDataScan with the
    same FILE_OBJECT we received IRP_MJ_CLEANUP for. But I suppose it's
    possible they're just scheduling something from PostCreate, rather
    than actually blocking and waiting there. Since the intention of the
    third-party relates to malware detection, I was just assuming they
    would want the ability to fail that create.

    Alan Adams
    Client for Open Enterprise Server
    Micro Focus
    xxxxx@microfocus.com
  • Scott_Noone_(OSR)Scott_Noone_(OSR) Administrator Posts: 3,021
    FsRtlCreateSectionForDataScan calls the file system at the
    AcquireFileForNtCreateSection Fast I/O entry point. It then calls
    MmCreateSection to create the section, then calls the file system at the
    ReleaseFileForNtCreateSection Fast I/O entry point. There are filter
    callbacks around these as well, so the full sequence would be:

    FsRtlCreateSectionForDataScan

    ->FsFilter Callbacks for PreAcquireForSectionSynchronization

    ->File system's AcquireFileForNtCreateSection

    ->FsFilter Callbacks for PostAcquireForSectionSynchronization

    ->MmCreateSection

    ->FsFilter Callbacks for PreReleaseForSectionSynchronization

    ->File system's ReleaseFileForNtCreateSection

    ->FsFilter Callbacks for PostReleaseForSectionSynchronization

    You can see the result with an example.

    NTFS uses the FsFilter callback for PreAcquire and uses the Fast I/O entry
    point for release. Stopped at a call to FsRtlCreateSectionForDataScan, we
    have a SectionObjectPointer with just an ImageSectionObject:

    0: kd> r
    nt!FsRtlCreateSectionForDataScan:
    fffff800`02bb3480 488bc4 mov rax,rsp

    0: kd> ?? ((nt!_file_object *)@r9)->SectionObjectPointer
    struct _SECTION_OBJECT_POINTERS * 0xfffffa80`1b528ea8
    +0x000 DataSectionObject : (null)
    +0x008 SharedCacheMap : (null)
    +0x010 ImageSectionObject : 0xfffffa80`1bd23280 Void

    // Set some breakpoints and go
    0: kd> bp Ntfs!NtfsFilterCallbackAcquireForCreateSection
    0: kd> bp Ntfs!NtfsReleaseForCreateSection
    0: kd> g

    Breakpoint 1 hit
    Ntfs!NtfsFilterCallbackAcquireForCreateSection:
    fffff880`010d05d0 48895c2408 mov qword ptr [rsp+8],rbx

    // Hit acquire, still no section...
    0: kd> ?? ((nt!_fs_filter_callback_data
    *)@rcx)->FileObject->SectionObjectPointer
    struct _SECTION_OBJECT_POINTERS * 0xfffffa80`1b528ea8
    +0x000 DataSectionObject : (null)
    +0x008 SharedCacheMap : (null)
    +0x010 ImageSectionObject : 0xfffffa80`1bd23280 Void

    0: kd> g
    Breakpoint 2 hit
    Ntfs!NtfsReleaseForCreateSection:
    fffff880`01042e00 fff3 push rbx

    // DataSectionObject populated by the time we hit the release
    0: kd> ?? ((nt!_file_object *)@rcx)->SectionObjectPointer
    struct _SECTION_OBJECT_POINTERS * 0xfffffa80`1b528ea8
    +0x000 DataSectionObject : 0xfffffa80`1b587530 Void
    +0x008 SharedCacheMap : (null)
    +0x010 ImageSectionObject : 0xfffffa80`1bd23280 Void

    So, the file system knows that someone is trying to create a section to the
    stream specified by the file object. This file object might not end up
    backing the section (e.g. if a section already existed there would already
    be a file object), but the file system is involved in the operation.

    -scott
    OSR
    @OSRDrivers

    -scott
    OSR

  • Alan_Adams-2Alan_Adams-2 Member Posts: 46
    > FsRtlCreateSectionForDataScan calls the file system at the
    > AcquireFileForNtCreateSection Fast I/O entry point. It then calls
    > MmCreateSection to create the section, then calls the file system at the
    > ReleaseFileForNtCreateSection Fast I/O entry point.

    Thanks Scott, got it. I was incorrectly interpreting that there was
    some actual active participation in the section creation process.

    As part of registering for FsRtlRegisterFileSystemFilterCallbacks, I
    was explicitly NULLing out the AcquireFileForNtCreateSection and
    ReleaseFileForNtCreateSection Fast I/O entry points. The legacy
    FILTER (not our file system / redirector) had code notes advising to
    do this, because asking for both the FsFilter callback and the Fast
    I/O callback had led to trouble.

    It was really a moot point for our file system / redirector driver,
    because we didn't provide Fast I/O support anyway.

    So ultimately the bottom line here is "your full file system driver /
    redirector already had access to a callback notification that would
    have told you about section creation, without needing to register for
    the more granular FsFilter callbacks instead." Not that the FsFilter
    callbacks "should" have been problematic, but apparently as a full
    file system driver, they're probably not the right choice.

    Thanks for clarifying, and I'll take the approach of implementing the
    Fast I/O callbacks instead.

    Alan Adams
    Client for Open Enterprise Server
    Micro Focus
    xxxxx@microfocus.com
Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!