What does it mean when the TargetFileObject or FileObject in the preCreate are on the kernel stack?

I was looking at this Microsoft sample:

https://github.com/microsoft/Windows-driver-samples/blob/main/filesys/miniFilter/avscan/filter/avscan.c

And saw this:

    //
    //  Stack file objects are never scanned.
    //

    IoGetStackLimits( &stackLow, &stackHigh );

    if (((ULONG_PTR)FltObjects->FileObject > stackLow) && 
        ((ULONG_PTR)FltObjects->FileObject < stackHigh)) {

        return FLT_PREOP_SUCCESS_NO_CALLBACK;
    }

And this:

//
//  Stack file objects are never scanned.
//

 PFILE_OBJECT FileObject = Data->Iopb->TargetFileObject;
 IoGetStackLimits( &stackLow, &stackHigh );

if (((ULONG_PTR)FileObject > stackLow) && 
    ((ULONG_PTR)FileObject < stackHigh)) {

    return FLT_PREOP_SUCCESS_NO_CALLBACK;
}

So my question is, what does it mean for the FltObjects->FileObject or for Data->Iopb->TargetFileObject to be on the stack? What does that have to do anything with scanning the file?

Back in the days of XP (or possibly even NT4 SP2 - maybe even 3.7) someone thought it would be a cute way of saving nanoseconds and NPP(*) to allocate a FO on the stack to streamline some operations (I am sure someone out there will remember which but I would guess it was in a fastio path equivalent to the stat(3) callbacks we now have).

This worked really well until file system filters (no mini filters in those days) started doing real object manipulations on them like referencing them, and dereferencing them later. At that stage the dereference would turn into a decrement of a random bit of stack which would usually cause a series of difficult to diagnose crashes.

So people got into the habit of saying ‘if this FO is on the stack keep clear’. Hence that code. I haven’t seen that since Vista or maybe Win7 but, given the difficulty of debugging this sort of crash, the code hangs around.

It might be interesting to put in a PR to pull that code and see if it gets accepted- that would be a pretty clear indication of whether the code really has been expunged.

R

(*) Remember NT had to boot in 64Mb of physical memory and run on what now would appear to be slow processors so allocating pool had a real and (critically) measurable cost. Hence tricks like this had real value and were done within a reasonably ‘rigorous’ engineering process (his initials were DC and it might even have been his idea) so it’s not a daft as it might sound now -certainly les strange than some things you see in an active kernel these days.

1 Like

@rod_widdowson said:
Back in the days of XP (or possibly even NT4 SP2 - maybe even 3.7) someone thought it would be a cute way of saving nanoseconds and NPP(*) to allocate a FO on the stack to streamline some operations (I am sure someone out there will remember which but I would guess it was in a fastio path equivalent to the stat(3) callbacks we now have).

This worked really well until file system filters (no mini filters in those days) started doing real object manipulations on them like referencing them, and dereferencing them later. At that stage the dereference would turn into a decrement of a random bit of stack which would usually cause a series of difficult to diagnose crashes.

So people got into the habit of saying ‘if this FO is on the stack keep clear’. Hence that code. I haven’t seen that since Vista or maybe Win7 but, given the difficulty of debugging this sort of crash, the code hangs around.

It might be interesting to put in a PR to pull that code and see if it gets accepted- that would be a pretty clear indication of whether the code really has been expunged.

R

(*) Remember NT had to boot in 64Mb of physical memory and run on what now would appear to be slow processors so allocating pool had a real and (critically) measurable cost. Hence tricks like this had real value and were done within a reasonably ‘rigorous’ engineering process (his initials were DC and it might even have been his idea) so it’s not a daft as it might sound now -certainly les strange than some things you see in an active kernel these days.

But should I also add this check in my minifilter callbacks? Because so far I have never used it and never got any problems, which makes me wonder why would Microsoft write this code in their new projects considering the problem seems to have existed pre vista, and minifilter is vista+? Why would I put a workaround code for a pre vista problem in a vista+ project?

Why would I put a workaround code for a pre vista problem in a vista+ project?

I’d imagine cut and paste programming. Me? I’d add it, but I’d also put in a PR to remove it from the sample and base my behavior on whether the PR was accepted.

There are two things that came to mind:

  • Stacks can be swapped out, without a critical/guard region, and waiting on a stack FILE_OBJECT can easily mean a page fault. I didn’t think through exactly how it would be an issue, it just came to mind
  • FILE_OBJECT structure is pretty heavy, kernel stack wise (216 bytes is a lot to just place on a kernel stack willy nilly), but that is more likely to have been taken care of by querying the remaining stack size.

I only recently noticed that a ZwCreateFile call can use up as much as ~6KB of kernel stack space in a pretty common call path (i.e. not in a corner case, but in everyday use) on Win 11, and none of the calls in that path, until FltMgr/NTFS check for how much kernel stack is left.

Cheers, Deja.