Hi.
I have noticed the Ntfs driver changes the owner thread pointer of the main resource of a file control block. The driver modifies the address of the owner thread within the NtfsNonCachedIo function so that the lowest two bits of the address are set to 1.
I encountered this when I was degugging a legacy file system filter driver. The filter driver got the Ntfs driver’s Fcb from a file object and it acquired the main resource of the Fcb shared. Then the filter driver called CcCopyRead on the file object, which caused a page fault. The page faul was services by the underlying ntfs driver, which also acquired the resource shared and then modified its owner thread. After the CcCopyRead function returned, the driver tried to release the resource, which failed as the owner thread was different than the current thread. (BugCheck E3)
This problem occurs on Windows 7, the filter driver works fine on the former versions of Windows up to Windows Xp.
I am really curious why the ntfs driver does this. Does it change the owner thread’s address to store some extra information or does it change it to prevent evil filter drivers to lock resources of its file control blocks or is there any other reason for it?
In my opinion, when to call CcCopyRead? and acquire whatever resources should be left to the file system owning the file object who has the best knowledge of the caching state. A filter should use higher level API to read data. The fact it “worked” in XP does not mean it is correct – as indicated now by the bugcheck in Windows 7 you observe.
Lijun
From: “xxxxx@gmail.com”
To: Windows File Systems Devs Interest List
Sent: Tue, November 24, 2009 6:51:46 AM
Subject: [ntfsd] Ntfs driver modifies resource’s owner thread
Hi.
I have noticed the Ntfs driver changes the owner thread pointer of the main resource of a file control block. The driver modifies the address of the owner thread within the NtfsNonCachedIo function so that the lowest two bits of the address are set to 1.
I encountered this when I was degugging a legacy file system filter driver. The filter driver got the Ntfs driver’s Fcb from a file object and it acquired the main resource of the Fcb shared. Then the filter driver called CcCopyRead on the file object, which caused a page fault. The page faul was services by the underlying ntfs driver, which also acquired the resource shared and then modified its owner thread. After the CcCopyRead function returned, the driver tried to release the resource, which failed as the owner thread was different than the current thread. (BugCheck E3)
This problem occurs on Windows 7, the filter driver works fine on the former versions of Windows up to Windows Xp.
I am really curious why the ntfs driver does this. Does it change the owner thread’s address to store some extra information or does it change it to prevent evil filter drivers to lock resources of its file control blocks or is there any other reason for it?
—
NTFSD is sponsored by OSR
For our schedule of debugging and file system seminars
(including our new fs mini-filter seminar) visit:
http://www.osr.com/seminars
To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
Lijun Wang wrote:
In my opinion, when to call CcCopyRead and acquire whatever resources
should be left to the file system owning the file object who has the
best knowledge of the caching state. A filter should use higher level
API to read data. The fact it “worked” in XP does not mean it is correct
– as indicated now by the bugcheck in Windows 7 you observe.
I don’t think I can reinforce this strongly enough.
File resources are not intended to be public things that filters can
acquire. Trying to do this requires intimate knowledge of exactly how
the filesystem will use its locks.
I don’t know exactly what this filter is trying to achieve. I do know
that in NTFS, acquiring main and calling CcCopyRead has always been
unsafe on user files, since it would not be robust to collided pagefault
deadlocks (my favorite topic!) The fact that Win7 deadlocks is just an
artifact of Cc/Mm issuing async reads, which were previously sync.
Why is this filter trying to muck with a filesystem owned cache directly
anyway? If it wants a cached read, it can perform a cached read (which
really will work.)
xxxxx@gmail.com wrote:
I am really curious why the ntfs driver does this. Does it change the
owner thread’s address to store some extra information or does it change
it to prevent evil filter drivers to lock resources of its file control
blocks or is there any other reason for it?
Filesystems have always reassigned resource ownership for async IO.
They need to hold a resource to guard against truncates, but it is
important that the issuing thread is also not able to initiate a
truncate. See the fastfat sample for how and when this happens
(ExSetResourceOwnerPointer.)
For guarding against evil filter drivers, a better approach would be to
move the real ERESOURCES into a private allocation. Since FsRtl
acquires can be handled by the FileSystem in its callbacks, no direct
acquires should ever occur. If the filter tries to acquire NULL, it
will bugcheck.
How very, very tempting.
> I encountered this when I was degugging a legacy file system filter driver. The filter driver got the Ntfs
driver’s Fcb from a file object and it acquired the main resource of the Fcb shared. Then the filter
driver called CcCopyRead
Amazingly buggy technique, not supported at all.
I would strongly suggest to use some another method of file reading in the filter.
The filters must not a) take the FCB locks b) call most of Cc’s functions, CcCopyRead/Write included.
–
Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com