Delayed AcquireForLazyWrite problem

OSR_Community_User · March 31, 2005, 10:58pm

Hi All,

I have written an FSD (not a filter driver) and am having some problems
with Cache Manager’s lazy writer.

Scenario -

From cmd.exe, copy fileA (on my volume) to fileB (on my volume)

From my logs (which are sequence numbered with an interlocked variable,
so the ordering here is correct), I see the following -

cmd.exe opens fileA & fileB
cmd.exe reads bit of fileA, then issues buffered write to fileB
(ie. a standard (while(read) { write(read_bytes) })

the cached write acquires & releases fileB FCB
mainResource exclusive

cmd.exe updates timestamps of fileB (IRP_MJ_SET_INFO / basic info)
cmd.exe sends cleanup & close of fileB (no more userhandles now)

… time passes …

AcquireForLazyWrite handler called for fileB FCB
(only cache manager has ref to FCB here, no user handles open)

picks up fcb mainResource exclusive

series of direct to disk (non-buffered) writes from lazy writer
occur for fileB

each write picks up / releases fcb pagingIoResource exclusive

ReleaseFromLazyWrite handler caller for fileB

releases fcb mainResource

… time passes - lazy writer hasn’t written entire file …

*asynchronous* paging I/O write occurs for fileB (lazy writer?)

IRP_MJ_WRITE pended (as I do some fancy stuff that blocks)

pended IRP from (8) picked up in system worker thread, IRP_MJ_WRITE
tries to pick up fcb pagingIoResource and BLOCKS

examination with debugger at this point reveals that
pagingIoResource is acquired SHARED by someone

AcquireForLazyWrite for fileB fcb called

fcb mainResource picked up exclusive

synchronous paging I/O write occurs for fileB (lazy writer)

we block again trying to get pagingIoResource

Steps 8 through 10 are shown below -

115: *** IRP_MJ_WRITE *** FO(864ED1D8)
116: WRITING 65536 bytes from 983040 FO(864ED1D8)
117: PENDING write of PFS file that can’t be blocked. FO(864ED1D8)
118: IRP_MJ_WRITE returning STATUS_PENDING with info 0 FO(864ED1D8)
119: *** IRP_MJ_WRITE *** FO(864ED1D8)
120: WRITING 65536 bytes from 983040 FO(864ED1D8)
121: ===> PAGING I/O (RE-ENTRANT / VMM) <=== FO(864ED1D8)
122: ACQUIRE EXCL 8550171C (fcb.pagingIoResource) for
PBoxWriteNormal (CAN block) … FO(864ED1D8)
123: ===> ACQUIRE FOR LAZY-WRITE (\fileB for System) <===
124: ACQUIRE EXCL 855016E4 (fcb.mainResource) for
PBoxAcquireForLazyWrite (CAN block) …
125: ACQUIRED EXCL 855016E4 (fcb.mainResource) for
PBoxAcquireForLazyWrite
126: *** IRP_MJ_WRITE *** FO(864ED1D8)
127: WRITING 65536 bytes from 786432 FO(864ED1D8)
128: ===> PAGING I/O (LAZY-WRITER) <=== FO(864ED1D8)
129: ACQUIRE EXCL 8550171C (fcb.pagingIoResource) for
PBoxWriteNormal (CAN block) … FO(864ED1D8)

So, stuff starts going wrong at Step 8. Some questions -

Why does this paging write (assumingly lazy writer) at Step 8 occur
BEFORE AcquireForLazyWrite is called at Step 10?
Why is paging write at Step 8 asynchronous, but paging write at Step
11 synchronous?

If I break in at line 129 the status of the two resources for the FCB
are as follows -

mainResource:

0: kd> dt -y -r ERESOURCE 0x855016e4
+0x000 SystemResourcesList : _LIST_ENTRY [0x8550171c - 0x8551d71c]
+0x000 Flink : 0x8550171c [0x8648ea68 - 0x855016e4]
+0x000 Flink : 0x8648ea68 [0x864e5290 - 0x8550171c]
+0x004 Blink : 0x855016e4 [0x8550171c - 0x8551d71c]
+0x004 Blink : 0x8551d71c [0x855016e4 - 0x8551d6e4]
+0x000 Flink : 0x855016e4 [0x8550171c - 0x8551d71c]
+0x004 Blink : 0x8551d6e4 [0x8551d71c - 0x8657c71c]
+0x008 OwnerTable : (null)
+0x00c ActiveCount : 1
+0x00e Flag : 0x80
+0x010 SharedWaiters : 0x862b4350
+0x000 Header : _DISPATCHER_HEADER
+0x000 Type : 0x5 ‘’
+0x001 Absolute : 0 ‘’
+0x002 Size : 0x5 ‘’
+0x003 Inserted : 0 ‘’
+0x004 SignalState : 0
+0x008 WaitListHead : _LIST_ENTRY [0x862b4358 - 0x862b4358]
+0x010 Limit : 2147483647
+0x014 ExclusiveWaiters : 0x863d1530
+0x000 Header : _DISPATCHER_HEADER
+0x000 Type : 0x1 ‘’
+0x001 Absolute : 0 ‘’
+0x002 Size : 0x4 ‘’
+0x003 Inserted : 0 ‘’
+0x004 SignalState : 0
+0x008 WaitListHead : _LIST_ENTRY [0x863d1538 - 0x863d1538]
+0x018 OwnerThreads : [2] _OWNER_ENTRY
+0x000 OwnerThread : 0x867cb340
+0x004 OwnerCount : 1
+0x004 TableSize : 1
+0x028 ContentionCount : 2
+0x02c NumberOfSharedWaiters : 0
+0x02e NumberOfExclusiveWaiters : 0
+0x030 Address : (null)
+0x030 CreatorBackTraceIndex : 0
+0x034 SpinLock : 0

pagingIoResource:

0: kd> dt -y -r ERESOURCE 0x8550171C
+0x000 SystemResourcesList : _LIST_ENTRY [0x8648ea68 - 0x855016e4]
+0x000 Flink : 0x8648ea68 [0x864e5290 - 0x8550171c]
+0x000 Flink : 0x864e5290 [0x86492b48 - 0x8648ea68]
+0x004 Blink : 0x8550171c [0x8648ea68 - 0x855016e4]
+0x004 Blink : 0x855016e4 [0x8550171c - 0x8551d71c]
+0x000 Flink : 0x8550171c [0x8648ea68 - 0x855016e4]
+0x004 Blink : 0x8551d71c [0x855016e4 - 0x8551d6e4]
+0x008 OwnerTable : (null)
+0x00c ActiveCount : 1
+0x00e Flag : 0
+0x010 SharedWaiters : (null)
+0x014 ExclusiveWaiters : 0x8664a148
+0x000 Header : _DISPATCHER_HEADER
+0x000 Type : 0x1 ‘’
+0x001 Absolute : 0 ‘’
+0x002 Size : 0x4 ‘’
+0x003 Inserted : 0 ‘’
+0x004 SignalState : 0
+0x008 WaitListHead : _LIST_ENTRY [0x867cb3b0 - 0x867cb8a0]
+0x018 OwnerThreads : [2] _OWNER_ENTRY
+0x000 OwnerThread : 0
+0x004 OwnerCount : 0
+0x004 TableSize : 0
+0x028 ContentionCount : 2
+0x02c NumberOfSharedWaiters : 0
+0x02e NumberOfExclusiveWaiters : 2
+0x030 Address : (null)
+0x030 CreatorBackTraceIndex : 0
+0x034 SpinLock : 0

pagingIoResource, which is what I am blocking on, is last acquired
shared by an IRP_MJ_READ, which just happens to be a recursive call into
my FSD on the *last* cached write to fileB -

187: *** IRP_MJ_WRITE *** FO(864ED1D8)
188: WRITING 22528 bytes from 1638400 FO(864ED1D8)
189: ACQUIRE EXCL 855016E4 (fcb.mainResource) for
PBoxWriteNormal (CAN block) … FO(864ED1D8)
190: ACQUIRED EXCL 855016E4 (fcb.mainResource) for
PBoxWriteNormal
191: ===> Using CACHE MANAGER to satisfy write <=== FO(864ED1D8)

–> start recursive FSD entry

192: *** IRP_MJ_READ *** FO(864ED1D8)
193: READING 4096 bytes from 1658880 FO(864ED1D8)
194: PENDING read of PFS file that can’t be blocked. FO(864ED1D8)
195: IRP_MJ_READ returning STATUS_PENDING with Information 0
FO(864ED1D8)
196: *** IRP_MJ_READ *** FO(864ED1D8)
197: READING 4096 bytes from 1658880 FO(864ED1D8)
198: ===> REQUEST IS FROM VMM (PAGING I/O) <=== FO(864ED1D8)
199: ACQUIRE SHRD 8550171C (fcb.pagingIoResource) for
PBoxReadNormal (CAN block) … FO(864ED1D8)
200: ACQUIRED SHRD 8550171C (fcb.pagingIoResource) for
PBoxReadNormal
201: ===> Using DISK DIRECT to satisfy read <=== FO(864ED1D8)
202: Read would exceed EOF, truncated. Reading 2049, returning
2048 bytes FO(864ED1D8)
203: RELEASED 8550171C (fcb.pagingIoResource) for (PBoxReadNormal)
204: IRP_MJ_READ returning STATUS_SUCCESS with Information 2048
FO(864ED1D8)

<– end recursive FSD entry

205: CcCopyWrite() returned SUCCESS FO(864ED1D8)
206: RELEASED 855016E4 (fcb.mainResource) for (PBoxWriteNormal)
207: IRP_MJ_WRITE returning STATUS_SUCCESS with Information 22528
FO(864ED1D8)

So, CcCopyWrite() is triggering a direct read (presumably to ensure an
overlapping page is valid in the cache, or something), which recursively
enters my filesystem. Whilst IRP_MJ_READ acquires pagingIoResource
shared (line 199), it DOES release it (line 203). I’ve actually checked
this with the debugger and ActiveCount drops from 1 to 0 after the release.

So, is Cache Manager using the resources in my FCB common header itself,
without calling my callbacks, and hence I am trying to acquire them when
it already has? Perhaps my implementation of
AcquireForLazyWrite/ReleaseFromLazyWrite is incorrect. I noticed in
FASTFAT they do something like -

IoSetTopLevelIrp((PIRP)FSRTL_CACHE_TOP_LEVEL_IRP);

In AcquireForLazyWrite(). They call it a kludge. I note also that they
rarely, if ever, check WHO is the Top Level Component, in terms of the
FSRTL constants – does Cache Manager use the TLC values?

I would really appreciate it if anyone has any ideas, I’ve been looking
at this for a day or so now.

Furthermore, all this stuff is protected by FsRtlEnterFileSystem()
EXCEPT the AcquireForLazyWrite() / ReleaseFromLazyWrite() callbacks –
they don’t appear to be protected by these in FastFat either.

Cheers,
Josh

OSR_Community_User · April 1, 2005, 2:14am

Josh S wrote:

… time passes - lazy writer hasn’t written entire file …

*asynchronous* paging I/O write occurs for fileB (lazy writer?)

IRP_MJ_WRITE pended (as I do some fancy stuff that blocks)

pended IRP from (8) picked up in system worker thread, IRP_MJ_WRITE
tries to pick up fcb pagingIoResource and BLOCKS

examination with debugger at this point reveals that
pagingIoResource is acquired SHARED by someone

AcquireForLazyWrite for fileB fcb called

fcb mainResource picked up exclusive

synchronous paging I/O write occurs for fileB (lazy writer)

we block again trying to get pagingIoResource

Steps 8 through 10 are shown below -

115: *** IRP_MJ_WRITE *** FO(864ED1D8)
116: WRITING 65536 bytes from 983040 FO(864ED1D8)
117: PENDING write of PFS file that can’t be blocked. FO(864ED1D8)
118: IRP_MJ_WRITE returning STATUS_PENDING with info 0 FO(864ED1D8)
119: *** IRP_MJ_WRITE *** FO(864ED1D8)
120: WRITING 65536 bytes from 983040 FO(864ED1D8)
121: ===> PAGING I/O (RE-ENTRANT / VMM) <=== FO(864ED1D8)
122: ACQUIRE EXCL 8550171C (fcb.pagingIoResource) for
PBoxWriteNormal (CAN block) … FO(864ED1D8)
123: ===> ACQUIRE FOR LAZY-WRITE (\fileB for System) <===
124: ACQUIRE EXCL 855016E4 (fcb.mainResource) for
PBoxAcquireForLazyWrite (CAN block) …
125: ACQUIRED EXCL 855016E4 (fcb.mainResource) for
PBoxAcquireForLazyWrite
126: *** IRP_MJ_WRITE *** FO(864ED1D8)
127: WRITING 65536 bytes from 786432 FO(864ED1D8)
128: ===> PAGING I/O (LAZY-WRITER) <=== FO(864ED1D8)
129: ACQUIRE EXCL 8550171C (fcb.pagingIoResource) for
PBoxWriteNormal (CAN block) … FO(864ED1D8)

In a somewhat strange manner of replying to myself, before anyone else
has, I thought I’d shed some light on what I *think* may be the problem.

The asynchronous write in Step 8 is actually the MPW coming along and
flushing the dirty page from the cache before the lazy writer gets a
chance to – further analysis of the problem shows it only occurs under
high load, hence when the MPW is most likely to kick in.

I hadn’t installed a handler for FastIoAcquireForModWrite – installing
one reveals that it’s only called (and my problem only occurs) when I
try to copy many files at once (and hence a bit of memory is being
used). When only copying one file at a time no stress is put on the
system, the MPW stays out of the picture and the normal lazy write
preacquisition works fine.

So, my question to someone with inside knowledge of the MPW is, which
resources does the *default* implementation of FastIoAcquireForModWrite
(used when my dispatch ptr is NULL) acquire, and in which order?

Why does this paging write (assumingly lazy writer) at Step 8 occur
BEFORE AcquireForLazyWrite is called at Step 10?

This assumption was, I think, incorrect. It’s probably the MPW, backed
up by the fact the write is asynchronous (as the MPW is frantically
trying to flush).

I still wouldn’t mind knowing why FastFAT does the “kludge” work in
FastIoAcquireForLazyWrite et al in setting the TLC to
FSRTL_CACHE_MANAGER – do any system components actually look at the TLC
and selectively (not) acquire resources?

Cheers,
Josh

OSR_Community_User · April 1, 2005, 7:34am

I actually discuss this exact scenario in file systems class:

If the Resource in your FsContext structure (using the advanced FCB
header) is NULL, no locking is performed.

There are two flags defined in ntifs.h:

#define FSRTL_FLAG_ACQUIRE_MAIN_RSRC_EX (0x08)
#define FSRTL_FLAG_ACQUIRE_MAIN_RSRC_SH (0x10)

If the Flags field has either of these set, it grabs the Resource in the
corresponding mode.

If neither flag is set it acquires your PagingIoResource in shared mode.

This fits perfectly with what you described (and indeed, after reading
your first note I suspected that you were seeing the modified page
writer). You can completely control this behavior.

If you have a fast I/O entry point (or register for filter callbacks an
unusual but theoretically possible step for a file system driver) then
that will be called BEFORE applying this logic.

Regards,

Tony

Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Josh S
Sent: Friday, April 01, 2005 2:14 AM
To: ntfsd redirect
Subject: Re: [ntfsd] Delayed AcquireForLazyWrite problem

Josh S wrote:

… time passes - lazy writer hasn’t written entire file …

*asynchronous* paging I/O write occurs for fileB (lazy writer?)

IRP_MJ_WRITE pended (as I do some fancy stuff that blocks)

pended IRP from (8) picked up in system worker thread, IRP_MJ_WRITE
tries to pick up fcb pagingIoResource and BLOCKS

examination with debugger at this point reveals that
pagingIoResource is acquired SHARED by someone

AcquireForLazyWrite for fileB fcb called

fcb mainResource picked up exclusive

synchronous paging I/O write occurs for fileB (lazy writer)

we block again trying to get pagingIoResource

Steps 8 through 10 are shown below -

115: *** IRP_MJ_WRITE *** FO(864ED1D8)
116: WRITING 65536 bytes from 983040 FO(864ED1D8)
117: PENDING write of PFS file that can’t be blocked. FO(864ED1D8)
118: IRP_MJ_WRITE returning STATUS_PENDING with info 0
FO(864ED1D8)
119: *** IRP_MJ_WRITE *** FO(864ED1D8)
120: WRITING 65536 bytes from 983040 FO(864ED1D8)
121: ===> PAGING I/O (RE-ENTRANT / VMM) <=== FO(864ED1D8)
122: ACQUIRE EXCL 8550171C (fcb.pagingIoResource) for
PBoxWriteNormal (CAN block) … FO(864ED1D8)
123: ===> ACQUIRE FOR LAZY-WRITE (\fileB for System) <===
124: ACQUIRE EXCL 855016E4 (fcb.mainResource) for
PBoxAcquireForLazyWrite (CAN block) …
125: ACQUIRED EXCL 855016E4 (fcb.mainResource) for
PBoxAcquireForLazyWrite
126: *** IRP_MJ_WRITE *** FO(864ED1D8)
127: WRITING 65536 bytes from 786432 FO(864ED1D8)
128: ===> PAGING I/O (LAZY-WRITER) <=== FO(864ED1D8)
129: ACQUIRE EXCL 8550171C (fcb.pagingIoResource) for
PBoxWriteNormal (CAN block) … FO(864ED1D8)

In a somewhat strange manner of replying to myself, before anyone else
has, I thought I’d shed some light on what I *think* may be the problem.

The asynchronous write in Step 8 is actually the MPW coming along and
flushing the dirty page from the cache before the lazy writer gets a
chance to – further analysis of the problem shows it only occurs under
high load, hence when the MPW is most likely to kick in.

I hadn’t installed a handler for FastIoAcquireForModWrite – installing
one reveals that it’s only called (and my problem only occurs) when I
try to copy many files at once (and hence a bit of memory is being
used). When only copying one file at a time no stress is put on the
system, the MPW stays out of the picture and the normal lazy write
preacquisition works fine.

So, my question to someone with inside knowledge of the MPW is, which
resources does the *default* implementation of FastIoAcquireForModWrite
(used when my dispatch ptr is NULL) acquire, and in which order?

Why does this paging write (assumingly lazy writer) at Step 8 occur
BEFORE AcquireForLazyWrite is called at Step 10?

This assumption was, I think, incorrect. It’s probably the MPW, backed
up by the fact the write is asynchronous (as the MPW is frantically
trying to flush).

I still wouldn’t mind knowing why FastFAT does the “kludge” work in
FastIoAcquireForLazyWrite et al in setting the TLC to
FSRTL_CACHE_MANAGER – do any system components actually look at the TLC

and selectively (not) acquire resources?

Cheers,
Josh

Questions? First check the IFS FAQ at
https://www.osronline.com/article.cfm?id=17

You are currently subscribed to ntfsd as: xxxxx@osr.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

OSR_Community_User · April 3, 2005, 7:43pm

Tony Mason wrote:

I actually discuss this exact scenario in file systems class:

If the Resource in your FsContext structure (using the advanced FCB
header) is NULL, no locking is performed.

There are two flags defined in ntifs.h:

#define FSRTL_FLAG_ACQUIRE_MAIN_RSRC_EX (0x08)
#define FSRTL_FLAG_ACQUIRE_MAIN_RSRC_SH (0x10)

If the Flags field has either of these set, it grabs the Resource in the
corresponding mode.

If neither flag is set it acquires your PagingIoResource in shared mode.

Tony,

Thank-you very much for this detail, that is very useful behaviour to
know about. As I was setting neither flag it does indeed match the
scenario perfectly – I’ve changed this and all seems to be well now.

Thanks once again.

Josh