This is really weird.
Am I understanding this correctly? The ‘do nothing’ code intermittently fails with a 30 second timeout.
What happens if you call FilterReplyMessage for a message that hasn’t been received? I guess that the function would ‘immediately’ return with an error like ‘No waiter for reply’. It seems really strange that the timestamp for the error return for FilterReplyMessage is after the driver’s timeout.
What doesn’t make sense is why FilterReplyMessage would take so long (seconds) to fail. Either the message is a reply to something that has been sent or it isn’t. It can’t wait for a message to be sent to match a reply :-). There is no reason for this function to block the thread for any more than normal scheduling time. It’s like the FilterReplyMessage is deadlocking internally.
Incidentally, my deadlock condition was intermittent as well. The fact that it works a couple hundred times eliminates a lot of things, but not a deadlock. My deadlock was caused by resources acquired by the OS (I assume). If the OS had not acquired certain resources at the time, everything worked correctly. In my case, things were much more repeatable if I cleared the cache (unmounted the volume).
The other strange thing is that your ‘cut down’ code sounds *really* bread and butter. Multiple IFS examples are doing almost exactly the same thing.
Oh well, I’m running out of things to try. Maybe something in the C# CLR is intermittently causing a deadlock underneath the covers. I guess you could make C# call a C DLL with a diagnostic print and FilterReplyMessage in it. There’s not much ambiguity in C. If you get to the C printf, you could eliminate the CLR from suspicion.
Maybe someone else out there has positive experiences with using C# that they can communicate.
I wish I could be of more help. Maybe I’m on totally the wrong path with the deadlock thoughts.
Good Luck,
Derek
Ken Cross wrote: Yep.
Ken
---------------------------------
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Derek Dickinson
Sent: Wednesday, December 28, 2005 5:37 PM
To: Windows File Systems Devs Interest List
Subject: RE: [ntfsd] Synchronous operation of FltSendMessage
Are you setting the Status field of the FILTER_REPLY_HEADER to zero?
Derek
Ken Cross wrote: Derek:
Again, good thoughts.
I have indeed degenerated my user-mode routine to do-nothing – the
statements following the log of the “Received” message are (1) set the
MessageId in the reply, and (2) FilterReplyMessage. Then there’s a 3 second
timeout before the FilterReplyMessage returns with an error status.
The driver, of course, is just sitting at FltSendMessage waiting for the
reply.
I’ve tried longer timeouts, too – no change.
Remember that it works a long time before it breaks – I suspect some kind
of quirky timing issue. (The worst kind of thing to debug.)
Mysteriouser and mysteriouser…
Thanks,
Ken
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Derek Dickinson
Sent: Wednesday, December 28, 2005 4:58 PM
To: Windows File Systems Devs Interest List
Subject: RE: [ntfsd] Synchronous operation of FltSendMessage
Hello Ken,
Obviously, the message is getting to the user mode app. The reply is the
issue. The output you have is virtually identical to what I saw with my
deadlock condition. Consider:
1) Message sent by kernel
2) Message received by user mode.
3) User mode blocks waiting on resource (internally bound to the very write
you are blocking).
4) Kernel mode times out and releases the resource.
5) User mode now unblocks and tries to reply but no one is waiting.
The times seem to reflect this as well.
It looks like you are not displaying anything ‘immediately’ before the
FilterReplyMessage call, only after the FilterGetMessage return. Unless you
have a ‘do nothing’ filter there is no such thing as returns immediately.
I think you need a debug print immediately before the call to
FilterReplyMessage. You should also set the timeout up to 30 seconds. It
doesn’t slow anything down until things go awry. I don’t know if this is
really your problem, but you can easily eliminate it with a couple simple
checks.
The only other thing I can think of is that the FILTER_REPLY_MESSAGE
structure at the beginning of your buffer is corrupted during processing.
If this is used to correllate the messages you would see the same thing.
However, the timing would probably be different in that case (i.e. the
FilterReplyMessage would fail before the FltSendMessage timeout).
On another note, is it possible to remove ALL processing between the
FilterGetMessage and FilterReplyMessage? If it works when this is totally
inert, you can start adding things back in incrementally.
Good Luck,
Derek
Ken Cross wrote:
Derek:
Yes, I display debug messages before/after every operation and it
includes the MessageId (which is just a counter). I’ve also removed them to
make sure Heisenberg isn’t mucking things up – same results.
And yes, I meant FltSendMessage – sorry.
Here’s what I’m seeing, one success and one failure. From the
driver messages, viewed in Traceview:
12\28\2005-15:20:07:012 [OFP]: SENDING to port 800009D4, 512 @
88386048
12\28\2005-15:20:07:076 [OFP]: SUCCESSFUL reply to SendMessage to
port 800009D4, 512 @ 88386048
12\28\2005-15:20:07:268 [OFP]: SENDING to port 800009D4, 8192 @
42475520
12\28\2005-15:20:10:331 [OFP]: TIMEOUT 3 seconds to PID 3516, port
800009D4, IRQL 0, CPU 0
At that point the driver breaks the connection.
From the user-mode program logs:
2005-12-28 15:20:07.012; Received id 221, 512 @ 88386048 <– issue
FilterReplyMessage immediately
2005-12-28 15:20:07.268; Received id 222, 8192 @ 42475520 <– issue
FilterReplyMessage immediately
2005-12-28 15:20:11.065; FilterReplyMessage failed: Error code
0x801F0020 - No waiter for reply
The data (8192 @ 42475520) was received and a response sent back
immediately, but the FltSendMessage in the driver timed out. The previous
221 operations worked fine. Hence my frustration…
Ken
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Derek Dickinson
Sent: Wednesday, December 28, 2005 3:12 PM
To: Windows File Systems Devs Interest List
Subject: Re: [ntfsd] Synchronous operation of FltSendMessage
Hello Ken,
The FILTER_REPLY_HEADER has a MessageId. I assume this is similar
to the serial numbers that Mickey is using. I would include it in my debug
display along with the Status field. From what I’ve seen, it is just a
message counter. This appears to be how the Filter manager correlates
messages with replies.
Also, in your (Ken’s) last response you used FilterSendMessage when
I assume you meant FltSendMessage. Obviously, a FilterSendMessage cannot be
used in place of a FilterReplyMessage.
Good Luck,
Derek
Mickey & Eileen Lane wrote:
My user code issues N overlapped FilterGetMessage() calls
and waits on
an array of events to trigger the
data processing.
Whenever the kernel code calls FltSendMessage(), there’s
already
something waiting for it.
I’ve mucked around with values for N and 4 seems to work
pretty well. I
have a debug case where each
message contains a serial number so I can see when messages
get lost,
duplicated or processed out of order.
I mostly filter CreateFile and one of my tests is to open as
many big
applications as possible in the
shortest time possible (Word, Firefox, etc. etc.) and so far
it hasn’t
missed a beat.
Mickey.
Ken Cross wrote:
>Thanks. Good suggestions all, but:
>
>- I know that it’s not a paging operation (shouldn’t
matter, though)
>
>- I eliminated all activity but the operation,
>which is actually an ArrayList.Add() (in C#). It’s all
memory-resident
>although it may have to allocate more memory. No way that
could take 3
>seconds, though.
>
>The problem is that FilterSendMessage sent a message
waiting for
>FilterGetMessage to consume it, but FilterGetMessage is
sitting waiting for
>a message. But FilterSendMessage sent a message. This is
after they traded
>messages successfully hundreds of times, as recently as 50
ms earlier.
>
>
>Ken
>
>
>
>
>From: xxxxx@lists.osr.com
>[mailto:xxxxx@lists.osr.com] On Behalf Of
Derek Dickinson
>Sent: Wednesday, December 28, 2005 1:47 PM
>To: Windows File Systems Devs Interest List
>Subject: Re: [ntfsd] Synchronous operation of
FltSendMessage
>
>
>
>The scanner example in the IFS kit uses almost the same
logic in the
>pre-write. They specifically check for paging I/O AFTER
they have received
>the reply from user mode. That is, paging I/O should not be
a problem for
>the logic presented.
>
>Of couse, you could add a debug print of the TopLevelIrp
and extend your
>timeout to something huge. You may find that it is non-null
when the
>problem occurs.
>
>I had a similar problem in my minifilter. However, my
problem was due to an
>(accidental) attempt to perform synchronuos file system I/O
in the user mode
>code between the FilterGetMessage and FilterReplyMessage.
>
>Maybe these questions could help narrow things down:
>
>Is the a memory mapped file?
>
>Are you logging diagnostic information to a file?
>
>I’d make sure that the user mode code is really getting
back to the
>FilterSendMessage and/or FilterGetMessage. In my case, the
user mode code
>was locking up in a function that was supposed to happen
asynchronously but
>was internally being forced to operate synchronously.
>
>Good Luck,
>Derek
>
>Dejan Maksimovic wrote:
>
>
> Lurking in the dark here, perhaps, but maybe this is a
paging I/O
>write so you
> cannot get a reply?
>
> Ken Cross wrote:
>
> > NTFSD Folk:
> >
> > I’ve been fighting a problem with FltSendMessage() for
days and
>it’s just
> > about beaten me.
> >
> > The problem is this: The minifilter driver must send a
message to
>a user
> > program using FltSendMessage() in the pre-write callback
and wait
>for a
> > reply. The write operation must be held until the reply
is
>received.
> > Basically, in the driver:
> >
> > status = FltSendMessage(
> > FKFilter, // Filter
> > &ClientPort, // Client port
> > &xMessage, // Data
> > sizeof( xMessage ), // Length of data
> > &xReply, // Reply buffer
> > &iBytes, // Reply buffer length
> > &SendTimeout ); // Timeout (3 seconds)
> >
> > In the user code (simplified for brevity):
> >
> > while( true )
> > {
> > hResult = FilterGetMessage( hDriverHandle,
pMessageBuffer,
> > Marshal.SizeOf( xMessageBuffer ), IntPtr.Zero );
> >
> >
> >
> > hResult = FilterReplyMessage( hDriverHandle,
pReplyBuffer,
> > Marshal.SizeOf( xReply ) );
> > }
> >
> > The code I’ve written works great UNTIL the
FltSendMessage() times
>out (3
> > seconds) after many (300-400) successful operations.
> >
> > The docs say that FltSendMessage() will wait until
>FilterGetMessage() is
> > called in user mode, and FilterGetMessage() will wait
until a
>message
> > arrives. But something ain’t working.
> >
> > I know it’s a bit vague, but any suggestions?
> >
> > Ken
> >
> > —
> > Questions? First check the IFS FAQ at
>https://www.osronline.com/article.cfm?id=17
> >
> > You are currently subscribed to ntfsd as:
xxxxx@alfasp.com
> > To unsubscribe send a blank email to
>xxxxx@lists.osr.com
>
> –
> Kind regards, Dejan M.
> http://www.alfasp.com E-mail: xxxxx@alfasp.com
> Alfa Transparent File Encryptor - Transparent file
encryption
>services.
> Alfa File Protector - File protection and hiding library
for Win32
>developers.
> Alfa File Monitor - File monitoring library for Win32
developers.
>
>
>
>
> —
> Questions? First check the IFS FAQ at
>https://www.osronline.com/article.cfm?id=17
>
> You are currently subscribed to ntfsd as:
xxxxx@yahoo.com
> To unsubscribe send a blank email to
>xxxxx@lists.osr.com
>
>
>
>
>
>Yahoo! DSL
>>om/broadband/> Something to write home about. Just
$16.99/mo. or less —
>Questions? First check the IFS FAQ at
>https://www.osronline.com/article.cfm?id=17 You are
currently subscribed to
>ntfsd as: xxxxx@comcast.net To unsubscribe send a blank
email to
>xxxxx@lists.osr.com
>
>
>
>—
>Questions? First check the IFS FAQ at
https://www.osronline.com/article.cfm?id=17
>
>You are currently subscribed to ntfsd as:
xxxxx@earthlink.net
>To unsubscribe send a blank email to
xxxxx@lists.osr.com
>
>
>
—
Questions? First check the IFS FAQ at
https://www.osronline.com/article.cfm?id=17
You are currently subscribed to ntfsd as:
xxxxx@yahoo.com
To unsubscribe send a blank email to
xxxxx@lists.osr.com
Yahoo! DSL
om/broadband/> Something to write home about. Just $16.99/mo. or less —
Questions? First check the IFS FAQ at
https://www.osronline.com/article.cfm?id=17 You are currently subscribed to
ntfsd as: xxxxx@comcast.net To unsubscribe send a blank email to
xxxxx@lists.osr.com
—
Questions? First check the IFS FAQ at
https://www.osronline.com/article.cfm?id=17
You are currently subscribed to ntfsd as: unknown lmsubst tag
argument: ‘’
To unsubscribe send a blank email to
xxxxx@lists.osr.com
Yahoo! Shopping
Find Great Deals on Holiday Gifts at Yahoo! Shopping
_ylc=X3oDMTE2bzVzaHJtBF9TAzk1OTQ5NjM2BHNlYwNtYWlsdGFnBHNsawNob2xpZGF5LTA1>
— Questions? First check the IFS FAQ at
https://www.osronline.com/article.cfm?id=17 You are currently subscribed to
ntfsd as: xxxxx@comcast.net To unsubscribe send a blank email to
xxxxx@lists.osr.com
—
Questions? First check the IFS FAQ at https://www.osronline.com/article.cfm?id=17
You are currently subscribed to ntfsd as: xxxxx@yahoo.com
To unsubscribe send a blank email to xxxxx@lists.osr.com
---------------------------------
Yahoo! Photos
Ring in the New Year with Photo Calendars. Add photos, events, holidays, whatever. — Questions? First check the IFS FAQ at https://www.osronline.com/article.cfm?id=17 You are currently subscribed to ntfsd as: xxxxx@comcast.net To unsubscribe send a blank email to xxxxx@lists.osr.com
—
Questions? First check the IFS FAQ at https://www.osronline.com/article.cfm?id=17
You are currently subscribed to ntfsd as: unknown lmsubst tag argument: ‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com
---------------------------------
Yahoo! Shopping
Find Great Deals on Holiday Gifts at Yahoo! Shopping