Could this be a synchronization problem....

Hi all,

I am having a problem with my encryption filter on NTFS. Let me first
explain how i am handling the write with some pseudo-code.

Write dispatch
{
if this an encrypted file
{
queue to worker thread so we are at passive level and i can
talk to a usb device.
pend operation
}
else
{
call lower driver
}
}

WorkerThread
{
read encryption key from usb device

if key read failed
{
deny write
complete irp with error
}
else
{
allocate our own buffer
copy data to write into our buffer
encrypt buffer
replace buffer in irp with our buffer
set completion routine
call lower driver
}
}

completionroutine
{
tidy up the irp, replacing the buffer etc.
clean everything up
complete irp with success
}

I hope all that makes sense!

This all seems to work fine on fat/fat32 drives, but when i move across to
NTFS it causes no end of
problems.

Basically when an application tries to write to the file (in response to a
FlushFileBuffers call -> IRP_MJ_FLUSH_BUFFERS) it locks the thread and sits
there for ever.

Any ideas how I can even start to debug this. I can see that the thread has
hung during MmFlushImageSection
from NTFS but that’s about all.

Regards

Ben

You have omitted several items of interest. How do you determine that this
is an encrypted file? Why do you have to get the key during write
processing? If the write request was sent at IPC_LEVEL it is not a good
idea to make it wait. Are you looking at the paging and nocaching bits in
the write and create? I know of no way, except writing your own file
system, to keep the data in the cache manager’s buffers encrypted. The
buffer pointers, etc. are very critical, so make sure you are restoring it
exactly as it was when it was received. If this is a sync request, the
current file offset must be updated too.

----- Original Message -----
From:
To: “File Systems Developers”
Sent: Thursday, September 26, 2002 11:32 AM
Subject: [ntfsd] Could this be a synchronization problem…

>
> Hi all,
>
> I am having a problem with my encryption filter on NTFS. Let me first
> explain how i am handling the write with some pseudo-code.
>
> Write dispatch
> {
> if this an encrypted file
> {
> queue to worker thread so we are at passive level and i can
> talk to a usb device.
> pend operation
> }
> else
> {
> call lower driver
> }
> }
>
> WorkerThread
> {
> read encryption key from usb device
>
> if key read failed
> {
> deny write
> complete irp with error
> }
> else
> {
> allocate our own buffer
> copy data to write into our buffer
> encrypt buffer
> replace buffer in irp with our buffer
> set completion routine
> call lower driver
> }
> }
>
> completionroutine
> {
> tidy up the irp, replacing the buffer etc.
> clean everything up
> complete irp with success
> }
>
> I hope all that makes sense!
>
> This all seems to work fine on fat/fat32 drives, but when i move across to
> NTFS it causes no end of
> problems.
>
> Basically when an application tries to write to the file (in response to a
> FlushFileBuffers call -> IRP_MJ_FLUSH_BUFFERS) it locks the thread and
sits
> there for ever.
>
> Any ideas how I can even start to debug this. I can see that the thread
has
> hung during MmFlushImageSection
> from NTFS but that’s about all.
>
> Regards
>
> Ben
>
>
> —
> You are currently subscribed to ntfsd as: xxxxx@yoshimuni.com
> To unsubscribe send a blank email to %%email.unsub%%
>

> I know of no way, except writing your own file

system, to keep the data in the cache manager’s buffers
encrypted. The buffer pointers, etc. are very critical, so
make sure you are restoring it exactly as it was when it was
received. If this is a sync request, the current file offset
must be updated too.

Even if you were to write your own filesystem, I think this would be
impossible due to memory-mapping of files. When a file is memory-mapped,
the memory manager hands the application an address pointing directly
into a buffer owned by the cache-manager. If this data is encrypted, the
application sees garbage. I don’t know of a way for a filesystem to
modify this behavior outside of the binary choice of either
supporting/not supporting caching (and thus memory-mapping of files).

If you are the file system, you could, but I wouldn’t necessarily recommend
it, have the memory mapped files done without using the cache manager
itself, but intercept all the mapping requests and handle them your self.
It might take a source code license for the OS to find all the problems this
would create, but I would not say it is impossible. Impractical, unwise,
and bucket of worms, but not impossible.

----- Original Message -----
From: “Nicholas Ryan”
To: “File Systems Developers”
Sent: Thursday, September 26, 2002 7:08 PM
Subject: [ntfsd] Re: Could this be a synchronization problem…

> > I know of no way, except writing your own file
> > system, to keep the data in the cache manager’s buffers
> > encrypted. The buffer pointers, etc. are very critical, so
> > make sure you are restoring it exactly as it was when it was
> > received. If this is a sync request, the current file offset
> > must be updated too.
>
> Even if you were to write your own filesystem, I think this would be
> impossible due to memory-mapping of files. When a file is memory-mapped,
> the memory manager hands the application an address pointing directly
> into a buffer owned by the cache-manager. If this data is encrypted, the
> application sees garbage. I don’t know of a way for a filesystem to
> modify this behavior outside of the binary choice of either
> supporting/not supporting caching (and thus memory-mapping of files).
>
>
>
> —
> You are currently subscribed to ntfsd as: xxxxx@yoshimuni.com
> To unsubscribe send a blank email to %%email.unsub%%
>

Hi David,

To clarify your questions:

  1. I determine an encrypted file by a linked list of file objects, this list
    is
    keyed on the fscontext value so that I don’t miss stream file objects.
    Basically
    I have a list of fscontext values and within each node of the list I have
    a
    further linked list of fileobjects associated with the fscontext.
  2. The reason for getting the key during the write is so that if the usb
    device
    is removed unexpectedly I can deny access to the file. I don`t really
    want to
    keep a list of cached keys in memory any longer than is needed.
  3. With regards to the APC_LEVEL observation, I think this is the problem :wink:
  4. I only encrypt/decrypt pagingio, so that data in the cache is left
    decrypted.
    I agree that keeping the data in the cache is practically impossible.
  5. All buffers and pointers checked and accounted for!

Regards

Ben

-----Original Message-----
From: David J. Craig [mailto:xxxxx@yoshimuni.com]
Sent: 26 September 2002 23:48
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…

You have omitted several items of interest. How do you determine that this
is an encrypted file? Why do you have to get the key during write
processing? If the write request was sent at IPC_LEVEL it is not a good
idea to make it wait. Are you looking at the paging and nocaching bits in
the write and create? I know of no way, except writing your own file
system, to keep the data in the cache manager’s buffers encrypted. The
buffer pointers, etc. are very critical, so make sure you are restoring it
exactly as it was when it was received. If this is a sync request, the
current file offset must be updated too.

----- Original Message -----
From:
To: “File Systems Developers”
Sent: Thursday, September 26, 2002 11:32 AM
Subject: [ntfsd] Could this be a synchronization problem…

>
> Hi all,
>
> I am having a problem with my encryption filter on NTFS. Let me first
> explain how i am handling the write with some pseudo-code.
>
> Write dispatch
> {
> if this an encrypted file
> {
> queue to worker thread so we are at passive level and i can
> talk to a usb device.
> pend operation
> }
> else
> {
> call lower driver
> }
> }
>
> WorkerThread
> {
> read encryption key from usb device
>
> if key read failed
> {
> deny write
> complete irp with error
> }
> else
> {
> allocate our own buffer
> copy data to write into our buffer
> encrypt buffer
> replace buffer in irp with our buffer
> set completion routine
> call lower driver
> }
> }
>
> completionroutine
> {
> tidy up the irp, replacing the buffer etc.
> clean everything up
> complete irp with success
> }
>
> I hope all that makes sense!
>
> This all seems to work fine on fat/fat32 drives, but when i move across to
> NTFS it causes no end of
> problems.
>
> Basically when an application tries to write to the file (in response to a
> FlushFileBuffers call -> IRP_MJ_FLUSH_BUFFERS) it locks the thread and
sits
> there for ever.
>
> Any ideas how I can even start to debug this. I can see that the thread
has
> hung during MmFlushImageSection
> from NTFS but that’s about all.
>
> Regards
>
> Ben
>
>
> —
> You are currently subscribed to ntfsd as: xxxxx@yoshimuni.com
> To unsubscribe send a blank email to %%email.unsub%%
>


You are currently subscribed to ntfsd as: xxxxx@des.co.uk
To unsubscribe send a blank email to %%email.unsub%%

I am assuming that the FsContexts are captured during create completion
processing.

Is the SmartCard reader under the control of PC/SC? If so, the application
will be notified when it is removed. That is why the floppy drive based
SmartCard reader couldn’t get WHQL approval because there is no automatic
notifications to the owning driver when a floppy or pseudo floppy is
removed.

Write requests may come in at APC_LEVEL and the completion routine might be
called at DISPATCH_LEVEL. You may have to queue to a worker thread at
PASSIVE_LEVEL. Luckily I have seen someone say that creates are always
completed at PASSIVE_LEVEL. My testing to date has not disproved this idea
and it also appears that with the MS file systems you remain in the user’s
context in the completion routine also, but I still capture all the info I
needed during the create dispatch routine.

----- Original Message -----
From:
To: “File Systems Developers”
Sent: Friday, September 27, 2002 6:33 AM
Subject: [ntfsd] Re: Could this be a synchronization problem…

>
> Hi David,
>
> To clarify your questions:
>
> 1. I determine an encrypted file by a linked list of file objects, this
list
> is
> keyed on the fscontext value so that I don’t miss stream file objects.
> Basically
> I have a list of fscontext values and within each node of the list I
have
> a
> further linked list of fileobjects associated with the fscontext.
> 2. The reason for getting the key during the write is so that if the usb
> device
> is removed unexpectedly I can deny access to the file. I don`t really
> want to
> keep a list of cached keys in memory any longer than is needed.
> 3. With regards to the APC_LEVEL observation, I think this is the problem
:wink:
> 4. I only encrypt/decrypt pagingio, so that data in the cache is left
> decrypted.
> I agree that keeping the data in the cache is practically impossible.
> 5. All buffers and pointers checked and accounted for!
>
> Regards
>
> Ben
>
> -----Original Message-----
> From: David J. Craig [mailto:xxxxx@yoshimuni.com]
> Sent: 26 September 2002 23:48
> To: File Systems Developers
> Subject: [ntfsd] Re: Could this be a synchronization problem…
>
>
> You have omitted several items of interest. How do you determine that
this
> is an encrypted file? Why do you have to get the key during write
> processing? If the write request was sent at IPC_LEVEL it is not a good
> idea to make it wait. Are you looking at the paging and nocaching bits in
> the write and create? I know of no way, except writing your own file
> system, to keep the data in the cache manager’s buffers encrypted. The
> buffer pointers, etc. are very critical, so make sure you are restoring it
> exactly as it was when it was received. If this is a sync request, the
> current file offset must be updated too.
>
> ----- Original Message -----
> From:
> To: “File Systems Developers”
> Sent: Thursday, September 26, 2002 11:32 AM
> Subject: [ntfsd] Could this be a synchronization problem…
>
>
> >
> > Hi all,
> >
> > I am having a problem with my encryption filter on NTFS. Let me first
> > explain how i am handling the write with some pseudo-code.
> >
> > Write dispatch
> > {
> > if this an encrypted file
> > {
> > queue to worker thread so we are at passive level and i can
> > talk to a usb device.
> > pend operation
> > }
> > else
> > {
> > call lower driver
> > }
> > }
> >
> > WorkerThread
> > {
> > read encryption key from usb device
> >
> > if key read failed
> > {
> > deny write
> > complete irp with error
> > }
> > else
> > {
> > allocate our own buffer
> > copy data to write into our buffer
> > encrypt buffer
> > replace buffer in irp with our buffer
> > set completion routine
> > call lower driver
> > }
> > }
> >
> > completionroutine
> > {
> > tidy up the irp, replacing the buffer etc.
> > clean everything up
> > complete irp with success
> > }
> >
> > I hope all that makes sense!
> >
> > This all seems to work fine on fat/fat32 drives, but when i move across
to
> > NTFS it causes no end of
> > problems.
> >
> > Basically when an application tries to write to the file (in response to
a
> > FlushFileBuffers call -> IRP_MJ_FLUSH_BUFFERS) it locks the thread and
> sits
> > there for ever.
> >
> > Any ideas how I can even start to debug this. I can see that the thread
> has
> > hung during MmFlushImageSection
> > from NTFS but that’s about all.
> >
> > Regards
> >
> > Ben
> >
> >
> > —
> > You are currently subscribed to ntfsd as: xxxxx@yoshimuni.com
> > To unsubscribe send a blank email to %%email.unsub%%
> >
>
>
>
>
> —
> You are currently subscribed to ntfsd as: xxxxx@des.co.uk
> To unsubscribe send a blank email to %%email.unsub%%
>
>
> —
> You are currently subscribed to ntfsd as: xxxxx@yoshimuni.com
> To unsubscribe send a blank email to %%email.unsub%%

Hi David,

FsContexts are indeed captured during create completion processing.

The USB device uses a smartcard microprocessor but that’s about as far
as it goes. I have implemented my own operating system in firmware so
no nice PC/SC! You are right to say that notification is sent when the
device is removed though.

I already queue write requests to a worker item so that I can talk to the
USB device. It only seems to be a problem when the underlying file system
calls FLUSH_BUFFERS. Here is the stack dump.

f356c3c8 804f6320 nt!KiSwapContext+0x2e (FPO: [EBP 0xf356c3fc] [0,0,4])
f356c3d4 804f04e8 nt!KiSwapThread+0x44 (FPO: [0,0,2])
f356c3fc 804fffa0 nt!KeWaitForSingleObject+0x1c0 (FPO: [Non-Fpo])
f356c4d4 805008ee nt!MiFlushSectionInternal+0x38a (FPO: [7,44,3])
f356c510 804dc340 nt!MmFlushSection+0x1e0 (FPO: [Non-Fpo])
f356c594 f5006090 nt!CcFlushCache+0x33e (FPO: [Non-Fpo])
f356c5c0 f5023628 Ntfs!NtfsFlushUserStream+0x6a (FPO: [Non-Fpo])
f356c630 f50239ca Ntfs!NtfsCommonFlushBuffers+0x12a (FPO: [Non-Fpo])
f356c694 804e5d53 Ntfs!NtfsFsdFlushBuffers+0x92 (FPO: [Non-Fpo])
f356c6a4 80623e10 nt!IopfCallDriver+0x31 (FPO: [0,0,1])
f356c6c8 baebd362 nt!IovCallDriver+0x9e (FPO: [Non-Fpo])
f356c6e4 baebca52 dlkpfsd!DLKPFSD_FlushBuffers_Dispatch+0x107 (CONV:
stdcall)
f356c704 baebc8c5 dlkpfsd!DLKPFSD_FilterDispatch+0xa9 (CONV: stdcall)
f356c720 804e5d53 dlkpfsd!DLKPFSD_Default_DriverDispatch+0x72 (CONV:
stdcall)
f356c730 80623e10 nt!IopfCallDriver+0x31 (FPO: [0,0,1])
f356c754 80556870 nt!IovCallDriver+0x9e (FPO: [Non-Fpo])
f356c768 8054f317 nt!IopSynchronousServiceTail+0x5e (FPO: [Non-Fpo])
f356c7e4 805283c1 nt!NtFlushBuffersFile+0x1ad (FPO: [Non-Fpo])

That KeWaitForSingleObject near the top is the problem, the event never gets
signaled!

NTFS Calls flush buffers to write dirty data to disk, i queue the write to a
worker, my
worker gets called, but when I call IoCallDriver to do the actual write it
hangs the thread.

Regards

Ben

-----Original Message-----
From: David J. Craig [mailto:xxxxx@yoshimuni.com]
Sent: 28 September 2002 02:26
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…

I am assuming that the FsContexts are captured during create completion
processing.

Is the SmartCard reader under the control of PC/SC? If so, the application
will be notified when it is removed. That is why the floppy drive based
SmartCard reader couldn’t get WHQL approval because there is no automatic
notifications to the owning driver when a floppy or pseudo floppy is
removed.

Write requests may come in at APC_LEVEL and the completion routine might be
called at DISPATCH_LEVEL. You may have to queue to a worker thread at
PASSIVE_LEVEL. Luckily I have seen someone say that creates are always
completed at PASSIVE_LEVEL. My testing to date has not disproved this idea
and it also appears that with the MS file systems you remain in the user’s
context in the completion routine also, but I still capture all the info I
needed during the create dispatch routine.

----- Original Message -----
From:
To: “File Systems Developers”
Sent: Friday, September 27, 2002 6:33 AM
Subject: [ntfsd] Re: Could this be a synchronization problem…

>
> Hi David,
>
> To clarify your questions:
>
> 1. I determine an encrypted file by a linked list of file objects, this
list
> is
> keyed on the fscontext value so that I don’t miss stream file objects.
> Basically
> I have a list of fscontext values and within each node of the list I
have
> a
> further linked list of fileobjects associated with the fscontext.
> 2. The reason for getting the key during the write is so that if the usb
> device
> is removed unexpectedly I can deny access to the file. I don`t really
> want to
> keep a list of cached keys in memory any longer than is needed.
> 3. With regards to the APC_LEVEL observation, I think this is the problem
:wink:
> 4. I only encrypt/decrypt pagingio, so that data in the cache is left
> decrypted.
> I agree that keeping the data in the cache is practically impossible.
> 5. All buffers and pointers checked and accounted for!
>
> Regards
>
> Ben
>
> -----Original Message-----
> From: David J. Craig [mailto:xxxxx@yoshimuni.com]
> Sent: 26 September 2002 23:48
> To: File Systems Developers
> Subject: [ntfsd] Re: Could this be a synchronization problem…
>
>
> You have omitted several items of interest. How do you determine that
this
> is an encrypted file? Why do you have to get the key during write
> processing? If the write request was sent at IPC_LEVEL it is not a good
> idea to make it wait. Are you looking at the paging and nocaching bits in
> the write and create? I know of no way, except writing your own file
> system, to keep the data in the cache manager’s buffers encrypted. The
> buffer pointers, etc. are very critical, so make sure you are restoring it
> exactly as it was when it was received. If this is a sync request, the
> current file offset must be updated too.
>
> ----- Original Message -----
> From:
> To: “File Systems Developers”
> Sent: Thursday, September 26, 2002 11:32 AM
> Subject: [ntfsd] Could this be a synchronization problem…
>
>
> >
> > Hi all,
> >
> > I am having a problem with my encryption filter on NTFS. Let me first
> > explain how i am handling the write with some pseudo-code.
> >
> > Write dispatch
> > {
> > if this an encrypted file
> > {
> > queue to worker thread so we are at passive level and i can
> > talk to a usb device.
> > pend operation
> > }
> > else
> > {
> > call lower driver
> > }
> > }
> >
> > WorkerThread
> > {
> > read encryption key from usb device
> >
> > if key read failed
> > {
> > deny write
> > complete irp with error
> > }
> > else
> > {
> > allocate our own buffer
> > copy data to write into our buffer
> > encrypt buffer
> > replace buffer in irp with our buffer
> > set completion routine
> > call lower driver
> > }
> > }
> >
> > completionroutine
> > {
> > tidy up the irp, replacing the buffer etc.
> > clean everything up
> > complete irp with success
> > }
> >
> > I hope all that makes sense!
> >
> > This all seems to work fine on fat/fat32 drives, but when i move across
to
> > NTFS it causes no end of
> > problems.
> >
> > Basically when an application tries to write to the file (in response to
a
> > FlushFileBuffers call -> IRP_MJ_FLUSH_BUFFERS) it locks the thread and
> sits
> > there for ever.
> >
> > Any ideas how I can even start to debug this. I can see that the thread
> has
> > hung during MmFlushImageSection
> > from NTFS but that’s about all.
> >
> > Regards
> >
> > Ben
> >
> >
> > —
> > You are currently subscribed to ntfsd as: xxxxx@yoshimuni.com
> > To unsubscribe send a blank email to %%email.unsub%%
> >
>
>
>
>
> —
> You are currently subscribed to ntfsd as: xxxxx@des.co.uk
> To unsubscribe send a blank email to %%email.unsub%%
>
>
> —
> You are currently subscribed to ntfsd as: xxxxx@yoshimuni.com
> To unsubscribe send a blank email to %%email.unsub%%


You are currently subscribed to ntfsd as: xxxxx@des.co.uk
To unsubscribe send a blank email to %%email.unsub%%

Ben,

This indicates that the I/O operation isn’t finishing. You need to find
where that I/O operation has gone astray (probably in a DIFFERENT thread
context.)

I have seen IRP/MDL handling issues cause mysterious hangs before. Are you
modifying the Irp->MdlAddress at all? Are you using fast mutexes? They
have a side-effect of raising to APC_LEVEL, although paging I/O operations
are completed differently because they cannot use the APC mechanism (those
calls ARRIVE at APC_LEVEL.)

For example, if you do change Irp->MdlAddress (as is often done in
encryption drivers, for example) then you MUST also update Irp->UserBuffer.
It doesn’t matter what that value is, or whether it is a valid ADDRESS, so
long as MmGetMdlVirtualAddress(Irp->MdlAddress) is the same as
Irp->UserBuffer. I’ve seen this particular problem hang systems before.

As always, more information is best.

Regards,

Tony

Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com

-----Original Message-----
From: xxxxx@des.co.uk [mailto:xxxxx@des.co.uk]
Sent: Saturday, September 28, 2002 6:13 AM
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…

Hi David,

FsContexts are indeed captured during create completion processing.

The USB device uses a smartcard microprocessor but that’s about as far
as it goes. I have implemented my own operating system in firmware so
no nice PC/SC! You are right to say that notification is sent when the
device is removed though.

I already queue write requests to a worker item so that I can talk to the
USB device. It only seems to be a problem when the underlying file system
calls FLUSH_BUFFERS. Here is the stack dump.

f356c3c8 804f6320 nt!KiSwapContext+0x2e (FPO: [EBP 0xf356c3fc] [0,0,4])
f356c3d4 804f04e8 nt!KiSwapThread+0x44 (FPO: [0,0,2])
f356c3fc 804fffa0 nt!KeWaitForSingleObject+0x1c0 (FPO: [Non-Fpo])
f356c4d4 805008ee nt!MiFlushSectionInternal+0x38a (FPO: [7,44,3])
f356c510 804dc340 nt!MmFlushSection+0x1e0 (FPO: [Non-Fpo])
f356c594 f5006090 nt!CcFlushCache+0x33e (FPO: [Non-Fpo])
f356c5c0 f5023628 Ntfs!NtfsFlushUserStream+0x6a (FPO: [Non-Fpo])
f356c630 f50239ca Ntfs!NtfsCommonFlushBuffers+0x12a (FPO: [Non-Fpo])
f356c694 804e5d53 Ntfs!NtfsFsdFlushBuffers+0x92 (FPO: [Non-Fpo])
f356c6a4 80623e10 nt!IopfCallDriver+0x31 (FPO: [0,0,1])
f356c6c8 baebd362 nt!IovCallDriver+0x9e (FPO: [Non-Fpo])
f356c6e4 baebca52 dlkpfsd!DLKPFSD_FlushBuffers_Dispatch+0x107 (CONV:
stdcall)
f356c704 baebc8c5 dlkpfsd!DLKPFSD_FilterDispatch+0xa9 (CONV: stdcall)
f356c720 804e5d53 dlkpfsd!DLKPFSD_Default_DriverDispatch+0x72 (CONV:
stdcall)
f356c730 80623e10 nt!IopfCallDriver+0x31 (FPO: [0,0,1])
f356c754 80556870 nt!IovCallDriver+0x9e (FPO: [Non-Fpo])
f356c768 8054f317 nt!IopSynchronousServiceTail+0x5e (FPO: [Non-Fpo])
f356c7e4 805283c1 nt!NtFlushBuffersFile+0x1ad (FPO: [Non-Fpo])

That KeWaitForSingleObject near the top is the problem, the event never gets
signaled!

NTFS Calls flush buffers to write dirty data to disk, i queue the write to a
worker, my
worker gets called, but when I call IoCallDriver to do the actual write it
hangs the thread.

Regards

Ben

-----Original Message-----
From: David J. Craig [mailto:xxxxx@yoshimuni.com]
Sent: 28 September 2002 02:26
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…

I am assuming that the FsContexts are captured during create completion
processing.

Is the SmartCard reader under the control of PC/SC? If so, the application
will be notified when it is removed. That is why the floppy drive based
SmartCard reader couldn’t get WHQL approval because there is no automatic
notifications to the owning driver when a floppy or pseudo floppy is
removed.

Write requests may come in at APC_LEVEL and the completion routine might be
called at DISPATCH_LEVEL. You may have to queue to a worker thread at
PASSIVE_LEVEL. Luckily I have seen someone say that creates are always
completed at PASSIVE_LEVEL. My testing to date has not disproved this idea
and it also appears that with the MS file systems you remain in the user’s
context in the completion routine also, but I still capture all the info I
needed during the create dispatch routine.

----- Original Message -----
From:
To: “File Systems Developers”
Sent: Friday, September 27, 2002 6:33 AM
Subject: [ntfsd] Re: Could this be a synchronization problem…

>
> Hi David,
>
> To clarify your questions:
>
> 1. I determine an encrypted file by a linked list of file objects, this
list
> is
> keyed on the fscontext value so that I don’t miss stream file objects.
> Basically
> I have a list of fscontext values and within each node of the list I
have
> a
> further linked list of fileobjects associated with the fscontext.
> 2. The reason for getting the key during the write is so that if the usb
> device
> is removed unexpectedly I can deny access to the file. I don`t really
> want to
> keep a list of cached keys in memory any longer than is needed.
> 3. With regards to the APC_LEVEL observation, I think this is the problem
:wink:
> 4. I only encrypt/decrypt pagingio, so that data in the cache is left
> decrypted.
> I agree that keeping the data in the cache is practically impossible.
> 5. All buffers and pointers checked and accounted for!
>
> Regards
>
> Ben
>
> -----Original Message-----
> From: David J. Craig [mailto:xxxxx@yoshimuni.com]
> Sent: 26 September 2002 23:48
> To: File Systems Developers
> Subject: [ntfsd] Re: Could this be a synchronization problem…
>
>
> You have omitted several items of interest. How do you determine that
this
> is an encrypted file? Why do you have to get the key during write
> processing? If the write request was sent at IPC_LEVEL it is not a good
> idea to make it wait. Are you looking at the paging and nocaching bits in
> the write and create? I know of no way, except writing your own file
> system, to keep the data in the cache manager’s buffers encrypted. The
> buffer pointers, etc. are very critical, so make sure you are restoring it
> exactly as it was when it was received. If this is a sync request, the
> current file offset must be updated too.
>
> ----- Original Message -----
> From:
> To: “File Systems Developers”
> Sent: Thursday, September 26, 2002 11:32 AM
> Subject: [ntfsd] Could this be a synchronization problem…
>
>
> >
> > Hi all,
> >
> > I am having a problem with my encryption filter on NTFS. Let me first
> > explain how i am handling the write with some pseudo-code.
> >
> > Write dispatch
> > {
> > if this an encrypted file
> > {
> > queue to worker thread so we are at passive level and i can
> > talk to a usb device.
> > pend operation
> > }
> > else
> > {
> > call lower driver
> > }
> > }
> >
> > WorkerThread
> > {
> > read encryption key from usb device
> >
> > if key read failed
> > {
> > deny write
> > complete irp with error
> > }
> > else
> > {
> > allocate our own buffer
> > copy data to write into our buffer
> > encrypt buffer
> > replace buffer in irp with our buffer
> > set completion routine
> > call lower driver
> > }
> > }
> >
> > completionroutine
> > {
> > tidy up the irp, replacing the buffer etc.
> > clean everything up
> > complete irp with success
> > }
> >
> > I hope all that makes sense!
> >
> > This all seems to work fine on fat/fat32 drives, but when i move across
to
> > NTFS it causes no end of
> > problems.
> >
> > Basically when an application tries to write to the file (in response to
a
> > FlushFileBuffers call -> IRP_MJ_FLUSH_BUFFERS) it locks the thread and
> sits
> > there for ever.
> >
> > Any ideas how I can even start to debug this. I can see that the thread
> has
> > hung during MmFlushImageSection
> > from NTFS but that’s about all.
> >
> > Regards
> >
> > Ben
> >
> >
> > —
> > You are currently subscribed to ntfsd as: xxxxx@yoshimuni.com
> > To unsubscribe send a blank email to %%email.unsub%%
> >
>
>
>
>
> —
> You are currently subscribed to ntfsd as: xxxxx@des.co.uk
> To unsubscribe send a blank email to %%email.unsub%%
>
>
> —
> You are currently subscribed to ntfsd as: xxxxx@yoshimuni.com
> To unsubscribe send a blank email to %%email.unsub%%


You are currently subscribed to ntfsd as: xxxxx@des.co.uk
To unsubscribe send a blank email to %%email.unsub%%


You are currently subscribed to ntfsd as: xxxxx@osr.com
To unsubscribe send a blank email to %%email.unsub%%

Hi Tony,

Thanks for that, I have included the Write worker routine below (minus all
the error checking and debug!), as you can see i have (i think) updated the
mdl & userbuffer address correctly!

It just hangs right down the bottom where i call IoCallDriver(…)

Any ideas on what the best approach to tracking this down is?

Regards

Ben

VOID DLKPFSD_Write_Worker( PVOID pContext )
{
PDLKPFSD_WRITE_WORKERITEM pWorkItem =
(PDLKPFSD_WRITE_WORKERITEM)pContext;
CDLKPFSD_Device_Extension* pExtension =
DLKPFSD_GetDeviceExtension( pWorkItem->ContextData->pDeviceObject );
PDLKPFSD_WRITECOMPLETION_CONTEXT pWriteContext =
pWorkItem->ContextData;
PIRP pIrp =
pWorkItem->ContextData->pOriginalIRP;
PIO_STACK_LOCATION pCurrentIrpStack =
IoGetCurrentIrpStackLocation( pIrp );
CDLKPFSD_TrackedFileObject* pTrackedFile =
pWorkItem->ContextData->pTrackedFile;

PMDL pMdl = pIrp->MdlAddress;
PUCHAR pUserBuffer = NULL;
ULONG uLength = pCurrentIrpStack->Parameters.Write.Length;

if( pMdl )
{
pUserBuffer = (PUCHAR)MmGetSystemAddressForMdlSafe( pMdl,
HighPagePriority );
}
else
{
pUserBuffer = ((PUCHAR)pIrp->UserBuffer);
}

/* Store Original Buffers */
pWriteContext->pOriginalMdl = pIrp->MdlAddress;
pWriteContext->pOriginalUserBuffer = pIrp->UserBuffer;

/* Allocate storage */
PVOID pOurBuffer = ExAllocatePool( NonPagedPool, uLength );

if( pUserBuffer && pOurBuffer )
{
/* Copy supplied data to our buffer coz we can`t do it in
place */
RtlCopyBytes( pOurBuffer, pUserBuffer, uLength );

/* Do the encryption */
__try
{
DLP_MINI_KEY data;
ULONG ulErr;

RtlZeroMemory( &data, sizeof( DLP_MINI_KEY ) );

CDLKPFSD_TrackedFileObject* pTrackedFileObject =
pWorkItem->ContextData->pTrackedFile;
CDLKPFSD_TrackedFolderObject* pFolderObject =
(CDLKPFSD_TrackedFolderObject*)pTrackedFileObject->GetData();

ulErr = TOKEN_ExtractKey( 0,
pFolderObject->GetKeySerial(), pFolderObject->GetKeyIndex(), FALSE, &data );

if( ulErr != DLPCLNT_ERROR_SUCCESS )
{
/* Cleanup */
ExFreePool( pOurBuffer );
ExFreePool( pWriteContext );
ExFreePool( pWorkItem );

/* Complete the IRP with error */
pIrp->IoStatus.Status =
STATUS_ACCESS_DENIED;

IoCompleteRequest( pIrp, IO_NO_INCREMENT );

return;
}
else
{
PLUGIN_Encrypt( pFolderObject->GetIV(),
data.bKey,

(PUCHAR)pOurBuffer,

(int)uLength,

(int)(pCurrentIrpStack->Parameters.Write.ByteOffset.QuadPart & 0x0FFFFFFFF),

data.bAlgoID
);
}
}
__except(EXCEPTION_EXECUTE_HANDLER)
{
DLKPFSD_ExceptionHandler( GetExceptionCode(), 0, 0,
0, 0, FALSE );
}
}
else
{
DLKPFSD_LOG_PRINT( DLKPFSD_DEBUG_WRITE, (“[DLKPFSD]:
pUserBuffer || pOurBuffer == NULL!\n” ) );
}

/* Allocate a new MDL */
PMDL pMdlReplacement = IoAllocateMdl( pOurBuffer, uLength, FALSE,
FALSE, NULL );

if( pMdlReplacement )
{
/* Build new mdl */
MmBuildMdlForNonPagedPool( pMdlReplacement );

/* Update IRP to reflect new MDL */
pIrp->MdlAddress = pMdlReplacement;

/* Update the pIrp->UserBuffer field to!! */
pIrp->UserBuffer = MmGetMdlVirtualAddress( pMdlReplacement
);

}
else
{
DLKPFSD_LOG_PRINT( DLKPFSD_DEBUG_WRITE, (“[DLKPFSD]: ****
FAILED TO CREATE MDL REPLACEMENT ****\n” ) );
}

pWriteContext->pOriginalIRP = pIrp;

/* Store our buffer away */
pWriteContext->pOurBuffer = pOurBuffer;

IoCopyCurrentIrpStackLocationToNext( pIrp );

/* Setup the completion routine */
IoSetCompletionRoutine( pIrp, DLKPFSD_Write_Completion, pWorkItem,
TRUE, TRUE, TRUE );

DLKPFSD_LOG_PRINT( DLKPFSD_DEBUG_WRITE, (“[DLKPFSD]: IoCallDriver
from write start\n” ) );
IoCallDriver( pExtension->pAttachedDeviceObject, pIrp );
DLKPFSD_LOG_PRINT( DLKPFSD_DEBUG_WRITE, (“[DLKPFSD]: IoCallDriver
from write done\n” ) );
}

-----Original Message-----
From: Tony Mason [mailto:xxxxx@osr.com]
Sent: 28 September 2002 11:37
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…

Ben,

This indicates that the I/O operation isn’t finishing. You need to find
where that I/O operation has gone astray (probably in a DIFFERENT thread
context.)

I have seen IRP/MDL handling issues cause mysterious hangs before. Are you
modifying the Irp->MdlAddress at all? Are you using fast mutexes? They
have a side-effect of raising to APC_LEVEL, although paging I/O operations
are completed differently because they cannot use the APC mechanism (those
calls ARRIVE at APC_LEVEL.)

For example, if you do change Irp->MdlAddress (as is often done in
encryption drivers, for example) then you MUST also update Irp->UserBuffer.
It doesn’t matter what that value is, or whether it is a valid ADDRESS, so
long as MmGetMdlVirtualAddress(Irp->MdlAddress) is the same as
Irp->UserBuffer. I’ve seen this particular problem hang systems before.

As always, more information is best.

Regards,

Tony

Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com

-----Original Message-----
From: xxxxx@des.co.uk [mailto:xxxxx@des.co.uk]
Sent: Saturday, September 28, 2002 6:13 AM
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…

Hi David,

FsContexts are indeed captured during create completion processing.

The USB device uses a smartcard microprocessor but that’s about as far
as it goes. I have implemented my own operating system in firmware so
no nice PC/SC! You are right to say that notification is sent when the
device is removed though.

I already queue write requests to a worker item so that I can talk to the
USB device. It only seems to be a problem when the underlying file system
calls FLUSH_BUFFERS. Here is the stack dump.

f356c3c8 804f6320 nt!KiSwapContext+0x2e (FPO: [EBP 0xf356c3fc] [0,0,4])
f356c3d4 804f04e8 nt!KiSwapThread+0x44 (FPO: [0,0,2])
f356c3fc 804fffa0 nt!KeWaitForSingleObject+0x1c0 (FPO: [Non-Fpo])
f356c4d4 805008ee nt!MiFlushSectionInternal+0x38a (FPO: [7,44,3])
f356c510 804dc340 nt!MmFlushSection+0x1e0 (FPO: [Non-Fpo])
f356c594 f5006090 nt!CcFlushCache+0x33e (FPO: [Non-Fpo])
f356c5c0 f5023628 Ntfs!NtfsFlushUserStream+0x6a (FPO: [Non-Fpo])
f356c630 f50239ca Ntfs!NtfsCommonFlushBuffers+0x12a (FPO: [Non-Fpo])
f356c694 804e5d53 Ntfs!NtfsFsdFlushBuffers+0x92 (FPO: [Non-Fpo])
f356c6a4 80623e10 nt!IopfCallDriver+0x31 (FPO: [0,0,1])
f356c6c8 baebd362 nt!IovCallDriver+0x9e (FPO: [Non-Fpo])
f356c6e4 baebca52 dlkpfsd!DLKPFSD_FlushBuffers_Dispatch+0x107 (CONV:
stdcall)
f356c704 baebc8c5 dlkpfsd!DLKPFSD_FilterDispatch+0xa9 (CONV: stdcall)
f356c720 804e5d53 dlkpfsd!DLKPFSD_Default_DriverDispatch+0x72 (CONV:
stdcall)
f356c730 80623e10 nt!IopfCallDriver+0x31 (FPO: [0,0,1])
f356c754 80556870 nt!IovCallDriver+0x9e (FPO: [Non-Fpo])
f356c768 8054f317 nt!IopSynchronousServiceTail+0x5e (FPO: [Non-Fpo])
f356c7e4 805283c1 nt!NtFlushBuffersFile+0x1ad (FPO: [Non-Fpo])

That KeWaitForSingleObject near the top is the problem, the event never gets
signaled!

NTFS Calls flush buffers to write dirty data to disk, i queue the write to a
worker, my
worker gets called, but when I call IoCallDriver to do the actual write it
hangs the thread.

Regards

Ben

-----Original Message-----
From: David J. Craig [mailto:xxxxx@yoshimuni.com]
Sent: 28 September 2002 02:26
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…

I am assuming that the FsContexts are captured during create completion
processing.

Is the SmartCard reader under the control of PC/SC? If so, the application
will be notified when it is removed. That is why the floppy drive based
SmartCard reader couldn’t get WHQL approval because there is no automatic
notifications to the owning driver when a floppy or pseudo floppy is
removed.

Write requests may come in at APC_LEVEL and the completion routine might be
called at DISPATCH_LEVEL. You may have to queue to a worker thread at
PASSIVE_LEVEL. Luckily I have seen someone say that creates are always
completed at PASSIVE_LEVEL. My testing to date has not disproved this idea
and it also appears that with the MS file systems you remain in the user’s
context in the completion routine also, but I still capture all the info I
needed during the create dispatch routine.

----- Original Message -----
From:
To: “File Systems Developers”
Sent: Friday, September 27, 2002 6:33 AM
Subject: [ntfsd] Re: Could this be a synchronization problem…

>
> Hi David,
>
> To clarify your questions:
>
> 1. I determine an encrypted file by a linked list of file objects, this
list
> is
> keyed on the fscontext value so that I don’t miss stream file objects.
> Basically
> I have a list of fscontext values and within each node of the list I
have
> a
> further linked list of fileobjects associated with the fscontext.
> 2. The reason for getting the key during the write is so that if the usb
> device
> is removed unexpectedly I can deny access to the file. I don`t really
> want to
> keep a list of cached keys in memory any longer than is needed.
> 3. With regards to the APC_LEVEL observation, I think this is the problem
:wink:
> 4. I only encrypt/decrypt pagingio, so that data in the cache is left
> decrypted.
> I agree that keeping the data in the cache is practically impossible.
> 5. All buffers and pointers checked and accounted for!
>
> Regards
>
> Ben
>
> -----Original Message-----
> From: David J. Craig [mailto:xxxxx@yoshimuni.com]
> Sent: 26 September 2002 23:48
> To: File Systems Developers
> Subject: [ntfsd] Re: Could this be a synchronization problem…
>
>
> You have omitted several items of interest. How do you determine that
this
> is an encrypted file? Why do you have to get the key during write
> processing? If the write request was sent at IPC_LEVEL it is not a good
> idea to make it wait. Are you looking at the paging and nocaching bits in
> the write and create? I know of no way, except writing your own file
> system, to keep the data in the cache manager’s buffers encrypted. The
> buffer pointers, etc. are very critical, so make sure you are restoring it
> exactly as it was when it was received. If this is a sync request, the
> current file offset must be updated too.
>
> ----- Original Message -----
> From:
> To: “File Systems Developers”
> Sent: Thursday, September 26, 2002 11:32 AM
> Subject: [ntfsd] Could this be a synchronization problem…
>
>
> >
> > Hi all,
> >
> > I am having a problem with my encryption filter on NTFS. Let me first
> > explain how i am handling the write with some pseudo-code.
> >
> > Write dispatch
> > {
> > if this an encrypted file
> > {
> > queue to worker thread so we are at passive level and i can
> > talk to a usb device.
> > pend operation
> > }
> > else
> > {
> > call lower driver
> > }
> > }
> >
> > WorkerThread
> > {
> > read encryption key from usb device
> >
> > if key read failed
> > {
> > deny write
> > complete irp with error
> > }
> > else
> > {
> > allocate our own buffer
> > copy data to write into our buffer
> > encrypt buffer
> > replace buffer in irp with our buffer
> > set completion routine
> > call lower driver
> > }
> > }
> >
> > completionroutine
> > {
> > tidy up the irp, replacing the buffer etc.
> > clean everything up
> > complete irp with success
> > }
> >
> > I hope all that makes sense!
> >
> > This all seems to work fine on fat/fat32 drives, but when i move across
to
> > NTFS it causes no end of
> > problems.
> >
> > Basically when an application tries to write to the file (in response to
a
> > FlushFileBuffers call -> IRP_MJ_FLUSH_BUFFERS) it locks the thread and
> sits
> > there for ever.
> >
> > Any ideas how I can even start to debug this. I can see that the thread
> has
> > hung during MmFlushImageSection
> > from NTFS but that’s about all.
> >
> > Regards
> >
> > Ben
> >
> >
> > —
> > You are currently subscribed to ntfsd as: xxxxx@yoshimuni.com
> > To unsubscribe send a blank email to %%email.unsub%%
> >
>
>
>
>
> —
> You are currently subscribed to ntfsd as: xxxxx@des.co.uk
> To unsubscribe send a blank email to %%email.unsub%%
>
>
> —
> You are currently subscribed to ntfsd as: xxxxx@yoshimuni.com
> To unsubscribe send a blank email to %%email.unsub%%


You are currently subscribed to ntfsd as: xxxxx@des.co.uk
To unsubscribe send a blank email to %%email.unsub%%


You are currently subscribed to ntfsd as: xxxxx@osr.com
To unsubscribe send a blank email to %%email.unsub%%


You are currently subscribed to ntfsd as: xxxxx@des.co.uk
To unsubscribe send a blank email to %%email.unsub%%

Ben,

You are living dangerously here - you are using a non-paging I/O marked IRP
for satisfying what I assume is a paging I/O operation. That BEGS for
problems. Have you been following the discussion about I/O completion
processing? Well, I/O completion processing is different when done for
paging I/O operations.

I’m not saying you are “doing it wrong” but rather that this is another
potential area for errors.

Again - what are the other threads in the OS doing? You indicate that
IoCallDriver never returns. If so, how far along does your IRP go? Have
you walked into the underlying driver(s) to see if you can figure out what
has happened to them?

Regards,

Tony

Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com

-----Original Message-----
From: xxxxx@des.co.uk [mailto:xxxxx@des.co.uk]
Sent: Saturday, September 28, 2002 6:43 AM
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…

Hi Tony,

Thanks for that, I have included the Write worker routine below (minus all
the error checking and debug!), as you can see i have (i think) updated the
mdl & userbuffer address correctly!

It just hangs right down the bottom where i call IoCallDriver(…)

Any ideas on what the best approach to tracking this down is?

Regards

Ben

VOID DLKPFSD_Write_Worker( PVOID pContext )
{
PDLKPFSD_WRITE_WORKERITEM pWorkItem =
(PDLKPFSD_WRITE_WORKERITEM)pContext;
CDLKPFSD_Device_Extension* pExtension =
DLKPFSD_GetDeviceExtension( pWorkItem->ContextData->pDeviceObject );
PDLKPFSD_WRITECOMPLETION_CONTEXT pWriteContext =
pWorkItem->ContextData;
PIRP pIrp =
pWorkItem->ContextData->pOriginalIRP;
PIO_STACK_LOCATION pCurrentIrpStack =
IoGetCurrentIrpStackLocation( pIrp );
CDLKPFSD_TrackedFileObject* pTrackedFile =
pWorkItem->ContextData->pTrackedFile;

PMDL pMdl = pIrp->MdlAddress;
PUCHAR pUserBuffer = NULL;
ULONG uLength = pCurrentIrpStack->Parameters.Write.Length;

if( pMdl )
{
pUserBuffer = (PUCHAR)MmGetSystemAddressForMdlSafe( pMdl,
HighPagePriority );
}
else
{
pUserBuffer = ((PUCHAR)pIrp->UserBuffer);
}

/* Store Original Buffers */
pWriteContext->pOriginalMdl = pIrp->MdlAddress;
pWriteContext->pOriginalUserBuffer = pIrp->UserBuffer;

/* Allocate storage */
PVOID pOurBuffer = ExAllocatePool( NonPagedPool, uLength );

if( pUserBuffer && pOurBuffer )
{
/* Copy supplied data to our buffer coz we can`t do it in
place */
RtlCopyBytes( pOurBuffer, pUserBuffer, uLength );

/* Do the encryption */
__try
{
DLP_MINI_KEY data;
ULONG ulErr;

RtlZeroMemory( &data, sizeof( DLP_MINI_KEY ) );

CDLKPFSD_TrackedFileObject* pTrackedFileObject =
pWorkItem->ContextData->pTrackedFile;
CDLKPFSD_TrackedFolderObject* pFolderObject =
(CDLKPFSD_TrackedFolderObject*)pTrackedFileObject->GetData();

ulErr = TOKEN_ExtractKey( 0,
pFolderObject->GetKeySerial(), pFolderObject->GetKeyIndex(), FALSE, &data );

if( ulErr != DLPCLNT_ERROR_SUCCESS )
{
/* Cleanup */
ExFreePool( pOurBuffer );
ExFreePool( pWriteContext );
ExFreePool( pWorkItem );

/* Complete the IRP with error */
pIrp->IoStatus.Status =
STATUS_ACCESS_DENIED;

IoCompleteRequest( pIrp, IO_NO_INCREMENT );

return;
}
else
{
PLUGIN_Encrypt( pFolderObject->GetIV(),
data.bKey,

(PUCHAR)pOurBuffer,

(int)uLength,

(int)(pCurrentIrpStack->Parameters.Write.ByteOffset.QuadPart & 0x0FFFFFFFF),

data.bAlgoID
);
}
}
__except(EXCEPTION_EXECUTE_HANDLER)
{
DLKPFSD_ExceptionHandler( GetExceptionCode(), 0, 0,
0, 0, FALSE );
}
}
else
{
DLKPFSD_LOG_PRINT( DLKPFSD_DEBUG_WRITE, (“[DLKPFSD]:
pUserBuffer || pOurBuffer == NULL!\n” ) );
}

/* Allocate a new MDL */
PMDL pMdlReplacement = IoAllocateMdl( pOurBuffer, uLength, FALSE,
FALSE, NULL );

if( pMdlReplacement )
{
/* Build new mdl */
MmBuildMdlForNonPagedPool( pMdlReplacement );

/* Update IRP to reflect new MDL */
pIrp->MdlAddress = pMdlReplacement;

/* Update the pIrp->UserBuffer field to!! */
pIrp->UserBuffer = MmGetMdlVirtualAddress( pMdlReplacement
);

}
else
{
DLKPFSD_LOG_PRINT( DLKPFSD_DEBUG_WRITE, (“[DLKPFSD]: ****
FAILED TO CREATE MDL REPLACEMENT ****\n” ) );
}

pWriteContext->pOriginalIRP = pIrp;

/* Store our buffer away */
pWriteContext->pOurBuffer = pOurBuffer;

IoCopyCurrentIrpStackLocationToNext( pIrp );

/* Setup the completion routine */
IoSetCompletionRoutine( pIrp, DLKPFSD_Write_Completion, pWorkItem,
TRUE, TRUE, TRUE );

DLKPFSD_LOG_PRINT( DLKPFSD_DEBUG_WRITE, (“[DLKPFSD]: IoCallDriver
from write start\n” ) );
IoCallDriver( pExtension->pAttachedDeviceObject, pIrp );
DLKPFSD_LOG_PRINT( DLKPFSD_DEBUG_WRITE, (“[DLKPFSD]: IoCallDriver
from write done\n” ) );
}

-----Original Message-----
From: Tony Mason [mailto:xxxxx@osr.com]
Sent: 28 September 2002 11:37
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…

Ben,

This indicates that the I/O operation isn’t finishing. You need to find
where that I/O operation has gone astray (probably in a DIFFERENT thread
context.)

I have seen IRP/MDL handling issues cause mysterious hangs before. Are you
modifying the Irp->MdlAddress at all? Are you using fast mutexes? They
have a side-effect of raising to APC_LEVEL, although paging I/O operations
are completed differently because they cannot use the APC mechanism (those
calls ARRIVE at APC_LEVEL.)

For example, if you do change Irp->MdlAddress (as is often done in
encryption drivers, for example) then you MUST also update Irp->UserBuffer.
It doesn’t matter what that value is, or whether it is a valid ADDRESS, so
long as MmGetMdlVirtualAddress(Irp->MdlAddress) is the same as
Irp->UserBuffer. I’ve seen this particular problem hang systems before.

As always, more information is best.

Regards,

Tony

Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com

-----Original Message-----
From: xxxxx@des.co.uk [mailto:xxxxx@des.co.uk]
Sent: Saturday, September 28, 2002 6:13 AM
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…

Hi David,

FsContexts are indeed captured during create completion processing.

The USB device uses a smartcard microprocessor but that’s about as far
as it goes. I have implemented my own operating system in firmware so
no nice PC/SC! You are right to say that notification is sent when the
device is removed though.

I already queue write requests to a worker item so that I can talk to the
USB device. It only seems to be a problem when the underlying file system
calls FLUSH_BUFFERS. Here is the stack dump.

f356c3c8 804f6320 nt!KiSwapContext+0x2e (FPO: [EBP 0xf356c3fc] [0,0,4])
f356c3d4 804f04e8 nt!KiSwapThread+0x44 (FPO: [0,0,2])
f356c3fc 804fffa0 nt!KeWaitForSingleObject+0x1c0 (FPO: [Non-Fpo])
f356c4d4 805008ee nt!MiFlushSectionInternal+0x38a (FPO: [7,44,3])
f356c510 804dc340 nt!MmFlushSection+0x1e0 (FPO: [Non-Fpo])
f356c594 f5006090 nt!CcFlushCache+0x33e (FPO: [Non-Fpo])
f356c5c0 f5023628 Ntfs!NtfsFlushUserStream+0x6a (FPO: [Non-Fpo])
f356c630 f50239ca Ntfs!NtfsCommonFlushBuffers+0x12a (FPO: [Non-Fpo])
f356c694 804e5d53 Ntfs!NtfsFsdFlushBuffers+0x92 (FPO: [Non-Fpo])
f356c6a4 80623e10 nt!IopfCallDriver+0x31 (FPO: [0,0,1])
f356c6c8 baebd362 nt!IovCallDriver+0x9e (FPO: [Non-Fpo])
f356c6e4 baebca52 dlkpfsd!DLKPFSD_FlushBuffers_Dispatch+0x107 (CONV:
stdcall)
f356c704 baebc8c5 dlkpfsd!DLKPFSD_FilterDispatch+0xa9 (CONV: stdcall)
f356c720 804e5d53 dlkpfsd!DLKPFSD_Default_DriverDispatch+0x72 (CONV:
stdcall)
f356c730 80623e10 nt!IopfCallDriver+0x31 (FPO: [0,0,1])
f356c754 80556870 nt!IovCallDriver+0x9e (FPO: [Non-Fpo])
f356c768 8054f317 nt!IopSynchronousServiceTail+0x5e (FPO: [Non-Fpo])
f356c7e4 805283c1 nt!NtFlushBuffersFile+0x1ad (FPO: [Non-Fpo])

That KeWaitForSingleObject near the top is the problem, the event never gets
signaled!

NTFS Calls flush buffers to write dirty data to disk, i queue the write to a
worker, my
worker gets called, but when I call IoCallDriver to do the actual write it
hangs the thread.

Regards

Ben

-----Original Message-----
From: David J. Craig [mailto:xxxxx@yoshimuni.com]
Sent: 28 September 2002 02:26
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…

I am assuming that the FsContexts are captured during create completion
processing.

Is the SmartCard reader under the control of PC/SC? If so, the application
will be notified when it is removed. That is why the floppy drive based
SmartCard reader couldn’t get WHQL approval because there is no automatic
notifications to the owning driver when a floppy or pseudo floppy is
removed.

Write requests may come in at APC_LEVEL and the completion routine might be
called at DISPATCH_LEVEL. You may have to queue to a worker thread at
PASSIVE_LEVEL. Luckily I have seen someone say that creates are always
completed at PASSIVE_LEVEL. My testing to date has not disproved this idea
and it also appears that with the MS file systems you remain in the user’s
context in the completion routine also, but I still capture all the info I
needed during the create dispatch routine.

----- Original Message -----
From:
To: “File Systems Developers”
Sent: Friday, September 27, 2002 6:33 AM
Subject: [ntfsd] Re: Could this be a synchronization problem…

>
> Hi David,
>
> To clarify your questions:
>
> 1. I determine an encrypted file by a linked list of file objects, this
list
> is
> keyed on the fscontext value so that I don’t miss stream file objects.
> Basically
> I have a list of fscontext values and within each node of the list I
have
> a
> further linked list of fileobjects associated with the fscontext.
> 2. The reason for getting the key during the write is so that if the usb
> device
> is removed unexpectedly I can deny access to the file. I don`t really
> want to
> keep a list of cached keys in memory any longer than is needed.
> 3. With regards to the APC_LEVEL observation, I think this is the problem
:wink:
> 4. I only encrypt/decrypt pagingio, so that data in the cache is left
> decrypted.
> I agree that keeping the data in the cache is practically impossible.
> 5. All buffers and pointers checked and accounted for!
>
> Regards
>
> Ben
>
> -----Original Message-----
> From: David J. Craig [mailto:xxxxx@yoshimuni.com]
> Sent: 26 September 2002 23:48
> To: File Systems Developers
> Subject: [ntfsd] Re: Could this be a synchronization problem…
>
>
> You have omitted several items of interest. How do you determine that
this
> is an encrypted file? Why do you have to get the key during write
> processing? If the write request was sent at IPC_LEVEL it is not a good
> idea to make it wait. Are you looking at the paging and nocaching bits in
> the write and create? I know of no way, except writing your own file
> system, to keep the data in the cache manager’s buffers encrypted. The
> buffer pointers, etc. are very critical, so make sure you are restoring it
> exactly as it was when it was received. If this is a sync request, the
> current file offset must be updated too.
>
> ----- Original Message -----
> From:
> To: “File Systems Developers”
> Sent: Thursday, September 26, 2002 11:32 AM
> Subject: [ntfsd] Could this be a synchronization problem…
>
>
> >
> > Hi all,
> >
> > I am having a problem with my encryption filter on NTFS. Let me first
> > explain how i am handling the write with some pseudo-code.
> >
> > Write dispatch
> > {
> > if this an encrypted file
> > {
> > queue to worker thread so we are at passive level and i can
> > talk to a usb device.
> > pend operation
> > }
> > else
> > {
> > call lower driver
> > }
> > }
> >
> > WorkerThread
> > {
> > read encryption key from usb device
> >
> > if key read failed
> > {
> > deny write
> > complete irp with error
> > }
> > else
> > {
> > allocate our own buffer
> > copy data to write into our buffer
> > encrypt buffer
> > replace buffer in irp with our buffer
> > set completion routine
> > call lower driver
> > }
> > }
> >
> > completionroutine
> > {
> > tidy up the irp, replacing the buffer etc.
> > clean everything up
> > complete irp with success
> > }
> >
> > I hope all that makes sense!
> >
> > This all seems to work fine on fat/fat32 drives, but when i move across
to
> > NTFS it causes no end of
> > problems.
> >
> > Basically when an application tries to write to the file (in response to
a
> > FlushFileBuffers call -> IRP_MJ_FLUSH_BUFFERS) it locks the thread and
> sits
> > there for ever.
> >
> > Any ideas how I can even start to debug this. I can see that the thread
> has
> > hung during MmFlushImageSection
> > from NTFS but that’s about all.
> >
> > Regards
> >
> > Ben
> >
> >
> > —
> > You are currently subscribed to ntfsd as: xxxxx@yoshimuni.com
> > To unsubscribe send a blank email to %%email.unsub%%
> >
>
>
>
>
> —
> You are currently subscribed to ntfsd as: xxxxx@des.co.uk
> To unsubscribe send a blank email to %%email.unsub%%
>
>
> —
> You are currently subscribed to ntfsd as: xxxxx@yoshimuni.com
> To unsubscribe send a blank email to %%email.unsub%%


You are currently subscribed to ntfsd as: xxxxx@des.co.uk
To unsubscribe send a blank email to %%email.unsub%%


You are currently subscribed to ntfsd as: xxxxx@osr.com
To unsubscribe send a blank email to %%email.unsub%%


You are currently subscribed to ntfsd as: xxxxx@des.co.uk
To unsubscribe send a blank email to %%email.unsub%%


You are currently subscribed to ntfsd as: xxxxx@osr.com
To unsubscribe send a blank email to %%email.unsub%%

You should not really be posting paging i/o to a worker thread. You’ll
step on a minefield if you do. For paging i/o, the buffers are already
locked down, so you should be able to do all your work in the dispatch.

There is a very good reason why you would hit a deadlock when you post a
paging i/o routine in the below case. The file system can
acquire the Scb exclusive, as well as the paging resource before issuing
the flush. When the write comes through the paging resource will be
re-acquired - normally this would go through, but if you posted the
write, since the worker thread does not hold the resource, you will
deadlock.

I’ve been seeing a few filters doing this lately & this will surely lead
to a deadlock sooner or later.

Ravi

This posting is provided “AS IS” with no warranties, and confers no
rights.

-----Original Message-----
From: Tony Mason [mailto:xxxxx@osr.com]
Sent: Saturday, September 28, 2002 5:15 AM
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…

Ben,

You are living dangerously here - you are using a non-paging I/O marked
IRP for satisfying what I assume is a paging I/O operation. That BEGS
for problems. Have you been following the discussion about I/O
completion processing? Well, I/O completion processing is different
when done for paging I/O operations.

I’m not saying you are “doing it wrong” but rather that this is another
potential area for errors.

Again - what are the other threads in the OS doing? You indicate that
IoCallDriver never returns. If so, how far along does your IRP go?
Have you walked into the underlying driver(s) to see if you can figure
out what has happened to them?

Regards,

Tony

Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com

-----Original Message-----
From: xxxxx@des.co.uk [mailto:xxxxx@des.co.uk]
Sent: Saturday, September 28, 2002 6:43 AM
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…

Hi Tony,

Thanks for that, I have included the Write worker routine below (minus
all the error checking and debug!), as you can see i have (i think)
updated the mdl & userbuffer address correctly!

It just hangs right down the bottom where i call IoCallDriver(…)

Any ideas on what the best approach to tracking this down is?

Regards

Ben

VOID DLKPFSD_Write_Worker( PVOID pContext )
{
PDLKPFSD_WRITE_WORKERITEM pWorkItem =
(PDLKPFSD_WRITE_WORKERITEM)pContext;
CDLKPFSD_Device_Extension* pExtension =
DLKPFSD_GetDeviceExtension( pWorkItem->ContextData->pDeviceObject );
PDLKPFSD_WRITECOMPLETION_CONTEXT pWriteContext =
pWorkItem->ContextData;
PIRP pIrp =
pWorkItem->ContextData->pOriginalIRP;
PIO_STACK_LOCATION pCurrentIrpStack =
IoGetCurrentIrpStackLocation( pIrp );
CDLKPFSD_TrackedFileObject* pTrackedFile =
pWorkItem->ContextData->pTrackedFile;

PMDL pMdl = pIrp->MdlAddress;
PUCHAR pUserBuffer = NULL;
ULONG uLength =
pCurrentIrpStack->Parameters.Write.Length;

if( pMdl )
{
pUserBuffer = (PUCHAR)MmGetSystemAddressForMdlSafe(
pMdl,
HighPagePriority );
}
else
{
pUserBuffer = ((PUCHAR)pIrp->UserBuffer);
}

/* Store Original Buffers */
pWriteContext->pOriginalMdl = pIrp->MdlAddress;
pWriteContext->pOriginalUserBuffer = pIrp->UserBuffer;

/* Allocate storage */
PVOID pOurBuffer = ExAllocatePool( NonPagedPool, uLength );

if( pUserBuffer && pOurBuffer )
{
/* Copy supplied data to our buffer coz we can`t do it
in place */
RtlCopyBytes( pOurBuffer, pUserBuffer, uLength );

/* Do the encryption */
__try
{
DLP_MINI_KEY data;
ULONG ulErr;

RtlZeroMemory( &data, sizeof( DLP_MINI_KEY ) );

CDLKPFSD_TrackedFileObject*
pTrackedFileObject =
pWorkItem->ContextData->pTrackedFile;
CDLKPFSD_TrackedFolderObject* pFolderObject =
(CDLKPFSD_TrackedFolderObject*)pTrackedFileObject->GetData();

ulErr = TOKEN_ExtractKey( 0,
pFolderObject->GetKeySerial(), pFolderObject->GetKeyIndex(), FALSE,
pFolderObject->&data );

if( ulErr != DLPCLNT_ERROR_SUCCESS )
{
/* Cleanup */
ExFreePool( pOurBuffer );
ExFreePool( pWriteContext );
ExFreePool( pWorkItem );

/* Complete the IRP with error */
pIrp->IoStatus.Status =
STATUS_ACCESS_DENIED;

IoCompleteRequest( pIrp, IO_NO_INCREMENT
);

return;
}
else
{
PLUGIN_Encrypt( pFolderObject->GetIV(),

data.bKey,

(PUCHAR)pOurBuffer,

(int)uLength,

(int)(pCurrentIrpStack->Parameters.Write.ByteOffset.QuadPart &
0x0FFFFFFFF),

data.bAlgoID
);
}
}
__except(EXCEPTION_EXECUTE_HANDLER)
{
DLKPFSD_ExceptionHandler( GetExceptionCode(), 0,
0,
0, 0, FALSE );
}
}
else
{
DLKPFSD_LOG_PRINT( DLKPFSD_DEBUG_WRITE, (“[DLKPFSD]:
pUserBuffer || pOurBuffer == NULL!\n” ) );
}

/* Allocate a new MDL */
PMDL pMdlReplacement = IoAllocateMdl( pOurBuffer, uLength,
FALSE, FALSE, NULL );

if( pMdlReplacement )
{
/* Build new mdl */
MmBuildMdlForNonPagedPool( pMdlReplacement );

/* Update IRP to reflect new MDL */
pIrp->MdlAddress = pMdlReplacement;

/* Update the pIrp->UserBuffer field to!! */
pIrp->UserBuffer = MmGetMdlVirtualAddress(
pMdlReplacement
);

}
else
{
DLKPFSD_LOG_PRINT( DLKPFSD_DEBUG_WRITE, (“[DLKPFSD]:
**** FAILED TO CREATE MDL REPLACEMENT ****\n” ) );
}

pWriteContext->pOriginalIRP = pIrp;

/* Store our buffer away */
pWriteContext->pOurBuffer = pOurBuffer;

IoCopyCurrentIrpStackLocationToNext( pIrp );

/* Setup the completion routine */
IoSetCompletionRoutine( pIrp, DLKPFSD_Write_Completion,
pWorkItem,
TRUE, TRUE, TRUE );

DLKPFSD_LOG_PRINT( DLKPFSD_DEBUG_WRITE, (“[DLKPFSD]:
IoCallDriver from write start\n” ) );
IoCallDriver( pExtension->pAttachedDeviceObject, pIrp );
DLKPFSD_LOG_PRINT( DLKPFSD_DEBUG_WRITE, (“[DLKPFSD]:
IoCallDriver from write done\n” ) ); }

-----Original Message-----
From: Tony Mason [mailto:xxxxx@osr.com]
Sent: 28 September 2002 11:37
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…

Ben,

This indicates that the I/O operation isn’t finishing. You need to find
where that I/O operation has gone astray (probably in a DIFFERENT thread
context.)

I have seen IRP/MDL handling issues cause mysterious hangs before. Are
you modifying the Irp->MdlAddress at all? Are you using fast mutexes?
They have a side-effect of raising to APC_LEVEL, although paging I/O
operations are completed differently because they cannot use the APC
mechanism (those calls ARRIVE at APC_LEVEL.)

For example, if you do change Irp->MdlAddress (as is often done in
encryption drivers, for example) then you MUST also update
Irp->UserBuffer. It doesn’t matter what that value is, or whether it is
a valid ADDRESS, so long as MmGetMdlVirtualAddress(Irp->MdlAddress) is
the same as
Irp->UserBuffer. I’ve seen this particular problem hang systems before.

As always, more information is best.

Regards,

Tony

Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com

-----Original Message-----
From: xxxxx@des.co.uk [mailto:xxxxx@des.co.uk]
Sent: Saturday, September 28, 2002 6:13 AM
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…

Hi David,

FsContexts are indeed captured during create completion processing.

The USB device uses a smartcard microprocessor but that’s about as far
as it goes. I have implemented my own operating system in firmware so
no nice PC/SC! You are right to say that notification is sent when the
device is removed though.

I already queue write requests to a worker item so that I can talk to
the USB device. It only seems to be a problem when the underlying file
system calls FLUSH_BUFFERS. Here is the stack dump.

f356c3c8 804f6320 nt!KiSwapContext+0x2e (FPO: [EBP 0xf356c3fc] [0,0,4])
f356c3d4 804f04e8 nt!KiSwapThread+0x44 (FPO: [0,0,2]) f356c3fc 804fffa0
nt!KeWaitForSingleObject+0x1c0 (FPO: [Non-Fpo]) f356c4d4 805008ee
nt!MiFlushSectionInternal+0x38a (FPO: [7,44,3]) f356c510 804dc340
nt!MmFlushSection+0x1e0 (FPO: [Non-Fpo]) f356c594 f5006090
nt!CcFlushCache+0x33e (FPO: [Non-Fpo]) f356c5c0 f5023628
Ntfs!NtfsFlushUserStream+0x6a (FPO: [Non-Fpo]) f356c630 f50239ca
Ntfs!NtfsCommonFlushBuffers+0x12a (FPO: [Non-Fpo]) f356c694 804e5d53
Ntfs!NtfsFsdFlushBuffers+0x92 (FPO: [Non-Fpo]) f356c6a4 80623e10
nt!IopfCallDriver+0x31 (FPO: [0,0,1]) f356c6c8 baebd362
nt!IovCallDriver+0x9e (FPO: [Non-Fpo]) f356c6e4 baebca52
dlkpfsd!DLKPFSD_FlushBuffers_Dispatch+0x107 (CONV:
stdcall)
f356c704 baebc8c5 dlkpfsd!DLKPFSD_FilterDispatch+0xa9 (CONV: stdcall)
f356c720 804e5d53 dlkpfsd!DLKPFSD_Default_DriverDispatch+0x72 (CONV:
stdcall)
f356c730 80623e10 nt!IopfCallDriver+0x31 (FPO: [0,0,1]) f356c754
80556870 nt!IovCallDriver+0x9e (FPO: [Non-Fpo]) f356c768 8054f317
nt!IopSynchronousServiceTail+0x5e (FPO: [Non-Fpo]) f356c7e4 805283c1
nt!NtFlushBuffersFile+0x1ad (FPO: [Non-Fpo])

That KeWaitForSingleObject near the top is the problem, the event never
gets signaled!

NTFS Calls flush buffers to write dirty data to disk, i queue the write
to a worker, my worker gets called, but when I call IoCallDriver to do
the actual write it hangs the thread.

Regards

Ben

-----Original Message-----
From: David J. Craig [mailto:xxxxx@yoshimuni.com]
Sent: 28 September 2002 02:26
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…

I am assuming that the FsContexts are captured during create completion
processing.

Is the SmartCard reader under the control of PC/SC? If so, the
application will be notified when it is removed. That is why the floppy
drive based SmartCard reader couldn’t get WHQL approval because there is
no automatic notifications to the owning driver when a floppy or pseudo
floppy is removed.

Write requests may come in at APC_LEVEL and the completion routine might
be called at DISPATCH_LEVEL. You may have to queue to a worker thread
at PASSIVE_LEVEL. Luckily I have seen someone say that creates are
always completed at PASSIVE_LEVEL. My testing to date has not disproved
this idea and it also appears that with the MS file systems you remain
in the user’s context in the completion routine also, but I still
capture all the info I needed during the create dispatch routine.

----- Original Message -----
From:
To: “File Systems Developers”
Sent: Friday, September 27, 2002 6:33 AM
Subject: [ntfsd] Re: Could this be a synchronization problem…

>
> Hi David,
>
> To clarify your questions:
>
> 1. I determine an encrypted file by a linked list of file objects,
> this
list
> is
> keyed on the fscontext value so that I don’t miss stream file
> objects. Basically
> I have a list of fscontext values and within each node of the list
> I
have
> a
> further linked list of fileobjects associated with the fscontext.
> 2. The reason for getting the key during the write is so that if the
> usb device
> is removed unexpectedly I can deny access to the file. I don`t
> really want to
> keep a list of cached keys in memory any longer than is needed. 3.
> With regards to the APC_LEVEL observation, I think this is the problem
:wink:
> 4. I only encrypt/decrypt pagingio, so that data in the cache is left
> decrypted.
> I agree that keeping the data in the cache is practically
> impossible. 5. All buffers and pointers checked and accounted for!
>
> Regards
>
> Ben
>
> -----Original Message-----
> From: David J. Craig [mailto:xxxxx@yoshimuni.com]
> Sent: 26 September 2002 23:48
> To: File Systems Developers
> Subject: [ntfsd] Re: Could this be a synchronization problem…
>
>
> You have omitted several items of interest. How do you determine that
this
> is an encrypted file? Why do you have to get the key during write
> processing? If the write request was sent at IPC_LEVEL it is not a
> good idea to make it wait. Are you looking at the paging and
> nocaching bits in the write and create? I know of no way, except
> writing your own file system, to keep the data in the cache manager’s
> buffers encrypted. The buffer pointers, etc. are very critical, so
> make sure you are restoring it exactly as it was when it was received.

> If this is a sync request, the current file offset must be updated
> too.
>
> ----- Original Message -----
> From:
> To: “File Systems Developers”
> Sent: Thursday, September 26, 2002 11:32 AM
> Subject: [ntfsd] Could this be a synchronization problem…
>
>
> >
> > Hi all,
> >
> > I am having a problem with my encryption filter on NTFS. Let me
> > first explain how i am handling the write with some pseudo-code.
> >
> > Write dispatch
> > {
> > if this an encrypted file
> > {
> > queue to worker thread so we are at passive level and i can talk to
> > a usb device. pend operation
> > }
> > else
> > {
> > call lower driver
> > }
> > }
> >
> > WorkerThread
> > {
> > read encryption key from usb device
> >
> > if key read failed
> > {
> > deny write
> > complete irp with error
> > }
> > else
> > {
> > allocate our own buffer
> > copy data to write into our buffer
> > encrypt buffer
> > replace buffer in irp with our buffer
> > set completion routine
> > call lower driver
> > }
> > }
> >
> > completionroutine
> > {
> > tidy up the irp, replacing the buffer etc.
> > clean everything up
> > complete irp with success
> > }
> >
> > I hope all that makes sense!
> >
> > This all seems to work fine on fat/fat32 drives, but when i move
> > across
to
> > NTFS it causes no end of
> > problems.
> >
> > Basically when an application tries to write to the file (in
> > response to
a
> > FlushFileBuffers call -> IRP_MJ_FLUSH_BUFFERS) it locks the thread
> > and
> sits
> > there for ever.
> >
> > Any ideas how I can even start to debug this. I can see that the
> > thread
> has
> > hung during MmFlushImageSection
> > from NTFS but that’s about all.
> >
> > Regards
> >
> > Ben
> >
> >
> > —
> > You are currently subscribed to ntfsd as: xxxxx@yoshimuni.com To
> > unsubscribe send a blank email to %%email.unsub%%
> >
>
>
>
>
> —
> You are currently subscribed to ntfsd as: xxxxx@des.co.uk
> To unsubscribe send a blank email to %%email.unsub%%
>
>
> —
> You are currently subscribed to ntfsd as: xxxxx@yoshimuni.com To
> unsubscribe send a blank email to %%email.unsub%%


You are currently subscribed to ntfsd as: xxxxx@des.co.uk
To unsubscribe send a blank email to %%email.unsub%%


You are currently subscribed to ntfsd as: xxxxx@osr.com
To unsubscribe send a blank email to %%email.unsub%%


You are currently subscribed to ntfsd as: xxxxx@des.co.uk
To unsubscribe send a blank email to %%email.unsub%%


You are currently subscribed to ntfsd as: xxxxx@osr.com
To unsubscribe send a blank email to %%email.unsub%%


You are currently subscribed to ntfsd as: xxxxx@windows.microsoft.com
To unsubscribe send a blank email to %%email.unsub%%

Why not just pass the flush buffers request on down to NTFS and don’t bother
with it? Since this is an encryption driver, you don’t really care about
the reads and writes that come to the file system until they get passed back
from the cache manager - this excludes those that are no caching, of course.
When a cleanup comes for a file, there may be a lot of dirty buffers for
that drive. You will get the various data buffers to be written whenever
the FSD, cache manager, and/or memory manager decide to flush them. If the
drive is removable, some of the data and system areas can take over a minute
before it decides to flush the stuff. XP is supposed to be better at this,
but I haven’t seen it proven to my satisfaction.

It looks from the dump that something has caused NTFS to switch context
during the flush operation. Maybe since you queued the request to a worker
thread, the buffers are no longer valid in that context. That is one reason
I suggest that you not queue the request.

----- Original Message -----
From:
To: “File Systems Developers”
Sent: Saturday, September 28, 2002 6:13 AM
Subject: [ntfsd] Re: Could this be a synchronization problem…

>
> Hi David,
>
> FsContexts are indeed captured during create completion processing.
>
> The USB device uses a smartcard microprocessor but that’s about as far
> as it goes. I have implemented my own operating system in firmware so
> no nice PC/SC! You are right to say that notification is sent when the
> device is removed though.
>
> I already queue write requests to a worker item so that I can talk to the
> USB device. It only seems to be a problem when the underlying file system
> calls FLUSH_BUFFERS. Here is the stack dump.
>
> f356c3c8 804f6320 nt!KiSwapContext+0x2e (FPO: [EBP 0xf356c3fc] [0,0,4])
> f356c3d4 804f04e8 nt!KiSwapThread+0x44 (FPO: [0,0,2])
> f356c3fc 804fffa0 nt!KeWaitForSingleObject+0x1c0 (FPO: [Non-Fpo])
> f356c4d4 805008ee nt!MiFlushSectionInternal+0x38a (FPO: [7,44,3])
> f356c510 804dc340 nt!MmFlushSection+0x1e0 (FPO: [Non-Fpo])
> f356c594 f5006090 nt!CcFlushCache+0x33e (FPO: [Non-Fpo])
> f356c5c0 f5023628 Ntfs!NtfsFlushUserStream+0x6a (FPO: [Non-Fpo])
> f356c630 f50239ca Ntfs!NtfsCommonFlushBuffers+0x12a (FPO: [Non-Fpo])
> f356c694 804e5d53 Ntfs!NtfsFsdFlushBuffers+0x92 (FPO: [Non-Fpo])
> f356c6a4 80623e10 nt!IopfCallDriver+0x31 (FPO: [0,0,1])
> f356c6c8 baebd362 nt!IovCallDriver+0x9e (FPO: [Non-Fpo])
> f356c6e4 baebca52 dlkpfsd!DLKPFSD_FlushBuffers_Dispatch+0x107 (CONV:
> stdcall)
> f356c704 baebc8c5 dlkpfsd!DLKPFSD_FilterDispatch+0xa9 (CONV: stdcall)
> f356c720 804e5d53 dlkpfsd!DLKPFSD_Default_DriverDispatch+0x72 (CONV:
> stdcall)
> f356c730 80623e10 nt!IopfCallDriver+0x31 (FPO: [0,0,1])
> f356c754 80556870 nt!IovCallDriver+0x9e (FPO: [Non-Fpo])
> f356c768 8054f317 nt!IopSynchronousServiceTail+0x5e (FPO: [Non-Fpo])
> f356c7e4 805283c1 nt!NtFlushBuffersFile+0x1ad (FPO: [Non-Fpo])
>
> That KeWaitForSingleObject near the top is the problem, the event never
gets
> signaled!
>
> NTFS Calls flush buffers to write dirty data to disk, i queue the write to
a
> worker, my
> worker gets called, but when I call IoCallDriver to do the actual write it
> hangs the thread.
>
> Regards
>
> Ben
>
>
> -----Original Message-----
> From: David J. Craig [mailto:xxxxx@yoshimuni.com]
> Sent: 28 September 2002 02:26
> To: File Systems Developers
> Subject: [ntfsd] Re: Could this be a synchronization problem…
>
>
> I am assuming that the FsContexts are captured during create completion
> processing.
>
> Is the SmartCard reader under the control of PC/SC? If so, the
application
> will be notified when it is removed. That is why the floppy drive based
> SmartCard reader couldn’t get WHQL approval because there is no automatic
> notifications to the owning driver when a floppy or pseudo floppy is
> removed.
>
> Write requests may come in at APC_LEVEL and the completion routine might
be
> called at DISPATCH_LEVEL. You may have to queue to a worker thread at
> PASSIVE_LEVEL. Luckily I have seen someone say that creates are always
> completed at PASSIVE_LEVEL. My testing to date has not disproved this
idea
> and it also appears that with the MS file systems you remain in the user’s
> context in the completion routine also, but I still capture all the info I
> needed during the create dispatch routine.
>
> ----- Original Message -----
> From:
> To: “File Systems Developers”
> Sent: Friday, September 27, 2002 6:33 AM
> Subject: [ntfsd] Re: Could this be a synchronization problem…
>
>
> >
> > Hi David,
> >
> > To clarify your questions:
> >
> > 1. I determine an encrypted file by a linked list of file objects, this
> list
> > is
> > keyed on the fscontext value so that I don’t miss stream file
objects.
> > Basically
> > I have a list of fscontext values and within each node of the list I
> have
> > a
> > further linked list of fileobjects associated with the fscontext.
> > 2. The reason for getting the key during the write is so that if the usb
> > device
> > is removed unexpectedly I can deny access to the file. I don`t really
> > want to
> > keep a list of cached keys in memory any longer than is needed.
> > 3. With regards to the APC_LEVEL observation, I think this is the
problem
> :wink:
> > 4. I only encrypt/decrypt pagingio, so that data in the cache is left
> > decrypted.
> > I agree that keeping the data in the cache is practically impossible.
> > 5. All buffers and pointers checked and accounted for!
> >
> > Regards
> >
> > Ben
> >
> > -----Original Message-----
> > From: David J. Craig [mailto:xxxxx@yoshimuni.com]
> > Sent: 26 September 2002 23:48
> > To: File Systems Developers
> > Subject: [ntfsd] Re: Could this be a synchronization problem…
> >
> >
> > You have omitted several items of interest. How do you determine that
> this
> > is an encrypted file? Why do you have to get the key during write
> > processing? If the write request was sent at IPC_LEVEL it is not a good
> > idea to make it wait. Are you looking at the paging and nocaching bits
in
> > the write and create? I know of no way, except writing your own file
> > system, to keep the data in the cache manager’s buffers encrypted. The
> > buffer pointers, etc. are very critical, so make sure you are restoring
it
> > exactly as it was when it was received. If this is a sync request, the
> > current file offset must be updated too.
> >
> > ----- Original Message -----
> > From:
> > To: “File Systems Developers”
> > Sent: Thursday, September 26, 2002 11:32 AM
> > Subject: [ntfsd] Could this be a synchronization problem…
> >
> >
> > >
> > > Hi all,
> > >
> > > I am having a problem with my encryption filter on NTFS. Let me first
> > > explain how i am handling the write with some pseudo-code.
> > >
> > > Write dispatch
> > > {
> > > if this an encrypted file
> > > {
> > > queue to worker thread so we are at passive level and i can
> > > talk to a usb device.
> > > pend operation
> > > }
> > > else
> > > {
> > > call lower driver
> > > }
> > > }
> > >
> > > WorkerThread
> > > {
> > > read encryption key from usb device
> > >
> > > if key read failed
> > > {
> > > deny write
> > > complete irp with error
> > > }
> > > else
> > > {
> > > allocate our own buffer
> > > copy data to write into our buffer
> > > encrypt buffer
> > > replace buffer in irp with our buffer
> > > set completion routine
> > > call lower driver
> > > }
> > > }
> > >
> > > completionroutine
> > > {
> > > tidy up the irp, replacing the buffer etc.
> > > clean everything up
> > > complete irp with success
> > > }
> > >
> > > I hope all that makes sense!
> > >
> > > This all seems to work fine on fat/fat32 drives, but when i move
across
> to
> > > NTFS it causes no end of
> > > problems.
> > >
> > > Basically when an application tries to write to the file (in response
to
> a
> > > FlushFileBuffers call -> IRP_MJ_FLUSH_BUFFERS) it locks the thread and
> > sits
> > > there for ever.
> > >
> > > Any ideas how I can even start to debug this. I can see that the
thread
> > has
> > > hung during MmFlushImageSection
> > > from NTFS but that’s about all.
> > >
> > > Regards
> > >
> > > Ben
> > >
> > >
> > > —
> > > You are currently subscribed to ntfsd as: xxxxx@yoshimuni.com
> > > To unsubscribe send a blank email to %%email.unsub%%
> > >
> >
> >
> >
> >
> > —
> > You are currently subscribed to ntfsd as: xxxxx@des.co.uk
> > To unsubscribe send a blank email to %%email.unsub%%
> >
> >
> > —
> > You are currently subscribed to ntfsd as: xxxxx@yoshimuni.com
> > To unsubscribe send a blank email to %%email.unsub%%
>
>
>
>
> —
> You are currently subscribed to ntfsd as: xxxxx@des.co.uk
> To unsubscribe send a blank email to %%email.unsub%%
>
>
> —
> You are currently subscribed to ntfsd as: xxxxx@yoshimuni.com
> To unsubscribe send a blank email to %%email.unsub%%
>

>re-acquired - normally this would go through, but if you posted the

write, since the worker thread does not hold the resource, you will
deadlock.

Am I wrong that the notion of TopLevelIrp is to protect from this
deadlock?

IIRC TopLevelIrp == NULL means - locks are not acquired, while
TopLevelIrp != NULL means - locks are already acquired for this
operation by some upper code, just proceed without locking.

So, if the worker thread will set TopLevelIrp to FSP_TOP_LEVEL_IRP
before issuing the IRP to the FSD, things can be OK.

Max

Unfortunately, the TopLevelIrp field is used by filesystems in whatever
way they wish - it is not a good idea for filters to overload it and
send it down, as it can confuse FSDs. What a filesystem puts in the
TopLevelIrp field is very particular to that filesystem: NTFS, for
instance usually puts in a pointer to an internal data structure.

Secondly there are a few times when NTFS does not set the TopLevelIrp
field, but holds a lock before recursing down the stack.
One example is the FastIoCheckIFPossible() callback. Another is during a
particular paging i/o path.
In reality during paging i/o, some lock is always held by the
filesystem, regardless of what the TopLevelIrp field says.
Ravi

This posting is provided “AS IS” with no warranties, and confers no
rights.

-----Original Message-----
From: Maxim S. Shatskih [mailto:xxxxx@storagecraft.com]
Sent: Saturday, September 28, 2002 11:38 PM
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…

re-acquired - normally this would go through, but if you posted the
write, since the worker thread does not hold the resource, you will
deadlock.

Am I wrong that the notion of TopLevelIrp is to protect from this
deadlock?

IIRC TopLevelIrp == NULL means - locks are not acquired, while
TopLevelIrp != NULL means - locks are already acquired for this
operation by some upper code, just proceed without locking.

So, if the worker thread will set TopLevelIrp to FSP_TOP_LEVEL_IRP
before issuing the IRP to the FSD, things can be OK.

Max


You are currently subscribed to ntfsd as: xxxxx@windows.microsoft.com
To unsubscribe send a blank email to %%email.unsub%%

Hi all,

Thanks to all that tried to help with my post. I have now decided that I
could spend forever trying to get this code working, especially when other
filter drivers are introduced to the stack.

So, I have decided to re-write the complete routine without the reliance on
queue to worker! :wink:

Thanks

Ben

-----Original Message-----
From: Ravisankar Pudipeddi [mailto:xxxxx@windows.microsoft.com]
Sent: 29 September 2002 19:13
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…

Unfortunately, the TopLevelIrp field is used by filesystems in whatever
way they wish - it is not a good idea for filters to overload it and
send it down, as it can confuse FSDs. What a filesystem puts in the
TopLevelIrp field is very particular to that filesystem: NTFS, for
instance usually puts in a pointer to an internal data structure.

Secondly there are a few times when NTFS does not set the TopLevelIrp
field, but holds a lock before recursing down the stack.
One example is the FastIoCheckIFPossible() callback. Another is during a
particular paging i/o path.
In reality during paging i/o, some lock is always held by the
filesystem, regardless of what the TopLevelIrp field says.
Ravi

This posting is provided “AS IS” with no warranties, and confers no
rights.

-----Original Message-----
From: Maxim S. Shatskih [mailto:xxxxx@storagecraft.com]
Sent: Saturday, September 28, 2002 11:38 PM
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…

re-acquired - normally this would go through, but if you posted the
write, since the worker thread does not hold the resource, you will
deadlock.

Am I wrong that the notion of TopLevelIrp is to protect from this
deadlock?

IIRC TopLevelIrp == NULL means - locks are not acquired, while
TopLevelIrp != NULL means - locks are already acquired for this
operation by some upper code, just proceed without locking.

So, if the worker thread will set TopLevelIrp to FSP_TOP_LEVEL_IRP
before issuing the IRP to the FSD, things can be OK.

Max


You are currently subscribed to ntfsd as: xxxxx@windows.microsoft.com
To unsubscribe send a blank email to %%email.unsub%%


You are currently subscribed to ntfsd as: xxxxx@des.co.uk
To unsubscribe send a blank email to %%email.unsub%%