You should not really be posting paging i/o to a worker thread. You’ll
step on a minefield if you do. For paging i/o, the buffers are already
locked down, so you should be able to do all your work in the dispatch.
There is a very good reason why you would hit a deadlock when you post a
paging i/o routine in the below case. The file system can
acquire the Scb exclusive, as well as the paging resource before issuing
the flush. When the write comes through the paging resource will be
re-acquired - normally this would go through, but if you posted the
write, since the worker thread does not hold the resource, you will
deadlock.
I’ve been seeing a few filters doing this lately & this will surely lead
to a deadlock sooner or later.
Ravi
This posting is provided “AS IS” with no warranties, and confers no
rights.
-----Original Message-----
From: Tony Mason [mailto:xxxxx@osr.com]
Sent: Saturday, September 28, 2002 5:15 AM
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…
Ben,
You are living dangerously here - you are using a non-paging I/O marked
IRP for satisfying what I assume is a paging I/O operation. That BEGS
for problems. Have you been following the discussion about I/O
completion processing? Well, I/O completion processing is different
when done for paging I/O operations.
I’m not saying you are “doing it wrong” but rather that this is another
potential area for errors.
Again - what are the other threads in the OS doing? You indicate that
IoCallDriver never returns. If so, how far along does your IRP go?
Have you walked into the underlying driver(s) to see if you can figure
out what has happened to them?
Regards,
Tony
Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com
-----Original Message-----
From: xxxxx@des.co.uk [mailto:xxxxx@des.co.uk]
Sent: Saturday, September 28, 2002 6:43 AM
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…
Hi Tony,
Thanks for that, I have included the Write worker routine below (minus
all the error checking and debug!), as you can see i have (i think)
updated the mdl & userbuffer address correctly!
It just hangs right down the bottom where i call IoCallDriver(…)
Any ideas on what the best approach to tracking this down is?
Regards
Ben
VOID DLKPFSD_Write_Worker( PVOID pContext )
{
PDLKPFSD_WRITE_WORKERITEM pWorkItem =
(PDLKPFSD_WRITE_WORKERITEM)pContext;
CDLKPFSD_Device_Extension* pExtension =
DLKPFSD_GetDeviceExtension( pWorkItem->ContextData->pDeviceObject );
PDLKPFSD_WRITECOMPLETION_CONTEXT pWriteContext =
pWorkItem->ContextData;
PIRP pIrp =
pWorkItem->ContextData->pOriginalIRP;
PIO_STACK_LOCATION pCurrentIrpStack =
IoGetCurrentIrpStackLocation( pIrp );
CDLKPFSD_TrackedFileObject* pTrackedFile =
pWorkItem->ContextData->pTrackedFile;
PMDL pMdl = pIrp->MdlAddress;
PUCHAR pUserBuffer = NULL;
ULONG uLength =
pCurrentIrpStack->Parameters.Write.Length;
if( pMdl )
{
pUserBuffer = (PUCHAR)MmGetSystemAddressForMdlSafe(
pMdl,
HighPagePriority );
}
else
{
pUserBuffer = ((PUCHAR)pIrp->UserBuffer);
}
/* Store Original Buffers */
pWriteContext->pOriginalMdl = pIrp->MdlAddress;
pWriteContext->pOriginalUserBuffer = pIrp->UserBuffer;
/* Allocate storage */
PVOID pOurBuffer = ExAllocatePool( NonPagedPool, uLength );
if( pUserBuffer && pOurBuffer )
{
/* Copy supplied data to our buffer coz we can`t do it
in place */
RtlCopyBytes( pOurBuffer, pUserBuffer, uLength );
/* Do the encryption */
__try
{
DLP_MINI_KEY data;
ULONG ulErr;
RtlZeroMemory( &data, sizeof( DLP_MINI_KEY ) );
CDLKPFSD_TrackedFileObject*
pTrackedFileObject =
pWorkItem->ContextData->pTrackedFile;
CDLKPFSD_TrackedFolderObject* pFolderObject =
(CDLKPFSD_TrackedFolderObject*)pTrackedFileObject->GetData();
ulErr = TOKEN_ExtractKey( 0,
pFolderObject->GetKeySerial(), pFolderObject->GetKeyIndex(), FALSE,
pFolderObject->&data );
if( ulErr != DLPCLNT_ERROR_SUCCESS )
{
/* Cleanup */
ExFreePool( pOurBuffer );
ExFreePool( pWriteContext );
ExFreePool( pWorkItem );
/* Complete the IRP with error */
pIrp->IoStatus.Status =
STATUS_ACCESS_DENIED;
IoCompleteRequest( pIrp, IO_NO_INCREMENT
);
return;
}
else
{
PLUGIN_Encrypt( pFolderObject->GetIV(),
data.bKey,
(PUCHAR)pOurBuffer,
(int)uLength,
(int)(pCurrentIrpStack->Parameters.Write.ByteOffset.QuadPart &
0x0FFFFFFFF),
data.bAlgoID
);
}
}
__except(EXCEPTION_EXECUTE_HANDLER)
{
DLKPFSD_ExceptionHandler( GetExceptionCode(), 0,
0,
0, 0, FALSE );
}
}
else
{
DLKPFSD_LOG_PRINT( DLKPFSD_DEBUG_WRITE, (“[DLKPFSD]:
pUserBuffer || pOurBuffer == NULL!\n” ) );
}
/* Allocate a new MDL */
PMDL pMdlReplacement = IoAllocateMdl( pOurBuffer, uLength,
FALSE, FALSE, NULL );
if( pMdlReplacement )
{
/* Build new mdl */
MmBuildMdlForNonPagedPool( pMdlReplacement );
/* Update IRP to reflect new MDL */
pIrp->MdlAddress = pMdlReplacement;
/* Update the pIrp->UserBuffer field to!! */
pIrp->UserBuffer = MmGetMdlVirtualAddress(
pMdlReplacement
);
}
else
{
DLKPFSD_LOG_PRINT( DLKPFSD_DEBUG_WRITE, (“[DLKPFSD]:
**** FAILED TO CREATE MDL REPLACEMENT ****\n” ) );
}
pWriteContext->pOriginalIRP = pIrp;
/* Store our buffer away */
pWriteContext->pOurBuffer = pOurBuffer;
IoCopyCurrentIrpStackLocationToNext( pIrp );
/* Setup the completion routine */
IoSetCompletionRoutine( pIrp, DLKPFSD_Write_Completion,
pWorkItem,
TRUE, TRUE, TRUE );
DLKPFSD_LOG_PRINT( DLKPFSD_DEBUG_WRITE, (“[DLKPFSD]:
IoCallDriver from write start\n” ) );
IoCallDriver( pExtension->pAttachedDeviceObject, pIrp );
DLKPFSD_LOG_PRINT( DLKPFSD_DEBUG_WRITE, (“[DLKPFSD]:
IoCallDriver from write done\n” ) ); }
-----Original Message-----
From: Tony Mason [mailto:xxxxx@osr.com]
Sent: 28 September 2002 11:37
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…
Ben,
This indicates that the I/O operation isn’t finishing. You need to find
where that I/O operation has gone astray (probably in a DIFFERENT thread
context.)
I have seen IRP/MDL handling issues cause mysterious hangs before. Are
you modifying the Irp->MdlAddress at all? Are you using fast mutexes?
They have a side-effect of raising to APC_LEVEL, although paging I/O
operations are completed differently because they cannot use the APC
mechanism (those calls ARRIVE at APC_LEVEL.)
For example, if you do change Irp->MdlAddress (as is often done in
encryption drivers, for example) then you MUST also update
Irp->UserBuffer. It doesn’t matter what that value is, or whether it is
a valid ADDRESS, so long as MmGetMdlVirtualAddress(Irp->MdlAddress) is
the same as
Irp->UserBuffer. I’ve seen this particular problem hang systems before.
As always, more information is best.
Regards,
Tony
Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com
-----Original Message-----
From: xxxxx@des.co.uk [mailto:xxxxx@des.co.uk]
Sent: Saturday, September 28, 2002 6:13 AM
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…
Hi David,
FsContexts are indeed captured during create completion processing.
The USB device uses a smartcard microprocessor but that’s about as far
as it goes. I have implemented my own operating system in firmware so
no nice PC/SC! You are right to say that notification is sent when the
device is removed though.
I already queue write requests to a worker item so that I can talk to
the USB device. It only seems to be a problem when the underlying file
system calls FLUSH_BUFFERS. Here is the stack dump.
f356c3c8 804f6320 nt!KiSwapContext+0x2e (FPO: [EBP 0xf356c3fc] [0,0,4])
f356c3d4 804f04e8 nt!KiSwapThread+0x44 (FPO: [0,0,2]) f356c3fc 804fffa0
nt!KeWaitForSingleObject+0x1c0 (FPO: [Non-Fpo]) f356c4d4 805008ee
nt!MiFlushSectionInternal+0x38a (FPO: [7,44,3]) f356c510 804dc340
nt!MmFlushSection+0x1e0 (FPO: [Non-Fpo]) f356c594 f5006090
nt!CcFlushCache+0x33e (FPO: [Non-Fpo]) f356c5c0 f5023628
Ntfs!NtfsFlushUserStream+0x6a (FPO: [Non-Fpo]) f356c630 f50239ca
Ntfs!NtfsCommonFlushBuffers+0x12a (FPO: [Non-Fpo]) f356c694 804e5d53
Ntfs!NtfsFsdFlushBuffers+0x92 (FPO: [Non-Fpo]) f356c6a4 80623e10
nt!IopfCallDriver+0x31 (FPO: [0,0,1]) f356c6c8 baebd362
nt!IovCallDriver+0x9e (FPO: [Non-Fpo]) f356c6e4 baebca52
dlkpfsd!DLKPFSD_FlushBuffers_Dispatch+0x107 (CONV:
stdcall)
f356c704 baebc8c5 dlkpfsd!DLKPFSD_FilterDispatch+0xa9 (CONV: stdcall)
f356c720 804e5d53 dlkpfsd!DLKPFSD_Default_DriverDispatch+0x72 (CONV:
stdcall)
f356c730 80623e10 nt!IopfCallDriver+0x31 (FPO: [0,0,1]) f356c754
80556870 nt!IovCallDriver+0x9e (FPO: [Non-Fpo]) f356c768 8054f317
nt!IopSynchronousServiceTail+0x5e (FPO: [Non-Fpo]) f356c7e4 805283c1
nt!NtFlushBuffersFile+0x1ad (FPO: [Non-Fpo])
That KeWaitForSingleObject near the top is the problem, the event never
gets signaled!
NTFS Calls flush buffers to write dirty data to disk, i queue the write
to a worker, my worker gets called, but when I call IoCallDriver to do
the actual write it hangs the thread.
Regards
Ben
-----Original Message-----
From: David J. Craig [mailto:xxxxx@yoshimuni.com]
Sent: 28 September 2002 02:26
To: File Systems Developers
Subject: [ntfsd] Re: Could this be a synchronization problem…
I am assuming that the FsContexts are captured during create completion
processing.
Is the SmartCard reader under the control of PC/SC? If so, the
application will be notified when it is removed. That is why the floppy
drive based SmartCard reader couldn’t get WHQL approval because there is
no automatic notifications to the owning driver when a floppy or pseudo
floppy is removed.
Write requests may come in at APC_LEVEL and the completion routine might
be called at DISPATCH_LEVEL. You may have to queue to a worker thread
at PASSIVE_LEVEL. Luckily I have seen someone say that creates are
always completed at PASSIVE_LEVEL. My testing to date has not disproved
this idea and it also appears that with the MS file systems you remain
in the user’s context in the completion routine also, but I still
capture all the info I needed during the create dispatch routine.
----- Original Message -----
From:
To: “File Systems Developers”
Sent: Friday, September 27, 2002 6:33 AM
Subject: [ntfsd] Re: Could this be a synchronization problem…
>
> Hi David,
>
> To clarify your questions:
>
> 1. I determine an encrypted file by a linked list of file objects,
> this
list
> is
> keyed on the fscontext value so that I don’t miss stream file
> objects. Basically
> I have a list of fscontext values and within each node of the list
> I
have
> a
> further linked list of fileobjects associated with the fscontext.
> 2. The reason for getting the key during the write is so that if the
> usb device
> is removed unexpectedly I can deny access to the file. I don`t
> really want to
> keep a list of cached keys in memory any longer than is needed. 3.
> With regards to the APC_LEVEL observation, I think this is the problem

> 4. I only encrypt/decrypt pagingio, so that data in the cache is left
> decrypted.
> I agree that keeping the data in the cache is practically
> impossible. 5. All buffers and pointers checked and accounted for!
>
> Regards
>
> Ben
>
> -----Original Message-----
> From: David J. Craig [mailto:xxxxx@yoshimuni.com]
> Sent: 26 September 2002 23:48
> To: File Systems Developers
> Subject: [ntfsd] Re: Could this be a synchronization problem…
>
>
> You have omitted several items of interest. How do you determine that
this
> is an encrypted file? Why do you have to get the key during write
> processing? If the write request was sent at IPC_LEVEL it is not a
> good idea to make it wait. Are you looking at the paging and
> nocaching bits in the write and create? I know of no way, except
> writing your own file system, to keep the data in the cache manager’s
> buffers encrypted. The buffer pointers, etc. are very critical, so
> make sure you are restoring it exactly as it was when it was received.
> If this is a sync request, the current file offset must be updated
> too.
>
> ----- Original Message -----
> From:
> To: “File Systems Developers”
> Sent: Thursday, September 26, 2002 11:32 AM
> Subject: [ntfsd] Could this be a synchronization problem…
>
>
> >
> > Hi all,
> >
> > I am having a problem with my encryption filter on NTFS. Let me
> > first explain how i am handling the write with some pseudo-code.
> >
> > Write dispatch
> > {
> > if this an encrypted file
> > {
> > queue to worker thread so we are at passive level and i can talk to
> > a usb device. pend operation
> > }
> > else
> > {
> > call lower driver
> > }
> > }
> >
> > WorkerThread
> > {
> > read encryption key from usb device
> >
> > if key read failed
> > {
> > deny write
> > complete irp with error
> > }
> > else
> > {
> > allocate our own buffer
> > copy data to write into our buffer
> > encrypt buffer
> > replace buffer in irp with our buffer
> > set completion routine
> > call lower driver
> > }
> > }
> >
> > completionroutine
> > {
> > tidy up the irp, replacing the buffer etc.
> > clean everything up
> > complete irp with success
> > }
> >
> > I hope all that makes sense!
> >
> > This all seems to work fine on fat/fat32 drives, but when i move
> > across
to
> > NTFS it causes no end of
> > problems.
> >
> > Basically when an application tries to write to the file (in
> > response to
a
> > FlushFileBuffers call -> IRP_MJ_FLUSH_BUFFERS) it locks the thread
> > and
> sits
> > there for ever.
> >
> > Any ideas how I can even start to debug this. I can see that the
> > thread
> has
> > hung during MmFlushImageSection
> > from NTFS but that’s about all.
> >
> > Regards
> >
> > Ben
> >
> >
> > —
> > You are currently subscribed to ntfsd as: xxxxx@yoshimuni.com To
> > unsubscribe send a blank email to %%email.unsub%%
> >
>
>
>
>
> —
> You are currently subscribed to ntfsd as: xxxxx@des.co.uk
> To unsubscribe send a blank email to %%email.unsub%%
>
>
> —
> You are currently subscribed to ntfsd as: xxxxx@yoshimuni.com To
> unsubscribe send a blank email to %%email.unsub%%
—
You are currently subscribed to ntfsd as: xxxxx@des.co.uk
To unsubscribe send a blank email to %%email.unsub%%
—
You are currently subscribed to ntfsd as: xxxxx@osr.com
To unsubscribe send a blank email to %%email.unsub%%
—
You are currently subscribed to ntfsd as: xxxxx@des.co.uk
To unsubscribe send a blank email to %%email.unsub%%
—
You are currently subscribed to ntfsd as: xxxxx@osr.com
To unsubscribe send a blank email to %%email.unsub%%
—
You are currently subscribed to ntfsd as: xxxxx@windows.microsoft.com
To unsubscribe send a blank email to %%email.unsub%%