Sorry for the length of this post, always hard to decide how much info is too-much vs. too-little…
I’ve recently had to revamp a FS mini-filter driver that redirects NTFS file I/O from one volume/drive (I’ll call the “primary” drive) to another volume/drive (I’ll call the “secondary” drive).
Originally we were just fixing up the path/filename and returning STATUS_REPARSE during IRP_MJ_CREATE to do the trick and save us alot of headache. This works nicely for the case when both primary and secondary are local drives (C:, D:, etc).
However in the case when the secondary drive is remote (i.e. \remoteserver\share.…), STATUS_REPARSE doesn’t work and we’ve had to move to a more complex shadow/proxy file object architecture, where now the mini-filter takes ownership of and manages some file objects that NTFS would normally handle. Any I/O to these shadow file objects the filter then redirects over to the “real” file of the same name on the remote server.
After much trial & error and pain & suffering we’ve managed to get things working well enough to do some stress testing. Now on one of our test servers we are consistently getting a BSOD that is proving a little tricky to dissect and determine exactly what the filter is doing wrong.
I’m almost certain it has to do with the filter not completely managing proxy/shadow file objects correctly and not doing something that the I/O subsystem is expecting us to do. But it’s hard to know exactly what since when the crash occurs our filter doesn’t show up in the call stack (although one of our proxy file objects does).
Although the call stacks vary from crash to crash, the cause of the crash is always at the same line of code in KeWaitForSingleObject (KeWaitForSingleObject + 17c to be exact, running Windows 2008 R2 x64), mov qword ptr [rax], r15. But rax is 0, thus the crash.
From what I can tell, it appears one of the filter proxy file objects is being passed to KeWaitForSingleObject, which is assuming that the fileObject->Lock Link fields are non-NULL, but in this case that’s not true so we ge the NULL memory reference and crash.
The MSDN Lib description of FILE_OBJECT fields doesn’t reveal much. Regarding Lock field it just says field is opaque and “used by the system”, but doesn’t really say exactly who is responsible for setting it up.
I’ve included stack trace and other useful info from the crash dump below. I also include a code snippet showing exactly what we do during IRP_MJ_CREATE when we take ownership of the shadow file object and how we are setting up the fields.
Any help/ideas are greatly appreciated. It’s really tough to find any information anywhere describing how to properly manage shadow/proxy file objects in a filter.
From the crash dump…
BugCheck A, {0, 2, 1, fffff8000168688c}
Probably caused by : srv.sys ( srv!QueryPathOrFileInformation+ca )
Followup: MachineOwner
kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
IRQL_NOT_LESS_OR_EQUAL (a)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If a kernel debugger is available get the stack backtrace.
Arguments:
Arg1: 0000000000000000, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000001, bitfield :
bit 0 : value 0 = read operation, 1 = write operation
bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status)
Arg4: fffff8000168688c, address which referenced memory
Debugging Details:
WRITE_ADDRESS: 0000000000000000
CURRENT_IRQL: 2
FAULTING_IP:
nt!KeWaitForSingleObject+17c
fffff800`0168688c 4c8938 mov qword ptr [rax],r15
DEFAULT_BUCKET_ID: VISTA_DRIVER_FAULT
BUGCHECK_STR: 0xA
PROCESS_NAME: System
TRAP_FRAME: fffff88005167610 – (.trap 0xfffff88005167610)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=0000000000000000 rbx=0000000000000000 rcx=fffffa8004ff6b98
rdx=0000000000000000 rsi=0000000000000000 rdi=0000000000000000
rip=fffff8000168688c rsp=fffff880051677a0 rbp=0000000000000000
r8=fffff78000000008 r9=fffff88005167800 r10=0000000000000000
r11=fffff800017f7e80 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei pl zr na po nc
nt!KeWaitForSingleObject+0x17c:
fffff800`0168688c 4c8938 mov qword ptr [rax],r15 ds:fbc0:0000=???
Resetting default scope
LAST_CONTROL_TRANSFER: from fffff8000167cb69 to fffff8000167d600
STACK_TEXT:
fffff880051674c8 fffff800
0167cb69 : 000000000000000a 00000000
00000000 0000000000000002 00000000
00000001 : nt!KeBugCheckEx
fffff880051674d0 fffff800
0167b7e0 : 0000000000000001 00000000
00000000 fffff88005167410 fffffa80
04db1c60 : nt!KiBugCheckDispatch+0x69
fffff88005167610 fffff800
0168688c : fffffa8000000000 fffffa80
018bcd80 fffffa80051a4201 fffff880
05167a20 : nt!KiPageFault+0x260
fffff880051677a0 fffff800
018e04d8 : 0000000000000f00 00000000
00000000 fffff88005167a00 00000000
00000000 : nt!KeWaitForSingleObject+0x17c
fffff88005167840 fffff800
01961fc9 : fffffa8004ff6b10 00000000
00000016 fffff8a00d463118 00000000
00000016 : nt!IopAcquireFileObjectLock+0x84
fffff88005167880 fffff880
03a4c18a : ffffffff80000b2c fffff880
05167a00 fffff8a00d463118 00000000
00000ffe : nt!NtQueryInformationFile+0x894
fffff880051679c0 fffff880
03a4c05d : fffffa800279d730 00000000
00000000 fffffa8005573820 fffff8a0
0d463010 : srv!QueryPathOrFileInformation+0xca
fffff88005167a80 fffff880
03a4c2f5 : 0000000000000016 fffff8a0
0d463114 fffffa8002798220 00000000
00000004 : srv!SrvSmbQueryFileInformation+0x15d
fffff88005167af0 fffff880
03a4a7a4 : fffffa8002798220 00000000
00000000 00000000fffffffc 00000000
00000004 : srv!ExecuteTransaction+0xc5
fffff88005167b30 fffff880
03a02698 : fffff88003a1d100 fffffa80
0279d701 fffff8a00d463118 00000000
00000000 : srv!SrvSmbTransaction+0x664
fffff88005167c30 fffff880
03a025b3 : fffffa800279d730 00000000
00000006 0000000000000006 fffffa80
0279d730 : srv!SrvProcessSmb+0xb8
fffff88005167cb0 fffff880
03a47763 : fffffa8002784b80 00000000
00000005 fffffa800279d730 fffffa80
0279d740 : srv!SrvRestartReceive+0xa3
fffff88005167cf0 fffff800
01922a86 : fffffa800279d730 fffffa80
02557b60 0000000000000080 fffffa80
0183c450 : srv!WorkerThread+0xed
fffff88005167d40 fffff800
0165bb06 : fffff800017f7e80 fffffa80
02557b60 fffffa8004d4e680 fffff880
016746c0 : nt!PspSystemThreadStartup+0x5a
fffff88005167d80 00000000
00000000 : fffff88005168000 fffff880
05162000 fffff880051679f0 00000000
00000000 : nt!KxStartSystemThread+0x16
STACK_COMMAND: kb
FOLLOWUP_IP:
srv!QueryPathOrFileInformation+ca
fffff880`03a4c18a 65488b0c2588010000 mov rcx,qword ptr gs:[188h]
SYMBOL_STACK_INDEX: 6
SYMBOL_NAME: srv!QueryPathOrFileInformation+ca
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: srv
IMAGE_NAME: srv.sys
DEBUG_FLR_IMAGE_TIMESTAMP: 4b1e0f37
FAILURE_BUCKET_ID: X64_0xA_srv!QueryPathOrFileInformation+ca
BUCKET_ID: X64_0xA_srv!QueryPathOrFileInformation+ca
Followup: MachineOwner
kd> !fileobj fffffa8004ff6b10
udir\mill\0009.DIR\1808.FIL
Related File Object: 0xfffffa80027b7050
Device Object: 0xfffffa8001bdb750 \Driver\volmgr
Vpb: 0xfffffa8001bdb690
Access: Read SharedRead SharedWrite SharedDelete
Flags: 0x10c0002
Synchronous IO
Handle Created
Fast IO Read
Remote Origin
File Object is currently busy and has 1 waiters.
FsContext: 0xfffff8a00d475ae0 FsContext2: 0xfffff8a00ca9deb8
CurrentByteOffset: 0
Cache Data:
Section Object Pointers: fffffa8004ca15d8
Shared Cache Map: fffffa8004c4a270 File Offset: 0 in VACB number 0
Vacb: fffffa80018df3f0
Your data is at: fffff98023c80000
kd> dd fffffa8004ff6b10
fffffa8004ff6b10 00d80005 00000000 01bdb750 fffffa80 fffffa80
04ff6b20 01bdb690 fffffa80 0d475ae0 fffff8a0
fffffa8004ff6b30 0ca9deb8 fffff8a0 04ca15d8 fffffa80 fffffa80
04ff6b40 00000000 00000000 00000000 00000000
fffffa8004ff6b50 027b7050 fffffa80 00010000 01010100 fffffa80
04ff6b60 010c0002 00000000 00380036 00000000
fffffa8004ff6b70 064be6f0 fffff8a0 00000000 00000000 fffffa80
04ff6b80 00000001 00000001 00000000 00000000
kd> d
fffffa8004ff6b90 00000080 00000000 00000000 00000000 fffffa80
04ff6ba0 00000000 00000000 00060000 00000000
fffffa8004ff6bb0 04ff6bb0 fffffa80 04ff6bb0 fffffa80 fffffa80
04ff6bc0 00000000 00000000 00000000 00000000
fffffa8004ff6bd0 04ff6bd0 fffffa80 04ff6bd0 fffffa80 fffffa80
04ff6be0 00000000 00000000 00000000 00000000
fffffa8004ff6bf0 02130013 e56c6946 00000000 00000000 fffffa80
04ff6c00 00000000 00000000 00000000 fffff880
In our code at IRP_MJ_CREATE time in our PostCreate handler we do the following in the case when local NTFS driver returns
(Data->IoStatus.Status == STATUS_OBJECT_NAME_NOT_FOUND || Data->IoStatus.Status == STATUS_OBJECT_PATH_NOT_FOUND)
1). Build file/pathname over to remote server and do FltCreateFile and then ObReferenceObjectByHandle to get actual file handle/object for the remote file (using MUP & CIFS client/redirector).
- Take ownership and initialize the local NTFS shadow/proxy file object as shown below. Note that for the most part we are “borrowing” fields from the remote file object to try and make life easier for us. We then hold a reference on the remote file object until we are completely finished with the shadow/proxy file object. Code snippet showing how we are initializing shadow file object:
if (Data->Iopb->TargetFileObject->Vpb == NULL &&
Data->Iopb->TargetFileObject->RelatedFileObject != NULL)
{
Data->Iopb->TargetFileObject->Vpb = Data->Iopb->TargetFileObject->RelatedFileObject->Vpb;
}
Data->Iopb->TargetFileObject->FsContext = remoteFileObject->FsContext;
Data->Iopb->TargetFileObject->FsContext2 = remoteFileObject->FsContext2;
Data->Iopb->TargetFileObject->SectionObjectPointer =
remoteFileObject->SectionObjectPointer;
Data->Iopb->TargetFileObject->LockOperation =
remoteFileObject->LockOperation;
Data->Iopb->TargetFileObject->DeletePending =
remoteFileObject->DeletePending;
Data->Iopb->TargetFileObject->ReadAccess =
remoteFileObject->ReadAccess;
Data->Iopb->TargetFileObject->WriteAccess =
remoteFileObject->WriteAccess;
Data->Iopb->TargetFileObject->DeleteAccess =
remoteFileObject->DeleteAccess;
Data->Iopb->TargetFileObject->SharedRead =
remoteFileObject->SharedRead;
Data->Iopb->TargetFileObject->SharedWrite =
remoteFileObject->SharedWrite;
Data->Iopb->TargetFileObject->SharedDelete =
remoteFileObject->SharedDelete;
Data->Iopb->TargetFileObject->Flags = FileObject->Flags;
Data->Iopb->TargetFileObject->CurrentByteOffset =
remoteFileObject->CurrentByteOffset;
We then change the IoStatus from failed to SUCCESS and complete the I/O.
-
Then we intercept ALL I/O to the shadow file object and make proxy Flt/Zw calls to the remote server using the file handle/object obtained during Create time. We massage the returned data if necessary and pass it back to the original caller and complete the I/O. We have to make sure that NTFS will NEVER see one of these proxy file objects or it will croak.
-
At Close time we tear down the relationship between the local proxy file object and the remote file object and finally do a deferred close on the remote file.
NOTE: There is also other stuff we’re doing related to managing the proxy file object that I haven’t show here as I don’t think it’s relevant to the problem and this post is already WAY too long. I’ll add more later if anyone thinks it would be helpful.
THanks for your help and diligence in reading this far!!