I’m attempting to determine the cause of a verifier induced crash in the
development version of the OpenAFS Windows redirector. I have an
hypothesis about the crash and would like to see of other people think
it is a reasonable explanition.
Configuration:
Standard Server 2008
Verifier monitoring the driver in question
Tried VMware & real hardware: no difference
Tried 32 & 64-bit: no difference
Test:
Start OpenAFS
Connect to a cell
Write a 25 Mbyte file to the cell
Boom
Without verifier, all indications are that the operation completed
without fault.
Crash excerpt:
DRIVER_VERIFIER_DETECTED_VIOLATION (c4)
A device driver attempting to corrupt the system has been caught.
This is
because the driver was specified in the registry as being suspect
(by the
Arguments:
Arg1: 000000c5, Thread APC disable count changed by driver dispatch
routine.
Arg2: 95bc7df0, Driver dispatch routine address.
Arg3: 00000000, Current thread APC disable count.
Arg4: 0000ffff, Thread APC disable count before calling driver
dispatch routine.
The APC disable count is decremented each time a driver calls
KeEnterCriticalRegion, KeInitializeMutex, or
FsRtlEnterFileSystem. The APC
disable count is incremented each time a driver calls
KeLeaveCriticalRegion,
KeReleaseMutex, or FsRtlExitFileSystem. Since these calls
should always be in
pairs, this value should be zero when a thread exits. A
negative value
indicates that a driver has disabled APC calls without
re-enabling them. A
positive value indicates that the reverse is true.
FAULTING_SOURCE_CODE:
70: //
71: NTSTATUS
72: AFSWrite( IN PDEVICE_OBJECT DeviceObject,
73: IN PIRP Irp)
> 74: {
75: return AFSCommonWrite(DeviceObject, Irp, NULL);
76: }
77:
78: NTSTATUS
79: AFSCommonWrite( IN PDEVICE_OBJECT DeviceObject,
The driver supports callbacks for CM AcquireForLazyWrite and Fast I/O
AcquireForModWrite.
It does not support Fast I/O write. The FastIoWrite callback is
implemented but returns FALSE. From a performance point of view, this is
not an issue. I/O Manager frequently attempts Fast I/O but is forced to
generate an IRP.
When either AcquireForLazyWrite or AcquireForModWrite is called, FCB
resource(s) are acquired as intended. Since the
ExAcquireResource…Lite functions have KeEnterCriticalRegion as a
prerequsite, that is called first. KeEnterCriticalRegion turns off APCs.
As far as I can determine, this action is immediately followed by an
IRP_MJ_WRITE IRP and verifier crashes the system because ACPs are off.
Put differently, every time an IRP_MJ_WRITE IRP occurs with APCs
disabled, the preceeding activity was a call to either
AcquireForLazyWrite or AcquireForModWrite.
Questions:
Am I missing something?
If Fast I/O write was implemented, the I/O Manager would probably use
that in place of the IRP. Would verifier still trigger because APCs are off?
Would the newer version of Server 2008 (with presumably newer verifier)
behave differently? (I suspect no but it doesn’t hurt to ask.)
Thanks,
Mickey Lane