Driver Problem (Blue Screen)

I have a KMDF USB driver that has been running fine for several months with no problems. Recently, our installer has installed our application, driver, etc. and I am seeing an issue on a few of the machines. Previously, I had manually installed the driver. What is strange is that even when I try and reinstall manually, i cannot get this machine out of the state that it is in.

On a few of the systems, the driver is running fine even after the install. On the other system, I see the application losing communication with the driver and then the system blue screens. I will attach the crash dump below. To me, this really points towards a driver bug. But, i don’t understand how we have run for months with no problems and the driver has not changed. But now this problem has came up.

My first thought is that maybe there is a mismatch between the WdmCoinstaller and the driver (checked vs. free). But I am pretty sure that both are the free builds.

Anyway, if anyone has any ideas that will help, I would greatly appreciate it. This problem has me stumped.

Here is the crash Dump and the code. I marked line 129 in the code which is a point of interest in the crash dump.

=========================================================
Loading Dump File [C:\Outbox\Driver Problem\CrashDumpFiles1\Mini030707-01.dmp]
Mini Kernel Dump File: Only registers and stack trace are available

Symbol search path is: C:\Dev\LXR\Version_3.0\Source\Hardware\Drivers\UsbDriver\objchk_wxp_x86\i386;;C:\Outbox\Driver Problem;srv*c:\Symbols*http://msdl.microsoft.com/download/symbols
Executable search path is: C:\WINDOWS\system32\drivers;C:\Dev\LXR\Version_3.0\Source\Hardware\Drivers\UsbDriver\objchk_wxp_x86\i386
Windows XP Kernel Version 2600 (Service Pack 2) MP (2 procs) Free x86 compatible
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 2600.xpsp_sp2_gdr.050301-1519
Kernel base = 0x804d7000 PsLoadedModuleList = 0x805624a0
Debug session time: Tue Mar 6 18:54:51.353 2007 (GMT-6)
System Uptime: 0 days 0:13:31.479
Loading Kernel Symbols

Loading User Symbols
Loading unloaded module list

*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 1000008E, {c0000005, a9195fc0, a9fb4b14, 0}

Probably caused by : Wdf01000.sys ( Wdf01000!FxRequest::CompleteWithInformation+10 )

Followup: MachineOwner

0: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

KERNEL_MODE_EXCEPTION_NOT_HANDLED_M (1000008e)
This is a very common bugcheck. Usually the exception address pinpoints
the driver/function that caused the problem. Always note this address
as well as the link date of the driver/image that contains this address.
Some common problems are exception code 0x80000003. This means a hard
coded breakpoint or assertion was hit, but this system was booted
/NODEBUG. This is not supposed to happen as developers should never have
hardcoded breakpoints in retail code, but …
If this happens, make sure a debugger gets connected, and the
system is booted /DEBUG. This will let us see why this breakpoint is
happening.
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: a9195fc0, The address that the exception occurred at
Arg3: a9fb4b14, Trap Frame
Arg4: 00000000

Debugging Details:

EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at “0x%08lx” referenced memory at “0x%08lx”. The memory could not be “%s”.

FAULTING_IP:
Wdf01000!FxRequest::CompleteWithInformation+10
a9195fc0 89501c mov dword ptr [eax+1Ch],edx

TRAP_FRAME: a9fb4b14 – (.trap ffffffffa9fb4b14)
ErrCode = 00000002
eax=00000000 ebx=81c880c0 ecx=81c46f48 edx=00000000 esi=a91cb618 edi=81c88008
eip=a9195fc0 esp=a9fb4b88 ebp=a9fb4b90 iopl=0 nv up ei ng nz na po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010282
Wdf01000!FxRequest::CompleteWithInformation+0x10:
a9195fc0 89501c mov dword ptr [eax+1Ch],edx ds:0023:0000001c=???
Resetting default scope

CUSTOMER_CRASH_COUNT: 1

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0x8E

PROCESS_NAME: LxService.exe

LAST_CONTROL_TRANSFER: from a91804a2 to a9195fc0

STACK_TEXT:
a9fb4b90 a91804a2 c0200204 00000000 f889df34 Wdf01000!FxRequest::CompleteWithInformation+0x10
a9fb4bac f889d128 81c46f48 7e3b90b0 c0200204 Wdf01000!imp_WdfRequestCompleteWithInformation+0x7f
a9fb4bd0 a91a3071 7e975fb0 7e3b90b0 00000800 LxNextGenUsbDriver!LxEvtIoRead+0x20e [c:\dev\lxr\version_3.0\source\hardware\drivers\nextgenusbdriver\bulkrwr.c @ 129]
a9fb4bec a91a5b20 7e975fb0 7e3b90b0 00000800 Wdf01000!FxIoQueueIoWrite::Invoke+0x2a
a9fb4c14 a91a7929 7e3b90b0 81c46f48 8168a048 Wdf01000!FxIoQueue::DispatchRequestToDriver+0x1dc
a9fb4c30 a91a8be0 8168a000 a91cd9b8 81b5fd10 Wdf01000!FxIoQueue::DispatchEvents+0x47d
a9fb4c4c a91aabb1 00000000 81543230 81f829f0 Wdf01000!FxIoQueue::QueueRequest+0x237
a9fb4c70 a919a272 8228c208 a9fb4ca0 804e13d9 Wdf01000!FxPkgIo::Dispatch+0x34a
a9fb4c7c 804e13d9 822f5260 8228c208 806ff410 Wdf01000!FxDevice::Dispatch+0x7f
a9fb4c8c 8056f50b 8228c2c0 8228c208 820578f0 nt!IopfCallDriver+0x31
a9fb4ca0 8057f13f 822f5260 8228c208 820578f0 nt!IopSynchronousServiceTail+0x60
a9fb4d38 804dd99f 0000025c 00000278 00000000 nt!NtReadFile+0x55d
a9fb4d38 7c90eb94 0000025c 00000278 00000000 nt!KiFastCallEntry+0xfc
WARNING: Frame IP not in any known module. Following frames may be wrong.
02f1ef4c 00000000 00000000 00000000 00000000 0x7c90eb94

STACK_COMMAND: kb

FOLLOWUP_IP:
Wdf01000!FxRequest::CompleteWithInformation+10
a9195fc0 89501c mov dword ptr [eax+1Ch],edx

SYMBOL_STACK_INDEX: 0

SYMBOL_NAME: Wdf01000!FxRequest::CompleteWithInformation+10

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: Wdf01000

IMAGE_NAME: Wdf01000.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 444738f2

FAILURE_BUCKET_ID: 0x8E_Wdf01000!FxRequest::CompleteWithInformation+10

BUCKET_ID: 0x8E_Wdf01000!FxRequest::CompleteWithInformation+10

Followup: MachineOwner

VOID
LxEvtIoRead(
IN WDFQUEUE Queue,
IN WDFREQUEST Request,
IN size_t Length
)
{
WDFUSBPIPE pipe;
NTSTATUS status;
WDFMEMORY reqMemory;
PDEVICE_CONTEXT pDeviceContext;
PUNICODE_STRING fileName;
LX_ENUM_FILE_CREATE_INTERFACE eCreateFileInterface;
PFILE_OBJECT_CONTEXT fileContext;
WDFFILEOBJECT fileObject;
WDFDEVICE Device;

Device = WdfIoQueueGetDevice(Queue);
pDeviceContext = GetDeviceContext(Device);
fileObject = WdfRequestGetFileObject(Request);

if (fileObject != NULL)
{
fileContext = GetFileObjectContext(fileObject);

if (fileContext != NULL)
{
if (fileContext->pDataQueue == NULL)
{
TraceEvents(TRACE_LEVEL_VERBOSE, DBG_READ, “–>LxEvtIoRead fileContext->pDataQueue is NULL\n”);
status = STATUS_INVALID_DEVICE_REQUEST;
}
else if (fileContext->pDataQueue == &pDeviceContext->BeadDataQueue)
{
TraceEvents(TRACE_LEVEL_VERBOSE, DBG_READ, “LxEvtIoRead Forwarding Request to Bead Data QUEUE”);
//Forward the Request to the Appropriate Queue that Was Setup on the Device
//context during the call to EvtDeviceFileCreate.
status = WdfRequestForwardToIoQueue(Request,*(fileContext->pDataQueue));
if(!NT_SUCCESS(status))
{
TraceEvents(TRACE_LEVEL_VERBOSE, DBG_READ, “–>LxEvtIoRead WdfRequestForwardToIoQueue failed on eStatusInterface: Error =0x%x\n”, status);
WdfRequestCompleteWithInformation(Request, status, 0);
}
else
{
status = STATUS_PENDING;
}
}
else if (fileContext->pDataQueue == &pDeviceContext->StatusDataQueue)
{
TraceEvents(TRACE_LEVEL_VERBOSE, DBG_READ, “LxEvtIoRead Forwarding Request to Status Data QUEUE”);
//Forward the Request to the Appropriate Queue that Was Setup on the Device
//context during the call to EvtDeviceFileCreate.
status = WdfRequestForwardToIoQueue(Request,*(fileContext->pDataQueue));
if(!NT_SUCCESS(status))
{
TraceEvents(TRACE_LEVEL_VERBOSE, DBG_READ, “–>LxEvtIoRead WdfRequestForwardToIoQueue failed on eStatusInterface: Error =0x%x\n”, status);
WdfRequestCompleteWithInformation(Request, status, 0);
}
else
{
status = STATUS_PENDING;
}
}
else
{
TraceEvents(TRACE_LEVEL_VERBOSE, DBG_READ, “–>LxEvtIoRead fileContext->pDataQueue is UNKNOWN\n”);
status = STATUS_INVALID_DEVICE_REQUEST;
}
}
else
{
TraceEvents(TRACE_LEVEL_VERBOSE, DBG_READ, “–>LxEvtIoRead GetFileObjectContext failed.”);
status = STATUS_INVALID_DEVICE_REQUEST;
}
}
else
{
TraceEvents(TRACE_LEVEL_VERBOSE, DBG_READ, “–>LxEvtIoRead WdfRequestGetFileObject failed.”);
status = STATUS_INVALID_DEVICE_REQUEST;
}

if (status != STATUS_PENDING) {
//
// complete now if not pending
//
WdfRequestCompleteWithInformation(Request, status, 0);
LINE 129: TraceEvents(TRACE_LEVEL_INFORMATION, DBG_READ, “<– LxEvtIoRead Exiting: Error =0x%x\n”, status);
}

TraceEvents(TRACE_LEVEL_INFORMATION, DBG_READ, “<– LxEvtIoRead Exiting\n”);

return;
}

It’s called “double completion”, and you do it when various calls (such as WdfRequestForwardToIoQueue) fail- you call WdfRequestCompleteWithInformation with the failed status code, then proceed on down, see the status isn’t STATUS_PENDING, and call it again.

I would guess that you are double completing the WDFREQUEST. Have you turned on the KMDF verifier (http://blogs.msdn.com/doronh/archive/2006/05/17/600335.aspx) and driver verifier on your driver? That should help in narrowing down the problem

d

It took me a while to look up the status code from your dump. You are definitely failing one of the forwarding calls.

C0200204 is STATUS_WDF_BUSY- indicating one of the queues you forwarded to is not accepting requests. I believe this happens when you purge or drain the queue (that’s my reading of the source, but it was a bit too hasty for me to be absolutely certain).

Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: text/plain; charset=“us-ascii”
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Return-Path: xxxxx@microsoft.com

Sorry about my overhasty earlier post. I did a more thorough search- add w=
hile the queue is in the process of being destroyed and a power-managed que=
ue is powering down to the list. Given the symptoms, I’m guessing power-do=
wn is involved in your case.

Scratch the last part- it looked like power, but it is actually only in remove processing that the queue would go to that state.

Gotta cut back on the coffee (and leave this stuff to someone who knows the state machines better)…

But it definitely is a double complete, and you might check your log for purging or draining those queues you are forwarding to.

I saw that potential for the double complete when I first looked at the code and the trace. What is troublesome is that this driver has been running for months without incident. I guess the app changed and caused a problem or the checked build was getting lucky or a combination of both. Thanks for the reply.