Arrrg!

I’m going to apologize beforehand for this one… It’s been driving me nuts
for days now.

I have a mini-filter that sends information to a user application. Sometimes
a response is expected, sometimes not.
For the most part, the information comes from the pre-create callback and
the registry open callback. At the moment, every transaction proceeds
unmolested.

The bulk of the user code is a big C++ class. There are Start() and Stop()
functions that use the service manager to load or unload the driver and
start or stop a high-priority I/O thread. The I/O thread’s main job is to
service the FilterConnectCommunicationPort() (etc.) stuff. All of this code
is contained in a single library.

I have an MFC based dialog program to hosts the library. Buttons tell it to
go, stop, etc. and a display prints out assorted messages. I can run this
MFC app for days without any noticeable effect on the system. I can stop the
filtering process (stop includes a complete unload and registry scrub)
followed by a start (basically install everything from scratch) as many
times as I want.

I have a 2nd application that runs either as a windows process without any
UI or as a service. The 2nd app uses the same library image and the same
driver image as the MFC app. I can’t get it to run worth a hoot. (I’m not
sure I’ve ever seen it do the same thing twice…)

Every once in a while, I can get it to load and run OK. I can trace the
messages from driver to user space (i.e. it’s actually filtering). I can
tell it to shut down and see trace messages that appear to be normal. Very
shortly after that, the entire system enters this state where one after
another, the open applications freeze. After a few seconds, everything is
frozen. Sometimes, the system crashes, sometimes I need the reset button.

(Plaintive whining, rending of hair and gnashing of teeth continues
below…)

Microsoft (R) Windows Debugger Version 6.3.0017.0
Copyright (c) Microsoft Corporation. All rights reserved.

Loading Dump File [F:\WINDOWS\MEMORY.DMP]
Kernel Complete Dump File: Full address space is available

Symbol search path is:
SRV*f:\Symbols*http://msdl.microsoft.com/download/symbols;F:\SYMBOLS
Executable search path is:
Windows XP Kernel Version 2600 (Service Pack 2) MP (2 procs) Free x86
compatible
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 2600.xpsp_sp2_gdr.050301-1519
Kernel base = 0x804d7000 PsLoadedModuleList = 0x805624a0
Debug session time: Mon Aug 01 11:10:40 2005
System Uptime: 0 days 0:43:59.911
Loading Kernel Symbols


Loading unloaded module list

Loading User Symbols

****************************************************************************
***
*
*
* Bugcheck Analysis
*
*
*
****************************************************************************
***

Use !analyze -v to get detailed debugging information.

BugCheck C5, {0, 2, 1, 80550ae2}

Probably caused by : Pool_Corruption ( nt!ExDeferredFreePool+107 )

Followup: Pool_corruption

0: kd> !analyze -v
****************************************************************************
***
*
*
* Bugcheck Analysis
*
*
*
****************************************************************************
***

DRIVER_CORRUPTED_EXPOOL (c5)
An attempt was made to access a pageable (or completely invalid) address at
an
interrupt request level (IRQL) that is too high. This is
caused by drivers that have corrupted the system pool. Run the driver
verifier against any new (or suspect) drivers, and if that doesn’t turn up
the culprit, then use gflags to enable special pool.
Arguments:
Arg1: 00000000, memory referenced
Arg2: 00000002, IRQL
Arg3: 00000001, value 0 = read operation, 1 = write operation
Arg4: 80550ae2, address which referenced memory

Debugging Details:

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0xC5

LAST_CONTROL_TRANSFER: from 80550ac7 to 80550ae2

TRAP_FRAME: f0489930 – (.trap fffffffff0489930)
ErrCode = 00000002
eax=858b8000 ebx=00000000 ecx=000001ff edx=858b8290 esi=805699c0
edi=00000000
eip=80550ae2 esp=f04899a4 ebp=f04899e4 iopl=0 nv up ei ng nz ac po
cy
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
efl=00010297
nt!ExDeferredFreePool+0x107:
80550ae2 893b mov [ebx],edi
ds:0023:00000000=???
Resetting default scope

STACK_TEXT:
f04899e4 80550ac7 00000001 860b4c20 85dd5214 nt!ExDeferredFreePool+0x107
f0489a24 804ecd0f 85dd5008 85dbc938 85bdccf8 nt!ExFreePoolWithTag+0x47f
f0489a7c 804ecd6a 85bdccf8 f0489ac8 f0489abc nt!IopCompleteRequest+0xf4
f0489acc 806ffef2 00000000 00000000 f0489ae4 nt!KiDeliverApc+0xb3
f0489acc 806ffae4 00000000 00000000 f0489ae4 hal!HalpApcInterrupt+0xc6
f0489b54 804e5d26 85bdccf8 85bdccb8 00000000
hal!KeReleaseQueuedSpinLock+0x3c
f0489b74 804ecd84 85bdccf8 860b4c20 00000000 nt!KeInsertQueueApc+0x6d
f0489ba8 f2c986d2 00000000 85bdccb8 85bdcd28 nt!IopfCompleteRequest+0x1d8
f0489bcc f2c97193 85bdccb8 00bdcd28 00000000 tcpip!GetInterfaceInfo+0xf8
f0489bf0 f2cda9b6 85bdccb8 85bdcd28 85c09b20
tcpip!IPDispatchDeviceControl+0x61c
f0489c08 f2c97097 8659ff18 85bdccb8 85c09b20 tcpip!IPDispatch+0x52
f0489c40 804e13d9 8659ff18 85bdccb8 806ff410 tcpip!TCPDispatch+0x127
f0489c50 8056f50b 85bdcd28 860b4c20 85bdccb8 nt!IopfCallDriver+0x31
f0489c64 80580fb1 8659ff18 85bdccb8 860b4c20
nt!IopSynchronousServiceTail+0x60
f0489d00 8058709e 00000344 000004b8 00000000 nt!IopXxxControlFile+0x5ef
f0489d34 804dd99f 00000344 000004b8 00000000 nt!NtDeviceIoControlFile+0x2a
f0489d34 7c90eb94 00000344 000004b8 00000000 nt!KiFastCallEntry+0xfc
019bf4d4 7c90d8ef 76d627fa 00000344 000004b8 ntdll!KiFastSystemCallRet
019bf4d8 76d627fa 00000344 000004b8 00000000 ntdll!ZwDeviceIoControlFile+0xc
019bf520 76d628b4 00000344 00120040 00000000 iphlpapi!TCPSendIoctl+0x53
019bf5a8 76d62983 014838f8 019bf5dc 014d7028 iphlpapi!GetInterfaceInfo+0x86
019bf5e0 76d630d1 00000000 00000000 0000001b
iphlpapi!GetAdapterOrderMap+0xb5
019bf834 76d6366c 019bf8dc 000efad8 00000000 iphlpapi!GetAdapterList+0x46
019bf86c 76d661af 01492a50 000efad8 00000000 iphlpapi!GetAdapterInfo+0x29
019bf8c0 76442c35 00000000 019bf8dc 00000000 iphlpapi!GetAdapterInfoEx+0x1c
019bfb40 7645f6dc 000efad8 01492a50 019bfc08
NETSHELL!HrGetDHCPAddressType+0x40
019bfbc8 76460d32 019bfbe4 00000001 000ea098
NETSHELL!CLanStatEngine::HrUpdateData+0x14c
019bfbec 7645d7a4 000efaa0 019bfc08 019bfc8c
NETSHELL!CNetStatisticsEngine::UpdateStatistics+0x2d
019bfc10 7645e7eb 00284822 7645e7b5 014a9530
NETSHELL!CNetStatisticsCentral::RefreshStatistics+0x4e
019bfc24 77d48734 00000000 00000113 00007fef
NETSHELL!CNetStatisticsCentral::TimerCallback+0x36
019bfc50 77d49857 7645e7b5 00000000 00000113 USER32!InternalCallWinProc+0x28
019bfcb8 77d49791 00000000 7645e7b5 00000000 USER32!UserCallWinProc+0xf3
019bfd10 77d48a10 019bfd68 00000000 019bfd8c
USER32!DispatchMessageWorker+0x10e
019bfd20 7628155a 019bfd68 00000000 76280000 USER32!DispatchMessageW+0xf
019bfd8c 76283746 76280000 00000000 000101e6 stobject!SysTrayMain+0x177
019bffb4 7c80b50b 00000000 00000000 00000000
stobject!CSysTray::SysTrayThreadProc+0x4f
019bffec 00000000 762836f7 00000000 00000000 kernel32!BaseThreadStart+0x37

FOLLOWUP_IP:
nt!ExDeferredFreePool+107
80550ae2 893b mov [ebx],edi

SYMBOL_STACK_INDEX: 0

FOLLOWUP_NAME: Pool_corruption

SYMBOL_NAME: nt!ExDeferredFreePool+107

MODULE_NAME: Pool_Corruption

IMAGE_NAME: Pool_Corruption

DEBUG_FLR_IMAGE_TIMESTAMP: 0

STACK_COMMAND: .trap fffffffff0489930 ; kb

BUCKET_ID: 0xC5_nt!ExDeferredFreePool+107

Followup: Pool_corruption

The thing that’s got me bamboozled is the fact that the exact same library
bits will run no problem with one hosting app and they’ll take out
everything with the other.

I’ve run the 2nd app with the Start() call commented out and it runs fine.
Doesn’t do anything interesting but runs fine.

I know this can’t be resolved with the info given (and I’m not sure I can
determine what info would do so) so I guess I’m looking for a list of things
to check with system lockups in mind.

Thanks for listening to my rant. I feel better now.

Mickey.

>shortly after that, the entire system enters this state where one after

another, the open applications freeze. After a few seconds, everything is
frozen.

Clear sign of a deadlock. Use !process 0 7.

Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

Max’s point is simply that if the system isn’t making forward progress,
you need to figure out why not - so find the threads and figure out for
what they are blocked waiting.

While !process 0 7 will work, I find that many beginners find this much
detail to be daunting. Since everything is frozen, you can also just
choose one process and start analyzing why the threads are not making
forward progress.

Another command that works well in XP and above is “!stacks”. That tends
to generate a lot less output than !process 0 7 and usually identifies
interesting stacks for further investigation rather quickly.

Regards,

Tony

Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Maxim S. Shatskih
Sent: Monday, August 01, 2005 5:51 PM
To: ntfsd redirect
Subject: Re: [ntfsd] Arrrg!

shortly after that, the entire system enters this state where one after
another, the open applications freeze. After a few seconds, everything
is
frozen.

Clear sign of a deadlock. Use !process 0 7.

Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com


Questions? First check the IFS FAQ at
https://www.osronline.com/article.cfm?id=17

You are currently subscribed to ntfsd as: xxxxx@osr.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

>While !process 0 7 will work, I find that many beginners find this much

detail to be daunting.

Why? bad stacks are cleanly distinguished from the good stacks - they are too
unlikely :slight_smile: especially in the last bottom-most calls.

Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

You and I have been doing this for a very long time so it is no problem
for us, but for a newbie to feel they must examine 300+ threads (the
number on an out-of-the-box W2K3 system these days) and figure out which
ones are relevant and which are not seems to be a daunting task. Those
of us doing this for a while can quickly scan through the list and pick
out those that are “interesting”.

I’ve tried to explain WHY a particular thread is interesting (teaching
debugging) and come to the conclusion that most users find it easier to
sift through a much smaller body of information; the process is
subjective and difficult to explain simply and coherently.

Regards,

Tony

Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Maxim S. Shatskih
Sent: Tuesday, August 02, 2005 5:10 PM
To: ntfsd redirect
Subject: Re: [ntfsd] Arrrg!

While !process 0 7 will work, I find that many beginners find this much
detail to be daunting.

Why? bad stacks are cleanly distinguished from the good stacks - they
are too
unlikely :slight_smile: especially in the last bottom-most calls.

Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com


Questions? First check the IFS FAQ at
https://www.osronline.com/article.cfm?id=17

You are currently subscribed to ntfsd as: xxxxx@osr.com
To unsubscribe send a blank email to xxxxx@lists.osr.com