Trouble finding a pool corruption bug

Eric_Diven · September 9, 2008, 12:45am

I made some changes to my control device object to support overlapped
calls of DeviceIoControl, and I seem to have introduced a pool
corruption bug somewhere. I have driver verifier on with pool tracking,
and I’ve enabled special pool for all of my tags, and I cannot find the
problem.

The symptom I’m seeing (3 times now, consistently) is that I get a
bugcheck DRIVER_CORRUPTED_EXPOOL (c5), with a very specific address
(0x107) and the problem is happening in ExAllocatePoolWithTag for a call
to NtQueryVolumeInformationFile. My driver isn’t on the call stack, so
I have to assume it corrupted the pool earlier. The code that’s
exposing the problem is our code, but it’s not mine and I don’t have
symbols at the moment. I am pretty sure it’s not directly invoking
anything of mine at any point though. I’ll fill in on this point
tomorrow morning once I can talk to the guy who does that stuff.

If anybody sees any glaring errors in the code below, or can glean any
useful information from the !analyze, please fill me in. At this point,
I can only guess that I’m corrupting the pool with my handling of
workitem allocation and freeing, but I can’t see a way to get a tag on
that allocation, despite the fact that pooltag.txt lists some tags for
various workitems.

Thanks,

~Eric

typedef struct write_ctx {
PIRP irp;
PIO_STACK_LOCATION irp_sp;
EHR_WRITE_ARGS *args;
PIO_WORKITEM wi;
} write_ctx_t;

static NTSTATUS ehr_file_write_async (PDEVICE_OBJECT device, PIRP irp,
PIO_STACK_LOCATION irp_sp, EHR_WRITE_ARGS *args)
{
// NOT STATUS_SUCCESS, we have to return S_P if we
IoMarkIrpPending
NTSTATUS status = STATUS_PENDING;
write_ctx_t *wctx = NULL;

// TODO: use a lookaside list
wctx = ExAllocatePoolWithTag (PagedPool, sizeof (write_ctx_t),
EHR_WRITE_CTX_TAG);
TEST_ALLOC_AND_ABORT (wctx);

wctx->wi = IoAllocateWorkItem (device);
TEST_ALLOC_AND_ABORT (wctx->wi);

wctx->irp = irp;
wctx->irp_sp = irp_sp;
wctx->args = args;

IoMarkIrpPending (irp);

IoQueueWorkItem (wctx->wi, ehr_file_write_async_worker,
DelayedWorkQueue,
wctx);

out:
if (status != STATUS_PENDING) {
if (wctx->wi) {
IoFreeWorkItem (wctx->wi);
}
if (wctx) {
DBG_PRINT_WRAPPER ((“\n\n>>> FREED wctx in
write_async!!! <<<\n\n”));
ExFreePoolWithTag (wctx, EHR_WRITE_CTX_TAG);
}
}
return status;
}

static VOID ehr_file_write_async_worker (PDEVICE_OBJECT device, VOID
*ctx)
{
NTSTATUS status;
write_ctx_t *wctx = (write_ctx_t *) ctx;

status = ehr_file_write_sync (wctx->irp, wctx->irp_sp,
wctx->args);

wctx->irp->IoStatus.Status = status;

IoFreeWorkItem (wctx->wi);
IoCompleteRequest (wctx->irp, IO_NO_INCREMENT);
ExFreePoolWithTag (wctx, EHR_WRITE_CTX_TAG);
}

************************************************************************
*******
*
*
* Bugcheck Analysis
*
*
*
************************************************************************
*******

Use !analyze -v to get detailed debugging information.

BugCheck C5, {107, d0000002, 1, 808933b7}

*** ERROR: Symbol file could not be found. Defaulted to export symbols
for OurDLL.dll -
*** ERROR: Module load completed but symbols could not be loaded for
OurService.exe
************************************************************************
*
***
***
***
***
*** Your debugger is not using the correct symbols
***
***
***
*** In order for this command to work properly, your symbol path
***
*** must point to .pdb files that have full type information.
***
***
***
*** Certain .pdb files (such as the public OS symbols) do not
***
*** contain the required information. Contact the group that
***
*** provided you with these symbols if you need this command to
***
*** work.
***
***
***
*** Type referenced: kernel32!pNlsUserInfo
***
***
***
************************************************************************
*
************************************************************************
*
***
***
***
***
*** Your debugger is not using the correct symbols
***
***
***
*** In order for this command to work properly, your symbol path
***
*** must point to .pdb files that have full type information.
***
***
***
*** Certain .pdb files (such as the public OS symbols) do not
***
*** contain the required information. Contact the group that
***
*** provided you with these symbols if you need this command to
***
*** work.
***
***
***
*** Type referenced: kernel32!pNlsUserInfo
***
***
***
************************************************************************
*
Probably caused by : ntkrpamp.exe ( nt!ExAllocatePoolWithTag+83f )

Followup: MachineOwner

nt!RtlpBreakWithStatusInstruction:
80871f20 cc int 3
kd> !analyze -v
************************************************************************
*******
*
*
* Bugcheck Analysis
*
*
*
************************************************************************
*******

DRIVER_CORRUPTED_EXPOOL (c5)
An attempt was made to access a pageable (or completely invalid) address
at an interrupt request level (IRQL) that is too high. This is caused
by drivers that have corrupted the system pool. Run the driver verifier
against any new (or suspect) drivers, and if that doesn’t turn up the
culprit, then use gflags to enable special pool.
Arguments:
Arg1: 00000107, memory referenced
Arg2: d0000002, IRQL
Arg3: 00000001, value 0 = read operation, 1 = write operation
Arg4: 808933b7, address which referenced memory

Debugging Details:

************************************************************************
*
***
***
***
***
*** Your debugger is not using the correct symbols
***
***
***
*** In order for this command to work properly, your symbol path
***
*** must point to .pdb files that have full type information.
***
***
***
*** Certain .pdb files (such as the public OS symbols) do not
***
*** contain the required information. Contact the group that
***
*** provided you with these symbols if you need this command to
***
*** work.
***
***
***
*** Type referenced: kernel32!pNlsUserInfo
***
***
***
************************************************************************
*
************************************************************************
*
***
***
***
***
*** Your debugger is not using the correct symbols
***
***
***
*** In order for this command to work properly, your symbol path
***
*** must point to .pdb files that have full type information.
***
***
***
*** Certain .pdb files (such as the public OS symbols) do not
***
*** contain the required information. Contact the group that
***
*** provided you with these symbols if you need this command to
***
*** work.
***
***
***
*** Type referenced: kernel32!pNlsUserInfo
***
***
***
************************************************************************
*

BUGCHECK_STR: 0xC5_D0000002

CURRENT_IRQL: 2

FAULTING_IP:
nt!ExAllocatePoolWithTag+83f
808933b7 897004 mov dword ptr [eax+4],esi

DEFAULT_BUCKET_ID: DRIVER_FAULT

PROCESS_NAME: OurService.exe

TRAP_FRAME: ba84bbfc – (.trap 0xffffffffba84bbfc) ErrCode = 00000002
eax=00000103 ebx=808aeae0 ecx=808b4180 edx=00000045 esi=808aed30
edi=81c8e488
eip=808933b7 esp=ba84bc70 ebp=ba84bcac iopl=0 nv up ei pl nz na
po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
efl=00010202
nt!ExAllocatePoolWithTag+0x83f:
808933b7 897004 mov dword ptr [eax+4],esi
ds:0023:00000107=???
Resetting default scope

LAST_CONTROL_TRANSFER: from 80826967 to 80871f20

STACK_TEXT:
ba84b7f8 80826967 00000003 00000000 00000000
nt!RtlpBreakWithStatusInstruction
ba84b844 8082786b 00000003 00000107 808933b7
nt!KiBugCheckDebugBreak+0x19
ba84bbdc 8088c963 0000000a 00000107 d0000002 nt!KeBugCheck2+0x5e1
ba84bbdc 808933b7 0000000a 00000107 d0000002 nt!KiTrap0E+0x2a7
ba84bcac 8087ed00 00000008 00000000 20206f49
nt!ExAllocatePoolWithTag+0x83f
ba84bcd0 808f1c0e 8196bd88 0000021a 20206f49
nt!ExAllocatePoolWithQuotaTag+0x5a
ba84bd48 8088978c 00000174 0013f150 0018c5a0
nt!NtQueryVolumeInformationFile+0x382
ba84bd48 7c8285ec 00000174 0013f150 0018c5a0 nt!KiFastCallEntry+0xfc
0013f10c 7c82772b 77e4b71e 00000174 0013f150 ntdll!KiFastSystemCallRet
0013f110 77e4b71e 00000174 0013f150 0018c5a0
ntdll!NtQueryVolumeInformationFile+0xc
0013f1a0 77e43b68 7ffdec00 001b0940 00000206
kernel32!GetVolumeInformationW+0x237
0013f220 110c4de2 0018a634 001842dc 00000103
kernel32!GetVolumeInformationA+0xf0
WARNING: Stack unwind information not available. Following frames may be
wrong.
0013f334 110b9a79 00182610 0013f770 73649cec
OurDLL!DllCanUnloadNow+0x80354
0013f8bc 0041847d 00182610 0013fb6c 0013fc3c
OurDLL!DllCanUnloadNow+0x74feb
0013fb60 735c1fb3 0015f868 0013fb7c 004044d6 OurService+0x1847d
0013fb7c 735c22b4 004044d6 0013fc38 00000002
MSVBVM60!tagAPRINTER::QueryInterface+0x193
0013fb94 735c239a 0015f8d4 0013fc78 0013fc38
MSVBVM60!CTL::QueryInterface+0x97
0013fc9c 735c28e7 00e913a4 00e9034c 00e85af8
MSVBVM60!CTL::QueryInterface+0x17d
0013fcc0 7362bd94 00e913a4 00000000 00000000
MSVBVM60!CTL::InternalRelease+0x29
0013fcd8 735cd0c6 00e913a4 00010056 00000113
MSVBVM60!CVBApplication::LoadUnload+0xc5
0013fd00 735cf855 00e913a4 00010056 00000113
MSVBVM60!MainHwndCreate+0x72
0013fd5c 7739b6e3 00010056 00000113 00010056 MSVBVM60!ShowMethShow+0xde
0013fd88 7739b874 735cf626 00010056 00000113
USER32!InternalCallWinProc+0x28
0013fe00 7739ba92 00000000 735cf626 00010056
USER32!UserCallWinProcCheckWow+0x151
0013fe68 773a16e5 0013fe90 00000001 0013feb8
USER32!DispatchMessageWorker+0x327
0013fe78 7357a4a3 0013fe90 ffffffff 00e8373c USER32!DispatchMessageA+0xf
0013feb8 7357a41a ffffffff 00e83764 00e80000
MSVBVM60!CLIENT::vftable'+0xb 0013fefc 7357a2f8 00e83834 ffffffff 000004cc MSVBVM60!FORM::vftable’+0xa
0013ff18 7357a2c3 00e83760 00e83834 ffffffff
MSVBVM60!CTLDOC::vftable'+0x8 0013ff3c 7357361c ffffffff 00000000 00000000 MSVBVM60!CTLMENU::vftable’+0x13
0013ffb8 00402222 00402aa0 77e6f23b 00000000
MSVBVM60!_CDIR_vtbl::`vftable’+0x144
0013fff0 00000000 00402218 00000000 78746341 OurService+0x2222

STACK_COMMAND: kb

FOLLOWUP_IP:
nt!ExAllocatePoolWithTag+83f
808933b7 897004 mov dword ptr [eax+4],esi

SYMBOL_STACK_INDEX: 4

SYMBOL_NAME: nt!ExAllocatePoolWithTag+83f

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: nt

IMAGE_NAME: ntkrpamp.exe

DEBUG_FLR_IMAGE_TIMESTAMP: 45ec0a19

FAILURE_BUCKET_ID: 0xC5_D0000002_VRF_nt!ExAllocatePoolWithTag+83f

BUCKET_ID: 0xC5_D0000002_VRF_nt!ExAllocatePoolWithTag+83f

Followup: MachineOwner

Eric_Diven · September 9, 2008, 12:56am

Figured out how to get a pool tag on a work item
(ExAllocatePoolWithTag/ExInitializeWorkItem). I’ll check that and the
Irp* tags tomorrow.

~Eric

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Eric Diven
Sent: Tuesday, September 09, 2008 12:45 AM
To: Windows File Systems Devs Interest List
Subject: [ntfsd] Trouble finding a pool corruption bug

I made some changes to my control device object to support overlapped
calls of DeviceIoControl, and I seem to have introduced a pool
corruption bug somewhere. I have driver verifier on with pool tracking,
and I’ve enabled special pool for all of my tags, and I cannot find the
problem.

The symptom I’m seeing (3 times now, consistently) is that I get a
bugcheck DRIVER_CORRUPTED_EXPOOL (c5), with a very specific address
(0x107) and the problem is happening in ExAllocatePoolWithTag for a call
to NtQueryVolumeInformationFile. My driver isn’t on the call stack, so
I have to assume it corrupted the pool earlier. The code that’s
exposing the problem is our code, but it’s not mine and I don’t have
symbols at the moment. I am pretty sure it’s not directly invoking
anything of mine at any point though. I’ll fill in on this point
tomorrow morning once I can talk to the guy who does that stuff.

If anybody sees any glaring errors in the code below, or can glean any
useful information from the !analyze, please fill me in. At this point,
I can only guess that I’m corrupting the pool with my handling of
workitem allocation and freeing, but I can’t see a way to get a tag on
that allocation, despite the fact that pooltag.txt lists some tags for
various workitems.

Thanks,

~Eric

Bronislav_Gabrhelik · September 9, 2008, 2:05am

Switch on Special Pool in Driver Verifier for your driver.
see http://msdn.microsoft.com/en-us/library/ms792863.aspx

You will detect buffer overrun by this way. For buffer underrun you can either use gflags util or manualy add registryvalue PoolTagOverrun.
see http://support.microsoft.com/default.aspx/kb/188831

-bg

Daniel_Terhell · September 9, 2008, 4:43am

Yes I see a bug here, if IoAllocateWorkItem failed (and assuming
TEST_ALLOC_AND_ABORT jumps to out) then the following code does nothing
because status is always equal to STATUS_PENDING:

out:
…
if (wctx)
{
DBG_PRINT_WRAPPER ((“\n\n>>> FREED wctx in write_async!!! <<<\n\n”));
ExFreePoolWithTag (wctx, EHR_WRITE_CTX_TAG);
}

But that only means you wctx structure is not freed

“Eric Diven” wrote in message news:xxxxx@ntfsd…
I made some changes to my control device object to support overlapped
calls of DeviceIoControl, and I seem to have introduced a pool
corruption bug somewhere. I have driver verifier on with pool tracking,
and I’ve enabled special pool for all of my tags, and I cannot find the
problem.

The symptom I’m seeing (3 times now, consistently) is that I get a
bugcheck DRIVER_CORRUPTED_EXPOOL (c5), with a very specific address
(0x107) and the problem is happening in ExAllocatePoolWithTag for a call
to NtQueryVolumeInformationFile. My driver isn’t on the call stack, so
I have to assume it corrupted the pool earlier. The code that’s
exposing the problem is our code, but it’s not mine and I don’t have
symbols at the moment. I am pretty sure it’s not directly invoking
anything of mine at any point though. I’ll fill in on this point
tomorrow morning once I can talk to the guy who does that stuff.

If anybody sees any glaring errors in the code below, or can glean any
useful information from the !analyze, please fill me in. At this point,
I can only guess that I’m corrupting the pool with my handling of
workitem allocation and freeing, but I can’t see a way to get a tag on
that allocation, despite the fact that pooltag.txt lists some tags for
various workitems.

Thanks,

~Eric

typedef struct write_ctx {
PIRP irp;
PIO_STACK_LOCATION irp_sp;
EHR_WRITE_ARGS *args;
PIO_WORKITEM wi;
} write_ctx_t;

static NTSTATUS ehr_file_write_async (PDEVICE_OBJECT device, PIRP irp,
PIO_STACK_LOCATION irp_sp, EHR_WRITE_ARGS *args)
{
// NOT STATUS_SUCCESS, we have to return S_P if we
IoMarkIrpPending
NTSTATUS status = STATUS_PENDING;
write_ctx_t *wctx = NULL;

// TODO: use a lookaside list
wctx = ExAllocatePoolWithTag (PagedPool, sizeof (write_ctx_t),
EHR_WRITE_CTX_TAG);
TEST_ALLOC_AND_ABORT (wctx);

wctx->wi = IoAllocateWorkItem (device);
TEST_ALLOC_AND_ABORT (wctx->wi);

wctx->irp = irp;
wctx->irp_sp = irp_sp;
wctx->args = args;

IoMarkIrpPending (irp);

IoQueueWorkItem (wctx->wi, ehr_file_write_async_worker,
DelayedWorkQueue,
wctx);

out:
if (status != STATUS_PENDING) {
if (wctx->wi) {
IoFreeWorkItem (wctx->wi);
}
if (wctx) {
DBG_PRINT_WRAPPER ((“\n\n>>> FREED wctx in
write_async!!! <<<\n\n”));
ExFreePoolWithTag (wctx, EHR_WRITE_CTX_TAG);
}
}
return status;
}

static VOID ehr_file_write_async_worker (PDEVICE_OBJECT device, VOID
*ctx)
{
NTSTATUS status;
write_ctx_t *wctx = (write_ctx_t *) ctx;

status = ehr_file_write_sync (wctx->irp, wctx->irp_sp,
wctx->args);

wctx->irp->IoStatus.Status = status;

IoFreeWorkItem (wctx->wi);
IoCompleteRequest (wctx->irp, IO_NO_INCREMENT);
ExFreePoolWithTag (wctx, EHR_WRITE_CTX_TAG);
}

***********************************************************

Bugcheck Analysis

*************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck C5, {107, d0000002, 1, 808933b7}

ERROR: Symbol file could not be found. Defaulted to export symbols
for OurDLL.dll -
ERROR: Module load completed but symbols could not be loaded for
OurService.exe

Your debugger is not using the correct symbols

In order for this command to work properly, your symbol path

must point to .pdb files that have full type information.

Certain .pdb files (such as the public OS symbols) do not

contain the required information. Contact the group that

provided you with these symbols if you need this command to

work.

Type referenced: kernel32!pNlsUserInfo

Your debugger is not using the correct symbols

In order for this command to work properly, your symbol path

must point to .pdb files that have full type information.

Certain .pdb files (such as the public OS symbols) do not

contain the required information. Contact the group that

provided you with these symbols if you need this command to

work.

Type referenced: kernel32!pNlsUserInfo

Probably caused by : ntkrpamp.exe ( nt!ExAllocatePoolWithTag+83f )

Followup: MachineOwner
---------

nt!RtlpBreakWithStatusInstruction:
80871f20 cc int 3
kd> !analyze -v
***********************************************************

Bugcheck Analysis

*************************************************************

DRIVER_CORRUPTED_EXPOOL (c5)
An attempt was made to access a pageable (or completely invalid) address
at an interrupt request level (IRQL) that is too high. This is caused
by drivers that have corrupted the system pool. Run the driver verifier
against any new (or suspect) drivers, and if that doesn’t turn up the
culprit, then use gflags to enable special pool.
Arguments:
Arg1: 00000107, memory referenced
Arg2: d0000002, IRQL
Arg3: 00000001, value 0 = read operation, 1 = write operation
Arg4: 808933b7, address which referenced memory

Debugging Details:
------------------

Your debugger is not using the correct symbols

In order for this command to work properly, your symbol path

must point to .pdb files that have full type information.

Certain .pdb files (such as the public OS symbols) do not

contain the required information. Contact the group that

provided you with these symbols if you need this command to

work.

Type referenced: kernel32!pNlsUserInfo

Your debugger is not using the correct symbols

In order for this command to work properly, your symbol path

must point to .pdb files that have full type information.

Certain .pdb files (such as the public OS symbols) do not

contain the required information. Contact the group that

provided you with these symbols if you need this command to

work.

Type referenced: kernel32!pNlsUserInfo

BUGCHECK_STR: 0xC5_D0000002

CURRENT_IRQL: 2

FAULTING_IP:
nt!ExAllocatePoolWithTag+83f
808933b7 897004 mov dword ptr [eax+4],esi

DEFAULT_BUCKET_ID: DRIVER_FAULT

PROCESS_NAME: OurService.exe

TRAP_FRAME: ba84bbfc – (.trap 0xffffffffba84bbfc) ErrCode = 00000002
eax=00000103 ebx=808aeae0 ecx=808b4180 edx=00000045 esi=808aed30
edi=81c8e488
eip=808933b7 esp=ba84bc70 ebp=ba84bcac iopl=0 nv up ei pl nz na
po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
efl=00010202
nt!ExAllocatePoolWithTag+0x83f:
808933b7 897004 mov dword ptr [eax+4],esi
ds:0023:00000107=???
Resetting default scope

LAST_CONTROL_TRANSFER: from 80826967 to 80871f20

STACK_TEXT:
ba84b7f8 80826967 00000003 00000000 00000000
nt!RtlpBreakWithStatusInstruction
ba84b844 8082786b 00000003 00000107 808933b7
nt!KiBugCheckDebugBreak+0x19
ba84bbdc 8088c963 0000000a 00000107 d0000002 nt!KeBugCheck2+0x5e1
ba84bbdc 808933b7 0000000a 00000107 d0000002 nt!KiTrap0E+0x2a7
ba84bcac 8087ed00 00000008 00000000 20206f49
nt!ExAllocatePoolWithTag+0x83f
ba84bcd0 808f1c0e 8196bd88 0000021a 20206f49
nt!ExAllocatePoolWithQuotaTag+0x5a
ba84bd48 8088978c 00000174 0013f150 0018c5a0
nt!NtQueryVolumeInformationFile+0x382
ba84bd48 7c8285ec 00000174 0013f150 0018c5a0 nt!KiFastCallEntry+0xfc
0013f10c 7c82772b 77e4b71e 00000174 0013f150 ntdll!KiFastSystemCallRet
0013f110 77e4b71e 00000174 0013f150 0018c5a0
ntdll!NtQueryVolumeInformationFile+0xc
0013f1a0 77e43b68 7ffdec00 001b0940 00000206
kernel32!GetVolumeInformationW+0x237
0013f220 110c4de2 0018a634 001842dc 00000103
kernel32!GetVolumeInformationA+0xf0
WARNING: Stack unwind information not available. Following frames may be
wrong.
0013f334 110b9a79 00182610 0013f770 73649cec
OurDLL!DllCanUnloadNow+0x80354
0013f8bc 0041847d 00182610 0013fb6c 0013fc3c
OurDLL!DllCanUnloadNow+0x74feb
0013fb60 735c1fb3 0015f868 0013fb7c 004044d6 OurService+0x1847d
0013fb7c 735c22b4 004044d6 0013fc38 00000002
MSVBVM60!tagAPRINTER::QueryInterface+0x193
0013fb94 735c239a 0015f8d4 0013fc78 0013fc38
MSVBVM60!CTL::QueryInterface+0x97
0013fc9c 735c28e7 00e913a4 00e9034c 00e85af8
MSVBVM60!CTL::QueryInterface+0x17d
0013fcc0 7362bd94 00e913a4 00000000 00000000
MSVBVM60!CTL::InternalRelease+0x29
0013fcd8 735cd0c6 00e913a4 00010056 00000113
MSVBVM60!CVBApplication::LoadUnload+0xc5
0013fd00 735cf855 00e913a4 00010056 00000113
MSVBVM60!MainHwndCreate+0x72
0013fd5c 7739b6e3 00010056 00000113 00010056 MSVBVM60!ShowMethShow+0xde
0013fd88 7739b874 735cf626 00010056 00000113
USER32!InternalCallWinProc+0x28
0013fe00 7739ba92 00000000 735cf626 00010056
USER32!UserCallWinProcCheckWow+0x151
0013fe68 773a16e5 0013fe90 00000001 0013feb8
USER32!DispatchMessageWorker+0x327
0013fe78 7357a4a3 0013fe90 ffffffff 00e8373c USER32!DispatchMessageA+0xf
0013feb8 7357a41a ffffffff 00e83764 00e80000
MSVBVM60!CLIENT::vftable'+0xb 0013fefc 7357a2f8 00e83834 ffffffff 000004cc MSVBVM60!FORM::vftable’+0xa
0013ff18 7357a2c3 00e83760 00e83834 ffffffff
MSVBVM60!CTLDOC::vftable'+0x8 0013ff3c 7357361c ffffffff 00000000 00000000 MSVBVM60!CTLMENU::vftable’+0x13
0013ffb8 00402222 00402aa0 77e6f23b 00000000
MSVBVM60!_CDIR_vtbl::`vftable’+0x144
0013fff0 00000000 00402218 00000000 78746341 OurService+0x2222

STACK_COMMAND: kb

FOLLOWUP_IP:
nt!ExAllocatePoolWithTag+83f
808933b7 897004 mov dword ptr [eax+4],esi

SYMBOL_STACK_INDEX: 4

SYMBOL_NAME: nt!ExAllocatePoolWithTag+83f

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: nt

IMAGE_NAME: ntkrpamp.exe

DEBUG_FLR_IMAGE_TIMESTAMP: 45ec0a19

FAILURE_BUCKET_ID: 0xC5_D0000002_VRF_nt!ExAllocatePoolWithTag+83f

BUCKET_ID: 0xC5_D0000002_VRF_nt!ExAllocatePoolWithTag+83f

Followup: MachineOwner
---------

Daniel_Terhell · September 9, 2008, 4:58am

Sorry about that, my message posted by itself. To help you further I would
like to see your TEST_ALLOC_AND_ABORT macro (does this update status ?) as
well as your ehr_file_write_sync routine.

//Daniel

wrote in message news:xxxxx@ntfsd…
> Yes I see a bug here, if IoAllocateWorkItem failed (and assuming
> TEST_ALLOC_AND_ABORT jumps to out) then the following code does nothing
> because status is always equal to STATUS_PENDING:
>
> ----
> out:
> …
> if (wctx)
> {
> DBG_PRINT_WRAPPER ((“\n\n>>> FREED wctx in write_async!!! <<<\n\n”));
> ExFreePoolWithTag (wctx, EHR_WRITE_CTX_TAG);
> }
> ----
>
> But that only means you wctx structure is not freed
>
>
>
> “Eric Diven” wrote in message
> news:xxxxx@ntfsd…
> I made some changes to my control device object to support overlapped
> calls of DeviceIoControl, and I seem to have introduced a pool
> corruption bug somewhere. I have driver verifier on with pool tracking,
> and I’ve enabled special pool for all of my tags, and I cannot find the
> problem.
>
> The symptom I’m seeing (3 times now, consistently) is that I get a
> bugcheck DRIVER_CORRUPTED_EXPOOL (c5), with a very specific address
> (0x107) and the problem is happening in ExAllocatePoolWithTag for a call
> to NtQueryVolumeInformationFile. My driver isn’t on the call stack, so
> I have to assume it corrupted the pool earlier. The code that’s
> exposing the problem is our code, but it’s not mine and I don’t have
> symbols at the moment. I am pretty sure it’s not directly invoking
> anything of mine at any point though. I’ll fill in on this point
> tomorrow morning once I can talk to the guy who does that stuff.
>
> If anybody sees any glaring errors in the code below, or can glean any
> useful information from the !analyze, please fill me in. At this point,
> I can only guess that I’m corrupting the pool with my handling of
> workitem allocation and freeing, but I can’t see a way to get a tag on
> that allocation, despite the fact that pooltag.txt lists some tags for
> various workitems.
>
> Thanks,
>
> ~Eric
>
> typedef struct write_ctx {
> PIRP irp;
> PIO_STACK_LOCATION irp_sp;
> EHR_WRITE_ARGS *args;
> PIO_WORKITEM wi;
> } write_ctx_t;
>
> static NTSTATUS ehr_file_write_async (PDEVICE_OBJECT device, PIRP irp,
> PIO_STACK_LOCATION irp_sp, EHR_WRITE_ARGS *args)
> {
> // NOT STATUS_SUCCESS, we have to return S_P if we
> IoMarkIrpPending
> NTSTATUS status = STATUS_PENDING;
> write_ctx_t *wctx = NULL;
>
> // TODO: use a lookaside list
> wctx = ExAllocatePoolWithTag (PagedPool, sizeof (write_ctx_t),
> EHR_WRITE_CTX_TAG);
> TEST_ALLOC_AND_ABORT (wctx);
>
> wctx->wi = IoAllocateWorkItem (device);
> TEST_ALLOC_AND_ABORT (wctx->wi);
>
> wctx->irp = irp;
> wctx->irp_sp = irp_sp;
> wctx->args = args;
>
> IoMarkIrpPending (irp);
>
> IoQueueWorkItem (wctx->wi, ehr_file_write_async_worker,
> DelayedWorkQueue,
> wctx);
>
> out:
> if (status != STATUS_PENDING) {
> if (wctx->wi) {
> IoFreeWorkItem (wctx->wi);
> }
> if (wctx) {
> DBG_PRINT_WRAPPER ((“\n\n>>> FREED wctx in
> write_async!!! <<<\n\n”));
> ExFreePoolWithTag (wctx, EHR_WRITE_CTX_TAG);
> }
> }
> return status;
> }
>
> static VOID ehr_file_write_async_worker (PDEVICE_OBJECT device, VOID
> *ctx)
> {
> NTSTATUS status;
> write_ctx_t *wctx = (write_ctx_t *) ctx;
>
> status = ehr_file_write_sync (wctx->irp, wctx->irp_sp,
> wctx->args);
>
> wctx->irp->IoStatus.Status = status;
>
> IoFreeWorkItem (wctx->wi);
> IoCompleteRequest (wctx->irp, IO_NO_INCREMENT);
> ExFreePoolWithTag (wctx, EHR_WRITE_CTX_TAG);
> }
>
>
>
> *****************************************************************
>
> *
> *
> * Bugcheck Analysis
> *
> *
> *
> *****************************************************************
>
>
> Use !analyze -v to get detailed debugging information.
>
> BugCheck C5, {107, d0000002, 1, 808933b7}
>
> ERROR: Symbol file could not be found. Defaulted to export symbols
> for OurDLL.dll -
> ERROR: Module load completed but symbols could not be loaded for
> OurService.exe
> *
>
>
>
>
>
> Your debugger is not using the correct symbols
>
>
>
> In order for this command to work properly, your symbol path
>
> must point to .pdb files that have full type information.
>
>
>
> Certain .pdb files (such as the public OS symbols) do not
>
> contain the required information. Contact the group that
>
> provided you with these symbols if you need this command to
>
> work.
>
>
>
> Type referenced: kernel32!pNlsUserInfo
>
>
>
>
> *
> *
>
>
>
>
>
> Your debugger is not using the correct symbols
>
>
>
> In order for this command to work properly, your symbol path
>
> must point to .pdb files that have full type information.
>
>
>
> Certain .pdb files (such as the public OS symbols) do not
>
> contain the required information. Contact the group that
>
> provided you with these symbols if you need this command to
>
> work.
>
>
>
> Type referenced: kernel32!pNlsUserInfo
>
>
>
>
> *
> Probably caused by : ntkrpamp.exe ( nt!ExAllocatePoolWithTag+83f )
>
> Followup: MachineOwner
> ---------
>
> nt!RtlpBreakWithStatusInstruction:
> 80871f20 cc int 3
> kd> !analyze -v
> *****************************************************************
>
> *
> *
> * Bugcheck Analysis
> *
> *
> *
> *****************************************************************
>
>
> DRIVER_CORRUPTED_EXPOOL (c5)
> An attempt was made to access a pageable (or completely invalid) address
> at an interrupt request level (IRQL) that is too high. This is caused
> by drivers that have corrupted the system pool. Run the driver verifier
> against any new (or suspect) drivers, and if that doesn’t turn up the
> culprit, then use gflags to enable special pool.
> Arguments:
> Arg1: 00000107, memory referenced
> Arg2: d0000002, IRQL
> Arg3: 00000001, value 0 = read operation, 1 = write operation
> Arg4: 808933b7, address which referenced memory
>
> Debugging Details:
> ------------------
>
> *
>
>
>
>
>
> Your debugger is not using the correct symbols
>
>
>
> In order for this command to work properly, your symbol path
>
> must point to .pdb files that have full type information.
>
>
>
> Certain .pdb files (such as the public OS symbols) do not
>
> contain the required information. Contact the group that
>
> provided you with these symbols if you need this command to
>
> work.
>
>
>
> Type referenced: kernel32!pNlsUserInfo
>
>
>
>
> *
> *
>
>
>
>
>
> Your debugger is not using the correct symbols
>
>
>
> In order for this command to work properly, your symbol path
>
> must point to .pdb files that have full type information.
>
>
>
> Certain .pdb files (such as the public OS symbols) do not
>
> contain the required information. Contact the group that
>
> provided you with these symbols if you need this command to
>
> work.
>
>
>
> Type referenced: kernel32!pNlsUserInfo
>
>
>
>
> *
>
> BUGCHECK_STR: 0xC5_D0000002
>
> CURRENT_IRQL: 2
>
> FAULTING_IP:
> nt!ExAllocatePoolWithTag+83f
> 808933b7 897004 mov dword ptr [eax+4],esi
>
> DEFAULT_BUCKET_ID: DRIVER_FAULT
>
> PROCESS_NAME: OurService.exe
>
> TRAP_FRAME: ba84bbfc – (.trap 0xffffffffba84bbfc) ErrCode = 00000002
> eax=00000103 ebx=808aeae0 ecx=808b4180 edx=00000045 esi=808aed30
> edi=81c8e488
> eip=808933b7 esp=ba84bc70 ebp=ba84bcac iopl=0 nv up ei pl nz na
> po nc
> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> efl=00010202
> nt!ExAllocatePoolWithTag+0x83f:
> 808933b7 897004 mov dword ptr [eax+4],esi
> ds:0023:00000107=???
> Resetting default scope
>
> LAST_CONTROL_TRANSFER: from 80826967 to 80871f20
>
> STACK_TEXT:
> ba84b7f8 80826967 00000003 00000000 00000000
> nt!RtlpBreakWithStatusInstruction
> ba84b844 8082786b 00000003 00000107 808933b7
> nt!KiBugCheckDebugBreak+0x19
> ba84bbdc 8088c963 0000000a 00000107 d0000002 nt!KeBugCheck2+0x5e1
> ba84bbdc 808933b7 0000000a 00000107 d0000002 nt!KiTrap0E+0x2a7
> ba84bcac 8087ed00 00000008 00000000 20206f49
> nt!ExAllocatePoolWithTag+0x83f
> ba84bcd0 808f1c0e 8196bd88 0000021a 20206f49
> nt!ExAllocatePoolWithQuotaTag+0x5a
> ba84bd48 8088978c 00000174 0013f150 0018c5a0
> nt!NtQueryVolumeInformationFile+0x382
> ba84bd48 7c8285ec 00000174 0013f150 0018c5a0 nt!KiFastCallEntry+0xfc
> 0013f10c 7c82772b 77e4b71e 00000174 0013f150 ntdll!KiFastSystemCallRet
> 0013f110 77e4b71e 00000174 0013f150 0018c5a0
> ntdll!NtQueryVolumeInformationFile+0xc
> 0013f1a0 77e43b68 7ffdec00 001b0940 00000206
> kernel32!GetVolumeInformationW+0x237
> 0013f220 110c4de2 0018a634 001842dc 00000103
> kernel32!GetVolumeInformationA+0xf0
> WARNING: Stack unwind information not available. Following frames may be
> wrong.
> 0013f334 110b9a79 00182610 0013f770 73649cec
> OurDLL!DllCanUnloadNow+0x80354
> 0013f8bc 0041847d 00182610 0013fb6c 0013fc3c
> OurDLL!DllCanUnloadNow+0x74feb
> 0013fb60 735c1fb3 0015f868 0013fb7c 004044d6 OurService+0x1847d
> 0013fb7c 735c22b4 004044d6 0013fc38 00000002
> MSVBVM60!tagAPRINTER::QueryInterface+0x193
> 0013fb94 735c239a 0015f8d4 0013fc78 0013fc38
> MSVBVM60!CTL::QueryInterface+0x97
> 0013fc9c 735c28e7 00e913a4 00e9034c 00e85af8
> MSVBVM60!CTL::QueryInterface+0x17d
> 0013fcc0 7362bd94 00e913a4 00000000 00000000
> MSVBVM60!CTL::InternalRelease+0x29
> 0013fcd8 735cd0c6 00e913a4 00010056 00000113
> MSVBVM60!CVBApplication::LoadUnload+0xc5
> 0013fd00 735cf855 00e913a4 00010056 00000113
> MSVBVM60!MainHwndCreate+0x72
> 0013fd5c 7739b6e3 00010056 00000113 00010056 MSVBVM60!ShowMethShow+0xde
> 0013fd88 7739b874 735cf626 00010056 00000113
> USER32!InternalCallWinProc+0x28
> 0013fe00 7739ba92 00000000 735cf626 00010056
> USER32!UserCallWinProcCheckWow+0x151
> 0013fe68 773a16e5 0013fe90 00000001 0013feb8
> USER32!DispatchMessageWorker+0x327
> 0013fe78 7357a4a3 0013fe90 ffffffff 00e8373c USER32!DispatchMessageA+0xf
> 0013feb8 7357a41a ffffffff 00e83764 00e80000
> MSVBVM60!CLIENT::vftable'+0xb > 0013fefc 7357a2f8 00e83834 ffffffff 000004cc > MSVBVM60!FORM::vftable’+0xa
> 0013ff18 7357a2c3 00e83760 00e83834 ffffffff
> MSVBVM60!CTLDOC::vftable'+0x8 > 0013ff3c 7357361c ffffffff 00000000 00000000 > MSVBVM60!CTLMENU::vftable’+0x13
> 0013ffb8 00402222 00402aa0 77e6f23b 00000000
> MSVBVM60!_CDIR_vtbl::`vftable’+0x144
> 0013fff0 00000000 00402218 00000000 78746341 OurService+0x2222
>
>
> STACK_COMMAND: kb
>
> FOLLOWUP_IP:
> nt!ExAllocatePoolWithTag+83f
> 808933b7 897004 mov dword ptr [eax+4],esi
>
> SYMBOL_STACK_INDEX: 4
>
> SYMBOL_NAME: nt!ExAllocatePoolWithTag+83f
>
> FOLLOWUP_NAME: MachineOwner
>
> MODULE_NAME: nt
>
> IMAGE_NAME: ntkrpamp.exe
>
> DEBUG_FLR_IMAGE_TIMESTAMP: 45ec0a19
>
> FAILURE_BUCKET_ID: 0xC5_D0000002_VRF_nt!ExAllocatePoolWithTag+83f
>
> BUCKET_ID: 0xC5_D0000002_VRF_nt!ExAllocatePoolWithTag+83f
>
> Followup: MachineOwner
> ---------
>
>

Eric_Diven · September 9, 2008, 9:43am

Yeah, TEST_ALLOC_AND_ABORT sets status to STATUS_NO_MEMORY before it
bails. I should have included that salient bit of information. The
write_sync function calls MmGetSystemAddressForMdlSafe and then
FltWriteFile. It’s basically unchanged (apart from being factored out
of the common write dispatch) from what it was doing for
synchronous-only write.

get_fileo_for_filenum looks up the file object (and references it) that
we’re writing to, and has been stable for a while, everything else
should be pretty straightforward.

static NTSTATUS ehr_file_write_sync (PIRP irp, PIO_STACK_LOCATION
irp_sp,
EHR_WRITE_ARGS *args)
{
NTSTATUS status = STATUS_SUCCESS;

ULONG data_size =
irp_sp->Parameters.DeviceIoControl.OutputBufferLength;
VOID *data = NULL;
FILE_OBJECT *fileo = NULL;
eh_stream_context *sc = NULL;

data = MmGetSystemAddressForMdlSafe (irp->MdlAddress,
NormalPagePriority);
TEST_ALLOC_AND_ABORT (data);

status = get_fileo_for_filenum (&fileo, args->file_number, &sc);
TEST_STATUS_AND_ABORT ((“Couldn’t get fileo for write\n”));

status = FltWriteFile (args->iid, fileo, &args->offset,
data_size, data, 0,
NULL, NULL, NULL);
TEST_STATUS_AND_ABORT ((“Could not write data to file”));

out:
if (fileo) {
ObDereferenceObject (fileo);
}
return status;
}

Thanks,

~Eric

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@resplendence.com
Sent: Tuesday, September 09, 2008 4:57 AM
To: Windows File Systems Devs Interest List
Subject: Re:[ntfsd] Trouble finding a pool corruption bug

Sorry about that, my message posted by itself. To help you further I
would like to see your TEST_ALLOC_AND_ABORT macro (does this update
status ?) as well as your ehr_file_write_sync routine.

//Daniel

wrote in message news:xxxxx@ntfsd…
> Yes I see a bug here, if IoAllocateWorkItem failed (and assuming
> TEST_ALLOC_AND_ABORT jumps to out) then the following code does
> nothing because status is always equal to STATUS_PENDING:
>
> ----
> out:
> …
> if (wctx)
> {
> DBG_PRINT_WRAPPER ((“\n\n>>> FREED wctx in write_async!!!
> <<<\n\n”)); ExFreePoolWithTag (wctx, EHR_WRITE_CTX_TAG); }
> ----
>
> But that only means you wctx structure is not freed
>
>

Eric_Diven · September 9, 2008, 10:04am

Son of bitch. IoInitializeWorkItem is only available on Vista. Any
ideas on how to figure out if mishandling the workitem is causing my
problem?

~Eric

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Eric Diven
Sent: Tuesday, September 09, 2008 12:56 AM
To: Windows File Systems Devs Interest List
Subject: RE: [ntfsd] Trouble finding a pool corruption bug

Figured out how to get a pool tag on a work item
(ExAllocatePoolWithTag/ExInitializeWorkItem). I’ll check that and the
Irp* tags tomorrow.

~Eric

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Eric Diven
Sent: Tuesday, September 09, 2008 12:45 AM
To: Windows File Systems Devs Interest List
Subject: [ntfsd] Trouble finding a pool corruption bug

I made some changes to my control device object to support overlapped
calls of DeviceIoControl, and I seem to have introduced a pool
corruption bug somewhere. I have driver verifier on with pool tracking,
and I’ve enabled special pool for all of my tags, and I cannot find the
problem.

The symptom I’m seeing (3 times now, consistently) is that I get a
bugcheck DRIVER_CORRUPTED_EXPOOL (c5), with a very specific address
(0x107) and the problem is happening in ExAllocatePoolWithTag for a call
to NtQueryVolumeInformationFile. My driver isn’t on the call stack, so
I have to assume it corrupted the pool earlier. The code that’s
exposing the problem is our code, but it’s not mine and I don’t have
symbols at the moment. I am pretty sure it’s not directly invoking
anything of mine at any point though. I’ll fill in on this point
tomorrow morning once I can talk to the guy who does that stuff.

If anybody sees any glaring errors in the code below, or can glean any
useful information from the !analyze, please fill me in. At this point,
I can only guess that I’m corrupting the pool with my handling of
workitem allocation and freeing, but I can’t see a way to get a tag on
that allocation, despite the fact that pooltag.txt lists some tags for
various workitems.

Thanks,

~Eric

NTFSD is sponsored by OSR

For our schedule debugging and file system seminars (including our new
fs mini-filter seminar) visit:
http://www.osr.com/seminars

You are currently subscribed to ntfsd as: unknown lmsubst tag argument:
‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com

Ian_Blake · September 9, 2008, 10:40am

> data = MmGetSystemAddressForMdlSafe (irp->MdlAddress, NormalPagePriority);

I would worry about that line.

Did you lock the Mdl?
You have not unlocked it in the code we have seen.

–

Visit Pipex Business: The homepage for UK Small Businesses

Go to http://www.pipex.co.uk/business-services

Eric_Diven · September 9, 2008, 10:47am

Do I have to? The dispatch routine should be called at PASSIVE_LEVEL
(http://msdn.microsoft.com/en-us/library/ms790762.aspx) as should the
workitem callback
(http://msdn.microsoft.com/en-us/library/ms795209.aspx). Since it’s an
MDL, it shouldn’t matter what thread context it’s in, correct? Or have
I made a completely flawed assumption about how something works here?
Please correct me if that’s the case.

Thanks,

~Eric

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@dsl.pipex.com
Sent: Tuesday, September 09, 2008 10:40 AM
To: Windows File Systems Devs Interest List
Subject: RE:[ntfsd] Trouble finding a pool corruption bug

data = MmGetSystemAddressForMdlSafe (irp->MdlAddress,
NormalPagePriority);

I would worry about that line.

Did you lock the Mdl?
You have not unlocked it in the code we have seen.

–

Visit Pipex Business: The homepage for UK Small Businesses

Go to http://www.pipex.co.uk/business-services

NTFSD is sponsored by OSR

For our schedule debugging and file system seminars (including our new
fs mini-filter seminar) visit:
http://www.osr.com/seminars

You are currently subscribed to ntfsd as: xxxxx@edsiohio.com To
unsubscribe send a blank email to xxxxx@lists.osr.com

Eric_Diven · September 9, 2008, 10:52am

Ah blessed documentation, thank you kindly. I couldn’t find anything
last night other than “use gflags”. I’ve now tried special pool for my
pool tags with both overrun and underrun protection. No luck on either.

~Eric

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@xythos.com
Sent: Tuesday, September 09, 2008 2:04 AM
To: Windows File Systems Devs Interest List
Subject: RE:[ntfsd] Trouble finding a pool corruption bug

Switch on Special Pool in Driver Verifier for your driver.
see http://msdn.microsoft.com/en-us/library/ms792863.aspx

You will detect buffer overrun by this way. For buffer underrun you can
either use gflags util or manualy add registryvalue PoolTagOverrun.
see http://support.microsoft.com/default.aspx/kb/188831

-bg

NTFSD is sponsored by OSR

For our schedule debugging and file system seminars (including our new
fs mini-filter seminar) visit:
http://www.osr.com/seminars

You are currently subscribed to ntfsd as: xxxxx@edsiohio.com To
unsubscribe send a blank email to xxxxx@lists.osr.com

Ian_Blake · September 9, 2008, 11:03am

Quoting Eric Diven :

> Do I have to? The dispatch routine should be called at PASSIVE_LEVEL
> (http://msdn.microsoft.com/en-us/library/ms790762.aspx) as should the
> workitem callback
> (http://msdn.microsoft.com/en-us/library/ms795209.aspx). Since it’s an
> MDL, it shouldn’t matter what thread context it’s in, correct? Or have
> I made a completely flawed assumption about how something works here?
> Please correct me if that’s the case.
>
> Thanks,
>
> ~Eric
>

Yes you have to.

You are right you are still at PASSIVE_LEVEL but a different user. The
address in the MDL refers to a particular user when the context has
changed there is no longer any connection between the pages and the data
that must be mapped to them. So you need to lock the mdl before calling
the workitem (which looks ok to me).

If you look at the docs for MmGetSystemAddressForMdlSafe you will find
the following in the comments section

‘The input MDL must describe an already locked-down user-space buffer
that MmProbeAndLockPages returned, a locked-down buffer that
MmBuildMdlForNonPagedPool returned, or system-space memory that is
allocated from nonpaged pool, contiguous memory, or noncached memory’

-------------------------------------------------
Visit Pipex Business: The homepage for UK Small Businesses

Go to http://www.pipex.co.uk/business-services

Eric_Diven · September 9, 2008, 11:46am

Well bugger. That’s not it. Thanks for pointing it out though. I’m
sure it would have reared its head at some point. On the plus side,
I’ve now noticed that the crash is happening after I do stuff with
IRP_MJ_ACQUIRE_FOR_SECTION_SYNCHRONIZATION…

~Eric

Yes you have to.

You are right you are still at PASSIVE_LEVEL but a different user. The
address in the MDL refers to a particular user when the context has
changed there is no longer any connection between the pages and the data
that must be mapped to them. So you need to lock the mdl before calling
the workitem (which looks ok to me).

If you look at the docs for MmGetSystemAddressForMdlSafe you will find
the following in the comments section

‘The input MDL must describe an already locked-down user-space buffer
that MmProbeAndLockPages returned, a locked-down buffer that
MmBuildMdlForNonPagedPool returned, or system-space memory that is
allocated from nonpaged pool, contiguous memory, or noncached memory’

Visit Pipex Business: The homepage for UK Small Businesses

Go to http://www.pipex.co.uk/business-services

NTFSD is sponsored by OSR

For our schedule debugging and file system seminars (including our new
fs mini-filter seminar) visit:
http://www.osr.com/seminars

You are currently subscribed to ntfsd as: xxxxx@edsiohio.com To
unsubscribe send a blank email to xxxxx@lists.osr.com

Eric_Diven · September 9, 2008, 12:22pm

Well, I can duplicate the bug a little more quickly and reliably by
doing an explorer drag and drop (because involving explorer is the
surefire way to make debugging easier, of course ). It looks like
the pool tag for the allocation that’s bugchecking is Io[sp][sp]. I
guess I’ll put the special pool on that one since I’m not seeing
anything with it on mine.

~Eric

BUGCHECK_STR: 0xC5_D0000002

CURRENT_IRQL: 2

FAULTING_IP:
nt!ExAllocatePoolWithTag+83f
808933b7 897004 mov dword ptr [eax+4],esi

DEFAULT_BUCKET_ID: DRIVER_FAULT

PROCESS_NAME: explorer.exe

TRAP_FRAME: ba1e6bfc – (.trap 0xffffffffba1e6bfc)
ErrCode = 00000002
eax=00000103 ebx=808aeae0 ecx=808b4180 edx=00000045 esi=808aed30
edi=81950b98
eip=808933b7 esp=ba1e6c70 ebp=ba1e6cac iopl=0 nv up ei pl nz na
po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
efl=00010202
nt!ExAllocatePoolWithTag+0x83f:
808933b7 897004 mov dword ptr [eax+4],esi
ds:0023:00000107=???
Resetting default scope

LAST_CONTROL_TRANSFER: from 80826967 to 80871f20

STACK_TEXT:
ba1e67f8 80826967 00000003 00000000 00000000
nt!RtlpBreakWithStatusInstruction
ba1e6844 8082786b 00000003 00000107 808933b7
nt!KiBugCheckDebugBreak+0x19
ba1e6bdc 8088c963 0000000a 00000107 d0000002 nt!KeBugCheck2+0x5e1
ba1e6bdc 808933b7 0000000a 00000107 d0000002 nt!KiTrap0E+0x2a7
ba1e6cac 8087ed00 00000008 00000000 20206f49
nt!ExAllocatePoolWithTag+0x83f
ba1e6cd0 808f1c0e 81f4e110 0000021c 20206f49
nt!ExAllocatePoolWithQuotaTag+0x5a
ba1e6d48 8088978c 00000518 01bdec24 01bde9a8
nt!NtQueryVolumeInformationFile+0x382
ba1e6d48 7c8285ec 00000518 01bdec24 01bde9a8 nt!KiFastCallEntry+0xfc
01bde988 7c82772b 7c9c4a7b 00000518 01bdec24 ntdll!KiFastSystemCallRet
01bde98c 7c9c4a7b 00000518 01bdec24 01bde9a8
ntdll!NtQueryVolumeInformationFile+0xc
01bdee90 7c99ea1d 0015e10c 01bdeed4 00000000
SHELL32!GetDownlevelCopyDataLossText+0x215
01bdf0e0 7c9a183f 0015e018 01bdf38c 00000300
SHELL32!AllConfirmations+0x2df
01bdf828 7c9a1d71 0015e018 00000000 001024a0
SHELL32!MoveCopyDriver+0x309
01bdf86c 7ca05f9d 00000000 0016acec 00153d34
SHELL32!SHFileOperationW+0x17b
01bdfce0 7ca062bb 00153d34 0016acec 00103c50
SHELL32!CFSDropTarget::_MoveCopy+0x1ff
01bdff38 7ca06361 00153d34 0016acec 00000000
SHELL32!CFSDropTarget::_DoDrop+0x270
01bdff54 77da3f12 0016acec 00000000 00000000
SHELL32!CFSDropTarget::_DoDropThreadProc+0x46
01bdffb8 77e64829 00000000 00000000 00000000
SHLWAPI!WrapperThreadProc+0x94
01bdffec 00000000 77da3ea5 0007eee0 00000000
kernel32!BaseThreadStart+0x34

STACK_COMMAND: kb

FOLLOWUP_IP:
nt!ExAllocatePoolWithTag+83f
808933b7 897004 mov dword ptr [eax+4],esi

SYMBOL_STACK_INDEX: 4

SYMBOL_NAME: nt!ExAllocatePoolWithTag+83f

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: nt

IMAGE_NAME: ntkrpamp.exe

DEBUG_FLR_IMAGE_TIMESTAMP: 45ec0a19

FAILURE_BUCKET_ID: 0xC5_D0000002_VRF_nt!ExAllocatePoolWithTag+83f

BUCKET_ID: 0xC5_D0000002_VRF_nt!ExAllocatePoolWithTag+83f

Followup: MachineOwner

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Eric Diven
Sent: Tuesday, September 09, 2008 11:46 AM
To: Windows File Systems Devs Interest List
Subject: RE: [ntfsd] Trouble finding a pool corruption bug

Well bugger. That’s not it. Thanks for pointing it out though. I’m
sure it would have reared its head at some point. On the plus side,
I’ve now noticed that the crash is happening after I do stuff with
IRP_MJ_ACQUIRE_FOR_SECTION_SYNCHRONIZATION…

~Eric

Eric_Diven · September 9, 2008, 1:09pm

That found it. I was accessing something in AssociatedIrp.SystemBuffer
after it had been freed.

Thanks!

~Eric

(And yet another thread has degenerated into Eric talking to himself)

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Eric Diven
Sent: Tuesday, September 09, 2008 12:22 PM
To: Windows File Systems Devs Interest List
Subject: RE: [ntfsd] Trouble finding a pool corruption bug

Well, I can duplicate the bug a little more quickly and reliably by
doing an explorer drag and drop (because involving explorer is the
surefire way to make debugging easier, of course ). It looks like
the pool tag for the allocation that’s bugchecking is Io[sp][sp]. I
guess I’ll put the special pool on that one since I’m not seeing
anything with it on mine.

~Eric

BUGCHECK_STR: 0xC5_D0000002

CURRENT_IRQL: 2

FAULTING_IP:
nt!ExAllocatePoolWithTag+83f
808933b7 897004 mov dword ptr [eax+4],esi

DEFAULT_BUCKET_ID: DRIVER_FAULT

PROCESS_NAME: explorer.exe

TRAP_FRAME: ba1e6bfc – (.trap 0xffffffffba1e6bfc) ErrCode = 00000002
eax=00000103 ebx=808aeae0 ecx=808b4180 edx=00000045 esi=808aed30
edi=81950b98
eip=808933b7 esp=ba1e6c70 ebp=ba1e6cac iopl=0 nv up ei pl nz na
po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
efl=00010202
nt!ExAllocatePoolWithTag+0x83f:
808933b7 897004 mov dword ptr [eax+4],esi
ds:0023:00000107=???
Resetting default scope

LAST_CONTROL_TRANSFER: from 80826967 to 80871f20

STACK_TEXT:
ba1e67f8 80826967 00000003 00000000 00000000
nt!RtlpBreakWithStatusInstruction
ba1e6844 8082786b 00000003 00000107 808933b7
nt!KiBugCheckDebugBreak+0x19
ba1e6bdc 8088c963 0000000a 00000107 d0000002 nt!KeBugCheck2+0x5e1
ba1e6bdc 808933b7 0000000a 00000107 d0000002 nt!KiTrap0E+0x2a7 ba1e6cac
8087ed00 00000008 00000000 20206f49 nt!ExAllocatePoolWithTag+0x83f
ba1e6cd0 808f1c0e 81f4e110 0000021c 20206f49
nt!ExAllocatePoolWithQuotaTag+0x5a
ba1e6d48 8088978c 00000518 01bdec24 01bde9a8
nt!NtQueryVolumeInformationFile+0x382
ba1e6d48 7c8285ec 00000518 01bdec24 01bde9a8 nt!KiFastCallEntry+0xfc
01bde988 7c82772b 7c9c4a7b 00000518 01bdec24 ntdll!KiFastSystemCallRet
01bde98c 7c9c4a7b 00000518 01bdec24 01bde9a8
ntdll!NtQueryVolumeInformationFile+0xc
01bdee90 7c99ea1d 0015e10c 01bdeed4 00000000
SHELL32!GetDownlevelCopyDataLossText+0x215
01bdf0e0 7c9a183f 0015e018 01bdf38c 00000300
SHELL32!AllConfirmations+0x2df
01bdf828 7c9a1d71 0015e018 00000000 001024a0
SHELL32!MoveCopyDriver+0x309
01bdf86c 7ca05f9d 00000000 0016acec 00153d34
SHELL32!SHFileOperationW+0x17b 01bdfce0 7ca062bb 00153d34 0016acec
00103c50 SHELL32!CFSDropTarget::_MoveCopy+0x1ff
01bdff38 7ca06361 00153d34 0016acec 00000000
SHELL32!CFSDropTarget::_DoDrop+0x270
01bdff54 77da3f12 0016acec 00000000 00000000
SHELL32!CFSDropTarget::_DoDropThreadProc+0x46
01bdffb8 77e64829 00000000 00000000 00000000
SHLWAPI!WrapperThreadProc+0x94
01bdffec 00000000 77da3ea5 0007eee0 00000000
kernel32!BaseThreadStart+0x34

STACK_COMMAND: kb

FOLLOWUP_IP:
nt!ExAllocatePoolWithTag+83f
808933b7 897004 mov dword ptr [eax+4],esi

SYMBOL_STACK_INDEX: 4

SYMBOL_NAME: nt!ExAllocatePoolWithTag+83f

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: nt

IMAGE_NAME: ntkrpamp.exe

DEBUG_FLR_IMAGE_TIMESTAMP: 45ec0a19

FAILURE_BUCKET_ID: 0xC5_D0000002_VRF_nt!ExAllocatePoolWithTag+83f

BUCKET_ID: 0xC5_D0000002_VRF_nt!ExAllocatePoolWithTag+83f

Followup: MachineOwner

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Eric Diven
Sent: Tuesday, September 09, 2008 11:46 AM
To: Windows File Systems Devs Interest List
Subject: RE: [ntfsd] Trouble finding a pool corruption bug

Well bugger. That’s not it. Thanks for pointing it out though. I’m
sure it would have reared its head at some point. On the plus side,
I’ve now noticed that the crash is happening after I do stuff with
IRP_MJ_ACQUIRE_FOR_SECTION_SYNCHRONIZATION…

~Eric

NTFSD is sponsored by OSR

For our schedule debugging and file system seminars (including our new
fs mini-filter seminar) visit:
http://www.osr.com/seminars

You are currently subscribed to ntfsd as: unknown lmsubst tag argument:
‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com

Daniel_Terhell · September 9, 2008, 4:23pm

If you are writing a minifilter then you should be using
FltAllocateDeferredIoWorkItem / FltQueueDeferredIoWorkItem and for IRP based
operations only. I will take a look at your code tomorrow.

//Daniel

“Eric Diven” wrote in message news:xxxxx@ntfsd…
I made some changes to my control device object to support overlapped
calls of DeviceIoControl, and I seem to have introduced a pool
corruption bug somewhere. I have driver verifier on with pool tracking,
and I’ve enabled special pool for all of my tags, and I cannot find the
problem.

The symptom I’m seeing (3 times now, consistently) is that I get a
bugcheck DRIVER_CORRUPTED_EXPOOL (c5), with a very specific address
(0x107) and the problem is happening in ExAllocatePoolWithTag for a call
to NtQueryVolumeInformationFile. My driver isn’t on the call stack, so
I have to assume it corrupted the pool earlier. The code that’s
exposing the problem is our code, but it’s not mine and I don’t have
symbols at the moment. I am pretty sure it’s not directly invoking
anything of mine at any point though. I’ll fill in on this point
tomorrow morning once I can talk to the guy who does that stuff.

If anybody sees any glaring errors in the code below, or can glean any
useful information from the !analyze, please fill me in. At this point,
I can only guess that I’m corrupting the pool with my handling of
workitem allocation and freeing, but I can’t see a way to get a tag on
that allocation, despite the fact that pooltag.txt lists some tags for
various workitems.

Thanks,

~Eric

typedef struct write_ctx {
PIRP irp;
PIO_STACK_LOCATION irp_sp;
EHR_WRITE_ARGS *args;
PIO_WORKITEM wi;
} write_ctx_t;

static NTSTATUS ehr_file_write_async (PDEVICE_OBJECT device, PIRP irp,
PIO_STACK_LOCATION irp_sp, EHR_WRITE_ARGS *args)
{
// NOT STATUS_SUCCESS, we have to return S_P if we
IoMarkIrpPending
NTSTATUS status = STATUS_PENDING;
write_ctx_t *wctx = NULL;

// TODO: use a lookaside list
wctx = ExAllocatePoolWithTag (PagedPool, sizeof (write_ctx_t),
EHR_WRITE_CTX_TAG);
TEST_ALLOC_AND_ABORT (wctx);

wctx->wi = IoAllocateWorkItem (device);
TEST_ALLOC_AND_ABORT (wctx->wi);

wctx->irp = irp;
wctx->irp_sp = irp_sp;
wctx->args = args;

IoMarkIrpPending (irp);

IoQueueWorkItem (wctx->wi, ehr_file_write_async_worker,
DelayedWorkQueue,
wctx);

out:
if (status != STATUS_PENDING) {
if (wctx->wi) {
IoFreeWorkItem (wctx->wi);
}
if (wctx) {
DBG_PRINT_WRAPPER ((“\n\n>>> FREED wctx in
write_async!!! <<<\n\n”));
ExFreePoolWithTag (wctx, EHR_WRITE_CTX_TAG);
}
}
return status;
}

static VOID ehr_file_write_async_worker (PDEVICE_OBJECT device, VOID
*ctx)
{
NTSTATUS status;
write_ctx_t *wctx = (write_ctx_t *) ctx;

status = ehr_file_write_sync (wctx->irp, wctx->irp_sp,
wctx->args);

wctx->irp->IoStatus.Status = status;

IoFreeWorkItem (wctx->wi);
IoCompleteRequest (wctx->irp, IO_NO_INCREMENT);
ExFreePoolWithTag (wctx, EHR_WRITE_CTX_TAG);
}

***********************************************************

Bugcheck Analysis

*************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck C5, {107, d0000002, 1, 808933b7}

ERROR: Symbol file could not be found. Defaulted to export symbols
for OurDLL.dll -
ERROR: Module load completed but symbols could not be loaded for
OurService.exe

Your debugger is not using the correct symbols

In order for this command to work properly, your symbol path

must point to .pdb files that have full type information.

Certain .pdb files (such as the public OS symbols) do not

contain the required information. Contact the group that

provided you with these symbols if you need this command to

work.

Type referenced: kernel32!pNlsUserInfo

Your debugger is not using the correct symbols

In order for this command to work properly, your symbol path

must point to .pdb files that have full type information.

Certain .pdb files (such as the public OS symbols) do not

contain the required information. Contact the group that

provided you with these symbols if you need this command to

work.

Type referenced: kernel32!pNlsUserInfo

Probably caused by : ntkrpamp.exe ( nt!ExAllocatePoolWithTag+83f )

Followup: MachineOwner
---------

nt!RtlpBreakWithStatusInstruction:
80871f20 cc int 3
kd> !analyze -v
***********************************************************

Bugcheck Analysis

*************************************************************

DRIVER_CORRUPTED_EXPOOL (c5)
An attempt was made to access a pageable (or completely invalid) address
at an interrupt request level (IRQL) that is too high. This is caused
by drivers that have corrupted the system pool. Run the driver verifier
against any new (or suspect) drivers, and if that doesn’t turn up the
culprit, then use gflags to enable special pool.
Arguments:
Arg1: 00000107, memory referenced
Arg2: d0000002, IRQL
Arg3: 00000001, value 0 = read operation, 1 = write operation
Arg4: 808933b7, address which referenced memory

Debugging Details:
------------------

Your debugger is not using the correct symbols

In order for this command to work properly, your symbol path

must point to .pdb files that have full type information.

Certain .pdb files (such as the public OS symbols) do not

contain the required information. Contact the group that

provided you with these symbols if you need this command to

work.

Type referenced: kernel32!pNlsUserInfo

Your debugger is not using the correct symbols

In order for this command to work properly, your symbol path

must point to .pdb files that have full type information.

Certain .pdb files (such as the public OS symbols) do not

contain the required information. Contact the group that

provided you with these symbols if you need this command to

work.

Type referenced: kernel32!pNlsUserInfo

BUGCHECK_STR: 0xC5_D0000002

CURRENT_IRQL: 2

FAULTING_IP:
nt!ExAllocatePoolWithTag+83f
808933b7 897004 mov dword ptr [eax+4],esi

DEFAULT_BUCKET_ID: DRIVER_FAULT

PROCESS_NAME: OurService.exe

TRAP_FRAME: ba84bbfc – (.trap 0xffffffffba84bbfc) ErrCode = 00000002
eax=00000103 ebx=808aeae0 ecx=808b4180 edx=00000045 esi=808aed30
edi=81c8e488
eip=808933b7 esp=ba84bc70 ebp=ba84bcac iopl=0 nv up ei pl nz na
po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
efl=00010202
nt!ExAllocatePoolWithTag+0x83f:
808933b7 897004 mov dword ptr [eax+4],esi
ds:0023:00000107=???
Resetting default scope

LAST_CONTROL_TRANSFER: from 80826967 to 80871f20

STACK_TEXT:
ba84b7f8 80826967 00000003 00000000 00000000
nt!RtlpBreakWithStatusInstruction
ba84b844 8082786b 00000003 00000107 808933b7
nt!KiBugCheckDebugBreak+0x19
ba84bbdc 8088c963 0000000a 00000107 d0000002 nt!KeBugCheck2+0x5e1
ba84bbdc 808933b7 0000000a 00000107 d0000002 nt!KiTrap0E+0x2a7
ba84bcac 8087ed00 00000008 00000000 20206f49
nt!ExAllocatePoolWithTag+0x83f
ba84bcd0 808f1c0e 8196bd88 0000021a 20206f49
nt!ExAllocatePoolWithQuotaTag+0x5a
ba84bd48 8088978c 00000174 0013f150 0018c5a0
nt!NtQueryVolumeInformationFile+0x382
ba84bd48 7c8285ec 00000174 0013f150 0018c5a0 nt!KiFastCallEntry+0xfc
0013f10c 7c82772b 77e4b71e 00000174 0013f150 ntdll!KiFastSystemCallRet
0013f110 77e4b71e 00000174 0013f150 0018c5a0
ntdll!NtQueryVolumeInformationFile+0xc
0013f1a0 77e43b68 7ffdec00 001b0940 00000206
kernel32!GetVolumeInformationW+0x237
0013f220 110c4de2 0018a634 001842dc 00000103
kernel32!GetVolumeInformationA+0xf0
WARNING: Stack unwind information not available. Following frames may be
wrong.
0013f334 110b9a79 00182610 0013f770 73649cec
OurDLL!DllCanUnloadNow+0x80354
0013f8bc 0041847d 00182610 0013fb6c 0013fc3c
OurDLL!DllCanUnloadNow+0x74feb
0013fb60 735c1fb3 0015f868 0013fb7c 004044d6 OurService+0x1847d
0013fb7c 735c22b4 004044d6 0013fc38 00000002
MSVBVM60!tagAPRINTER::QueryInterface+0x193
0013fb94 735c239a 0015f8d4 0013fc78 0013fc38
MSVBVM60!CTL::QueryInterface+0x97
0013fc9c 735c28e7 00e913a4 00e9034c 00e85af8
MSVBVM60!CTL::QueryInterface+0x17d
0013fcc0 7362bd94 00e913a4 00000000 00000000
MSVBVM60!CTL::InternalRelease+0x29
0013fcd8 735cd0c6 00e913a4 00010056 00000113
MSVBVM60!CVBApplication::LoadUnload+0xc5
0013fd00 735cf855 00e913a4 00010056 00000113
MSVBVM60!MainHwndCreate+0x72
0013fd5c 7739b6e3 00010056 00000113 00010056 MSVBVM60!ShowMethShow+0xde
0013fd88 7739b874 735cf626 00010056 00000113
USER32!InternalCallWinProc+0x28
0013fe00 7739ba92 00000000 735cf626 00010056
USER32!UserCallWinProcCheckWow+0x151
0013fe68 773a16e5 0013fe90 00000001 0013feb8
USER32!DispatchMessageWorker+0x327
0013fe78 7357a4a3 0013fe90 ffffffff 00e8373c USER32!DispatchMessageA+0xf
0013feb8 7357a41a ffffffff 00e83764 00e80000
MSVBVM60!CLIENT::vftable'+0xb 0013fefc 7357a2f8 00e83834 ffffffff 000004cc MSVBVM60!FORM::vftable’+0xa
0013ff18 7357a2c3 00e83760 00e83834 ffffffff
MSVBVM60!CTLDOC::vftable'+0x8 0013ff3c 7357361c ffffffff 00000000 00000000 MSVBVM60!CTLMENU::vftable’+0x13
0013ffb8 00402222 00402aa0 77e6f23b 00000000
MSVBVM60!_CDIR_vtbl::`vftable’+0x144
0013fff0 00000000 00402218 00000000 78746341 OurService+0x2222

STACK_COMMAND: kb

FOLLOWUP_IP:
nt!ExAllocatePoolWithTag+83f
808933b7 897004 mov dword ptr [eax+4],esi

SYMBOL_STACK_INDEX: 4

SYMBOL_NAME: nt!ExAllocatePoolWithTag+83f

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: nt

IMAGE_NAME: ntkrpamp.exe

DEBUG_FLR_IMAGE_TIMESTAMP: 45ec0a19

FAILURE_BUCKET_ID: 0xC5_D0000002_VRF_nt!ExAllocatePoolWithTag+83f

BUCKET_ID: 0xC5_D0000002_VRF_nt!ExAllocatePoolWithTag+83f

Followup: MachineOwner
---------

Eric_Diven · September 10, 2008, 12:53am

Before you get too far into my code, take a look at the mini filter cdo
example (and marvel at the perversity of the whole scheme). The control
device object is a device all its own with a little legacy driver
attached, which is basically what I’ve implemented. I’m using it to
issue writes to minifilters below mine to avoid sharing and reentrancy
issues. Sorry if this is all familiar to you, I figured I’d spare you
some digging in case it isn’t. Okay, as it’s now tomorrow for me, I’m
heading home for some sleep.

Thanks,

~Eric

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@resplendence.com
Sent: Tuesday, September 09, 2008 4:23 PM
To: Windows File Systems Devs Interest List
Subject: Re:[ntfsd] Trouble finding a pool corruption bug

If you are writing a minifilter then you should be using
FltAllocateDeferredIoWorkItem / FltQueueDeferredIoWorkItem and for IRP
based operations only. I will take a look at your code tomorrow.

//Daniel

Trouble finding a pool corruption bug

Followup: MachineOwner

Debugging Details:

Followup: MachineOwner

out: … if (wctx) { DBG_PRINT_WRAPPER ((“\n\n>>> FREED wctx in write_async!!! <<<\n\n”)); ExFreePoolWithTag (wctx, EHR_WRITE_CTX_TAG); }

Followup: MachineOwner

Followup: MachineOwner

out:
…
if (wctx)
{
DBG_PRINT_WRAPPER ((“\n\n>>> FREED wctx in write_async!!! <<<\n\n”));
ExFreePoolWithTag (wctx, EHR_WRITE_CTX_TAG);
}