PseudoRandom Bugchecks around ExFreePool

Hello all,

I’ve really appreciated searching through this list and gleaning some
knowledge off of your previous answers. I hope someone may be able to
give me a new approach to my current problem.

I’m writing a minifilter driver for file systems. Everything has been
working fine, but a bugcheck will occur sometimes with
IRQL_NOT_LESS_OR_EQUAL. Now, it appears that during some of my
ExFreePoolWithTag calls (or a bit later during what seems to be a
deferred free), the pointers will be less than 0x0000FFFF. I find this
to be very strange and I do not know where this might originate, because
I check for allocation failures after each of my allocate calls.

The Driver Verifier is turned on, and I can see the calls for the
special pool (VerifierFreePoolWithTag). The bugchecks seem to have no
link to heavy or light usage, but will always occur if I wait long
enough. They sometimes occur as soon as the driver starts (during
Instance setup). There does not seem to be a pattern as to where the
bugchecks occur.

I’ve attached two of the !analyze outputs below. If anyone can suggest
a good direction for investigation, I would greatly appreciate it.

Thanks for your time,

Justin

– I apologize for the lack of my own driver symbols. I recompiled and
lost the ability to load symbols for these dumps.

Justin M. Walker, CDIA+, ECMp  
<xxxxx><br>Performance Testing Engineer<br>Hyland Software <http:><br>Office: 440.788.5461<br>Fax: 440.788.5561<br><br>Driver is called Windtalk<br><br>kd&gt; !analyze -v<br> ************************************************************************<br>******* <br>*<br>*<br>* Bugcheck Analysis<br>*<br>*<br>*<br> ************************************************************************<br>******* <br><br>IRQL_NOT_LESS_OR_EQUAL (a)<br>An attempt was made to access a pageable (or completely invalid) address<br>at an<br>interrupt request level (IRQL) that is too high. This is usually<br>caused by drivers using improper addresses.<br>If a kernel debugger is available get the stack backtrace.<br>Arguments:<br>Arg1: 0000b130, memory referenced<br>Arg2: 00000002, IRQL<br>Arg3: 00000000, value 0 = read operation, 1 = write operation<br>Arg4: 80548e03, address which referenced memory<br><br>Debugging Details:<br>------------------<br><br>READ_ADDRESS: 0000b130 <br><br>CURRENT_IRQL: 2<br><br>FAULTING_IP: <br>nt!ExpCheckForResource+4f<br>80548e03 8b36 mov esi,[esi]<br><br>DEFAULT_BUCKET_ID: DRIVER_FAULT<br><br>BUGCHECK_STR: 0xA<br><br>LAST_CONTROL_TRANSFER: from 80548e03 to 804e0aac<br><br>STACK_TEXT: <br>b9a82a2c 80548e03 badb0d00 00000000 00000021 nt!KiTrap0E+0x238<br>b9a82ab0 8067efdd 00378fc0 00000040 818d4f38 nt!ExpCheckForResource+0x4f<br>b9a82ac8 806731fb 83378fc0 b9a82b50 b98004b1<br>nt!ExFreePoolSanityChecks+0x4d<br>b9a82ad4 b98004b1 83378fc0 4f424550 81ed4550<br>nt!VerifierFreePoolWithTag+0x1c<br>b9a82b50 ba6f803d 00000000 00326cc0 0000000c WindTalk+0x34b1<br>b9a82b7c ba7029ac 81d36540 00326cc0 0000000c<br>fltMgr!FltpFilterMessage+0x45<br>b9a82ba4 ba6f69e7 81d36540 832d8f00 00326cc0<br>fltMgr!FltpMsgDeviceControl+0x7a<br>b9a82be8 ba6f6f3d 8229c3e8 832d8f68 8229c3e8 fltMgr!FltpMsgDispatch+0x87<br>b9a82c1c 804e13d9 8229c3e8 832d8f68 806ff428 fltMgr!FltpDispatch+0x35<br>b9a82c2c 80672145 81cdb808 806ff410 832d8f68 nt!IopfCallDriver+0x31<br>b9a82c50 8056f50b 832d8fd8 81d36540 832d8f68 nt!IovCallDriver+0xa0<br>b9a82c64 80580fb1 8229c3e8 832d8f68 81d36540<br>nt!IopSynchronousServiceTail+0x60<br>b9a82d00 8058709e 0000004c 00000000 00000000 nt!IopXxxControlFile+0x5ef<br>b9a82d34 804dd99f 0000004c 00000000 00000000<br>nt!NtDeviceIoControlFile+0x2a<br>b9a82d34 7c90eb94 0000004c 00000000 00000000 nt!KiFastCallEntry+0xfc<br>0012f748 00000000 00000000 00000000 00000000 0x7c90eb94<br><br>STACK_COMMAND: kb<br><br>FOLLOWUP_IP: <br>WindTalk+34b1<br>b98004b1 0fb74ddc movzx ecx,word ptr [ebp-0x24]<br><br>FAULTING_SOURCE_CODE: <br><br>SYMBOL_STACK_INDEX: 4<br><br>FOLLOWUP_NAME: MachineOwner<br><br>SYMBOL_NAME: WindTalk+34b1<br><br>MODULE_NAME: WindTalk<br><br>IMAGE_NAME: WindTalk.sys<br><br>DEBUG_FLR_IMAGE_TIMESTAMP: 44dce93d<br><br>FAILURE_BUCKET_ID: 0xA_VRF_WindTalk+34b1<br><br>BUCKET_ID: 0xA_VRF_WindTalk+34b1<br><br>Followup: MachineOwner<br><br>========================================================================<br>=======<br><br>kd&gt; !analyze -v<br> ************************************************************************<br>******* <br>*<br>*<br>* Bugcheck Analysis<br>*<br>*<br>*<br> ************************************************************************<br>******* <br><br>IRQL_NOT_LESS_OR_EQUAL (a)<br>An attempt was made to access a pageable (or completely invalid) address<br>at an<br>interrupt request level (IRQL) that is too high. This is usually<br>caused by drivers using improper addresses.<br>If a kernel debugger is available get the stack backtrace.<br>Arguments:<br>Arg1: 000017f0, memory referenced<br>Arg2: 0000001c, IRQL<br>Arg3: 00000000, value 0 = read operation, 1 = write operation<br>Arg4: 805371f2, address which referenced memory<br><br>Debugging Details:<br>------------------<br><br>READ_ADDRESS: 000017f0 <br><br>CURRENT_IRQL: 1c<br><br>FAULTING_IP: <br>nt!KeCheckForTimer+35<br>805371f2 8b3f mov edi,[edi]<br><br>DEFAULT_BUCKET_ID: DRIVER_FAULT<br><br>BUGCHECK_STR: 0xA<br><br>LAST_CONTROL_TRANSFER: from 805371f2 to 804e0aac<br><br>STACK_TEXT: <br>f88ea6d8 805371f2 badb0d00 000017d8 804d95fa nt!KiTrap0E+0x238<br>f88ea758 8067efd6 00a4cf00 00000270 81809bf0 nt!KeCheckForTimer+0x35<br>f88ea770 806731fb 82a4cf00 f88eaa14 b977cb0a<br>nt!ExFreePoolSanityChecks+0x46<br>f88ea77c b977cb0a 82a4cf00 4f425653 b977cad2<br>nt!VerifierFreePoolWithTag+0x1c<br>f88eaa14 b9784709 f88eaa7c 804da3a4 81809bf0 WindTalk+0xb0a<br>f88eaa48 ba70c121 f88eaa7c 00000001 00000014 WindTalk+0x8709<br>f88eaa60 ba703ebb f88eaa7c 00000001 00000014<br>fltMgr!FltvInstanceSetup+0x1b<br>f88eaa94 ba704442 81962840 00000001 80550005<br>fltMgr!FltpDoInstanceSetupNotification+0x4b<br>f88eaaf4 ba7047cd 8220e1e8 81809bf0 00000001<br>fltMgr!FltpInitInstance+0x272<br>f88eab64 ba7048d8 8220e1e8 81809bf0 00000001<br>fltMgr!FltpCreateInstanceFromName+0x295<br>f88eabcc ba70b6af 8220e1e8 81809bf0 00000001<br>fltMgr!FltpEnumerateRegistryInstances+0xf4<br>f88eac18 ba70a703 8220e1e8 8196c680 e22d123a<br>fltMgr!FltpDoVolumeNotificationForNewFilter+0xbd<br>f88eac4c b97862bb 8220e1e8 00000001 00000000<br>fltMgr!FltStartFiltering+0x35<br>f88eac84 805a42e5 8196c680 8135f000 00000000 WindTalk+0xa2bb<br>f88ead54 805acb11 0000023c 00000001 00000000 nt!IopLoadDriver+0x66c<br>f88ead7c 804e23b5 0000023c 00000000 822cf020 nt!IopLoadUnloadDriver+0x45<br>f88eadac 80574128 b9f67cf4 00000000 00000000 nt!ExpWorkerThread+0xef<br>f88eaddc 804efc81 804e22f1 00000001 00000000<br>nt!PspSystemThreadStartup+0x34<br>00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16<br><br>STACK_COMMAND: kb<br><br>FOLLOWUP_IP: <br>WindTalk+b0a<br>b977cb0a c3 ret<br><br>FAULTING_SOURCE_CODE: <br><br>SYMBOL_STACK_INDEX: 4<br><br>FOLLOWUP_NAME: MachineOwner<br><br>SYMBOL_NAME: WindTalk+b0a<br><br>MODULE_NAME: WindTalk<br><br>IMAGE_NAME: WindTalk.sys<br><br>DEBUG_FLR_IMAGE_TIMESTAMP: 44e06b13<br><br>FAILURE_BUCKET_ID: 0xA_VRF_WindTalk+b0a<br><br>BUCKET_ID: 0xA_VRF_WindTalk+b0a<br><br>Followup: MachineOwner<br>---------<br><br>-----------------------------------------<br>CONFIDENTIALITY NOTICE: This message and any attached documents may<br>contain confidential information from Hyland Software, Inc. The<br>information is intended only for the use of the individual or<br>entity named above. If the reader of this message is not the<br>intended recipient, or an employee or agent responsible for the<br>delivery of this message to the intended recipient, the reader is<br>hereby notified that any dissemination, distribution or copying of<br>this message or of any attached documents, or the taking of any<br>action or omission to take any action in reliance on the contents<br>of this message or of any attached documents, is strictly<br>prohibited. If you have received this communication in error,<br>please notify the sender immediately by e-mail or telephone, at<br>(440) 788-5000, and delete the original message immediately. Thank<br>you.</http:></xxxxx>

So the low value for the pointer is typically (for
IRQL_NOT_LESS_OR_EQUAL) an indication of dereferencing a pointer to
structure that is NULL. The low value is the offset into the struct.

Given that you are crashing inside kernel routines, this indicates that
you are passing stale pointers or that you have managed to trample over
internal structures before your call to ExFreePool. If you just hand a
bogus or stale pool allocation pointer to ExFreePool I think verifier
will catch that and you will get a different bugcheck, so I am guessing
that you are stepping over internal kernel data structures.

What verifier settings are enabled? At this point I would turn on
everything and run against the checked kernel.

How come your symbols for your windtalk minifilter aren’t showing up?

Runtime tracing using either DbgPrint or ETW can help as well to uncover
a pattern in the failures by understanding what your driver was doing in
the recent past.

Run your driver through prefast and fix everything that it reasonably
complains about.

Look for footprints of your data structures scattered in the mess around
where you are crashing.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Justin Walker
Sent: Tuesday, August 15, 2006 8:20 AM
To: Windows System Software Devs Interest List
Subject: [ntdev] PseudoRandom Bugchecks around ExFreePool

Hello all,

I’ve really appreciated searching through this list and gleaning some
knowledge off of your previous answers. I hope someone may be able to
give me a new approach to my current problem.

I’m writing a minifilter driver for file systems. Everything has been
working fine, but a bugcheck will occur sometimes with
IRQL_NOT_LESS_OR_EQUAL. Now, it appears that during some of my
ExFreePoolWithTag calls (or a bit later during what seems to be a
deferred free), the pointers will be less than 0x0000FFFF. I find this
to be very strange and I do not know where this might originate, because
I check for allocation failures after each of my allocate calls.

The Driver Verifier is turned on, and I can see the calls for the
special pool (VerifierFreePoolWithTag). The bugchecks seem to have no
link to heavy or light usage, but will always occur if I wait long
enough. They sometimes occur as soon as the driver starts (during
Instance setup). There does not seem to be a pattern as to where the
bugchecks occur.

I’ve attached two of the !analyze outputs below. If anyone can suggest
a good direction for investigation, I would greatly appreciate it.

Thanks for your time,

Justin

– I apologize for the lack of my own driver symbols. I recompiled and
lost the ability to load symbols for these dumps.

Justin M. Walker, CDIA+, ECMp  
<xxxxx><br>Performance Testing Engineer<br>Hyland Software <http:><br>Office: 440.788.5461<br>Fax: 440.788.5561<br><br>Driver is called Windtalk<br><br>kd&gt; !analyze -v<br> ************************************************************************<br>******* <br>*<br>*<br>* Bugcheck Analysis<br>*<br>*<br>*<br> ************************************************************************<br>******* <br><br>IRQL_NOT_LESS_OR_EQUAL (a)<br>An attempt was made to access a pageable (or completely invalid) address<br>at an<br>interrupt request level (IRQL) that is too high. This is usually<br>caused by drivers using improper addresses.<br>If a kernel debugger is available get the stack backtrace.<br>Arguments:<br>Arg1: 0000b130, memory referenced<br>Arg2: 00000002, IRQL<br>Arg3: 00000000, value 0 = read operation, 1 = write operation<br>Arg4: 80548e03, address which referenced memory<br><br>Debugging Details:<br>------------------<br><br>READ_ADDRESS: 0000b130 <br><br>CURRENT_IRQL: 2<br><br>FAULTING_IP: <br>nt!ExpCheckForResource+4f<br>80548e03 8b36 mov esi,[esi]<br><br>DEFAULT_BUCKET_ID: DRIVER_FAULT<br><br>BUGCHECK_STR: 0xA<br><br>LAST_CONTROL_TRANSFER: from 80548e03 to 804e0aac<br><br>STACK_TEXT: <br>b9a82a2c 80548e03 badb0d00 00000000 00000021 nt!KiTrap0E+0x238<br>b9a82ab0 8067efdd 00378fc0 00000040 818d4f38 nt!ExpCheckForResource+0x4f<br>b9a82ac8 806731fb 83378fc0 b9a82b50 b98004b1<br>nt!ExFreePoolSanityChecks+0x4d<br>b9a82ad4 b98004b1 83378fc0 4f424550 81ed4550<br>nt!VerifierFreePoolWithTag+0x1c<br>b9a82b50 ba6f803d 00000000 00326cc0 0000000c WindTalk+0x34b1<br>b9a82b7c ba7029ac 81d36540 00326cc0 0000000c<br>fltMgr!FltpFilterMessage+0x45<br>b9a82ba4 ba6f69e7 81d36540 832d8f00 00326cc0<br>fltMgr!FltpMsgDeviceControl+0x7a<br>b9a82be8 ba6f6f3d 8229c3e8 832d8f68 8229c3e8 fltMgr!FltpMsgDispatch+0x87<br>b9a82c1c 804e13d9 8229c3e8 832d8f68 806ff428 fltMgr!FltpDispatch+0x35<br>b9a82c2c 80672145 81cdb808 806ff410 832d8f68 nt!IopfCallDriver+0x31<br>b9a82c50 8056f50b 832d8fd8 81d36540 832d8f68 nt!IovCallDriver+0xa0<br>b9a82c64 80580fb1 8229c3e8 832d8f68 81d36540<br>nt!IopSynchronousServiceTail+0x60<br>b9a82d00 8058709e 0000004c 00000000 00000000 nt!IopXxxControlFile+0x5ef<br>b9a82d34 804dd99f 0000004c 00000000 00000000<br>nt!NtDeviceIoControlFile+0x2a<br>b9a82d34 7c90eb94 0000004c 00000000 00000000 nt!KiFastCallEntry+0xfc<br>0012f748 00000000 00000000 00000000 00000000 0x7c90eb94<br><br>STACK_COMMAND: kb<br><br>FOLLOWUP_IP: <br>WindTalk+34b1<br>b98004b1 0fb74ddc movzx ecx,word ptr [ebp-0x24]<br><br>FAULTING_SOURCE_CODE: <br><br>SYMBOL_STACK_INDEX: 4<br><br>FOLLOWUP_NAME: MachineOwner<br><br>SYMBOL_NAME: WindTalk+34b1<br><br>MODULE_NAME: WindTalk<br><br>IMAGE_NAME: WindTalk.sys<br><br>DEBUG_FLR_IMAGE_TIMESTAMP: 44dce93d<br><br>FAILURE_BUCKET_ID: 0xA_VRF_WindTalk+34b1<br><br>BUCKET_ID: 0xA_VRF_WindTalk+34b1<br><br>Followup: MachineOwner<br><br>========================================================================<br>=======<br><br>kd&gt; !analyze -v<br> ************************************************************************<br>******* <br>*<br>*<br>* Bugcheck Analysis<br>*<br>*<br>*<br> ************************************************************************<br>******* <br><br>IRQL_NOT_LESS_OR_EQUAL (a)<br>An attempt was made to access a pageable (or completely invalid) address<br>at an<br>interrupt request level (IRQL) that is too high. This is usually<br>caused by drivers using improper addresses.<br>If a kernel debugger is available get the stack backtrace.<br>Arguments:<br>Arg1: 000017f0, memory referenced<br>Arg2: 0000001c, IRQL<br>Arg3: 00000000, value 0 = read operation, 1 = write operation<br>Arg4: 805371f2, address which referenced memory<br><br>Debugging Details:<br>------------------<br><br>READ_ADDRESS: 000017f0 <br><br>CURRENT_IRQL: 1c<br><br>FAULTING_IP: <br>nt!KeCheckForTimer+35<br>805371f2 8b3f mov edi,[edi]<br><br>DEFAULT_BUCKET_ID: DRIVER_FAULT<br><br>BUGCHECK_STR: 0xA<br><br>LAST_CONTROL_TRANSFER: from 805371f2 to 804e0aac<br><br>STACK_TEXT: <br>f88ea6d8 805371f2 badb0d00 000017d8 804d95fa nt!KiTrap0E+0x238<br>f88ea758 8067efd6 00a4cf00 00000270 81809bf0 nt!KeCheckForTimer+0x35<br>f88ea770 806731fb 82a4cf00 f88eaa14 b977cb0a<br>nt!ExFreePoolSanityChecks+0x46<br>f88ea77c b977cb0a 82a4cf00 4f425653 b977cad2<br>nt!VerifierFreePoolWithTag+0x1c<br>f88eaa14 b9784709 f88eaa7c 804da3a4 81809bf0 WindTalk+0xb0a<br>f88eaa48 ba70c121 f88eaa7c 00000001 00000014 WindTalk+0x8709<br>f88eaa60 ba703ebb f88eaa7c 00000001 00000014<br>fltMgr!FltvInstanceSetup+0x1b<br>f88eaa94 ba704442 81962840 00000001 80550005<br>fltMgr!FltpDoInstanceSetupNotification+0x4b<br>f88eaaf4 ba7047cd 8220e1e8 81809bf0 00000001<br>fltMgr!FltpInitInstance+0x272<br>f88eab64 ba7048d8 8220e1e8 81809bf0 00000001<br>fltMgr!FltpCreateInstanceFromName+0x295<br>f88eabcc ba70b6af 8220e1e8 81809bf0 00000001<br>fltMgr!FltpEnumerateRegistryInstances+0xf4<br>f88eac18 ba70a703 8220e1e8 8196c680 e22d123a<br>fltMgr!FltpDoVolumeNotificationForNewFilter+0xbd<br>f88eac4c b97862bb 8220e1e8 00000001 00000000<br>fltMgr!FltStartFiltering+0x35<br>f88eac84 805a42e5 8196c680 8135f000 00000000 WindTalk+0xa2bb<br>f88ead54 805acb11 0000023c 00000001 00000000 nt!IopLoadDriver+0x66c<br>f88ead7c 804e23b5 0000023c 00000000 822cf020 nt!IopLoadUnloadDriver+0x45<br>f88eadac 80574128 b9f67cf4 00000000 00000000 nt!ExpWorkerThread+0xef<br>f88eaddc 804efc81 804e22f1 00000001 00000000<br>nt!PspSystemThreadStartup+0x34<br>00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16<br><br>STACK_COMMAND: kb<br><br>FOLLOWUP_IP: <br>WindTalk+b0a<br>b977cb0a c3 ret<br><br>FAULTING_SOURCE_CODE: <br><br>SYMBOL_STACK_INDEX: 4<br><br>FOLLOWUP_NAME: MachineOwner<br><br>SYMBOL_NAME: WindTalk+b0a<br><br>MODULE_NAME: WindTalk<br><br>IMAGE_NAME: WindTalk.sys<br><br>DEBUG_FLR_IMAGE_TIMESTAMP: 44e06b13<br><br>FAILURE_BUCKET_ID: 0xA_VRF_WindTalk+b0a<br><br>BUCKET_ID: 0xA_VRF_WindTalk+b0a<br><br>Followup: MachineOwner<br>---------<br><br>-----------------------------------------<br>CONFIDENTIALITY NOTICE: This message and any attached documents may<br>contain confidential information from Hyland Software, Inc. The<br>information is intended only for the use of the individual or<br>entity named above. If the reader of this message is not the<br>intended recipient, or an employee or agent responsible for the<br>delivery of this message to the intended recipient, the reader is<br>hereby notified that any dissemination, distribution or copying of<br>this message or of any attached documents, or the taking of any<br>action or omission to take any action in reliance on the contents<br>of this message or of any attached documents, is strictly<br>prohibited. If you have received this communication in error,<br>please notify the sender immediately by e-mail or telephone, at<br>(440) 788-5000, and delete the original message immediately. Thank<br>you.<br><br>---<br>Questions? First check the Kernel Driver FAQ at<br>http://www.osronline.com/article.cfm?id=256<br><br>To unsubscribe, visit the List Server section of OSR Online at<br>http://www.osronline.com/page.cfm?name=ListServer</http:></xxxxx>

Hi Mark,

Looking at the bugcheck dumps, I see a value of B130 for one of the
memory locations, which translates to around 45KB. I definitely do not
allocate anything this size, especially since most of these are string
buffers of 20 characters or less. In one of the cases, all I’m doing is
copying a string into the memory location, passing it to a function that
does not change it, then freeing it. The peak allocation shown by the
driver verifier is around 8kB.

I have run PreFast against all of my code, and the driver verifier is
enabled for all except for low resource simulation.

As for looking for a pattern, the only one I’ve seen is that it happens
around the freeing of the pools. Sometimes it occurs during the
Instance setup (where all it does is copy into and out of the memory
location), sometimes during a Pre or Post read function, and sometimes
during the teardown.

I have extensive DbgPrint statements throughout the code, none of which
have helped find the specific culprit.

Does anyone know if a call to any of the RtlCopy functions could do
anything like this? All of the memory is nonpaged, so it shouldn’t
matter where the IRQL is, but as I’m still learning so I could be wrong.
Is there a specific type of copy I should be using (such as
RtlCopyBytes, RtlCopyMemory or RtlCopyMemory32)? None of the memory is
overlapping in the copy statement, so I do not use the move function.

As for the symbols not showing up, I recompiled and lost the PDBs for
the runs the dumps are for. If you’d like I can run a few more times
and get some fresh dumps.

Thanks for the ideas, and keep them coming.

Justin

Justin M. Walker, CDIA+, ECMp  
Performance Testing Engineer  
440.788.5461  
  
Reality is just Chaos with better lighting.  
  
> -----Original Message-----  
> From: xxxxx@lists.osr.com   
> [mailto:xxxxx@lists.osr.com] On Behalf Of Roddy, Mark  
> Sent: Tuesday, August 15, 2006 9:56 AM  
> To: Windows System Software Devs Interest List  
> Subject: RE: [ntdev] PseudoRandom Bugchecks around ExFreePool  
>   
> So the low value for the pointer is typically (for  
> IRQL_NOT_LESS_OR_EQUAL) an indication of dereferencing a   
> pointer to structure that is NULL. The low value is the   
> offset into the struct.   
>   
> Given that you are crashing inside kernel routines, this   
> indicates that you are passing stale pointers or that you   
> have managed to trample over internal structures before your   
> call to ExFreePool. If you just hand a bogus or stale pool   
> allocation pointer to ExFreePool I think verifier will catch   
> that and you will get a different bugcheck, so I am guessing   
> that you are stepping over internal kernel data structures.  
>   
> What verifier settings are enabled? At this point I would   
> turn on everything and run against the checked kernel.  
>   
> How come your symbols for your windtalk minifilter aren't showing up?  
>   
> Runtime tracing using either DbgPrint or ETW can help as well   
> to uncover a pattern in the failures by understanding what   
> your driver was doing in the recent past.  
>   
> Run your driver through prefast and fix everything that it   
> reasonably complains about.  
>   
> Look for footprints of your data structures scattered in the   
> mess around where you are crashing.  
  
-----------------------------------------  
CONFIDENTIALITY NOTICE: This message and any attached documents may  
contain confidential information from Hyland Software, Inc. The  
information is intended only for the use of the individual or  
entity named above. If the reader of this message is not the  
intended recipient, or an employee or agent responsible for the  
delivery of this message to the intended recipient, the reader is  
hereby notified that any dissemination, distribution or copying of  
this message or of any attached documents, or the taking of any  
action or omission to take any action in reliance on the contents  
of this message or of any attached documents, is strictly  
prohibited. If you have received this communication in error,  
please notify the sender immediately by e-mail or telephone, at  
(440) 788-5000, and delete the original message immediately. Thank  
you.

In my experience this means you are walking off the end of an allocation
somewhere. You mention these are string buffers. My first bet is that
somewhere, on at least one string (and possibly only one string) you are
allocating a size in characters and then storing unicode data, which walks
off the end of the buffer. Or perhaps you forgot a trailing null character
and are going 2 bytes too far.

I’d guess that those clobbered pointers are getting the top 2 bytes cleared
somehow, and possibly the bottom two overwritten. The B130 would be a
little hard to generate as a unicode character, and its a little strange on
a little-endian machine that you would clobber the top byte and not the
bottom. This would more suggest that an over-long structure is getting
modified than a string.

If you have the kernel debugger set up, I’d consider a breakpoint on the
sanity check call. Then look in memory at the stuff that is being returned,
and in the pool just in front of it. Maybe you will recognize something
there that looks like one of your allocations and see something wrong with
it.

Loren

You say that you see pointer values of 0x0000ffff, but it’s not clear to
me whether these are the values you’re handing to ExFreePool, or whether
they’re the values found in the bugcheck report.

Have you verified whether the pointers you’re handing into ExFreePool
are in-fact the same pointers you allocated? Have you verified that
you’re always handing in a pointer to pool rather than a pointer to a
global buffer or an on-stack buffer?

The first thing is obviously to make sure you’re handing in valid
parameters. If you aren’t then it’s a matter of tracking the invalid
parameter back to its source, be it an internal corruption or an
incorrect source (i.e. not pool).

If you are handing in a valid pool parameter then use !pool to see what
the pool block looks like and see if you can identify the corruption
there. See if you are overwriting the end of your pool block (and since
you’re working with strings that’s incredibly likely) or if you’re
perhaps underwriting from the beginning, particularly if you have any
pointer math that involves subtraction or intermediate values of types
smaller than PVOID.

-p

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Justin Walker
Sent: Tuesday, August 15, 2006 7:56 AM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] PseudoRandom Bugchecks around ExFreePool

Hi Mark,

Looking at the bugcheck dumps, I see a value of B130 for one of the
memory locations, which translates to around 45KB. I definitely do not
allocate anything this size, especially since most of these are string
buffers of 20 characters or less. In one of the cases, all I’m doing is
copying a string into the memory location, passing it to a function that
does not change it, then freeing it. The peak allocation shown by the
driver verifier is around 8kB.

I have run PreFast against all of my code, and the driver verifier is
enabled for all except for low resource simulation.

As for looking for a pattern, the only one I’ve seen is that it happens
around the freeing of the pools. Sometimes it occurs during the
Instance setup (where all it does is copy into and out of the memory
location), sometimes during a Pre or Post read function, and sometimes
during the teardown.

I have extensive DbgPrint statements throughout the code, none of which
have helped find the specific culprit.

Does anyone know if a call to any of the RtlCopy functions could do
anything like this? All of the memory is nonpaged, so it shouldn’t
matter where the IRQL is, but as I’m still learning so I could be wrong.
Is there a specific type of copy I should be using (such as
RtlCopyBytes, RtlCopyMemory or RtlCopyMemory32)? None of the memory is
overlapping in the copy statement, so I do not use the move function.

As for the symbols not showing up, I recompiled and lost the PDBs for
the runs the dumps are for. If you’d like I can run a few more times
and get some fresh dumps.

Thanks for the ideas, and keep them coming.

Justin

Justin M. Walker, CDIA+, ECMp  
Performance Testing Engineer  
440.788.5461  
  
Reality is just Chaos with better lighting.  
  
> -----Original Message-----  
> From: xxxxx@lists.osr.com   
> [mailto:xxxxx@lists.osr.com] On Behalf Of Roddy, Mark  
> Sent: Tuesday, August 15, 2006 9:56 AM  
> To: Windows System Software Devs Interest List  
> Subject: RE: [ntdev] PseudoRandom Bugchecks around ExFreePool  
>   
> So the low value for the pointer is typically (for  
> IRQL_NOT_LESS_OR_EQUAL) an indication of dereferencing a pointer to   
> structure that is NULL. The low value is the offset into the struct.  
>   
> Given that you are crashing inside kernel routines, this indicates   
> that you are passing stale pointers or that you have managed to   
> trample over internal structures before your call to ExFreePool. If   
> you just hand a bogus or stale pool allocation pointer to ExFreePool I  
  
> think verifier will catch that and you will get a different bugcheck,   
> so I am guessing that you are stepping over internal kernel data   
> structures.  
>   
> What verifier settings are enabled? At this point I would turn on   
> everything and run against the checked kernel.  
>   
> How come your symbols for your windtalk minifilter aren't showing up?  
>   
> Runtime tracing using either DbgPrint or ETW can help as well to   
> uncover a pattern in the failures by understanding what your driver   
> was doing in the recent past.  
>   
> Run your driver through prefast and fix everything that it reasonably   
> complains about.  
>   
> Look for footprints of your data structures scattered in the mess   
> around where you are crashing.  
  
-----------------------------------------  
CONFIDENTIALITY NOTICE: This message and any attached documents may  
contain confidential information from Hyland Software, Inc. The  
information is intended only for the use of the individual or entity  
named above. If the reader of this message is not the intended  
recipient, or an employee or agent responsible for the delivery of this  
message to the intended recipient, the reader is hereby notified that  
any dissemination, distribution or copying of this message or of any  
attached documents, or the taking of any action or omission to take any  
action in reliance on the contents of this message or of any attached  
documents, is strictly prohibited. If you have received this  
communication in error, please notify the sender immediately by e-mail  
or telephone, at  
(440) 788-5000, and delete the original message immediately. Thank you.  
  
---  
Questions? First check the Kernel Driver FAQ at  
http://www.osronline.com/article.cfm?id=256  
  
To unsubscribe, visit the List Server section of OSR Online at  
http://www.osronline.com/page.cfm?name=ListServer

Thanks for everyone’s help.

It turned out to be some code that wasn’t thread safe. Why I didn’t see
it before I don’t know.

Thanks again!

Justin M. Walker, CDIA+, ECMp  
Performance Testing Engineer  
440.788.5461  
  
Reality is just Chaos with better lighting.  
  
> -----Original Message-----  
> From: xxxxx@lists.osr.com   
> [mailto:xxxxx@lists.osr.com] On Behalf Of Peter Wieland  
> Sent: Tuesday, August 15, 2006 2:10 PM  
> To: Windows System Software Devs Interest List  
> Subject: RE: [ntdev] PseudoRandom Bugchecks around ExFreePool  
>   
> You say that you see pointer values of 0x0000ffff, but it's   
> not clear to me whether these are the values you're handing   
> to ExFreePool, or whether they're the values found in the   
> bugcheck report.  
>   
> Have you verified whether the pointers you're handing into   
> ExFreePool are in-fact the same pointers you allocated? Have   
> you verified that you're always handing in a pointer to pool   
> rather than a pointer to a global buffer or an on-stack buffer?  
>   
> The first thing is obviously to make sure you're handing in   
> valid parameters. If you aren't then it's a matter of   
> tracking the invalid parameter back to its source, be it an   
> internal corruption or an incorrect source (i.e. not pool).  
>   
> If you are handing in a valid pool parameter then use !pool   
> to see what the pool block looks like and see if you can   
> identify the corruption there. See if you are overwriting   
> the end of your pool block (and since you're working with   
> strings that's incredibly likely) or if you're perhaps   
> underwriting from the beginning, particularly if you have any   
> pointer math that involves subtraction or intermediate values   
> of types smaller than PVOID.   
>   
> -p  
>   
  
-----------------------------------------  
CONFIDENTIALITY NOTICE: This message and any attached documents may  
contain confidential information from Hyland Software, Inc. The  
information is intended only for the use of the individual or  
entity named above. If the reader of this message is not the  
intended recipient, or an employee or agent responsible for the  
delivery of this message to the intended recipient, the reader is  
hereby notified that any dissemination, distribution or copying of  
this message or of any attached documents, or the taking of any  
action or omission to take any action in reliance on the contents  
of this message or of any attached documents, is strictly  
prohibited. If you have received this communication in error,  
please notify the sender immediately by e-mail or telephone, at  
(440) 788-5000, and delete the original message immediately. Thank  
you.