Hi All,
Over the last few months I’ve developed a file system filter driver and
a few other security drivers working together as part of our AV/AS
product. I’ve got nasty random BSODs once in a while. And I’m sure it
has something to do with my filter driver as it happens only when filter
routine is activated. Their callstack traces are different each time and
none of them shows my code in the callstack. What they appear to have in
common is they bugcheck in normal NT kernel functions. For instance,
KERNEL_MODE_EXCEPTION_NOT_HANDLED (8e)
Arg1: c0000005, The exception code that was not handled
Arg2: 805b2caf, The address that the exception occurred at
Arg3: ef872ae0, Trap Frame
Arg4: 00000000
ef8726a8 804fc973 0000008e c0000005 805b2caf nt!KeBugCheckEx+0x1b
ef872a70 8053d251 ef872a8c 00000000 ef872ae0
nt!KiDispatchException+0x3b1
ef872ad8 8053d202 ef872b78 805b2caf badb0d00
nt!CommonDispatchException+0x4d
ef872b08 805b93a7 00000000 00000023 00000023 nt!KiExceptionExit+0x18a
ef872b78 805b34f6 e1c37ac8 ef872ba0 00000064
nt!ObpParseSymbolicLink+0x30f
ef872bec 805b76a5 00000000 82350334 00000080
nt!ObpLookupObjectName+0x41e
ef872cf0 806039f6 82350368 ef872c2c 00000000 nt!ObInsertObject+0x299
ef872d48 8053c808 0013b118 001f0003 0013b0f8 nt!NtCreateEvent+0xc2
ef872d48 7c90eb94 0013b118 001f0003 0013b0f8 nt!KiFastCallEntry+0xf8
0013b0cc 7c90d664 7c80a6da 0013b118 001f0003 ntdll!KiFastSystemCallRet
0013b0d0 7c80a6da 0013b118 001f0003 0013b0f8 ntdll!NtCreateEvent+0xc
0013b11c 015ed0b0 0013b174 00000000 00000000 kernel32!CreateEventW+0x67
This particular instance of BSOD happened during normal NtCreateEvent
request processing. The exception might occur in nt object name space
enumeration.
I enabled driver verifier with all options switched on except for low
resource simulation. It didn’t catch the bug in my code. Although these
BSODs are triggered by some other reason, I actually suspect memory pool
allocated by other drivers or kernel is overwritten by my code. The fact
that none of my functions is listed in the BSOD call stack traces
suggests that this memory overwrite mostly occurs sometime after the
dodgy bit of my code is executed.
I believe the driver verifier wouldn’t detect this kind of memory
corruption after reading driver verifier doc.
My question is,
Is there any way to detect this type of memory overwrite?
Thanks for your invaluable advice.
Sean Park
Kernel Driver Developer
PCTools Research Pty Ltd.
www.pctools.com
Hi,
This bug check might indicate a stack overflow problem.
you can try to run PrefastDrv when compile with small stack threshold, it might help you.
Other thing you can make regarding suspecting memory overwrite - put some meory breakpoints in your driver structures and you notice if other driver write your memory
Thanks for the comment.
It’s a typical scenario. When running with standard windows drivers,
windows doesn’t crash. But it crashes a while later when I run my
drivers. I know what the callstack trace looks like when stack overflow
occurs. This isn’t a stack overflow issue at all.
Also I’m suspecting that my driver is overwriting someone else’s address
space, not that other driver is overwriting my driver’s memory pool.
Any useful method to detect this sort of stuff? Looking forward to
expert’s ideas… It will be greatly appreciated.
Thanks.
Sean Park
Kernel Driver Developer
PCTools Research Pty Ltd.
www.pctools.com
-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@yahoo.com
Sent: Thursday, December 07, 2006 7:19 AM
To: Windows File Systems Devs Interest List
Subject: RE:[ntfsd] Filter driver random BSOD
Hi,
This bug check might indicate a stack overflow problem.
you can try to run PrefastDrv when compile with small stack threshold,
it might help you.
Other thing you can make regarding suspecting memory overwrite - put
some meory breakpoints in your driver structures and you notice if other
driver write your memory
Questions? First check the IFS FAQ at
https://www.osronline.com/article.cfm?id=17
You are currently subscribed to ntfsd as: xxxxx@pctools.com To
unsubscribe send a blank email to xxxxx@lists.osr.com
Hi there,
I’ve run into similar things in my driver creation experience. Some of
it came from not realizing the effect some things have on the IRQL.
Some of the things that you might check:
-
If you’re using any FAST_MUTEXs, make sure any functions between them
can run in APC, as the mutex call raises the IRQL.
-
ALL string functions must be run at PASSIVE, even dbgprint statements
that print out a string variable (constant strings are okay).
-
Check any globals and make sure you have a mutex around them if
necessary.
-
Understand what levels all of your functions need to run at.
Unfortunately, I don’t know of any foolproof way to find this type of
bug. I don’t claim to be an expert, but looking through these things
helped me out. Hopefully they might help you.
Good luck,
Justin
Justin M. Walker, CDIA+, ECMp
Performance Testing Engineer
440.788.5461
Reality is just Chaos with better lighting.
> -----Original Message-----
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of Sean Park
> Sent: Wednesday, December 06, 2006 5:23 PM
> To: Windows File Systems Devs Interest List
> Subject: RE: [ntfsd] Filter driver random BSOD
>
> Thanks for the comment.
>
> It's a typical scenario. When running with standard windows
> drivers, windows doesn't crash. But it crashes a while later
> when I run my drivers. I know what the callstack trace looks
> like when stack overflow occurs. This isn't a stack overflow
> issue at all.
>
> Also I'm suspecting that my driver is overwriting someone
> else's address space, not that other driver is overwriting my
> driver's memory pool.
>
> Any useful method to detect this sort of stuff? Looking
> forward to expert's ideas..... It will be greatly appreciated.
>
> Thanks.
>
>
> Sean Park
> Kernel Driver Developer
> PCTools Research Pty Ltd.
> www.pctools.com
>
-----------------------------------------
CONFIDENTIALITY NOTICE: This message and any attached documents may
contain confidential information from Hyland Software, Inc. The
information is intended only for the use of the individual or
entity named above. If the reader of this message is not the
intended recipient, or an employee or agent responsible for the
delivery of this message to the intended recipient, the reader is
hereby notified that any dissemination, distribution or copying of
this message or of any attached documents, or the taking of any
action or omission to take any action in reliance on the contents
of this message or of any attached documents, is strictly
prohibited. If you have received this communication in error,
please notify the sender immediately by e-mail or telephone, at
(440) 788-5000, and delete the original message immediately. Thank
you.
Hi,
Aside from the obvious suggestions of pulling your hair out and drinking
heavily, here are some things that might help you out:
-
Collect as many of these crashes as possible and do your darnedest to see
some sort of pattern. Even if the crashes occur at random places, look at
the corruption that caused the crash and see if anything jumps out. For
example, what’s the output of .trap ef872ae0 on the crash below?
-
DbgPrint every single last pool address your driver comes in contact with
(file objects, device objects, IRPs, allocations, string buffers, etc).
Seems obvious, but as a last resort this can help lead you to the right
piece of code.
-
Take a lot of breaks 
HTH,
-scott
–
Scott Noone
Software Engineer
OSR Open Systems Resources, Inc.
http://www.osronline.com
“Sean Park” wrote in message news:xxxxx@ntfsd…
Hi All,
Over the last few months I’ve developed a file system filter driver and
a few other security drivers working together as part of our AV/AS
product. I’ve got nasty random BSODs once in a while. And I’m sure it
has something to do with my filter driver as it happens only when filter
routine is activated. Their callstack traces are different each time and
none of them shows my code in the callstack. What they appear to have in
common is they bugcheck in normal NT kernel functions. For instance,
KERNEL_MODE_EXCEPTION_NOT_HANDLED (8e)
Arg1: c0000005, The exception code that was not handled
Arg2: 805b2caf, The address that the exception occurred at
Arg3: ef872ae0, Trap Frame
Arg4: 00000000
ef8726a8 804fc973 0000008e c0000005 805b2caf nt!KeBugCheckEx+0x1b
ef872a70 8053d251 ef872a8c 00000000 ef872ae0
nt!KiDispatchException+0x3b1
ef872ad8 8053d202 ef872b78 805b2caf badb0d00
nt!CommonDispatchException+0x4d
ef872b08 805b93a7 00000000 00000023 00000023 nt!KiExceptionExit+0x18a
ef872b78 805b34f6 e1c37ac8 ef872ba0 00000064
nt!ObpParseSymbolicLink+0x30f
ef872bec 805b76a5 00000000 82350334 00000080
nt!ObpLookupObjectName+0x41e
ef872cf0 806039f6 82350368 ef872c2c 00000000 nt!ObInsertObject+0x299
ef872d48 8053c808 0013b118 001f0003 0013b0f8 nt!NtCreateEvent+0xc2
ef872d48 7c90eb94 0013b118 001f0003 0013b0f8 nt!KiFastCallEntry+0xf8
0013b0cc 7c90d664 7c80a6da 0013b118 001f0003 ntdll!KiFastSystemCallRet
0013b0d0 7c80a6da 0013b118 001f0003 0013b0f8 ntdll!NtCreateEvent+0xc
0013b11c 015ed0b0 0013b174 00000000 00000000 kernel32!CreateEventW+0x67
This particular instance of BSOD happened during normal NtCreateEvent
request processing. The exception might occur in nt object name space
enumeration.
I enabled driver verifier with all options switched on except for low
resource simulation. It didn’t catch the bug in my code. Although these
BSODs are triggered by some other reason, I actually suspect memory pool
allocated by other drivers or kernel is overwritten by my code. The fact
that none of my functions is listed in the BSOD call stack traces
suggests that this memory overwrite mostly occurs sometime after the
dodgy bit of my code is executed.
I believe the driver verifier wouldn’t detect this kind of memory
corruption after reading driver verifier doc.
My question is,
Is there any way to detect this type of memory overwrite?
Thanks for your invaluable advice.
Sean Park
Kernel Driver Developer
PCTools Research Pty Ltd.
www.pctools.com
In a related to Scott’s comments, consider using ExFreePoolWithTag and have
your tags include the PROTECTED_POOL flag. I have found some nasty crashes
this way where a driver I inherited was every so often freeing memory it
did not allocate. This will catch those errors at the free under driver
verifier.
–
Don Burn (MVP, Windows DDK)
Windows 2k/XP/2k3 Filesystem and Driver Consulting
http://www.windrvr.com
Remove StopSpam from the email to reply
“Scott Noone” wrote in message news:xxxxx@ntfsd…
> Hi,
>
> Aside from the obvious suggestions of pulling your hair out and drinking
> heavily, here are some things that might help you out:
>
> 1) Collect as many of these crashes as possible and do your darnedest to
> see some sort of pattern. Even if the crashes occur at random places,
> look at the corruption that caused the crash and see if anything jumps
> out. For example, what’s the output of .trap ef872ae0 on the crash below?
>
> 2) DbgPrint every single last pool address your driver comes in contact
> with (file objects, device objects, IRPs, allocations, string buffers,
> etc). Seems obvious, but as a last resort this can help lead you to the
> right piece of code.
>
> 3) Take a lot of breaks 
>
> HTH,
>
> -scott
>
> –
> Scott Noone
> Software Engineer
> OSR Open Systems Resources, Inc.
> http://www.osronline.com
>
>
> “Sean Park” wrote in message news:xxxxx@ntfsd…
> Hi All,
>
> Over the last few months I’ve developed a file system filter driver and
> a few other security drivers working together as part of our AV/AS
> product. I’ve got nasty random BSODs once in a while. And I’m sure it
> has something to do with my filter driver as it happens only when filter
> routine is activated. Their callstack traces are different each time and
> none of them shows my code in the callstack. What they appear to have in
> common is they bugcheck in normal NT kernel functions. For instance,
>
> KERNEL_MODE_EXCEPTION_NOT_HANDLED (8e)
> Arg1: c0000005, The exception code that was not handled
> Arg2: 805b2caf, The address that the exception occurred at
> Arg3: ef872ae0, Trap Frame
> Arg4: 00000000
>
> ef8726a8 804fc973 0000008e c0000005 805b2caf nt!KeBugCheckEx+0x1b
> ef872a70 8053d251 ef872a8c 00000000 ef872ae0
> nt!KiDispatchException+0x3b1
> ef872ad8 8053d202 ef872b78 805b2caf badb0d00
> nt!CommonDispatchException+0x4d
> ef872b08 805b93a7 00000000 00000023 00000023 nt!KiExceptionExit+0x18a
> ef872b78 805b34f6 e1c37ac8 ef872ba0 00000064
> nt!ObpParseSymbolicLink+0x30f
> ef872bec 805b76a5 00000000 82350334 00000080
> nt!ObpLookupObjectName+0x41e
> ef872cf0 806039f6 82350368 ef872c2c 00000000 nt!ObInsertObject+0x299
> ef872d48 8053c808 0013b118 001f0003 0013b0f8 nt!NtCreateEvent+0xc2
> ef872d48 7c90eb94 0013b118 001f0003 0013b0f8 nt!KiFastCallEntry+0xf8
> 0013b0cc 7c90d664 7c80a6da 0013b118 001f0003 ntdll!KiFastSystemCallRet
> 0013b0d0 7c80a6da 0013b118 001f0003 0013b0f8 ntdll!NtCreateEvent+0xc
> 0013b11c 015ed0b0 0013b174 00000000 00000000 kernel32!CreateEventW+0x67
>
> This particular instance of BSOD happened during normal NtCreateEvent
> request processing. The exception might occur in nt object name space
> enumeration.
>
> I enabled driver verifier with all options switched on except for low
> resource simulation. It didn’t catch the bug in my code. Although these
> BSODs are triggered by some other reason, I actually suspect memory pool
> allocated by other drivers or kernel is overwritten by my code. The fact
> that none of my functions is listed in the BSOD call stack traces
> suggests that this memory overwrite mostly occurs sometime after the
> dodgy bit of my code is executed.
>
> I believe the driver verifier wouldn’t detect this kind of memory
> corruption after reading driver verifier doc.
>
> My question is,
> Is there any way to detect this type of memory overwrite?
>
> Thanks for your invaluable advice.
>
>
> Sean Park
> Kernel Driver Developer
> PCTools Research Pty Ltd.
> www.pctools.com
>
>
>
>