C++ makes no guarantees about initialization, but this has been answered
extensively in the intervening messages. I use C++ on a daily basis in app
space (not kernel space) and have taught courses in it, and know that it
does not have any such guarantees.
Whatever you may have learned from gdb is irrelevant. I worked with early
Unix debuggers (in fact, I wrote a couple to replace the horrors that Unix
folk thought constituted “debuggers”) and the details of interaction between
the debugger and its child process on Unix are completely different in every
way from the details of how the debugger and the child process interact on
Windows. Windows is cleaner, and as far as I have been able to determine
does not change the stack of the child. Gdb, and its predecessor, dbx, are
really poor models on which to base your experience. They are the most
inhumane debugging systems I have ever had the misfortune to be subjected to
(cdb looks good by comparison!). I’ve also written debugger-like programs on
Windows, so I know the details are quite different.
joe
-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of James Harper
Sent: Saturday, January 15, 2011 4:45 AM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] bug check 0x19 (0x22,0,0,0)
Thanks for taking the time to write this all out Joe. I eventually found the
error, and it was indeed a stupid mistake - I’d removed some code when I
migrated from scsiport to storport and put it back again later but forgot to
increase the size of my device extension accordingly, so whenever I used the
buffer I allocated at the end of the device extension, it overwrite
something.
Generally, this error is almost always the consequence of taking a
long walk
off a short pier, that is, a buffer overrun on an array, memcpy,
memmove,
their Rtl equivalents, etc. Essentially, you have allocated a block
of
storage smaller than was required and you overwrote the header of a
storage
block.Note that the point at which this is detected can be billions of
instructions after the damage actually happens. The error indicates
the
damage has finally been *detected*, but this is no indication of what
actually caused it.
Yes. It was being detected in NDIS.
Look out for sizeof(PWHATEVER) if sizeof(WHATEVER) was intended (e.g.,
allocation of your device extension!)
Done that before ![]()
It is not a combination of errors; it is a genuine error in its own
right.
Why it is not documented is not explicable. It is extremely unlikely
it is
a combination of two other errors. But just check out where you
allocate
storage, and make sure all the allocations are large enough. I
generally
don’t even pay attention to these parameters, and look for the
erroneous
storage allocation or the erroneous overrun and I find the problem
quickly
using only “desktop” analysis.There are a couple other potential causes:
Uninitialized local variable pointers
Uninitialized heap structure pointers Key here is that you must
always write
PWHATEVER p = NULL;
For every local declaration that is not initialized immediately.
Otherwise,
if you have a path that can access the uninitialized variable it could
be
pointed to free storage in the heap, left over from whatever garbage
has
been left on the stack. Note that arguments about “efficiency” are
meaningless and without value, since an optimizing compiler will
detect that
the NULL assignment was unnecessary. Also, compile at level W4 (never
at
W3) so the compiler can do better analysis. Forget the code you see
in
unoptimized code, since unoptimized code is not a good metric for
production
code.
One of the finer points of difference between C and C++ is that C++
guarantees that variables are initialised to 0, while C doesn’t guarantee
anything. Even worse, one of the first times I used a debugger
(gdb) it did initialise everything to NULL so my bug didn’t happen!
Also, remember this is executing on a pipelined superscalar
architecture
which can execute about 6 instructions per nanosecond.For heap-allocated data, you should zero out the entire structure,
which
erases any pointer data left in it from its previous life. You
*might*
consider whether or not you want to put this under conditional
compilation,
but its cost is so small there is rarely a need to remove it in
release
mode. See reference to pipelined superscalar and add “multilevel
cache” to
the description, so you realize that zeroing blocks of contiguous
memory is
very fast.But far and away, the most common cause is overwriting blocks of
memory.
And also note that just because your driver detected it does not mean
your
driver caused it [although the truth is, it is almost always the case
that
it was your driver].
Thanks again Joe!
James
NTDEV is sponsored by OSR
For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars
To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer
–
This message has been scanned for viruses and dangerous content by
MailScanner, and is believed to be clean.