We have our crash dump driver working, but just want to confirm some of my
understanding about the execution environment.
When a crash dump miniport is processing crash dump writes (and any other
miniport requests):
-
the processor is at IRQL HIGH_LEVEL
-
all other processors are stopped (looping at HIGH_LEVEL?)
-
devices on the PCI bus have not been turned off yet (via clearing
the PCI command register or asking other drivers to stop or power down,
unless they have explicitly asked for a crash dump callback and turned off
the hardware) -
the state of the interrupt controller is? There were some cases
were spinlocks were acquired, lowering the IRQL to DISPATCH_LEVEL, which I
believe is really bad if interrupts are pending, we are fixing this -
pretty much EVERY kernel API is bad (except a few like
reading/writing ports/memory), including WMI/WPP tracing calls -
all threads except the one doing the crash dump are stopped
(implied if all processors are locked it HIGH_LEVEL) -
all faults will be bad, so use of SEH is pointless and doesn’t work
-
the crash dump ends with a processor reset? (but perhaps not a PCI
bus reset, as our boot bios code sometimes dies after a crash dump and
doesn’t after a normal shutdown??), this implies boot bios code should
assume boot hardware is NOT freshly powered on/hardware bus reset and may be
in an ugly state than needs resetting via software -
IRQL == HIGH_LEVEL is never encountered in normal operation, so can
assume if at HIGH_LEVEL we must be crashing (this makes bypassing spinlocks
much easier) -
can crash dumps happen when the original fault is at elevated IRQL or
only PASSIVE_LEVEL? (DISPATCH or DIRQL or ???) This matters because if they
never happen at elevated IRQL, our code can help protect the crash dump
driver by raising the IRQL when changing data structures that might be used
by the crash dump driver. -
crashing with no dump is much better than risking system disk
corruption -
I’ve noticed 1394 debugging doesn’t exactly single step correctly
while in a crash dump, there was recently a comment here that serial
debugging was much better in that case? -
we felt that if we were going to boot using our storage device, not
making crash dumps also work was pretty unacceptable -
does all this apply to writing hibernation data? Seems like there are
hiber_xxx drivers -
We have done all this in Win 2003, what changes about crash dumps in
Win 2008?
So what am I missing?
Jan