Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results
The free OSR Learning Library has more than 50 articles on a wide variety of topics about writing and debugging device drivers and Minifilters. From introductory level to advanced. All the articles have been recently reviewed and updated, and are written using the clear and definitive style you've come to expect from OSR over the years.
Check out The OSR Learning Library at: https://www.osr.com/osr-learning-library/
Hi,
I have a physical machine that i need to debug before the Inaccessible boot device BSOD happens so i can understand what's going on.
My question is how should i approach this? I never debugged a physical machine before, let alone debugging a Inaccessible boot device BSOD which means that even disk stack is probably not initialized.
So how can i debug this kernel and find out what is causing this BSOD? Should i use KDNet debugging method and connect another machine via Ethernet and then turn on the debug mode of the target machine during boot? Is the network stack even working at that stage? And how can i generate the network key to give to windbg, when the target machine is not booting at all?
I basically just want to know what is causing this and since this is a Inaccessible boot device BSOD, no memory.dmp is generated.
Upcoming OSR Seminars | ||
---|---|---|
OSR has suspended in-person seminars due to the Covid-19 outbreak. But, don't miss your training! Attend via the internet instead! | ||
Kernel Debugging | 16-20 October 2023 | Live, Online |
Developing Minifilters | 13-17 November 2023 | Live, Online |
Internals & Software Drivers | 4-8 Dec 2023 | Live, Online |
Writing WDF Drivers | 10-14 July 2023 | Live, Online |
Comments
Also i should note that when i use a Windows disk to bring up the repair mode and a command prompt and run the bcdedit /debug on, it says the system cannot find the file specified. But the bcdedit /dbgsettings net hostip:w.x.y.z port:n command works and gives me a key, but when i try to use that key on the host to connect to the target, it doesn't work and i can't connect to it. (I press F8 during boot and turn on the debugging mode)
Yes, you need to debug the target machine… using windbg and preferably via Ethernet.
In addition to /debug on, you’ll want to specify /bootdebug on. Without the /bootdebug switch, you can’t connect until system start time.
Peter Viscarola
OSR
@OSRDrivers
So i managed to attach to the target machine using KDNet with ethernet. The only suspicious thing i found was that when i brought up the command prompt in repair mode, and ran diskpart list disk, there was no * under the gpt of any disk, even tho the system is UEFI. Is this normal? If not, what does it mean?
And how should we usually pinpoint what is causing the INACCESSIBLE_BOOT_DEVICE BSOD? Looked through the upper and lower filter of disk class and no third party driver was installed. And nothing interesting on the stack of any core when BSOD happens (BSOD happens in PnpBootDeviceWait).
The first argument of the BSOD is just the ARC string of the boot disk ("\ArcName\multi..) and the second is 0xC0000034 : STATUS_OBJECT_NAME_NOT_FOUND.
Nothing was changed in the BIOS setting recently either, and no hardware change or anything.
What do !storagekd.storunit and !storagekd.storclass say?
-scott
OSR
Hi Scott, this is the output of the commands: (This is actually from a VMware based guest that had the same problem, and is not a physical machine)
OK, so the storage adapter is enumerating the LUN but disk driver failed to start for some reason. Does !devnode 0 21 say anything? And any upper or lower filters registered for disk:
!reg querykey \REGISTRY\MACHINE\SYSTEM\ControlSet001\Control\Class{4d36e967-e325-11ce-bfc1-08002be10318}
-scott
OSR
So how did you find out this out (That storage adapter is enumerating the LUN but the disk driver failed to start) ? I'm asking this because i want to learn what do the storage experts look for in the output of these commands in these situation? Because there are a lot of stuff that i don't get in the output of these commands.
This is the output of the command that you asked
The storage adapter enumerates the bus and creates a PDO for each storage device it finds. StorPort calls these "units", so !storunit shows you things the storage adapter found:
The function driver for a disk is going to be the Disk Class Driver. It gets notified of the arrival of the disk and then creates an FDO for the disk device. This is what you'll see in !storclass:
You can see the relation between the unit PDO and disk FDO with !devstack:
Two other things I can think of:
-scott
OSR
Thank you for the detailed answer Scott,
This is the output i get when i dumped the system even log:
So unfortunately it seems like there is not much useful information in it, and i couldn't find any useful information regarding the "Event metadata not found" error in the log by googling, only one OSR thread without any answer.
And disk!DiskAddDevice never gets called when i put a breakpoint on it (i put a bp on it with very early with the help of initial break cycle), although its DriverEntry does get called so at least it gets loaded.
Very mysterious...Any chance you can put the dump somewhere that I can take a look? Not sure what I'm looking for yet but it's a strange one.
Also: The GUIDs are the providers but their manifests aren't registered for some reason. You can use logman to see if the provider is registered on your host:
You can extract the manifest with the PerfView utility:
PerfView userCommand DumpRegisteredManifest {15CA44FF-4D7A-4BAA-BBA5-0998955E531E}
https://github.com/microsoft/perfview/releases/tag/v3.0.4
That being said, it doesn't look like any of those messages are interesting...
-scott
OSR
Unfortunately we are not allowed to share the dump files as it might contain customer data.
Can this happen because of a corrupted GPT partition? Note that when i boot the machine using a LIVE windows disk, the boot partition and its files/folders does get detected without any problems.
Also could this be happening because of a UEFI bootkit? Any suggestion on what other commands i should try out?
Have you checked your IRP_MN_START routine? have you handled it properly? I suspect this is causing the inaccessible boot device error.
That can't be the issue because currently we do not have any filter driver registered in this machine, you can check the
!reg querykey
output that i shared above.