I’m having some Windows driver BSOD issues and I’m looking for suggestions.
I’m working on a Windows 7 (64 bit) PC. I’m using VS2015 and WDK-10.
I used VS2015 to write a very minimal driver (debug - x64): it basically is just a “DriverEntry” and a “DriverUnload” function, only doing some DbgPrints. I built it (test signed it, and running Windows 7 in Test Mode), I used OSRLOAD to register and load it, and everything works fine. I can load and unload it at will, and I see my DbgPrints in DebugView.
So far so good.
Then I wanted to add some more functionality and found the famous WDK ‘Toaster’ driver sample source.
So, I decided to first build that, and then study it.
I used the provided solution file, and I built the project wdfSimple.
I used the same set-up as above, the ONLY things I changed were a) I added test-signing certifcate data to the properties, and changed the default setting from debug-x32 to debug-x64.
It built fine! But when I use OSRLOAD to load it (after registering it), it bluescreens.
Since I’m not yet on a 2 machine develop-test set-up (I just wanted to do some quick tests), and thus can’t debug it well, I simply started trimming it down to see what caused the bluescreen (I know, not a very professional procedure).
It kept dying on me. So at last I simply commented ALL code out in my toaster.c and pasted the code, that worked just fine in my own minimal driver. Built it, signed it, loaded … and it bluescreens as well! Same code!
I have been comparing project properties and nothing jumps out to me as being different between the projects. The generated .sys files have both the exact same number of bytes (in their ‘data’ section).
I uploaded the minidump to OSR’s analyzer and it tells me the following, basically suggesting there’s a break point being set.
"
SYSTEM_THREAD_EXCEPTION_NOT_HANDLED_M (1000007e)
This is a very common bugcheck. Usually the exception address pinpoints
the driver/function that caused the problem. Always note this address
as well as the link date of the driver/image that contains this address.
Some common problems are exception code 0x80000003. This means a hard
coded breakpoint or assertion was hit, but this system was booted
/NODEBUG.
"
I have no breakpoints set, VS2015 has ‘delete all breakpoints’ grayed out, and the address of where the violation occurs (602d) is way outside the memory used by my driver, based on the link map I had it generate for this purpose.
So it really looks my code is stomping on something it shouldn’t be accessing.
I’m out of ideas … and I’m also wondering: Why can’t I build and run the ‘toaster’ sample driver, as is, out of the box? It must, surely, have been built by tens of thousands of people! Can’t find anything with Google on a blue-screening toaster sample driver either!
It MUST be something in the build procedure/project properties, because the source code is now identical between ‘my’ toaster.c and my little minimal driver that works just fine.
Somehow this feels like it’s something simple, but I can’t think of anything.
Any ideas will be very very welcome.
The toaster sample drivers are all plug and play which OSRLOADER does not
support. You are going to have to use the INF file with either device
manager or devcon to load the driver.
I’m having some Windows driver BSOD issues and I’m looking for suggestions. I’m working on a Windows 7 (64 bit) PC. I’m using VS2015 and WDK-10. I used VS2015 to write a very minimal driver (debug - x64): it basically is just a “DriverEntry” and a “DriverUnload” function, only doing some DbgPrints. I built it (test signed it, and running Windows 7 in Test Mode), I used OSRLOAD to register and load it, and everything works fine. I can load and unload it at will, and I see my DbgPrints in DebugView. So far so good.
Then I wanted to add some more functionality and found the famous WDK ‘Toaster’ driver sample source. So, I decided to first build that, and then study it. I used the provided solution file, and I built the project wdfSimple. I used the same set-up as above, the ONLY things I changed were a) I added test-signing certifcate data to the properties, and changed the default setting from debug-x32 to debug-x64. It built fine! But when I use OSRLOAD to load it (after registering it), it bluescreens. Since I’m not yet on a 2 machine develop-test set-up (I just wanted to do some quick tests), and thus can’t debug it well, I simply started trimming it down to see what caused the bluescreen (I know, not a very professional procedure). It kept dying on me. So at last I simply commented ALL code out in my toaster.c and pasted the code, that worked just fine in my own minimal driver. Built it, signed it, loaded … and it bluescreens as well! Same code!
I have been comparing project properties and nothing jumps out to me as being different between the projects. The generated .sys files have both the exact same number of bytes (in their ‘data’ section).
I uploaded the minidump to OSR’s analyzer and it tells me the following, basically suggesting there’s a break point being set.
“ SYSTEM_THREAD_EXCEPTION_NOT_HANDLED_M (1000007e) This is a very common bugcheck. Usually the exception address pinpoints the driver/function that caused the problem. Always note this address as well as the link date of the driver/image that contains this address. Some common problems are exception code 0x80000003. This means a hard coded breakpoint or assertion was hit, but this system was booted /NODEBUG. ”
I have no breakpoints set, VS2015 has ‘delete all breakpoints’ grayed out, and the address of where the violation occurs (602d) is way outside the memory used by my driver, based on the link map I had it generate for this purpose. So it really looks my code is stomping on something it shouldn’t be accessing.
I’m out of ideas … and I’m also wondering: Why can’t I build and run the ‘toaster’ sample driver, as is, out of the box? It must, surely, have been built by tens of thousands of people! Can’t find anything with Google on a blue-screening toaster sample driver either!
It MUST be something in the build procedure/project properties, because the source code is now identical between ‘my’ toaster.c and my little minimal driver that works just fine.
Somehow this feels like it’s something simple, but I can’t think of anything. Any ideas will be very very welcome.
Thanks for your response.
Which triggers 2 more questions
Is the PnP nature of the driver reflected somewhere in the project’s properties? (And can I change that?). After all, the code, right now, has been reduced to a simple DriverEntry and DriverUnload … copied from a driver that DOES work just fine (when loaded with OSRLOAD)
Doesn’t the blue screen indicate that OSR actually DID manage to load it?
Is the PnP nature of the driver reflected somewhere in the project’s properties? (And can I change that?). After all, the code, right now, has been reduced to a simple DriverEntry and DriverUnload … copied from a driver that DOES work just fine (when loaded with OSRLOAD)
No, there’s no external property about this. Essentially, a non-PnP
driver creates its device object within DriverEntry. A PnP driver
registers an AddDevice handler and creates its device objects within
AddDevice. The AddDevice handler gets called when a hardware ID match
is found within an INF file.
Â
Doesn’t the blue screen indicate that OSR actually DID manage to load it?
Yes, although a mismatch between “thought i was PnP” and “not really
PnP” can cause it. You might post the entire dump analysis.
By the way, you can configure your test machine create a full dump
(“memory.dmp”) during a BSOD, then analyze that with “windbg -z”. In
many cases, that’s almost as productive as live kernel debugging.
–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.
Your “a mismatch between ‘thought i was PnP’ and ‘not really PnP’ can cause it” … sounds promising … but where does this mismatch come from, if not from the source?
Also, and this is what makes it such a mystery: it works fine when built in one VS2015 project, but not when I build this with the MS provided toaster project, when I replace their code with the code above: it STILL blue-screens.
As for posting the entire analysis … I see that I “may not” post attachments here: what can I do that get that permission? (Posting the analysis in a reply is probably a bit much).
Your “a mismatch between ‘thought i was PnP’ and ‘not really PnP’ can cause it” … sounds promising … but where does this mismatch come from, if not from the source?
Also, and this is what makes it such a mystery: it works fine when built in one VS2015 project, but not when I build this with the MS provided toaster project, when I replace their code with the code above: it STILL blue-screens.
As for posting the entire analysis … I see that I “may not” post attachments here: what can I do that get that permission? (Posting the analysis in a reply is probably a bit much).
Okay, here goes (didn't expand the raw stack, since that would make the post too big for posting here):
Instant Online Crash Analysis, brought to you by OSR Open Systems Resources, Inc.
Show DivPrimary Analysis
Crash Dump Analysis provided by OSR Open Systems Resources, Inc. (http://www.osr.com)
Online Crash Dump Analysis Service
See http://www.osronline.com for more information
Windows 7 Kernel Version 7601 (Service Pack 1) MP (4 procs) Free x64
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 7601.23807.amd64fre.win7sp1_ldr.170512-0600
Machine Name:
Kernel base = 0xfffff80002e61000 PsLoadedModuleList = 0xfffff800030a3750
Debug session time: Tue Aug 22 12:02:17.631 2017 (UTC - 4:00)
System Uptime: 0 days 2:11:11.631
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
SYSTEM_THREAD_EXCEPTION_NOT_HANDLED_M (1000007e)
This is a very common bugcheck. Usually the exception address pinpoints
the driver/function that caused the problem. Always note this address
as well as the link date of the driver/image that contains this address.
Some common problems are exception code 0x80000003. This means a hard
coded breakpoint or assertion was hit, but this system was booted
/NODEBUG. This is not supposed to happen as developers should never have
hardcoded breakpoints in retail code, but ...
If this happens, make sure a debugger gets connected, and the
system is booted /DEBUG. This will let us see why this breakpoint is
happening.
Arguments:
Arg1: ffffffff80000003, The exception code that was not handled
Arg2: fffff8800dca602d, The address that the exception occurred at
Arg3: fffff880033a1728, Exception Record Address
Arg4: fffff880033a0f90, Context Record Address
Debugging Details:
TRIAGER: Could not open triage file : e:\dump_analysis\program\triage\modclass.ini, error 2
EXCEPTION_CODE: (HRESULT) 0x80000003 (2147483651) - One or more arguments are invalid
FAULTING_IP:
wdfsimple+602d
fffff880`0dca602d cc int 3
FOLLOWUP_IP:
wdfsimple+602d
fffff880`0dca602d cc int 3
SYMBOL_STACK_INDEX: 0
SYMBOL_NAME: wdfsimple+602d
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: wdfsimple
IMAGE_NAME: wdfsimple.sys
DEBUG_FLR_IMAGE_TIMESTAMP: 599c5533
STACK_COMMAND: .cxr 0xfffff880033a0f90 ; kb
FAILURE_BUCKET_ID: X64_0x7E_wdfsimple+602d
BUCKET_ID: X64_0x7E_wdfsimple+602d
Followup: MachineOwner
This free analysis is provided by OSR Open Systems Resources, Inc.
Want a deeper understanding of crash dump analysis? Check out our Windows Kernel Debugging and Crash Dump Analysis Seminar (opens in new tab/window)
Hide DivCrash Code Links
View the MSDN page for SYSTEM_THREAD_EXCEPTION_NOT_HANDLED_M
Search Google for SYSTEM_THREAD_EXCEPTION_NOT_HANDLED_M
You should learn to setup a test machine for kernel debugging and attach a debugger. Use a virtual machine if you don’t have two physical machines. Start here:
Are you sure you are building a Windows 7 compatible binary. Open the project’s property page, than in ‘Driver Settings’, make sure the ‘Target OS Version’ property is set to ‘Windows 7’ and the ‘Target Platform’ property is set to ‘Desktop’. You should copy the current configuration to new one. For instance, copy the ‘Debug’ configuration to a new one named ‘Win7 Debug’, select this new configuration and then set the properties cited above to the correct value.
Your “a mismatch between ‘thought i was PnP’ and ‘not really PnP’ can cause it” … sounds promising … but where does this mismatch come from, if not from the source?
All my driver does is this:
" #include <ntddk.h> > > void DriverUnload(PDRIVER_OBJECT pDriverObject) > { > DbgPrint(“drpaul: Driver unloading. [0x%X]\n”, pDriverObject); > } > > NTSTATUS DriverEntry(PDRIVER_OBJECT DriverObject, PUNICODE_STRING RegistryPath) > { > DriverObject->DriverUnload = DriverUnload; > DbgPrint(“drpaul: Driver started. %wZ\n”, RegistryPath); > > return STATUS_SUCCESS; > } > " > > Also, and this is what makes it such a mystery: it works fine when built in one VS2015 project, but not when I build this with the MS provided toaster project, when I replace their code with the code above: it STILL blue-screens.
But in the second case, you’ve installed this with an INF file that has a PnP ID, right? So, the I/O manager is going to try to call your AddDevice handler, and you don’t have one.
> As for posting the entire analysis … I see that I “may not” post attachments here: what can I do that get that permission? (Posting the analysis in a reply is probably a bit much).
Not at all. Most of us love digging through a dump analysis looking for a smoking gun.
– Tim Roberts, xxxxx@probo.com Providenza & Boekelheide, Inc.</ntddk.h>
So far I have exclusively used osrload to load and unload these drivers.
I assumed that no .inf files were involved in that. Am I wrong?
So, what makes a driver a PnP driver, when I only have the two simple functions (DriverEntry and DriverUnload) in my driver?
Besides, for an ‘on demand’ driver, other than osrload or ‘net start’ I wouldn’t know how to load a driver, supposedly through an .inf file. Any hints there?
One more observation, if I move the original toaster source from my failing project, to the project that worked just fine with the two above mentioned functions, it builds fine, but now osrload gives me the error “The service cannot be started, either because it is disabled or it has no enabled devices associated with it”
So,
a. Why is this?
and
b. and what could possibly be the difference between the 2 projects/builds, where one won’t load the driver, and the other will, but blue-screens?
What am I missing?
Isn’t osrload the proper way to get this driver loaded?
I even did this: In VS2015, I started a new solution: picked as template: Driver, KMDF.
Only changed the solution from 32 to 64 bit, target OS = Windows 7, and added my certificate for signing. Then build all the VS2015 generated code … didn’t do anything to the code.
When I load it with osrload, I get that same error “The service cannot be started, either because it is disabled or it has no enabled devices associated with it” … Again, this is just standard generated code. Out of the box … didn’t touch it. Why doesn’t that work?
So far I have exclusively used osrload to load and unload these drivers.
I assumed that no .inf files were involved in that. Am I wrong?
If you built the “toaster” sample, that driver has an INF file, because
it is PnP. You can’t use osrload to load and unload it. It expects to
be loaded by the PnP system in response to its hardware ID appearing.
Â
So, what makes a driver a PnP driver, when I only have the two simple functions (DriverEntry and DriverUnload) in my driver?
Nothing, but the registry plays a huge part here. There is an entry in
HKLM\System\CurrentControlSet\Services that describes this driver.Â
“osrload” creates that key (using CreateService) if it does not already
exist. But if the system has ever seen the INF file for the “toaster”
version, then it also has an entry in HKLM\System\CurrentControlSet\Enum
that names the hardware ID from the INF file, and that is the basis for
triggering the load of a PnP device. If that key still exists, it could
be causing your driver to get loaded as a PnP driver.
Besides, for an ‘on demand’ driver, other than osrload or ‘net start’ I wouldn’t know how to load a driver, supposedly through an .inf file. Any hints there?
“osrload” and “net start” and “sc start” all load legacy drivers, which
are managed and loaded via the Service Manager. Most drivers today are
PnP drivers, which are loaded through their INF file, after the driver
package (INF plus SYS) is pre-installed in the driver store, or loaded
through Device Manager.
One more observation, if I move the original toaster source from my failing project, to the project that worked just fine with the two above mentioned functions, it builds fine, but now osrload gives me the error “The service cannot be started, either because it is disabled or it has no enabled devices associated with it”
So,
a. Why is this?
and
b. and what could possibly be the difference between the 2 projects/builds, where one won’t load the driver, and the other will, but blue-screens?
Check “link /dump /imports”. See if the second project is adding a
reference to a driver that isn’t loaded.
I suppose you could send me the directories, and I’ll see which of the
settings are weird.
Isn’t osrload the proper way to get this driver loaded?
If it is a legacy driver, “osrload” is fine. Osrload does both the
“install” and “load” process. Once it is installed (by copying into
\windows\system32\drivers and creating the Services entry), it can be
loaded through “net start”.
I even did this: In VS2015, I started a new solution: picked as template: Driver, KMDF.
Only changed the solution from 32 to 64 bit, target OS = Windows 7, and added my certificate for signing. Then build all the VS2015 generated code … didn’t do anything to the code.
When I load it with osrload, I get that same error “The service cannot be started, either because it is disabled or it has no enabled devices associated with it” … Again, this is just standard generated code. Out of the box … didn’t touch it. Why doesn’t that work?
You’re asking for KMDF here, even though you aren’t using KMDF. What
operating system are you loading on? If you’re running Windows 7 but
building for Windows 10, it could be that the wrong version of KMDF is
in place.
–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.
I don’t know if you can set the DRIVER_OBJECT unload routine with a KMDF driver. To me this field is reserved for KMDF.
You have to create a WDF driver object with WdfDriverCreate and set the WDF unload routine with the WDF_DRIVER_CONFIG structure. You don’t want PNP so you could probably call WDF_DRIVER_CONFIG_INIT with a NULL EvtDriverDeviceAdd routine as this parameter is optional.
I would use VS 2015 provided KMDF Driver Template. You would have a PNP driver that would install very easely with the command (admin):
So,
a. Why is this?
and
b. and what could possibly be the difference between the 2 projects/builds, where one won’t load the driver, and the other will, but blue-screens?
What am I missing?
There’s another possibility. You are building your bare minimum code
with a “KMDF” project. The KMDF project sets the initial starting
address to gDriverEntry inside of KMDF, and then KMDF redirects things
to your DriverEntry. Your DriverEntry is returning without calling
WdfDriverCreate, and the framework is probably throwing up at that.
–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.
Thanks for your generous and extensive comments.
I’ll have quite a few things to try now … and somehow I get the feeling that this whole PnP thing is what’s causing me problems.
But just one more question about loading a driver: You mention that some, non-legacy, drivers can’t be loaded by osrload (net start, etc). And you mention using the .inf file and device manager.
The problem is that my drivers don’t have any hardware associated with them. They’re just software drivers. How do I load those?
I guess the answer is probably going to be: if they don’t ‘drive’ hardware, they don’t need to be PnP, and thus they will be legacy drivers, so you CAN load it with osrload.
But apparently my drivers ARE somehow PnP. I have no control over that. As soon as WDF functions are called, or even linked in, apparently, things go sideways.
Here’s a simple scenario (which is probably easy and quickly to follow if you happen to have VS2015 (maybe earlier versions do the same) and WDK 10 installed.
Open a New -> Project
Select: Installed->Templates->Other Languages->Visual C+±>Windows Driver->WDF and select Kernel Mode Driver
Let VS generate all the code. Select the proper cpu setting, and if you need a test-signed driver add your certificate info to the project file
Build it. You have have a nice driver, built okay, based on Microsoft generated code.
Now my question is … HOW do I load this driver? osrload won’t do it (after registering), and there are no devices involved.
Excellent! That was the information I was looking for! (For this ‘loading’ part)
As for taking your guys’ driver seminar … What? Are you suggesting that reading your excellent book wasn’t enough? Okay, just kidding, besides, I read it some 18 years ago, and I MAY have forgotten a few details here and there after I quickly (after reading the book) moved to firmware development, plus another few details about Windows drivers have changed a bit too, I’m now finding out.
And I would LOVE to take your seminars. If only my boss would let me (time and money wise!).
(And I know … those seminars are, in the end, money saving investments … how do I convince upper management of that, I don’t know).