NDIS driver causing bugcheck on reboot of Windows upgrade

We have a case where our NDIS filter driver appears to be causing a bugcheck during bootup after Windows has been upgraded. Here’s the breakdown:

* Our NDIS filter driver has been in service for several years without a single incident.
* Customer has our driver installed on Windows Server 2008R2.
* Customer upgrades Windows to Windows Server 2012, choosing the upgrade path.
* On the final boot of Windows Server 2012 of the upgrade process, the kernel bugchecks giving the message: SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (ourdriver.sys)

I’m unable to get a kernel dump of this to investigate. Obviously it appears our driver isn’t properly catching an exception in one of the startup routines but I don’t understand why it would be different after the upgrade. If the driver is installed on Windows Server 2012 it works fine - it’s only with the upgrade process it has an apparent problem.

Is it possible to get a kernel dump generated in a case like this? I’ve tried attaching the driver to another VM but the dump wasn’t written after the reboot. Is it possible to kernel debug the machine during the upgrade process?

Any information is greatly appreciated.
Thank you,
Chris

The name “SYSTEM_THREAD_EXCEPTION_NOT_HANDLED” is misleading. (Like most bugcheck codes, sigh.) Here’s what the name really means. When somebody spins up a system thread (PsCreateSystemThread) that thread’s real entrypoint is an internal kernel function. That function looks like this:

VOID PspThisIsTheRealThreadEntrypoint(. . .)
{
__try {
(ThreadStart)(StartContext);
} __except (KeBugCheck(SYSTEM_THREAD_EXCEPTION_NOT_HANDLED)) {
NOTHING;
}
}

So if any SEH exception unwinds up to this thread entrypoint, you get a bugcheck SYSTEM_THREAD_EXCEPTION_NOT_HANDLED. It’s not saying “gosh you should handle more exceptions” or “something’s wrong with the system thread pool”. And because all sorts of work happens on a system thread, knowing that the exception happened on one tells you nearly nothing. If the exception had happened in a thread that was started from usermode, you probably would have gotten a SYSTEM_SERVICE_EXCEPTION. Or if the exception happened elsewhere, a KMODE_EXCEPTION_NOT_HANDLED. The message is only that *somebody* had an exception and the thread that was below it happened to be a system thread.

Anyway, that’s a good story. But what you wanted to know was how to debug issues during setup. And the best way to do that is to attach a kernel debugger during setup. Use the /1394debug, /netdebug, or /debug (for COM) flags to setup.exe. http://technet.microsoft.com/en-us/library/hh824834.aspx

Thank you, that is actually very helpful. Also thanks for the link, I didn’t realize that could be done.

I was able to get a debugger attached to the upgraded OS. I actually did it by simply having the original 2008R2 setup for debugging and that carries over to the upgraded OS when it boots. I’ve stepped through my DriverEntry and the exception was caused by my DriverEntry not properly handling a case where it will return a failure and thus cause the driver to be unloaded. My driver didn’t properly clean things up and resulted in nasty things after the unload. I resolved that issue, but…

The reason my DriverEntry is returning a failure result in the first place is because NdisFRegisterFilterDriver() is returning NDIS_STATUS_FAILURE. My DriverEntry returns the failure in this case, which seems reasonable. The good thing is my driver now properly handles this and nasty things no longer happen. But, once the upgraded Win2012 starts up, my LWF is no longer installed/registered. We only see NdisFRegisterFilterDriver() fail in this upgrade case.

So I’d like to ask:
* Is it expected that NdisFRegisterFilterDriver() will return NDIS_STATUS_FAILURE in some cases, for example in an upgrade process? Or might there be something wrong with our driver that is causing it? Again, it only fails in this upgrade. The driver works fine if I install it after the upgrade is complete and has always worked.
* Is the NdisFRegisterFilterDriver() failure related to my LWF no longer being installed after the OS upgrade or might that be a separate issue?
* How might I debug why NdisFRegisterFilterDriver() is failing?

I do appreciate you help.
–Chris

An update to this, I stepped through NdisFRegisterFilterDriver and observe the following:

NdisFRegisterFilterDriver calls ndisReadFilterDriverRegistry. ndisReadFilterDriverRegistry calls RtlQueryRegistryValuesEx with the following key:
“Network{4d36e974-e325-11ce-bfc1-08002be10318}{D8E7DFB3-0301-4EA9-ABA6-6B61C815482E}\Ndi”, which is our LWF.

RtlQueryRegistryValuesEx fails with STATUS_OBJECT_NAME_NOT_FOUND. As a result ndisReadFilterDriverRegistry returns NDIS_STATUS_FAILURE.

Is it expected the registry key won’t exist in this stage of the upgrade?

Thank you,
Chris

I’m glad to hear you tracked down the crash issue.

Unfortunately, Windows does not preserve the registration of NDIS LWFs or NDIS protocol drivers across an OS upgrade. I hate to say it, but it’s “by design” that your filter’s registration vanished.

You can blame me for that. I’m currently the engineer who owns this area. And I hear loud and clear that this is a vexing limitation.

The best thing you can do is have your usermode application detect that the filter got deregistered “somehow”, and offer to reinstall it.

Ok, that’s cool, as long as there’s an explanation and we’re not chasing down a ghost. I really appreciate your help. Do you know if there’s a KB article that we can point to just so we can CYA?

Thanks!
–Chris

I’m aware of KBs for Windows Vista & Windows 7. We haven’t updated the KB for Windows 8, but the underlying deficiency is still the same.

http://support.microsoft.com/kb/968216
http://support.microsoft.com/kb/928229

Very cool. You saved me a long weekend of debugging.

Much appreciated.
–Chris

Thanks for the kb pointers.

Regards
Dave Cattley

Sent from my Windows Phone


From: Jeffrey Tippetmailto:xxxxx
Sent: ‎2/‎27/‎2015 7:00 PM
To: Windows System Software Devs Interest Listmailto:xxxxx
Subject: RE: RE:[ntdev] NDIS driver causing bugcheck on reboot of Windows upgrade

I’m aware of KBs for Windows Vista & Windows 7. We haven’t updated the KB for Windows 8, but the underlying deficiency is still the same.

http://support.microsoft.com/kb/968216
http://support.microsoft.com/kb/928229


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer</mailto:xxxxx></mailto:xxxxx>