Handle NMI interrupt in Device driver

Hello,

My SBC’s vendor (Adlink) suggested to create an “NMI” interrupt upon ‘1’ in the chipset’s GPIO.

Can you please explain what should I write in inf+sys in order to catch this interrupt ?

O.S: Windows 7 (64)

Can you please tell if this is wise step to use NMI ? Is this interrupt used for other criticial purposes like application debug ?

Thank you,
Zvika

> Can you please tell if this is wise step to use NMI ?

Well, apparently, not…

NMI in itself is meant to be used either for indicating critical hardware failures that require immediate attention from the system (like, for example, parity error detected by the memory controller), or for breaking into the system by means of kernel debugger (i.e. signaling it via NMI pin) when no other option is available because of some irrecoverable system software bug. For example, if CPU executes HLT instruction (or just goes into an infinite loop) while interrupts are disabled, NMI is the only way one may get the system back.

Although Linux uses NMI for watchdog timers and performance counters, in the Windows world NMI is, IIRC, firmly associated with irrecoverable system failures and always results in bugchecking. You can check the following article for more details

https://support.microsoft.com/en-us/help/2750146/nmi-hardware-failure-error-when-an-nmi-is-triggered-on-windows-8-and-w

Therefore, using NMI option is obviously unwise in the Windows world unless bugchcking is your intended behavior (i.e. you want to use it for diagnostic purposes)

Anton Bassov

PVOID KeRegisterNmiCallback(
In PNMI_CALLBACK CallbackRoutine,
In_opt PVOID Context
);

Read the docs on how to use it.

Mark Roddy

On Sun, Nov 26, 2017 at 3:52 PM, xxxxx@hotmail.com <
xxxxx@lists.osr.com> wrote:

> Can you please tell if this is wise step to use NMI ?

Well, apparently, not…

NMI in itself is meant to be used either for indicating critical hardware
failures that require immediate attention from the system (like, for
example, parity error detected by the memory controller), or for breaking
into the system by means of kernel debugger (i.e. signaling it via NMI
pin) when no other option is available because of some irrecoverable system
software bug. For example, if CPU executes HLT instruction (or just goes
into an infinite loop) while interrupts are disabled, NMI is the only way
one may get the system back.

Although Linux uses NMI for watchdog timers and performance counters, in
the Windows world NMI is, IIRC, firmly associated with irrecoverable
system failures and always results in bugchecking. You can check the
following article for more details

https://support.microsoft.com/en-us/help/2750146/nmi-
hardware-failure-error-when-an-nmi-is-triggered-on-windows-8-and-w

Therefore, using NMI option is obviously unwise in the Windows world
unless bugchcking is your intended behavior (i.e. you want to use it for
diagnostic purposes)

Anton Bassov


NTDEV is sponsored by OSR

Visit the list online at: http:> showlists.cfm?list=ntdev>
>
> MONTHLY seminars on crash dump analysis, WDF, Windows internals and
> software drivers!
> Details at http:
>
> To unsubscribe, visit the List Server section of OSR Online at <
> http://www.osronline.com/page.cfm?name=ListServer&gt;
></http:></http:>

The documentation is perhaps not as cautionary as it should be on this topic. Essentially no system APIs can be safely called from an NMI callback (including but not limited to: acquiring locks, queueing a DPC, etc.), because an NMI can interrupt any code, even code that holds an interrupts-disabled or HIGH_LEVEL lock.

Since an NMI can interrupt anything (and there is almost no code that can safely be called from an NMI callback), synchronizing with any code that is called to handle an NMI without risk of deadlock etc. is difficult and cumbersome, not to mention that NMI code typically cannot be debugged in typical fashion, as the debugger itself relies on NMIs to function (at least on AMD64).

Unless something very special-purpose such as a debugging watchdog to catch, say, a hung system is being implemented, NMIs are best avoided.

  • S (Msft)

From: xxxxx@lists.osr.com on behalf of xxxxx@gmail.com
Sent: Monday, November 27, 2017 3:41:26 AM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] Handle NMI interrupt in Device driver

PVOID KeRegisterNmiCallback(
In PNMI_CALLBACK CallbackRoutine,
In_opt PVOID Context
);

Read the docs on how to use it.

Mark Roddy

On Sun, Nov 26, 2017 at 3:52 PM, xxxxx@hotmail.commailto:xxxxx > wrote:
> Can you please tell if this is wise step to use NMI ?

Well, apparently, not…

NMI in itself is meant to be used either for indicating critical hardware failures that require immediate attention from the system (like, for example, parity error detected by the memory controller), or for breaking into the system by means of kernel debugger (i.e. signaling it via NMI pin) when no other option is available because of some irrecoverable system software bug. For example, if CPU executes HLT instruction (or just goes into an infinite loop) while interrupts are disabled, NMI is the only way one may get the system back.

Although Linux uses NMI for watchdog timers and performance counters, in the Windows world NMI is, IIRC, firmly associated with irrecoverable system failures and always results in bugchecking. You can check the following article for more details

https://support.microsoft.com/en-us/help/2750146/nmi-hardware-failure-error-when-an-nmi-is-triggered-on-windows-8-and-whttps:

Therefore, using NMI option is obviously unwise in the Windows world unless bugchcking is your intended behavior (i.e. you want to use it for diagnostic purposes)

Anton Bassov


NTDEV is sponsored by OSR

Visit the list online at: http:>

MONTHLY seminars on crash dump analysis, WDF, Windows internals and software drivers!
Details at http:>

To unsubscribe, visit the List Server section of OSR Online at http:>

— NTDEV is sponsored by OSR Visit the list online at: MONTHLY seminars on crash dump analysis, WDF, Windows internals and software drivers! Details at To unsubscribe, visit the List Server section of OSR Online at</http:></http:></http:></https:></mailto:xxxxx>

Well presumably he just is going to read/write a register on his device and
dismiss the interrupt. Indeed if he is doing anything else it will be a
disaster.

Mark Roddy

On Mon, Nov 27, 2017 at 5:02 PM, xxxxx@valhallalegends.com <
xxxxx@lists.osr.com> wrote:

The documentation is perhaps not as cautionary as it should be on this
topic. Essentially no system APIs can be safely called from an NMI
callback (including but not limited to: acquiring locks, queueing a DPC,
etc.), because an NMI can interrupt any code, even code that holds an
interrupts-disabled or HIGH_LEVEL lock.

Since an NMI can interrupt anything (and there is almost no code that can
safely be called from an NMI callback), synchronizing with any code that is
called to handle an NMI without risk of deadlock etc. is difficult and
cumbersome, not to mention that NMI code typically cannot be debugged in
typical fashion, as the debugger itself relies on NMIs to function (at
least on AMD64).

Unless something very special-purpose such as a debugging watchdog to
catch, say, a hung system is being implemented, NMIs are best avoided.

  • S (Msft)

*From:* xxxxx@lists.osr.com > osr.com> on behalf of xxxxx@gmail.com
> Sent: Monday, November 27, 2017 3:41:26 AM
> To: Windows System Software Devs Interest List
> Subject: Re: [ntdev] Handle NMI interrupt in Device driver
>
>
> PVOID KeRegisterNmiCallback(
> In PNMI_CALLBACK CallbackRoutine,
> In_opt PVOID Context
> );
>
>
> Read the docs on how to use it.
>
> Mark Roddy
>
> On Sun, Nov 26, 2017 at 3:52 PM, xxxxx@hotmail.com <
> xxxxx@lists.osr.com> wrote:
>
>> > Can you please tell if this is wise step to use NMI ?
>>
>> Well, apparently, not…
>>
>>
>> NMI in itself is meant to be used either for indicating critical hardware
>> failures that require immediate attention from the system (like, for
>> example, parity error detected by the memory controller), or for breaking
>> into the system by means of kernel debugger (i.e. signaling it via NMI
>> pin) when no other option is available because of some irrecoverable system
>> software bug. For example, if CPU executes HLT instruction (or just goes
>> into an infinite loop) while interrupts are disabled, NMI is the only way
>> one may get the system back.
>>
>>
>> Although Linux uses NMI for watchdog timers and performance counters, in
>> the Windows world NMI is, IIRC, firmly associated with irrecoverable
>> system failures and always results in bugchecking. You can check the
>> following article for more details
>>
>>
>> https://support.microsoft.com/en-us/help/2750146/nmi-hardwar
>> e-failure-error-when-an-nmi-is-triggered-on-windows-8-and-w
>> https:
>>
>>
>> Therefore, using NMI option is obviously unwise in the Windows world
>> unless bugchcking is your intended behavior (i.e. you want to use it for
>> diagnostic purposes)
>>
>>
>>
>> Anton Bassov
>>
>>
>>
>> —
>> NTDEV is sponsored by OSR
>>
>> Visit the list online at: http:>> lists.cfm?list=ntdev
>> https:
>> >
>>
>> MONTHLY seminars on crash dump analysis, WDF, Windows internals and
>> software drivers!
>> Details at http:>> https:
>> >
>>
>> To unsubscribe, visit the List Server section of OSR Online at <
>> http://www.osronline.com/page.cfm?name=ListServer
>> https:
>> >
>>
>
> — NTDEV is sponsored by OSR Visit the list online at: MONTHLY seminars
> on crash dump analysis, WDF, Windows internals and software drivers!
> Details at To unsubscribe, visit the List Server section of OSR Online at
>
> —
> NTDEV is sponsored by OSR
>
> Visit the list online at: http:> showlists.cfm?list=ntdev>
>
> MONTHLY seminars on crash dump analysis, WDF, Windows internals and
> software drivers!
> Details at http:
>
> To unsubscribe, visit the List Server section of OSR Online at <
> http://www.osronline.com/page.cfm?name=ListServer&gt;
></http:></http:></https:></https:></http:></https:></http:></https:>

> Well presumably he just is going to read/write a register on his device and dismiss the interrupt.

In such case, what is the point of relying upon NMI, rather than a “regular” IOAPIC/MSI interrupt
that gets signaled by the device, in the first place??? As I said in my previous post, normally you would want to use it for working around the chipset bugs or debugging the kernel

Anton Bassov

xxxxx@hotmail.com wrote:

> Well presumably he just is going to read/write a register on his device and dismiss the interrupt.
In such case, what is the point of relying upon NMI, rather than a “regular” IOAPIC/MSI interrupt
that gets signaled by the device, in the first place??? As I said in my previous post, normally you would want to use it for working around the chipset bugs or debugging the kernel

Looping back around to the original request, he is working with an
embedded system, and the vendor has a GPIO pin that can trigger an NMI. 
It’s an easy path to get external input for debugging.  The embedded
world is just not a “pure” as the desktop world.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Hi Anton, Tim, Mark, Ken,

In my system, upon getting NMI from GPIO I should signal it to the application.
Application will send IOCTL and wait NMI handler to answer it.
According to your answers, I will ask SBC’s vendor for an alternative.

Thank you very much for the detailed information.

Best regards,
Zvika

Yes - once you need to involve any sort of acknowledgement or processing of your interrupt that is more than just twiddling bits in hardware registers, you really will want something other than an NMI.

There aren’t any good mechanisms available to communicate things from an NMI handler to the “outside world” (e.g. user mode application with a pending IOCTL) on account of the relatively hostile environment that NMIs run in - and without the ability to signal synchronization objects, or to queue a DPC, or to complete an I/O request, etc. directly from an NMI handler, any component that wants to consume data from an NMI is going to be forced to rely on suboptimal and awkward mechanisms like polling (and even then, it would need to be very careful to handle a second NMI coming in simultaneously and adjusting data structures being examined by the polling logic).

And if polling is sufficient, you might as well just dispense with the interrupt entirely. Otherwise, you would likely be much better off with a conventional hardware interrupt that can be masked, so that your ISR can queue a DPC to perform whatever software processing is needed to, say, inform your application that an interrupt has been handled.

  • S (Msft)

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@gmail.com
Sent: Tuesday, November 28, 2017 8:46 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Handle NMI interrupt in Device driver

Hi Anton, Tim, Mark, Ken,

In my system, upon getting NMI from GPIO I should signal it to the application.
Application will send IOCTL and wait NMI handler to answer it.
According to your answers, I will ask SBC’s vendor for an alternative.

Thank you very much for the detailed information.

Best regards,
Zvika


NTDEV is sponsored by OSR

Visit the list online at: https:

MONTHLY seminars on crash dump analysis, WDF, Windows internals and software drivers!
Details at https:

To unsubscribe, visit the List Server section of OSR Online at https:</https:></https:></https:>