Strange debugger breakpoints

Hi all,

I am developing a NIC adapter’s driver and I came across this strange
debugger breakpoints which appears infinitely. It seems to appear when I am
working the driver hard, ie. sending a lot of packets. Below is the
debugger’s output. My driver has nothing to do with the intelppm module
whatsoever.

1: kd> g
Break instruction exception - code 80000003 (first chance)
nt!RtlpBreakWithStatusInstruction:
80832de6 cc int 3
1: kd> g
Break instruction exception - code 80000003 (first chance)
nt!RtlpBreakWithStatusInstruction:
80832de6 cc int 3
1: kd> knf

Memory ChildEBP RetAddr

00 f772d28c 8087b1cf nt!RtlpBreakWithStatusInstruction
01 4c f772d2d8 8087b3a8 nt!KiBugCheckDebugBreak+0x19
02 8 f772d2e0 80a80e0f nt!KeEnterKernelDebugger+0x3d
03 3c f772d31c 80834b73 hal!HalHandleNMI+0x1bd
04 0 f772d31c f7649ca2 nt!KiTrap02+0x136
05 8c f772d3a8 00000000 intelppm!AcpiC1Idle+0x12

Can someone point out to me what is happening? I am at my wits’ end.

Alex

Alex
Processor was idling in C1 state when NMI occurred causing the bugcheck , You need to find out who’s pulling the NMI line. I found this article on NMI http://blogs.msdn.com/oldnewthing/archive/2007/02/27/1769274.aspx
Srikanth

From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Alex Lee
Sent: Wednesday, May 09, 2007 9:17 AM
To: Windows System Software Devs Interest List
Subject: [ntdev] Strange debugger breakpoints

Hi all,

I am developing a NIC adapter’s driver and I came across this strange debugger breakpoints which appears infinitely. It seems to appear when I am working the driver hard, ie. sending a lot of packets. Below is the debugger’s output. My driver has nothing to do with the intelppm module whatsoever.

1: kd> g
Break instruction exception - code 80000003 (first chance)
nt!RtlpBreakWithStatusInstruction:
80832de6 cc int 3
1: kd> g
Break instruction exception - code 80000003 (first chance)
nt!RtlpBreakWithStatusInstruction:
80832de6 cc int 3
1: kd> knf

Memory ChildEBP RetAddr

00 f772d28c 8087b1cf nt!RtlpBreakWithStatusInstruction
01 4c f772d2d8 8087b3a8 nt!KiBugCheckDebugBreak+0x19
02 8 f772d2e0 80a80e0f nt!KeEnterKernelDebugger+0x3d
03 3c f772d31c 80834b73 hal!HalHandleNMI+0x1bd
04 0 f772d31c f7649ca2 nt!KiTrap02+0x136
05 8c f772d3a8 00000000 intelppm!AcpiC1Idle+0x12

Can someone point out to me what is happening? I am at my wits’ end.

Alex
— Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256 To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

intelppm has nothing to do with the crash here, it is just where the code that runs the idle logic for the processor lives. An NMI occurred. I would guess that your PCI device caused a PCI bus error which gets mapped to an NMI. Others on the list could probably give you a good list of various reasons why this would occur.

d

If you’re running on Vista, you might want to check out WHEA. In either
case (XP or Vista, that is), tracking down the cause of an NMI is
difficult, because, as Doron mentioned, bus errors (probably the most
common case) and other errors can be mapped to an NMI. Depending on
your chipset, it might be possible to make some changes to sections of
the PCI chipset that affect the way the NMI/SCI/TCO/SMI are triggered
that may help disambiguate the source; this may or may not require some
assistance from some sort of bus analyzer or the like. I can’t say that
I have ever tried to track down the source of an NMI, but I’ve done a
lot of work involving chipset settings that I think would fit the bill.
There may very well be a better way.

In any case, if you can tell me a little bit about your chipset, I may
be able to provide you with a place to start. It would have to be one
that I familiar with or a very similar one, because the manuals for some
Intel ICH (IO Controller Hubs) exceed 1000 pages.

mm

>> xxxxx@Microsoft.com 2007-05-09 02:24 >>>
intelppm has nothing to do with the crash here, it is just where the
code that runs the idle logic for the processor lives. An NMI occurred.
I would guess that your PCI device caused a PCI bus error which gets
mapped to an NMI. Others on the list could probably give you a good
list of various reasons why this would occur.

d


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

My power supply just died and I am wondering if it has anything to do with
the NMI caught.

On 5/10/07, Martin O’Brien wrote:

> If you’re running on Vista, you might want to check out WHEA. In either
> case (XP or Vista, that is), tracking down the cause of an NMI is
> difficult, because, as Doron mentioned, bus errors (probably the most
> common case) and other errors can be mapped to an NMI. Depending on
> your chipset, it might be possible to make some changes to sections of
> the PCI chipset that affect the way the NMI/SCI/TCO/SMI are triggered
> that may help disambiguate the source; this may or may not require some
> assistance from some sort of bus analyzer or the like. I can’t say that
> I have ever tried to track down the source of an NMI, but I’ve done a
> lot of work involving chipset settings that I think would fit the bill.
> There may very well be a better way.
>
> In any case, if you can tell me a little bit about your chipset, I may
> be able to provide you with a place to start. It would have to be one
> that I familiar with or a very similar one, because the manuals for some
> Intel ICH (IO Controller Hubs) exceed 1000 pages.
>
> mm
>
> >>> xxxxx@Microsoft.com 2007-05-09 02:24 >>>
> intelppm has nothing to do with the crash here, it is just where the
> code that runs the idle logic for the processor lives. An NMI occurred.
> I would guess that your PCI device caused a PCI bus error which gets
> mapped to an NMI. Others on the list could probably give you a good
> list of various reasons why this would occur.
>
> d
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>

I’m sorry to hear that, but, assuming that the error was not
consistently reproducible, it certainly does sound like you found a very
likely cause of the error. Is it at least a standard motherboard with a
cheap, readily available generic equivalent?

mm

>> xxxxx@gmail.com 2007-05-10 04:42:08 >>>
My power supply just died and I am wondering if it has anything to do
with
the NMI caught.

On 5/10/07, Martin O’Brien wrote:

> If you’re running on Vista, you might want to check out WHEA. In
either
> case (XP or Vista, that is), tracking down the cause of an NMI is
> difficult, because, as Doron mentioned, bus errors (probably the
most
> common case) and other errors can be mapped to an NMI. Depending on
> your chipset, it might be possible to make some changes to sections
of
> the PCI chipset that affect the way the NMI/SCI/TCO/SMI are
triggered
> that may help disambiguate the source; this may or may not require
some
> assistance from some sort of bus analyzer or the like. I can’t say
that
> I have ever tried to track down the source of an NMI, but I’ve done
a
> lot of work involving chipset settings that I think would fit the
bill.
> There may very well be a better way.
>
> In any case, if you can tell me a little bit about your chipset, I
may
> be able to provide you with a place to start. It would have to be
one
> that I familiar with or a very similar one, because the manuals for
some
> Intel ICH (IO Controller Hubs) exceed 1000 pages.
>
> mm
>
> >>> xxxxx@Microsoft.com 2007-05-09 02:24 >>>
> intelppm has nothing to do with the crash here, it is just where the
> code that runs the idle logic for the processor lives. An NMI
occurred.
> I would guess that your PCI device caused a PCI bus error which gets
> mapped to an NMI. Others on the list could probably give you a good
> list of various reasons why this would occur.
>
> d
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

The server is not cheap in my knowledge. I do not want to name it.
Let me update you guys after my new power supply comes tomorrow.

On 5/10/07, Martin O’Brien wrote:
>
> I’m sorry to hear that, but, assuming that the error was not
> consistently reproducible, it certainly does sound like you found a very
> likely cause of the error. Is it at least a standard motherboard with a
> cheap, readily available generic equivalent?
>
> mm
>
> >>> xxxxx@gmail.com 2007-05-10 04:42:08 >>>
> My power supply just died and I am wondering if it has anything to do
> with
> the NMI caught.
>
> On 5/10/07, Martin O’Brien wrote:
>
> > If you’re running on Vista, you might want to check out WHEA. In
> either
> > case (XP or Vista, that is), tracking down the cause of an NMI is
> > difficult, because, as Doron mentioned, bus errors (probably the
> most
> > common case) and other errors can be mapped to an NMI. Depending on
> > your chipset, it might be possible to make some changes to sections
> of
> > the PCI chipset that affect the way the NMI/SCI/TCO/SMI are
> triggered
> > that may help disambiguate the source; this may or may not require
> some
> > assistance from some sort of bus analyzer or the like. I can’t say
> that
> > I have ever tried to track down the source of an NMI, but I’ve done
> a
> > lot of work involving chipset settings that I think would fit the
> bill.
> > There may very well be a better way.
> >
> > In any case, if you can tell me a little bit about your chipset, I
> may
> > be able to provide you with a place to start. It would have to be
> one
> > that I familiar with or a very similar one, because the manuals for
> some
> > Intel ICH (IO Controller Hubs) exceed 1000 pages.
> >
> > mm
> >
> > >>> xxxxx@Microsoft.com 2007-05-09 02:24 >>>
> > intelppm has nothing to do with the crash here, it is just where the
> > code that runs the idle logic for the processor lives. An NMI
> occurred.
> > I would guess that your PCI device caused a PCI bus error which gets
> > mapped to an NMI. Others on the list could probably give you a good
> > list of various reasons why this would occur.
> >
> > d
> >
> > —
> > Questions? First check the Kernel Driver FAQ at
> > http://www.osronline.com/article.cfm?id=256
> >
> > To unsubscribe, visit the List Server section of OSR Online at
> > http://www.osronline.com/page.cfm?name=ListServer
> >
> > —
> > Questions? First check the Kernel Driver FAQ at
> > http://www.osronline.com/article.cfm?id=256
> >
> > To unsubscribe, visit the List Server section of OSR Online at
> > http://www.osronline.com/page.cfm?name=ListServer
> >
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>

Normally servers have three power supplies with only two required to keep the system running. A flaky power supply could cause memory parity errors which are reported via NMI.

“Alex Lee” wrote in message news:xxxxx@ntdev…
The server is not cheap in my knowledge. I do not want to name it.
Let me update you guys after my new power supply comes tomorrow.

On 5/10/07, Martin O’Brien wrote:
I’m sorry to hear that, but, assuming that the error was not
consistently reproducible, it certainly does sound like you found a very
likely cause of the error. Is it at least a standard motherboard with a
cheap, readily available generic equivalent?

mm

>>> xxxxx@gmail.com 2007-05-10 04:42:08 >>>
My power supply just died and I am wondering if it has anything to do
with
the NMI caught.

On 5/10/07, Martin O’Brien wrote:

> If you’re running on Vista, you might want to check out WHEA. In
either
> case (XP or Vista, that is), tracking down the cause of an NMI is
> difficult, because, as Doron mentioned, bus errors (probably the
most
> common case) and other errors can be mapped to an NMI. Depending on
> your chipset, it might be possible to make some changes to sections
of
> the PCI chipset that affect the way the NMI/SCI/TCO/SMI are
triggered
> that may help disambiguate the source; this may or may not require
some
> assistance from some sort of bus analyzer or the like. I can’t say
that
> I have ever tried to track down the source of an NMI, but I’ve done
a
> lot of work involving chipset settings that I think would fit the
bill.
> There may very well be a better way.
>
> In any case, if you can tell me a little bit about your chipset, I
may
> be able to provide you with a place to start. It would have to be
one
> that I familiar with or a very similar one, because the manuals for
some
> Intel ICH (IO Controller Hubs) exceed 1000 pages.
>
> mm
>
> >>> xxxxx@Microsoft.com 2007-05-09 02:24 >>>
> intelppm has nothing to do with the crash here, it is just where the
> code that runs the idle logic for the processor lives. An NMI
occurred.
> I would guess that your PCI device caused a PCI bus error which gets
> mapped to an NMI. Others on the list could probably give you a good
> list of various reasons why this would occur.
>
> d
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer