Re: [ntdev] ANy tools available to debug system hang? (other than driver verifier?)

Reassess your assumptions. The way an ISR interacts with hardware is inherently asynchronous and subject to race conditions and other timing effects that expose edge conditions that non-deterministically expose flaws in driver and hardware logic. Like all bugs of this sort, the more times you execute the ISR, you increase the chances that you hit the problem and the effect is not necessarily linear.

I have no idea what your problem is, but you need to reassess your assumptions

Sent from Surface Pro

From: xxxxx@gmail.com
Sent: ‎Thursday‎, ‎September‎ ‎11‎, ‎2014 ‎10‎:‎39‎ ‎AM
To: Windows System Software Devs Interest List

Also i believe that interrupt storm is not happening because this issue is
reproduced only when the UART port settings are changed to generate interrupts
frequently.

I find that statement astounding. If you KNOW the issue changes based
on interrupt frequency, then how can you possibly deny that the problem
is interrupt related?

The problem may be related to interrupts, but it seems to me like not clearing a particular interrupt condition is a logical bug in the code and hence it should reproduce every time there that particular interrupt is generated. However the issue is reproducing only randomly. (Very few times i have even seen that issue is not reproducing.)

I will for a moment assume that with a faster rate of interrupts, there is some sort of data error occurring at the hardware side and hence hardware is raising a particular interrupt to signal that condition and perhaps i am not handling this condition. However

  1. When the interrupt storm really happened (i did run into this problem during early stages of development and i fixed it subsequently) the system was slowed down for a while and eventually it crashed. More importantly while the system was slowed down i did see that the keyboard and mouse were still responding.

  2. The hardware has a register which provides the status of the interrupt on the hardware. Upon reading this register the interrupt condition on this hardware will be cleared(regardless of which interrupt was fired). I am reading this register at the beginning of my ISR. So the interrupt condition on hardware must be getting cleared.

Do all of your serial ports share a single interrupt? In your interrupt
handler, do you check for all potential interrupt sources before
returning? That is, if you happen to find that “port 1 Rx FIFO full”
fired, do you handle it and then immediately return, or do you continue
to check and clear the other interrupt sources? Is your interrupt
shared with other devices? Are you handling that possibility?

8 of my serial ports share a single interrupt because all of them are present on a single controller. after handling one particular interrupt in the ISR (like “port 1 Rx FIFO full” as given above) i still go and check the further interrupts.

My interrupt is shared between devices. As soon as i receive an interrupt i check one of my hardware registers whether there is an interrupt condition on the hardware. If yes i handle all of those conditions and return TRUE from ISR. If not i will return FALSE from my ISR straight away. I hope this is what is meant by handling the interrupt that is shared between devices.

What is the data rate? How many interrupts per second are you
handling? And, by the way, you really should know that number right off
the top of your head

The data rate is 38400 bps. I will check the number of interrupts per second using some performance analyzer tool and will update.


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer