Interrupt Routing -- was: Question abuot masking interrupt

Especially since the HAL kit is no longer available so there are no custom
HAL’s for current OS’es.


Don Burn (MVP, Windows DKD)
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr

“Joseph M. Newcomer” wrote in message
news:xxxxx@ntdev…
> I’d hardly call it “completely false” if the only solution is to write a
> custom HAL!
>
> Is it true or false in an out-of-the-box Windows installation that setting
> device priority is entirely fixed by the time the device driver writer
> sees
> it?
> joe
>
> -----Original Message-----
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of
> xxxxx@hotmail.com
> Sent: Wednesday, July 29, 2009 5:30 PM
> To: Windows System Software Devs Interest List
> Subject: RE:[ntdev] Question abuot masking interrupt
>
>> Only that just hasn’t been true for a long time for device interrupt
> levels, which are just
>> more or less arbitrarily assigned based on pci wiring on the board and
>> not
> on any importance
>> of a device that happens to be plugged into one slot or another.
>
>
> It depends on interrupt controller on the target motherboard. If it is
> “good old PIC”, then interrupt priority is ,indeed, implied by IRQ number,
> so that your statement about priority depending “on pci wiring on the
> board”
> applies. However, IOAPIC breaks this dependency completely - it allows you
> to map IRQ to any vector above 32. Once priority is associated with
> vector,
> rather than IRQ, on APIC-based system, you can assign any priority to a
> given IRQ.
>
> In any case, Joe’s statement about device priority is completely false -
> if
> you want to assign a certain priority to all devices of a given class, you
> will have to implement a custom HAL that I mentioned in my previous post
> (and deal with the dilemma of handling the situation when devices of
> different classes happen to signal interrupts via the same pin)…
>
> Anton Bassov
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
> –
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>
>
> Information from ESET NOD32 Antivirus, version of virus
> signature database 4291 (20090730)

>
> The message was checked by ESET NOD32 Antivirus.
>
> http://www.eset.com
>
>
>

Information from ESET NOD32 Antivirus, version of virus signature database 4291 (20090730)

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

If it were based on the PCI wiring, then presumably changing the slot =
into
which the card is plugged should change the interrupt priority;
experimentation on one motherboard indicated that this did not occur. =
The
board always appeared to have the same interrupt level no matter what =
slot
it was plugged into. The PCI BIOS and/or Windows always seemed to =
assign
the same priority value. Alas, it was a realtime board with tight
constraints on interrupt latency, and they never could get it to work =
right.
My own observation was that since (as they had told me) the engineer =
based
the design on his ability to reprogram the APIC on a bare (well, MS-DOS)
x86, he had erroneously assumed that this was going to be universally
possible on all machine architectures, for all operating systems. The
problem was solved by a board redesign with an onboard FIFO, but I =
didn’t
hear about this for a couple more years. That’s why I was curious as =
to
whether there really was any method available to reassign interrupt
priorities. =20

I have found far too many hardware designers assume that what they can =
do on
a bare or MS-DOS x86 is used as their guideline for doing minimum-cost
design, letting “the software” (whatever THAT means!) solve the rest of =
the
problems. The design I just described is only one of several that I =
have
seen over the last decade or so that suffered from this particular =
disease
of hardware designers. Ed Dekker has even more frightening stories =
about
hardware designers. I’m sure many of the other participants in this =
group
do, also.

This is not a new phenomenon; I saw the same problems on the PDP-11. I
remember having to open “priority windows” in the middle of one device
driver so the (lower-priority, real-time) tape controller could detect =
the
BOT mark and stop the reels before the tape spun off. Of course, this =
meant
I could get recursive interrupts. ARGH! (Funny, hardware designers in =
2009
seem to make the same design errors that were made in 1975…doesn’t =
anyone
LEARN? Doesn’t anyone TEACH principles of Bad Hardware Design?)
joe

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Mark Roddy
Sent: Wednesday, July 29, 2009 5:00 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] Question abuot masking interrupt

Only that just hasn’t been true for a long time for device interrupt
levels, which are just more or less arbitrarily assigned based on pci
wiring on the board and not on any importance of a device that happens
to be plugged into one slot or another.

Mark Roddy

On Wed, Jul 29, 2009 at 12:01 PM, Joseph M.
Newcomer wrote:
> Generally, the idea is that the reason the interrupt is at a =
=93higher
> priority=94 is because it is more important than the lower-priority
> interrupts.=A0 The consequence of allowing lower-priority interrupts =
is that
> an unimportant device that is interrupting frequently can consume all =
the
> CPU cycles, thus delaying the response to the more important device.
>
>
>
> This is a variant of what is called the =93priority inversion=94 =
problem,
where
> a low-priority thread sets a lock, then is preempted by higher =
priority
> threads, but then a high-priority thread tries to acquire the lock, =
and
ends
> up being blocked for an indeterminate and perhaps indefinitely long =
time
by
> the lower-priority thread. =A0The result of this is that the =
high-priority
> thread effectively gets cycles only when the lower-priority thread =
runs,
> thus effectively making it a low-priority thread (hence the term =
=93priority
> inversion=94).
>
>
>
> I have encountered several situations in which the priority assigned =
by
the
> system BIOS was inappropriate for the device. =A0Changing priorities =
is a
> difficult, or perhaps impossible, task.
>
>
>
> Note that on a multiprocessor, in general if an interrupt is blocked =
on
CPUn
> because CPUn is running at a certain DIRQL level, the interrupt is
rerouted
> by the hardware to interrupt CPUm for m !=3D n, if it is =
interruptible. =A0So
it
> is possible to have as many interrupts running as CPUs in a
fully-symmetric
> multiprocessor system (the world changes in Vista, which supports
asymmetric
> device connections that might interrupt only a subset of the CPUs, as
> specified by the =93affinity mask=94 that specifies the CPUs that are =
allowed
> to/able to handle the interrupt).
>
>
>
> For PCI, interrupts are level-triggered, so as long as the device =
holds
the
> interrupt line low, the interrupt is held pending.=A0 There is some
analogous
> mechanism for handling message-based interrupts (only Vista and beyond
have
> support for message-based interrupts).=A0 It is not clear that there =
was
ever
> a situation in which edge-triggered interrupts could be lost =
(certainly
not
> when I was doing MS-DOS device drivers for ISA cards! =A0I had to =
quite
often
> deal with arbitrarily-long-delayed edge-triggered interrupts!) =A0It =
is
> unlikely that something that worked back in the days of ISA would be =
lost
in
> modern architectures when such a failure could be disastrous.
>
>
>
> Note also that in order to prevent running a given ISR concurrently or
> sequentially-before-it-has-completed, a combination of CPU masking of
> this-or-lower interrupts combined with the interrupt spin lock in the
> KINTERRUPT object is used.
>
>
>
> Personally, I would be interested if someone knows how to change =
interrupt
> priorities on a given device.
>
> =
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0
joe
>
>
>
>
>
>
>
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of Skywing
> Sent: Friday, July 24, 2009 12:06 PM
>
> To: Windows System Software Devs Interest List
> Subject: RE: [ntdev] Question abuot masking interrupt
>
>
>
> That is the intention of the word “masked” in this particular =
instance.
>
> - S
>
>
>

>
> From: sivakumar thulasimani
> Sent: Friday, July 24, 2009 01:55
> To: Windows System Software Devs Interest List
> Subject: Re: [ntdev] Question abuot masking interrupt
>
> Being in any (X) IRQL does not mask any interrupts that are handled at =
a
> lower (Y)=A0IRQL. All the system does is it makes sure that your code =
which
is
> running at X IRQL will continue to run till it finished its function =
or
till
> it receives another interrupt which is handled by=A0at even higher =
IRQL. The
> interrupts for any lower (Y) IRQL are still “regisiterd” ( dont know =
the
> exact technical word here, so am using my own) and will be handled =
when
the
> IRQL is reduced to the appropriate level. Hope that clears your doubt
>
>
>
> rtshiva
>
> 2009/7/24
>
> I found this on wiki, seems answer my previous question:
> However, it is fairly easy for an edge triggered interrupt to be =
missed -
> for example if interrupts have to be masked for a period - and unless
there
> is some type of hardware latch that records the event it is impossible =
to
> recover. Such problems caused many “lockups” in early computer =
hardware
> because the processor did not know it was expected to do something. =
More
> modern hardware often has one or more interrupt status registers that
latch
> the interrupt requests; well written edge-driven interrupt software =
often
> checks such registers to ensure events are not missed.
>
> -----Original Message-----
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of
> xxxxx@viatech.com.cn
> Sent: Friday, July 24, 2009 4:05 PM
> To: Windows System Software Devs Interest List
>
> Subject: RE: [ntdev] Question abuot masking interrupt
>
> Another question:
> If we mask the interrupt at and less than the interrupt currently
servicing,
> Is there any possibility that EDGE triggered lower prioprity interrupt
> LOST? (seems level trigged interrupt will not be lost).
>
> Thanks.
> HW
>
> -----Original Message-----
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of
> xxxxx@viatech.com.cn
> Sent: Friday, July 24, 2009 3:52 PM
> To: Windows System Software Devs Interest List
> Subject: [ntdev] Question abuot masking interrupt
>
> Hello everyone.
> I am a newbie in windows kernel and is now reading “windows internals” =
by
> Mark E. Russinovich, David A. Solomon.
> I have a question about masking interrupt.
> In chapter 3:
> The books saids:
> /*Quote begin
> interrupts from a source with an IRQL above the current level =
interrupt
the
> processor, whereas interrupts from sources with IRQLs equal to or =
below
the
> current level are masked until an executing thread lowers the IRQL.
> Quote end */
>
> Here is my question:
> Why bother masking the interrupts whose priority is lower than current
> interrupt level?
> The lower priority interrupt can’t preempt high priority interrupt per =
se.
> Does masking make any difference?
>
> Can anyone help me on this question? Thanks!
>
> Best regards,
> HW
>
> Email secured by Check Point at OSR.COM
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=3DListServer
>
>
> Email secured by Check Point at OSR.COM
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=3DListServer
>
>
> Email secured by Check Point at OSR.COM
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=3DListServer
>
>
> Email secured by Check Point at OSR.COM
>
> =3D — NTDEV is sponsored by OSR For our schedule of WDF, WDM, =
debugging
and
> other seminars visit: http://www.osr.com/seminars To unsubscribe, =
visit
the
> List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=3DListServer
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=3DListServer
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=3DListServer
> –
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:=20
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=3DListServer

–=20
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

> Is it true or false in an out-of-the-box Windows installation that setting device priority is entirely

fixed by the time the device driver writer sees it?

This one is true - this part gets established during HAL initialization, which, IIRC, is the very first thing kernel does upon initialization. However, device drivers (including boot ones) come into the play at much later stage when kernel is already up and running…

However, interrupt priority has nothing to do with the importance of a device itself - HAL cannot assign
hardware interrupt priority to any given device . The only thing it can assign priority to is IOAPIC pin ( I assume APIC HAL - on PIC -based machine HAL does not have any discretion even here). For PCI devices it may also try to assign a routing group to the device so that all devices of a given group signal interrupts via the same pin, and, hence, have the same interrupt priority. However, again, it does not always have a discretion here - some devices may be just physically wired together and hence be bound to signal interrupts via the same pin.

Therefore, if you want to be able to assign priority to a device you have to write custom HAL that implements interrupt priorities purely in a software. With such approach you can go as far as assigning different priorities to devices that share the same pin (although, once you cannot discover which particular device has actually interrupted without invoking all appropriate ISRs, it does not really seem to make any practical sense to do so)

Anton Bassov

> If it were based on the PCI wiring, then presumably changing the slot into which the card is

plugged should change the interrupt priority;

…but only on PIC-based machine…

experimentation on one motherboard indicated that this did not occur. The board always appeared
to have the same interrupt level no matter what slot it was plugged into.

This is because the OS can assign a priority to IOAPIC pin. . However, what you have described it is not necessarily going to be the case even on APIC-based machine if different devices are physically bound to share the same pin (unless device is MSI-capable and the OS decides to take advantage of its MSI capability, rather than making it signal interrupts via a pin - in such case it will be able to get dedicated vector, and, hence, priority)…

Anton Bassov

“Joseph M. Newcomer” wrote in message
news:xxxxx@ntdev…
> When you refer to “target set” are you referring to the hardware
> architecture or can this be controlled by the use of the affinity mask?
> joe
>

I 'm asserting that what he is referring to is the ProcessorEnableMask
affinity that you specify when calling IoConnectInterrupt which causes
HalEnable(System)Interrupt to be called for each processor.
HalDisable(System)Interrupt would give you some control to decide what
interrupts were connected to a specific CPU but these functions seem no
longer documented since they changed name and switched to a new format.

//Daniel

Totally. +1

This seems so obvious to me, that I figure it MUST have been tried and proven unacceptable for some reason. I’d love to know some of the back-story, if somebody in the know wants to provide it…

Peter
OSR

> I 'm asserting that what he is referring to is the ProcessorEnableMask affinity that you specify

when calling IoConnectInterrupt which causes HalEnable(System)Interrupt to be
called for each processor.

Please note that it is IOAPIC’s redirection table entry that specifies CPUs that may get interrupted by a given
source and not the other way around - otherwise you could get into a situation when the same interrupt source maps to different vectors on different CPUs, which implies the same source could have different priorities for different CPUs…

Anton Bassov

Peter,

This seems so obvious to me, that I figure it MUST have been tried and proven unacceptable
for some reason.

Does not APIC bus arbitration protocol ensure that interrupt gets dispensed to the least busy CPU among the ones that are allowed to be interrupted by a given source???

Anton Bassov

Sorry, wrong usage of ‘assert’. It was more like I wanted to see if this
throws an exception or not.

Is it not that the I/O APIC (which belongs to the chipset) directs
interrupts to the local APICs based on their IDTs ? Or is it that the I/O
APIC is programmed separately in this manner ? Unfortunately there is not
much information on the I/O APIC or this redirection table in the public
Intel manuals, they say you need to contact them manually.

//Daniel

wrote in message news:xxxxx@ntdev…
>> I 'm asserting that what he is referring to is the ProcessorEnableMask
>> affinity that you specify
>> when calling IoConnectInterrupt which causes HalEnable(System)Interrupt
>> to be
>> called for each processor.
>
> Please note that it is IOAPIC’s redirection table entry that specifies
> CPUs that may get interrupted by a given
> source and not the other way around - otherwise you could get into a
> situation when the same interrupt source maps to different vectors on
> different CPUs, which implies the same source could have different
> priorities for different CPUs…
>
>
> Anton Bassov
>

Daniel,

Unfortunately there is not much information on the I/O APIC or this redirection table in the
public Intel manuals,

In fact, they just have a separate manual for IOAPIC, so that they don’t go into too much details about it in
their 3-volume developer’s manuals. Please find a link to IOAPIC manual below - this doc goes into all details of IOAPIC, including even pin layout.

they say you need to contact them manually.

Luckily, it is not as bad as that - they’ve got quite a few docs in a public domain. Just enter 'IOAPIC" into a search box on Intel site, and the very first link that you will get is http://www.intel.com/design/chipsets/datashts/290566.htm

If you want to discover the mapping of a particular device to particular IOAPIC pin, then you have to read BIOS -related docs as well. The link below may be quite helpful
http://www.intel.com/design/archives/processors/pro/docs/242016.htm.

Although, according to Jake, Windows does not even look at tables that above mentioned doc describes, and, instead, goes right to ACPI ones, I believe it may still helpful - after all, in terms of complexity, layout of these tables does not go anywhere close to ACPI ones, while there is a good chance that info
found in these tables may be valid even on machines with ACPI HAL…

If you want more than that, then you can download ACPI Specs. However, I must warn you in advance that, unlike Intel manuals, this is not the easiest read one would imagine…

Anton Bassov

The IOAPIC has a mode called lowest priority delivery, where the chipset chooses the right destination from a set of processors based on the interrupt priority of the various processors. The problem is that interrupt priority changes a lot. Like a whole lot. Acquiring spinlocks changes it, timers and other DPCs change it, and of course interrupts change it. And the information about priority is kept in the processor, but the chipset needs to act on it. The latency and overhead of transmitting this information meant that it was basically always stale. And so many systems gave up trying.

Some chipset/processor combinations forward to just one processor. Some round robin or hash based on vector number. All of these options are simpler than perfect lowest priority, but have negative effects in some workloads. But they exist, and new processors/chipsets are continuing to do this differently, so there is still a desire to get the distribution right.

Dave

>

Totally. +1

This seems so obvious to me, that I figure it MUST have been tried and
proven
unacceptable for some reason. I’d love to know some of the
back-story, if
somebody in the know wants to provide it…

Having a read of the Linux mailing list archives where this is discussed
is an interesting (but lengthy) thing to do. Some of the more specific
discussions are obviously not relevant to windows but still interesting.

James

Simply put, no it doesn’t.

First of all, none of us have seen a machine with an APIC bus in recent
memory. Second, when the APIC bus did exist, it guaranteed only that the
processor with the lowest TPR value got interrupted. But NT idles
processors (for various reasons) at DISPATCH_LEVEL. So the least busy
processor gets interrupted well after processors doing useful work at
PASSIVE_LEVEL.


Jake Oshins
Hyper-V I/O Architect
Windows Kernel Group

This post implies no warranties and confers no rights.


wrote in message news:xxxxx@ntdev…
> Peter,
>
>> This seems so obvious to me, that I figure it MUST have been tried and
>> proven unacceptable
>> for some reason.
>
> Does not APIC bus arbitration protocol ensure that interrupt gets
> dispensed to the least busy CPU among the ones that are allowed to be
> interrupted by a given source???
>
> Anton Bassov
>

All of the relevant information is publicly available from Intel. In
particular, see Volume 3, Chapter 8, Section 11 of the Programmer’s
Reference Manual. For the I/O APIC, any chipset datasheet will do.

The I/O APIC redirection table sends interrupts in various modes. The mode
we’ve been implicitly describing (lowest priority mode) describes a set of
processors that an interrupt is sent to. One of them gets the interrupt.
Which one is really up to the north bridge.


Jake Oshins
Hyper-V I/O Architect
Windows Kernel Group

This post implies no warranties and confers no rights.


wrote in message news:xxxxx@ntdev…
> Sorry, wrong usage of ‘assert’. It was more like I wanted to see if this
> throws an exception or not.
>
> Is it not that the I/O APIC (which belongs to the chipset) directs
> interrupts to the local APICs based on their IDTs ? Or is it that the I/O
> APIC is programmed separately in this manner ? Unfortunately there is not
> much information on the I/O APIC or this redirection table in the public
> Intel manuals, they say you need to contact them manually.
>
> //Daniel
>
>
> wrote in message news:xxxxx@ntdev…
>>> I 'm asserting that what he is referring to is the ProcessorEnableMask
>>> affinity that you specify
>>> when calling IoConnectInterrupt which causes HalEnable(System)Interrupt
>>> to be
>>> called for each processor.
>>
>> Please note that it is IOAPIC’s redirection table entry that specifies
>> CPUs that may get interrupted by a given
>> source and not the other way around - otherwise you could get into a
>> situation when the same interrupt source maps to different vectors on
>> different CPUs, which implies the same source could have different
>> priorities for different CPUs…
>>
>>
>> Anton Bassov
>>
>

In the Bad Old Days, when interrupt line correlated 1:1 with priority, a
number of cards would interrupt on two lines, one for the less critical
interrupt, one for the really critical interrupt.

Assuming that all devices are equally important can lead to the priority
inversion problem.

What I don’t understand is how a PCI BIOS can *a priori* determine how
important a device is relative to other devices.

Also note that using a priority-ordered DPC queue simply changes the problem
slightly; the issue of how priorities are established remains a problem.
joe

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@hotmail.com
Sent: Thursday, July 30, 2009 10:36 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Interrupt Routing – was: Question abuot masking
interrupt

I am dubious about the merits of a general purpose OS that provides a
configurable
interrupt priority scheme. My devices would always want the highest
priority available.

…which, in turn, raises questions about the reasoning behind prioritizing
hardware interrupts to one another, in the first place. To begin with,
the same device may interrupt for different reasons (for example, send
completion and data arrival on NIC), and sometimes these reasons may, from
the logical point of view, imply different priorities of interrupts
handling. I think it would be much more reasonable to treat all hardware
interrupts (apart from timer, of course) as equals while allowing a wide
range of priorities of software interrupts that ISRs defer the work to, and
enforcing the requirement for ISRs to do as little work as possible (i.e.
check the reason for interrupt, program device registers to stop in from
interrupting and request software interrupt that does the further
processing)…

Anton Bassov


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

> First of all, none of us have seen a machine with an APIC bus in recent memory.

Indeed, starting from Pentium IV, the system bus is used for IOAPIC- to- local APIC communications.
However, the way I understand it (probably wrong anyway), particular details may be chipset-specific. Therefore, in context of Peter’s question the very first thing that got into my head was APIC bus arbitration protocol that is used by P6 family and Pentium processors - Intel Developer’ s Manual describes it in great detail. Judging from this description, I somehow arrived to the conclusion that the least busy processor is guaranteed to be chosen on these systems. More on it below…

Second, when the APIC bus did exist, it guaranteed only that the processor with the lowest
TPR value got interrupted. But NT idles processors (for various reasons) at DISPATCH_LEVEL.
So the least busy processor gets interrupted well after processors doing useful
work at PASSIVE_LEVEL.

Sorry, but this is already a software-related issue - objectively, the hardware just cannot make any
judgement about the actual importance of the task that a given CPU performs, so it has to decide only upon the information in hardware registers. However, if I got it right, Peter was speaking about guarantees provided by the hardware platform…

Anton Bassov

> Assuming that all devices are equally important can lead to the priority inversion problem.

If you think about it carefully you will realize that the above statement contradicts itself - priority inversion just
does not make sense when everyone is equally important, don’t you think…

What I don’t understand is how a PCI BIOS can *a priori* determine how important a device
is relative to other devices.

BIOS does not assign priorities to devices …

Also note that using a priority-ordered DPC queue simply changes the problem slightly; the issue
of how priorities are established remains a problem.

I think this decision can be left to driver writers pretty much the same way decisions about thread priorities are left to app ones. The argument that everyone would want to have the highest possible priority
seems to be faulty - if your device’s minimal latency comes at the price of unresponsive GUI you will, apparently, think twice about trying it…

Anton Bassov

Note that I said “assuming all devices are equally important”, which is
often an invalid assumption; perhaps the correct statement would have been

“Assuming the all devices are equally important belies the fact that they
are, in real systems, *not* equally important; consequently, this assumption
leads to a problem analogous to the priority-inversion problem when a device
which actually *is* more important (in terms of meeting an interrupt latency
requirement) is blocked for an indeterminate time because it is erroneously
treated as a peer of all the other devices.”

I didn’t realize I needed to be so precise in stating what seemed obvious.

The PCI BIOS certainly does assign priorities; I know this because in some
bizarre cases I’ve had to go into the PCI BIOS setup to reserve an
non-exclusive interrupt line, and my choice of reserved line impacted the
priority; it took some experimentation to discover which line gave the best
performance (although this was far enough in the past that I was probably
working with a PIC system). Embedded systems, including MS-DOS systems, do
not have any code to assign device priorities, yet the device priorities are
nonetheless assigned. I never had to assign device priorities when working
on embedded x86 systems, yet they were clearly assigned. One of the
problems we had was that for devices that required low-latency service, the
priorities were assigned incorrectly relative to what we needed. But
sometimes we just couldn’t get the priorities assigned in the way we wanted.

Note that in a single app, the designer of the app gets to assign the thread
priorities based on an understanding of the relative importance of the
threads. This is also what we did in real-time operating systems; and
techniques such as rate monotonic analysis take these priorities and thread
execution times can be used to determine the balance of the thread mix and
the priorities can be adjusted to achieve a feasible solution.

A device driver, on the other hand, works in isolation; it cannot determine
its importance relative to a set of unknown device drivers in any given
situation. Therefore, it is not as simple as in working with a mix of
threads in a single app. Note also that the “thread priority” game only
works well in vertical market systems where the entire collection of apps is
predetermined; the impact of manipulating thread priorities in a system with
an unknown set of applications running is also unsolvable. I’m not sure how
any application writer can magically determine the correct thread priorities
to set to achieve a specified performance without (a) interfering with
unknown and unknowable apps that may coexist or (b) being interfered with by
unknown and unknowable apps.

In one case, we gave up 25% of the CPU resources to guarantee both the GUI
remained responsive and the realtime-constrained threads handled their
workload correctly; the trick is to SetThreadAffinityMask for the GUI to one
CPU (arbitrarily, CPU0) and the worker threads to the system mask & ~1 (that
is, to never run on CPU0). But note that this required some serious
assumptions that are not necessarily consistent with a general-purpose
system, but we were working in a vertical-market application turnkey-system
style environment.

When I deliver an app, and I have discovered the (very rare) need to
manipulate thread priorities, I always make these choices settable so an end
user can adjust them to make sure they work in the real environment in which
the app must live. Ultimately, the priorities can only have meaning in
context. A given app may work well with one priority assignment and fail in
another, and each such instance would be determinable only in the context of
the system which is actually running with its existing mix of applications.
In the case of device drivers, the importance of an interrupt may also
profoundly affect my application’s responsiveness, which is what I’ve
encountered in practice. But I didn’t know how to solve the problem.
joe

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@hotmail.com
Sent: Friday, July 31, 2009 6:52 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Interrupt Routing – was: Question abuot masking
interrupt

Assuming that all devices are equally important can lead to the priority
inversion problem.

If you think about it carefully you will realize that the above statement
contradicts itself - priority inversion just
does not make sense when everyone is equally important, don’t you think…

What I don’t understand is how a PCI BIOS can *a priori* determine how
important a device
is relative to other devices.

BIOS does not assign priorities to devices …

Also note that using a priority-ordered DPC queue simply changes the
problem slightly; the issue
of how priorities are established remains a problem.

I think this decision can be left to driver writers pretty much the same way
decisions about thread priorities are left to app ones. The argument that
everyone would want to have the highest possible priority
seems to be faulty - if your device’s minimal latency comes at the price of
unresponsive GUI you will, apparently, think twice about trying it…

Anton Bassov


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

> Assuming the all devices are equally important belies the fact that they are, in real systems,

*not* equally important;

What you should gave added is " IN A GIVEN CONTEXT" - after all, device priority is relative thing and may depend on various factors like type of operation, priorities. of apps that wait for data from it, etc.
For example, priority of USB controller in a situation when isochronous transfer is in the schedule should, apparently, be different from the one when only bulk ones are in sight…

…this assumption leads to a problem analogous to the priority-inversion problem when
a device which actually *is* more important (in terms of meeting an interrupt latency requirement)
is blocked for an indeterminate time because it is erroneously treated as a peer of
all the other devices."

Actually, assigning fixed priorities to devices/controllers themselves does not seem to eliminate this problem either (just look at the above example of USB controller). Therefore, I think treating all devices (apart from timer, of course) as equals at the time of interrupt while giving a wider range of priorities to deferred tasks that ISRs queue seems to address this problem better…

A device driver, on the other hand, works in isolation; it cannot determine its importance relative
to a set of unknown device drivers in any given situation.

In fact, this part is relatively easy - all that is needed is a predefined set of priority constants like INTERACTIVE_EVENT, NIC_DATA_ARRIVAL, NIC_SEND_COMPLETE, DISK_IO,
so that a driver writer can schedule a job depending on a given situation…

Anton Bassov

On Sun, Aug 2, 2009 at 9:58 AM, wrote:
> Therefore, I think treating all devices (apart from timer, of course) as equals at the time of interrupt while giving a wider range of priorities to deferred tasks that ISRs queue seems to address this problem better.

Which is approximately what NT tries to do with its isr/dpc design and
the addition of threaded DPCs and increased dpc priority granularity
in current releases.

Has anyone actually ever used a threaded dpc?

Mark Roddy