Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Before Posting...
Please check out the Community Guidelines in the Announcements and Administration Category.

RE: interrupt handshaking - was other crap

Jake_OshinsJake_Oshins Member Posts: 1,058
You ask for a lot. But, in order to put this to bed, here it is.

APIC Case (I'll stick to the single-processor case for brevity:)

Let me add some hypothetical details. When I use "Time A" or "Time B,"
assume that time passes in alphabetical order.

Card 1 is attached to I/O APIC input #21.
Card 2 is attached to I/O APIC input #20.
Card 3 and 4 are attached to I/O APIC input #19.

The OS has assigned IDT entry 0x71 (IRQL 0xB) to I/O APIC input #21.
The OS has assigned IDT entry 0xa1 (IRQL 0xE) to I/O APIC input #20.
The OS has assigned IDT entry 0x93 (IRQL 0xD) to I/O APIC input #19.

Card 2 asserts INTA# (by grounding it) at Time A.
Card 1 asserts INTA# at Time C.
Card 4 asserts INTA# at Time D.
Card 3 asserts INTA# at Time E.

Assume the processor is running at PASSIVE_LEVEL at Time A.

Now for the flow:

The I/O APIC will send a message to the Local APIC in the processor
shortly after Time A telling it that a level-triggered interrupt
occurred on vector 0xa1.

The Local APIC will set the bit corresponding to 0xa1 in its Trigger
Mode Register. Then set the 0xa1 bit in its IRR register, meaning that
it received the interrupt.

At Time B, the Local APIC then asserts an interrupt at the processor
core. The processor core responds by reading the vector from the Local
APIC, dumping context on the stack and jumping through the IDT entry at
0xa1. This causes the Local APIC to set the 0xa1 bit in its ISR
register.

The NT kernel has placed an architecture-specific interrupt pre-amble at
that address which raises to IRQL issues a "sti" instruction,
re-enabling interrupts.

Just before it reaches the sti, we hit Time C. The I/O APIC sends a
message to the local APIC telling it that a level-triggered interrupt
occurred on vector 0x71.

The Local APIC sets the IRR and TMR bits associated with 0x71 and does
nothing more, since IRQL has been raised to higher than this vector.
(IRQL, on APIC systems, is maintained directly in the Local APIC's Task
Priority Register.)

The processor issues the "sti" instruction mentioned above. The
interrupt pre-amble code then starts looking for
architecture-independent ISRs connected to vector 0xa1 that are there as
a result of drivers calling IoConnectInterrupt.

Time D arrives. The I/O APIC detects that card 4 has asserted INTA#.
It sends a message to the Local APIC telling it that a level-triggered
interrupt occurred on vector 0x93. The Local APIC sets the 0x93's IRR
and TMR bits.

The processor executes more of card2's driver's ISR. When this
completes, the ISR returns "TRUE, it was my interrupt and I handled it."
This prompts the NT kernel to quit processing the ISR chain, ACK the
interrupt and drop IRQL. This will cause the ISR and IRR bits
corresponding to 0xa1 in the Local APIC to be cleared. Because 0xa1's
TMR bit is set, this ACK will also cause a message to be sent to the I/O
APIC, telling it to re-sample vector 0xa1.

Now that IRQL has been lowered, the Local APIC will interrupt the
processor again, this time with vector 0x93, which is currently the
highest priority in the IRR register.

The processor will again jump through the IDT to the pre-able code.
Again, the code will raise IRQL, this time to level 0xD, and start
executing ISRs.

About this time the ACK message reaches the I/O APIC. The I/O APIC
re-samples input number #20. It's not asserted, since card2's driver
just ran its ISR. No new interrupt occurs here.

Back to the processor. It starts to execute the first ISR on 0x93's
chain. Assume that it finds card3's driver's ISR first on the list. It
will execute that ISR, which will clear the interrupting condition in
card3 and return "TRUE - that was my device, and the condition has been
handled." The processor then sends and ACK and drops IRQL. This causes
0x93's IRR and ISR bits to be cleared, and a message to be sent to I/O
APIC, since 0x93's TMR bit is set.

It will then drop IRQL and accept another interrupt from the Local APIC,
this time vector 0x71. The processor probably won't get very far
through the pre-amble code before the message gets to the I/O APIC.
Imagine it gets no further than raising to IRQL 0xB and issuing a "sti."


At that point, the I/O APIC will re-sample input #19. Since card4 is
still asserting INTA#, this input will still be active. The I/O APIC
will send a message back to the Local APIC telling it that vector 0x93
was triggered, level-style.

The Local APIC will then interrupt the processor with vector 0x93 again.
The processor will jump through the IDT and start executing pre-amble
code, raising back to IRQL 0xD. It will call card3's ISR again, since
it is first on 0x93's chain. Card3's ISR will return "FALSE - it wasn't
me." The NT kernel will then call the next ISR on the chain, that of
card4. Card4's ISR will run and return "TRUE - that was mine and I've
cleared interrupting condition." This will cause the kernel to issue
another ACK and drop IRQL.

The Local APIC will then clear 0x93's ISR and IRR bits again and issue
an ACK to the I/O APIC. Then the I/O APIC will re-sample input #19.
Now it's de-asserted, since neither card3 nor card4 is interrupting.

The processor will now continue executing the interrupt pre-amble code
for vector 0x71, which will call card1's ISR. Card1's ISR will return
"TRUE - that was mine and I've cleared the interrupting condition." The
kernel will ACK the interrupt and drop IRQL.

The Local APIC will clear the ISR and IRR bits associated with 0x71.
This will cause an ACK to be sent to the I/O APIC. The I/O APIC will
then re-sample input #21. It is now deasserted.

The processor will go back to executing lower-priority code. In all
likelihood, these ISRs queued up a bunch of DPCs. Now those DPCs will
be executed in the order that they were queued at DISPATCH_LEVEL. Then
the processor will drop back to PASSIVE_LEVEL and look for threads to
run. It may or may not find any.


I'm really tired of typing at the moment. So I challenge somebody else
to do either PIC case. (They are much more alike than different. With
respect to the device's ISRs, they are indistinguishable.)

This is the last that I will write on this topic. I refer any further
inquiries to Intel's Programmer's Reference Manuals for the Pentium
Family. See Volume 3, chapter 8.

http://developer.intel.com/design/pentium4/manuals/

Jake Oshins
Windows Kernel Group Interrupt Guy

This posting is provided "AS IS" with no warranties, and confers no
rights.


-----Original Message-----
Subject: RE: Device Interrupt priority - Reviewing Jose Flores
From: "Christiaan Ghijselinck" <xxxxx@CompaqNet.be>
Date: Wed, 11 Dec 2002 20:38:24 +0100
X-Message-Number: 35

Who will rise to the next challenge :


"Four PCI cards fire quasi at the same moment ( assume 100 ns time
difference ) an interrupt. Card1 , card2 and card3 have different IRQ's
=
,
card4 shares the same IRQ with card3.

I would like to see a detailled flow of all handschaking actions between
=
the
Cards <--> [ PCI bus <-- > ] PIC/APIC <--> CPU with bus <--> OS/ISR's in
both the "Lazy" and the "Strict Model". The flow ends when the last ( =
fourth )
IRQ has been completely serviced.

Such a description would have an incredible value .

Comments

  • Thank you, Jake,

    This is just what I expected to see. One should have the intention to put
    this
    information into a article for MSDN-Magazine, or as KB-Article.

    Once again, thank you very much,

    Christiaan



    ----- Original Message -----
    From: "Jake Oshins" <xxxxx@windows.microsoft.com>
    To: "NT Developers Interest List" <xxxxx@lists.osr.com>
    Sent: Friday, December 13, 2002 5:03 AM
    Subject: [ntdev] RE: interrupt handshaking - was other crap


    > You ask for a lot. But, in order to put this to bed, here it is.
    >
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    If I understand this correctly, the OS reflects the IRQL in the APIC's
    priority register. Higher priority interrupts will be masked out until the
    OS issues the sti. Lower priority interrupts will be masked out until the OS
    acknowledges the interrupt. There's a window in there where even a higher
    IRQL interrupt won't get through, and another window where no lower priority
    interrupts will get through. This may or may not be a problem, depending on
    the application and on the interrupt volume.


    Alberto.



    -----Original Message-----
    From: Jake Oshins [mailto:xxxxx@windows.microsoft.com]
    Sent: Thursday, December 12, 2002 11:03 PM
    To: NT Developers Interest List
    Subject: [ntdev] RE: interrupt handshaking - was other crap


    You ask for a lot. But, in order to put this to bed, here it is.

    APIC Case (I'll stick to the single-processor case for brevity:)

    Let me add some hypothetical details. When I use "Time A" or "Time B,"
    assume that time passes in alphabetical order.

    Card 1 is attached to I/O APIC input #21.
    Card 2 is attached to I/O APIC input #20.
    Card 3 and 4 are attached to I/O APIC input #19.

    The OS has assigned IDT entry 0x71 (IRQL 0xB) to I/O APIC input #21.
    The OS has assigned IDT entry 0xa1 (IRQL 0xE) to I/O APIC input #20.
    The OS has assigned IDT entry 0x93 (IRQL 0xD) to I/O APIC input #19.

    Card 2 asserts INTA# (by grounding it) at Time A.
    Card 1 asserts INTA# at Time C.
    Card 4 asserts INTA# at Time D.
    Card 3 asserts INTA# at Time E.

    Assume the processor is running at PASSIVE_LEVEL at Time A.

    Now for the flow:

    The I/O APIC will send a message to the Local APIC in the processor
    shortly after Time A telling it that a level-triggered interrupt
    occurred on vector 0xa1.

    The Local APIC will set the bit corresponding to 0xa1 in its Trigger
    Mode Register. Then set the 0xa1 bit in its IRR register, meaning that
    it received the interrupt.

    At Time B, the Local APIC then asserts an interrupt at the processor
    core. The processor core responds by reading the vector from the Local
    APIC, dumping context on the stack and jumping through the IDT entry at
    0xa1. This causes the Local APIC to set the 0xa1 bit in its ISR
    register.

    The NT kernel has placed an architecture-specific interrupt pre-amble at
    that address which raises to IRQL issues a "sti" instruction,
    re-enabling interrupts.

    Just before it reaches the sti, we hit Time C. The I/O APIC sends a
    message to the local APIC telling it that a level-triggered interrupt
    occurred on vector 0x71.

    The Local APIC sets the IRR and TMR bits associated with 0x71 and does
    nothing more, since IRQL has been raised to higher than this vector.
    (IRQL, on APIC systems, is maintained directly in the Local APIC's Task
    Priority Register.)

    The processor issues the "sti" instruction mentioned above. The
    interrupt pre-amble code then starts looking for
    architecture-independent ISRs connected to vector 0xa1 that are there as
    a result of drivers calling IoConnectInterrupt.

    Time D arrives. The I/O APIC detects that card 4 has asserted INTA#.
    It sends a message to the Local APIC telling it that a level-triggered
    interrupt occurred on vector 0x93. The Local APIC sets the 0x93's IRR
    and TMR bits.

    The processor executes more of card2's driver's ISR. When this
    completes, the ISR returns "TRUE, it was my interrupt and I handled it."
    This prompts the NT kernel to quit processing the ISR chain, ACK the
    interrupt and drop IRQL. This will cause the ISR and IRR bits
    corresponding to 0xa1 in the Local APIC to be cleared. Because 0xa1's
    TMR bit is set, this ACK will also cause a message to be sent to the I/O
    APIC, telling it to re-sample vector 0xa1.

    Now that IRQL has been lowered, the Local APIC will interrupt the
    processor again, this time with vector 0x93, which is currently the
    highest priority in the IRR register.

    The processor will again jump through the IDT to the pre-able code.
    Again, the code will raise IRQL, this time to level 0xD, and start
    executing ISRs.

    About this time the ACK message reaches the I/O APIC. The I/O APIC
    re-samples input number #20. It's not asserted, since card2's driver
    just ran its ISR. No new interrupt occurs here.

    Back to the processor. It starts to execute the first ISR on 0x93's
    chain. Assume that it finds card3's driver's ISR first on the list. It
    will execute that ISR, which will clear the interrupting condition in
    card3 and return "TRUE - that was my device, and the condition has been
    handled." The processor then sends and ACK and drops IRQL. This causes
    0x93's IRR and ISR bits to be cleared, and a message to be sent to I/O
    APIC, since 0x93's TMR bit is set.

    It will then drop IRQL and accept another interrupt from the Local APIC,
    this time vector 0x71. The processor probably won't get very far
    through the pre-amble code before the message gets to the I/O APIC.
    Imagine it gets no further than raising to IRQL 0xB and issuing a "sti."


    At that point, the I/O APIC will re-sample input #19. Since card4 is
    still asserting INTA#, this input will still be active. The I/O APIC
    will send a message back to the Local APIC telling it that vector 0x93
    was triggered, level-style.

    The Local APIC will then interrupt the processor with vector 0x93 again.
    The processor will jump through the IDT and start executing pre-amble
    code, raising back to IRQL 0xD. It will call card3's ISR again, since
    it is first on 0x93's chain. Card3's ISR will return "FALSE - it wasn't
    me." The NT kernel will then call the next ISR on the chain, that of
    card4. Card4's ISR will run and return "TRUE - that was mine and I've
    cleared interrupting condition." This will cause the kernel to issue
    another ACK and drop IRQL.

    The Local APIC will then clear 0x93's ISR and IRR bits again and issue
    an ACK to the I/O APIC. Then the I/O APIC will re-sample input #19.
    Now it's de-asserted, since neither card3 nor card4 is interrupting.

    The processor will now continue executing the interrupt pre-amble code
    for vector 0x71, which will call card1's ISR. Card1's ISR will return
    "TRUE - that was mine and I've cleared the interrupting condition." The
    kernel will ACK the interrupt and drop IRQL.

    The Local APIC will clear the ISR and IRR bits associated with 0x71.
    This will cause an ACK to be sent to the I/O APIC. The I/O APIC will
    then re-sample input #21. It is now deasserted.

    The processor will go back to executing lower-priority code. In all
    likelihood, these ISRs queued up a bunch of DPCs. Now those DPCs will
    be executed in the order that they were queued at DISPATCH_LEVEL. Then
    the processor will drop back to PASSIVE_LEVEL and look for threads to
    run. It may or may not find any.


    I'm really tired of typing at the moment. So I challenge somebody else
    to do either PIC case. (They are much more alike than different. With
    respect to the device's ISRs, they are indistinguishable.)

    This is the last that I will write on this topic. I refer any further
    inquiries to Intel's Programmer's Reference Manuals for the Pentium
    Family. See Volume 3, chapter 8.

    http://developer.intel.com/design/pentium4/manuals/

    Jake Oshins
    Windows Kernel Group Interrupt Guy

    This posting is provided "AS IS" with no warranties, and confers no
    rights.


    -----Original Message-----
    Subject: RE: Device Interrupt priority - Reviewing Jose Flores
    From: "Christiaan Ghijselinck" <xxxxx@CompaqNet.be>
    Date: Wed, 11 Dec 2002 20:38:24 +0100
    X-Message-Number: 35

    Who will rise to the next challenge :


    "Four PCI cards fire quasi at the same moment ( assume 100 ns time
    difference ) an interrupt. Card1 , card2 and card3 have different IRQ's
    =
    ,
    card4 shares the same IRQ with card3.

    I would like to see a detailled flow of all handschaking actions between
    =
    the
    Cards <--> [ PCI bus <-- > ] PIC/APIC <--> CPU with bus <--> OS/ISR's in
    both the "Lazy" and the "Strict Model". The flow ends when the last ( =
    fourth )
    IRQ has been completely serviced.

    Such a description would have an incredible value .




    ---
    You are currently subscribed to ntdev as: xxxxx@compuware.com
    To unsubscribe send a blank email to %%email.unsub%%



    The contents of this e-mail are intended for the named addressee only. It
    contains information that may be confidential. Unless you are the named
    addressee or an authorized designee, you may not copy or use it, or disclose
    it to anyone else. If you received it in error please notify us immediately
    and then destroy it.
  • DougDoug Member Posts: 83
    An second interrupt will only come in when the interrupt flag is set (STI)
    and the (A)PIC has been issued an 'End Of Interrupt' (EOI) command to clear
    the currently requesting interrupt. When an EOI is sent the highest
    currently active interrupt gets serviced. If a lower priority interrupt is
    being serviced it is possible a higher priority interrupt could come in
    before the EOI is sent for the lower priority interrupt. This is true of any
    priority interrupt scheme.

    That is why the kernel separates out real IRQ routines (owned by the kernel)
    from Interrupt Service Routines (ISRs - owned by the device driver writer).
    Kernel IRQ routines are designed to be very small and very fast so the time
    used to service an IRQ (i.e. send EOI, STI, and queue ISR) is minimal.

    Doug

    "Moreira, Alberto" wrote in message
    news:xxxxx@ntdev...
    >
    > If I understand this correctly, the OS reflects the IRQL in the APIC's
    > priority register. Higher priority interrupts will be masked out until the
    > OS issues the sti. Lower priority interrupts will be masked out until the
    OS
    > acknowledges the interrupt. There's a window in there where even a higher
    > IRQL interrupt won't get through, and another window where no lower
    priority
    > interrupts will get through. This may or may not be a problem, depending
    on
    > the application and on the interrupt volume.
    >
    >
    > Alberto.
    >
    >
    >
    > -----Original Message-----
    > From: Jake Oshins [mailto:xxxxx@windows.microsoft.com]
    > Sent: Thursday, December 12, 2002 11:03 PM
    > To: NT Developers Interest List
    > Subject: [ntdev] RE: interrupt handshaking - was other crap
    >
    >
    > You ask for a lot. But, in order to put this to bed, here it is.
    >
    > APIC Case (I'll stick to the single-processor case for brevity:)
    >
    > Let me add some hypothetical details. When I use "Time A" or "Time B,"
    > assume that time passes in alphabetical order.
    >
    > Card 1 is attached to I/O APIC input #21.
    > Card 2 is attached to I/O APIC input #20.
    > Card 3 and 4 are attached to I/O APIC input #19.
    >
    > The OS has assigned IDT entry 0x71 (IRQL 0xB) to I/O APIC input #21.
    > The OS has assigned IDT entry 0xa1 (IRQL 0xE) to I/O APIC input #20.
    > The OS has assigned IDT entry 0x93 (IRQL 0xD) to I/O APIC input #19.
    >
    > Card 2 asserts INTA# (by grounding it) at Time A.
    > Card 1 asserts INTA# at Time C.
    > Card 4 asserts INTA# at Time D.
    > Card 3 asserts INTA# at Time E.
    >
    > Assume the processor is running at PASSIVE_LEVEL at Time A.
    >
    > Now for the flow:
    >
    > The I/O APIC will send a message to the Local APIC in the processor
    > shortly after Time A telling it that a level-triggered interrupt
    > occurred on vector 0xa1.
    >
    > The Local APIC will set the bit corresponding to 0xa1 in its Trigger
    > Mode Register. Then set the 0xa1 bit in its IRR register, meaning that
    > it received the interrupt.
    >
    > At Time B, the Local APIC then asserts an interrupt at the processor
    > core. The processor core responds by reading the vector from the Local
    > APIC, dumping context on the stack and jumping through the IDT entry at
    > 0xa1. This causes the Local APIC to set the 0xa1 bit in its ISR
    > register.
    >
    > The NT kernel has placed an architecture-specific interrupt pre-amble at
    > that address which raises to IRQL issues a "sti" instruction,
    > re-enabling interrupts.
    >
    > Just before it reaches the sti, we hit Time C. The I/O APIC sends a
    > message to the local APIC telling it that a level-triggered interrupt
    > occurred on vector 0x71.
    >
    > The Local APIC sets the IRR and TMR bits associated with 0x71 and does
    > nothing more, since IRQL has been raised to higher than this vector.
    > (IRQL, on APIC systems, is maintained directly in the Local APIC's Task
    > Priority Register.)
    >
    > The processor issues the "sti" instruction mentioned above. The
    > interrupt pre-amble code then starts looking for
    > architecture-independent ISRs connected to vector 0xa1 that are there as
    > a result of drivers calling IoConnectInterrupt.
    >
    > Time D arrives. The I/O APIC detects that card 4 has asserted INTA#.
    > It sends a message to the Local APIC telling it that a level-triggered
    > interrupt occurred on vector 0x93. The Local APIC sets the 0x93's IRR
    > and TMR bits.
    >
    > The processor executes more of card2's driver's ISR. When this
    > completes, the ISR returns "TRUE, it was my interrupt and I handled it."
    > This prompts the NT kernel to quit processing the ISR chain, ACK the
    > interrupt and drop IRQL. This will cause the ISR and IRR bits
    > corresponding to 0xa1 in the Local APIC to be cleared. Because 0xa1's
    > TMR bit is set, this ACK will also cause a message to be sent to the I/O
    > APIC, telling it to re-sample vector 0xa1.
    >
    > Now that IRQL has been lowered, the Local APIC will interrupt the
    > processor again, this time with vector 0x93, which is currently the
    > highest priority in the IRR register.
    >
    > The processor will again jump through the IDT to the pre-able code.
    > Again, the code will raise IRQL, this time to level 0xD, and start
    > executing ISRs.
    >
    > About this time the ACK message reaches the I/O APIC. The I/O APIC
    > re-samples input number #20. It's not asserted, since card2's driver
    > just ran its ISR. No new interrupt occurs here.
    >
    > Back to the processor. It starts to execute the first ISR on 0x93's
    > chain. Assume that it finds card3's driver's ISR first on the list. It
    > will execute that ISR, which will clear the interrupting condition in
    > card3 and return "TRUE - that was my device, and the condition has been
    > handled." The processor then sends and ACK and drops IRQL. This causes
    > 0x93's IRR and ISR bits to be cleared, and a message to be sent to I/O
    > APIC, since 0x93's TMR bit is set.
    >
    > It will then drop IRQL and accept another interrupt from the Local APIC,
    > this time vector 0x71. The processor probably won't get very far
    > through the pre-amble code before the message gets to the I/O APIC.
    > Imagine it gets no further than raising to IRQL 0xB and issuing a "sti."
    >
    >
    > At that point, the I/O APIC will re-sample input #19. Since card4 is
    > still asserting INTA#, this input will still be active. The I/O APIC
    > will send a message back to the Local APIC telling it that vector 0x93
    > was triggered, level-style.
    >
    > The Local APIC will then interrupt the processor with vector 0x93 again.
    > The processor will jump through the IDT and start executing pre-amble
    > code, raising back to IRQL 0xD. It will call card3's ISR again, since
    > it is first on 0x93's chain. Card3's ISR will return "FALSE - it wasn't
    > me." The NT kernel will then call the next ISR on the chain, that of
    > card4. Card4's ISR will run and return "TRUE - that was mine and I've
    > cleared interrupting condition." This will cause the kernel to issue
    > another ACK and drop IRQL.
    >
    > The Local APIC will then clear 0x93's ISR and IRR bits again and issue
    > an ACK to the I/O APIC. Then the I/O APIC will re-sample input #19.
    > Now it's de-asserted, since neither card3 nor card4 is interrupting.
    >
    > The processor will now continue executing the interrupt pre-amble code
    > for vector 0x71, which will call card1's ISR. Card1's ISR will return
    > "TRUE - that was mine and I've cleared the interrupting condition." The
    > kernel will ACK the interrupt and drop IRQL.
    >
    > The Local APIC will clear the ISR and IRR bits associated with 0x71.
    > This will cause an ACK to be sent to the I/O APIC. The I/O APIC will
    > then re-sample input #21. It is now deasserted.
    >
    > The processor will go back to executing lower-priority code. In all
    > likelihood, these ISRs queued up a bunch of DPCs. Now those DPCs will
    > be executed in the order that they were queued at DISPATCH_LEVEL. Then
    > the processor will drop back to PASSIVE_LEVEL and look for threads to
    > run. It may or may not find any.
    >
    >
    > I'm really tired of typing at the moment. So I challenge somebody else
    > to do either PIC case. (They are much more alike than different. With
    > respect to the device's ISRs, they are indistinguishable.)
    >
    > This is the last that I will write on this topic. I refer any further
    > inquiries to Intel's Programmer's Reference Manuals for the Pentium
    > Family. See Volume 3, chapter 8.
    >
    > http://developer.intel.com/design/pentium4/manuals/
    >
    > Jake Oshins
    > Windows Kernel Group Interrupt Guy
    >
    > This posting is provided "AS IS" with no warranties, and confers no
    > rights.
    >
    >
    > -----Original Message-----
    > Subject: RE: Device Interrupt priority - Reviewing Jose Flores
    > From: "Christiaan Ghijselinck"
    > Date: Wed, 11 Dec 2002 20:38:24 +0100
    > X-Message-Number: 35
    >
    > Who will rise to the next challenge :
    >
    >
    > "Four PCI cards fire quasi at the same moment ( assume 100 ns time
    > difference ) an interrupt. Card1 , card2 and card3 have different IRQ's
    > =
    > ,
    > card4 shares the same IRQ with card3.
    >
    > I would like to see a detailled flow of all handschaking actions between
    > =
    > the
    > Cards <--> [ PCI bus <-- > ] PIC/APIC <--> CPU with bus <--> OS/ISR's in
    > both the "Lazy" and the "Strict Model". The flow ends when the last ( =
    > fourth )
    > IRQ has been completely serviced.
    >
    > Such a description would have an incredible value .
    >
    >
    >
    >
    > ---
    > You are currently subscribed to ntdev as: xxxxx@compuware.com
    > To unsubscribe send a blank email to %%email.unsub%%
    >
    >
    >
    > The contents of this e-mail are intended for the named addressee only. It
    > contains information that may be confidential. Unless you are the named
    > addressee or an authorized designee, you may not copy or use it, or
    disclose
    > it to anyone else. If you received it in error please notify us
    immediately
    > and then destroy it.
    >
    >
    >
    >
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    > acknowledges the interrupt. There's a window in there where even a
    higher
    > IRQL interrupt won't get through

    It will just be delayed till IRQL will lower. It will not be missed.

    Max
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    A second interrupt will come when it comes - that's dictated by the
    peripheral subsystem. If the APIC or the processor are not in a state that
    the interrupt can get through, it'll back up at the peripheral subsystem,
    that is, to the extent it can back up. High throughput systems can be time
    critical ! The key issue is interrupt latency: the time elapsed between the
    time the peripheral issues an interrupt and the time the peripheral can
    issue another interrupt.


    Alberto.


    -----Original Message-----
    From: Doug [mailto:xxxxx@hotmail.com]
    Sent: Friday, December 13, 2002 10:36 AM
    To: NT Developers Interest List
    Subject: [ntdev] Re: interrupt handshaking - was other crap


    An second interrupt will only come in when the interrupt flag is set (STI)
    and the (A)PIC has been issued an 'End Of Interrupt' (EOI) command to clear
    the currently requesting interrupt. When an EOI is sent the highest
    currently active interrupt gets serviced. If a lower priority interrupt is
    being serviced it is possible a higher priority interrupt could come in
    before the EOI is sent for the lower priority interrupt. This is true of any
    priority interrupt scheme.

    That is why the kernel separates out real IRQ routines (owned by the kernel)
    from Interrupt Service Routines (ISRs - owned by the device driver writer).
    Kernel IRQ routines are designed to be very small and very fast so the time
    used to service an IRQ (i.e. send EOI, STI, and queue ISR) is minimal.

    Doug

    "Moreira, Alberto" <xxxxx@compuware.com> wrote in message
    news:xxxxx@ntdev...
    >
    > If I understand this correctly, the OS reflects the IRQL in the APIC's
    > priority register. Higher priority interrupts will be masked out until the
    > OS issues the sti. Lower priority interrupts will be masked out until the
    OS
    > acknowledges the interrupt. There's a window in there where even a higher
    > IRQL interrupt won't get through, and another window where no lower
    priority
    > interrupts will get through. This may or may not be a problem, depending
    on
    > the application and on the interrupt volume.
    >
    >
    > Alberto.
    >
    >
    >
    > -----Original Message-----
    > From: Jake Oshins [mailto:xxxxx@windows.microsoft.com]
    > Sent: Thursday, December 12, 2002 11:03 PM
    > To: NT Developers Interest List
    > Subject: [ntdev] RE: interrupt handshaking - was other crap
    >
    >
    > You ask for a lot. But, in order to put this to bed, here it is.
    >
    > APIC Case (I'll stick to the single-processor case for brevity:)
    >
    > Let me add some hypothetical details. When I use "Time A" or "Time B,"
    > assume that time passes in alphabetical order.
    >
    > Card 1 is attached to I/O APIC input #21.
    > Card 2 is attached to I/O APIC input #20.
    > Card 3 and 4 are attached to I/O APIC input #19.
    >
    > The OS has assigned IDT entry 0x71 (IRQL 0xB) to I/O APIC input #21.
    > The OS has assigned IDT entry 0xa1 (IRQL 0xE) to I/O APIC input #20.
    > The OS has assigned IDT entry 0x93 (IRQL 0xD) to I/O APIC input #19.
    >
    > Card 2 asserts INTA# (by grounding it) at Time A.
    > Card 1 asserts INTA# at Time C.
    > Card 4 asserts INTA# at Time D.
    > Card 3 asserts INTA# at Time E.
    >
    > Assume the processor is running at PASSIVE_LEVEL at Time A.
    >
    > Now for the flow:
    >
    > The I/O APIC will send a message to the Local APIC in the processor
    > shortly after Time A telling it that a level-triggered interrupt
    > occurred on vector 0xa1.
    >
    > The Local APIC will set the bit corresponding to 0xa1 in its Trigger
    > Mode Register. Then set the 0xa1 bit in its IRR register, meaning that
    > it received the interrupt.
    >
    > At Time B, the Local APIC then asserts an interrupt at the processor
    > core. The processor core responds by reading the vector from the Local
    > APIC, dumping context on the stack and jumping through the IDT entry at
    > 0xa1. This causes the Local APIC to set the 0xa1 bit in its ISR
    > register.
    >
    > The NT kernel has placed an architecture-specific interrupt pre-amble at
    > that address which raises to IRQL issues a "sti" instruction,
    > re-enabling interrupts.
    >
    > Just before it reaches the sti, we hit Time C. The I/O APIC sends a
    > message to the local APIC telling it that a level-triggered interrupt
    > occurred on vector 0x71.
    >
    > The Local APIC sets the IRR and TMR bits associated with 0x71 and does
    > nothing more, since IRQL has been raised to higher than this vector.
    > (IRQL, on APIC systems, is maintained directly in the Local APIC's Task
    > Priority Register.)
    >
    > The processor issues the "sti" instruction mentioned above. The
    > interrupt pre-amble code then starts looking for
    > architecture-independent ISRs connected to vector 0xa1 that are there as
    > a result of drivers calling IoConnectInterrupt.
    >
    > Time D arrives. The I/O APIC detects that card 4 has asserted INTA#.
    > It sends a message to the Local APIC telling it that a level-triggered
    > interrupt occurred on vector 0x93. The Local APIC sets the 0x93's IRR
    > and TMR bits.
    >
    > The processor executes more of card2's driver's ISR. When this
    > completes, the ISR returns "TRUE, it was my interrupt and I handled it."
    > This prompts the NT kernel to quit processing the ISR chain, ACK the
    > interrupt and drop IRQL. This will cause the ISR and IRR bits
    > corresponding to 0xa1 in the Local APIC to be cleared. Because 0xa1's
    > TMR bit is set, this ACK will also cause a message to be sent to the I/O
    > APIC, telling it to re-sample vector 0xa1.
    >
    > Now that IRQL has been lowered, the Local APIC will interrupt the
    > processor again, this time with vector 0x93, which is currently the
    > highest priority in the IRR register.
    >
    > The processor will again jump through the IDT to the pre-able code.
    > Again, the code will raise IRQL, this time to level 0xD, and start
    > executing ISRs.
    >
    > About this time the ACK message reaches the I/O APIC. The I/O APIC
    > re-samples input number #20. It's not asserted, since card2's driver
    > just ran its ISR. No new interrupt occurs here.
    >
    > Back to the processor. It starts to execute the first ISR on 0x93's
    > chain. Assume that it finds card3's driver's ISR first on the list. It
    > will execute that ISR, which will clear the interrupting condition in
    > card3 and return "TRUE - that was my device, and the condition has been
    > handled." The processor then sends and ACK and drops IRQL. This causes
    > 0x93's IRR and ISR bits to be cleared, and a message to be sent to I/O
    > APIC, since 0x93's TMR bit is set.
    >
    > It will then drop IRQL and accept another interrupt from the Local APIC,
    > this time vector 0x71. The processor probably won't get very far
    > through the pre-amble code before the message gets to the I/O APIC.
    > Imagine it gets no further than raising to IRQL 0xB and issuing a "sti."
    >
    >
    > At that point, the I/O APIC will re-sample input #19. Since card4 is
    > still asserting INTA#, this input will still be active. The I/O APIC
    > will send a message back to the Local APIC telling it that vector 0x93
    > was triggered, level-style.
    >
    > The Local APIC will then interrupt the processor with vector 0x93 again.
    > The processor will jump through the IDT and start executing pre-amble
    > code, raising back to IRQL 0xD. It will call card3's ISR again, since
    > it is first on 0x93's chain. Card3's ISR will return "FALSE - it wasn't
    > me." The NT kernel will then call the next ISR on the chain, that of
    > card4. Card4's ISR will run and return "TRUE - that was mine and I've
    > cleared interrupting condition." This will cause the kernel to issue
    > another ACK and drop IRQL.
    >
    > The Local APIC will then clear 0x93's ISR and IRR bits again and issue
    > an ACK to the I/O APIC. Then the I/O APIC will re-sample input #19.
    > Now it's de-asserted, since neither card3 nor card4 is interrupting.
    >
    > The processor will now continue executing the interrupt pre-amble code
    > for vector 0x71, which will call card1's ISR. Card1's ISR will return
    > "TRUE - that was mine and I've cleared the interrupting condition." The
    > kernel will ACK the interrupt and drop IRQL.
    >
    > The Local APIC will clear the ISR and IRR bits associated with 0x71.
    > This will cause an ACK to be sent to the I/O APIC. The I/O APIC will
    > then re-sample input #21. It is now deasserted.
    >
    > The processor will go back to executing lower-priority code. In all
    > likelihood, these ISRs queued up a bunch of DPCs. Now those DPCs will
    > be executed in the order that they were queued at DISPATCH_LEVEL. Then
    > the processor will drop back to PASSIVE_LEVEL and look for threads to
    > run. It may or may not find any.
    >
    >
    > I'm really tired of typing at the moment. So I challenge somebody else
    > to do either PIC case. (They are much more alike than different. With
    > respect to the device's ISRs, they are indistinguishable.)
    >
    > This is the last that I will write on this topic. I refer any further
    > inquiries to Intel's Programmer's Reference Manuals for the Pentium
    > Family. See Volume 3, chapter 8.
    >
    > http://developer.intel.com/design/pentium4/manuals/
    >
    > Jake Oshins
    > Windows Kernel Group Interrupt Guy
    >
    > This posting is provided "AS IS" with no warranties, and confers no
    > rights.
    >
    >
    > -----Original Message-----
    > Subject: RE: Device Interrupt priority - Reviewing Jose Flores
    > From: "Christiaan Ghijselinck" <xxxxx@CompaqNet.be>
    > Date: Wed, 11 Dec 2002 20:38:24 +0100
    > X-Message-Number: 35
    >
    > Who will rise to the next challenge :
    >
    >
    > "Four PCI cards fire quasi at the same moment ( assume 100 ns time
    > difference ) an interrupt. Card1 , card2 and card3 have different IRQ's
    > =
    > ,
    > card4 shares the same IRQ with card3.
    >
    > I would like to see a detailled flow of all handschaking actions between
    > =
    > the
    > Cards <--> [ PCI bus <-- > ] PIC/APIC <--> CPU with bus <--> OS/ISR's in
    > both the "Lazy" and the "Strict Model". The flow ends when the last ( =
    > fourth )
    > IRQ has been completely serviced.
    >
    > Such a description would have an incredible value .
    >
    >
    >
    >
    > ---
    > You are currently subscribed to ntdev as: xxxxx@compuware.com
    > To unsubscribe send a blank email to %%email.unsub%%
    >
    >
    >
    > The contents of this e-mail are intended for the named addressee only. It
    > contains information that may be confidential. Unless you are the named
    > addressee or an authorized designee, you may not copy or use it, or
    disclose
    > it to anyone else. If you received it in error please notify us
    immediately
    > and then destroy it.
    >
    >
    >
    >



    ---
    You are currently subscribed to ntdev as: xxxxx@compuware.com
    To unsubscribe send a blank email to %%email.unsub%%



    The contents of this e-mail are intended for the named addressee only. It
    contains information that may be confidential. Unless you are the named
    addressee or an authorized designee, you may not copy or use it, or disclose
    it to anyone else. If you received it in error please notify us immediately
    and then destroy it.
  • DougDoug Member Posts: 83
    A higher priority IRQ may be slightly delayed (and I mean slightly) when the
    IRQ routine clears out a lower priority IRQ from the APIC, but as soon as
    the higher priority IRQ makes it through the APIC, NT should get interrupted
    and swap out any lower IRQL ISR that is running to allow the higher priority
    IRQL's ISR to run. Then it is up to the DD writer to be efficient as
    possible.

    The only way an IRQ could get lost on a peripheral is if the Periperal
    INT->INTA#->APIC IRQ->CPU IRQ->ISR->Clear Peripheral INT is not handled
    before the next one is generated by the peripheral. If that is the case, the
    peripheral should employ some sort of queueing to handle the latency. For
    example, this is why UARTs started having 16 byte FIFO queues, 256 byte
    queues, etc. Beating the CPU with IRQs is not good for the CPU or the
    peripheral.

    Doug

    "Moreira, Alberto" wrote in message
    news:xxxxx@ntdev...
    >
    > A second interrupt will come when it comes - that's dictated by the
    > peripheral subsystem. If the APIC or the processor are not in a state that
    > the interrupt can get through, it'll back up at the peripheral subsystem,
    > that is, to the extent it can back up. High throughput systems can be time
    > critical ! The key issue is interrupt latency: the time elapsed between
    the
    > time the peripheral issues an interrupt and the time the peripheral can
    > issue another interrupt.
    >
    >
    > Alberto.
    >
    >
    > -----Original Message-----
    > From: Doug [mailto:xxxxx@hotmail.com]
    > Sent: Friday, December 13, 2002 10:36 AM
    > To: NT Developers Interest List
    > Subject: [ntdev] Re: interrupt handshaking - was other crap
    >
    >
    > An second interrupt will only come in when the interrupt flag is set (STI)
    > and the (A)PIC has been issued an 'End Of Interrupt' (EOI) command to
    clear
    > the currently requesting interrupt. When an EOI is sent the highest
    > currently active interrupt gets serviced. If a lower priority interrupt is
    > being serviced it is possible a higher priority interrupt could come in
    > before the EOI is sent for the lower priority interrupt. This is true of
    any
    > priority interrupt scheme.
    >
    > That is why the kernel separates out real IRQ routines (owned by the
    kernel)
    > from Interrupt Service Routines (ISRs - owned by the device driver
    writer).
    > Kernel IRQ routines are designed to be very small and very fast so the
    time
    > used to service an IRQ (i.e. send EOI, STI, and queue ISR) is minimal.
    >
    > Doug
    >
    > "Moreira, Alberto" wrote in message
    > news:xxxxx@ntdev...
    > >
    > > If I understand this correctly, the OS reflects the IRQL in the APIC's
    > > priority register. Higher priority interrupts will be masked out until
    the
    > > OS issues the sti. Lower priority interrupts will be masked out until
    the
    > OS
    > > acknowledges the interrupt. There's a window in there where even a
    higher
    > > IRQL interrupt won't get through, and another window where no lower
    > priority
    > > interrupts will get through. This may or may not be a problem, depending
    > on
    > > the application and on the interrupt volume.
    > >
    > >
    > > Alberto.
    > >
    > >
    > >
    > > -----Original Message-----
    > > From: Jake Oshins [mailto:xxxxx@windows.microsoft.com]
    > > Sent: Thursday, December 12, 2002 11:03 PM
    > > To: NT Developers Interest List
    > > Subject: [ntdev] RE: interrupt handshaking - was other crap
    > >
    > >
    > > You ask for a lot. But, in order to put this to bed, here it is.
    > >
    > > APIC Case (I'll stick to the single-processor case for brevity:)
    > >
    > > Let me add some hypothetical details. When I use "Time A" or "Time B,"
    > > assume that time passes in alphabetical order.
    > >
    > > Card 1 is attached to I/O APIC input #21.
    > > Card 2 is attached to I/O APIC input #20.
    > > Card 3 and 4 are attached to I/O APIC input #19.
    > >
    > > The OS has assigned IDT entry 0x71 (IRQL 0xB) to I/O APIC input #21.
    > > The OS has assigned IDT entry 0xa1 (IRQL 0xE) to I/O APIC input #20.
    > > The OS has assigned IDT entry 0x93 (IRQL 0xD) to I/O APIC input #19.
    > >
    > > Card 2 asserts INTA# (by grounding it) at Time A.
    > > Card 1 asserts INTA# at Time C.
    > > Card 4 asserts INTA# at Time D.
    > > Card 3 asserts INTA# at Time E.
    > >
    > > Assume the processor is running at PASSIVE_LEVEL at Time A.
    > >
    > > Now for the flow:
    > >
    > > The I/O APIC will send a message to the Local APIC in the processor
    > > shortly after Time A telling it that a level-triggered interrupt
    > > occurred on vector 0xa1.
    > >
    > > The Local APIC will set the bit corresponding to 0xa1 in its Trigger
    > > Mode Register. Then set the 0xa1 bit in its IRR register, meaning that
    > > it received the interrupt.
    > >
    > > At Time B, the Local APIC then asserts an interrupt at the processor
    > > core. The processor core responds by reading the vector from the Local
    > > APIC, dumping context on the stack and jumping through the IDT entry at
    > > 0xa1. This causes the Local APIC to set the 0xa1 bit in its ISR
    > > register.
    > >
    > > The NT kernel has placed an architecture-specific interrupt pre-amble at
    > > that address which raises to IRQL issues a "sti" instruction,
    > > re-enabling interrupts.
    > >
    > > Just before it reaches the sti, we hit Time C. The I/O APIC sends a
    > > message to the local APIC telling it that a level-triggered interrupt
    > > occurred on vector 0x71.
    > >
    > > The Local APIC sets the IRR and TMR bits associated with 0x71 and does
    > > nothing more, since IRQL has been raised to higher than this vector.
    > > (IRQL, on APIC systems, is maintained directly in the Local APIC's Task
    > > Priority Register.)
    > >
    > > The processor issues the "sti" instruction mentioned above. The
    > > interrupt pre-amble code then starts looking for
    > > architecture-independent ISRs connected to vector 0xa1 that are there as
    > > a result of drivers calling IoConnectInterrupt.
    > >
    > > Time D arrives. The I/O APIC detects that card 4 has asserted INTA#.
    > > It sends a message to the Local APIC telling it that a level-triggered
    > > interrupt occurred on vector 0x93. The Local APIC sets the 0x93's IRR
    > > and TMR bits.
    > >
    > > The processor executes more of card2's driver's ISR. When this
    > > completes, the ISR returns "TRUE, it was my interrupt and I handled it."
    > > This prompts the NT kernel to quit processing the ISR chain, ACK the
    > > interrupt and drop IRQL. This will cause the ISR and IRR bits
    > > corresponding to 0xa1 in the Local APIC to be cleared. Because 0xa1's
    > > TMR bit is set, this ACK will also cause a message to be sent to the I/O
    > > APIC, telling it to re-sample vector 0xa1.
    > >
    > > Now that IRQL has been lowered, the Local APIC will interrupt the
    > > processor again, this time with vector 0x93, which is currently the
    > > highest priority in the IRR register.
    > >
    > > The processor will again jump through the IDT to the pre-able code.
    > > Again, the code will raise IRQL, this time to level 0xD, and start
    > > executing ISRs.
    > >
    > > About this time the ACK message reaches the I/O APIC. The I/O APIC
    > > re-samples input number #20. It's not asserted, since card2's driver
    > > just ran its ISR. No new interrupt occurs here.
    > >
    > > Back to the processor. It starts to execute the first ISR on 0x93's
    > > chain. Assume that it finds card3's driver's ISR first on the list. It
    > > will execute that ISR, which will clear the interrupting condition in
    > > card3 and return "TRUE - that was my device, and the condition has been
    > > handled." The processor then sends and ACK and drops IRQL. This causes
    > > 0x93's IRR and ISR bits to be cleared, and a message to be sent to I/O
    > > APIC, since 0x93's TMR bit is set.
    > >
    > > It will then drop IRQL and accept another interrupt from the Local APIC,
    > > this time vector 0x71. The processor probably won't get very far
    > > through the pre-amble code before the message gets to the I/O APIC.
    > > Imagine it gets no further than raising to IRQL 0xB and issuing a "sti."
    > >
    > >
    > > At that point, the I/O APIC will re-sample input #19. Since card4 is
    > > still asserting INTA#, this input will still be active. The I/O APIC
    > > will send a message back to the Local APIC telling it that vector 0x93
    > > was triggered, level-style.
    > >
    > > The Local APIC will then interrupt the processor with vector 0x93 again.
    > > The processor will jump through the IDT and start executing pre-amble
    > > code, raising back to IRQL 0xD. It will call card3's ISR again, since
    > > it is first on 0x93's chain. Card3's ISR will return "FALSE - it wasn't
    > > me." The NT kernel will then call the next ISR on the chain, that of
    > > card4. Card4's ISR will run and return "TRUE - that was mine and I've
    > > cleared interrupting condition." This will cause the kernel to issue
    > > another ACK and drop IRQL.
    > >
    > > The Local APIC will then clear 0x93's ISR and IRR bits again and issue
    > > an ACK to the I/O APIC. Then the I/O APIC will re-sample input #19.
    > > Now it's de-asserted, since neither card3 nor card4 is interrupting.
    > >
    > > The processor will now continue executing the interrupt pre-amble code
    > > for vector 0x71, which will call card1's ISR. Card1's ISR will return
    > > "TRUE - that was mine and I've cleared the interrupting condition." The
    > > kernel will ACK the interrupt and drop IRQL.
    > >
    > > The Local APIC will clear the ISR and IRR bits associated with 0x71.
    > > This will cause an ACK to be sent to the I/O APIC. The I/O APIC will
    > > then re-sample input #21. It is now deasserted.
    > >
    > > The processor will go back to executing lower-priority code. In all
    > > likelihood, these ISRs queued up a bunch of DPCs. Now those DPCs will
    > > be executed in the order that they were queued at DISPATCH_LEVEL. Then
    > > the processor will drop back to PASSIVE_LEVEL and look for threads to
    > > run. It may or may not find any.
    > >
    > >
    > > I'm really tired of typing at the moment. So I challenge somebody else
    > > to do either PIC case. (They are much more alike than different. With
    > > respect to the device's ISRs, they are indistinguishable.)
    > >
    > > This is the last that I will write on this topic. I refer any further
    > > inquiries to Intel's Programmer's Reference Manuals for the Pentium
    > > Family. See Volume 3, chapter 8.
    > >
    > > http://developer.intel.com/design/pentium4/manuals/
    > >
    > > Jake Oshins
    > > Windows Kernel Group Interrupt Guy
    > >
    > > This posting is provided "AS IS" with no warranties, and confers no
    > > rights.
    > >
    > >
    > > -----Original Message-----
    > > Subject: RE: Device Interrupt priority - Reviewing Jose Flores
    > > From: "Christiaan Ghijselinck"
    > > Date: Wed, 11 Dec 2002 20:38:24 +0100
    > > X-Message-Number: 35
    > >
    > > Who will rise to the next challenge :
    > >
    > >
    > > "Four PCI cards fire quasi at the same moment ( assume 100 ns time
    > > difference ) an interrupt. Card1 , card2 and card3 have different IRQ's
    > > =
    > > ,
    > > card4 shares the same IRQ with card3.
    > >
    > > I would like to see a detailled flow of all handschaking actions between
    > > =
    > > the
    > > Cards <--> [ PCI bus <-- > ] PIC/APIC <--> CPU with bus <--> OS/ISR's in
    > > both the "Lazy" and the "Strict Model". The flow ends when the last ( =
    > > fourth )
    > > IRQ has been completely serviced.
    > >
    > > Such a description would have an incredible value .
    > >
    > >
    > >
    > >
    > > ---
    > > You are currently subscribed to ntdev as: xxxxx@compuware.com
    > > To unsubscribe send a blank email to %%email.unsub%%
    > >
    > >
    > >
    > > The contents of this e-mail are intended for the named addressee only.
    It
    > > contains information that may be confidential. Unless you are the named
    > > addressee or an authorized designee, you may not copy or use it, or
    > disclose
    > > it to anyone else. If you received it in error please notify us
    > immediately
    > > and then destroy it.
    > >
    > >
    > >
    > >
    >
    >
    >
    > ---
    > You are currently subscribed to ntdev as: xxxxx@compuware.com
    > To unsubscribe send a blank email to %%email.unsub%%
    >
    >
    >
    > The contents of this e-mail are intended for the named addressee only. It
    > contains information that may be confidential. Unless you are the named
    > addressee or an authorized designee, you may not copy or use it, or
    disclose
    > it to anyone else. If you received it in error please notify us
    immediately
    > and then destroy it.
    >
    >
    >
    >
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    So, we agree: if the peripheral generates interrupts fast enough, they'll
    back up unless the peripheral can queue them. That's been my point from the
    start. And I agree that beating the CPU with IRQs is not good - interrupts
    should be handled by peripheral processors. The best machines I've ever
    worked with didn't pass on peripheral interrupts to the CPU, in fact, I/O
    was not an issue for the kernel but for the peripheral processors.

    Alberto.


    -----Original Message-----
    From: Doug [mailto:xxxxx@hotmail.com]
    Sent: Friday, December 13, 2002 11:24 AM
    To: NT Developers Interest List
    Subject: [ntdev] Re: interrupt handshaking - was other crap


    A higher priority IRQ may be slightly delayed (and I mean slightly) when the
    IRQ routine clears out a lower priority IRQ from the APIC, but as soon as
    the higher priority IRQ makes it through the APIC, NT should get interrupted
    and swap out any lower IRQL ISR that is running to allow the higher priority
    IRQL's ISR to run. Then it is up to the DD writer to be efficient as
    possible.

    The only way an IRQ could get lost on a peripheral is if the Periperal
    INT->INTA#->APIC IRQ->CPU IRQ->ISR->Clear Peripheral INT is not handled
    before the next one is generated by the peripheral. If that is the case, the
    peripheral should employ some sort of queueing to handle the latency. For
    example, this is why UARTs started having 16 byte FIFO queues, 256 byte
    queues, etc. Beating the CPU with IRQs is not good for the CPU or the
    peripheral.

    Doug

    "Moreira, Alberto" <xxxxx@compuware.com> wrote in message
    news:xxxxx@ntdev...
    >
    > A second interrupt will come when it comes - that's dictated by the
    > peripheral subsystem. If the APIC or the processor are not in a state that
    > the interrupt can get through, it'll back up at the peripheral subsystem,
    > that is, to the extent it can back up. High throughput systems can be time
    > critical ! The key issue is interrupt latency: the time elapsed between
    the
    > time the peripheral issues an interrupt and the time the peripheral can
    > issue another interrupt.
    >
    >
    > Alberto.
    >
    >
    > -----Original Message-----
    > From: Doug [mailto:xxxxx@hotmail.com]
    > Sent: Friday, December 13, 2002 10:36 AM
    > To: NT Developers Interest List
    > Subject: [ntdev] Re: interrupt handshaking - was other crap
    >
    >
    > An second interrupt will only come in when the interrupt flag is set (STI)
    > and the (A)PIC has been issued an 'End Of Interrupt' (EOI) command to
    clear
    > the currently requesting interrupt. When an EOI is sent the highest
    > currently active interrupt gets serviced. If a lower priority interrupt is
    > being serviced it is possible a higher priority interrupt could come in
    > before the EOI is sent for the lower priority interrupt. This is true of
    any
    > priority interrupt scheme.
    >
    > That is why the kernel separates out real IRQ routines (owned by the
    kernel)
    > from Interrupt Service Routines (ISRs - owned by the device driver
    writer).
    > Kernel IRQ routines are designed to be very small and very fast so the
    time
    > used to service an IRQ (i.e. send EOI, STI, and queue ISR) is minimal.
    >
    > Doug
    >
    > "Moreira, Alberto" <xxxxx@compuware.com> wrote in message
    > news:xxxxx@ntdev...
    > >
    > > If I understand this correctly, the OS reflects the IRQL in the APIC's
    > > priority register. Higher priority interrupts will be masked out until
    the
    > > OS issues the sti. Lower priority interrupts will be masked out until
    the
    > OS
    > > acknowledges the interrupt. There's a window in there where even a
    higher
    > > IRQL interrupt won't get through, and another window where no lower
    > priority
    > > interrupts will get through. This may or may not be a problem, depending
    > on
    > > the application and on the interrupt volume.
    > >
    > >
    > > Alberto.
    > >
    > >
    > >
    > > -----Original Message-----
    > > From: Jake Oshins [mailto:xxxxx@windows.microsoft.com]
    > > Sent: Thursday, December 12, 2002 11:03 PM
    > > To: NT Developers Interest List
    > > Subject: [ntdev] RE: interrupt handshaking - was other crap
    > >
    > >
    > > You ask for a lot. But, in order to put this to bed, here it is.
    > >
    > > APIC Case (I'll stick to the single-processor case for brevity:)
    > >
    > > Let me add some hypothetical details. When I use "Time A" or "Time B,"
    > > assume that time passes in alphabetical order.
    > >
    > > Card 1 is attached to I/O APIC input #21.
    > > Card 2 is attached to I/O APIC input #20.
    > > Card 3 and 4 are attached to I/O APIC input #19.
    > >
    > > The OS has assigned IDT entry 0x71 (IRQL 0xB) to I/O APIC input #21.
    > > The OS has assigned IDT entry 0xa1 (IRQL 0xE) to I/O APIC input #20.
    > > The OS has assigned IDT entry 0x93 (IRQL 0xD) to I/O APIC input #19.
    > >
    > > Card 2 asserts INTA# (by grounding it) at Time A.
    > > Card 1 asserts INTA# at Time C.
    > > Card 4 asserts INTA# at Time D.
    > > Card 3 asserts INTA# at Time E.
    > >
    > > Assume the processor is running at PASSIVE_LEVEL at Time A.
    > >
    > > Now for the flow:
    > >
    > > The I/O APIC will send a message to the Local APIC in the processor
    > > shortly after Time A telling it that a level-triggered interrupt
    > > occurred on vector 0xa1.
    > >
    > > The Local APIC will set the bit corresponding to 0xa1 in its Trigger
    > > Mode Register. Then set the 0xa1 bit in its IRR register, meaning that
    > > it received the interrupt.
    > >
    > > At Time B, the Local APIC then asserts an interrupt at the processor
    > > core. The processor core responds by reading the vector from the Local
    > > APIC, dumping context on the stack and jumping through the IDT entry at
    > > 0xa1. This causes the Local APIC to set the 0xa1 bit in its ISR
    > > register.
    > >
    > > The NT kernel has placed an architecture-specific interrupt pre-amble at
    > > that address which raises to IRQL issues a "sti" instruction,
    > > re-enabling interrupts.
    > >
    > > Just before it reaches the sti, we hit Time C. The I/O APIC sends a
    > > message to the local APIC telling it that a level-triggered interrupt
    > > occurred on vector 0x71.
    > >
    > > The Local APIC sets the IRR and TMR bits associated with 0x71 and does
    > > nothing more, since IRQL has been raised to higher than this vector.
    > > (IRQL, on APIC systems, is maintained directly in the Local APIC's Task
    > > Priority Register.)
    > >
    > > The processor issues the "sti" instruction mentioned above. The
    > > interrupt pre-amble code then starts looking for
    > > architecture-independent ISRs connected to vector 0xa1 that are there as
    > > a result of drivers calling IoConnectInterrupt.
    > >
    > > Time D arrives. The I/O APIC detects that card 4 has asserted INTA#.
    > > It sends a message to the Local APIC telling it that a level-triggered
    > > interrupt occurred on vector 0x93. The Local APIC sets the 0x93's IRR
    > > and TMR bits.
    > >
    > > The processor executes more of card2's driver's ISR. When this
    > > completes, the ISR returns "TRUE, it was my interrupt and I handled it."
    > > This prompts the NT kernel to quit processing the ISR chain, ACK the
    > > interrupt and drop IRQL. This will cause the ISR and IRR bits
    > > corresponding to 0xa1 in the Local APIC to be cleared. Because 0xa1's
    > > TMR bit is set, this ACK will also cause a message to be sent to the I/O
    > > APIC, telling it to re-sample vector 0xa1.
    > >
    > > Now that IRQL has been lowered, the Local APIC will interrupt the
    > > processor again, this time with vector 0x93, which is currently the
    > > highest priority in the IRR register.
    > >
    > > The processor will again jump through the IDT to the pre-able code.
    > > Again, the code will raise IRQL, this time to level 0xD, and start
    > > executing ISRs.
    > >
    > > About this time the ACK message reaches the I/O APIC. The I/O APIC
    > > re-samples input number #20. It's not asserted, since card2's driver
    > > just ran its ISR. No new interrupt occurs here.
    > >
    > > Back to the processor. It starts to execute the first ISR on 0x93's
    > > chain. Assume that it finds card3's driver's ISR first on the list. It
    > > will execute that ISR, which will clear the interrupting condition in
    > > card3 and return "TRUE - that was my device, and the condition has been
    > > handled." The processor then sends and ACK and drops IRQL. This causes
    > > 0x93's IRR and ISR bits to be cleared, and a message to be sent to I/O
    > > APIC, since 0x93's TMR bit is set.
    > >
    > > It will then drop IRQL and accept another interrupt from the Local APIC,
    > > this time vector 0x71. The processor probably won't get very far
    > > through the pre-amble code before the message gets to the I/O APIC.
    > > Imagine it gets no further than raising to IRQL 0xB and issuing a "sti."
    > >
    > >
    > > At that point, the I/O APIC will re-sample input #19. Since card4 is
    > > still asserting INTA#, this input will still be active. The I/O APIC
    > > will send a message back to the Local APIC telling it that vector 0x93
    > > was triggered, level-style.
    > >
    > > The Local APIC will then interrupt the processor with vector 0x93 again.
    > > The processor will jump through the IDT and start executing pre-amble
    > > code, raising back to IRQL 0xD. It will call card3's ISR again, since
    > > it is first on 0x93's chain. Card3's ISR will return "FALSE - it wasn't
    > > me." The NT kernel will then call the next ISR on the chain, that of
    > > card4. Card4's ISR will run and return "TRUE - that was mine and I've
    > > cleared interrupting condition." This will cause the kernel to issue
    > > another ACK and drop IRQL.
    > >
    > > The Local APIC will then clear 0x93's ISR and IRR bits again and issue
    > > an ACK to the I/O APIC. Then the I/O APIC will re-sample input #19.
    > > Now it's de-asserted, since neither card3 nor card4 is interrupting.
    > >
    > > The processor will now continue executing the interrupt pre-amble code
    > > for vector 0x71, which will call card1's ISR. Card1's ISR will return
    > > "TRUE - that was mine and I've cleared the interrupting condition." The
    > > kernel will ACK the interrupt and drop IRQL.
    > >
    > > The Local APIC will clear the ISR and IRR bits associated with 0x71.
    > > This will cause an ACK to be sent to the I/O APIC. The I/O APIC will
    > > then re-sample input #21. It is now deasserted.
    > >
    > > The processor will go back to executing lower-priority code. In all
    > > likelihood, these ISRs queued up a bunch of DPCs. Now those DPCs will
    > > be executed in the order that they were queued at DISPATCH_LEVEL. Then
    > > the processor will drop back to PASSIVE_LEVEL and look for threads to
    > > run. It may or may not find any.
    > >
    > >
    > > I'm really tired of typing at the moment. So I challenge somebody else
    > > to do either PIC case. (They are much more alike than different. With
    > > respect to the device's ISRs, they are indistinguishable.)
    > >
    > > This is the last that I will write on this topic. I refer any further
    > > inquiries to Intel's Programmer's Reference Manuals for the Pentium
    > > Family. See Volume 3, chapter 8.
    > >
    > > http://developer.intel.com/design/pentium4/manuals/
    > >
    > > Jake Oshins
    > > Windows Kernel Group Interrupt Guy
    > >
    > > This posting is provided "AS IS" with no warranties, and confers no
    > > rights.
    > >
    > >
    > > -----Original Message-----
    > > Subject: RE: Device Interrupt priority - Reviewing Jose Flores
    > > From: "Christiaan Ghijselinck" <xxxxx@CompaqNet.be>
    > > Date: Wed, 11 Dec 2002 20:38:24 +0100
    > > X-Message-Number: 35
    > >
    > > Who will rise to the next challenge :
    > >
    > >
    > > "Four PCI cards fire quasi at the same moment ( assume 100 ns time
    > > difference ) an interrupt. Card1 , card2 and card3 have different IRQ's
    > > =
    > > ,
    > > card4 shares the same IRQ with card3.
    > >
    > > I would like to see a detailled flow of all handschaking actions between
    > > =
    > > the
    > > Cards <--> [ PCI bus <-- > ] PIC/APIC <--> CPU with bus <--> OS/ISR's in
    > > both the "Lazy" and the "Strict Model". The flow ends when the last ( =
    > > fourth )
    > > IRQ has been completely serviced.
    > >
    > > Such a description would have an incredible value .
    > >
    > >
    > >
    > >
    > > ---
    > > You are currently subscribed to ntdev as: xxxxx@compuware.com
    > > To unsubscribe send a blank email to %%email.unsub%%
    > >
    > >
    > >
    > > The contents of this e-mail are intended for the named addressee only.
    It
    > > contains information that may be confidential. Unless you are the named
    > > addressee or an authorized designee, you may not copy or use it, or
    > disclose
    > > it to anyone else. If you received it in error please notify us
    > immediately
    > > and then destroy it.
    > >
    > >
    > >
    > >
    >
    >
    >
    > ---
    > You are currently subscribed to ntdev as: xxxxx@compuware.com
    > To unsubscribe send a blank email to %%email.unsub%%
    >
    >
    >
    > The contents of this e-mail are intended for the named addressee only. It
    > contains information that may be confidential. Unless you are the named
    > addressee or an authorized designee, you may not copy or use it, or
    disclose
    > it to anyone else. If you received it in error please notify us
    immediately
    > and then destroy it.
    >
    >
    >
    >



    ---
    You are currently subscribed to ntdev as: xxxxx@compuware.com
    To unsubscribe send a blank email to %%email.unsub%%



    The contents of this e-mail are intended for the named addressee only. It
    contains information that may be confidential. Unless you are the named
    addressee or an authorized designee, you may not copy or use it, or disclose
    it to anyone else. If you received it in error please notify us immediately
    and then destroy it.
  • James_AntogniniJames_Antognini Member Posts: 263
    S390 is like what you describe: "Channel engines" (something like
    peripheral CPUs) do most of the talking to devices. In my opinion, the
    x86 architecture, the whole approach, is weaker in the domain of I/O.

    --
    If replying by e-mail, please remove "nospam." from the address.

    James Antognini
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    > The only way an IRQ could get lost on a peripheral is if the
    Periperal
    > INT->INTA#->APIC IRQ->CPU IRQ->ISR->Clear Peripheral INT is not
    handled
    > before the next one is generated by the peripheral. If that is the
    case, the
    > peripheral should employ some sort of queueing to handle the
    latency.

    Not necessary, the smart peripheral will tolerate interrupt collapsing
    (replacing a sequence of 2 interrupts with only the second one).

    Max
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    > that is, to the extent it can back up. High throughput systems can
    be time
    > critical !

    If they use the chained DMA with long enough buffer list in the host
    memory - then they can tolerate long interrupt latencies without data
    loss.

    Max
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    > S390 is like what you describe: "Channel engines" (something like
    > peripheral CPUs) do most of the talking to devices.

    Am I wrong that any chain-DMA-capable PCI device (UHCI, IDE DMA,
    OHCI1394 and so on) is a kind of "channel engine"?

    Max
  • David_J._CraigDavid_J._Craig Member Posts: 1,885
    Yes, but look at the costs involved in having I/O channel controllers that
    are full blown multiprocessor systems themselves.

    The newer versions of EIDE/ATAPI with full DMA and even the old SCSI-2
    standard allows some of the capabilities of the mainframes on the PC. Not
    quite as complete an offload of I/O processing. I remember the Sperry
    1100/60/90 series where the computer terminals were attached to a DCP that
    had multiple processors, mass storage, and a full OS just to offload the
    communications overhead. The mainframe had from one to four processors, but
    they were not true SMP. Some I/O devices could only be reached by each pair
    of CPUs, so if a request needed to be issued to an I/O controller on the
    other pair, a pass off was required.

    One thing that the "Channel engines" don't have to contend with in the
    mainframe world is the variety of devices that can be attached. The
    mainframe companies design the "engine" and all the devices that can be
    attached. Any third-party hardware company that wants to sell into that
    market has to be compatible to a greater degree than in the PC world. There
    is more standardization with the Microsoft/Intel hardware design specs, but
    it is not nearly as restrictive. You can fix a lot of the problems on the
    PC with a driver, but Microsoft is trying to force most of the market to
    write to more restrictive standards than before. One example is the first
    flash memory readers that required drivers. The newest versions are "mass
    storage compliant" so the standard Microsoft drivers can access the devices.
    That compliance is more expensive in that enough memory must be internal to
    the device to handle two flash memory blocks which are usually much larger
    than the 512 byte sector addressing the OS uses.

    ----- Original Message -----
    From: "James Antognini" <xxxxx@mindspring.nospam.com>
    Newsgroups: ntdev
    To: "NT Developers Interest List" <xxxxx@lists.osr.com>
    Sent: Friday, December 13, 2002 12:26 PM
    Subject: [ntdev] Re: interrupt handshaking - was other crap


    > S390 is like what you describe: "Channel engines" (something like
    > peripheral CPUs) do most of the talking to devices. In my opinion, the
    > x86 architecture, the whole approach, is weaker in the domain of I/O.
    >
    > --
    > If replying by e-mail, please remove "nospam." from the address.
    >
    > James Antognini
    >
    >
    >
    > ---
    > You are currently subscribed to ntdev as: xxxxx@yoshimuni.com
    > To unsubscribe send a blank email to %%email.unsub%%
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    The DCP only handled communications, other peripherals went to the 1100
    directly. The I/O controllers on the mainframe side were a bit more limited,
    and as far as I remember they didn't have as much functionality as the IBM
    Channels.

    As far as costs go, processors are cheap. Yesterday I went to a presentation
    of a NUON 4x VLIW microprocessor: $10 bucks or less. And multiple PCI buses
    allows for open connectivity: the interrupt stops here!


    Alberto.



    -----Original Message-----
    From: David J. Craig [mailto:xxxxx@yoshimuni.com]
    Sent: Friday, December 13, 2002 1:02 PM
    To: NT Developers Interest List
    Subject: [ntdev] Re: interrupt handshaking - was other crap


    Yes, but look at the costs involved in having I/O channel controllers that
    are full blown multiprocessor systems themselves.

    The newer versions of EIDE/ATAPI with full DMA and even the old SCSI-2
    standard allows some of the capabilities of the mainframes on the PC. Not
    quite as complete an offload of I/O processing. I remember the Sperry
    1100/60/90 series where the computer terminals were attached to a DCP that
    had multiple processors, mass storage, and a full OS just to offload the
    communications overhead. The mainframe had from one to four processors, but
    they were not true SMP. Some I/O devices could only be reached by each pair
    of CPUs, so if a request needed to be issued to an I/O controller on the
    other pair, a pass off was required.

    One thing that the "Channel engines" don't have to contend with in the
    mainframe world is the variety of devices that can be attached. The
    mainframe companies design the "engine" and all the devices that can be
    attached. Any third-party hardware company that wants to sell into that
    market has to be compatible to a greater degree than in the PC world. There
    is more standardization with the Microsoft/Intel hardware design specs, but
    it is not nearly as restrictive. You can fix a lot of the problems on the
    PC with a driver, but Microsoft is trying to force most of the market to
    write to more restrictive standards than before. One example is the first
    flash memory readers that required drivers. The newest versions are "mass
    storage compliant" so the standard Microsoft drivers can access the devices.
    That compliance is more expensive in that enough memory must be internal to
    the device to handle two flash memory blocks which are usually much larger
    than the 512 byte sector addressing the OS uses.

    ----- Original Message -----
    From: "James Antognini" <xxxxx@mindspring.nospam.com>
    Newsgroups: ntdev
    To: "NT Developers Interest List" <xxxxx@lists.osr.com>
    Sent: Friday, December 13, 2002 12:26 PM
    Subject: [ntdev] Re: interrupt handshaking - was other crap


    > S390 is like what you describe: "Channel engines" (something like
    > peripheral CPUs) do most of the talking to devices. In my opinion, the
    > x86 architecture, the whole approach, is weaker in the domain of I/O.
    >
    > --
    > If replying by e-mail, please remove "nospam." from the address.
    >
    > James Antognini
    >
    >
    >
    > ---
    > You are currently subscribed to ntdev as: xxxxx@yoshimuni.com
    > To unsubscribe send a blank email to %%email.unsub%%




    ---
    You are currently subscribed to ntdev as: xxxxx@compuware.com
    To unsubscribe send a blank email to %%email.unsub%%



    The contents of this e-mail are intended for the named addressee only. It
    contains information that may be confidential. Unless you are the named
    addressee or an authorized designee, you may not copy or use it, or disclose
    it to anyone else. If you received it in error please notify us immediately
    and then destroy it.
  • DougDoug Member Posts: 83
    That is a form of queueing. Internally, the peripheral would still have to
    present information about the two things that caused the interrupt.

    "Maxim S. Shatskih" wrote in message
    news:xxxxx@ntdev...
    >
    > > The only way an IRQ could get lost on a peripheral is if the
    > Periperal
    > > INT->INTA#->APIC IRQ->CPU IRQ->ISR->Clear Peripheral INT is not
    > handled
    > > before the next one is generated by the peripheral. If that is the
    > case, the
    > > peripheral should employ some sort of queueing to handle the
    > latency.
    >
    > Not necessary, the smart peripheral will tolerate interrupt collapsing
    > (replacing a sequence of 2 interrupts with only the second one).
    >
    > Max
    >
    >
    >
    >
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    No, I don't agree.

    Devices typically queue data, not interrupts. If the device has an external interface that supports a flow control mechanism the device should naturally flow control data so that it is not lost as its data queues reach their high water mark.

    For those interfaces without flow control (or s/w based flow control), the device has no choice other than to discard data when its queues get full. This condition (i.e. data discarded) is often one of the interrupt status bits that can assert the interrupt signal (i.e. buffer overrun).

    Duane J. McCrory
    InfiniCon Systems

    -----Original Message-----
    From: Moreira, Alberto [mailto:xxxxx@compuware.com]
    Sent: Friday, December 13, 2002 11:34 AM
    To: NT Developers Interest List
    Subject: [ntdev] Re: interrupt handshaking - was other crap


    So, we agree: if the peripheral generates interrupts fast enough, they'll
    back up unless the peripheral can queue them. That's been my point from the
    start. And I agree that beating the CPU with IRQs is not good - interrupts
    should be handled by peripheral processors. The best machines I've ever
    worked with didn't pass on peripheral interrupts to the CPU, in fact, I/O
    was not an issue for the kernel but for the peripheral processors.

    Alberto.
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    If the device can't queue the condition that will generate the interrupt -
    data is one of them - something will have to give, either flow control will
    have to be applied or data will start to get lost. But some devices can't
    flow control, and in many cases flow control means lower throughput or
    slower response times, which will be unacceptable. So, if a device has to
    discard data because the queue is full and it cannot generate interrupts
    fast enough - due to processor bottlenecks - this has the same effect as if
    the interrupts themselves got lost. I'm using "interrupt" here to mean a
    condition that requires intervention by the processor, and buffer overflows
    because of traffic jams is one of the reasons why.

    Alberto.


    -----Original Message-----
    From: McCrory, Duane [mailto:xxxxx@infiniconsys.com]
    Sent: Friday, December 13, 2002 1:31 PM
    To: NT Developers Interest List
    Subject: [ntdev] Re: interrupt handshaking - was other crap


    No, I don't agree.

    Devices typically queue data, not interrupts. If the device has an external
    interface that supports a flow control mechanism the device should naturally
    flow control data so that it is not lost as its data queues reach their high
    water mark.

    For those interfaces without flow control (or s/w based flow control), the
    device has no choice other than to discard data when its queues get full.
    This condition (i.e. data discarded) is often one of the interrupt status
    bits that can assert the interrupt signal (i.e. buffer overrun).

    Duane J. McCrory
    InfiniCon Systems

    -----Original Message-----
    From: Moreira, Alberto [mailto:xxxxx@compuware.com]
    Sent: Friday, December 13, 2002 11:34 AM
    To: NT Developers Interest List
    Subject: [ntdev] Re: interrupt handshaking - was other crap


    So, we agree: if the peripheral generates interrupts fast enough, they'll
    back up unless the peripheral can queue them. That's been my point from the
    start. And I agree that beating the CPU with IRQs is not good - interrupts
    should be handled by peripheral processors. The best machines I've ever
    worked with didn't pass on peripheral interrupts to the CPU, in fact, I/O
    was not an issue for the kernel but for the peripheral processors.

    Alberto.


    ---
    You are currently subscribed to ntdev as: xxxxx@compuware.com
    To unsubscribe send a blank email to %%email.unsub%%



    The contents of this e-mail are intended for the named addressee only. It
    contains information that may be confidential. Unless you are the named
    addressee or an authorized designee, you may not copy or use it, or disclose
    it to anyone else. If you received it in error please notify us immediately
    and then destroy it.
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    Surely, by means of status register.

    "Doug" <xxxxx@hotmail.com> wrote in message
    news:LYRIS-542-88032-2002.12.13-13.23.36--maxim#xxxxx@lists
    .osr.com...
    > That is a form of queueing. Internally, the peripheral would still
    have to
    > present information about the two things that caused the interrupt.
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    Let's not overload the term interrupt. A device is either interrupting or it is not. The condition where you are interrupting but not getting serviced quick enough is a real-time problem, where the system is periodically oversubscribed. In this sitution you are not losing interrupts, so for the sake of keeping this forum technically accurate, please don't misuse terms.

    -----Original Message-----
    From: Moreira, Alberto [mailto:xxxxx@compuware.com]
    Sent: Friday, December 13, 2002 1:38 PM
    To: NT Developers Interest List
    Subject: [ntdev] Re: interrupt handshaking - was other crap


    If the device can't queue the condition that will generate the interrupt -
    data is one of them - something will have to give, either flow control will
    have to be applied or data will start to get lost. But some devices can't
    flow control, and in many cases flow control means lower throughput or
    slower response times, which will be unacceptable. So, if a device has to
    discard data because the queue is full and it cannot generate interrupts
    fast enough - due to processor bottlenecks - this has the same effect as if
    the interrupts themselves got lost. I'm using "interrupt" here to mean a
    condition that requires intervention by the processor, and buffer overflows
    because of traffic jams is one of the reasons why.

    Alberto.


    -----Original Message-----
    From: McCrory, Duane [mailto:xxxxx@infiniconsys.com]
    Sent: Friday, December 13, 2002 1:31 PM
    To: NT Developers Interest List
    Subject: [ntdev] Re: interrupt handshaking - was other crap


    No, I don't agree.

    Devices typically queue data, not interrupts. If the device has an external
    interface that supports a flow control mechanism the device should naturally
    flow control data so that it is not lost as its data queues reach their high
    water mark.

    For those interfaces without flow control (or s/w based flow control), the
    device has no choice other than to discard data when its queues get full.
    This condition (i.e. data discarded) is often one of the interrupt status
    bits that can assert the interrupt signal (i.e. buffer overrun).

    Duane J. McCrory
    InfiniCon Systems

    -----Original Message-----
    From: Moreira, Alberto [mailto:xxxxx@compuware.com]
    Sent: Friday, December 13, 2002 11:34 AM
    To: NT Developers Interest List
    Subject: [ntdev] Re: interrupt handshaking - was other crap


    So, we agree: if the peripheral generates interrupts fast enough, they'll
    back up unless the peripheral can queue them. That's been my point from the
    start. And I agree that beating the CPU with IRQs is not good - interrupts
    should be handled by peripheral processors. The best machines I've ever
    worked with didn't pass on peripheral interrupts to the CPU, in fact, I/O
    was not an issue for the kernel but for the peripheral processors.

    Alberto.


    ---
    You are currently subscribed to ntdev as: xxxxx@compuware.com
    To unsubscribe send a blank email to %%email.unsub%%



    The contents of this e-mail are intended for the named addressee only. It
    contains information that may be confidential. Unless you are the named
    addressee or an authorized designee, you may not copy or use it, or disclose
    it to anyone else. If you received it in error please notify us immediately
    and then destroy it.



    ---
    You are currently subscribed to ntdev as: xxxxx@infiniconsys.com
    To unsubscribe send a blank email to %%email.unsub%%
  • David_J._CraigDavid_J._Craig Member Posts: 1,885
    "Cheap" is a EXTREMELY relative. I know of development where even $.05 was
    a problem. Don't forget that for most cases the cost of a part is
    multiplied by four before you see the "suggested retail price". There is
    also the support circuitry, pads, and other components so I think your $10
    CPU costs about $50 in the final product.

    Jake Oshins said that there are no notebooks that have APICs. He didn't say
    why, but since most notebooks use Intel chipsets with little additional
    features there must be some other reason. Could it be cost? That may be
    why Dell notebooks haven't been supporting USB 2.0 on the built-in USB
    port(s). I haven't seen a notebook with a DVD+RW/-RW burner yet either.
    Sony has the only drive with that support that I know of, but it is about
    $100 more than some of the straight DVD+RW burners I see in Sam's Club.

    As to the Sperry, I think the controllers were more complex than you
    mentioned, but probably not as self-standing as the IBM. The only IBM I
    ever worked on was a ANFSQ7 and it didn't have any "channels". It didn't
    even have memory protection.

    ----- Original Message -----
    From: "Moreira, Alberto" <xxxxx@compuware.com>
    To: "NT Developers Interest List" <xxxxx@lists.osr.com>
    Sent: Friday, December 13, 2002 1:21 PM
    Subject: [ntdev] Re: interrupt handshaking - was other crap


    > The DCP only handled communications, other peripherals went to the 1100
    > directly. The I/O controllers on the mainframe side were a bit more
    limited,
    > and as far as I remember they didn't have as much functionality as the IBM
    > Channels.
    >
    > As far as costs go, processors are cheap. Yesterday I went to a
    presentation
    > of a NUON 4x VLIW microprocessor: $10 bucks or less. And multiple PCI
    buses
    > allows for open connectivity: the interrupt stops here!
    >
    >
    > Alberto.
    >
    >
    >
    > -----Original Message-----
    > From: David J. Craig [mailto:xxxxx@yoshimuni.com]
    > Sent: Friday, December 13, 2002 1:02 PM
    > To: NT Developers Interest List
    > Subject: [ntdev] Re: interrupt handshaking - was other crap
    >
    >
    > Yes, but look at the costs involved in having I/O channel controllers that
    > are full blown multiprocessor systems themselves.
    >
    > The newer versions of EIDE/ATAPI with full DMA and even the old SCSI-2
    > standard allows some of the capabilities of the mainframes on the PC. Not
    > quite as complete an offload of I/O processing. I remember the Sperry
    > 1100/60/90 series where the computer terminals were attached to a DCP that
    > had multiple processors, mass storage, and a full OS just to offload the
    > communications overhead. The mainframe had from one to four processors,
    but
    > they were not true SMP. Some I/O devices could only be reached by each
    pair
    > of CPUs, so if a request needed to be issued to an I/O controller on the
    > other pair, a pass off was required.
    >
    > One thing that the "Channel engines" don't have to contend with in the
    > mainframe world is the variety of devices that can be attached. The
    > mainframe companies design the "engine" and all the devices that can be
    > attached. Any third-party hardware company that wants to sell into that
    > market has to be compatible to a greater degree than in the PC world.
    There
    > is more standardization with the Microsoft/Intel hardware design specs,
    but
    > it is not nearly as restrictive. You can fix a lot of the problems on the
    > PC with a driver, but Microsoft is trying to force most of the market to
    > write to more restrictive standards than before. One example is the first
    > flash memory readers that required drivers. The newest versions are "mass
    > storage compliant" so the standard Microsoft drivers can access the
    devices.
    > That compliance is more expensive in that enough memory must be internal
    to
    > the device to handle two flash memory blocks which are usually much larger
    > than the 512 byte sector addressing the OS uses.
    >
    > ----- Original Message -----
    > From: "James Antognini" <xxxxx@mindspring.nospam.com>
    > Newsgroups: ntdev
    > To: "NT Developers Interest List" <xxxxx@lists.osr.com>
    > Sent: Friday, December 13, 2002 12:26 PM
    > Subject: [ntdev] Re: interrupt handshaking - was other crap
    >
    >
    > > S390 is like what you describe: "Channel engines" (something like
    > > peripheral CPUs) do most of the talking to devices. In my opinion, the
    > > x86 architecture, the whole approach, is weaker in the domain of I/O.
    > >
    > > --
    > > If replying by e-mail, please remove "nospam." from the address.
    > >
    > > James Antognini
    > >
    > >
    > >
    > > ---
    > > You are currently subscribed to ntdev as: xxxxx@yoshimuni.com
    > > To unsubscribe send a blank email to %%email.unsub%%
    >
    >
    >
    >
    > ---
    > You are currently subscribed to ntdev as: xxxxx@compuware.com
    > To unsubscribe send a blank email to %%email.unsub%%
    >
    >
    >
    > The contents of this e-mail are intended for the named addressee only. It
    > contains information that may be confidential. Unless you are the named
    > addressee or an authorized designee, you may not copy or use it, or
    disclose
    > it to anyone else. If you received it in error please notify us
    immediately
    > and then destroy it.
    >
    >
    >
    > ---
    > You are currently subscribed to ntdev as: xxxxx@yoshimuni.com
    > To unsubscribe send a blank email to %%email.unsub%%
  • Jake_OshinsJake_Oshins Member Posts: 1,058
    I'm surprised at you Alberto. As much as I disagree with your
    philosophy regarding the way drivers should be written, your
    understanding of processor architecture is usually dead-on.

    All interrupts are masked at the processor core before the STI
    instruction is issued. But the Local APIC is still accepting interrupt
    messages from the I/O APICs. This means that a higher priority
    interrupt will be accepted, including setting the TMR and IRR bits. As
    soon at the processor issues the STI, the processor will be interrupted
    by the Local APIC with that high-priority interrupt, and the Local APIC
    will then set the ISR bit.

    The same will happen for lower-priority interrupts, except that they
    won't be delivered to the processor until the OS writes a lower-IRQL
    value into the TPR register in the Local APIC.

    None will be lost. All of the examples that I have given so far have
    been for level-triggered interrupts. So, to put your mind at ease, I'll
    quickly describe the protocol for edge-triggered interrupts.

    When the OS receives a level-triggered interrupt, it sends the EOI after
    a driver-specific ISR has claimed it, or after it hits the end of the
    ISR chain. The causes the I/O APIC to re-sample the interrupt and send
    another one if the line is still active.

    Edge-triggered interrupts can't work this way. So the OS sends the EOI
    after raising IRQL and before issuing the STI instruction, which is
    before it calls the device-specific ISR chain. That way any new edge
    triggered events will be accepted at the Local APIC. Duplicate
    edge-triggered interrupt events may be collapsed into a single
    edge-triggered event. But that doesn't matter because we call every ISR
    on the chain every time the interrupt occurs. Your ISR may be called
    when your device didn't interrupt. But it will definitely be called if
    it did interrupt.

    Jake Oshins
    Windows Kernel Group Interrupt Guy

    This posting is provided "AS IS" with no warranties, and confers no
    rights.


    -----Original Message-----
    Subject: RE: interrupt handshaking - was other crap
    From: "Moreira, Alberto" <xxxxx@compuware.com>
    Date: Fri, 13 Dec 2002 10:19:06 -0500
    X-Message-Number: 8

    If I understand this correctly, the OS reflects the IRQL in the APIC's
    priority register. Higher priority interrupts will be masked out until
    the
    OS issues the sti. Lower priority interrupts will be masked out until
    the OS
    acknowledges the interrupt. There's a window in there where even a
    higher
    IRQL interrupt won't get through, and another window where no lower
    priority
    interrupts will get through. This may or may not be a problem, depending
    on
    the application and on the interrupt volume.


    Alberto.



    -----Original Message-----
    From: Jake Oshins [mailto:xxxxx@windows.microsoft.com]
    Sent: Thursday, December 12, 2002 11:03 PM
    To: NT Developers Interest List
    Subject: [ntdev] RE: interrupt handshaking - was other crap


    You ask for a lot. But, in order to put this to bed, here it is.

    <stuff deleted>
  • Jake_OshinsJake_Oshins Member Posts: 1,058
    I don't completely agree. The serial port FIFO example given below is
    spurious. The problem with a serial port with no FIFO is that the data
    may be overwritten so quickly by new incoming data that you would need
    fantastically small interrupt latency to handle it at 115200 BAUD.

    With level-triggered devices (see my earlier messages) you end up doing
    exactly the sort of "queuing" that you're talking about just by holding
    your interrupt in the asserted state. The OS will call your ISR
    repeatedly until you finally release the signal. (In PCI devices, this
    is the INTx# signal.)

    With edge-triggered devices, your ISR must be able to handle every event
    that your device currently has pending, since it may only be called
    once. This may or may not involve internal queuing. It may just
    involve setting individual status bits for each class of event.

    Jake Oshins
    Windows Kernel Group Interrupt Guy

    This posting is provided "AS IS" with no warranties, and confers no
    rights.

    -----Original Message-----
    Subject: Re: interrupt handshaking - was other crap
    From: "Moreira, Alberto" <xxxxx@compuware.com>
    Date: Fri, 13 Dec 2002 11:34:23 -0500
    X-Message-Number: 15

    So, we agree: if the peripheral generates interrupts fast enough,
    they'll
    back up unless the peripheral can queue them. That's been my point from
    the
    start. And I agree that beating the CPU with IRQs is not good -
    interrupts
    should be handled by peripheral processors. The best machines I've ever
    worked with didn't pass on peripheral interrupts to the CPU, in fact,
    I/O
    was not an issue for the kernel but for the peripheral processors.

    Alberto.


    -----Original Message-----
    From: Doug [mailto:xxxxx@hotmail.com]
    Sent: Friday, December 13, 2002 11:24 AM
    To: NT Developers Interest List
    Subject: [ntdev] Re: interrupt handshaking - was other crap


    A higher priority IRQ may be slightly delayed (and I mean slightly) when
    the
    IRQ routine clears out a lower priority IRQ from the APIC, but as soon
    as
    the higher priority IRQ makes it through the APIC, NT should get
    interrupted
    and swap out any lower IRQL ISR that is running to allow the higher
    priority
    IRQL's ISR to run. Then it is up to the DD writer to be efficient as
    possible.

    The only way an IRQ could get lost on a peripheral is if the Periperal
    INT->INTA#->APIC IRQ->CPU IRQ->ISR->Clear Peripheral INT is not handled
    before the next one is generated by the peripheral. If that is the case,
    the
    peripheral should employ some sort of queueing to handle the latency.
    For
    example, this is why UARTs started having 16 byte FIFO queues, 256 byte
    queues, etc. Beating the CPU with IRQs is not good for the CPU or the
    peripheral.

    Doug
  • James_AntogniniJames_Antognini Member Posts: 263
    From what I understand of chained DMA (having just read a little about
    it on the Web), it is indeed an outboard engine. That is, it has its own
    processor and can move data to/from memory of the host computer. The
    S390 channel engine is probably more powerful (it should be, since it
    costs a lot more), eg, accepts channel programs (real program entities,
    that is), can disconnect from and reconnect (upon being presented with
    an interrupt) to a device, can present "interim" interrupts (before the
    I/O required by the channel program has completed but at a point when
    some subset of data are written/read) and probably has a wider I/O path
    (8 bytes?). I'm sure there are other things in S390 I don't know about
    (it's been 10+ years since I did stuff like channel programming), but I
    think those capture the essence. That said, a chained DMA engine is in
    the same ballpark as an S390 engine. I am impressed, because for a long
    time a definite weakness of x86 architecture was the I/O aspect. Things
    have moved on.

    --
    If replying by e-mail, please remove "nospam." from the address.

    James Antognini
  • DougDoug Member Posts: 83
    Sorry, but I disagree with your disagreement :-).

    If a UART had no FIFO it would
    a) Interrupt on every character received possibly flooding the processor
    with interrupts at high baud rates.
    b) Loose characters if the interrupt is not handled before the next
    charcater comes in to the UART.

    With a FIFO you can
    a) Interrupt after X number of characters reducing the number of interrupts
    to the processor
    b) Queue up characters so they are not lost by a delay in emptying a queue.

    This does not really have anything to do with edge versus level triggered
    interupts. You cannot queue anything if your queue or buffer is only one
    character. That is not a queue - it is a register. I will agree that level
    triggered interrupts are MUCH better than edge triggered though.

    Doug

    "Jake Oshins" wrote in message
    news:xxxxx@ntdev...

    I don't completely agree. The serial port FIFO example given below is
    spurious. The problem with a serial port with no FIFO is that the data
    may be overwritten so quickly by new incoming data that you would need
    fantastically small interrupt latency to handle it at 115200 BAUD.

    With level-triggered devices (see my earlier messages) you end up doing
    exactly the sort of "queuing" that you're talking about just by holding
    your interrupt in the asserted state. The OS will call your ISR
    repeatedly until you finally release the signal. (In PCI devices, this
    is the INTx# signal.)

    With edge-triggered devices, your ISR must be able to handle every event
    that your device currently has pending, since it may only be called
    once. This may or may not involve internal queuing. It may just
    involve setting individual status bits for each class of event.

    Jake Oshins
    Windows Kernel Group Interrupt Guy

    This posting is provided "AS IS" with no warranties, and confers no
    rights.
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    Good Lord, if a company marks up 4x, they do it at their own risk. I once
    was told by a Taiwan executive, "You know how I compute my price ? I take
    the cost and divide by .85".

    And then, you get what you pay for, right ?


    Alberto.


    -----Original Message-----
    From: David J. Craig [mailto:xxxxx@yoshimuni.com]
    Sent: Friday, December 13, 2002 5:17 PM
    To: NT Developers Interest List
    Subject: [ntdev] Re: interrupt handshaking - was other crap


    "Cheap" is a EXTREMELY relative. I know of development where even $.05 was
    a problem. Don't forget that for most cases the cost of a part is
    multiplied by four before you see the "suggested retail price". There is
    also the support circuitry, pads, and other components so I think your $10
    CPU costs about $50 in the final product.

    Jake Oshins said that there are no notebooks that have APICs. He didn't say
    why, but since most notebooks use Intel chipsets with little additional
    features there must be some other reason. Could it be cost? That may be
    why Dell notebooks haven't been supporting USB 2.0 on the built-in USB
    port(s). I haven't seen a notebook with a DVD+RW/-RW burner yet either.
    Sony has the only drive with that support that I know of, but it is about
    $100 more than some of the straight DVD+RW burners I see in Sam's Club.

    As to the Sperry, I think the controllers were more complex than you
    mentioned, but probably not as self-standing as the IBM. The only IBM I
    ever worked on was a ANFSQ7 and it didn't have any "channels". It didn't
    even have memory protection.

    ----- Original Message -----
    From: "Moreira, Alberto" <xxxxx@compuware.com>
    To: "NT Developers Interest List" <xxxxx@lists.osr.com>
    Sent: Friday, December 13, 2002 1:21 PM
    Subject: [ntdev] Re: interrupt handshaking - was other crap


    > The DCP only handled communications, other peripherals went to the 1100
    > directly. The I/O controllers on the mainframe side were a bit more
    limited,
    > and as far as I remember they didn't have as much functionality as the IBM
    > Channels.
    >
    > As far as costs go, processors are cheap. Yesterday I went to a
    presentation
    > of a NUON 4x VLIW microprocessor: $10 bucks or less. And multiple PCI
    buses
    > allows for open connectivity: the interrupt stops here!
    >
    >
    > Alberto.
    >
    >
    >
    > -----Original Message-----
    > From: David J. Craig [mailto:xxxxx@yoshimuni.com]
    > Sent: Friday, December 13, 2002 1:02 PM
    > To: NT Developers Interest List
    > Subject: [ntdev] Re: interrupt handshaking - was other crap
    >
    >
    > Yes, but look at the costs involved in having I/O channel controllers that
    > are full blown multiprocessor systems themselves.
    >
    > The newer versions of EIDE/ATAPI with full DMA and even the old SCSI-2
    > standard allows some of the capabilities of the mainframes on the PC. Not
    > quite as complete an offload of I/O processing. I remember the Sperry
    > 1100/60/90 series where the computer terminals were attached to a DCP that
    > had multiple processors, mass storage, and a full OS just to offload the
    > communications overhead. The mainframe had from one to four processors,
    but
    > they were not true SMP. Some I/O devices could only be reached by each
    pair
    > of CPUs, so if a request needed to be issued to an I/O controller on the
    > other pair, a pass off was required.
    >
    > One thing that the "Channel engines" don't have to contend with in the
    > mainframe world is the variety of devices that can be attached. The
    > mainframe companies design the "engine" and all the devices that can be
    > attached. Any third-party hardware company that wants to sell into that
    > market has to be compatible to a greater degree than in the PC world.
    There
    > is more standardization with the Microsoft/Intel hardware design specs,
    but
    > it is not nearly as restrictive. You can fix a lot of the problems on the
    > PC with a driver, but Microsoft is trying to force most of the market to
    > write to more restrictive standards than before. One example is the first
    > flash memory readers that required drivers. The newest versions are "mass
    > storage compliant" so the standard Microsoft drivers can access the
    devices.
    > That compliance is more expensive in that enough memory must be internal
    to
    > the device to handle two flash memory blocks which are usually much larger
    > than the 512 byte sector addressing the OS uses.
    >
    > ----- Original Message -----
    > From: "James Antognini" <xxxxx@mindspring.nospam.com>
    > Newsgroups: ntdev
    > To: "NT Developers Interest List" <xxxxx@lists.osr.com>
    > Sent: Friday, December 13, 2002 12:26 PM
    > Subject: [ntdev] Re: interrupt handshaking - was other crap
    >
    >
    > > S390 is like what you describe: "Channel engines" (something like
    > > peripheral CPUs) do most of the talking to devices. In my opinion, the
    > > x86 architecture, the whole approach, is weaker in the domain of I/O.
    > >
    > > --
    > > If replying by e-mail, please remove "nospam." from the address.
    > >
    > > James Antognini
    > >
    > >
    > >
    > > ---
    > > You are currently subscribed to ntdev as: xxxxx@yoshimuni.com
    > > To unsubscribe send a blank email to %%email.unsub%%
    >
    >
    >
    >
    > ---
    > You are currently subscribed to ntdev as: xxxxx@compuware.com
    > To unsubscribe send a blank email to %%email.unsub%%
    >
    >
    >
    > The contents of this e-mail are intended for the named addressee only. It
    > contains information that may be confidential. Unless you are the named
    > addressee or an authorized designee, you may not copy or use it, or
    disclose
    > it to anyone else. If you received it in error please notify us
    immediately
    > and then destroy it.
    >
    >
    >
    > ---
    > You are currently subscribed to ntdev as: xxxxx@yoshimuni.com
    > To unsubscribe send a blank email to %%email.unsub%%




    ---
    You are currently subscribed to ntdev as: xxxxx@compuware.com
    To unsubscribe send a blank email to %%email.unsub%%



    The contents of this e-mail are intended for the named addressee only. It
    contains information that may be confidential. Unless you are the named
    addressee or an authorized designee, you may not copy or use it, or disclose
    it to anyone else. If you received it in error please notify us immediately
    and then destroy it.
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    If there's lots of agents to share that interrupt line, level triggering may
    not be feasible for electrical reasons. But to me, "interrupt" is a
    condition where a peripheral needs attention by the processor, and that
    attention must usually be granted within a time window or the peripheral
    risks losing data. People use buffering - fifos and what not - to stretch
    those windows and reduce the frequency that the peripheral needs the
    processor's attention. But at some point in time that attention must be
    there, or else.

    Alberto.


    -----Original Message-----
    From: Doug [mailto:xxxxx@hotmail.com]
    Sent: Monday, December 16, 2002 8:51 AM
    To: NT Developers Interest List
    Subject: [ntdev] Re: interrupt handshaking - was other crap


    Sorry, but I disagree with your disagreement :-).

    If a UART had no FIFO it would
    a) Interrupt on every character received possibly flooding the processor
    with interrupts at high baud rates.
    b) Loose characters if the interrupt is not handled before the next
    charcater comes in to the UART.

    With a FIFO you can
    a) Interrupt after X number of characters reducing the number of interrupts
    to the processor
    b) Queue up characters so they are not lost by a delay in emptying a queue.

    This does not really have anything to do with edge versus level triggered
    interupts. You cannot queue anything if your queue or buffer is only one
    character. That is not a queue - it is a register. I will agree that level
    triggered interrupts are MUCH better than edge triggered though.

    Doug

    "Jake Oshins" <xxxxx@windows.microsoft.com> wrote in message
    news:xxxxx@ntdev...

    I don't completely agree. The serial port FIFO example given below is
    spurious. The problem with a serial port with no FIFO is that the data
    may be overwritten so quickly by new incoming data that you would need
    fantastically small interrupt latency to handle it at 115200 BAUD.

    With level-triggered devices (see my earlier messages) you end up doing
    exactly the sort of "queuing" that you're talking about just by holding
    your interrupt in the asserted state. The OS will call your ISR
    repeatedly until you finally release the signal. (In PCI devices, this
    is the INTx# signal.)

    With edge-triggered devices, your ISR must be able to handle every event
    that your device currently has pending, since it may only be called
    once. This may or may not involve internal queuing. It may just
    involve setting individual status bits for each class of event.

    Jake Oshins
    Windows Kernel Group Interrupt Guy

    This posting is provided "AS IS" with no warranties, and confers no
    rights.



    ---
    You are currently subscribed to ntdev as: xxxxx@compuware.com
    To unsubscribe send a blank email to %%email.unsub%%



    The contents of this e-mail are intended for the named addressee only. It
    contains information that may be confidential. Unless you are the named
    addressee or an authorized designee, you may not copy or use it, or disclose
    it to anyone else. If you received it in error please notify us immediately
    and then destroy it.
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    So, let's see. 115,200 bits per second is 14,400 8-bit characters per
    second. If one interrupt per character will flood the system, the peak
    interrupt rate of the system is 14,400 interrupts per second. That gives us
    around 70 microseconds per interrupt. At one gigahertz, that is, 10^9/ 10^6
    = 1000 instructions per microsecond, so, that's around 70,000 instructions.
    Even if we use a fraction of the CPU power to handle the interrupt, say,
    7,000 instructions, that's a lot of instructions to handle one character, no
    ? I'd expect a bit more from a modern processor.

    Alberto.


    -----Original Message-----
    From: Jake Oshins [mailto:xxxxx@windows.microsoft.com]
    Sent: Saturday, December 14, 2002 2:29 PM
    To: NT Developers Interest List
    Subject: [ntdev] RE: interrupt handshaking - was other crap


    I don't completely agree. The serial port FIFO example given below is
    spurious. The problem with a serial port with no FIFO is that the data
    may be overwritten so quickly by new incoming data that you would need
    fantastically small interrupt latency to handle it at 115200 BAUD.

    With level-triggered devices (see my earlier messages) you end up doing
    exactly the sort of "queuing" that you're talking about just by holding
    your interrupt in the asserted state. The OS will call your ISR
    repeatedly until you finally release the signal. (In PCI devices, this
    is the INTx# signal.)

    With edge-triggered devices, your ISR must be able to handle every event
    that your device currently has pending, since it may only be called
    once. This may or may not involve internal queuing. It may just
    involve setting individual status bits for each class of event.

    Jake Oshins
    Windows Kernel Group Interrupt Guy

    This posting is provided "AS IS" with no warranties, and confers no
    rights.

    -----Original Message-----
    Subject: Re: interrupt handshaking - was other crap
    From: "Moreira, Alberto" <xxxxx@compuware.com>
    Date: Fri, 13 Dec 2002 11:34:23 -0500
    X-Message-Number: 15

    So, we agree: if the peripheral generates interrupts fast enough,
    they'll
    back up unless the peripheral can queue them. That's been my point from
    the
    start. And I agree that beating the CPU with IRQs is not good -
    interrupts
    should be handled by peripheral processors. The best machines I've ever
    worked with didn't pass on peripheral interrupts to the CPU, in fact,
    I/O
    was not an issue for the kernel but for the peripheral processors.

    Alberto.


    -----Original Message-----
    From: Doug [mailto:xxxxx@hotmail.com]
    Sent: Friday, December 13, 2002 11:24 AM
    To: NT Developers Interest List
    Subject: [ntdev] Re: interrupt handshaking - was other crap


    A higher priority IRQ may be slightly delayed (and I mean slightly) when
    the
    IRQ routine clears out a lower priority IRQ from the APIC, but as soon
    as
    the higher priority IRQ makes it through the APIC, NT should get
    interrupted
    and swap out any lower IRQL ISR that is running to allow the higher
    priority
    IRQL's ISR to run. Then it is up to the DD writer to be efficient as
    possible.

    The only way an IRQ could get lost on a peripheral is if the Periperal
    INT->INTA#->APIC IRQ->CPU IRQ->ISR->Clear Peripheral INT is not handled
    before the next one is generated by the peripheral. If that is the case,
    the
    peripheral should employ some sort of queueing to handle the latency.
    For
    example, this is why UARTs started having 16 byte FIFO queues, 256 byte
    queues, etc. Beating the CPU with IRQs is not good for the CPU or the
    peripheral.

    Doug



    ---
    You are currently subscribed to ntdev as: xxxxx@compuware.com
    To unsubscribe send a blank email to %%email.unsub%%



    The contents of this e-mail are intended for the named addressee only. It
    contains information that may be confidential. Unless you are the named
    addressee or an authorized designee, you may not copy or use it, or disclose
    it to anyone else. If you received it in error please notify us immediately
    and then destroy it.
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    Well, I teach computer architecture, so, I'd better have my act sort of
    together or my students will shoot me. :-)

    We may be talking about different things: you're looking at the processor
    and the OS, I'm looking at the I/O subsystem. If the OS uses an interrupt
    gate, interrupts to the processor are indeed masked, but not at the APIC,
    indeed,you're right, if I get a higher priority peripheral before the
    processor's ISR issues that STI, the APIC will still handle it and another
    processor can catch it. However, if interrupts come at fast enough a pace,
    they'll back up at the peripheral because of the global latency issue. For
    lower priority interrupts, the local APIC will mask them, so, issuing the
    STI is not enough, the higher priority interrupt must be cleared before a
    lower priority interrupt can get through.

    Whether an interrupt will be lost depends entirely on the nature of the
    peripheral subsystem. If my peripheral has a deadline for that interrupt to
    be cleared, I will lose data unless I have buffering, and I will lose data
    if the interrupt latency is larger what my traffic needs: in the end it's an
    M/M/1 queue. Point being, the issue of losing interrupts cannot be totally
    solvable at processor level, it must take the nature of the particular I/O
    subsystem into account. So, depending on the peripheral, it may not matter
    whether the local APIC accepts a higher priority interrupt while the
    processor is running with interrupts disabled ! Some interrupts are only
    cleared when the ISR pokes at the peripheral, and not doing that fast enough
    may lead to traffic backing up at the processor. So, when you say interrupts
    won't be lost, what you're really saying is, the processor won't lose any
    interrupts that the APIC sees - but those are already the survivors. In the
    end it's a sequential chain: peripheral, I/O APIC, Local APIC, processor,
    ISR, I/O, peripheral, and the total latency as far as the peripheral sees is
    the sum of them all. Unless that latency is smaller than what the peripheral
    can admit, data will be lost.

    I still believe that interrupts should be very short lived things: get one,
    get control into the ISR as soon as possible, let the ISR clear the
    interrupt as fast as possible without being interrupted, enqueue the
    interrupt and get out - let the DPC handle the rest. I believe IRQL should
    be handled entirely by the APIC hardware, no need for the OS to bother
    itself with. I also believe that if APICs had hardware interrupt queueing,
    the net I/O throughput of the system would go up, making the machine a
    better server.

    -----Original Message-----
    From: Jake Oshins [mailto:xxxxx@windows.microsoft.com]
    Sent: Saturday, December 14, 2002 2:14 PM
    To: NT Developers Interest List
    Subject: [ntdev] RE: interrupt handshaking - was other crap


    I'm surprised at you Alberto. As much as I disagree with your
    philosophy regarding the way drivers should be written, your
    understanding of processor architecture is usually dead-on.

    All interrupts are masked at the processor core before the STI
    instruction is issued. But the Local APIC is still accepting interrupt
    messages from the I/O APICs. This means that a higher priority
    interrupt will be accepted, including setting the TMR and IRR bits. As
    soon at the processor issues the STI, the processor will be interrupted
    by the Local APIC with that high-priority interrupt, and the Local APIC
    will then set the ISR bit.

    The same will happen for lower-priority interrupts, except that they
    won't be delivered to the processor until the OS writes a lower-IRQL
    value into the TPR register in the Local APIC.

    None will be lost. All of the examples that I have given so far have
    been for level-triggered interrupts. So, to put your mind at ease, I'll
    quickly describe the protocol for edge-triggered interrupts.

    When the OS receives a level-triggered interrupt, it sends the EOI after
    a driver-specific ISR has claimed it, or after it hits the end of the
    ISR chain. The causes the I/O APIC to re-sample the interrupt and send
    another one if the line is still active.

    Edge-triggered interrupts can't work this way. So the OS sends the EOI
    after raising IRQL and before issuing the STI instruction, which is
    before it calls the device-specific ISR chain. That way any new edge
    triggered events will be accepted at the Local APIC. Duplicate
    edge-triggered interrupt events may be collapsed into a single
    edge-triggered event. But that doesn't matter because we call every ISR
    on the chain every time the interrupt occurs. Your ISR may be called
    when your device didn't interrupt. But it will definitely be called if
    it did interrupt.

    Jake Oshins
    Windows Kernel Group Interrupt Guy

    This posting is provided "AS IS" with no warranties, and confers no
    rights.


    -----Original Message-----
    Subject: RE: interrupt handshaking - was other crap
    From: "Moreira, Alberto" <xxxxx@compuware.com>
    Date: Fri, 13 Dec 2002 10:19:06 -0500
    X-Message-Number: 8

    If I understand this correctly, the OS reflects the IRQL in the APIC's
    priority register. Higher priority interrupts will be masked out until
    the
    OS issues the sti. Lower priority interrupts will be masked out until
    the OS
    acknowledges the interrupt. There's a window in there where even a
    higher
    IRQL interrupt won't get through, and another window where no lower
    priority
    interrupts will get through. This may or may not be a problem, depending
    on
    the application and on the interrupt volume.


    Alberto.



    -----Original Message-----
    From: Jake Oshins [mailto:xxxxx@windows.microsoft.com]
    Sent: Thursday, December 12, 2002 11:03 PM
    To: NT Developers Interest List
    Subject: [ntdev] RE: interrupt handshaking - was other crap


    You ask for a lot. But, in order to put this to bed, here it is.

    <stuff deleted>


    ---
    You are currently subscribed to ntdev as: xxxxx@compuware.com
    To unsubscribe send a blank email to %%email.unsub%%



    The contents of this e-mail are intended for the named addressee only. It
    contains information that may be confidential. Unless you are the named
    addressee or an authorized designee, you may not copy or use it, or disclose
    it to anyone else. If you received it in error please notify us immediately
    and then destroy it.
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    > Whether an interrupt will be lost depends entirely on the nature of
    the
    > peripheral subsystem. If my peripheral has a deadline for that
    interrupt to
    > be cleared

    PCI devices must not have such.

    >, I will lose data unless I have buffering

    ...or chain DMA. The on-card FIFOs are to "buffer out" the PCI bus
    grant latency only.

    > I still believe that interrupts should be very short lived things:
    get one,
    > get control into the ISR as soon as possible

    NT works this way.

    Max
  • David_J._CraigDavid_J._Craig Member Posts: 1,885
    A Microsoft OS has a $1.50 cost structure for every $100 charged according
    to an article I read recently. I guess they had to release some of that
    info because of the anti-trust case.

    In the hardware arena where items are patented, the 4x is "standard". I saw
    it at two different companies. Sony seems to have something close to that
    in most of their consumer products. When products become commodities, I
    have seen quite a disparity in pricing from Sam's Club, Sears, and big name
    labels. Look at Intel products. If their stuff isn't at least 4x, I would
    be very surprised.

    ----- Original Message -----
    From: "Moreira, Alberto" <xxxxx@compuware.com>
    To: "NT Developers Interest List" <xxxxx@lists.osr.com>
    Sent: Monday, December 16, 2002 9:09 AM
    Subject: [ntdev] Re: interrupt handshaking - was other crap


    > Good Lord, if a company marks up 4x, they do it at their own risk. I once
    > was told by a Taiwan executive, "You know how I compute my price ? I take
    > the cost and divide by .85".
    >
    > And then, you get what you pay for, right ?
    >
    >
    > Alberto.
    >
    >
    > -----Original Message-----
    > From: David J. Craig [mailto:xxxxx@yoshimuni.com]
    > Sent: Friday, December 13, 2002 5:17 PM
    > To: NT Developers Interest List
    > Subject: [ntdev] Re: interrupt handshaking - was other crap
    >
    >
    > "Cheap" is a EXTREMELY relative. I know of development where even $.05
    was
    > a problem. Don't forget that for most cases the cost of a part is
    > multiplied by four before you see the "suggested retail price". There is
    > also the support circuitry, pads, and other components so I think your $10
    > CPU costs about $50 in the final product.
    >
    > Jake Oshins said that there are no notebooks that have APICs. He didn't
    say
    > why, but since most notebooks use Intel chipsets with little additional
    > features there must be some other reason. Could it be cost? That may be
    > why Dell notebooks haven't been supporting USB 2.0 on the built-in USB
    > port(s). I haven't seen a notebook with a DVD+RW/-RW burner yet either.
    > Sony has the only drive with that support that I know of, but it is about
    > $100 more than some of the straight DVD+RW burners I see in Sam's Club.
    >
    > As to the Sperry, I think the controllers were more complex than you
    > mentioned, but probably not as self-standing as the IBM. The only IBM I
    > ever worked on was a ANFSQ7 and it didn't have any "channels". It didn't
    > even have memory protection.
    >
    > ----- Original Message -----
    > From: "Moreira, Alberto" <xxxxx@compuware.com>
    > To: "NT Developers Interest List" <xxxxx@lists.osr.com>
    > Sent: Friday, December 13, 2002 1:21 PM
    > Subject: [ntdev] Re: interrupt handshaking - was other crap
    >
    >
    > > The DCP only handled communications, other peripherals went to the 1100
    > > directly. The I/O controllers on the mainframe side were a bit more
    > limited,
    > > and as far as I remember they didn't have as much functionality as the
    IBM
    > > Channels.
    > >
    > > As far as costs go, processors are cheap. Yesterday I went to a
    > presentation
    > > of a NUON 4x VLIW microprocessor: $10 bucks or less. And multiple PCI
    > buses
    > > allows for open connectivity: the interrupt stops here!
    > >
    > >
    > > Alberto.
    > >
    > >
    > >
    > > -----Original Message-----
    > > From: David J. Craig [mailto:xxxxx@yoshimuni.com]
    > > Sent: Friday, December 13, 2002 1:02 PM
    > > To: NT Developers Interest List
    > > Subject: [ntdev] Re: interrupt handshaking - was other crap
    > >
    > >
    > > Yes, but look at the costs involved in having I/O channel controllers
    that
    > > are full blown multiprocessor systems themselves.
    > >
    > > The newer versions of EIDE/ATAPI with full DMA and even the old SCSI-2
    > > standard allows some of the capabilities of the mainframes on the PC.
    Not
    > > quite as complete an offload of I/O processing. I remember the Sperry
    > > 1100/60/90 series where the computer terminals were attached to a DCP
    that
    > > had multiple processors, mass storage, and a full OS just to offload the
    > > communications overhead. The mainframe had from one to four processors,
    > but
    > > they were not true SMP. Some I/O devices could only be reached by each
    > pair
    > > of CPUs, so if a request needed to be issued to an I/O controller on the
    > > other pair, a pass off was required.
    > >
    > > One thing that the "Channel engines" don't have to contend with in the
    > > mainframe world is the variety of devices that can be attached. The
    > > mainframe companies design the "engine" and all the devices that can be
    > > attached. Any third-party hardware company that wants to sell into that
    > > market has to be compatible to a greater degree than in the PC world.
    > There
    > > is more standardization with the Microsoft/Intel hardware design specs,
    > but
    > > it is not nearly as restrictive. You can fix a lot of the problems on
    the
    > > PC with a driver, but Microsoft is trying to force most of the market to
    > > write to more restrictive standards than before. One example is the
    first
    > > flash memory readers that required drivers. The newest versions are
    "mass
    > > storage compliant" so the standard Microsoft drivers can access the
    > devices.
    > > That compliance is more expensive in that enough memory must be internal
    > to
    > > the device to handle two flash memory blocks which are usually much
    larger
    > > than the 512 byte sector addressing the OS uses.
    > >
    > > ----- Original Message -----
    > > From: "James Antognini" <xxxxx@mindspring.nospam.com>
    > > Newsgroups: ntdev
    > > To: "NT Developers Interest List" <xxxxx@lists.osr.com>
    > > Sent: Friday, December 13, 2002 12:26 PM
    > > Subject: [ntdev] Re: interrupt handshaking - was other crap
    > >
    > >
    > > > S390 is like what you describe: "Channel engines" (something like
    > > > peripheral CPUs) do most of the talking to devices. In my opinion, the
    > > > x86 architecture, the whole approach, is weaker in the domain of I/O.
    > > >
    > > > --
    > > > If replying by e-mail, please remove "nospam." from the address.
    > > >
    > > > James Antognini
    > > >
    > > >
    > > >
    > > > ---
    > > > You are currently subscribed to ntdev as: xxxxx@yoshimuni.com
    > > > To unsubscribe send a blank email to %%email.unsub%%
    > >
    > >
    > >
    > >
    > > ---
    > > You are currently subscribed to ntdev as: xxxxx@compuware.com
    > > To unsubscribe send a blank email to %%email.unsub%%
    > >
    > >
    > >
    > > The contents of this e-mail are intended for the named addressee only.
    It
    > > contains information that may be confidential. Unless you are the named
    > > addressee or an authorized designee, you may not copy or use it, or
    > disclose
    > > it to anyone else. If you received it in error please notify us
    > immediately
    > > and then destroy it.
    > >
    > >
    > >
    > > ---
    > > You are currently subscribed to ntdev as: xxxxx@yoshimuni.com
    > > To unsubscribe send a blank email to %%email.unsub%%
    >
    >
    >
    >
    > ---
    > You are currently subscribed to ntdev as: xxxxx@compuware.com
    > To unsubscribe send a blank email to %%email.unsub%%
    >
    >
    >
    > The contents of this e-mail are intended for the named addressee only. It
    > contains information that may be confidential. Unless you are the named
    > addressee or an authorized designee, you may not copy or use it, or
    disclose
    > it to anyone else. If you received it in error please notify us
    immediately
    > and then destroy it.
    >
    >
    >
    > ---
    > You are currently subscribed to ntdev as: xxxxx@yoshimuni.com
    > To unsubscribe send a blank email to %%email.unsub%%
Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Upcoming OSR Seminars
Developing Minifilters 29 July 2019 OSR Seminar Space
Writing WDF Drivers 23 Sept 2019 OSR Seminar Space
Kernel Debugging 21 Oct 2019 OSR Seminar Space
Internals & Software Drivers 18 Nov 2019 Dulles, VA