MSI-X Latency

Alex_Chang · October 8, 2010, 7:02pm

Hi,

I have a question regarding Storport miniport driver. I measure the time between a MSI-X vector written by our device and the miniport driver’s ISR gets called is around 45us. Which is unacceptably lengthy. I wonder why? And Is there any ways to shorten it in Storport miniport driver?

Thanks.

Tim_Roberts · October 8, 2010, 7:18pm

xxxxx@yahoo.com wrote:

Hi,

I have a question regarding Storport miniport driver. I measure the time between a MSI-X vector written by our device and the miniport driver’s ISR gets called is around 45us. Which is unacceptably lengthy. I wonder why? And Is there any ways to shorten it in Storport miniport driver?

How did you measure that? Did you write a register and watch a scope to
see when it happened?

–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

James_Harper · October 8, 2010, 8:28pm

> Hi,

I have a question regarding Storport miniport driver. I measure the
time
between a MSI-X vector written by our device and the miniport driver’s
ISR
gets called is around 45us. Which is unacceptably lengthy. I wonder
why? And
Is there any ways to shorten it in Storport miniport driver?

Can you measure the minimum and maximum time values too?

What is the priority of your interrupt? What other interrupts are there
of higher priority and how often are they being hit?

James

Alex_Chang · October 11, 2010, 12:23pm

I use Lecroy to measure it with a register (memroy) write when my ISR gets called.
Hi James,
How do I find out the priority of my interrupt?

Thanks.

Alex_Chang · October 11, 2010, 1:02pm

My bad. I think the interrupt level is DIRQL.

James_Harper · October 11, 2010, 9:49pm

>

My bad. I think the interrupt level is DIRQL.

I assume that in these days of MSI that DIRQL still means that you can
be interrupted by anything with a higher priority and delayed by
anything with the same priority (eg an interrupt at your DIRQL that is
already in progress). I’m not sure what control you have over the
ordering of DIRQLs though… I don’t think you can say “my interrupt is
high importance” etc. If the people who know such things don’t notice
your question I can make an obviously false statement and they’re sure
to jump in and correct me

James

Jake_Oshins · October 13, 2010, 12:54am

To confirm, James, you’re right. You have no control over which priority
your device gets assigned.

You might find an old whitepaper on the net from an early beta of Longhorn
which implied that you could have some control. This was true at that stage
in Longhorn. When I started working on VMs, another guy picked up all of my
code and simplified out the ability to express a priority, mostly on the
assumption that it would become meaningless as everybody demanded high
priority for their driver. By the time Windows Vista shipped, it was still
possible to express a preference, but that preference is ignored.

Jake Oshins
Hyper-V I/O Architect (former interrupt guy)
Windows Kernel Group

This post implies no warranties and confers no rights.

“James Harper” wrote in message news:xxxxx@ntdev…

My bad. I think the interrupt level is DIRQL.

I assume that in these days of MSI that DIRQL still means that you can
be interrupted by anything with a higher priority and delayed by
anything with the same priority (eg an interrupt at your DIRQL that is
already in progress). I’m not sure what control you have over the
ordering of DIRQLs though… I don’t think you can say “my interrupt is
high importance” etc. If the people who know such things don’t notice
your question I can make an obviously false statement and they’re sure
to jump in and correct me

James

Alex_Chang · October 22, 2010, 6:14pm

thanks a lot, guys.
I finally narrow down to what causes the lengthy latency: when my application calls Createfile and Readfile to read from the device in asynchronous way (i.e. FILE_FLAG_OVERLAPPED specified), the lengthy latency is reduced to below 2us. In synchronous way, it can be somewhere between 3 and 90 us. Can somebody explain this ?

Doron_Holan · October 22, 2010, 6:17pm

When the file is not opened as overlapped, the io manager provides serialization and that requires a lock. My guess is that the increased latency you are seeing is acquiring that lock

d

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@yahoo.com
Sent: Friday, October 22, 2010 3:16 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] MSI-X Latency

thanks a lot, guys.
I finally narrow down to what causes the lengthy latency: when my application calls Createfile and Readfile to read from the device in asynchronous way (i.e. FILE_FLAG_OVERLAPPED specified), the lengthy latency is reduced to below 2us. In synchronous way, it can be somewhere between 3 and 90 us. Can somebody explain this ?

NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Pavel_A1 · October 22, 2010, 7:10pm

Doron Holan" wrote in message
news:xxxxx@ntdev…
> When the file is not opened as overlapped, the io manager provides
> serialization and that requires a lock. My guess is that the increased
> latency you are seeing is acquiring that lock
>
> d

How this lock (on the file object?) can affect time between MSI request and
entering the ISR?
However, the OP’s driver is storport minport - the ISR in that context is
really ISR, or DPC?
–pa

> -----Original Message-----
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of
> xxxxx@yahoo.com
> Sent: Friday, October 22, 2010 3:16 PM
> To: Windows System Software Devs Interest List
> Subject: RE:[ntdev] MSI-X Latency
>
> thanks a lot, guys.
> I finally narrow down to what causes the lengthy latency: when my
> application calls Createfile and Readfile to read from the device in
> asynchronous way (i.e. FILE_FLAG_OVERLAPPED specified), the lengthy
> latency is reduced to below 2us. In synchronous way, it can be somewhere
> between 3 and 90 us. Can somebody explain this ?

Doron_Holan · October 22, 2010, 7:19pm

Maybe i misread his question. It sounded like he was basing his time measurements on api calls, not hw events.

d

dent from a phpne with no keynoard

-----Original Message-----
From: Pavel A.
Sent: October 22, 2010 4:10 PM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] MSI-X Latency

Doron Holan" wrote in message
news:xxxxx@ntdev…
> When the file is not opened as overlapped, the io manager provides
> serialization and that requires a lock. My guess is that the increased
> latency you are seeing is acquiring that lock
>
> d

How this lock (on the file object?) can affect time between MSI request and
entering the ISR?
However, the OP’s driver is storport minport - the ISR in that context is
really ISR, or DPC?
–pa

> -----Original Message-----
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of
> xxxxx@yahoo.com
> Sent: Friday, October 22, 2010 3:16 PM
> To: Windows System Software Devs Interest List
> Subject: RE:[ntdev] MSI-X Latency
>
> thanks a lot, guys.
> I finally narrow down to what causes the lengthy latency: when my
> application calls Createfile and Readfile to read from the device in
> asynchronous way (i.e. FILE_FLAG_OVERLAPPED specified), the lengthy
> latency is reduced to below 2us. In synchronous way, it can be somewhere
> between 3 and 90 us. Can somebody explain this ?

—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Alex_Chang · October 22, 2010, 8:11pm

Doron,

You did not misread my question. It’s the interrupt latency (ie, from MSI write that triggers interrupt to invoke of my ISR) I am talking about, and the latency is accurately measured with Lecroy device. I am surprised the flag can cause that in user application level. I really wonder how differently Windows kernel handles them? The locking you just mentioned may make sense, however, isn’t it also doing certain locking for asynchronous accesses?

Thanks.

Alex_Chang · October 25, 2010, 12:24pm

Hi Jake,

I have couple of questions and thought you might have answer to them:

Does kernel’s interrupt dispacher know the request associated the current interrupt it’s handling is a synchronous or asynchronous IO request?
If it does, does it consider other things before invoking device’s ISR ?

Thank you in advance.

Tim_Roberts · October 25, 2010, 1:28pm

xxxxx@yahoo.com wrote:

I have couple of questions and thought you might have answer to them:

Does kernel’s interrupt dispacher know the request associated the current interrupt it’s handling is a synchronous or asynchronous IO request?

If it does, does it consider other things before invoking device’s ISR ?

The question doesn’t make sense. Interrupts have nothing to do with I/O
requests. The fact that your driver happens to complete an I/O request
during its ISR or ISRDPC is merely an implementation detail of your driver.

When there’s an interrupt, the dispatcher invokes the ISR chain until
someone accepts it. There’s nothing else to consider.

–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Alex_Chang · October 25, 2010, 5:19pm

I agree, Tim, that interrupt latency should have anything to do with the fashion of IO requests.
However, when I test it, only one request is issued at a time such that ISR is ready for the requests all the time. I am using MSI-X, which means there is no interrupt sharing or ISR chain if I understand it correctly.

Thanks.

Maxim_S_Shatskih · October 26, 2010, 6:44am

>the time. I am using MSI-X, which means there is no interrupt sharing or ISR chain if I understand it

With MSI, the hardware itself chooses which CPU to interrupt, not the kernel.

–
Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

anton_bassov · October 26, 2010, 9:19am

>> the time. I am using MSI-X, which means there is no interrupt sharing or ISR chain if I understand it

With MSI, the hardware itself chooses which CPU to interrupt, not the kernel.

For both MSI and “legacy” interrupts the OS software has to set up certain things, i.e. Message Address and Message Data Registers for MSI and redirection table for IOAPIC-based interrupts that require the same info like destination ID, destination mode, vector,etc. Therefore, interrupts of both types will get raised by the hardware the way OS software had specified - there is no difference between MSI and “legacy” interrupts in this respect whatsoever.

The only difference is how interrupts get actually raised. In case of “legacy” one interrupt gets raised via IOAPIC pin that may be shared by multiple devices (the OS may have no chance to do anything about it - everything depends on BIOS and MB wiring), so that the same vector may be shared by multiple devices. Raising interrupts via a simple memory write, rather than signaling them via pins, removes this limitation, so that the OS has a chance to give every MSI-capable device its own dedicated vector…

Anton Bassov

Alex_Chang · October 26, 2010, 12:10pm

Completely agreed, Anton.
Again, back to the latency (i.e., from the MSI memory write to ISR invoked), who is in charge in terms of when the ISR will eventually gets called after the hand-shaking between CPUs and IOAPICs to determine which CPU supposed to serve the interrupt?

Thanks.

Peter_Viscarola_OSR · October 26, 2010, 12:37pm

It should be a straight-forward dispatching operation through the kernel, which is why I don’t understand how the latency could change, based on a whether a File Object indicates synchronous or asynchronous I/O.

With no disrespect intended, I suspect an error in the OP’s methodology. What he’s observing just doesn’t make any sense… at least to me.

Peter
OSR

Maxim_S_Shatskih · October 26, 2010, 1:26pm

>types will get raised by the hardware the way OS software had specified - there is no difference

between MSI and “legacy” interrupts in this respect whatsoever.

With MSI, the governing stuff set by the OS can be flexible and allow the device to choose what CPU to interrupt.

This (and the ability of eliminating IOAPIC at all) is one of the major advantanges of MSI.

–
Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com