DPC scheduling question

msr · January 31, 2009, 2:02am

WDK in Organization of DPC Queues says

…
For **A** ordinary (non-threaded) DPCs ****, KeSetImportanceDpc also determines whether KeInsertQueueDpc or IoRequestDpc will immediately begin processing the DPC queue. The following list describes the rules for processing the queue:

Processing of the DPC queue begins ***immediately*** if the DPC is assigned to the current processor and Importance is not equal to LowImportance,
<
what does immediatly mean above ?
Does the DPC starts running in teh same thread as teh one that is queued or it is run as soon as possible, which can be any time from now (can that time be non-deterministic etc)
(A above) Does that explicit qualification (ordinary DPCs) mean what was being said is only applicable to ordinary DPCs or is also applicable to TDPCs?

–thx

anton_bassov · January 31, 2009, 4:31am

> 1) Processing of the DPC queue begins ***immediately*** if the DPC is assigned to the

current processor and Importance is not equal to LowImportance, < 1)

what does immediatly mean above ?

It means that DPC will get invoked before the code that runs on a given CPU at low IRQL is given a chance to run. There are 2 scenarios:

Interrupted code ran at low IRQL at the time of interrupt. In this case interrupt handler stub will forward execution to the software interrupt ox41 handler (the one that handles DPCs) before returning control to interrupted code - KeInsertQueueDpc() does not need to request software interrupt ox41 under these circumstances
1. Interrupted code ran at elevated IRQL at the time of interrupt. In this case KeInsertQueueDpc() will
  request software interrupt ox41 that will be kept pending while IRQL is elevated, and fire immediately when IRQL goes below DISPATCH_LEVEL .Therefore, in this case execution will return immediately to interrupted code

In order to make a distinction between these scenarios, one of the very first things interrupt handler stub does is checking IRQL, and saving a flag in PCR…

Anton Bassov

msr · January 31, 2009, 5:53am

Anton

Thx. Let me know if below is (my resonable/sensible :-)) approximation of what you explained…

>> Interrupted code ran at low IRQL at the time of interrupt. <<

which interrupt is being referred above (scheduler?)
(> “Interrupted code” meaning the thread doing KeInsertQueueDpc() ?)

Also
Say thread T on core C doing
a) Create DPC object D etc
b) KeSetImportanceDpc(D,MediumImportance), KeSetTargetProcessorDpc (D, C)
c) KeInsertQueueDpc(D) to core C i.e same core

(Assuming no other (device/higher) intrs happened between a-f above in both cases etc)

Case - irql=0
d) Goto interrupt0x41
e) “Thread T” itself will service the pending DPCs’ q’ed on core C including this DPC D that is at the tail now
f) Once the DPC list is serviced, return back to instruction following KeInsertQueueDpc()

??

Case - irql=2
Core C still running thread T

c.1) Core C went to irql=0 (say T released lock etc)
d) Goto interrupt0x41
which thread will go to 0x41(), can it be T itself? I guess not/connot ?
e) Service the pending DPCs’ q’ed on core C including this DPC D that is at the tail now
f) Once the DPC list is service, return back to instruction following KeInsertQueueDpc() (on Thread T)

??

–thx

msr · January 31, 2009, 6:10am

Also

Basically if I enable the code to do queuing of DPC on to same thread with medium importance (i.e run immeidatly above), I amd gettiing into weird and seemingly as of now random BSODS…

If I disable the code, it runs for ever…

My thread coudl be at irql<=2. I am assuming my above issue (BSOD) will be happening only at certain irql? (i could have enough DPCs’ already q’ed on to the core etc)

Probably will step-into see how the stack goes (at different irqls) starting from KeInsertQueueDpc()… ?

James_Harper · January 31, 2009, 6:19am

> Also

Basically if I enable the code to do queuing of DPC on to same thread
with
medium importance (i.e run immeidatly above), I amd gettiing into
weird
and seemingly as of now random BSODS…

If I disable the code, it runs for ever…

My thread coudl be at irql<=2. I am assuming my above issue (BSOD)
will be
happening only at certain irql? (i could have enough DPCs’ already
q’ed on
to the core etc)

Probably will step-into see how the stack goes (at different irqls)
starting from KeInsertQueueDpc()… ?

Are you’re DPC’s also being scheduled from an ISR? If they are then I
think you need to protect the DPC queue itself by synchronising your
schedule call.

Otherwise my guess would be that there is some shared resource between
the code that schedules and the DPC itself that is being corrupted via
simultaneous access.

James

anton_bassov · January 31, 2009, 8:17am

> Say thread T on core C doing

a) Create DPC object D etc
b) KeSetImportanceDpc(D,MediumImportance), KeSetTargetProcessorDpc (D, C)
c) KeInsertQueueDpc(D) to core C i.e same core

(Assuming no other (device/higher) intrs happened between a-f above in both cases etc)

You seem to miss a crucial point - the very idea of DPC is to avoid doing too much work at DIRQL and do it at DPC level, rather than at DIRQL. Therefore, it is going to get queued only by ISR - although you can make DPC requeue itself right from DPC routine, it does not seem to be really practical idea, unless you just want to take StartForce approach and hang the CPU intentionally. Doing it in some context other than that of ISR just defeats the common sense, and thread context is simply irrelevant here.

Your subsequent post makes me suspect that you simply misuse the DPC - it gives me a weird feeling that you try to queue a DPC from non-ISR context…

Anton Bassov

Maxim_S_Shatskih · January 31, 2009, 2:54pm

>you can make DPC requeue itself right from DPC routine, it does not seem to be really practical idea,

unless you just want to take StartForce approach and hang the CPU intentionally.

This also reduces the kernel stack usage and saves from kernel stack overflow sometimes.

–
Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

anton_bassov · January 31, 2009, 3:21pm

> This also reduces the kernel stack usage and saves from kernel stack overflow sometimes.

It could work this way if low-priority DPCs were guaranteed to run only in context of the idle thread, but, IIRC, this is not how things work under Windows. - IIRC, Windows will drain the whole DPC queue before interrupted thread is allowed to resume execution at low IRQL anyway… …

Anton Bassov

Maxim_S_Shatskih · January 31, 2009, 3:52pm

> It could work this way if low-priority DPCs were guaranteed to run only in context of the idle thread,

DPCs have their personal per-CPU stacks. They do not use the thread stacks.

–
Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

anton_bassov · January 31, 2009, 5:20pm

> DPCs have their personal per-CPU stacks. They do not use the thread stacks.

I am right that this was introduced only in Vista?

Anton Bassov

Maxim_S_Shatskih · January 31, 2009, 6:30pm

> I am right that this was introduced only in Vista?

IIRC from NT4 at least.

–
Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

msr · January 31, 2009, 6:41pm

>>>
Are you’re DPC’s also being scheduled from an ISR? If they are then I
think you need to protect the DPC queue itself by synchronising your
schedule call.
<<<

I am protocol driver. I am scheduling DPCs in ProtocolReceivePacket and ProtocolReceibNBLs’. I am ensuer again my data touched by DPC callback are protected well…
Reason below

>
Your subsequent post makes me suspect that you simply misuse the DPC - it gives me a weird feeling that you try to queue a DPC from non-ISR context…
<<

Yes. Since I am protocol driver, so irql<=2.

I need to Q DPC’s becuase the miniport is invoking my call back only on Core 0. I need to CRC calcs on very frame (not IP, not checksums, no avaibale offloads etc).
If I do everthing in that call back context on Core 0 pretty much pegs up soon, mouse is stuck and anyways other cores are useless and idle.
I tried scheduling items on to a thread to be processed later, but the latency increases so much, that miniport stats complaining that I am not returning it RECV NBL/MDLs’ fast enough.
I run on server system, and my NIC/Protocol receive too many packets/too fast.

I twiddles with NDIS processorAffinityMask (Set to all 1s’), the miniport BSODs during init or protocol binding. For stopped pursuing that approach

Coming back to my ? above, I starting Q’ing DPC to same core (in my case Core 0) becuase I observed, if I return back control to miniport immediatly (rathet than trying to process too many frames in Core 0 in that call back context itself) the perf seem to be better.
Probably miniport on control back could enable interrupts etc.

But now starting see weird issue when I enable q’ing in to same proc.
So just wanted to know if there is any weird happening above from windwos standpoint, else i guess something wrong in my semantics, but BSOD dump not that helpful yet.

>
DPCs have their personal per-CPU stacks. They do not use the thread stacks.
<<
Does this add something to my above questions, which thread is running the DPC’s? can it be my thred T above in all/both cases or it cannot be determined.

–thx

anton_bassov · January 31, 2009, 9:59pm

> IIRC from NT4 at least.

Are you sure you don’t confuse it with per-CPU idle thread??? Although low-priority DPC may get dispatched in context of idle thread, queuing a “normal” one would result in draining the whole queue before lowering IRQL below DPC level, so that the whole thing will happen in context of an arbitrary thread. Concerning the stack switch, it depends on whether interrupted thread was running in kernel mode or user one. In the former case no stack switch will occur; and in the latter one ESP will be set to the value that is specified in TSS. This will be done by the CPU itself transparently to the software…

Concerning a stack for DPCs, I vaguely recall one of Jake’s posts where he said they had introduced the feature of dispatching DPCs in context of a dedicated thread, but, IIRC, he said they had introduced it quite recently. However, I may be wrong…

Anton Bassov

Ken_Johnson · February 1, 2009, 1:53am

I think that you are confusing threaded DPCs with the dedicated per-CPU DPC stack (a pointer to which is hung off the PCR/PRCB). The former is a relatively recent (for some definitions of recent) addition; the latter is not.

S

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@hotmail.com
Sent: Saturday, January 31, 2009 9:59 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] DPC scheduling question

IIRC from NT4 at least.

Are you sure you don’t confuse it with per-CPU idle thread??? Although low-priority DPC may get dispatched in context of idle thread, queuing a “normal” one would result in draining the whole queue before lowering IRQL below DPC level, so that the whole thing will happen in context of an arbitrary thread. Concerning the stack switch, it depends on whether interrupted thread was running in kernel mode or user one. In the former case no stack switch will occur; and in the latter one ESP will be set to the value that is specified in TSS. This will be done by the CPU itself transparently to the software…

Concerning a stack for DPCs, I vaguely recall one of Jake’s posts where he said they had introduced the feature of dispatching DPCs in context of a dedicated thread, but, IIRC, he said they had introduced it quite recently. However, I may be wrong…

Anton Bassov

NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

OSR_Community_User · February 1, 2009, 7:33am

I seem to recall something very similar to what Anton is talking about
‘Vista forward’. Clarification on
how DPC’s are handled differently in Vista and later would be nice.

I think that you are confusing threaded DPCs with the dedicated per-CPU DPC
stack (a pointer to which is hung off the PCR/PRCB). The former is a
relatively recent (for some definitions of recent) addition; the latter is
not.

S

Concerning a stack for DPCs, I vaguely recall one of Jake’s posts where he
said they had introduced the feature of dispatching DPCs in context of a
dedicated thread, but, IIRC, he said they had introduced it quite recently.
However, I may be wrong…

Anton Bassov

anton_bassov · February 1, 2009, 8:59am

Ken,

I think that you are confusing threaded DPCs with the dedicated per-CPU DPC stack

Could very well be the case, but I have weird feeling that we were not speaking about threaded DPCs in the above mentioned discussion . Threaded DPC (which, indeed, turned up only under Vista) is a totally different thing that is more similar to BSD/Solaris interrupt threads, rather than to Windows/Linux DPCs and tasklets (i.e. atomic execution units that don’t allow blocking calls) - it gets processed in context of dedicated thread which is a subject to dispatcher’s policies, so that you can make blocking calls in it. From the Windows programmer’s perspective, threaded DPC is a combination of DPC and workitem, so that it has features of both. Unlike workitem, it can be queued from ISR, which makes it DPC-like. However, unlike “conventional” DPCs, it runs at PASSIVE_LEVEL, which makes it workitem-like.

If we ignore dispatcher-related stuff that does not apply to “conventional” DPCs anyway, the only thing that makes dedicated kernel thread different from the arbitrary one is presence of dedicated stack.
Once idle thread has nothing else to do, apart from dispatching DPCs, you can think of above mentioned dedicated per-CPU DPC stack as of idle thread’s one, and I believe this is what Maxim refers to. However, IIRC, if your DPC runs in context of an arbitrary kernel thread it is going to use this thread’s stack (although I may be wrong here)…

Anton Bassov

Maxim_S_Shatskih · February 1, 2009, 10:31am

>> IIRC from NT4 at least.

Are you sure you don’t confuse it with per-CPU idle thread???

At least starting from w2k there was a pre-allocated DPC stack associated with the PCR, and DPC execution was started by switching to this stack, and then switching back. I think you can always see KiRetireDpcList at the top of the DPC stack, but cannot list the stack above KiRetireDpcList in WinDbg.

Monitoring IoGetRemainingStackSize and re-queuing the DPC if the stack is too low really works.

result in draining the whole queue before lowering IRQL below DPC level, so that the whole thing will
happen in context of an arbitrary thread.

Yes, and this “whole thing” starts with a stack switch.

Concerning the stack switch, it depends on whether interrupted thread was running in kernel mode or
user one.

DPCs are delivered when an attempt is made for the IRQL to drop to < DISPATCH_LEVEL, which implies kernel mode.

–
Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

Peter_Viscarola_OSR · February 1, 2009, 10:47am

Max is entirely correct… Note also that the DPC list is examined/retired as part of the idle thread.

As a complete coincidence, the pending issue of The NT Insider has an article explaining all about DPCs… and answers many of these questions.

But, I’m not at all sure we’ve helped the OP or answered his question: Ordinarily, people don’t fool with KeSetImportanceDpc and KeSetTargetProcessorDpc. When people DO fool with these, they almost always end up creating more overhead rather than less. Cross-processor DPC queuing either creates significant overhead on the target processor, or creates significant latency for the DPC being queued. If the target process is idle, that’s not really a problem of course… this has lead folks to attempt to devise their own DPC distribution schemes, in an attempt to fan their DPCs out across processors that are idle. This usually doesn’t work out that well either… it’s harder than it first sounds.

Is any of this answering your question, msr???

Peter
OSR

msr · February 1, 2009, 3:09pm

Peter

Glad to know there is upcoming article on DPCs. I am not getting my NTInsider for some reason, missed couple of recent ones I think. Any idea how to get that resolved (osronline.com says I am already registered to receive etc.). Will see if anyone is getting their hands on it before I do.

>
Is any of this answering your question, msr???
<<
I am doing TDPC onto other cores as well which is fine for now. I am seeing issue only when I do TDPC on to same core. So wnat to kow anything I am beaking fron windows standpoint. I will test with ordinary DPCs also.

I am taking below at this point

[1]

>Note also that the DPC list is examined/retired as part of the idle thread.
<<

The thread that calls KeInsertQueueDpc() will never retire the DPC list (for the DPCs queued to same proc).
It is the per core idle thread (Same for TDPC, I am using TDPC becuases it seem to satisfy my system experience requirement(mouse not stuck etc, with ordinary DPCs it seem to) maininting decent latency) using a pre-allocated DPC stack associated with the PCR

[2]

>
Cross-processor DPC queuing either creates significant overhead on the target processor, or creates significant latency for the DPC being queued.
<<
Yes, given that I need to CRC calcs on every incoming frame (10G NIC), doing everything on Core 0 makes the system unusable. So doing DPC redirection to other cores, it seem to do the trick most of the time, except at times now like you say I see the remaining cores not really doing that much either, they are just at <20% utilization compared to Core 0 which at ~100%. (For discussion assuming right now DPC redirection is worth the overhead etc, because I am able to push more work on to other cores without hitting my protocol timeouts etc).
Will turn off DPC re-direction entirely and do everything on Core 0 itself and see how the perf/system-experience(mouse etc) behaves

[3]

>
Unlike workitem, it can be queued from ISR, which makes it DPC-like. However, unlike “conventional” DPCs, it runs at PASSIVE_LEVEL, which makes it workitem-like.
<<
I am using TDPC. On win2k8 server, I have seen TDPC being scheduled both at PASSIVE and DISPATCH (once in DpcRoutine I did KeGetCurrentIrql()). I do not knwo why.

–thx

Peter_Viscarola_OSR · February 1, 2009, 5:51pm

Threaded DPCs are really just worker threads. To the best of my knowledge, there’s really nothing special about the way they’re scheduled.

This is not correct. Ignoring THREADED DPCs, the thread that calls KeInsertQueueDpc will retire the DPC List if (a) the DPC is queued on the current processor AND EITHER (b) the DPC is at least MEDIUM_IMPORTANCE (thus causing an IRQL DISPATCH_LEVEL software interrupt to be generated) OR (c) the DPC scheduling parameters (rate or length) result in an IRQL DISPATCH_LEVEL software interrupt being generated.

Sorry, I know all these details are annoying… and networking introduces lots of unusual DPC situations (due to the way that NDIS wants to handle interrupts and DPCs).

Again, ignoring threaded DPCs (which are indeed a very unusual case) The issue I’d be most concerned about in targeting DPCs to other processors is LATENCY. When you queue a DPC to a processor other than the current one, how and when will that remote processor notice that there’s a new DPC put into its DPC list?? Answer: It won’t, until another DPC is generated (or the DPC queue is retired on that processor by the idle thread)… UNLESS the DPC that you queue is marked at HIGH_IMPORTANCE. In this case, when the DPC is queued to the remote processor an interprocessor interrupt is generated… which is not exactly a low-overhead event.

Nor do I. Either you’re this is not possible and you’re thus not using threaded DPCs when you think you are, or I do not understand how Threaded DPCs work. Sorry, but I’m not in a position right at this moment to check the code.

Peter
OSR