CPU pinning in Windows

Hello Team,

I had a question regarding cpu pinning inside windows. Can a DPC such as that used by NIC driver be pinned to a particular (v)CPU?

Orthogonal to the previous question, is there a way to specify that that (v)CPU be devoted to running the DPC only and not other threads? (Okay, maybe as much as possible, the Windows CPU scheduler should not put other threads on it. )

The reason to ask that is after looking at PerfMon stats for the adapter, I see that DPC deferred rate is going up. I think it can be due to other threads/process running. So, wanted to know if there is any way (steps or tools) of manually pinning a cpu for particular process/thread.

Thanks,
Ronak

KeSetTargetProcessorDpc

You can’t really dedicate a cpu to running just your dpc. You will run into
watchdog issues for starters. If you can configure isr and dpc affinity you
will get your interrupt path processing on whatever gang of cpu’s you
specify.

Mark Roddy

DPCs are naturally affinitized to where they were queued. Actually the opposite is the weird case – DPCs live above the scheduler, so it doesn’t really make much sense to “schedule” them around to a different processor like threads get scheduled.

Like Mark said, Windows doesn’t currently expose a way to exclusively monopolize a CPU. (You can imagine the arms race that that leads to…). We are thinking about this space – see the “CPU Sets” on Microsoft Docs for some of our early thinking.

However, we in the Networking team don’t think you should have to become experts in the Windows scheduler / DPC. We are tinkering with ways to remove the burden of scheduling from the NICs, so you don’t have to worry about how/when/where to execute your datapath. If you’re interested in learning more, check out the “NetAdapter” project. (If you squint, there are vague similarities to DPDK or NAPI, where the upper level drives all the execution.)

check out the “NetAdapter” project

Read this about NetAdapter. SUPER interesting and exciting.

Peter

Thank you for the information. Will go through the NetAdapter project.

Ronak

@“Jeffrey_Tippet_[MSFT]” said:
Windows doesn’t currently expose a way to exclusively monopolize a CPU. (You can imagine the arms race that that leads to…).
… see the “CPU Sets” on Microsoft Docs for some of our early thinking.

So, this is very interesting. It sounds like Windows is finally getting on the page of accommodating real-time solutions.


_SYSTEM_CPU_SET_INFORMATION

BYTE EfficiencyClass;

BYTE RealTime : 1;

BYTE SchedulingClass;

One way we could reserve CPUs for a thread or process, was by using the undocumented SchedulingClasss feature of the scheduler (with a process job object) and by booting the system with special parameters for processor group awareness testing (which wasn’t intended for this purpose). This would not always work and as such wasn’t suitable for off-the-shelf products that should work on every system.

Great news.

//Daniel

On Nov 4, 2018, at 4:58 AM, Daniel_Terhell wrote:
>
> So, this is very interesting. It sounds like Windows is finally getting on the page of accommodating real-time solutions.
>
> _SYSTEM_CPU_SET_INFORMATION
> …
> BYTE EfficiencyClass;
> …
> BYTE RealTime : 1;
> …
> BYTE SchedulingClass;

Well, not really. A true real-time system requires the enforcement of mandatory timing limits on ISRs and DPCs, and Windows has no way of doing that right now.

Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.

Well, not really. A true real-time system requires the enforcement of mandatory timing limits on ISRs and DPCs, and Windows has no way of doing that right now.

Not sure if I understand what you mean. Windows does have timing limits for ISRs and DPCs that are enforced, but by monopolizing a CPU, these can be completely avoided.

//Daniel

Windows does have timing limits for ISRs and DPCs

Well, there are guidelines. The timing limits that are enforced are for gross violations.

But, no matter: The primary characteristics that people care about in a real-time system are latency guarantees – to get to these may require some timing limits, but they are secondary. If the OS can guarantee me maximum bounds on latencies, then I can plan my real-time processing accordingly.

Peter

The NT kernel is not a realtime kernel. However, if your hardware platform has multiple CPUs, and the scheduler has a process isolation policy like CPU Sets, and all the inbox drivers are enlightened to stay away from thus isolated processors, and if you have tight control over any 3rd-party drivers… then you can start talking about soft realtime.

The current version of NT is not in the situation I’ve described: several inbox drivers do not steer their ISRs/DPCs away from isolated CPUs. But scheduling+isolation+latency is an area some of us are thinking about.

(empty message)

So, this is very interesting. It sounds like Windows is finally getting on the page of accommodating real-time solutions.

Nope…

I order to be able to get anywhere close to RTOS capabilities Windows needs to get rid of DPCs (i.e. the code that runs in context of a software interrupt), and replace it with interrupt threads that are subjected to dispatcher’s policies.

In the ideal case the same should apply to ISRs as well. This part may be a bit problematic, because in order to put ISRs under the control of a dispatcher, interrupt handler stub has to ACK an interrupt to the controller before invoking an ISR. It is understandable that, as long as we are speaking about level-triggered interrupts, an interrupt source has to be disabled in the stub if you ACK an interrupt prior to actually handling it - otherwise the machine will get into an interrupt storm. Although it may work fine for most platforms and interrupt controllers, taking this approach with APIC under x86 and x86_64 is, probably, not the best idea , because you may get quite a few spurious interrupts if you attempt something like that.

Never mind - let’s imagine guys in Redmond try to do it at some point in some OS version. You would not want to rewrite the entire OS, including all drivers, would you. Therefore, they have to maintain backwards compatibility

When it comes to interrupt threads, this part can be introduced transparently to the drivers - they can be easily made believe that their ISRs and DPCs run in context of a hardware/software interrupt, while, in actuality, running in context of a high-priority thread. Then, these dispatcher-controlled threads have to minimise their use of spinlocks, and replace them with mutexes. Certainly, as long as driver code treats spinlocks and mutexes as opaque data structures, you may be able to keep everything backwards-compatible at the source level if you do a bit of header editing, so that a call to KeAcquireSpinlock() will, in actuality, lock a real-time mutex,rather than a spinlock, behind the scenes.

However, no matter how you look at it, you would be unable to maintain backwards compatibility at the binary level. Taking into consideration that binary-level backwards compatibility is taken almost as a religious thing in Redmond, such “bold and revolutionary” changes are very unlikely to ever take place…

Anton Bassov

Anton, did you comment on the wrong thread ? By mistake perhaps ?

Anton, did you comment on the wrong thread ? By mistake perhaps ?

Well, I just explained to you what Windows kernel guys would, at the very minimum, have to do if they were about to start, in your words,
“getting on the page of accommodating real-time solutions”. This is the very, very,very minimum that would have to be done - as long as you allow certain execution units to be out of the scheduler’s control you have absolutely no means of providing latency guarantees that are required for RT tasks. Therefore, they would have to ensure that no execution unit in existence is out of the scheduler’s control (and, hence, that no one has a chance, at any point, to disable it)…

Anton Bassov

Your message sounds a bit like a rant and doesn’t seem to touch on the topic of reserving/isolating CPUs.

to be able to get anywhere close to RTOS capabilities Windows needs to get rid of DPCs

That is exactly why CPU isolation is a great idea. You get rid of unwanted interrupts and DPCs on your isolated CPU. Of course unless the DPCs are yours and part of your solution, then you have them executed exactly where you want.

//Daniel

Your message sounds a bit like a rant…

Not at all - you know, this posting platform reportedly has an extensive list of troll-managing capabilities, so that these days I try my best to avoid going into a rant mode in order to avoid “The Hanging Judge’s” wrath. What you have read is just a technical analysis of the problem

…an doesn’t seem to touch on the topic of reserving/isolating CPUs.

Indeed, it does not, simply because reserving the CPUs has nothing to do with RT, as you seem to believe. Although it may have a wide range of applications covering anything from load balancing and improving CPU utilisation to making use of NUMA capabilities of the target machine, handling RT tasks is not among them…

You get rid of unwanted interrupts and DPCs on your isolated CPU. Of course unless the DPCs are yours and part of your solution,
then you have them executed exactly where you want.

OK, let’s say you “got rid of unwanted interrupts and DPCs on your isolated CPU”. You also got an extra scheduler class that allows you to do RT scheduling, so that your RT threads can be scheduled according to your requirements. Even more, let’s assume that the OS has also introduced RT mutexes that support priority inheritance protocols. Do you really think you are in a position to handle RT tasks now?

Well, lets look at the practical example. Let’s say one of your high-priority RT threads running on a “ISR/DPC-free” CPU makes a system call that relies upon a mutex behind the scenes. This mutex is currently held by a thread that runs on another CPU that allows ISR and DPCs. Certainly, once we are speaking about RT mutex we can always adjust the owner thread’s priority accordingly. However, if interrupt occurs while the target thread holds a mutex, the owner thread is going to be unable to resume its execution( and, hence, unable to release the mutex) until the CPU flushes the entire DPC queue, and things are going to work this way regardless of the owner thread’s priority.
Therefore, your RT thread running on CPU A will have to wait until DPC queue on CPU B gets flushed.

In other words, as you can see it yourself, a delay introduced by handling ISRs and DPCs on CPU A may easily propagate itself to the CPU B in some cases, even when the latter one is spared from handling interrupts and DPCs, and all scheduler’s policies like adjusting thread priorities et al cannot do anything about it. As I have already said in my previous post, as long as you allow certain execution units to be out of the scheduler’s control you have absolutely no means of providing latency guarantees that are required for RT tasks. Full stop.

Now let’s assume something that is much, much,much worse as far as RT requirements are concerned - let’s say the target system call relies upon an event or a semaphore, rather than a mutex,behind the scenes. Unlike mutexes (and potentially RW locks) , priority inversion problems that these constructs introduce cannot be resolved, because, unlike mutexes and RW locks, they don’t have owners so that you just don’t know who is going to signal them. At this point all your RT expectations literally go to dogs - your high-priority thread may be left waiting for a low-priority one to signal an event or semaphore, and you are unable to do anything about it, despite all your CPU reservations and RT scheduling classes. Therefore, in order to avoid this unfortunate scenario, the system call in question has to be re-written with RT requirements in mind. The same applies to all other system calls, as well as kernel subsystems that they rely upon.

In other words, in order to be in a position to provide RT guarantees, the entire Windows kernel has to be re-written. How come that you don’t understand it???

I am really, really sorry for saying it, Daniel, but sometimes I am just amazed how superficial your knowledge of the system-level concepts still is - after all, you had been posting here for more than 10 years…

Anton Bassov

Let’s say one of your high-priority RT threads running on a “ISR/DPC-free” CPU makes a system call that relies upon a mutex behind the scenes.
let’s say the target system call relies upon an event or a semaphore, rather than a mutex,behind the scenes

Yes, you typically want to design your solution so that it does not have dependencies on the rest of the OS. And if it does, you make sure you work your way around them. You’re probably not going to make file system calls either from a RT thread. Not every service in the OS needs to respond within a guaranteed timeframe for it to accommodate real-time solutions.

You could have a RT thread responding to interrupts (and DPCs). Or a thread which only does a DeviceIoControl periodically. Or even a thread
which only reads and write to device registers. Without being dependent on what the rest of the OS is doing.

If you need to share or save data, you delegate that work to another CPU+thread. If a normal lock would ever become a problem, you could simply use a try/lock or even a lock-free structure.

Sometimes I am just amazed how superficial your knowledge of the system-level concepts still is - after all, you had been posting here for more than 10 years…

I can take that, for I know who’s saying it.

//Daniel

Yes, you typically want to design your solution so that it does not have dependencies on the rest of the OS. And if it does,
you make sure you work your way around them. You’re probably not going to make file system calls either from a RT thread.
Not every service in the OS needs to respond within a guaranteed timeframe for it to accommodate real-time solutions.

True, but there is a “minor” problem here - although it may all sound fine in theory, in practical terms you are limited, at the very most, to the subset of API that may be called at DIRQL if you want to run your RT tasks within Windows. Even API fnctions that are callable at DISPACH_LEVEL are, apparently, out of your reach because of the possible use of spinlocks that is not adjusted to the RT requirements, behind the scenes. All API calls that are not allowed at elevated IRQL (and, hence, all userland API in existence) are most definitely out of your reach because of the possibility of blocking calls and page faults behind the scenes.

In other words, you don’t seem to have that many practical possibilities if you go this way under Windows OS in its currently existing form, don’t you think.

In any case, you have expressed a reasonable (apparently, for the first time in this discussion) idea that RT apps need a well-defined set of special RT API functions that they are allowed to call, while all other API calls have to be prohibited for them. More on it below

You could have a RT thread responding to interrupts (and DPCs).

Well, in order to be in a position to do so, you need to put ISR and DPC handling under the control of a dispatcher, and to handle interrupts and DPCs in context of dedicated threads, rather than in the one of a hardware/software interrupt that runs out of dispatcher’s control. Now go and read again my very first post on this thread, i.e. the one that you seem to be arguing with, for some reason…

Or a thread which only does a DeviceIoControl periodically

DeviceIoControl() is just a userland function, and, hence, may touch the pageable memory or just go blocking behind the scenes. Therefore, you cannot use it from an app that requires any RT guarantees. The same is true for any other userland API in existence…

Or even a thread which only reads and write to device registers.

This part, indeed, may be done. However, " if(interrupt_occurs) goto statement_above;"

If you need to share or save data, you delegate that work to another CPU+thread. If a normal lock would ever become a problem,
you could simply use a try/lock or even a lock-free structure.

What you really need here is a well-defined set of “RT-to-non-RT-and vice-versa-communication” channels and API functions that may be called by both RT and non-RT tasks, which ensures that RT and non-RT tasks can communicate in such a way that non-RT task cannot ever block the RT one

Sure there are some ways to make GPOS and RTOS coexist on the same machine, and even work alongside one another within exactly the same application. Basically, you have to design a small kernel with its own RT scheduling/synch/etc API that is in full control of interrupts and timers. This kernel have to think about the rest of the OS as of a low-priority task, and give it a chance to run whenever no RT tasks are in sight.

As long as we are speaking about the third-part solutions, this kernel is just a module/driver that does not make any calls to the OS, from the GPOS’s perspective. All its client apps also have to be KM drivers. I am not sure you can do it with Windows these days - in order to do it you need to write your own HAL, and I am not sure HAL Developer Kit is still available. If it came from MSFT, it could be done simply as a kernel extension/modification, i.e. the one that I spoke about in my very first post on this thread. In such case one could, possibly, even expect to do RT programming in the userland. This part would require quite a bit of integration with the existing OS, and, hence, quite a few modifications to the existing kernel would be needed, so that I am not sure it would be reasonable to allow RT tasks to get out of the kernel.

However, you just cannot run your RT tasks within GPOS, and simply pinning tasks to certain CPUs while still relying upon GPOS scheduler and interrupt handling mechanism is not going to help you here in any possible way.

I can take that, for I know who’s saying it.

Well, such a reaction explains quite a lot…

You know, literally the other day I heard a truly brilliant thing from someone. The bloke said “When you speak you are just repeating something that you know already, and, possibly, re-iterating your misconceptions and erroneous beliefs. However, when you listen you may learn something new”…

Anton Bassov

sometimes I am just amazed how superficial your knowledge of the system-level concepts still is

And sometimes I’m amazed at what an absolute asshole you can be, Anton.

Mr. Terhell is one of the preeminent authorities on the practical aspects of meeting time-critical constraints on Windows systems. Of this there can be no doubt. And you and I both can learn a lot from him on this subject.

Despite the (rude and annoying) personal attacks and some random utter stupidity, this is actually a pretty-good discussion.

There are of course real-time extensions to Windows. Several are commercially available. There’s absolutely nothing that would prevent Microsoft from building such extensions and making them extremely effective.

But that’s not the point, I don’t think.

The point, rather, is how you can provide the ability to meet some soft real-time constraints on a general purpose system, without writing your own “OS within an OS.” The ability to do this would make a lot of people happy, not the least of which would be the audio production/editing community (who almost exclusively use Macs today).

Peter

And sometimes I’m amazed at what an absolute asshole you can be, Anton.

Well, certainly you are going to blame everything one me, no matter what and no matter how - this is out of question. Who would even get any doubts about it…

Mr. Terhell is one of the preeminent authorities on the practical aspects of meeting time-critical constraints on Windows systems. Of this >there can be no doubt.

I dunno, but the very phrase “meeting time-critical constraints on Windows systems” (as well as on ANY system that is not designed for this purpose) sounds sort of nonsensical in itself, don’t you think. I would rather refer you to your own article in “NT Insider” about sticking wings to a pig. I don’t know about you, but if I was in Mr.Terhell’s place I would take this “title” as a complete piss-taking, derision and mockery, rather than a compliment…

And you and I both can learn a lot from him on this subject.

Well, I don’t know about you, but I don’t really want to learn anything on the topic that can be easily PROVED with mathematical certainty to be infeasible in itself. Just consider the scenario when your payment depends upon the delivery of a workable product. Would you really want to waste your time on something that you know for sure cannot be made work all the time, or would you prefer to explain to your clients the amount of work that the delivery of a truly workable product involves, and give them a chance to consider all pros and cons?

The point, rather, is how you can provide the ability to meet some soft real-time constraints on a general purpose system,
without writing your own “OS within an OS.”

Again, go and check the above mentioned article in “NT Insider”…

OTOH, everything depends on what you define as " meeting soft real-time constraints"…

If you define it as a " failure rate of anything from 1% to 99% on any particular run of a program, depending solely upon the
external factors/ pure luck, and being totally out of your control"…well, then you can discuss it with Mr.Terhell. Otherwise, I would refer you to Mr.Tippet’s post than makes it clear that Windows, in its current form, is unsuitable for any RT requirements, including even the soft ones.

The ability to do this would make a lot of people happy, not the least of which would be the audio production/editing community

Again, I would refer you to Mr.Tippet’s post - he makes it clear that making Windows suitable for soft RT tasks would require tight integration of the third-party drivers. Therefore, even if it gets introduced at some point it will affect mainly SOCs and other tightly-coupled systems - you are very unlikely to see it anywhere on the commodity-grade systems…

Despite the (rude and annoying) personal attacks

Actually, I did not attack anyone -objectively, all the things that Mr.Terhell has said on this thread reveal his lack of in-depth understanding of the general OS concepts, which I find surprising for someone who has been working in this area for 10+ years. Probably, I should not have pointed it out to him, but it was not me who started speaking about “rants” and “posts on the wrong thread”, was it…

Anton Bassov