Regarding CPU Usage and Threads

Uzair_Lakhani · June 6, 2013, 3:46am

Dear All,

I want to find out what APIs are avaliable in DDK for determining the CPU Usage at the current stage? For example if CPU usage is less then a heavy process can be started.

Secondly if in my driver if I am creating four threads and on my system if there are two virtual processors then is it possible to assign three light threads to one processor and one heavy thread to the second processor.

Thirdly if we manually do not assign threads to specific processors then how the OS assigns the threads to the processors?

Finally if some good references for the above are available then it will be good.

Thanks,
Uzair Lakhani

Doron_Holan · June 6, 2013, 5:30am

Read the windows internals book by russinovich for how the scheduler works. Why are you trying to write your own load balancer?

d

Bent from my phone

From: xxxxx@gmail.com mailto:xxxxx
Sent: ?6/?6/?2013 12:46 AM
To: Windows System Software Devs Interest Listmailto:xxxxx
Subject: [ntdev] Regarding CPU Usage and Threads

Dear All,

I want to find out what APIs are avaliable in DDK for determining the CPU Usage at the current stage? For example if CPU usage is less then a heavy process can be started.

Secondly if in my driver if I am creating four threads and on my system if there are two virtual processors then is it possible to assign three light threads to one processor and one heavy thread to the second processor.

Thirdly if we manually do not assign threads to specific processors then how the OS assigns the threads to the processors?

Finally if some good references for the above are available then it will be good.

Thanks,
Uzair Lakhani

—
NTDEV is sponsored by OSR

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer</mailto:xxxxx></mailto:xxxxx>

Uzair_Lakhani · June 6, 2013, 8:32am

Dear Doron,

Thanks for the input. As the field of hardware is making progress and now a normal machine comes with multiple virtual processors. Now the new frameworks are giving more control to the programmer to control the processors themselves instead of strict control by the operating system. I wanted to find out whether the same is true in case of DDK.

I had taken a lecture about the multithreading capabilities in Java where new frameworks give more control to the programmers instead of strict control from the operating system.

Thanks,
Uzair Lakhani

Don_Burn · June 6, 2013, 8:46am

There has always been support for thread affinity in Windows, in particular
for threads, DPC routines, and interrupts. But it has been rare that is it
is used well. Even if you know what the load of your application and driver
is you don’t have control over the rest of the system so trying to lock
things to particular threads typically makes things worse in performance.

I’ve watched this argument over who should control multi-processor work for
a long time and with every generation of MP systems someone thinks “the
application can do it better”, in the long run except for dedicated system
for HPC that never really works out as true. I first dealt with MP systems
over 40 years ago, and the world keeps trying to solve the same problems.

Don Burn
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@gmail.com
Sent: Thursday, June 06, 2013 8:32 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Regarding CPU Usage and Threads

Dear Doron,

Thanks for the input. As the field of hardware is making progress and now a
normal machine comes with multiple virtual processors. Now the new
frameworks are giving more control to the programmer to control the
processors themselves instead of strict control by the operating system. I
wanted to find out whether the same is true in case of DDK.

I had taken a lecture about the multithreading capabilities in Java where
new frameworks give more control to the programmers instead of strict
control from the operating system.

Thanks,
Uzair Lakhani

NTDEV is sponsored by OSR

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Doron_Holan · June 6, 2013, 9:48am

You have it a little backwards. When a framework like Java exposes more control, it is not that the os is abdicating that control. Rather, the framework is exposing more than it used to. The framework can only do what the os allows. Furthermore, the point of virtualization is for only the necessary small subset of the os to know it is virtualized and the rest is happily oblivious

d

Bent from my phone

From: xxxxx@gmail.com mailto:xxxxx
Sent: ?6/?6/?2013 5:33 AM
To: Windows System Software Devs Interest Listmailto:xxxxx
Subject: RE:[ntdev] Regarding CPU Usage and Threads

Dear Doron,

Thanks for the input. As the field of hardware is making progress and now a normal machine comes with multiple virtual processors. Now the new frameworks are giving more control to the programmer to control the processors themselves instead of strict control by the operating system. I wanted to find out whether the same is true in case of DDK.

I had taken a lecture about the multithreading capabilities in Java where new frameworks give more control to the programmers instead of strict control from the operating system.

Thanks,
Uzair Lakhani

—
NTDEV is sponsored by OSR

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer</mailto:xxxxx></mailto:xxxxx>

OSR_Community_User · June 6, 2013, 1:18pm

> Dear All,

I want to find out what APIs are avaliable in DDK for determining the CPU
Usage at the current stage? For example if CPU usage is less then a heavy
process can be started.

How do you tell a “heavy” process from a “light” one?

Secondly if in my driver if I am creating four threads and on my system if
there are two virtual processors then is it possible to assign three light
threads to one processor and one heavy thread to the second processor.

You have not defined what is a “light” thread and what is a “heavy”
thread. Yes, you can call SetThreadAffinity to bind a thread to a core,
but this is not always the best approach; in general, unless you
fiendishly clever and careful, there’s a good chance that making demands
as to which processors can run which threds will end up slowing everyone
down, since your choices will prohibit the thread from running on an
available core.

Note that there are times when this /can/ improve some factor; for
example, we bound the high-priority video threads to cores 1-3 and bound
the GUI thread to core 0; otherwise, the GUI was effectively dead for the
app, and the entire user interface was totally dead. By binding the
high-priority threads to all but one core, that core was able to run IE,
Windows Explorer, Task Manager, Solitaire and email, as well as the GUI
for the video app. Users never noticed that all the apps they were using
were running on only a single core.
joe

Thirdly if we manually do not assign threads to specific processors then

how the OS assigns the threads to the processors?

Scheduler algorithm: the highest priority thread wins
Generalization: in a systme of N cores, the N highest priority threads win

The system finds a core with nothing to do and schedules the thread on
that core. The details have changed a lot in Win7+ to try to minimize
lock conflicts on a single master scheduler queue. the OS assigns a
waiting thread to an idle core.

Finally if some good references for the above are available then it will
be good.

Thanks,
Uzair Lakhani

NTDEV is sponsored by OSR

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

OSR_Community_User · June 6, 2013, 1:28pm

Don,

We all learned a lot in our years of programming. Sadly, the next
“generation” (about ten years in this field) never bothers to ask us about
what we learned, but go haring off to make mistakes we either did
notcommit or committed and learned from. What is sadder is that these
ideas often make it into products.

The myth that the app can do load balancing is one of those urban fables
that keeps crawling out from under the rocks where we buried these bad
ideas decades ago. Other than the fact that this idea is designed not
just to ignore reality, but to aggressively reject reality, there’s
nothing wrong with it. Until, of course, it meets reality.

joe

There has always been support for thread affinity in Windows, in
particular
for threads, DPC routines, and interrupts. But it has been rare that is
it
is used well. Even if you know what the load of your application and
driver
is you don’t have control over the rest of the system so trying to lock
things to particular threads typically makes things worse in performance.

I’ve watched this argument over who should control multi-processor work
for
a long time and with every generation of MP systems someone thinks “the
application can do it better”, in the long run except for dedicated system
for HPC that never really works out as true. I first dealt with MP
systems
over 40 years ago, and the world keeps trying to solve the same problems.

Don Burn
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@gmail.com
Sent: Thursday, June 06, 2013 8:32 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Regarding CPU Usage and Threads

Dear Doron,

Thanks for the input. As the field of hardware is making progress and now
a
normal machine comes with multiple virtual processors. Now the new
frameworks are giving more control to the programmer to control the
processors themselves instead of strict control by the operating system. I
wanted to find out whether the same is true in case of DDK.

I had taken a lecture about the multithreading capabilities in Java where
new frameworks give more control to the programmers instead of strict
control from the operating system.

Thanks,
Uzair Lakhani

NTDEV is sponsored by OSR

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

NTDEV is sponsored by OSR

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Tim_Roberts · June 6, 2013, 3:30pm

xxxxx@gmail.com wrote:

I want to find out what APIs are avaliable in DDK for determining the CPU Usage at the current stage? For example if CPU usage is less then a heavy process can be started.

You need to think about what you’re asking. Remember, at any given
time, a particular CPU’s utilization is either 0% or 100%. When you see
another number, that value has been integrated over time. So, it’s
possible to compute a CPU’s average utilization over the last second, or
over the last 5 seconds, but any information you get is history, and
doesn’t reflect future performance. You can’t base a scheduling
decision on that.

If you have a CPU-intensive process and you don’t want it to interfere,
just start it up and reduce its priority. The system will schedule it
when no other threads are ready.

Secondly if in my driver if I am creating four threads and on my system if there are two virtual processors then is it possible to assign three light threads to one processor and one heavy thread to the second processor.

The Windows thread scheduler is the result of 25 years of development,
research, testing, experimentation, and improvement. There is
ABSOLUTELY no way you can do a better job.

Thirdly if we manually do not assign threads to specific processors then how the OS assigns the threads to the processors?

When a thread reaches the end of its time slice, or gives up its time
slice by waiting for some resource, the scheduler runs. The scheduler
maintains a list of all of the threads in the system that are
“ready-to-run”, sorted by priority. It just picks the top of the list
and starts it. The thread that had been running goes back to the bottom
of the stack (at its priority level).

I had taken a lecture about the multithreading capabilities in Java where new frameworks give more control to the programmers instead of strict control from the operating system.

This is an almost uniformly bad idea, because of the assertion I made
above: there is no way that even a very good Java programmer can do a
better job of thread scheduling than any of the mainstream operating
systems.

–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

OSR_Community_User · June 6, 2013, 6:27pm

+1

“We know the future will be just like the past because in te past, the
future was just lie the past”

When applied to the formation geological features of the Earth, it became
known as “uniformitarianism”. The counterargument, that features on the
Earth firned as the result of major upheavals, was known as
“catastrophism”.

They were both right.

I believe it was the late Stephen J. Gould who introduced the idea of
“puctuated equilibrium”, which can be summed up as “Things are pretty
stable until there is a major disruption, then they settle down to a new
stable state”. In the case of computers, the integrated-over-time
utilization does not tell you the performance over the next five seconds,
and the programmer has no synoptic view of what is going on. In addition,
CPU utilization does not indicate how much time is spent in the kernel
doing kernel things, how much time is spent in the kernel acting as the
(privileged) agent of a user thread, and how much utilization is in user
space. While I believe the kernel/user split can now be determined, the
kernel-as-kernel and kernel-as-agent numbers are much harder to determine.
Basing any thread assignments on an improperly-interpreted set of numbers
means your results are going to be more than a bit flaky, especially if
you have large, multithreaded apps with “bursty” behavior, e.g. SQL Server
getting a complex query, or IIS serving up a dynamic Web page, or dealing
with server-side scripting in response to some external user action.

Anyone who thinks building a scheduler is easy has never built a
scheduler. I worked with two teams on two different operating system in
my career, and I helped design some of data collection experiments. In
our multiprocessor OS, we went through a period of several months where we
were putting up a new scheduler every week, each one tweaked to deliver
something better than the previous version. Mostly, such tweaks did make
an improvement, but in sone cases the performance was substantially worse,
and in other cases we solved problem A only to uncover problem B, which in
turn had to be solved.

IBM avoided this issue in their TSS/360 OS: their scheduler was a state
machine, and by loading a new state transition table, you could change its
behavior. And when we profiled the OS, we found that 37% of the CPU
cycles were going to the scheduler, on a giant mainframe with about the
same computing power of a 286, but with less memory (512K for the 360,
640K for the 286), and we were supporting 60 concurrent users!

In addition, it was not just the scheduler. The scheduler and the virtual
memory manager were goombahs, and held complex dialogs trying to optimize
both system responsiveness and overall system performance For example, we
disovered certain aspects of the Working Set model required subtle changes
in our VM manager and in our scheduler, and how they interacted.

If you think LRU and Working Set solved all the problems of VM management,
have I got a deal for you! My late uncle had been a high-ranking officer
in the Nigerian army, and has this… (no, I can’t give you any details
until you buy into this project).

So I have the same reaction when somebody not only makes the claim that
they can do “optimal scheduling” for an app, but also starts asking “how
do I do this?”. My first response should simply be “ARE YOU OUT OF YOUR
FREAKIN’ MIND!!!”

My second reaction is: “do you have any data that suggests that the
current scheduler is doing a poor job?” Each “scheduler meeting” we held
had, across the width of the room, a 20’ or so section of the system
performance graph (for grins, he printed out the graph for an hour; it was
something close to 100 feet). Between six and ten people who lived in the
kernel source code watched while he pointed out where improvements could
be made, or where the new scheduler introduced problems, and we would then
propose various alternate implementations that might fix that problem with
introducing new ones. Sometimes we were even right.

So we have a newbie, who has ZERO data to back up the idea that the
Windows scheduler is inadequate, proposing to write a scheduler, and
furthermore has no idea of how to demonstrate that this frankenscheduler
is an improvement over the existing one. Yeah, right.
joe

xxxxx@gmail.com wrote:
> I want to find out what APIs are avaliable in DDK for determining the
> CPU Usage at the current stage? For example if CPU usage is less then a
> heavy process can be started.

You need to think about what you’re asking. Remember, at any given
time, a particular CPU’s utilization is either 0% or 100%. When you see
another number, that value has been integrated over time. So, it’s
possible to compute a CPU’s average utilization over the last second, or
over the last 5 seconds, but any information you get is history, and
doesn’t reflect future performance. You can’t base a scheduling
decision on that.

If you have a CPU-intensive process and you don’t want it to interfere,
just start it up and reduce its priority. The system will schedule it
when no other threads are ready.

> Secondly if in my driver if I am creating four threads and on my system
> if there are two virtual processors then is it possible to assign three
> light threads to one processor and one heavy thread to the second
> processor.

The Windows thread scheduler is the result of 25 years of development,
research, testing, experimentation, and improvement. There is
ABSOLUTELY no way you can do a better job.

> Thirdly if we manually do not assign threads to specific processors then
> how the OS assigns the threads to the processors?

When a thread reaches the end of its time slice, or gives up its time
slice by waiting for some resource, the scheduler runs. The scheduler
maintains a list of all of the threads in the system that are
“ready-to-run”, sorted by priority. It just picks the top of the list
and starts it. The thread that had been running goes back to the bottom
of the stack (at its priority level).

> I had taken a lecture about the multithreading capabilities in Java
> where new frameworks give more control to the programmers instead of
> strict control from the operating system.

This is an almost uniformly bad idea, because of the assertion I made
above: there is no way that even a very good Java programmer can do a
better job of thread scheduling than any of the mainstream operating
systems.

–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

NTDEV is sponsored by OSR

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Alex_Grig · June 6, 2013, 7:52pm

@Dr. Joe:

But there is no question that Windows (still in 7+) *virtual memory* manager does extremely poor job. There is no reason for a 4+ GB box to have much of application code/data of a typical dev box ever discarded. Still, I see it all the time that my Visual Studio and other apps freeze when I switch to them, to page stuff in.

OSR_Community_User · June 6, 2013, 8:57pm

Our biggest “scheduler” problem was our VM manager. But the OP didn’t
even hint at trying to improve the VM manager, but posed this weird idea
about binding threads to cores as somehow being a way to improve
performance. It’s sort of like the compiler treating the ‘register’
keyword as other than syntactic noise; with very few exceptions, obeying
this directive resulted in poorer code than if it is treated properly as
just legacy noise.
joe

@Dr. Joe:

But there is no question that Windows (still in 7+) *virtual memory*
manager does extremely poor job. There is no reason for a 4+ GB box to
have much of application code/data of a typical dev box ever discarded.
Still, I see it all the time that my Visual Studio and other apps freeze
when I switch to them, to page stuff in.

NTDEV is sponsored by OSR

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer