> A thread can be requested to terminate at any time, without regard to
whether it is currently executing in kernel mode or user mode.
Some years ago, some folks at ISI had a lot of problems because of this.
ASKING a thread to terminate is not the same as HAVING the thread
terminate, as noted below.
Threads are terminated only when they are ready to return to user mode
from kernel mode, were they in kernel mode.
A well-behaved kernel mode component that does inline work on behalf of a
user thread will configure their long-running waits such that they can
receive notification that a thread termination has been requested, and
subsequently break out of their processing in a timely fashion.
Their experience was that, if we accept the premise that “well-behaved”
components will do this, then many parts of the kernel were not
well-behaved. I no longer recall all the details, but they could get
indefinitely-long hangs. I used to be able to get it just by blocking on
a semaphore; nothing short of TerminateThread could blast it loose. I long
ago recognized that I had to multiwait on the synchronization primitive
AND a “shutdown event”. Timed waits were far too messy because of the
necessary cleanup involved.
Cleanly terminating threads was something I taught in my classes on
Advanced Windows System Programming. It is /always/ possible to tell a
thread to cleanly terminate, unless it is stuck on sone blocking call that
ecfectively overrides the request. Key here is to avoid blocking calls.
Network APIs are the worst offenders, but even file system calls can be
hung up on a non-responding server, and device calls to custom devices,
likewise.
Some years ago, we had to do DNS name resolution on program startup, so it
could re-establish communication. There were no asynchronous DNS calls
then, so I did it all in a backround thread, with an unobtrusive progress
bar displayed on the status bar. An attempt to shut the program down
required putting up a dialog that said “waiting for network response” and
I later added a progress bar that counted down 70 seconds (and recorded
the time right before the call so it was mostly accurate). That thread
could not be broken out of the gethostbyname (if I’ve remembered the call
correctly) function even by a call to ExitThread.
When I talk about “erroneous programs”, I am basing this on many decades
of programming experience (and almost 20 years in Win32) of the kind of
code that gives the illusion of working without actually being correct.
Tiny changes in the assumptions means the program stops working (sometimes
in subtle ways) and the programmer often ignores certain issues I know to
look for, because that part of the program is “known to work”. Tossing in
gratuitous “exithread” and “exitprocess” calls are among the most common
failure modes. I’ve even had a programmer tell me “I was told that
ExitProcess was bad because the OnExit calls aren’t made, so now I call
exit(n)” Of course, when I go looking for the error, the first place I
look is to make sure that all cleanup paths are properly executed. They
usually aren’t. Consider the model
UINT threadfunc(LPVOID p)
{
allocate thread state structure and initialize it
whatever->runthread(threadstate);
deallocate thread state structure
notify soneone the thread has cleanly terminated
return 0;
}
Note that if someone writes an ExitThread call (or one of its many
synonyms) then the lines following the runthread() call are never
executed. But they’re not inportant anyway (and if you believe this, my
uncle was in the Nigerian army and I need your help in getting his assets
out of the country). Since many of these calls are in (untested) error
handling paths, and they leak small amounts of memory, the field report
says “once every 10-14 days we have to restart the service because its
memory footprint has become ridiculously large”. You can’t get this
failure in-house. It is worse if the whatever->runthread() call is being
used to “enter C++ space”. When the ExitThread call happens, all existing
objects in the heap, that would have been cleaned up by their end-of-scope
destructors, are left behind. My “opinions” are backed by years of
experience in finding and fixing these kinds of errors.
I once had a client insist that C++ collections were unreliable because
they leaked memory (I was being asked to recode my solution because I had
used std::map). I said this was simply not true, and he countered by
presenting me with “proof”, a program that leaked massive amounts of heap
space when it had used C++ collections. In less than five minutes I found
an exit() call buried in an error path, and eventually found a couple
dozen. They used aalloc compulsively, “because using malloc() gave us the
same problems” (why the failures were attributed solely to C++ in the
presence of this contradictory information escapes me).
So I have taken to referring to these time bombs as “erroneous code”
because, in fact, many of them /are/ erroneous, and those that are not
will become so under maintenance.
It’s easy to write code that gives the illusion of working. It’s a lot
harder to write code that is actually correct, and will remain so for
years. Key is to not commit certain errors, because if you do, you will
eventually commit them in a context where certain necessary preconditions
are violated by their usage. I call such practices “erroneous”.
joe
-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@flounder.com
Sent: Thursday, November 22, 2012 12:10 PM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] Unique process id
>>> Note that Windows has no concept of a “main thread”; in general, the
>>> main thread can exit any time, and the process is live as long as
>>> there is any potentially runnable thread in the process.
>
> ???
>
> The thread executing main() or WinMain() calls ExitProcess after return.
Which will not kill threads that are blocked in the kernel. Or at least,
it used to be that way.
joe
>
> –
> Maxim S. Shatskih
> Windows DDK MVP
> xxxxx@storagecraft.com
> http://www.storagecraft.com
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
NTDEV is sponsored by OSR
For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars
To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer
NTDEV is sponsored by OSR
For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars
To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer