> Hi Jospeh,
>Key here is that you have to try all the accepted “good”
>ways of asking a process to close; if they all fail, then you
>can consider TerminateProcess.Never said something different,…
>and putting a WM_QUIT message in the queue
Putting this on a foreign process is for sure deadly, but not if you are
prepared to handle this in your own app!
****
Actually, this is very hard to deal with. Suppose you need the message
pump during cleanup? The alternative is to carry around massive amounts
of state in global variables (or CWinApp-derived-class variables)
complicating the coding and maintenance. I’ve seen the results of this,
and they Aren’t Pretty.
*****>Note that anything that causes a long delay in message processing
Thats why i am using threads, synchronisation and a signaling mechanism to
make sure not to get false positives as less as possible.
****
Threads in your app? And what is a “signaling mechanism”? The choice of
implementation makes a huge difference in the robustness of the app suite
*****>Registered Window Message requsting emergency shutdown, which
>is semi-isomorphic to WM_CLOSE, except it knows there is no user to
>respond to confirmationIf your application has a own mechanism to handle this its fine, but the
application must be still responding or there will be no arrival of
anykind of message once its message pump, threads or polling queque for
commands is frozen! Here the Application Recovery and Restart (ARR) comes
quite handy or another watchdog who can be of help e.g. a service, but
there you have the session barries, etc,…
****
If the message pump freezes, it means the app is defective in design.
This is something you have to make sure cannot happen. Note that, as you
point out, you cannot use a service to do this. Therefore, there is
nothing to prevent the user from killing your watchdog app. Which is why
my system had mutual auto-restart.
Note that some of the mechanisms which purport to save a program’s state
during a restart either cannot do so, or cannot restore the state they
think they saved, or in fact are saving the very state that caused the
lockup in the first place, none of which are particularly good.
****
>suites of programs that are robust and can make guarantees about
> correctness.There is simply no
****
Actually, unless someone uses task manager to kill one of my apps, we
guarantee correctness. I spent weeks making sure that worked. Once
TerminateProcess is in the picture, though, there can never be guarantees.
****i think we can infinitely debate on this “best practises” and “design
concepts” topic and still find no 100% failsafe solution. Its impossible
to handle all kind of failures in a application, because there are things
that can be out of the scope of your app, e.g. the runtime, external
situations, operating system failure, powerloss and and and. IMHO there is
NO 100% safe way to make a application work 100% safe, even by using
transational processes, there is always something that can fail. I am sure
you know who Edward A. Murphy is
****
Been there, done that, could even recover from memory parity errors. One
app reststarted itself about once a day when its heap got corrupted; we
never did find the cause of the heap corruption, but we could and did (I
did) recover from it. I will admit, I had not expected to get a memory
parity error reflected to the process by the kernel, but it did, and I
handled it correctly. The process promised “best attempt” delivery of
messages to its communicating processes, but if there was a failure, both
sender and receiver got a notification of the error. MTBF went from 45
minutes to six weeks (when a campus-wide power failure shut down
everything). I spent on the order of a year working on this project,
which was an OS-critical component which, when it failed, required a
complete reboot to fix. I could guarantee that either every packet got
through or there was a notification of failure.
Yes, if the power fails while the disk directory is being written, you
have potential problems, but a transacted file system (which we had)
catches those as well. So there is a way, and I’ve done it, and therefore
I don’t believe the assertion that it cannot be done.
****
I would like to ask something on DuplicateHandle and handle leaks: What if
i do have a Process A and a Process B and Process A accesses a object
handle in Process B with DuplicateHandle and DUPLICATE_SAME_ACCESS but
Process A dies before Process B can close the handle to the Process A
object handle. Am i right if i say that the kernel object is not freed
unless the Process B releases the handle to object in dead Process A. Isnt
that a “sort” of handle leak? As long as there is a reference count >0 the
object and its memory is alive but the object does not belong to Process
B. Or am i wrong and the handle to the object goes down with Process A and
the duplicate handle on Process B is invalid?
****
Duplicate handles will leak only if the process into which they have been
duplicated keeps running without closing them. ALL handles are closed
when a prorcess terminates. The logic does not care what kind of handle
it is, original or duplicate; it is all maintained by reference counts in
the kernel. In fact, it is frequently the case that once process A
creates a duplicate handle in B that A will terminate. Now there is only
one handle left, the handle in B. When B terminates, if the handle has
not been closed, it will be forced closed. It is not a “sort” of handle
leak; it is not a leak at all. The system is behaving correctly. If B is
using the handle, the handle is validly in use, and must NOT be forced
closed just because A terminated. And the handle DOES belong to process
B. The internal logic does not care in the slightest that process A
*opened* the handle; what the kernel sees is two references to an object,
then one reference to that object. There is no way for it to tell WHICH
reference (original or duplicate) is outstanding, and, frankly, nobody
cares. All that matters is that there is a reference.
If you have a long-term process B, and A creates a duplicate handle in B’s
handle space, and B never closes it, then B is defective. Fix the bug.
There is no concept of “belonging to” at this level, just raw “reference
count”. A handle belongs to the process that can address it. As long as
any one handle remains in use, the objects managed by that handle remain
valid. Closing the “original” handle has ZERO impact on the validity of
any and all duplicates which may have been made of it.
Consider the classic example: using an unnamed pipe to deal with a child
process writing to stdout/stderr. The protocol goes like this:
* Create an anonymous pipe, getting two handles, one for its input side
and one for its output side
* Create an inheritable duplicate of the input handle
* Place the handle value in the process creation block for stdout
* Create an inheritable duplicate of the input handle (a second one!)
* Place the second handle value in the process creation block for stderr
* CreateProcess, specifying that inheritable handles be inherited.
* Close the non-inheritable input handle
* Close the inheritable input handle
* Close the second inheritable input handle
* ReadFile from the output handle until you get ERROR_BROKEN_PIPE,
indicating the child process has closed the handle
Note that the non-inheritable and inheritable input handles must all be
closed so that the ONLY remaining valid handles are in the child process.
When the child process terminates, the result will be that those handles
are forced closed and the ERROR_BROKEN_PIPE will result. Failure to close
these handles (the most common error my students make, even though I give
them code examples on the slides) means the ReadFile will hang because the
output side of the pipe still has valid handles for the input side, even
though the child process has closed.
The reason for two duplicates is this: some apps do not use stderr, so the
first thing they do is close the stderr handle. Suppose the numeric value
of that handle was 12345 (just for discussion). If I put the same handle
value in for stdout and stderr, then when the child process closes stderr,
it closes handle 12345. The kernel does not care that this is also the
value for the stdout handle; it closes the handle. The caller sees broken
pipe; the child process will get an error writing to stdout. So you need
two duplicates. Upon completion of the second DuplicateHandle, there are
three handles to the input side: the original non-inheritable handle and
two inheritable handles. After CreateProcess, there are five handles to
the input side: the three aforementioned parent handles, and the two
handles for stdout and stderr in the child process. Now, let’s say the
child process closes stderr. Now there are four handles. The parent
closes the non-inheritable input handle and the two inheritable input
handles, leaving one valid handle, the child process’s stdout. When that
handle is closed, either by the child process issuing a CloseHandle or the
process just terminating (say, for an error), then there are zero handles
left, and the ReadFile gets the ERROR_BROKEN_PIPE condition. Note that
the parent process no longer possesses ANY handles to the input side of
the pipe, yet the pipe remains perfectly valid, because there is one valid
handle.
I find a lot of confusion about “ownership” issues. Processes do not own
other processes (unless you create a process group, which is a rare
occurrence), and the termination of one process has no consequences in
terms of the kernel forcing termination of any processes it created. The
concept of “parent” and “child” exist for conversational purposes but the
kernel does not care in the slightest. Similarly, processes own threads;
threads do not own threads, and termination of one thread has zero kernel
impact on the status of any threads it created (your program may hang
because it is waiting for something from that recently-deceased thread,
but that represents a bug in your program, not the kernel). Processes own
handles. They do not care if these are “original” handles from
CreateFile, or “duplicate” handles from DuplicateHandle. A handle is a
handle. FAILING to close a handle can have an impact on correctness of a
program (the most common error I see is a student says “It hangs on the
ReadFile call” which is true because there is nothing to read, but the
presence of valid handles to the input side means that there WILL be input
someday, except that the parent process is not going to write to those
handles, but the kernel does not know or care. The code is simply
incorrect as written).
So don’t think that a duplicate handle is a second-class citizen, whose
existence depends on the continuing existence of the original handle. It
is a full-fledged handle, with all the rights thereunto, one of which is
to keep the file or device open until some data is read or written via
that handle. Only when ALL outstanding handles are closed will you get
IRP_MJ_CLOSE in your driver.
joe
****
NTDEV is sponsored by OSR
For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminarsTo unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer