All threads in user mode process dump waiting

Hi All,

I am analyzing a user mode process dump. When I see thread stacks, it seems all threads are waiting and call stack shows NtWaitForMultipleObjects on the top. I know it is not really true that all threads are waiting. But why the dump file showing like this. What happens when the user initiates dump from the task manager.

Thanks,

I know it is not really true that all threads are waiting.

Why do you think so? Most applications spend the vast majority of their time with all threads waiting.

it would be surprising if every thread in a UM process was waiting on NtWaitForMultipleObjects. that would usually point to a deadlock of some kind. Normally, at least some threads will wait with other APIs especially the GetMessage family and often the IOCP or thread pool APIs.

It also depends greatly on how the dump was triggered. If triggered from task manager or VS create dump at some arbitrary point in the program execution is different from a dr watson (WER) generated dump

Most applications spend the vast majority of their time with all threads waiting.

True, but it does not necessarily have to be the explicit wait request, at least as far as the target app/thread is concerned.
For example, any blocking file IO or registry-related call is going to put the target thread into the wait state at some point.
Actually, I think it would not be a gross exaggeration to say that in most cases they enter the wait state behind the scenes either because of the page fault, or due to some system call other than the actual WaitXXX one.

However, in this particular case the call stacks seem to suggest that they all have entered the wait state explicitly, and did it right from the userland by means of calling WaitForMultipleObjects() that, in turn, has called NtWaitForMultipleObjects() behind the scenes

Why do you think so?

Try to think logically. If all threads have explicitly entered the wait state by means of calling WaitForMultipleObjects() , who is going to release them ( I am making certain assumptions here -more on it below )??? Therefore, if you look at the whole thing purely from the logical perspective, the whole app is going to get frozen until the end of time, whatever this term may mean in this context - this is just a logical deadlock, and there is absolutely nothing that may resolve it.

Certainly, I am making some assumptions here . I assume that, first, these are not time-bound waits,second, that no co-operating apps or KM drivers are involved, and, third, that these threads haven’t made asynch IO requests that are still outstanding, prior to entering the alertable wait (i.e these threads don’t expect their waits to be alerted by the user APCs upon the asynch IO completion)

Anton Bassov

Try to think logically. If all threads have explicitly entered the wait state by means of calling WaitForMultipleObjects() , who is going to release them ( I am making certain assumptions here -more on it below )???

Your assumptions refute your own assertion. I stand by my original statement, and I’m not sure why you’re arguing. Most apps spend the majority of their time with all threads in a wait state. The main thread is waiting for something to pop into the message queue. Some threads wait on timers. Some threads wait on network activity. Some are thread pools waiting for work. It is a very common condition for a Windows application.

1 Like

What Mr. Roberts said.

And it’s not at all unusual for every thread in a process to be waiting.

Peter

I’m not sure why you’re arguing.

Read my post carefully, and you will( hopefully) understand it…

Most apps spend the majority of their time with all threads in a wait state.

Who would ever argue about this part…

However, it does not necessarily imply that all these threads have entered the wait state by means of calling WaitXXX() in the userland. This is the only thing that I am saying - again,read my post carefully.

Your assumptions refute your own assertion.

Not at all. Again, read my post carefully.

The main thread is waiting for something to pop into the message queue. Some threads wait on timers. Some threads wait on network
activity. Some are thread pools waiting for work.

However, in none of the above mentioned cases you will see NtWaitForMultipleObjects() on top of the userland call stack, because the target thread has entered the wait state as a result of some IO-related call that has implicitly call KeWaitXXX behind the scenes, rather than by means of directly calling WaitXXX() in the userland.

The fact that all threads went blocking by means of NtWaitForMultipleObjects() strongly suggests the deadlock scenario, because all threads seem to be waiting on some event(s) that one of these threads is supposed to signal. This is the only thing that I am saying. Again, read my post carefully. …

Anton Bassov

Tim,

As a follow-up to my previous post, I have to admit that I have overlooked one more scenario (albeit a hypothetical one). What I have forgotten is that one may THEORETICALLY use WaitForMultipleObjects() in order to simulate the semantics of poll()/select() UNIX calls. Certainly, this is not the most reasonable, so to say, way of doing things on the system that supports IO completion ports, but technically there is no reason why it cannot be done. If app writer decides to do things this not-so-reasonable way, you may, indeed, see NtWaitForMultipleObjects() on top of the userland call stacks of all waiting threads, but it would not necessarily imply that the target app got deadlocked.

Anton Bassov

call that has implicitly call KeWaitXXX behind the scenes

If you wait “behind the scenes” then you’re going to see KeWaitXXX on the stack, right?

Oh, whatever. This has ceased to help the OP, I’m sure.

Peter

to the OP, can you analyze the call stacks above this wait? That’s where you are likely to find useful information. if the number of threads is small enough, you can post them here and we can try to help you

If you wait “behind the scenes” then you’re going to see KeWaitXXX on the stack, right?

This is EXACTLY my point. If it was the OP’s case, it would be just a perfectly standard situation that one would expect to see 99+% of the time.

However, according to the OP, what he actually sees is NtWaitXXX (i.e. the system call exported via SSDT), rather than KeWaitXXX() (i.e the call that one would normally expect to be used by a KM component for entering the wait state). This, in turn, leads us to the logical conclusion that absolutely all threads of some app have entered the wait state explicitly upon the userland request.

I can see 4 possibilities here

  1. The target app is waiting for some event from some external component (i.e. an app or driver) that it is tightly coupled with.Such apps are relatively rare.
  2. The target app is polling the events that it expects to get signaled upon the outstanding IO completion, i.e. tries to implement the semantics of poll()/select() UNIX calls. This is a pretty bizarre and inefficient way of doing things on the system that supports IO completion ports
  3. The target app is deadlocked
  4. The OP just has provided us with the wrong description of the situation, and,in actuality, it is, indeed, KeWaitXXX and not NTWaitXXX, on top of the stack. If this is the case…well, then it means that I have started “a heated debate” that eventually involved " The Hanging Judge" himself, on the thread that has been nonsensical since its very inception. Nothing particularly new here, don’t you think…

Anton Bassov