Troubleshooting a deadlock, I have a bunch of threads stuck in ExAcquireResourceSharedLite(). With a live debugger, it’s taking a long time to return from the !locks command as the debugger fetches data across a named pipe between 2 VMs.
If I have a dump of one of the thread’s call stacks on an x64 machine, are there any of the stack arguments that might tell me which resource he’s waiting for? So far, I haven’t found anything useful. Seems like !locks is my only choice.
For instance, this one:
fffffa8006d26170 fffffa8006d26170 0000000000000000 fffffa800000000d : nt!KiSwapContext+0x7a
fffffa800ac14b50 fffff80001839e80 fffff8800000004b fffff80001839e80 : nt!KiCommitThreadWait+0x1d2
0000000000000000 fffffa800000001b 0000000000000000 fffff80001839e00 : nt!KeWaitForSingleObject+0x19f
fffffffffd9da600 fffffa800934c520 fffffa800a201880 0000000000000000 : nt!ExpWaitForResource+0xae
0000000000000000 fffffa8012c62a10 fffff80001864280 fffff880009c0180 : nt!ExAcquireResourceSharedLite+0x2c6
fffffa8012c62a10 fffff8a018c6d140 fffff8a018c6d010 fffffa800a201180 : Ntfs!NtfsCommonClose+0x87
0000000000000000 fffff800019b6d00 fffffa8006d26100 000d697000000005 : Ntfs!NtfsFspClose+0x15f
000d70700014f1e4 fffffa8006d26170 0000000000000080 fffffa8006d13040 : nt!ExpWorkerThread+0x111
fffff880009c0180 fffffa8006d26170 fffff880009caf40 000d7452000d73b0 : nt!PspSystemThreadStartup+0x5a
fffff880021bb000 fffff880021b5000 fffff880021ba8a0 0000000000000000 : nt!KiStartSystemThread+0x16
The problem is that on x64 the arguments are recorded on the stack in the home locations (those being shown by the debugger).
The usual technique we use is to try and figure out where the parameters were saved or loaded from (so either up or down the stack from the call of interest).
So (working from a random crash dump of my own so I can see the functions you’re looking at):
Ntfs!NtfsCommonClose+0x58:
fffff880014dce98 410fba6424040b bt dword ptr [r12+4],0Bh fffff880014dce9f 0f820f0a0000 jb Ntfs!NtfsCommonClose+0xa71 (fffff880014dd8b4) fffff880014dcea5 4439b3c0000000 cmp dword ptr [rbx+0C0h],r14d
fffff880014dceac 0f8593420600 jne Ntfs! ?? ::NNGAKEGL::string’+0x1b2c0 (fffff88001541145) fffff880014dceb2 0fb65308 movzx edx,byte ptr [rbx+8]
fffff880014dceb6 80e201 and dl,1 fffff880014dceb9 498d8c2400070000 lea rcx,[r12+700h]
fffff880014dcec1 ff15f18dfaff call qword ptr [Ntfs!_imp_ExAcquireResourceSharedLite (fffff88001485cb8)]
In this case RCX (the ERESOURCE address) was computed relative to the value in R12.
So, either R12 was saved on the stack somewhere OR it is still in its original register… the next step is to find it.
We’re in luck:
0: kd> u nt!ExAcquireResourceSharedLite
nt!ExAcquireResourceSharedLite:
fffff800016e00a0 48895c2410 mov qword ptr [rsp+10h],rbx fffff800016e00a5 48896c2418 mov qword ptr [rsp+18h],rbp
fffff800016e00aa 56 push rsi fffff800016e00ab 57 push rdi
fffff800016e00ac 4154 push r12 ;; \<\<\<\<\<\<--------------------------- BINGO! fffff800016e00ae 4156 push r14
fffff800016e00b0 4157 push r15 fffff800016e00b2 4883ec40 sub rsp,40h
See the push? It’s on the stack. All you need to do then is find the call frame and look at the value that is 0x10 below the return address
(this will be Ntfs!NtfsCommonClose+0x58)
Take the value in that stack location and add 0x700 to it… Now you have the address of the ERESOURCE. Feed that into “!locks -v” and you should get what you are seeking.
Tony
OSR
Let’s try that again.
“The problem is that on x64 the arguments *are not* recorded on the stack in the home locations (those being shown by the debugger).”
There, that’s better… Now to proofread it one more time before sending it.
And in the spirit of being pedantic: debug builds do copy them back to the home location. Useful for debugging your own drivers, but not so useful for the OS typically.
Tony
OSR
Thanks, Tony. Always nice to get a lesson from the master.
Now if you could fix my deadlock, that would be sweet. But I’m afraid it’s my problem. User-mode process serving requests from my driver. A vast pool of opportunity for problems. 
Good luck - those user mode callback deadlock problems are always a bit tricky (it’s not as simple as “well, do the operation unlocked and then deal with the race possibility” normally).
Tony
OSR
Unfortunately, it’s not me doing the locking. It’s NTFS in both cases (owner and multiple waiters) I can’t just tell NTFS not to lock stuff, now can I ? And not just my file system with user-mode service. There’s also a 3rd party volume storage driver with user-mode service calls.