Deadlock while calling ZwOpenKey from PostOpCreate

Hello,

I am calling ZwOpenKey from PostOpCreate to read certain registry keys.

While it works fine most of the times, I ran into a deadlock situation in early boot sequence when registry hives are being loaded. The typical stack trace looks like following.

4.000168 86099380 0000100 Blocked nt!KiSwapContext+0x25
nt!KiSwapThread+0x83
nt!KeWaitForSingleObject+0x2e0
nt!ExpWaitForResource+0xd3
nt!ExAcquireResourceSharedLite+0xd9
nt!CmpLockRegistry+0x25
nt!CmpBuildHashStackAndLookupCache+0x43
nt!CmpParseKey+0x110
nt!ObpLookupObjectName+0x5b0
nt!ObOpenObjectByName+0xea
nt!NtOpenKey+0x1ad
nt!KiFastCallEntry+0xf8
nt!ZwOpenKey+0x11
nt!VfZwOpenKey+0x6d



fltmgr!FltvPostOperation+0x4d
fltmgr!FltpPerformPostCallbacks+0x1c5
fltmgr!FltpProcessIoCompletion+0x10
fltmgr!FltpPassThroughCompletion+0x89
fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x269
fltmgr!FltpCreate+0x26a
nt!IovCallDriver+0x112
nt!IofCallDriver+0x13
nt!IopParseDevice+0xa35
nt!ObpLookupObjectName+0x5b0
nt!ObOpenObjectByName+0xea
nt!IopCreateFile+0x447
nt!IoCreateFile+0xa3
nt!NtCreateFile+0x30
nt!KiFastCallEntry+0xf8
nt!ZwCreateFile+0x11
nt!CmpOpenHiveFiles+0x117

The FILE_OBJECT in question in pointing to registry hive file viz. \windows\system32\config\system OR sam OR software etc.

It seems me that the registry lock is acquired by the time our PostOpCreate is called and when our filter attempts to call ZwOpenKey it blocks while acquiring the registry lock causing a deadlock.

Question for the experts:

  1. Is it possible to detect that the FILE_OBJECT in question is a registry hive?
  2. Is there a way to check if the registry lock is acquired without blocking?
  3. Is there any other way to solve this issue apart from not reading the registry at all from PostOpCreate.

Any advice would be much appreciated!

Thanks.
-Prasad

Who is holding an exclusive lock? It can’t be the currect thread, because
another ExAcquire- call wouldn’t block the thread (use !locks with Cm- lock
address stored in nt!CmpLockRegistry fn).

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@vmware.com
Sent: Wednesday, March 21, 2012 7:34 AM
To: Windows File Systems Devs Interest List
Subject: [ntfsd] Deadlock while calling ZwOpenKey from PostOpCreate

Hello,

I am calling ZwOpenKey from PostOpCreate to read certain registry keys.

While it works fine most of the times, I ran into a deadlock situation in
early boot sequence when registry hives are being loaded. The typical stack
trace looks like following.

4.000168 86099380 0000100 Blocked nt!KiSwapContext+0x25
nt!KiSwapThread+0x83
nt!KeWaitForSingleObject+0x2e0
nt!ExpWaitForResource+0xd3
nt!ExAcquireResourceSharedLite+0xd9
nt!CmpLockRegistry+0x25

nt!CmpBuildHashStackAndLookupCache+0x43
nt!CmpParseKey+0x110
nt!ObpLookupObjectName+0x5b0
nt!ObOpenObjectByName+0xea
nt!NtOpenKey+0x1ad
nt!KiFastCallEntry+0xf8
nt!ZwOpenKey+0x11
nt!VfZwOpenKey+0x6d



fltmgr!FltvPostOperation+0x4d

fltmgr!FltpPerformPostCallbacks+0x1c5
fltmgr!FltpProcessIoCompletion+0x10

fltmgr!FltpPassThroughCompletion+0x89

fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x269
fltmgr!FltpCreate+0x26a
nt!IovCallDriver+0x112
nt!IofCallDriver+0x13
nt!IopParseDevice+0xa35
nt!ObpLookupObjectName+0x5b0
nt!ObOpenObjectByName+0xea
nt!IopCreateFile+0x447
nt!IoCreateFile+0xa3
nt!NtCreateFile+0x30
nt!KiFastCallEntry+0xf8
nt!ZwCreateFile+0x11
nt!CmpOpenHiveFiles+0x117

The FILE_OBJECT in question in pointing to registry hive file viz.
\windows\system32\config\system OR sam OR software etc.

It seems me that the registry lock is acquired by the time our PostOpCreate
is called and when our filter attempts to call ZwOpenKey it blocks while
acquiring the registry lock causing a deadlock.

Question for the experts:

  1. Is it possible to detect that the FILE_OBJECT in question is a registry
    hive?
  2. Is there a way to check if the registry lock is acquired without
    blocking?
  3. Is there any other way to solve this issue apart from not reading the
    registry at all from PostOpCreate.

Any advice would be much appreciated!

Thanks.
-Prasad


NTFSD is sponsored by OSR

For our schedule of debugging and file system seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Hi Petr,

I traced this backwards and here is what I find.

This is the NtInitializeRegistry code which is triggering the deadlock.

808b2803 e8d0af0000 call nt!CmpLockRegistryExclusive (808bd7d8)
808b2808 ff7508 push dword ptr [ebp+8]
808b280b e850020100 call nt!CmpCmdInit (808c2a60)
808b2810 e871ea0000 call nt!CmpSetVersionData (808c1286)
808b2815 e8daaf0000 call nt!CmpUnlockRegistry (808bd7f4)

The CmpCmdInit calls down to CmpInitializeHiveList which in turn creates a new thread (CmpLoadHiveThread) for loading each hive. CmpLoadHiveThread calls ZwCreateFile to open the registry hive file which hits into our filter driver’s PostOpCreate.

As you can see, CmpCmdInit is inside the exclusive global lock on registry (CmpLockRegistryExclusive, nt!CmpUnlockRegistry). From PostOpCreate, when we try to call ZwOpenKey, it internally attempts to acquire shared lock on registry and blocks forever since there is already an exclusive lock on registry?

NOTE: This behavior is observed on 32-bit Windows 2003 SP2 VM. However, when I tried on Windows 2008 R2 SP1 VM, this was not reproducible. I am going to try on Vista now. May be it’s behavior in older versions of Windows and registry locking is improved after that?

Thanks.
-Prasad

> I am calling ZwOpenKey from PostOpCreate to read certain registry keys.

Aren’t you violating the IRQL restrictions due to this? :slight_smile:


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

Registry locking is very complex and I’ve seen many filters (including a couple of my own) get into the type of deadlock you’ve encountered. The registry had (maybe still has?) this locking model where it would acquire an exclusive lock when “extending” the registry (loading a new hive or even increasing the size of a registry file) and if you happen to try to access the registry during that path you’re out of luck.

As far as I remember, postCreate or preCreate doesn’t matter, the lock is acquired before the create is issued so you can still deadlock in preCreate.

I am not aware of any documented way to identify a file as a registry file. You can use the name but that’s rather finicky (there are multiple files, transactions add even more files and so on; the file extensions aren’t very specific anyway).

Newer OS releases did a lot of work to improve locking (making it more granular and less deadlock prone) but I think there are still cases where a global lock is held.

Based on past experience I would definitely try to avoid calling ZwOpenKey from a postCreate (or preCreate).

Thanks,
Alex.

@Maxim, ZwOpenKey documentation says requirements as IRQL = PASSIVE_LEVEL and PostOpCreate is called at PASSIVE_LEVEL. It doesn’t say “and with APCs enabled” like it says for some other Zwxx calls viz. ZwCreateFile. Given this, I am not sure, why you are saying that I am violating IRQL restrictions?

Alex, yes you are correct. As mentioned in my comment #3, the exclusive lock on registry is taken even before calling ZwCreateFile on the hive file.

On other note, when I loaded a new hive using regedit.exe (after logging in etc), the ZwOpenKey call made from PostOpCreate for the new hive file did not trigger a deadlock. Hence, it seems that load hive is talking a different code path than taken during early boot by NtInitializeRegistry.

Thanks.
-Prasad