Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Before Posting...
Please check out the Community Guidelines in the Announcements and Administration Category.

Reading registry during boot, for hostid.

Jorgen_LundmanJorgen_Lundman Member - All Emails Posts: 45

Hello list,

So ZFS has an idea of "hostid", just a 32bit unique identifier that is stamped into the pool, so it can issue a warning when
you attempt to import pool on a "different" host. Hostid should stay the same between reboots, but where to draw the line in the sand
beyond that is flexible (to me). Reinstalling OS could change it, or changing parts of the hardware.

Anyway, I looked around for various things to use as hostid, and settled on
"\REGISTRY\MACHINE\SOFTWARE\Microsoft\Cryptography\MachineGuid".
I then use fnv_32a hash, and set it. All is well, I thought.

Then I rebooted with my driver installed and unfortunately RtlQueryRegistryValues() returns
0xC0000034 STATUS_OBJECT_NAME_NOT_FOUND.

My guess is that not all of the Registry is loaded and ready to be used during boot. My driver uses
ServiceType = 1
StartType = 1
ErrorControl = 1
LoadOrderGroup = "File System"

Naturally, I could simply probe until it is available, then set it, if I was desperate. But perhaps
there is an event or callback that happens when the Registry is available, that I could plug into and
do all my Registry reads there (I only have one other,
"\REGISTRY\MACHINE\HARDWARE\RESOURCEMAP\System Resources\Physical Memory"
which thankfully appears to work, even during boot).

But after Googling for a bit, it seems perhaps I should not be reading the Registry at all? Would
reading MachineGuid be considered "uncool" ?

Another option would be to use the Driver's own Registry (currently not used at all), and if hostid
does not exist, randomly generate one, and write it in for future reboots. (Assuming I am allowed to write to it
during boot, or I have the same "event" question as above.)

What is recommended here, as best practise? I rely on you fine Windows-devops to show
me the right path :)

Lund

Comments

  • anton_bassovanton_bassov Member Posts: 4,984

    My guess is that not all of the Registry is loaded and ready to be used during boot.

    Of course - the Registry is physically stored in certain files in a system folder. Therefore, you have to wait until both the whole storage stack
    and NTFS are up and running if you want to make the full use of it.

    My driver uses
    ServiceType = 1
    StartType = 1
    ErrorControl = 1
    LoadOrderGroup = "File System"

    Well, the very first question that gets into my head is why one wants his driver to be loaded at the boot time, taking into consideration that this driver does not either control any hardware device or implement a system service that may be essential for booting, and does not have any dependent drivers or services either. Change this part, and the problem will go in itself....

    Anton Bassov

  • Jorgen_LundmanJorgen_Lundman Member - All Emails Posts: 45

    Booting Windows on ZFS will be something to work on soon.

    But, on the original question and lack of replies, I think I will just use the driver's registry and generate a hostid instead of taking advantage of existing information.

  • anton_bassovanton_bassov Member Posts: 4,984
    edited May 16

    Booting Windows on ZFS will be something to work on soon.

    BTW, I had a brief look at your project. Sorry, but things that you do in your code seem to be just horrible

    For example, look at the snippet from spl_mutex_enter()

    https://github.com/openzfsonwindows/ZFSin/blob/master/ZFSin/spl/module/spl/spl-mutex.c

        if (mp->m_owner == current_thread())
            panic("mutex_enter: locking against myself!");
    
    #ifdef DEBUG
        if (*((uint64_t *)mp) == 0xdeadbeefdeadbeef) {
            panic("SPL: mutex_enter");
        }
    #endif
     //lck_mtx_lock((FAST_MUTEX *)&mp->m_lock);
        ExAcquireFastMutex((FAST_MUTEX *)&mp->m_lock);
        //KeWaitForSingleObject((KMUTEX *)&mp->m_lock, Executive, KernelMode, FALSE, NULL);
        mp->m_owner = current_thread();
    
        // Windows increases irql in fastmutex, this is not how
        // we want to use mutex with unix
        // We should research and check if ExAcquireResourceExclusiveLite() is better for this
    KeLowerIrql(PASSIVE_LEVEL);
    
    
    

    You have arbitrarily(!!!) lowered IRQL from APC_LEVEL( which prevents the code from re-entering itself by means of disabling APC delivery to the target thread) down to PASSIVE_LEVEL,effectively enabling APC delivery. Now consider what happens if APC gets delivered to the thread that owns a mutex, and it tries to acquire this mutex recursively.....

    To be honest, after having seen this "engineering feat" (and particularly the explanation of your reasoning behind it) I immediately lost my interest in looking any further....

    Anton Bassov

    [MODS: Edit for clarity, because Anton needs to learn how to use Markdown.]

    Post edited by Peter_Viscarola_(OSR) on
  • Jorgen_LundmanJorgen_Lundman Member - All Emails Posts: 45

    Thanks for taking a look anyway, despite your bad experience, and I agree - that was a hack to get long-hold mutex code to work, in the very early days of porting the code.

    It seems Unix and Windows have very different ideas of how a mutex can be used. The mutex held in Illumos, can be held for a long time, even days. Perfectly valid way to use mutex locking under Unix Kernel.

    Whereas I found that in Windows, it fiddles with the IRQL level even if I don't want it to, and is expected to be released asap, and by the very same thread.

    I suspect Windows simply does not have any kernel built-in options that will fit with Unix usage, and I will simply have to implement my own mutex code. It is on the TODO - but became much lower priority as the hack happens to work (for now).

    When I started this port, I did not know if it could be done, let alone by me as I am new to Windows kernels - but since it has become clear that it can be done, it is time to go back and make things proper. But it has been surprisingly hard to get "Windows Best Practises". Even here, it took you being revolted to increase the mutex TODO entry :)

  • Jorgen_LundmanJorgen_Lundman Member - All Emails Posts: 45

    You also mention "recursive mutex" usage, I would guess that is something Windows differs again. Unix mutex, for example SunOS/Solaris/illumos
    https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/os/mutex.c#L411

    Simply not a valid usage - I did not write that panic test above, it came from SYSV.

  • anton_bassovanton_bassov Member Posts: 4,984

    Even here, it took you being revolted to increase the mutex TODO entry :)

    Well, this is not the question of TODO entry....

    Basically, what you did is, in UNIX/Linux terms, exactly the same thing as enabling interrupts in a function that is meant to run with interrupts disabled. In other words, you have intentionally violated one of the most basic rules of Windows kernel programming, and, to make it much worse, found a justification for it....

    Simply not a valid usage - I did not write that panic test above, it came from SYSV.

    Well, if there was no "panic" part it would be even worse - you would simply deadlock, because fast_mutex cannot be acquired recursively....

    It seems Unix and Windows have very different ideas of how a mutex can be used. The mutex held in Illumos, can be held for a long time, >even days. Perfectly valid way to use mutex locking under Unix Kernel.

    Whereas I found that in Windows, it fiddles with the IRQL level even if I don't want it to, and is expected to be released asap,
    and by the very same thread.

    .....which reveals more and more about the level of your current understanding of Windows kernel...

    In actuality, a mutex can be held as long as you wish, which is true of both "regular" mutexes and fast ones. A fast mutex is "fast" because it relies upon test-and-set operation, which means it cannot be acquired recursively by the same thread. This is why it elevates IRQL to APC_LEVEL,which,however, has absolutely nothing to do with timing. A "regular" mutex can be acquired recursively...

    as I am new to Windows kernels

    ....but still you are trying to do something that happens to be, apparently, one of the most complex Windows projects in existence......

    In any case, it seems to me that you are taking the wrong approach from the very beginning. What I would do in your place is discarding ZFS POSIX layer altogether, and writing a Windows- centric filesystem that just makes use of DMU/SPA/MOS/ZIL layers behind the scenes, effectively storing data in ZFS format and presenting ZFS objects (i.e filesystems/snapshots/clones) as volumes that Windows filesystem may be mounted on. However, the way all these objects are accessed has to be a Windows-centric one....

    Anton Bassov

  • Jorgen_LundmanJorgen_Lundman Member - All Emails Posts: 45

    @anton_bassov said:
    A fast mutex is "fast" because it relies upon test-and-set operation, which means it cannot be acquired recursively by the same thread.
    This is why it elevates IRQL to APC_LEVEL,which,however, has absolutely nothing to do with timing. A "regular" mutex can be acquired recursively...

    "test-and-set" is how I would implement it, should I have to do so.

    Seeing as recursive call is not allowed on Unix, I have no need for that ability. You are suggesting here that fast_mutex can then be used, any hints as to how? If I don't have the hack, it dies fairly early on.

    .....which reveals more and more about the level of your current understanding of Windows kernel...
    ....but still you are trying to do something that happens to be, apparently, one of the most complex Windows projects in existence......

    Of course - why would you ever try to dissuade someone from wanting to learn? Have you no patience for those with less experience than you?
    Either way, I see it as a great way to get familiar with another platform - I already familiar with SYSV, BSD, and MACH kernels, why not one more.

    In any case, it seems to me that you are taking the wrong approach from the very beginning. What I would do in your place is discarding ZFS POSIX layer altogether, and writing a Windows- centric filesystem that just makes use of DMU/SPA/MOS/ZIL layers behind the scenes, effectively storing data in ZFS format and presenting ZFS objects (i.e filesystems/snapshots/clones) as volumes that Windows filesystem may be mounted on. However, the way all these objects are accessed has to be a Windows-centric one....

    Yes, that is how you would do it, and perhaps any Windows dev, would do it that way. Never going to happen though, and in 15 years, hasn't. Unfortunately, I know Unix inside-and-out but not Windows, so I ported it they way I know, the same way I ported it to OSX.

    Anyway, in the end I changed the code to use the driver's registry, and create a hostid if not present, it was quite an easy thing to do.

  • Jorgen_LundmanJorgen_Lundman Member - All Emails Posts: 45

    Actually, just to clarify, then you/Windows talks about "recursive mutex" we are talking about
    mutex_enter(A);
    ..mutex_enter(A);
    ....code
    ..mutex_exit(A);
    mutex_exit(A);

    Right? Not the standard nesting use:
    mutex_enter(A);
    ..mutex_enter(B);
    ....code
    ..mutex_exit(B);
    mutex_exit(A);

    If it is the case that fast_mutex is not able to do the latter example, then it is actually more like a spinlock, and not a mutex at all.
    If that was the case, I would totally understand your reaction, that would be a terrible, terrible use.

  • Jeffrey_Tippet_[MSFT]Jeffrey_Tippet_[MSFT] Member - All Emails Posts: 513

    The SYSTEM hive (mounted at /Registry/Machine/SYSTEM) is loaded by the boot loader, so it's available to boot start drivers. However, most other hives are loaded later by the normal storage stack.

    Typically, if you have something that's critical for a boot driver to start, you'd put it in the SYSTEM hive.

    At present, there's no notification API when other hives are loaded. A boot driver can poll, or your can orchestrate things so that needing to read the SOFTWARE hive is driven by some other event that guarantees the SOFTWARE hive is already loaded. For example, usermode doesn't start until hives are loaded, so you can assume that if you get an ioctl from usermode, you're ready to read other hives.

    Whereas I found that in Windows, it fiddles with the IRQL level even if I don't want it to, and is expected to be released asap, and by the very same thread.

    Yeah, Windows locks don't really expect to be held for "days", and are typically thread- or processor-affinitized. You can, of course, build your own lock with an InterlockedCompareExchange. One gotcha though: if you're running on some usermode process's context, you need to prevent the current thread from getting suspended while you hold a lock. (Imagine if usermode ioctls into your driver, you grab your lock, then someone suspends that process. The lock is still held, so nobody else can make forward progress, but the thread that holds it isn't going to run.) The solution to this is KeEnterCriticalRegion(), which postpones thread suspension. If you implement a "thread-neutral lock", you should think carefully about where to enter the critical region: whenever a thread needs to make forward progress to prevent the lock from getting starved, that thread needs to be in a critical region.

  • anton_bassovanton_bassov Member Posts: 4,984

    If it is the case that fast_mutex is not able to do the latter example, then it is actually more like a spinlock, and not a mutex at all.

    Well, to begin with, the scenario B (i.e the one of the nested acquisition of 2 different locks) is perfectly fine with ANY constructs, including spinlocks. The only condition here is that the target OS has to take the original IRQL in case of Windows ( or the original state of interrupt flag in case of Linux) into consideration when releasing a lock. As long as it gets done by the target OS, there is no problem here whatsoever.

    However, the scenario A is a totally different story as far as a spinlock owner is concerned - if you try something like that you are just bound to deadlock because of the test-and-set. This is why spinlocks owners have to ensure that this unfortunate scenario cannot occur. In the Windows world they do so by means of elevating IRQL to DPC_LEVEL before attempting test-and-set. The OSes that allow the use of spinlocks in ISRs have to disable interrupts before attempting test-and-set....

    Concerning the fast mutex, you can think of it just of a combination of a mutex and a spinlock.

    First ExAcquireFastMutex() tries test-and-set, and if it fails, it goes blocking on the dispatcher object (probably, after having had polled the target flag for some reasonable number of iterations). This is why it is "fast" - if the contention is low and/or the mutex is released quickly neither an owner nor a contender needs to go to the system dispatcher upon acquisition/release, effectively reducing the acquisition effort to a simple test-and-set. In this sense, it is conceptually similar to the adaptive mutex on Illumos. The only difference is that it polls the state of the flag, rather than the one of the owner thread. However, if you look at the whole thing from the owner's perspective, it is, for all practical purposes, just a spinlock that cannot be acquired recursively- a recursive acquisition attempt by the owner thread guarantees a deadlock because of the test-and-set. Therefore, in order to avoid this unfortunate scenario, ExAcquireFastMutex() elevates IRQL to APC_LEVEL before attempting test-and-set....

    If that was the case, I would totally understand your reaction, that would be a terrible, terrible use.

    As I told you already, what you do is equivalent, in Linux terms, to enabling interrupts while holding a spinlock. I really hope there is no need to explain to you the gravity of this mistake.....

    Yes, that is how you would do it, and perhaps any Windows dev, would do it that way.
    ...
    Unfortunately, I know Unix inside-and-out but not Windows, so I ported it they way I know, the same way I ported it to OSX.

    I hope you DO realise that ZFS is a world on its own, with no easy mapping of its operations to ANY major OS in existence, including even Solaris that it was developed on. In order to make it usable on the host OS ZFS POSIX layer has been designed. What it does is just interfacing ZFS DSL objects, and presenting them as files and directories to the target OS. It makes certain assumptions about the target OS, particularly about the way it handles the file system operations. Once ZFS had been originally designed for Solaris, the more similar to Solaris in this respect the target OS is, the less modifications porting ZFS to it requires.

    OSX is more or less similar to FreeBSD by the virtue of being a BSD derivative. Therefore, ZFS POSIX layer naturally maps to OSX file system operations, and does not require the significant modifications to itself when it gets ported to OSX. However, Windows NT kernel is a totally different world, and Solaris-targeting ZFS POSIX layer is completely foreign to it. This is why you are more than likely to get into the trouble if you try to bluntly and dumbly adjust it to Windows. Therefore, assuming that you want it to work fine under Windows, it would be better to interface ZFS objects to Windows in a way that naturally maps to the Windows filesystem operations, rather than trying to emulate the ways that are totally unnatural and foreign for it.....

    Certainly, the above does not necessarily imply that your code is going to crash and burn straight away - as I can see, you have managed to get away with a deadly serious bug in so far. However, it does not necessarily imply that your code is perfectly fine either, does it...

    Anton Bassov

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 7,213

    Either way, I see it as a great way to get familiar with another platform

    Hmmmmm.... kind of a big project for doing that. A bit like “I wanted to learn about boats, so o decided to build an aircraft carrier.”

    Also, getting familiar with another platform entails learning “the ways”... the underlying architecture... the overall design of that platform and the way it’s pieces interwork. The flavor, the approach... even the coding style. It doesn’t seem to me that taking a chainsaw and a hammer to force some pre-existing code to work “somehow” is the way to achieve the goal of learning.

    I would think you would WANT to come out the other side of your learning activity with a series of modules that reflect the best principles and practices of the host OS. But that’s not what you’re doing. So, with all due respect, that’s not really learning... it’s more like “bludgeoning.”

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • Jorgen_LundmanJorgen_Lundman Member - All Emails Posts: 45

    At present, there's no notification API when other hives are loaded. A boot driver can poll,

    I didn't really expect there to be, thought I would ask just in case. Thanks for letting me know.

    Yeah, Windows locks don't really expect to be held for "days", and are typically thread- or processor-affinitized. You can, of course, build your own lock with an InterlockedCompareExchange.

    They do seem to be a little more "specialised" than I was expecting, and perhaps leaning toward "critical section" usage, rather than the more generic "protecting variable access" as they are used in this code.

    I actually had "regular" (proper? full?) mutex code initially, but turned out to not work due to "The kernel never permits a thread that owns a mutex to cause a transition to user mode without first releasing the mutex", which would not work here. It totally expects to be able to do that.

    Skimming the documentation, perhaps InitializeSRWLock could be used in this situation, but it is starting to look like I might have to implement my own mutex code.

  • Jorgen_LundmanJorgen_Lundman Member - All Emails Posts: 45

    Concerning the fast mutex, you can think of it just of a combination of a mutex and a spinlock.

    Thank you, that is helpful.

    enabling interrupts while holding a spinlock. I really hope there is no need to explain to you the gravity of this mistake.....

    Of course. From my POV though, I was calling the Windows-equivalent of "pthread_enter()" only to find it disables interrupts as well, a weird thing to do that breaks the specifications. Naturally, Windows isn't POSIX, nor does it have pthreads. That is my bad, and I'll own that. Of course I thought we had already settled that that code is wrong, and we had moved on to looking at alternatives.

  • anton_bassovanton_bassov Member Posts: 4,984

    From my POV though, I was calling the Windows-equivalent of "pthread_enter()" only to find it disables interrupts as well,

    Well,it does not disable interrupts. What it disables is, in UNIX terms, a signal delivery to the target thread. The only difference is that we are speaking about the KM, rather than userland, here

    Naturally, Windows isn't POSIX

    Here we go....

    Now go and read my previous post again.....

    Anton Bassov

  • Jorgen_LundmanJorgen_Lundman Member - All Emails Posts: 45

    Here we go....
    Now go and read my previous post again.....

    And if you'd read what I said, you would have seen I was merely re-enacting my thoughts at the time, and pointing out that "of course" it will not work as I expect, as it is not the system I am used to. It's as if you guys just read keywords, and assume I'm on some crusade. I am not.

    Still, I checked out SRWLock - but it's userland. Poked at Pushlocks, but they don't have a *Try method. So in the end, I went with simple Events. I suspect that KeWaitForSingleObject() is not "free", so I attempt to avoid calling it by using CAS. It is not "sexy", but it works.

    https://github.com/openzfsonwindows/ZFSin/pull/138

    The fast_mutex is just one of the two hacks I wasn't happy with, the other one is in vdev_disk.c - I will hopefully find a better way around that one too.

    Anyway, that's all the beating I can take from you guys for now, thanks for the information shared.

  • Martin_DrábMartin_Dráb Member - All Emails Posts: 35

    Seeing as recursive call is not allowed on Unix, I have no need for that ability. You are suggesting here that fast_mutex can then be used, any hints as to how? If I don't have the hack, it dies fairly early on.

    Well, executive resources may be the right primitives for you. They are reader-writer locks but can be used as sort of mutexes. And they do not raise IRQL to APC level.

    Why do you need to hold a lock for a long time (even accross usermode trips of the holding thread)? Windows driver is something like a dynamic link library – the kernel just calls one of its callbacks, the driver does its job and returns the control back to the kernel. If you need to wait for something that happens in usermode, use appropriate synchronization primitives (events, semaphores...).

    Waiting for the registry hive to appear

    Well, you possibly may detect the SOFTWARE hive appearing by tracking changes of the HKLM\System\CurrentControlSet\Control\hivelist key. I think it should be present even at boot-time. I don't remember whether a specific ZwXxx routine is documented for this purpose. Either way, you can use CmRegisterCallback to do the waiting in a passive manner. But personally, I would opt to storing all necessary information in driver's registry key.

  • anton_bassovanton_bassov Member Posts: 4,984

    Well, executive resources may be the right primitives for you. They are reader-writer locks but can be used as sort of mutexes.
    And they do not raise IRQL to APC level.

    However, they still impose certain limitations and requirements (like, for example, disabling normal kernel APC delivery before calling
    ExAcquireResourceSXXXLite()). Therefore, I believe the best option here is just to implement mutexes,condvars and rwlocks yourself on top of KEVENTs and InterlockedXXX() functions...

    Anton Bassov

  • Martin_DrábMartin_Dráb Member - All Emails Posts: 35

    However, they still impose certain limitations and requirements (like, for example, disabling normal kernel APC delivery before calling

    ExAcquireResourceSXXXLite()). Therefore, I believe the best option here is just to implement mutexes,condvars and rwlocks yourself on top of KEVENTs and InterlockedXXX() functions..

    I probably missed some paragraphs but I got an intention that the main problem is the APC_LEVEL IRQL (since it restricts callable API set), not the fact that normal kernel APCs are disabled.

    Of course, KEVENT-based mutexes would work fine.

  • anton_bassovanton_bassov Member Posts: 4,984

    I probably missed some paragraphs but I got an intention that the main problem is the APC_LEVEL IRQL (since it restricts callable API set),

    Well, actually, the real problem is that the OP tries to use the code that got written for Solaris under Windows, despite the HUGE differences between these systems. His requirement has absolutely nothing to do with the objective APC_LEVEL restrictions - it is based upon the assertion that ,in his words, "Windows increases irql in fastmutex, this is not how we want to use mutex with unix" (the reasoning behind the second part is not so clear either)

    Therefore, in his situation it would be better to implement everything himself in order to ensure that his emulated API behaves as much Solaris-like as possible......

    Anton Bassov

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 12,969
    via Email
    Martin_Dráb wrote:
    >
    > Windows driver is something like a dynamic link library – the kernel just calls one of its callbacks, the driver does its job and returns the control back to the kernel.

    This, I think, is one of the fundamental architectural tenets of Windows
    drivers, and it is not intuitively obvious to user-mode programmers.  In
    user-mode programming, flow comes in at main(), and until we exit out
    the bottom of main(), there's always at least one thread that has its
    program counter in our code.  This is even true in a GUI app; although
    we may be waiting for events, we're waiting because we called GetMessage
    and DispatchMessage.

    With drivers, that's not the case.  The "steady state" for a driver is
    not to have any code active at all.  A driver consists of a set of
    callbacks.  The callbacks get fired to respond to events, either in the
    system or in our hardware.  The callback runs, does its work, and
    returns as quickly as possible.  A driver shouldn't wait; it should
    remember its state, and return until some other event triggers a change
    to the next state.  That's a very different way to think about programming.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Upcoming OSR Seminars
Developing Minifilters 29 July 2019 OSR Seminar Space
Writing WDF Drivers 23 Sept 2019 OSR Seminar Space
Kernel Debugging 21 Oct 2019 OSR Seminar Space
Internals & Software Drivers 18 Nov 2019 Dulles, VA