Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Before Posting...
Please check out the Community Guidelines in the Announcements and Administration Category.

ExInterlockedXxx functionality

Jamey_KirbyJamey_Kirby Member - All Emails Posts: 436
edited November 2018 in NTDEV

I've been reviewing the WRK, and ReactOS to see how interlocked SLIST works. I'm trying to tidy up my XLIST. I see that on entry to these functions, in 32 bit mode, the sequence PUSHFD, _disable(), grab spinlock, do work, POPFD. The spinlock is grabbed in the same way as if you had called KeAcquireSpinlockAtDpcLevel(); IRQL is not raised, &c. So, I threw together this bit of code:

#define EFLAGS_IF_MASK 0x00000200
#define EFLAGS_IF_SHIFT 9

PXLIST_ENTRY InterlockedInsertTailXList(PXLIST_HEADER List,
    PXLIST_ENTRY Entry, PKSPIN_LOCK Lock) {
#if (1)
    unsigned int Flags = GetCallersEflags();
    _disable();
    KeAcquireSpinLockAtDpcLevel(Lock);
#else
    KLOCK_QUEUE_HANDLE LockHandle;
    KeAcquireInStackQueuedSpinLock(Lock, &LockHandle);
#endif
    InsertTailXList(List, Entry);
#if (1)
    KeReleaseSpinLockFromDpcLevel(Lock);
    if (Flags & EFLAGS_IF_MASK) {
        _enable();
    }
#else
    KeReleaseInStackQueuedSpinLock(&LockHandle);
#endif
    return Entry;
}

I know that there are amd64 interlocked instructions for just about everything (xor, &c.), and I'll get around to that at some point if they are practical. I am just tinkering at the moment. Although pushfq and popfq are valid in 64 bit, Visual Studio coughs at __asm pushfq and __asm popfq. In 32 bit mode, all is OK using __asm pushfd and __asm popfd. I threw the above code together.

How bad is this?

Post edited by Peter_Viscarola_(OSR) on

Comments

  • Jamey_KirbyJamey_Kirby Member - All Emails Posts: 436

    Sorry, I need to understand how better to post code in this new web interface.

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 7,380

    I've been reviewing the WRK, and ReactOS

    /rolls eyes

    I don't like seeing the WRK license terms violated, and I find ReactOS... perplexing. Let's leave it at that.

    Visual Studio coughs at __asm pushfq and __asm popfq

    Well, you can't inline assembler in the 64-bit C compiler. It's not allowed.

    Is that your issue?

    How bad is this?

    I'm not sure what question you're asking, really.

    I need to understand how better to post code in this new web interface.

    I fixed it for you. You have two choices for formatting:

    • Write the Markdown yourself
    • Paste what you want and then select it, and apply the appropriate format from the ribbon.

    Either works pretty well -- but there can be odd, unanticipated, side-effect at times.

    Peter Viscarola
    OSR
    @OSRDrivers

  • anton_bassovanton_bassov Member Posts: 5,038

    Although pushfq and popfq are valid in 64 bit, Visual Studio coughs at __asm pushfq and __asm popfq.

    IIRC, if you compile to x86_64, you cannot use inline asm in .c files with either MSFT or Intel C compilers (although you can still use it with GCC). However, with MSFT and Intel compilers you have to put all your asm code in a separate .s file if you want to build a 64-bit executable. Therefore, what your compiler "coughs at" is the very __asm statement in a .c file......

    Anton Bassov

  • Jamey_KirbyJamey_Kirby Member - All Emails Posts: 436
    via Email
    I guess my question is this: Is my code acceptable? It is processor
    specific, but so is SLIST. I want to emulate how SLIST does it's
    interlocking, and it is not just simply acquiring and releasing the
    spinlock using documented APIs.

    As for WRK, I stumbled across it on GitHub, and use it as a resource. It's
    been there for quite a long time. Should I avoid it? My guess is that if MS
    had a real problem with it on GitHub, then they would have removed it. They
    own GitHub after all.




    On Tue, Nov 20, 2018 at 1:54 PM anton_bassov
    wrote:

    > OSR https://community.osr.com/
    > anton_bassov commented on ExInterlockedXxx functionality
    >
    > > Although pushfq and popfq are valid in 64 bit, Visual Studio coughs at
    > __asm pushfq and __asm popfq.
    >
    > IIRC, if you compile to x86_64, you cannot use inline asm in .c files
    > with either MSFT or Intel C compilers (although you can still use it with
    > GCC). However, with MSFT and Intel compilers you have to put all your asm
    > code in a separate .s file if you want to build a 64-bit executable.
    > Therefore, what your compiler "coughs at" is the very __asm statement in a
    > .c file......
    >
    > Anton Bassov
    >
    > --
    > Reply to this email directly or follow the link below to check it out:
    > https://community.osr.com/discussion/comment/291515#Comment_291515
    >
    > Check it out:
    > https://community.osr.com/discussion/comment/291515#Comment_291515
    >
  • anton_bassovanton_bassov Member Posts: 5,038

    I guess my question is this: Is my code acceptable? It is processor specific, but so is SLIST.

    Well, everything is acceptable as long as it is not buggy, but, to be honest, the whole thing seems (at least to me - probably, I just don't understand something) unnecessarily complex and convoluted. For example what about the following schematic code (I assume the list that works on FIFO basis)

    struct slist_entry { struct slist_entry * next;};

    struct slist {spinlock_t lock; struct slist_entry * first; struct slist_entry * last;};

    void slist_init(struct slist *list)
    {
    list->first=list->last=NULL;
    spinlock_init(&list->lock);
    }

    void slist_add (struct slist *list, struct slist_entry * new_entry)
    {
    new_entry->next=NULL;

    spin_lock(&list->lock);

    if(!list->first) list->first=list->last=new_entry;

    else {list->last->next=new_entry; list->last=new_entry;}

    spin_unlock(&list->lock);

    }

    void slist_remove (struct slist *list, struct slist_entry ** ptr)
    {

    spin_lock(&list->lock);

    *ptr=list->first;
    list->first=list->first->next;
    if(!list->first) list->last=NULL;

    spin_unlock(&list->lock);

    }

    This one seems to be straightforward and arch-agnostic - the only arch-specific part is a spinlock,which is wrapped in arch-independent constructs. In other words( unless I am just missing something, of course), you seem to be adding some unnecessary complications....

    Anton Bassov

  • Jamey_KirbyJamey_Kirby Member - All Emails Posts: 436
    via Email
    As you can see in the code, I was using an InStackQueueSpinlock, and that
    code works in my code. I want to make XLIST a drop-in replacement for
    SLIST, and LIST_ENTRY where a stack, queue, or dequeue are required. Every
    dequeue and many queue implementations I've seen uses a double linked list.
    The double linked list adds the niceness of removing elements from within
    the list, but those features are still outside the definition of these data
    structures.

    - A queue needs to add to the tail, and remove from the head.
    - A dequeue needs to add to the head, and remove from the tail; and visa
    versa.

    So, if this is the scope of the need, it can be implemented using one less
    pointer with an xor chain. You can also emulate a stack by adding to the
    front, and removing from the front, and visa versa keeping the node size
    the same as if you were using SLIST. So, I think implementing an XLIST is
    useful. I'd like it to be as flexible as possible, and something that can
    be used anywhere you can use the interlocked SLIST and interlocked
    LIST_ENTRY functions to implement stack, queue, and dequeue.

    Like I said, I am just tinkering, and having some fun; and thought you
    folks might be interested in my poking around.


    On Tue, Nov 20, 2018 at 3:50 PM anton_bassov
    wrote:

    > OSR https://community.osr.com/
    > anton_bassov commented on ExInterlockedXxx functionality
    >
    > > I guess my question is this: Is my code acceptable? It is processor
    > specific, but so is SLIST.
    >
    > Well, everything is acceptable as long as it is not buggy, but, to be
    > honest, the whole thing seems (at least to me - probably, I just don't
    > understand something) unnecessarily complex and convoluted. For example
    > what about the following schematic code (I assume the list that works on
    > FIFO basis)
    >
    > struct slist_entry { struct slist_entry * next;};
    >
    > struct slist {spinlock_t lock; struct slist_entry * first; struct
    > slist_entry * last;};
    >
    > void slist_init(struct slist *list)
    >
    > {
    >
    > list->first=list->last=NULL;
    >
    > spinlock_init(&list->lock);
    >
    > }
    >
    > void slist_add (struct slist *list, struct slist_entry * new_entry)
    >
    > {
    >
    > new_entry->next=NULL;
    >
    > spin_lock(&list->lock);
    >
    > if(!list->first) list->first=list->last=new_entry;
    >
    > else {list->last->next=new_entry; list->last=new_entry;}
    >
    > spin_unlock(&list->lock);
    >
    > }
    >
    > void slist_remove (struct slist *list, struct slist_entry ** ptr)
    >
    > {
    >
    > spin_lock(&list->lock);
    >
    > *ptr=list->first;
    >
    > list->first=list->first->next;
    >
    > if(!list->first) list->last=NULL;
    >
    > spin_unlock(&list->lock);
    >
    > }
    >
    > This one seems to be straightforward and arch-agnostic - the only
    > arch-specific part is a spinlock,which is wrapped in arch-independent
    > constructs. In other words( unless I am just missing something, of course),
    > you seem to be adding some unnecessary complications....
    >
    > Anton Bassov
    >
    > --
    > Reply to this email directly or follow the link below to check it out:
    > https://community.osr.com/discussion/comment/291526#Comment_291526
    >
    > Check it out:
    > https://community.osr.com/discussion/comment/291526#Comment_291526
    >
  • anton_bassovanton_bassov Member Posts: 5,038

    Every dequeue and many queue implementations I've seen uses a double linked list.
    ...

    • A queue needs to add to the tail, and remove from the head.
    • A dequeue needs to add to the head, and remove from the tail; and visa versa.

    Well, but then you can you the XOR-linked list that you had presented to us few weeks ago - when it comes to adding to/removing from the ends it seems to be really efficient, so that it seems to be a perfect queue/dequeue that does not require the target structures that you are linking to even know about getting linked. Furthermore, once you don't need to store a link pointer in the structure itself you can have exactly the same instance of a structure on multiple lists...

    In other words,XOR-linked list seems to be superior to a simple SLIST in absolutely all respects.....

    Anton Bassov

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 7,380
    edited November 2018

    I guess my question is this: Is my code acceptable? It is processor specific, but so is SLIST.

    Well, if your stated goal is to emulate what Interlocked SLIST does, then it must be acceptable, assuming it’s doing what SLIST does. Doing anything else wouldn’t meet that goal.

    If your question is a collegial “How do we all feel, in general, about a architecture-specific implementations for the sake of optimization?” I would answer that they are, IMHO, rarely worthwhile ... this being 2018 and not 1988 (when interlocked SLIST was invented, at least notionally). The implementation of Interlocked SLIST (and it’s brother) are notable in that they can be used to share data at any IRQL, and thus can be used for sharing data between your ISR and DPC. I really don’t see any other use for them.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • anton_bassovanton_bassov Member Posts: 5,038

    it seems to be a perfect queue/dequeue that does not require the target structures that you are linking to even know about
    getting linked. Furthermore, once you don't need to store a link pointer in the structure itself you can have exactly the same
    instance of a structure on multiple lists...

    Ooops, guys, I have to retract the above statement - it looks like I spouted nonsense yet another time, and, on this particular occasion, did it on a truly grand-scale. If it could only work the above way......well, then it would be just a perpetual-engine-grade compressor algorithm that allowed to compress infinite amounts of arbitrary data to a couple of pointers. Certainly you still need to embed a link in a target structure....

    Anton Bassov

Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Upcoming OSR Seminars
Developing Minifilters 29 July 2019 OSR Seminar Space
Writing WDF Drivers 23 Sept 2019 OSR Seminar Space
Kernel Debugging 21 Oct 2019 OSR Seminar Space
Internals & Software Drivers 18 Nov 2019 Dulles, VA