Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Before Posting...
Please check out the Community Guidelines in the Announcements and Administration Category.

Windows kernel XOR chain list

Jamey_KirbyJamey_Kirby Member - All Emails Posts: 429

I've implemented an xor chain double linked list. Using this code, you can maintain a double linked list using only 8 bytes of node overhead (x64). Using LIST_ENTRY requires 16 bytes of node overhead. Please feel free to use, comment, or modify. This is my first pass. I've tested it and it seems to be solid.

https://github.com/jameykirby/xorlist

«1

Comments

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 12,982
    via Email
    Jamey_Kirby wrote:
    > I've implemented an xor chain double linked list. Using this code, you can maintain a double linked list using only 8 bytes of node overhead (x64). Using LIST_ENTRY requires 16 bytes of node overhead. Please feel free to use, comment, or modify. This is my first pass. I've tested it and it seems to be solid.
    >
    > https://github.com/jameykirby/xorlist

    I think it is worthwhile to point out what you're giving up in exchange
    for those 8 bytes.  This is not a list in the STL sense, this is a
    deque.  Adding and removing at the ends is easy, but adding and removing
    in the middle is expensive, because it requires traversing the list. 
    Further, given a list entry, it is not possible to find the next or the
    previous entry; you have to start from one of the ends until you find
    the entry.

    That's fine, of course; there are many situations where a deque is the
    right choice.

    The functions you have provided make it a queue; by adding InsertHead
    and RemoveTail (which are equally easy), you'd have a deque.  You also
    might add some commented code showing how to traverse the list -- your
    basic "for each" loop.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • Phil_BarilaPhil_Barila Member - All Emails Posts: 148

    @Jamey_Kirby said:
    I've implemented an xor chain double linked list. Using this code, you can maintain a double linked list using only 8 bytes of node overhead (x64). Using LIST_ENTRY requires 16 bytes of node overhead. Please feel free to use, comment, or modify. This is my first pass. I've tested it and it seems to be solid.

    https://github.com/jameykirby/xorlist

    It's such a small implementation (some would call it trivial, but that refers to size, rather than value or complexity), it seems like an excellent candidate for a header-only implementation.

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 7,223

    First... Thank you, Jamey, for once again openly sharing your work with the Windows driver development community. That's darn nice of you.

    Now... Being basically ignorant, I'm not going to claim I understand how this thing works.

    But I did notice that there's an ExInterlockedRemoveHeadXorList implementation... and I wanted to note that this is not similar to the standard Windows ExInterlockedRemoveHeadList function in that it can only be called at IRQL <= DISPATCH_LEVEL. I think using the same naming convention, but a very different set of functionalities and constraints is unwise.

    I always wanted to point out that this implementation unfortunately lacks SAL annotations. Which is unfortunate, as these types of projects are cool ways to show people what SAL is about and how it should be used.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • Jamey_KirbyJamey_Kirby Member - All Emails Posts: 429
    via Email
    I remember reading about xor chaining in a Dr. Dobb's article in the late
    80's or early 90's, and have always wanted to find a use for one.

    Tim: You are correct, I only implemented a deque (add to tail, remove from
    head), as that was my immediate requirement. Yes, I do want to implements a
    fuller set of functions to traverse and such, but, as you mentioned, those
    are expensive operations, and probably not very practical for large lists.
    Adding RemoveTail, and InsertHead are trivial.

    Peter: Yes, I considered using different naming, or do something more like
    the kernel does (disable interrupts, etc.). I am not adverse to others
    contributing to the code either ;)

    Phil: I did put two functions in the header. Good suggestion though. The
    interlocked versions are probably best left as functions.

    Thanks for the feedback.


    On Thu, Oct 18, 2018 at 4:57 PM Peter_Viscarola_(OSR)
    wrote:

    > OSR https://community.osr.com/
    > Peter_Viscarola_(OSR) commented on Windows kernel XOR chain list
    >
    > First... Thank you, Jamey, for once again openly sharing your work with
    > the Windows driver development community. That's darn nice of you.
    >
    > Now... Being basically ignorant, I'm not going to claim I understand how
    > this thing works.
    >
    > But I did notice that there's an ExInterlockedRemoveHeadXorList
    > implementation... and I wanted to note that this is not similar to the
    > standard Windows ExInterlockedRemoveHeadList function in that it can only
    > be called at IRQL <= DISPATCH_LEVEL. I think using the same naming
    > convention, but a very different set of functionalities and constraints is
    > unwise.
    >
    > I always wanted to point out that this implementation unfortunately lacks
    > SAL annotations. Which is unfortunate, as these types of projects are cool
    > ways to show people what SAL is about and how it should be used.
    >
    > Peter
    >
    > --
    > Reply to this email directly or follow the link below to check it out:
    > https://community.osr.com/discussion/comment/291010#Comment_291010
    >
    > Check it out:
    > https://community.osr.com/discussion/comment/291010#Comment_291010
    >
  • MBondMBond Member - All Emails Posts: 846
    via Email
    Yes names like xxxInterlockedxxx should actually be interlocked – as opposed to software lock based 😉



    Also, on Amazon I can buy RAM at the rate of about $10 per GB. This means that the ‘value’ of using code like this that saves 8 bytes per entry, is only significant when the size of the list is absolutely enormous – say 100 million items or greater. But this is absolutely exactly the kind of list that should not be stored using this kind of structure so it is kind of a catch 22.



    Of much more interest is the relative speed of the operations



    In the insert path, you have one branch and either two assignments or three + 3 xors. A standard implementation can have zero branches and 4 assignments (two of which can be done via interlockedCompareExchange).



    In the remove path, you have two branches, two assignments and one or 3 xors. A standard implementation can have zero branches and 3 assignments (two of which can be done via interlockedCompareExchange).



    The branches are going to kill your performance, but I think if you work at it you can eliminate them. Usually a good way to eliminate the branches is to make the list circularly linked to an ‘anchor’ or ‘root’ node so all of the operations always have both a previous and next node – but that probably does not work well here



    Sent from Mail for Windows 10



    ________________________________
    From: Jamey_Kirby
    Sent: Thursday, October 18, 2018 5:16:57 PM
    To: MBond
    Subject: Re: [NTDEV] Windows kernel XOR chain list

    OSR http://osr.vanillacommunities.com/
    Jamey_Kirby commented on Windows kernel XOR chain list

    I remember reading about xor chaining in a Dr. Dobb's article in the late
    80's or early 90's, and have always wanted to find a use for one.

    Tim: You are correct, I only implemented a deque (add to tail, remove from
    head), as that was my immediate requirement. Yes, I do want to implements a
    fuller set of functions to traverse and such, but, as you mentioned, those
    are expensive operations, and probably not very practical for large lists.
    Adding RemoveTail, and InsertHead are trivial.

    Peter: Yes, I considered using different naming, or do something more like
    the kernel does (disable interrupts, etc.). I am not adverse to others
    contributing to the code either

    Phil: I did put two functions in the header. Good suggestion though. The
    interlocked versions are probably best left as functions.

    Thanks for the feedback.


    On Thu, Oct 18, 2018 at 4:57 PM Peter_Viscarola_(OSR)
    wrote:

    > OSR https://community.osr.com/
    > Peter_Viscarola_(OSR) commented on Windows kernel XOR chain list
    >
    > First... Thank you, Jamey, for once again openly sharing your work with
    > the Windows driver development community. That's darn nice of you.
    >
    > Now... Being basically ignorant, I'm not going to claim I understand how
    > this thing works.
    >
    > But I did notice that there's an ExInterlockedRemoveHeadXorList
    > implementation... and I wanted to note that this is not similar to the
    > standard Windows ExInterlockedRemoveHeadList function in that it can only
    > be called at IRQL <= DISPATCH_LEVEL. I think using the same naming
    > convention, but a very different set of functionalities and constraints is
    > unwise.
    >
    > I always wanted to point out that this implementation unfortunately lacks
    > SAL annotations. Which is unfortunate, as these types of projects are cool
    > ways to show people what SAL is about and how it should be used.
    >
    > Peter
    >
    > --
    > Reply to this email directly or follow the link below to check it out:
    > https://community.osr.com/discussion/comment/291010#Comment_291010
    >
    > Check it out:
    > https://community.osr.com/discussion/comment/291010#Comment_291010
    >
  • Jamey_KirbyJamey_Kirby Member - All Emails Posts: 429
    via Email
    Thanks for the input. Feel free to make changes yourself :)


    On Thu, Oct 18, 2018 at 5:54 PM MBond
    wrote:

    > OSR http://osr.vanillacommunities.com/
    > MBond commented on Windows kernel XOR chain list
    >
    > Yes names like xxxInterlockedxxx should actually be interlocked – as
    > opposed to software lock based 😉
    >
    >
    >
    > Also, on Amazon I can buy RAM at the rate of about $10 per GB. This means
    > that the ‘value’ of using code like this that saves 8 bytes per entry, is
    > only significant when the size of the list is absolutely enormous – say 100
    > million items or greater. But this is absolutely exactly the kind of list
    > that should not be stored using this kind of structure so it is kind of a
    > catch 22.
    >
    >
    >
    > Of much more interest is the relative speed of the operations
    >
    >
    >
    > In the insert path, you have one branch and either two assignments or
    > three + 3 xors. A standard implementation can have zero branches and 4
    > assignments (two of which can be done via interlockedCompareExchange).
    >
    >
    >
    > In the remove path, you have two branches, two assignments and one or 3
    > xors. A standard implementation can have zero branches and 3 assignments
    > (two of which can be done via interlockedCompareExchange).
    >
    >
    >
    > The branches are going to kill your performance, but I think if you work
    > at it you can eliminate them. Usually a good way to eliminate the branches
    > is to make the list circularly linked to an ‘anchor’ or ‘root’ node so all
    > of the operations always have both a previous and next node – but that
    > probably does not work well here
    >
    >
    >
    > Sent from Mail for Windows 10
    >
    >
    >
    > ________________________________
    > From: Jamey_Kirby
    > Sent: Thursday, October 18, 2018 5:16:57 PM
    > To: MBond
    > Subject: Re: [NTDEV] Windows kernel XOR chain list
    >
    > OSR http://osr.vanillacommunities.com/
    > Jamey_Kirby commented on Windows kernel XOR chain list
    >
    > I remember reading about xor chaining in a Dr. Dobb's article in the late
    > 80's or early 90's, and have always wanted to find a use for one.
    >
    > Tim: You are correct, I only implemented a deque (add to tail, remove from
    > head), as that was my immediate requirement. Yes, I do want to implements a
    > fuller set of functions to traverse and such, but, as you mentioned, those
    > are expensive operations, and probably not very practical for large lists.
    > Adding RemoveTail, and InsertHead are trivial.
    >
    > Peter: Yes, I considered using different naming, or do something more like
    > the kernel does (disable interrupts, etc.). I am not adverse to others
    > contributing to the code either
    >
    > Phil: I did put two functions in the header. Good suggestion though. The
    > interlocked versions are probably best left as functions.
    >
    > Thanks for the feedback.
    >
    >
    > On Thu, Oct 18, 2018 at 4:57 PM Peter_Viscarola_(OSR)
    > wrote:
    >
    > > OSR https://community.osr.com/
    > > Peter_Viscarola_(OSR) commented on Windows kernel XOR chain list
    > >
    > > First... Thank you, Jamey, for once again openly sharing your work with
    > > the Windows driver development community. That's darn nice of you.
    > >
    > > Now... Being basically ignorant, I'm not going to claim I understand how
    > > this thing works.
    > >
    > > But I did notice that there's an ExInterlockedRemoveHeadXorList
    > > implementation... and I wanted to note that this is not similar to the
    > > standard Windows ExInterlockedRemoveHeadList function in that it can only
    > > be called at IRQL <= DISPATCH_LEVEL. I think using the same naming
    > > convention, but a very different set of functionalities and constraints
    > is
    > > unwise.
    > >
    > > I always wanted to point out that this implementation unfortunately
    > lacks
    > > SAL annotations. Which is unfortunate, as these types of projects are
    > cool
    > > ways to show people what SAL is about and how it should be used.
    > >
    > > Peter
    > >
    > > --
    > > Reply to this email directly or follow the link below to check it out:
    > > https://community.osr.com/discussion/comment/291010#Comment_291010
    > >
    > > Check it out:
    > > https://community.osr.com/discussion/comment/291010#Comment_291010
    > >
    >
    > --
    > Reply to this email directly or follow the link below to check it out:
    > http://osr.vanillacommunities.com/discussion/comment/291012#Comment_291012
    >
    > Check it out:
    > http://osr.vanillacommunities.com/discussion/comment/291012#Comment_291012
    >
  • Jamey_KirbyJamey_Kirby Member - All Emails Posts: 429
    via Email
    I've updated it with a few of the simple suggestions:

    1) Added example in the header (commented out) to show how to enumerate the
    list.

    // Example to enumerate the list from head to tail.
    //
    // EnumerateXorList(PXOR_LIST List) {
    // PXOR_LIST_ENTRY Current = List->Head;
    // PXOR_LIST_ENTRY Previous = NULL;
    // PXOR_LIST_ENTRY Next = NULL;
    // while (Current != NULL) {
    // KdPrintEx((DPFLTR_IHVDRIVER_ID, DPFLTR_INFO_LEVEL,
    // "Item pointer: %p\n", Current));
    // Next = _xor_(Previous, Current->Pointer);
    // Previous = Current;
    // Current = Next;
    // }
    // }

    2) Added deque functions InsertHead...() and RemoveTail...().

    Anyway, I hope someone finds it useful. I'd appreciate any enhancements or
    updates. Let me know, and I can add you as a contributor. Over and out.


    On Thu, Oct 18, 2018 at 6:10 PM Jamey Kirby wrote:

    > Thanks for the input. Feel free to make changes yourself :)
    >
    >
    > On Thu, Oct 18, 2018 at 5:54 PM MBond
    > wrote:
    >
    >> OSR http://osr.vanillacommunities.com/
    >> MBond commented on Windows kernel XOR chain list
    >>
    >> Yes names like xxxInterlockedxxx should actually be interlocked – as
    >> opposed to software lock based 😉
    >>
    >>
    >>
    >> Also, on Amazon I can buy RAM at the rate of about $10 per GB. This
    >> means that the ‘value’ of using code like this that saves 8 bytes per
    >> entry, is only significant when the size of the list is absolutely enormous
    >> – say 100 million items or greater. But this is absolutely exactly the
    >> kind of list that should not be stored using this kind of structure so it
    >> is kind of a catch 22.
    >>
    >>
    >>
    >> Of much more interest is the relative speed of the operations
    >>
    >>
    >>
    >> In the insert path, you have one branch and either two assignments or
    >> three + 3 xors. A standard implementation can have zero branches and 4
    >> assignments (two of which can be done via interlockedCompareExchange).
    >>
    >>
    >>
    >> In the remove path, you have two branches, two assignments and one or 3
    >> xors. A standard implementation can have zero branches and 3 assignments
    >> (two of which can be done via interlockedCompareExchange).
    >>
    >>
    >>
    >> The branches are going to kill your performance, but I think if you work
    >> at it you can eliminate them. Usually a good way to eliminate the branches
    >> is to make the list circularly linked to an ‘anchor’ or ‘root’ node so all
    >> of the operations always have both a previous and next node – but that
    >> probably does not work well here
    >>
    >>
    >>
    >> Sent from Mail for Windows 10
    >>
    >>
    >>
    >> ________________________________
    >> From: Jamey_Kirby
    >> Sent: Thursday, October 18, 2018 5:16:57 PM
    >> To: MBond
    >> Subject: Re: [NTDEV] Windows kernel XOR chain list
    >>
    >> OSR http://osr.vanillacommunities.com/
    >> Jamey_Kirby commented on Windows kernel XOR chain list
    >>
    >> I remember reading about xor chaining in a Dr. Dobb's article in the late
    >> 80's or early 90's, and have always wanted to find a use for one.
    >>
    >> Tim: You are correct, I only implemented a deque (add to tail, remove from
    >> head), as that was my immediate requirement. Yes, I do want to implements
    >> a
    >> fuller set of functions to traverse and such, but, as you mentioned, those
    >> are expensive operations, and probably not very practical for large lists.
    >> Adding RemoveTail, and InsertHead are trivial.
    >>
    >> Peter: Yes, I considered using different naming, or do something more like
    >> the kernel does (disable interrupts, etc.). I am not adverse to others
    >> contributing to the code either
    >>
    >> Phil: I did put two functions in the header. Good suggestion though. The
    >> interlocked versions are probably best left as functions.
    >>
    >> Thanks for the feedback.
    >>
    >>
    >> On Thu, Oct 18, 2018 at 4:57 PM Peter_Viscarola_(OSR)
    >> wrote:
    >>
    >> > OSR https://community.osr.com/
    >> > Peter_Viscarola_(OSR) commented on Windows kernel XOR chain list
    >> >
    >> > First... Thank you, Jamey, for once again openly sharing your work with
    >> > the Windows driver development community. That's darn nice of you.
    >> >
    >> > Now... Being basically ignorant, I'm not going to claim I understand
    >> how
    >> > this thing works.
    >> >
    >> > But I did notice that there's an ExInterlockedRemoveHeadXorList
    >> > implementation... and I wanted to note that this is not similar to the
    >> > standard Windows ExInterlockedRemoveHeadList function in that it can
    >> only
    >> > be called at IRQL <= DISPATCH_LEVEL. I think using the same naming
    >> > convention, but a very different set of functionalities and constraints
    >> is
    >> > unwise.
    >> >
    >> > I always wanted to point out that this implementation unfortunately
    >> lacks
    >> > SAL annotations. Which is unfortunate, as these types of projects are
    >> cool
    >> > ways to show people what SAL is about and how it should be used.
    >> >
    >> > Peter
    >> >
    >> > --
    >> > Reply to this email directly or follow the link below to check it out:
    >> > https://community.osr.com/discussion/comment/291010#Comment_291010
    >> >
    >> > Check it out:
    >> > https://community.osr.com/discussion/comment/291010#Comment_291010
    >> >
    >>
    >> --
    >> Reply to this email directly or follow the link below to check it out:
    >> http://osr.vanillacommunities.com/discussion/comment/291012#Comment_291012
    >>
    >> Check it out:
    >> http://osr.vanillacommunities.com/discussion/comment/291012#Comment_291012
    >>
    >
    >
    > --
    > Jamey Kirby
    > Disrupting the establishment since 1964
    >
    > *This is a personal email account and as such, emails are not subject to
    > archiving. Nothing else really matters.*
    >
  • anton_bassovanton_bassov Member Posts: 4,991

    I've implemented an xor chain double linked list.

    Well, the only thing that I can see in so far is a SINGLE-linked list,rather than a DOUBLE-linked one. Just in case if you don't believe me, try removing an entry from the middle of the list, and you will see that it cannot be done without traversing the list from its first entry...

    The only advantage to a "conventional" single-linked list that it offers is saving you the necessity to embed a pointer to the next list element in every the structure that you are linking this way, effectively saving you 8 bytes of RAM (although in some cases it may save you the entire 64 bytes - for example, consider the scenario when your target structure size is a multiple of the one of a cache line, and,hence, adding an extra pointer to it would imply increasing the structure size by 64 bytes, rather than just by 8, if you want to maintain cache-line alignment).
    Not really a big deal for x86_64-based system in the year 2018, don't you think.....

    Taking into consideration that using a single-linked list in the kernel is, probably, not the most convenient/reasonable approach in 99% of cases anyway, the whole thing seems to be of very limited practical usefulness in the kernel.

    I hate sounding like a killjoy who attacks an "unconventional" approach, but that's the way it is....

    Anton Bassov

  • Jamey_KirbyJamey_Kirby Member - All Emails Posts: 429
    via Email
    It is a double linked list, not a single. The pointer in the node is the
    previous node xored with the next node. So, if you know one of the values,
    you can extract the other. You have to walk the list to insert and remove
    elements from inside the list, however, doing head and tail operations are
    efficient.

    And you are correct about cache alignment. It is one of the reasons I
    implemented the code. After dropping 8 bytes per node, my data structure
    dropped to 32 bytes. This plays nice with the cache, and with the memory
    allocator.

    The code can surely be optimized as Marion pointed out, but this is a solid
    reference implementation. I've pushed millions of nodes through the code
    with no issues. Performance is surely acceptable.




    On Thu, Oct 18, 2018 at 11:26 PM anton_bassov
    wrote:

    > OSR https://community.osr.com/
    > anton_bassov commented on Windows kernel XOR chain list
    >
    > > I've implemented an xor chain double linked list.
    >
    > Well, the only thing that I can see in so far is a SINGLE-linked
    > list,rather than a DOUBLE-linked one. Just in case if you don't believe me,
    > try removing an entry from the middle of the list, and you will see that it
    > cannot be done without traversing the list from its first entry...
    >
    > The only advantage to a "conventional" single-linked list that it offers
    > is saving you the necessity to embed a pointer to the next list element in
    > every the structure that you are linking this way, effectively saving you 8
    > bytes of RAM (although in some cases it may save you the entire 64 bytes -
    > for example, consider the scenario when your target structure size is a
    > multiple of the one of a cache line, and,hence, adding an extra pointer to
    > it would imply increasing the structure size by 64 bytes, rather than just
    > by 8, if you want to maintain cache-line alignment).
    >
    > Not really a big deal for x86_64-based system in the year 2018, don't you
    > think.....
    >
    > Taking into consideration that using a single-linked list in the kernel
    > is, probably, not the most convenient/reasonable approach in 99% of cases
    > anyway, the whole thing seems to be of very limited practical usefulness in
    > the kernel.
    >
    > I hate sounding like a killjoy who attacks an "unconventional" approach,
    > but that's the way it is....
    >
    > Anton Bassov
    >
    > --
    > Reply to this email directly or follow the link below to check it out:
    > https://community.osr.com/discussion/comment/291016#Comment_291016
    >
    > Check it out:
    > https://community.osr.com/discussion/comment/291016#Comment_291016
    >
  • Tim_RobertsTim_Roberts Member - All Emails Posts: 12,982
    via Email
    On Oct 18, 2018, at 8:26 PM, anton_bassov wrote:
    >
    >> I've implemented an xor chain double linked list.
    >
    > Well, the only thing that I can see in so far is a SINGLE-linked list,rather than a DOUBLE-linked one. Just in case if you don't believe me, try removing an entry from the middle of the list, and you will see that it cannot be done without traversing the list from its first entry...

    It is doubly-linked, not singly-linked. However as I mentioned earlier, it's a deque. You can run the list from either end, but you must start from the ends. You can't start from the middle.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • anton_bassovanton_bassov Member Posts: 4,991

    It is a double linked list, not a single. The pointer in the node is the
    previous node xored with the next node. So, if you know one of the values,
    you can extract the other. You have to walk the list to insert and remove
    elements from inside the list, however, doing head and tail operations are efficient.

    Just open any introduction-to-computer-science book,and you are going to see that what you say above provides just a classical description of a SINGLE -linked list that does not allow addition to/removals from the middle entries without traversing the list. Although it,indeed, allows additions/removals from either end, as well as traversals in either direction, it is still does not really qualify for a double-linked list. If it was a truly DOUBLE-linked list, adding and removing entries to/from the middle of the list would be a painless exercise. This is the exactly the feature that makes a double-linked list are so popular, compared to a single-linked one....

    And you are correct about cache alignment. It is one of the reasons I implemented the code. After dropping 8 bytes per node,
    my data structure dropped to 32 bytes. This plays nice with the cache, and with the memory allocator.

    Fair enough - as long as you are OK with the idea of efficient additions/removals being possible only at the ends of the list the whole thing makes a perfect sense....

    Anton Bassov

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 12,982
    via Email
    On Oct 18, 2018, at 9:39 PM, anton_bassov wrote:
    >
    >
    > Just open any introduction-to-computer-science book,and you are going to see that what you say above provides just a classical description of a SINGLE -linked list that does not allow addition to/removals from the middle entries without traversing the list.

    You are making up your own definitions, and you are incorrect. The list can be trivially traversed in both directions, beginning from the ends. It IS a doubly-linked


    > Although it,indeed, allows additions/removals from either end, as well as traversals in either direction, it is still does not really qualify for a double-linked list.

    By applying your definition, then it also does not qualify as a single-linked list. That's petty semantics.

    It is a deque, and one that happens to be implemented with bi-directional linking.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • anton_bassovanton_bassov Member Posts: 4,991

    You are making up your own definitions, and you are incorrect. The list can be trivially traversed in both directions,
    beginning from the ends. It IS a doubly-linked

    Well, if one looks at the whole thing objectively, we both act like visitors to the zoo who argue whether the animal in question is a wolf or a cheetah while looking at the hyena.....

    This is a XOR-linked list, i.e. a data structure in its own right. This data structure happens to have features of both single and double linked list, as well as its very own specifics. It allows traversals in either direction and removals from either end the way double-linked list does.
    However, unlike a full-blown double-linked list it does not allow arbitrary additions/removals to/from the middle of the list. In order to do so you have to traverse the list exactly the same way you would if you were adding to/removing from the middle of a single-linked list. What makes it distinct from both single and double linked list is the possibility of linking data structures (potentially even unrelated ones) without embedding a pointer to the next list element in it, which may have its own applications in some (admittedly rare) cases.

    BTW, once we are at it, you can implement the same concept using additions and subtractions instead of XOR. This kind of list is slightly different from the XOR linked list, because the instruction sequences needed to traverse the list forwards is different from the one needed to do the same thing but in the reverse direction.....

    Anton Bassov

  • Jamey_KirbyJamey_Kirby Member - All Emails Posts: 429
    via Email
    It's like blockchain version 0.1</>



    On Fri, Oct 19, 2018 at 9:32 AM anton_bassov
    wrote:

    > OSR https://community.osr.com/
    > anton_bassov commented on Windows kernel XOR chain list
    >
    > > You are making up your own definitions, and you are incorrect. The list
    > can be trivially traversed in both directions,
    > >
    > > beginning from the ends. It IS a doubly-linked
    >
    > Well, if one looks at the whole thing objectively, we both act like
    > visitors to the zoo who argue whether the animal in question is a wolf or
    > a cheetah while looking at the hyena.....
    >
    > This is a XOR-linked list, i.e. a data structure in its own right. This
    > data structure happens to have features of both single and double linked
    > list, as well as its very own specifics. It allows traversals in either
    > direction and removals from either end the way double-linked list does.
    >
    > However, unlike a full-blown double-linked list it does not allow
    > arbitrary additions/removals to/from the middle of the list. In order to do
    > so you have to traverse the list exactly the same way you would if you were
    > adding to/removing from the middle of a single-linked list. What makes it
    > distinct from both single and double linked list is the possibility of
    > linking data structures (potentially even unrelated ones) without
    > embedding a pointer to the next list element in it, which may have its own
    > applications in some (admittedly rare) cases.
    >
    > BTW, once we are at it, you can implement the same concept using
    > additions and subtractions instead of XOR. This kind of list is slightly
    > different from the XOR linked list, because the instruction sequences
    > needed to traverse the list forwards is different from the one needed to
    > do the same thing but in the reverse direction.....
    >
    > Anton Bassov
    >
    > --
    > Reply to this email directly or follow the link below to check it out:
    > https://community.osr.com/discussion/comment/291025#Comment_291025
    >
    > Check it out:
    > https://community.osr.com/discussion/comment/291025#Comment_291025
    >
  • Prokash_SinhaProkash_Sinha Member - All Emails Posts: 197

    Ah how hard would it be to port to kernel :-)
    https://www.linuxjournal.com/article/6828

  • Prokash_SinhaProkash_Sinha Member - All Emails Posts: 197

    And for that to make it x-platform ???
    -Pro

  • Jamey_KirbyJamey_Kirby Member - All Emails Posts: 429
    via Email
    Nice!!! I read this article a few weeks ago. That function would be quite
    easy to add to my implementation.

    I have even considered an enumerate function that passes a context
    structure with prev, curr, and next to restart where you left off.

    On Sat, Oct 20, 2018 at 11:58 AM Prokash_Sinha-2
    wrote:

    > OSR https://community.osr.com/
    > Prokash_Sinha-2 commented on Windows kernel XOR chain list
    >
    > And for that to make it x-platform ???
    >
    > -Pro
    >
    > --
    > Reply to this email directly or follow the link below to check it out:
    > https://community.osr.com/discussion/comment/291049#Comment_291049
    >
    > Check it out:
    > https://community.osr.com/discussion/comment/291049#Comment_291049
    >
  • Prokash_SinhaProkash_Sinha Member - All Emails Posts: 197

    :D

  • Prokash_SinhaProkash_Sinha Member - All Emails Posts: 197

    :D

  • anton_bassovanton_bassov Member Posts: 4,991

    It's like blockchain version 0.1

    Well, this is, probably, not the best comparison in context of this discussion (i.e. whether the list under consideration is a single or a double linked one) . Please note that, unlike XOR-linked list, blockchain is a STRICTLY single-linked list that allows traversals only in one direction and modification only to the end of the list. If you try adding or removing any of its entries ( apart from the last one, of course) you are going to get a checksum mismatch. Actually, this protection against modifications happens to be the underlying idea behind the very concept of a blockchain, in the first place.

    In fact, I try my best to avoid using this term - on my books, it is closely associated with business types and other laymen who like speaking about some supposedly "revolutionary blockchain technology", and want to go as far as even start teaching it in the universities. Being a technically-minded person, I realise that, in actuality, the whole "revolutionary technology" is nothing more than just a special form of a Merkle tree that had been invented (and patented) by Ralph Merkle back in 1979, i.e. 30 years before this term even came to the existence, and found a wide range of practical applications since. ZFS and BTRFS file systems, as well as various file-sharing protocols, are the very first examples that come up to me mind.

    Therefore, I would say the best term to describe a blockchain is either a "hash-linked list" or a "Merkle single-linked list" , although neither of the above is, apparently, too palatable to the business/marketing types......

    Anton Bassov

  • Jamey_KirbyJamey_Kirby Member - All Emails Posts: 429
    via Email
    Hence the </> I see you left that out of the quote :)




    On Sat, Oct 20, 2018 at 8:57 PM anton_bassov
    wrote:

    > OSR https://community.osr.com/
    > anton_bassov commented on Windows kernel XOR chain list
    >
    > > It's like blockchain version 0.1
    >
    > Well, this is, probably, not the best comparison in context of this
    > discussion (i.e. whether the list under consideration is a single or a
    > double linked one) . Please note that, unlike XOR-linked list, blockchain
    > is a STRICTLY single-linked list that allows traversals only in one
    > direction and modification only to the end of the list. If you try adding
    > or removing any of its entries ( apart from the last one, of course) you
    > are going to get a checksum mismatch. Actually, this protection against
    > modifications happens to be the underlying idea behind the very concept of
    > a blockchain, in the first place.
    >
    > In fact, I try my best to avoid using this term - on my books, it is
    > closely associated with business types and other laymen who like speaking
    > about some supposedly "revolutionary blockchain technology", and want to go
    > as far as even start teaching it in the universities. Being a
    > technically-minded person, I realise that, in actuality, the whole
    > "revolutionary technology" is nothing more than just a special form of a
    > Merkle tree that had been invented (and patented) by Ralph Merkle back in
    > 1979, i.e. 30 years before this term even came to the existence, and found
    > a wide range of practical applications since. ZFS and BTRFS file systems,
    > as well as various file-sharing protocols, are the very first examples that
    > come up to me mind.
    >
    > Therefore, I would say the best term to describe a blockchain is either
    > a "hash-linked list" or a "Merkle single-linked list" , although neither of
    > the above is, apparently, too palatable to the business/marketing
    > types......
    >
    > Anton Bassov
    >
    > --
    > Reply to this email directly or follow the link below to check it out:
    > https://community.osr.com/discussion/comment/291057#Comment_291057
    >
    > Check it out:
    > https://community.osr.com/discussion/comment/291057#Comment_291057
    >
  • anton_bassovanton_bassov Member Posts: 4,991

    Hence the </> I see you left that out of the quote

    As I have experimentally established, the new hosting platform seems, for some reason, to be automatically removing <>-enclosed XML- style stuff. For example, on a couple occasions it removed the opening statement while preserving the closing one
    (I am pretty sure it is about to do it with this post as well). Therefore, when I saw </> I thought that it could well be just a mangled paragraph-closing part, so that I did not give it any interpretation. As I can see now, it was meant to be some kind of an emoticon....

    BTW, once we are at it, I think that if some independent observers look at all these emoticons, they may suspect that the world is trying to develop a new glyph (or,probably, even a cuneiform) -based writing system of the XXI century. Call me a bore of you wish, but on my books this "writing system" firmly belongs in the same class with "2B/"4U/etc" one, i.e. something that is more likely to be associated with the kids texting one another, rather than with the mature technical-minded people posting to the technical NGs.....

    Anton Bassov

  • anton_bassovanton_bassov Member Posts: 4,991

    on a couple occasions it removed the opening statement while preserving the closing one
    (I am pretty sure it is about to do it with this post as well)

    As I can see, the platform 'improved" itself even further - now it seems to be removing both....

    Anton Bassov

  • Jamey_KirbyJamey_Kirby Member - All Emails Posts: 429
    via Email
    Here is a situation where this has been useful:

    Data structure with LIST_ENTRY was 40 bytes. Since memory manager aligns on
    16 byte boundary, this is actually taking 48 bytes. By removing 8 bytes by
    using the XOR list, my 40 byte structure is now 32 bytes. Although this
    seems trivial, it is not.

    1) I save 8 bytes from the node link, and 8 bytes of overflow from the
    memory manager. That's a savings of 16 bytes (33%).
    2) Two nodes can reside in the cache. Before, only one full node would fit.

    Although using XOR list may add CPU cycles, I suspect I gain more from
    being cache friendly than i lose in CPU cycles. I'd be happy if it were
    close in either direction.


    On Sat, Oct 20, 2018 at 10:47 PM anton_bassov
    wrote:

    > OSR https://community.osr.com/
    > anton_bassov commented on Windows kernel XOR chain list
    >
    > > on a couple occasions it removed the opening statement while preserving
    > the closing one
    > >
    > > (I am pretty sure it is about to do it with this post as well)
    >
    > As I can see, the platform 'improved" itself even further - now it seems
    > to be removing both....
    >
    > Anton Bassov
    >
    > --
    > Reply to this email directly or follow the link below to check it out:
    > https://community.osr.com/discussion/comment/291060#Comment_291060
    >
    > Check it out:
    > https://community.osr.com/discussion/comment/291060#Comment_291060
    >
  • Prokash_SinhaProkash_Sinha Member - All Emails Posts: 197

    I fail to understand your logic though. The way I think about, is that whatever the underlying data structure, a node is a node, randomly allocated ( as we assume under any dynamic structure ), hence how the cache will get them in cache memory, so that accessing those nodes will have cache hit. Usually cache fill is sequential prefetch, and the nodes are not. So for data cache, how could you expect they would be in cache ? Or more hits than misses ? 'm missing something ...
    -Pro

  • Jamey_KirbyJamey_Kirby Member - All Emails Posts: 429
    via Email
    When accessing the nodes to xor the pointers, two full nodes can reside in
    the cache at the same time. This is not guaranteed, but the opportunity for
    two nodes to reside in the cache at the same time is possible, and when
    processing millions of records through the list, I'd expect to see an
    accumulated benefit. I could setup a test and measure performance between
    XLIST and LIST_ENTRY. Maybe i'll do that next weekend when I have more
    tinker time. With a 40 byte node, I am guaranteed that two full nodes will
    never exist in the cache at the same time.

    I know for sure that I am saving memory. I have an odd situation where I am
    supporting an old design (not mine). Occasionally the software is paused,
    and lots of nodes are queued in NPP. This can be done for an extended
    period of time, and on systems without loads of RAM. On rare occasions, it
    can be millions of nodes. It is an issue. A 33% savings in this
    pathological situation is a temporary solution (tested) until I can re
    architect that aspect of the code. This is a very rare situation (right
    now), but one that needs addressing. When the software was designed (years
    ago), apparently no one was thinking about the impact of 20+ TB drives.

    And I've always wanted to play around with an xor chain, so now I have.





    On Sun, Oct 21, 2018 at 12:39 AM Prokash_Sinha-2
    wrote:

    > OSR https://community.osr.com/
    > Prokash_Sinha-2 commented on Windows kernel XOR chain list
    >
    > I fail to understand your logic though. The way I think about, is that
    > whatever the underlying data structure, a node is a node, randomly
    > allocated ( as we assume under any dynamic structure ), hence how the cache
    > will get them in cache memory, so that accessing those nodes will have
    > cache hit. Usually cache fill is sequential prefetch, and the nodes are
    > not. So for data cache, how could you expect they would be in cache ? Or
    > more hits than misses ? 'm missing something ...
    >
    > -Pro
    >
    > --
    > Reply to this email directly or follow the link below to check it out:
    > https://community.osr.com/discussion/comment/291062#Comment_291062
    >
    > Check it out:
    > https://community.osr.com/discussion/comment/291062#Comment_291062
    >
  • anton_bassovanton_bassov Member Posts: 4,991

    the way I think about, is that whatever the underlying data structure, a node is a node, randomly allocated ( as we assume under
    any dynamic structure ), hence how the cache will get them in cache memory, so that accessing those nodes will have cache hit.

    The very first thing that gets into my head is a custom slab allocator - first you allocate an array of needed structures at once by calling ExAllocatePool()), and then allocate/deallocate them from/to this array, rather than going to the system allocator upon every allocation/deallocation. You may get a noticeable performance improvement in some cases if you do it this way. In such case aligning your target structures on the cache-line boundary is going to enhance the performance even further.

    Anton Bassov

  • Prokash_SinhaProkash_Sinha Member - All Emails Posts: 197

    Yep, that's the approach to take :D . Surely, Kirby has a wonderful observation already, the system alignment. Big pain is debugging, often most debugger gives linked structures in debugger pane, and traversing is easy. Here no such luck :D

  • Prokash_SinhaProkash_Sinha Member - All Emails Posts: 197

    One thing to notice is that, dynamic structures ( their arrival, existence, and duration ) are assumed to be random --- So your apparent neighbors could change randomly, which would make cache misses more probable...

    -Pro

  • Jamey_KirbyJamey_Kirby Member - All Emails Posts: 429
    via Email
    I am using it to replace LIST_ENTRY to save space, so the code was already
    using a dynamic list; with a lookaside list.

    I was reviewing the WRK last night, and I see the default setting for
    lookaside lists is 256 entries. Most of the time I will have way less than
    that in my queue. Only in certain situations will the driver have to
    collect a lot of data in memory; which by that time, it is
    allocating/freeing from/to NPP anyway.

    I considered a slab allocator. I could then ensure my data was cache
    aligned without sacrificing too much memory. In reality, I am in upper
    region of the block storage stack, not sitting on a hardware interrupt, so
    I am not going to waste too much time right now on any major modifications
    or changes for performance. I am only collecting metadata, not block data.
    However, I am more than happy to add contributors to the project if anyone
    wants to tinker, and push better code :)

    If I am not mistaken, the Linux kernel has an xor based queue
    implementation that is frequently used; maybe in the storage stack. I think
    I remember seeing it some years back.

    I have made several modification based on some of the lists input; like
    renaming ExInterlockedXxx to InterlockedXxx, added traversal examples, and
    I even added PushXList() and PopXList() macros to mimic how SLIST works
    (kinda silly, but I added them anyway). When i get some time, I'll add an
    enumerator that has an _Out_ parameter that contains a state structure with
    prev, cur, and next pointers so that you can restart an enumeration where
    you left off; in either direction.



    On Sun, Oct 21, 2018 at 10:51 AM Prokash_Sinha-2
    wrote:

    > OSR https://community.osr.com/
    > Prokash_Sinha-2 commented on Windows kernel XOR chain list
    >
    > One thing to notice is that, dynamic structures ( their arrival,
    > existence, and duration ) are assumed to be random --- So your apparent
    > neighbors could change randomly, which would make cache misses more
    > probable...
    >
    > -Pro
    >
    > --
    > Reply to this email directly or follow the link below to check it out:
    > https://community.osr.com/discussion/comment/291066#Comment_291066
    >
    > Check it out:
    > https://community.osr.com/discussion/comment/291066#Comment_291066
    >
Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Upcoming OSR Seminars
Developing Minifilters 29 July 2019 OSR Seminar Space
Writing WDF Drivers 23 Sept 2019 OSR Seminar Space
Kernel Debugging 21 Oct 2019 OSR Seminar Space
Internals & Software Drivers 18 Nov 2019 Dulles, VA