Windows kernel XOR chain list

Thanks for the input. Feel free to make changes yourself :slight_smile:

On Thu, Oct 18, 2018 at 5:54 PM MBond
wrote:

> OSR http://osr.vanillacommunities.com/
> MBond commented on Windows kernel XOR chain list
>
> Yes names like xxxInterlockedxxx should actually be interlocked – as
> opposed to software lock based ?
>
>
>
> Also, on Amazon I can buy RAM at the rate of about $10 per GB. This means
> that the ‘value’ of using code like this that saves 8 bytes per entry, is
> only significant when the size of the list is absolutely enormous – say 100
> million items or greater. But this is absolutely exactly the kind of list
> that should not be stored using this kind of structure so it is kind of a
> catch 22.
>
>
>
> Of much more interest is the relative speed of the operations
>
>
>
> In the insert path, you have one branch and either two assignments or
> three + 3 xors. A standard implementation can have zero branches and 4
> assignments (two of which can be done via interlockedCompareExchange).
>
>
>
> In the remove path, you have two branches, two assignments and one or 3
> xors. A standard implementation can have zero branches and 3 assignments
> (two of which can be done via interlockedCompareExchange).
>
>
>
> The branches are going to kill your performance, but I think if you work
> at it you can eliminate them. Usually a good way to eliminate the branches
> is to make the list circularly linked to an ‘anchor’ or ‘root’ node so all
> of the operations always have both a previous and next node – but that
> probably does not work well here
>
>
>
> Sent from Mail for Windows 10
>
>
>
> ________________________________
> From: Jamey_Kirby
> Sent: Thursday, October 18, 2018 5:16:57 PM
> To: MBond
> Subject: Re: [NTDEV] Windows kernel XOR chain list
>
> OSR http://osr.vanillacommunities.com/
> Jamey_Kirby commented on Windows kernel XOR chain list
>
> I remember reading about xor chaining in a Dr. Dobb’s article in the late
> 80’s or early 90’s, and have always wanted to find a use for one.
>
> Tim: You are correct, I only implemented a deque (add to tail, remove from
> head), as that was my immediate requirement. Yes, I do want to implements a
> fuller set of functions to traverse and such, but, as you mentioned, those
> are expensive operations, and probably not very practical for large lists.
> Adding RemoveTail, and InsertHead are trivial.
>
> Peter: Yes, I considered using different naming, or do something more like
> the kernel does (disable interrupts, etc.). I am not adverse to others
> contributing to the code either
>
> Phil: I did put two functions in the header. Good suggestion though. The
> interlocked versions are probably best left as functions.
>
> Thanks for the feedback.
>
>
> On Thu, Oct 18, 2018 at 4:57 PM Peter_Viscarola_(OSR)
> wrote:
>
> > OSR https://community.osr.com/
> > Peter_Viscarola_(OSR) commented on Windows kernel XOR chain list
> >
> > First… Thank you, Jamey, for once again openly sharing your work with
> > the Windows driver development community. That’s darn nice of you.
> >
> > Now… Being basically ignorant, I’m not going to claim I understand how
> > this thing works.
> >
> > But I did notice that there’s an ExInterlockedRemoveHeadXorList
> > implementation… and I wanted to note that this is not similar to the
> > standard Windows ExInterlockedRemoveHeadList function in that it can only
> > be called at IRQL <= DISPATCH_LEVEL. I think using the same naming
> > convention, but a very different set of functionalities and constraints
> is
> > unwise.
> >
> > I always wanted to point out that this implementation unfortunately
> lacks
> > SAL annotations. Which is unfortunate, as these types of projects are
> cool
> > ways to show people what SAL is about and how it should be used.
> >
> > Peter
> >
> > –
> > Reply to this email directly or follow the link below to check it out:
> > https://community.osr.com/discussion/comment/291010#Comment_291010
> >
> > Check it out:
> > https://community.osr.com/discussion/comment/291010#Comment_291010
> >
>
> –
> Reply to this email directly or follow the link below to check it out:
> http://osr.vanillacommunities.com/discussion/comment/291012#Comment_291012
>
> Check it out:
> http://osr.vanillacommunities.com/discussion/comment/291012#Comment_291012
>

I’ve updated it with a few of the simple suggestions:

  1. Added example in the header (commented out) to show how to enumerate the
    list.

// Example to enumerate the list from head to tail.
//
// EnumerateXorList(PXOR_LIST List) {
// PXOR_LIST_ENTRY Current = List->Head;
// PXOR_LIST_ENTRY Previous = NULL;
// PXOR_LIST_ENTRY Next = NULL;
// while (Current != NULL) {
// KdPrintEx((DPFLTR_IHVDRIVER_ID, DPFLTR_INFO_LEVEL,
// “Item pointer: %p\n”, Current));
// Next = xor(Previous, Current->Pointer);
// Previous = Current;
// Current = Next;
// }
// }

  1. Added deque functions InsertHead…() and RemoveTail…().

Anyway, I hope someone finds it useful. I’d appreciate any enhancements or
updates. Let me know, and I can add you as a contributor. Over and out.

On Thu, Oct 18, 2018 at 6:10 PM Jamey Kirby <kirby.jamey> wrote:

> Thanks for the input. Feel free to make changes yourself :slight_smile:
>
>
> On Thu, Oct 18, 2018 at 5:54 PM MBond
> wrote:
>
>> OSR http://osr.vanillacommunities.com/
>> MBond commented on Windows kernel XOR chain list
>>
>> Yes names like xxxInterlockedxxx should actually be interlocked – as
>> opposed to software lock based ?
>>
>>
>>
>> Also, on Amazon I can buy RAM at the rate of about $10 per GB. This
>> means that the ‘value’ of using code like this that saves 8 bytes per
>> entry, is only significant when the size of the list is absolutely enormous
>> – say 100 million items or greater. But this is absolutely exactly the
>> kind of list that should not be stored using this kind of structure so it
>> is kind of a catch 22.
>>
>>
>>
>> Of much more interest is the relative speed of the operations
>>
>>
>>
>> In the insert path, you have one branch and either two assignments or
>> three + 3 xors. A standard implementation can have zero branches and 4
>> assignments (two of which can be done via interlockedCompareExchange).
>>
>>
>>
>> In the remove path, you have two branches, two assignments and one or 3
>> xors. A standard implementation can have zero branches and 3 assignments
>> (two of which can be done via interlockedCompareExchange).
>>
>>
>>
>> The branches are going to kill your performance, but I think if you work
>> at it you can eliminate them. Usually a good way to eliminate the branches
>> is to make the list circularly linked to an ‘anchor’ or ‘root’ node so all
>> of the operations always have both a previous and next node – but that
>> probably does not work well here
>>
>>
>>
>> Sent from Mail for Windows 10
>>
>>
>>
>> ________________________________
>> From: Jamey_Kirby
>> Sent: Thursday, October 18, 2018 5:16:57 PM
>> To: MBond
>> Subject: Re: [NTDEV] Windows kernel XOR chain list
>>
>> OSR http://osr.vanillacommunities.com/
>> Jamey_Kirby commented on Windows kernel XOR chain list
>>
>> I remember reading about xor chaining in a Dr. Dobb’s article in the late
>> 80’s or early 90’s, and have always wanted to find a use for one.
>>
>> Tim: You are correct, I only implemented a deque (add to tail, remove from
>> head), as that was my immediate requirement. Yes, I do want to implements
>> a
>> fuller set of functions to traverse and such, but, as you mentioned, those
>> are expensive operations, and probably not very practical for large lists.
>> Adding RemoveTail, and InsertHead are trivial.
>>
>> Peter: Yes, I considered using different naming, or do something more like
>> the kernel does (disable interrupts, etc.). I am not adverse to others
>> contributing to the code either
>>
>> Phil: I did put two functions in the header. Good suggestion though. The
>> interlocked versions are probably best left as functions.
>>
>> Thanks for the feedback.
>>
>>
>> On Thu, Oct 18, 2018 at 4:57 PM Peter_Viscarola_(OSR)
>> wrote:
>>
>> > OSR https://community.osr.com/
>> > Peter_Viscarola_(OSR) commented on Windows kernel XOR chain list
>> >
>> > First… Thank you, Jamey, for once again openly sharing your work with
>> > the Windows driver development community. That’s darn nice of you.
>> >
>> > Now… Being basically ignorant, I’m not going to claim I understand
>> how
>> > this thing works.
>> >
>> > But I did notice that there’s an ExInterlockedRemoveHeadXorList
>> > implementation… and I wanted to note that this is not similar to the
>> > standard Windows ExInterlockedRemoveHeadList function in that it can
>> only
>> > be called at IRQL <= DISPATCH_LEVEL. I think using the same naming
>> > convention, but a very different set of functionalities and constraints
>> is
>> > unwise.
>> >
>> > I always wanted to point out that this implementation unfortunately
>> lacks
>> > SAL annotations. Which is unfortunate, as these types of projects are
>> cool
>> > ways to show people what SAL is about and how it should be used.
>> >
>> > Peter
>> >
>> > –
>> > Reply to this email directly or follow the link below to check it out:
>> > https://community.osr.com/discussion/comment/291010#Comment_291010
>> >
>> > Check it out:
>> > https://community.osr.com/discussion/comment/291010#Comment_291010
>> >
>>
>> –
>> Reply to this email directly or follow the link below to check it out:
>> http://osr.vanillacommunities.com/discussion/comment/291012#Comment_291012
>>
>> Check it out:
>> http://osr.vanillacommunities.com/discussion/comment/291012#Comment_291012
>>
>
>
> –
> Jamey Kirby
> Disrupting the establishment since 1964
>
> This is a personal email account and as such, emails are not subject to
> archiving. Nothing else really matters.

></kirby.jamey>

I’ve implemented an xor chain double linked list.

Well, the only thing that I can see in so far is a SINGLE-linked list,rather than a DOUBLE-linked one. Just in case if you don’t believe me, try removing an entry from the middle of the list, and you will see that it cannot be done without traversing the list from its first entry…

The only advantage to a “conventional” single-linked list that it offers is saving you the necessity to embed a pointer to the next list element in every the structure that you are linking this way, effectively saving you 8 bytes of RAM (although in some cases it may save you the entire 64 bytes - for example, consider the scenario when your target structure size is a multiple of the one of a cache line, and,hence, adding an extra pointer to it would imply increasing the structure size by 64 bytes, rather than just by 8, if you want to maintain cache-line alignment).
Not really a big deal for x86_64-based system in the year 2018, don’t you think…

Taking into consideration that using a single-linked list in the kernel is, probably, not the most convenient/reasonable approach in 99% of cases anyway, the whole thing seems to be of very limited practical usefulness in the kernel.

I hate sounding like a killjoy who attacks an “unconventional” approach, but that’s the way it is…

Anton Bassov

It is a double linked list, not a single. The pointer in the node is the
previous node xored with the next node. So, if you know one of the values,
you can extract the other. You have to walk the list to insert and remove
elements from inside the list, however, doing head and tail operations are
efficient.

And you are correct about cache alignment. It is one of the reasons I
implemented the code. After dropping 8 bytes per node, my data structure
dropped to 32 bytes. This plays nice with the cache, and with the memory
allocator.

The code can surely be optimized as Marion pointed out, but this is a solid
reference implementation. I’ve pushed millions of nodes through the code
with no issues. Performance is surely acceptable.

On Thu, Oct 18, 2018 at 11:26 PM anton_bassov
wrote:

> OSR https://community.osr.com/
> anton_bassov commented on Windows kernel XOR chain list
>
> > I’ve implemented an xor chain double linked list.
>
> Well, the only thing that I can see in so far is a SINGLE-linked
> list,rather than a DOUBLE-linked one. Just in case if you don’t believe me,
> try removing an entry from the middle of the list, and you will see that it
> cannot be done without traversing the list from its first entry…
>
> The only advantage to a “conventional” single-linked list that it offers
> is saving you the necessity to embed a pointer to the next list element in
> every the structure that you are linking this way, effectively saving you 8
> bytes of RAM (although in some cases it may save you the entire 64 bytes -
> for example, consider the scenario when your target structure size is a
> multiple of the one of a cache line, and,hence, adding an extra pointer to
> it would imply increasing the structure size by 64 bytes, rather than just
> by 8, if you want to maintain cache-line alignment).
>
> Not really a big deal for x86_64-based system in the year 2018, don’t you
> think…
>
> Taking into consideration that using a single-linked list in the kernel
> is, probably, not the most convenient/reasonable approach in 99% of cases
> anyway, the whole thing seems to be of very limited practical usefulness in
> the kernel.
>
> I hate sounding like a killjoy who attacks an “unconventional” approach,
> but that’s the way it is…
>
> Anton Bassov
>
> –
> Reply to this email directly or follow the link below to check it out:
> https://community.osr.com/discussion/comment/291016#Comment_291016
>
> Check it out:
> https://community.osr.com/discussion/comment/291016#Comment_291016
>

On Oct 18, 2018, at 8:26 PM, anton_bassov wrote:
>
>> I’ve implemented an xor chain double linked list.
>
> Well, the only thing that I can see in so far is a SINGLE-linked list,rather than a DOUBLE-linked one. Just in case if you don’t believe me, try removing an entry from the middle of the list, and you will see that it cannot be done without traversing the list from its first entry…

It is doubly-linked, not singly-linked. However as I mentioned earlier, it’s a deque. You can run the list from either end, but you must start from the ends. You can’t start from the middle.

Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.

It is a double linked list, not a single. The pointer in the node is the
previous node xored with the next node. So, if you know one of the values,
you can extract the other. You have to walk the list to insert and remove
elements from inside the list, however, doing head and tail operations are efficient.

Just open any introduction-to-computer-science book,and you are going to see that what you say above provides just a classical description of a SINGLE -linked list that does not allow addition to/removals from the middle entries without traversing the list. Although it,indeed, allows additions/removals from either end, as well as traversals in either direction, it is still does not really qualify for a double-linked list. If it was a truly DOUBLE-linked list, adding and removing entries to/from the middle of the list would be a painless exercise. This is the exactly the feature that makes a double-linked list are so popular, compared to a single-linked one…

And you are correct about cache alignment. It is one of the reasons I implemented the code. After dropping 8 bytes per node,
my data structure dropped to 32 bytes. This plays nice with the cache, and with the memory allocator.

Fair enough - as long as you are OK with the idea of efficient additions/removals being possible only at the ends of the list the whole thing makes a perfect sense…

Anton Bassov

On Oct 18, 2018, at 9:39 PM, anton_bassov wrote:
>
>
> Just open any introduction-to-computer-science book,and you are going to see that what you say above provides just a classical description of a SINGLE -linked list that does not allow addition to/removals from the middle entries without traversing the list.

You are making up your own definitions, and you are incorrect. The list can be trivially traversed in both directions, beginning from the ends. It IS a doubly-linked

> Although it,indeed, allows additions/removals from either end, as well as traversals in either direction, it is still does not really qualify for a double-linked list.

By applying your definition, then it also does not qualify as a single-linked list. That’s petty semantics.

It is a deque, and one that happens to be implemented with bi-directional linking.

Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.

You are making up your own definitions, and you are incorrect. The list can be trivially traversed in both directions,
beginning from the ends. It IS a doubly-linked

Well, if one looks at the whole thing objectively, we both act like visitors to the zoo who argue whether the animal in question is a wolf or a cheetah while looking at the hyena…

This is a XOR-linked list, i.e. a data structure in its own right. This data structure happens to have features of both single and double linked list, as well as its very own specifics. It allows traversals in either direction and removals from either end the way double-linked list does.
However, unlike a full-blown double-linked list it does not allow arbitrary additions/removals to/from the middle of the list. In order to do so you have to traverse the list exactly the same way you would if you were adding to/removing from the middle of a single-linked list. What makes it distinct from both single and double linked list is the possibility of linking data structures (potentially even unrelated ones) without embedding a pointer to the next list element in it, which may have its own applications in some (admittedly rare) cases.

BTW, once we are at it, you can implement the same concept using additions and subtractions instead of XOR. This kind of list is slightly different from the XOR linked list, because the instruction sequences needed to traverse the list forwards is different from the one needed to do the same thing but in the reverse direction…

Anton Bassov

It’s like blockchain version 0.1>

On Fri, Oct 19, 2018 at 9:32 AM anton_bassov
wrote:

> OSR https://community.osr.com/
> anton_bassov commented on Windows kernel XOR chain list
>
> > You are making up your own definitions, and you are incorrect. The list
> can be trivially traversed in both directions,
> >
> > beginning from the ends. It IS a doubly-linked
>
> Well, if one looks at the whole thing objectively, we both act like
> visitors to the zoo who argue whether the animal in question is a wolf or
> a cheetah while looking at the hyena…
>
> This is a XOR-linked list, i.e. a data structure in its own right. This
> data structure happens to have features of both single and double linked
> list, as well as its very own specifics. It allows traversals in either
> direction and removals from either end the way double-linked list does.
>
> However, unlike a full-blown double-linked list it does not allow
> arbitrary additions/removals to/from the middle of the list. In order to do
> so you have to traverse the list exactly the same way you would if you were
> adding to/removing from the middle of a single-linked list. What makes it
> distinct from both single and double linked list is the possibility of
> linking data structures (potentially even unrelated ones) without
> embedding a pointer to the next list element in it, which may have its own
> applications in some (admittedly rare) cases.
>
> BTW, once we are at it, you can implement the same concept using
> additions and subtractions instead of XOR. This kind of list is slightly
> different from the XOR linked list, because the instruction sequences
> needed to traverse the list forwards is different from the one needed to
> do the same thing but in the reverse direction…
>
> Anton Bassov
>
> –
> Reply to this email directly or follow the link below to check it out:
> https://community.osr.com/discussion/comment/291025#Comment_291025
>
> Check it out:
> https://community.osr.com/discussion/comment/291025#Comment_291025
>

Ah how hard would it be to port to kernel :slight_smile:
https://www.linuxjournal.com/article/6828

And for that to make it x-platform ???
-Pro

Nice!!! I read this article a few weeks ago. That function would be quite
easy to add to my implementation.

I have even considered an enumerate function that passes a context
structure with prev, curr, and next to restart where you left off.

On Sat, Oct 20, 2018 at 11:58 AM Prokash_Sinha-2
wrote:

> OSR https://community.osr.com/
> Prokash_Sinha-2 commented on Windows kernel XOR chain list
>
> And for that to make it x-platform ???
>
> -Pro
>
> –
> Reply to this email directly or follow the link below to check it out:
> https://community.osr.com/discussion/comment/291049#Comment_291049
>
> Check it out:
> https://community.osr.com/discussion/comment/291049#Comment_291049
>

:smiley:

:smiley:

It’s like blockchain version 0.1

Well, this is, probably, not the best comparison in context of this discussion (i.e. whether the list under consideration is a single or a double linked one) . Please note that, unlike XOR-linked list, blockchain is a STRICTLY single-linked list that allows traversals only in one direction and modification only to the end of the list. If you try adding or removing any of its entries ( apart from the last one, of course) you are going to get a checksum mismatch. Actually, this protection against modifications happens to be the underlying idea behind the very concept of a blockchain, in the first place.

In fact, I try my best to avoid using this term - on my books, it is closely associated with business types and other laymen who like speaking about some supposedly “revolutionary blockchain technology”, and want to go as far as even start teaching it in the universities. Being a technically-minded person, I realise that, in actuality, the whole “revolutionary technology” is nothing more than just a special form of a Merkle tree that had been invented (and patented) by Ralph Merkle back in 1979, i.e. 30 years before this term even came to the existence, and found a wide range of practical applications since. ZFS and BTRFS file systems, as well as various file-sharing protocols, are the very first examples that come up to me mind.

Therefore, I would say the best term to describe a blockchain is either a “hash-linked list” or a “Merkle single-linked list” , although neither of the above is, apparently, too palatable to the business/marketing types…

Anton Bassov

Hence the > I see you left that out of the quote :slight_smile:

On Sat, Oct 20, 2018 at 8:57 PM anton_bassov
wrote:

> OSR https://community.osr.com/
> anton_bassov commented on Windows kernel XOR chain list
>
> > It’s like blockchain version 0.1
>
> Well, this is, probably, not the best comparison in context of this
> discussion (i.e. whether the list under consideration is a single or a
> double linked one) . Please note that, unlike XOR-linked list, blockchain
> is a STRICTLY single-linked list that allows traversals only in one
> direction and modification only to the end of the list. If you try adding
> or removing any of its entries ( apart from the last one, of course) you
> are going to get a checksum mismatch. Actually, this protection against
> modifications happens to be the underlying idea behind the very concept of
> a blockchain, in the first place.
>
> In fact, I try my best to avoid using this term - on my books, it is
> closely associated with business types and other laymen who like speaking
> about some supposedly “revolutionary blockchain technology”, and want to go
> as far as even start teaching it in the universities. Being a
> technically-minded person, I realise that, in actuality, the whole
> “revolutionary technology” is nothing more than just a special form of a
> Merkle tree that had been invented (and patented) by Ralph Merkle back in
> 1979, i.e. 30 years before this term even came to the existence, and found
> a wide range of practical applications since. ZFS and BTRFS file systems,
> as well as various file-sharing protocols, are the very first examples that
> come up to me mind.
>
> Therefore, I would say the best term to describe a blockchain is either
> a “hash-linked list” or a “Merkle single-linked list” , although neither of
> the above is, apparently, too palatable to the business/marketing
> types…
>
> Anton Bassov
>
> –
> Reply to this email directly or follow the link below to check it out:
> https://community.osr.com/discussion/comment/291057#Comment_291057
>
> Check it out:
> https://community.osr.com/discussion/comment/291057#Comment_291057
>

Hence the </> I see you left that out of the quote

As I have experimentally established, the new hosting platform seems, for some reason, to be automatically removing <>-enclosed XML- style stuff. For example, on a couple occasions it removed the opening

one
(I am pretty sure it is about to do it with this post as well). Therefore, when I saw </> I thought that it could well be just a mangled paragraph-closing part, so that I did not give it any interpretation. As I can see now, it was meant to be some kind of an emoticon…

BTW, once we are at it, I think that if some independent observers look at all these emoticons, they may suspect that the world is trying to develop a new glyph (or,probably, even a cuneiform) -based writing system of the XXI century. Call me a bore of you wish, but on my books this “writing system” firmly belongs in the same class with "2B/“4U/etc” one, i.e. something that is more likely to be associated with the kids texting one another, rather than with the mature technical-minded people posting to the technical NGs…

Anton Bassov

on a couple occasions it removed the opening statement while preserving the closing one
(I am pretty sure it is about to do it with this post as well)

As I can see, the platform 'improved" itself even further - now it seems to be removing both…

Anton Bassov

Here is a situation where this has been useful:

Data structure with LIST_ENTRY was 40 bytes. Since memory manager aligns on
16 byte boundary, this is actually taking 48 bytes. By removing 8 bytes by
using the XOR list, my 40 byte structure is now 32 bytes. Although this
seems trivial, it is not.

  1. I save 8 bytes from the node link, and 8 bytes of overflow from the
    memory manager. That’s a savings of 16 bytes (33%).
  2. Two nodes can reside in the cache. Before, only one full node would fit.

Although using XOR list may add CPU cycles, I suspect I gain more from
being cache friendly than i lose in CPU cycles. I’d be happy if it were
close in either direction.

On Sat, Oct 20, 2018 at 10:47 PM anton_bassov
wrote:

> OSR https://community.osr.com/
> anton_bassov commented on Windows kernel XOR chain list
>
> > on a couple occasions it removed the opening statement while preserving
> the closing one
> >
> > (I am pretty sure it is about to do it with this post as well)
>
> As I can see, the platform 'improved" itself even further - now it seems
> to be removing both…
>
> Anton Bassov
>
> –
> Reply to this email directly or follow the link below to check it out:
> https://community.osr.com/discussion/comment/291060#Comment_291060
>
> Check it out:
> https://community.osr.com/discussion/comment/291060#Comment_291060
>

I fail to understand your logic though. The way I think about, is that whatever the underlying data structure, a node is a node, randomly allocated ( as we assume under any dynamic structure ), hence how the cache will get them in cache memory, so that accessing those nodes will have cache hit. Usually cache fill is sequential prefetch, and the nodes are not. So for data cache, how could you expect they would be in cache ? Or more hits than misses ? 'm missing something …
-Pro