RB trees have their own set of flaws ? the largest underreported one is the assumption of entirely symmetric memory access and the very CPU cache unfriendly behaviours that can ensue ? this is also true of most tree implementations and does not necessarily reflect poorly on this one as the current state of the art has exactly these problems. This can be even worse on NUMA machines and no current production OS provides an adequate facility to identify or resolve these problems IMHO
Large CPU caches reduce the severity of the impact of data structures that have poor locality of reference, and indeed this is a secondary effect when compared with the efficiency of the algorithms, but even still secondary effects can still play a substantial role.
A significantly better tree algorithm that can be both compare and cache efficient does exist but I am not at liberty to disclose any of its details. I may flatter myself by mentioning it, but as the cancer doctors will shortly decide if my demise is imminent or at some protracted point in the future, I don?t care so much
In keeping with these challenges, I did buy a tablet (a surface) and learned to type left handed so that I can keep supplying all of the form members some increasingly erratic posts on religious, political, moral and by the way technical issues too. I apologise in advance to all of those whom I have learned from for squandering that wisdom
As these issues are totally OT, I hope we can get some better ones soon.
In the remote event that anyone wants to know about binary tree stability on cache coherent NUMA machines spanning millions of nodes in Windows KM, let me know and I?ll work on writing something that may make sense.
Sent from Mailhttps: for Windows 10
From: Maxim S. Shatskihmailto:xxxxx
Sent: July 25, 2016 6:55 AM
To: Windows System Software Devs Interest Listmailto:xxxxx
Subject: Re:[ntdev] Need to Sample source for _RTL_AVL_TABLE
If you delete a lot, then maybe red-black tree is better then AVL one.
That’s why most STL implementations use RB trees for sets and maps.
–
Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.commailto:xxxxx
http://www.storagecraft.com
“Marion Bond” > wrote in message news:xxxxx@ntdev…
It is well known that amongst all binary trees, AVL is a compromise. I suspect that the engineers at Microsoft chose this one because is exactly that.
Their counterparts in the SQL server team long ago made the choice to use b-trees instead (which are not binary however confusing the name). IMHO, primarily because of the relative ease of writing the data structures to disk in an atomic way ? and I stress relative ease
Skiplists are not a tree at all, but a modified form of a queue or linked list. They are especially useful if the data added to the list is in a particular order and the cost of comparison between elements is high. Depending on the depth of the skip, and the distribution of data, and the access pattern, they can improve performance dramatically since often far fewer comparisons are required to find the desired data. Order of elements is especially important when using skiplists as they are totally ineffective on lists with arbitrary order.
Again, depending upon the data insertion and access patterns, as well as the relative cost of comparisons, there are many different data structures that may be better or worse for a particular purpose. This is no new wisdom.
If the OP has already decided to use the AVL APIs (DDIs) presumably he knows that his particular usage pattern will benefit from this kind of data structure. We may presume that at least his boss suspects that the usage pattern will benefit from this. It may be possible to use an alternate data structure to a greater effect, but we have no possible way to know. One important factor is that AFAIK the only data structures that have built-in implementations in Windows for KM are the linked list and the AVL tree. I have no problem implementing generic data structures in KM, but others may wish to rely on the built-in functions as they have been well tested and peer reviewed
Sent from Mailhttps: for Windows 10
From: Jamey Kirbymailto:xxxxx
Sent: July 24, 2016 3:34 PM
To: Windows System Software Devs Interest Listmailto:xxxxx
Subject: Re: [ntdev] Need to Sample source for _RTL_AVL_TABLE
AVL is a bit heavy. If you want something just as fast, open source, and extremely easy to understand, try my kernel-mode skiplist implementation. I modeled it to use nearly the same interface semantics as RTL_GENERIC_TABLE. It is available on GitHub. https://github.com/jameykirby/public
On Thu, Jul 21, 2016 at 7:10 PM > wrote:
At one point, The Microsoft’s AVL tree implementation was the one described
in Knuth, “The Art of Computer Programming, Volume 3, Sorting and Searching”.
I don’t know if that has changed over time but the important think is that
the implementation is exported to usermode with NTDLL.DLL. You can test the
API in usermode with large collections of data. The WIN32 UI and the console
are probably easier to use for this purpose. Here is a header (AVL.H) you can
use to test the API:
/* AVL.H
Copyright (c) Microsoft Corporation. All rights reserved.
For testing purpose only !
This material was copied from the WDK’s NTDDK.H header
to use Microsoft’s AVL tree implementation in usermode.
Functions whose prototype begins with NTSYSAPI are exported
by NTDLL.DLL. ntdll.lib must be added to the list of libraries
imported by the project or any (.c or .cpp) source file that
calls one of these functions must contain the line:
#pragma comment(lib,“ntdll.lib”)
/
#ifndef AVL_TREE_HEADER
#define AVL_TREE_HEADER
/ We need Windows Internal defs and typedefs here*/
#include <winternl.h>
#ifdef __cplusplus
extern “C” {
#endif
//AVL Tree defs and typedefs
typedef enum _RTL_GENERIC_COMPARE_RESULTS {
GenericLessThan,
GenericGreaterThan,
GenericEqual
} RTL_GENERIC_COMPARE_RESULTS;
typedef
RTL_GENERIC_COMPARE_RESULTS
NTAPI
RTL_AVL_COMPARE_ROUTINE (
In struct _RTL_AVL_TABLE *Table,
In PVOID FirstStruct,
In PVOID SecondStruct
);
typedef RTL_AVL_COMPARE_ROUTINE *PRTL_AVL_COMPARE_ROUTINE;
typedef
PVOID
NTAPI
RTL_AVL_ALLOCATE_ROUTINE (
In struct _RTL_AVL_TABLE *Table,
In ULONG ByteSize
);
typedef RTL_AVL_ALLOCATE_ROUTINE *PRTL_AVL_ALLOCATE_ROUTINE;
typedef
VOID
NTAPI
RTL_AVL_FREE_ROUTINE (
In struct _RTL_AVL_TABLE *Table,
In PVOID Buffer
);
typedef RTL_AVL_FREE_ROUTINE *PRTL_AVL_FREE_ROUTINE;
typedef struct _RTL_BALANCED_LINKS {
struct _RTL_BALANCED_LINKS *Parent;
struct _RTL_BALANCED_LINKS *LeftChild;
struct _RTL_BALANCED_LINKS *RightChild;
CHAR Balance;
UCHAR Reserved[3];
} RTL_BALANCED_LINKS;
typedef RTL_BALANCED_LINKS *PRTL_BALANCED_LINKS;
typedef struct _RTL_AVL_TABLE {
RTL_BALANCED_LINKS BalancedRoot;
PVOID OrderedPointer;
ULONG WhichOrderedElement;
ULONG NumberGenericTableElements;
ULONG DepthOfTree;
PRTL_BALANCED_LINKS RestartKey;
ULONG DeleteCount;
PRTL_AVL_COMPARE_ROUTINE CompareRoutine;
PRTL_AVL_ALLOCATE_ROUTINE AllocateRoutine;
PRTL_AVL_FREE_ROUTINE FreeRoutine;
PVOID TableContext;
} RTL_AVL_TABLE;
typedef RTL_AVL_TABLE *PRTL_AVL_TABLE;
NTSYSAPI
VOID
NTAPI
RtlInitializeGenericTableAvl (
Out PRTL_AVL_TABLE Table,
In PRTL_AVL_COMPARE_ROUTINE CompareRoutine,
In PRTL_AVL_ALLOCATE_ROUTINE AllocateRoutine,
In PRTL_AVL_FREE_ROUTINE FreeRoutine,
In_opt PVOID TableContext
);
NTSYSAPI
PVOID
NTAPI
RtlInsertElementGenericTableAvl (
In PRTL_AVL_TABLE Table,
In_reads_bytes(BufferSize) PVOID Buffer,
In ULONG BufferSize,
Out_opt PBOOLEAN NewElement
);
NTSYSAPI
BOOLEAN
NTAPI
RtlDeleteElementGenericTableAvl (
In PRTL_AVL_TABLE Table,
In PVOID Buffer
);
NTSYSAPI
VOID
NTAPI
RtlDeleteElementGenericTableAvlEx (
In PRTL_AVL_TABLE Table,
In PVOID NodeOrParent
);
NTSYSAPI
PVOID
NTAPI
RtlLookupElementGenericTableAvl (
In PRTL_AVL_TABLE Table,
In PVOID Buffer
);
NTSYSAPI
PVOID
NTAPI
RtlEnumerateGenericTableAvl (
In PRTL_AVL_TABLE Table,
In BOOLEAN Restart
);
NTSYSAPI
PVOID
NTAPI
RtlEnumerateGenericTableWithoutSplayingAvl (
In PRTL_AVL_TABLE Table,
Inout PVOID *RestartKey
);
NTSYSAPI
ULONG
NTAPI
RtlNumberGenericTableElementsAvl (
In PRTL_AVL_TABLE Table
);
#ifdef__cplusplus
}
#endif
#endif //#ifndef __AVL_TREE_HEADER
//AVL.H ends here.
======================================
The allocate and free routines are simple. You just have to do
what you are supposed to do: allocating a chunk of memory of the given
size and freeing memory:
PVOID
NTAPI
AllocateRoutine (
In struct _RTL_AVL_TABLE *Table,
In ULONG ByteSize
){
return HeapAlloc(GetProcessHeap(),HEAP_ZERO_MEMORY,(SIZE_T)ByteSize);
}
VOID
NTAPI
FreeRoutine(
In struct _RTL_AVL_TABLE *Table,
In PVOID Buffer
){
HeapFree(GetProcessHeap(),0,Buffer);
}
The compare function is dependant on the data type of the node you insert in the tree.
For example if you insert nodes of the following type:
typedef struct _MYNODE{
UINT Index;
LPTSTR Name;
} MYNODE, *PMYNODE;
and if the Index field is used to compare nodes than the compare routine would be:
RTL_GENERIC_COMPARE_RESULTS
NTAPI CompareRoutine (
In struct _RTL_AVL_TABLE *Table,
In PVOID FirstStruct,
In PVOID SecondStruct
){
PMYNODE Node1, Node2;
Node1 = (PMYNODE)FirstStruct;
Node2 = (PMYNODE)SecondStruct;
if(Node1->Index > Node2->Index){
return GenericGreaterThan;
}
else if(Node1->Index < Node2->Index){
return GenericLessThan;
}
return GenericEqual;
}
Note that strings can be compared as well (lexicographical or dictionary order)
and could be used for indexing purpose. But for large collections, working with
string hashes is probably better and a hashing function is required.
—
NTDEV is sponsored by OSR
Visit the list online at: http:
MONTHLY seminars on crash dump analysis, WDF, Windows internals and software drivers!
Details at http:
To unsubscribe, visit the List Server section of OSR Online at http:
— NTDEV is sponsored by OSR Visit the list online at: MONTHLY seminars on crash dump analysis, WDF, Windows internals and software drivers! Details at To unsubscribe, visit the List Server section of OSR Online at
—
NTDEV is sponsored by OSR
Visit the list online at: http:
MONTHLY seminars on crash dump analysis, WDF, Windows internals and software drivers!
Details at http:
To unsubscribe, visit the List Server section of OSR Online at http:</http:></http:></http:></http:></http:></http:></winternl.h></mailto:xxxxx></mailto:xxxxx></https:></mailto:xxxxx></mailto:xxxxx></mailto:xxxxx></https:>