"Thread-safe" hashing of FsContext

Hi,

Many of us use hashing tables for tracking fileobjects. What would be
the correct way (read: thread-safe) to “lock” a fileobject while the
filter driver is using/analysing the IRP_MJ_* actions that use such a
fileobject. Also, does this work while analysing the file (content) that
is associated with the fileobject ?


Bartjan.

Bartjan:

First of all, I’m using (on the pitch from Max Shatskih) generic tables to
track
FOs. But the problem is common. So, I guard adds/lookups/removes in my
tracking
list with fast mutexes (just make sure that your code/data is in non-paged
pool
while you’re holding the lock). To make sure that data associated with the
FO
“survives” when I release the lock and continue working with it outside the
“guarded” code I use addref/release mechanizm, so my associated data will be
destroyed only when refcount drops to zero. And addref and release are also
performed in the guarded code. This method has been working for me so far so
good.

BTW: Has anybody really evaluated efficiency of generic tables against
hash tables
for FO tracking?

Regards,

Vladimir

-----Original Message-----
From: Bartjan Wattel [mailto:xxxxx@zeelandnet.nl]
Sent: Monday, April 15, 2002 7:27 AM
To: File Systems Developers
Subject: [ntfsd] “Thread-safe” hashing of FsContext

Hi,

Many of us use hashing tables for tracking fileobjects. What would be
the correct way (read: thread-safe) to “lock” a fileobject while the
filter driver is using/analysing the IRP_MJ_* actions that use such a
fileobject. Also, does this work while analysing the file (content) that
is associated with the fileobject ?


Bartjan.


You are currently subscribed to ntfsd as: xxxxx@Starbase.com
To unsubscribe send a blank email to %%email.unsub%%

Most approximately balanced binary trees have the advantage of a worst case lookup time of O(2 * log n + 1). I believe that is what the generic table package uses, but I’m not certain. Hash tables in the worst case may degrade to linear behavior, or O(n) if everything hashes to the same slot. The question is what kind of worst case performance is acceptable to your users?

-----Original Message-----
From: Chtchetkine, Vladimir [mailto:xxxxx@starbase.com]
Sent: Monday, April 15, 2002 9:44 AM
To: File Systems Developers
Subject: [ntfsd] RE: “Thread-safe” hashing of FsContext

Bartjan:

First of all, I’m using (on the pitch from Max Shatskih) generic tables to track
FOs. But the problem is common. So, I guard adds/lookups/removes in my tracking
list with fast mutexes (just make sure that your code/data is in non-paged pool
while you’re holding the lock). To make sure that data associated with the FO
“survives” when I release the lock and continue working with it outside the
“guarded” code I use addref/release mechanizm, so my associated data will be
destroyed only when refcount drops to zero. And addref and release are also
performed in the guarded code. This method has been working for me so far so good.

BTW: Has anybody really evaluated efficiency of generic tables against hash tables
for FO tracking?

Regards,

Vladimir

-----Original Message-----
From: Bartjan Wattel [mailto:xxxxx@zeelandnet.nl]
Sent: Monday, April 15, 2002 7:27 AM
To: File Systems Developers
Subject: [ntfsd] “Thread-safe” hashing of FsContext

Hi,

Many of us use hashing tables for tracking fileobjects. What would be
the correct way (read: thread-safe) to “lock” a fileobject while the
filter driver is using/analysing the IRP_MJ_* actions that use such a
fileobject. Also, does this work while analysing the file (content) that
is associated with the fileobject ?


Bartjan.


You are currently subscribed to ntfsd as: xxxxx@Starbase.com
To unsubscribe send a blank email to %%email.unsub%%


You are currently subscribed to ntfsd as: xxxxx@inin.com
To unsubscribe send a blank email to %%email.unsub%%

>> BTW: Has anybody really evaluated efficiency of generic tables
against hash tables
for FO tracking? <<

Yes. Hashing adds little performance that is not eaten up with the math.
Generic tables are implemented as a splay tree (MRU binary tree). For
file objects or FsContext pointers, splay trees seem to work relatively
fast. However, we have implemented the following index methods to
evaluate performance in general:

  • Splay tree
  • Red Black tree
  • AVL tree
  • BTREE

Our results show that the performance is in this order:

  • BTREE
  • AVL
  • RedBlack
  • Splay

So, needless to say, we are using our BTREE implementation. The nice
thing about BTREEs are that they are tweakable; changing page sizes to
select the best performance for your specific implementation.

As a general rule, we tend to steer clear of hashing. With hashing,
there is always the potential for generating an ambiguous key and having
to handle this duplicate case. With a binary tree, this is not an issue.

Jamey

-----Original Message-----
From: Bartjan Wattel [mailto:xxxxx@zeelandnet.nl]
Sent: Monday, April 15, 2002 7:27 AM
To: File Systems Developers
Subject: [ntfsd] “Thread-safe” hashing of FsContext

Hi,

Many of us use hashing tables for tracking fileobjects. What would be
the correct way (read: thread-safe) to “lock” a fileobject while the
filter driver is using/analysing the IRP_MJ_* actions that use such a
fileobject. Also, does this work while analysing the file (content) that

is associated with the fileobject ?


Bartjan.


You are currently subscribed to ntfsd as:
xxxxx@Starbase.com
To unsubscribe send a blank email to %%email.unsub%%


You are currently subscribed to ntfsd as: xxxxx@storagecraft.com
To unsubscribe send a blank email to %%email.unsub%%

Generic tables are not balanced trees; to the contrary, they are only
balanced (if balanceable at all) when the data access patterns are as
random as possible.

Generic table is implemented as a splay tree. Splay trees are MRU; it
puts the most recently used node at the top of the table on insert and
lookup. If you do lots of linear accesses to the table, the tree will
flatten and become as slow as using a linked list.

Tracking file objects via a splay tree is fairly fast because you do see
a some-what random pattern and a relatively small tree (a few hundred
items), but it would not take much to throw it off.

AVL trees are very fast and very efficient: O(log N) if I remember
correctly, but a BTREE will provide the best performance at the expense
of a more complex development effort.

Jamey

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Fuller, Rob
Sent: Monday, April 15, 2002 8:09 AM
To: File Systems Developers
Subject: [ntfsd] RE: “Thread-safe” hashing of FsContext

Most approximately balanced binary trees have the advantage of a worst
case lookup time of O(2 * log n + 1). I believe that is what the
generic table package uses, but I’m not certain. Hash tables in the
worst case may degrade to linear behavior, or O(n) if everything hashes
to the same slot. The question is what kind of worst case performance
is acceptable to your users?

-----Original Message-----
From: Chtchetkine, Vladimir [mailto:xxxxx@starbase.com]
Sent: Monday, April 15, 2002 9:44 AM
To: File Systems Developers
Subject: [ntfsd] RE: “Thread-safe” hashing of FsContext

Bartjan:

First of all, I’m using (on the pitch from Max Shatskih) generic tables
to track
FOs. But the problem is common. So, I guard adds/lookups/removes in my
tracking
list with fast mutexes (just make sure that your code/data is in
non-paged pool
while you’re holding the lock). To make sure that data associated with
the FO
“survives” when I release the lock and continue working with it outside
the
“guarded” code I use addref/release mechanizm, so my associated data
will be
destroyed only when refcount drops to zero. And addref and release are
also
performed in the guarded code. This method has been working for me so
far so good.

BTW: Has anybody really evaluated efficiency of generic tables against
hash tables
for FO tracking?

Regards,

Vladimir

-----Original Message-----
From: Bartjan Wattel [mailto:xxxxx@zeelandnet.nl]
Sent: Monday, April 15, 2002 7:27 AM
To: File Systems Developers
Subject: [ntfsd] “Thread-safe” hashing of FsContext

Hi,

Many of us use hashing tables for tracking fileobjects. What would be
the correct way (read: thread-safe) to “lock” a fileobject while the
filter driver is using/analysing the IRP_MJ_* actions that use such a
fileobject. Also, does this work while analysing the file (content) that

is associated with the fileobject ?


Bartjan.


You are currently subscribed to ntfsd as:
xxxxx@Starbase.com
To unsubscribe send a blank email to %%email.unsub%%


You are currently subscribed to ntfsd as: xxxxx@inin.com
To unsubscribe send a blank email to %%email.unsub%%


You are currently subscribed to ntfsd as: xxxxx@storagecraft.com
To unsubscribe send a blank email to %%email.unsub%%

RE: [ntfsd] “Thread-safe” hashing of FsContextNo.
Rtl generic table is “splay tree”. It always pulls the object which was hit by the lookup to the very top.
It is not balanced.

Under strict performance requirements, we implemented AVL and red-black trees instead of using the Rtl generic table. Its another drawback is that it cannot allow the items to be allocated from the lookaside.

So, for very performance-critical paths, do not use Rtl generic table. AVL tree seems to be great. Much (times) faster if you have at least 10.000 items in the table which are accessed on most read/write operations.
Red-black trees are slower then AVL ones. Maybe under heavy-on-delete pattern, RB tree would be faster, since AVL can require huge number of tree turns during delete.

Max

----- Original Message -----
From: Fuller, Rob
To: File Systems Developers
Sent: Monday, April 15, 2002 7:09 PM
Subject: [ntfsd] RE: “Thread-safe” hashing of FsContext

Most approximately balanced binary trees have the advantage of a worst case lookup time of O(2 * log n + 1). I believe that is what the generic table package uses, but I’m not certain. Hash tables in the worst case may degrade to linear behavior, or O(n) if everything hashes to the same slot. The question is what kind of worst case performance is acceptable to your users?

-----Original Message-----
From: Chtchetkine, Vladimir [mailto:xxxxx@starbase.com]
Sent: Monday, April 15, 2002 9:44 AM
To: File Systems Developers
Subject: [ntfsd] RE: “Thread-safe” hashing of FsContext

Bartjan:

First of all, I’m using (on the pitch from Max Shatskih) generic tables to track
FOs. But the problem is common. So, I guard adds/lookups/removes in my tracking
list with fast mutexes (just make sure that your code/data is in non-paged pool
while you’re holding the lock). To make sure that data associated with the FO
“survives” when I release the lock and continue working with it outside the
“guarded” code I use addref/release mechanizm, so my associated data will be
destroyed only when refcount drops to zero. And addref and release are also
performed in the guarded code. This method has been working for me so far so good.

BTW: Has anybody really evaluated efficiency of generic tables against hash tables
for FO tracking?

Regards,

Vladimir

-----Original Message-----
From: Bartjan Wattel [mailto:xxxxx@zeelandnet.nl]
Sent: Monday, April 15, 2002 7:27 AM
To: File Systems Developers
Subject: [ntfsd] “Thread-safe” hashing of FsContext

Hi,

Many of us use hashing tables for tracking fileobjects. What would be
the correct way (read: thread-safe) to “lock” a fileobject while the
filter driver is using/analysing the IRP_MJ_* actions that use such a
fileobject. Also, does this work while analysing the file (content) that
is associated with the fileobject ?


Bartjan.


You are currently subscribed to ntfsd as: xxxxx@Starbase.com
To unsubscribe send a blank email to %%email.unsub%%


You are currently subscribed to ntfsd as: xxxxx@inin.com
To unsubscribe send a blank email to %%email.unsub%%

You are currently subscribed to ntfsd as: xxxxx@storagecraft.com
To unsubscribe send a blank email to %%email.unsub%%