The simplest implementation is using a hash table with linked lists in
each hash bucket. My general rule of thumb is that I don’t want more
than around 10 entries in any hash bucket for acceptable performance.
If I’m getting more entries than that, either the table is too small
(lots of hash buckets are over the threshold) or the hash algorithm is
bad (a few hash buckets are over the threshold). To solve this issue
you can increase the size of the hash table, change your hash function
or move to trees in the individual hash buckets (e.g., splay trees,
which work fairly well in a situation like this.)
We published an article with sample code for a fixed size (256 entry in
the sample code) hash table that employs linked lists in the hash
buckets recently (http://www.osronline.com/article.cfm?article=443) and
the motivation was use in a file system. The table uses ERESOURCE
objects (one to cover the entire table in case someone needs to walk the
entire table and make changes to it, and then one per hash table
bucket.) The sample code doesn’t protect against duplicates (which is a
lot more complex to make efficient and correct) but if that’s a
requirement, I can explain it further.
Regards,
Tony
Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of amitr0
Sent: Thursday, May 18, 2006 8:58 AM
To: ntfsd redirect
Subject: Re: [ntfsd] list of FOs
aah, thanks yet again.
But tell me, in the hashing approach even, there would be collisions,
and I need to maintain a linked list for each hash bucket, so I cannot
avoif O(n) all the way, can I?
On 5/18/06, Dan Kyler wrote:
Multiple threads from the same process can be scheduled on different
processors.
----- Original Message -----
From: amitr0 mailto:xxxxx
To: Windows File Systems Devs Interest List mailto:xxxxx
Sent: Thursday, May 18, 2006 6:41 AM
Subject: Re: [ntfsd] list of FOs
Dan,
Thanks for the wonderful explanation. It helps.
"BTW, is this list a list of all the open file objects on the system?
If so, I would suggest that a linked list is the wrong approach (use a
hash table). Your performance will suffer greatly doing a O(n) linked
list search on every I/O. "
Yes, it is, and I am planning to implement the hashing mechanism, I
believe one such already exists in the filespy code fro reference. Just
that this was a prototype and was being used a a proof of concept.
Though I do absolutely agree that, even for a prototype it IS BAD
design.
one more thing, though this is offtopic in this thread, still:
in Windwos NT/2K/XP, how is multiprocessing implemented, the minimum
schedulable entity being a thread here, on a ,ultiprocessor machine,
does NT schedule all threads of a process on a single processor, or can
a single process also span over different processors while execution?
Sorry if it is a stupid question, but better learn late than learn
never.
amitr0
— Questions? First check the IFS FAQ at
https://www.osronline.com/article.cfm?id=17 You are currently subscribed
to ntfsd as: unknown lmsubst tag argument: ‘’ To unsubscribe send a
blank email to xxxxx@lists.osr.com
—
Questions? First check the IFS FAQ at
https://www.osronline.com/article.cfm?id=17
You are currently subscribed to ntfsd as: unknown lmsubst tag argument:
‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com
–
- amitr0 — Questions? First check the IFS FAQ at
https://www.osronline.com/article.cfm?id=17 You are currently subscribed
to ntfsd as: unknown lmsubst tag argument: ‘’ To unsubscribe send a
blank email to xxxxx@lists.osr.com</mailto:xxxxx></mailto:xxxxx>