RE: Object oriented programming in NDIS(OFFTOPIC)

Interesting point you brought up, specially in the caching area !!!
As of now there is a large amount of effort going on in that area. Mainly

Cache- caching
RAM- caching
Disk - caching

Cache being primary or secondary, is expensive, hence in the kilobytes range
! And being the oldest form of cache is well understood (X-way
associativity ) size, management of it etc. etc.

RAM cache being cheap is large, and in the MB or GB order, possibly can
layout my DNA :slight_smile:

Disk cache is in between, some MBs.

RAM and disk caching being bit more large-spaced are the targets for current
efforts. Wide variety of access patterns are being observed, and control
logic is becoming ever more sophisticated. IF YOU HAVE SEEN OR HARD hiccup
probability then you know :). And I’m trying to use some of the Arsenal I
learnt long time back, lucky indeed. I thought we were done with cache, but
it seems like there is always ‘thought for food’…

My apology for being offtopic, but this is one heck of an intellectual set
of hackers :-).

-pro

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com]On Behalf Of
xxxxx@3Dlabs.com
Sent: Friday, May 14, 2004 3:32 AM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] Object oriented programming in NDIS

hi, i am new to programming NDIS drivers…can you tell me if
i can use OO
programming, what are the disadvantages/advantages of doing
that. Also what
will be the fastest data structure to use for drivers, can we use
associative binary trees? Thanks for your help

You can use OO in drivers, but there are a few limitations, as well as
complicatins. For instance, it’s not possible to have “global” objects. All
objects have to be created dynamically and at runtime.

You will also have to roll your own of some of the runtime library, as it’s
not supported by MS for the driver libraries. For instance, you’ll need to
create a pair of “new” & “delete” functions.

Of course, as Guy suggested, you can use DriverStudio from NuMega to get
started.

Or you can use plain C.

We’ve got a large simulator written in C++ which is used to simulate our
next generation graphics processor, which interfaces to plain C driver that
we have. There’s been times when I’ve thought “This would be much easier to
do in C++” in the driver, but our driver is several dozen files, and a lot
of it is fairly complex, so it would take quite a lot of work to re-engineer
it to C++ all the way through, and if you don’t do it properly, it’s not
much point.

I’d say that the advantages of C is:

  • It’s directly interfacing to the DDK, you don’t need any ‘interface
    layers’.
  • I personally think it’s easier for the programmer to realize what the code
    does.

The advantages of C++ would be:

  • Generic classes, for encapsulation and code reuse.
  • Better type checking.

Of course you can use associative binary trees. It’s generally more popular
to use hash-tables or other methods that are more “directly accessible”, but
it all depends on how much data you’re trying to store/retrieve, and what
the access patterns are, etc, etc. All algorithms for storing and retrieving
data is dependant on how you access the data, and how much data is being
stored.

Also, if you’re going to access a lot of the data at short intervals, and
want the cache to help the performance of the driver, you may want to
consider something that isn’t a linked list (which binary trees is a form
of). Linked lists are perfect for thrashing in the cache, as each entry is
very likely to be on a different cache-line. Consider the following code:

struct x
{
struct x *next;
struct y *data;
} *list;

struct y
{
int key;
… some more data
};

struct y *findKey(int key)
{
struct x *p;
for(p = list; p; p = p->next)
{
if (p->data.key == key)
return p->data;
}
return NULL;
}

Now, it’s very likely that each allocation of struct x is on a different
cache-line than the previous one. Also, if data is not allocated closely to
struct x, it’s very likely to not be on the same cache-line as the owning
struct x. So every time through the loop, we fetch TWO cache-lines. In order
to get two integer values. On a modern processor that means reading 128 or
256 bytes each time. And the only time the extra data fetched is of some
help is on the final one where we find the correct key. All the others are
most likely just polluting the cache.

Of course, a slightly different (less generic) way of structuring the data
as a single struct would help a lot here. The above example would be very
likely to be the generic C++ implementation of a linked list, while a C
implementation may well use a single struct for the data and the link(s),
and that would help a fair bit (half the memory bandwidth needed). Another
way would be to copy the key from the data to the struct x, so that it’s
easy to access without the extra dererferenc.

But a much better option would be to have a hash-table where we can find the
object based on the key data. If done correctly, this would cause no memory
access during the lookup phase.

It’s important to remember that modern processors have a very high
throughput of instructions, but can easily be slowed down to a crawl by
accessing memory “badly”. The latency to read something that isn’t in
memory, on an Opteron/Athlon64 is about 45 ns. If the processor runs at
2GHz, and has 3 execution units, it will do 3 * 2 * 45 instructions in that
time. So even if you have to do a long instruction such as divide or
multiply, it’s a lot quicker than an uncached memory operation. Of course,
if the data is already in L1 cache, that’s a completely different story.


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@garlic.com
To unsubscribe send a blank email to xxxxx@lists.osr.com