Don,
These fixations are a common problem; I see them in applications all the
time. The horror stories of bad code based on “that’s how we always did
it” are too numerous to go into. Unstated is “On Unix” or “On VMS”.
And they are usually based on specious interpretations of requirements.
For example, “We don’t use short functions, because function calls are too
expensive” (175ps is expensive?) Or “macros are better than inline
functions because…” (some silly reason), or “We don’t use threads
because they’re too expensive” (unstated: “We don’t use threads because
our programmers are clueless” or “fork() was expensive so threads must be
expensive”) I find the attitudes about programming are those based on
programming PDP-11s, which is not surprising, because most of them have
either been taught by old PDP-11 programmers, or learned from books
written by PDP-11 programmers.
I get questions all the time of the form of “How do I efficiently do X on
a large file?” So I ask, “How large is a ‘large’ file?” and get answers
like “Ten megabytes”. To which I reply, “Oh, you mean a TINY file. I
thought you meant a LARGE file. Read the whole file into a buffer and
work on it in memory.” To me, large files are measured in gigabytes,
medium files in hundreds of megabytes. Ten megabytes is tiny. There is
simply no understanding of the consequences of
multiple-orders-of-magnitude changes in memmory speed, instruction
execution speed, or address space size on solving a problem. This is why
we get “patch the IDT” solutions, because they’re “faster”.
You are precisely right to ask for the specs; many of these designs are
based on solving nonexistent performance problems caused by a failure to
understand either the performance requirements of the problem domain or
the capabilities of the machines. It’s like everyone thinks we’re still
programming PDP-11s. The concepts of 2.8GHz pipelined superscalar
architectures with speculative execution, multilevel caches, and dozens of
other performance enhancements, do not seem to have impacted their
consciousness.
In my teaching, I had one C instructor catch me at break and asked me to
explain the difference between stack storage, static storage, and heap
storage (scary!). With disconnects like this, it is not surprising many
programmers are confused.
This is one of the reasons I tend to make statements of the form “X is
bad” (for some value of X), because it limits the scope of mistakes. By
the time they figure out that X may be the only solution, they have enough
years of experience to do the job right, and the judgment to have
determined that they need to do X.
I find that the less experience a programmer has, the more likely he/she
is to create a complex solution to a simple problem. Years of untraining
new programmers taught me this part.
joe
What bothers me on this whole discussion is that we have never gotten
the constraints of the problem:
-
How big of data blocks are being transferred? With today’s systems
a buffer copy is pretty cheap
-
What is rete of data requests? I can easily get 100K IOCTL’s
through a KMDF driver, and double that with WDM preprocessing, and an
order of magnitude improvement over the original with FastIO
-
Can the hardware read partial data sets (be they packets or
whatever)? I’ve seen a situation where I could read the data in chunks
as a software driver form of scatter gather
-
How CPU bound is the system(s) in question? Even with the overhead,
it may be better depending on the data size and request rate to copy
things.
Basically, this seems to be another “We did it this way in Linux, so we
have to do it to Windows” situation. The last of these I encountered I
was told that the application that accessed the shared memory was so
complex that they had to do it that way. I asked for a day to review
the application. The next day they said “You see it was terrible”, and
I replied “Heres a modified source that does it with IOCTL’s, once I
analyzed the code it took me an hour to change your application and get
it to run on Windows!” It is disturbing how many times I have
encountered this sort of situation.
Don Burn
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr
NTDEV is sponsored by OSR
For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars
To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer