rant:auxiliary
I have often had to tell my students: “You can’t debug a concurrent system
into correctness. You can only design it correctly. Then, when it fails,
you know you made an error in the implementation of your design.”
The number of times I’ve seen concurrent code in application space
sprinkled with Sleep(1) calls is frighteningly high. If I say “Why is
this Sleep() call here” the answer is “It won’t work without it” at which
point I say “No, it won’t work. The appearance of working is an
accident, but the fundamental design is wrong.” Often, I can find the
error within an hour; sometimes in under five minutes.
Or the “Why do you have a timeout on this WaitFor() call on the Mutex?”
and the answer is “Our program hangs if we don’t have a timeout”. Of
course, what they have written is
WaitForSingleObject(mutex, 5000);
// do things to the shared resource here…
which would be ROTFLMFAO if it weren’t so heartbreakingly sad that someone
would actually write code like this and think it made perfect sense.
One of my classic exercises is a producer/consumer model with a shared
queue between processes, using a mutex and semaphore on the free pool and
the queue. I tell them to write
switch(WaitForSingleObject(Semaphore, SEM_TIMEOUT))
{ /* WFSO sem /
case WAIT_TIMEOUT:
return NULL; // This is the spec I give them: return NULL on
timeout
case WAIT_OBJECT_0:
break;
default:
// report error via a specified interface here
return NULL;
} / WFSO sem /
switch(WaitForSingleObject(Mutex, INFINITE))
{ / WFSO mutex /
case WAIT_OBJECT_0:
break;
default:
// report error via a specified interface here
// Do appropriate recovery
return NULL;
} / WFSO mutex */
It is sad and amusing to watch them take the second one and replace the
switch with an if, because they only see two options at the moment, so
those must be the only two options. And for whom “do appropriate
recovery” means “do nothing”. When they get the project done, I create a
case in which the mutex gets abandoned, and ask what they plan to do about
it, and then I say “Now add a timeout to the mutex” and ask what they plan
to do about it. In 15 years, nobody has EVER reset the semaphore! And
this is after two days of lectures on concurrency, maintaining invariants,
and graceful error recovery! And deadlock, and deadlock detection
(heuristic) and deadlock recovery.
It is really, really important to understand the basic mechanisms involved
before haring off and throwing code at the problem in the hope that you
will stumble over the correct code sequence. This is most obvious in the
number of newbies who want to allocate kernel memory and map it into the
process, without even understanding how kernel memory works.
</rant:auxiliary>
Not to be a pedantic asshole, but I disagree. This isn’t a KMDF vs WDM
issue at all. It’s a lack of fundamental understanding of Windows I/O
Subsystem architecture. Contenxt, DPCs, IRQLs, and one driver calling
another aren’t WDM concepts. They’re Windows I/O Subsystem concepts.I’m not saying there aren’t some rough edges of WDM that leak into KMDF…
there are. But there are few.This.
To me, this is just ONE MORE EXAMPLE of why you can’t learn to write
drivers the way you learn to write C# applications: By figuring it out as
you go along and cutting and pasting random shit from samples. I realize
that’s how a whole generation of software developers have learned how to
code. But that fundamentally DOES NOT WORK for driver development. There
are overall, basic, system-wide, architectural concepts that you need to
understand before you can “hack and whack” a driver into existence.
Without knowledge of these concepts, you have no hope of writing a driver
correctly.OP: Do some reading about basic I/O Subsystem architecture and what the
KMDF constructs that you’re using (such as synch scope and execution
level) actually DO before you go off and try to write a driver. Did you
try to drive a car without first learning the rules of the road (which
side to drive on which side to pass on, the speed limit, etc)?? Because
that’s the equivalent of what you’re doing here in driver land.Peter
OSR
NTDEV is sponsored by OSR
For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminarsTo unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer