Hi, Feng Xinyu,
Besides being a kernel developer at Compuware/Numega, I’m also a professor
of computer science. So, I’m interested in your research ! If you want, you
can check out my web page at http://www.rivier.edu/faculty/amoreira/web.
I find that bugs related to shared multiprocessor systems are the hardest
ones to pinpoint. There’s a pretty decent tool out there, it’s called
“TotalView”, that gives a shot at exposing some of the inner workings of an
SMP system and at presenting a coherent vision of the system’s
multiprocessing operation. If you look it up with Google you will find their
web site without a problem.
The other problem of course is that hardware is non-deterministic, and it
can - and does - fail. The issue isn’t so much to diagnose frequent
problems, it’s really the odd infrequent ones that eat up much resources.
For example, I once had a graphics chip that misbehaved whenever there were
two back-to-back Interrupt Acknowledge cycles on the I/O bus, and that was a
pig to find. So, maybe you want to consider techniques that not only work in
C or C++, but also HDL or VHDL ? Mind you, debugging Verilog code is way
harder than debugging C or C++, and the parallel nature of hardware
definition languages makes it even harder.
Memory safety is also important, but here there’s some pretty decent tools
out there. If however you can find a new way of doing it, for example,
something that can be used without writing to that memory, I bet a lot of
people will raise their eyebrows. In fact, one of the issues we have with
debugging today is that most debugging techniques are destructive: they
require the debugging software to plant “hooks” inside the code, so that the
debugger can seize control at key points and do things such as breakpoints,
single stepping, or even debug output. However, this is a jolly pain, we run
into multiprocessor synchronization issues all the time because planting
those hooks is a delicate operation and very error prone, we keep being
surprised by multiprocessor synchronization problems very often !
However, this is a bit far from formal verification methods, unless of
course you want to be able to verify the health of a debugger’s hooking
mechanism.
Another issue we bump into over and again is the issue of an API call
failing. For example, you try to allocate memory and there isn’t enough, and
the driver is ill-prepared to handle that failure. A formal method of
evaluating driver correctness under stress conditions would be an invaluable
tool: you run the tool on the driver, and it pinpoints potential failures
due to API calls failing. For example, sometimes we have to run stress test
on a piece of code, and it takes hours: so, we leave it running overnight,
come back the next morning, and we find that the code failed about 4:30am
and somehow we failed to catch much of the failure’s original state. If a
verification method can expose these problems, it’ll be an instant success !
In that line, anything that contributes to “unintermittentlizing” an
intermittent problem will also be a great contribution, many of the problems
we get hammered on by our customers tend to be of the intermittent sort,
which makes them very hard to duplicate. Right now, we must duplicate every
error we find before we can fix it - but if a formal verification tool could
expose risky code and forecast those for us, again, that would be a great
contribution to the state of the art.
If you want, I’ll be here, you can email me privately at xxxxx@ieee.org
if you want to talk further about this issue. Feel free to contact me !
Alberto.
-----Original Message-----
From: Xinyu Feng [mailto:xxxxx@yale.edu]
Sent: Wednesday, November 05, 2003 2:43 AM
To: Windows System Software Devs Interest List
Subject: [ntdev] Verification of Device Drivers
Dear All:
I’m a Ph.D. student at the Department of Computer Science, Yale
Univ. One of our ongoing project is to use formal methods to verify the
correctness of the Device drivers/System BIOS. My question is, as device
driver developers/programmers, what kind of properties do you interested
in your program? In other words, what kind of problems/bugs do you
encounter most frequently? If I have an ideal tool that can verify any
properties you want (e.g. memory safety, type safety, deadlock free,
etc), what kind of properties do you want to verify?
You response would be highly appreciated!
Best,
– Xinyu
Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256
You are currently subscribed to ntdev as: xxxxx@compuware.com
To unsubscribe send a blank email to xxxxx@lists.osr.com
The contents of this e-mail are intended for the named addressee only. It
contains information that may be confidential. Unless you are the named
addressee or an authorized designee, you may not copy or use it, or disclose
it to anyone else. If you received it in error please notify us immediately
and then destroy it.