Peter,
But, sadly, you’re right. I think we’ve reached the time to end the SSDT thread.
It saddens me to do so, though. I’m kinda interested in microkernel architectures.
Therefore, let’s start another thread - I am quite interested in this as well…
Consider the model where an application doesn’t allocate the buffer in an arbitrary location,
but rather allocates it from a central pool of such buffers managed by the system.
This type of scheme has the advantage that you can enforce constraints such as "
one owner at a time"… all very clever stuff, and the data is effectively “pre-marshaled”
and can be passed-around without copying.
IMHO, currently known implementations of microkernel with all their ports, pipes, messages, etc rely upon the concept that is flawed in itself, and trying to improve it is pretty much useless idea. I think what we really need in order to design a high-performance microkernel is just to rethink the concepts of the process/thread and of the address space( ironically, the former part has been already done in Linux, so that microkernels may take advantage of it)
First of all, let’s ask ourselves a question - assuming that the address space is shared, can module X make any use of the fact that it has a full access to module Y’s code and data,
provided that they are not in the same compilation unit and code/data in question is not exported by module Y??? Obviously, not… Therefore, even if we unmap module Y’s code and data from the target address space while module X runs, we are not going to limit the latter’s functionality in any possible way( of course apart from the ability of modules to register callbacks directly with one another).
Every scheduleable execution unit in existence can have its own page directory and descriptor table. Therefore, you can think of threads just as of independent processes that just happen to share the address space with their parent for read-write access (i.e. the way Linux thinks of threads) only because their PDEs and PTEs are set up this way. However, modifying
thread X’s PDE is not going to affect thread Y’s one in any possible way.
From now on, I am going to refer to any scheduleable execution unit with the term “task”, and
to the code that it currently runs with the term “module”. Every task can have its address space sub- divided into module area, task’s private area and kernel area, with all these areas residing at some pre-defined base addresses. Only one module may be mapped into a given task’s PDE at any particular moment
-
Module area. This area holds currently running module’s code and data (including dynamic allocations and mappins of shared memory segments), and is shared by all tasks that currently run a given module, which may mean either multiple requests processed by the driver/server module, or multiple threads of mutlithreaded application’s module.Therefore, all tasks that execute a given module’s code have to synchronize their accesses to module’s global/static data. Once they share the same address space, they also happen to share semaphore indexes, so that there is no problem here whatsoever. If the module in question happens to be a device driver, device memory may be mapped into this area.
-
Task’s private area. This area holds the call stack, plus IO buffers that server and client receive respectively upon read()/write()/ioctl() requests and mmap() requests.
-
Kernel area.
Please note that apps/servers/drivers and totally unaware about the very existence of module and task private areas - the only one who knows about it is kernel itself.
What has to be done by the kernel when app/server/driver X calls server/driver Y under this model? Not that much. For example, let’s look at what kernel has to do when processing read/write/ioctl calls:
-
Check whether IO buffers are in the module area and if they are, map them to the task-private area and record this fact - we will need it upon returning. Please note that they may be already resident in the task-private area (for example, be passed on the stack), so that,
in this case, no mapping is needed. -
Replace module area in the tasks page directory with that of a module Y’s one, and forward execution to module Y’s code (addresses of IO buffers that module Y receives upon this call are already in the task-private area)
-
After module Y returns control, restore module area in the tasks page directory and make it refer to module X’a area again. If we had to map IO buffers to the task private area before calling module Y (this is why we had recorder it), unmap them
-
Return to the original callee
As you can see, the whole thing involves just updating few pointers in the target task’s page directory (this call chain may be pretty long, so that we will have to record callers in stack-like fashion)
If you think about it carefully, you will realize that basically it is still the same monolithic kernel
(at least in terms of performance), because we don’t need to wait for port messages, copy data around or block execution - everything happens in the same address space and in context of the same task. At the same time, it is a microkernel, because all drivers/servers are isolated from one another. In the above example, module Y has no chance to screw up module X’s
code and data whatsoever, because module X’s area is unmapped from the task’s address space while module Y executes its code, which is particularly useful if module X happens to be a driver with a device memory mapped into its module area.
Anton Bassov