Hi everyone. I am developing a device driver for a
speed-hungry device that sends and receives large amounts
of data. This made me search for ways to avoid copying
between the application and the driver. I control the
developing of both the app and the driver.
My first version of the code utilized METHOD_NEITHER
in both the transmit and the receive side, and completed
the IRPs in the dispatch functions (synchronously). Putting
the transmit aside (it is easy to code), the code for the
IOCTL_RECEIVE dispatch function COPIED the data into the
app’s buffer from a queue of ExAllocatePool’d buffers that
were allocated from within the DPC (as a response to
interrupts indicating data have arrived). In pseudo-code:
DPC:
Allocate buffer
Read data from card into buffer
Add buffer into queue
IOCTL_RECEIVE
Extract buffer from queue
Copy data from buffer into user space buffer
This was of course making the receiving side ‘heavier’
than the transmitting one,because of the extra allocate-
copy-free involved for each data packet. I thus decided to
change the IOCTL_RECEIVE mode to METHOD_OUT_DIRECT and the
model of usage from synchronous to asynchronous. The app,
issues a number of IOCTL_RECEIVEs at the beginning, and
then waits in a WaitForMultipleObjects. The IOCTL_RECEIVE
dispatch function, puts the IRP in a queue of pending
receive-requests and returns with STATUS_PENDING. When
receive-data interrupts arrive in the card, the DPC
removes pending IRPs from the queue, copies the data
inside the (locked, due to METHOD_OUT_DIRECT) buffer,
and completes the IRP. In pseudo-code:
IOCTL_RECEIVE
Put IRP in receive-requests queue
return STATUS_PENDING
DPC:
Take IRP from queue
Read data directly from card into buffer
IoCompleteRequest
Naturally, when the WaitForMultipleObjects wakes up,
the app quickly issues another pending request
(IOCTL_RECEIVE) to compensate for the completed one,
before it starts reading the arrived data.
This seamed to work (in terms of rate achieved),
but unfortunately, I quickly realized that it couldn’t
be relied on. Why? Because the order of data arriving
IS IMPORTANT, and sadly, the I/O manager doesn’t
‘SetEvent’ in the same order that the driver
IoCompleteRequests. Even though the architecture is
a uni-processor one, for reasons unknown to me, under
moderate CPU load, something like this eventually happens:
App IOCTL_RECEIVE bufferA
Driver (Dispatch function) Queue IRPofBufferA
App IOCTL_RECEIVE bufferB
Driver (Dispatch function) Queue IRPofBufferB
App WaitForMultipleObjects(2, …)
Driver (DPC)
IoCompleteRequest(IRPofBufferA)
IoCompleteRequest(IRPofBufferB)
App
WaitForMultipleObjects returns with B signaled
WaitForMultipleObjects returns with A signaled
This reversal of notifications is of course messing
things up, requires special code to cope with in the app,
and generally makes the whole thing not worthwhile
(Take into account that the first approach doesn’t
require Cancel processing…)
Which brings us to the subject: Are Single-Copy
receives possible in NT/2000, when order of
arrival is vital? Or does one have to use intermediate
queues, like I did in the first approach?
Thanks for reading this (rather long) question.
Thanassis Tsiodras
xxxxx@4Plus.com
4Plus Technologies
You are currently subscribed to ntdev as: $subst(‘Recip.EmailAddr’)
To unsubscribe send a blank email to leave-ntdev-$subst(‘Recip.MemberIDChar’)@lists.osr.com