We are writing a Virtual Storport Miniport driver for a custom device. There is requirement to off-load the Read/Write SRBs from StartIO onto a driver thread. We are returning SRB_STATUS_PENDING from StartIO and we are completing the SRBs from the driver thread.
We were experimenting the above on a RAM Disk implementation. The observation was that the performance (measured using IOMeter) got halved when we started off-loading the work to a driver thread when compared to completing the requests in StartIO itself. Is this because of the fact that StartIO runs at DISPATCH LEVEL and the driver thread runs at PASSIVE LEVEL and the thread would get very little time to run?
We tried setting the thread's affinity to specific processor and even then it didn't help. Is there anything that we are missing?