why ssd response too slowly for my filter driver?

i was writing cache filter driver for windows, which use ssd as the HDD cache.
my filter driver locate between disk.sys and partmgr.sys. when it receive IRP_MJ_READ/WRITE, it will do cache hit testing. if there are cache hit, it will allocate a new irp and send it to ssd to realize a cache read or write.
However, when i do sequencial read or write test repeatly , there are a mass of cache hit in a short time. but the IoCompletion routine was called very slowly, worse still, sometimes there are more than 100 IoCompetion called one by one after request send for a long time. as result, the pool resource may be exausted in a peak.
I expect that one request followed by one IoCompletion returned.

HDDReadWrite: PID 1836, run in DiskMark.exe process.
CsampPollDevice: IRP_MJ_WRITE 0x90d3d910,dbn=0x31db22000,index=6920, Buflen=0x100000.
CsampollDevice:CacheUpdate Allocate 0n1048576 Bytes,nSequenceBlocks=0n255, Block Start Index=0n3543053.
<–HDDReadWrite: Insert pending irp 0x90d3d910 to csq.
IoCallDriver Irp 0x90d3d910.
–>HDDReadWrite: Irp=0x90d3d910 Write to 0x31dc22000 ,len=0x100000.
HDDReadWrite: PID 1836, run in DiskMark.exe process.
CsampPollDevice: IRP_MJ_WRITE 0x90d3d910,dbn=0x31dc22000,index=5283, Buflen=0x100000.
CsampollDevice:CacheUpdate Allocate 0n1048576 Bytes,nSequenceBlocks=0n255, Block Start Index=0n2704899.
<–HDDReadWrite: Insert pending irp 0x90d3d910 to csq.
IoCallDriver Irp 0x90d3d910.
–>HDDReadWrite: Irp=0x8e71bc08 Write to 0x113000 ,len=0x1000.
HDDReadWrite: PID 1836, run in DiskMark.exe process.
<–HDDReadWrite: Insert pending irp 0x8e71bc08 to csq.
CsampPollDevice: IRP_MJ_WRITE 0x8e71bc08,dbn=0x113000,index=7198, Buflen=0x1000.
CsampollDevice:CacheUpdate Allocate 0n4096 Bytes,nSequenceBlocks=0n0, Block Start Index=0n3685376.
IoCallDriver Irp 0x8e71bc08.
–>CacheWriteCompletion.
CacheWriteCompletion: Deal with Irp 0x90d3d910,nSequencialBlocks=0n0,cache update.
<–CacheWriteCompletion: Irp.status=0x00000000.
–>HDDReadWrite: Irp=0x8e71bc08 Read from 0xc2a2600 ,len=0x1000.
HDDReadWrite: PID 1752, run in taskhost.exe process.
<–HDDReadWrite: Insert pending irp 0x8e71bc08 to csq.
CsampPollDevice: IRP_MJ_READ 0x8e71bc08, dbn=0xc2a2600,BufLen=0x1000,index=7480.
CsampPollDevice: Asynchronously Read Pending Return.
–>CacheWriteCompletion.
CacheWriteCompletion: Deal with Irp 0x8e71bc08,nSequencialBlocks=0n0,cache update.
<–CacheWriteCompletion: Irp.status=0x00000000.
–>CacheWriteCompletion.
CacheWriteCompletion: Deal with Irp 0x90d3d910,nSequencialBlocks=0n0,cache update.
<–CacheWriteCompletion: Irp.status=0x00000000.
–>CacheWriteCompletion.
CacheWriteCompletion: Deal with Irp 0x90d3d910,nSequencialBlocks=0n255,cache update.
<–CacheWriteCompletion: Irp.status=0x00000000.
–>CacheWriteCompletion.
CacheWriteCompletion: Deal with Irp 0x90d3d910,nSequencialBlocks=0n254,cache update.
<–CacheWriteCompletion: Irp.status=0x00000000.
–>CacheWriteCompletion.
CacheWriteCompletion: Deal with Irp 0x93db9db8,nSequencialBlocks=0n70,cache update.
<–CacheWriteCompletion: Irp.status=0x00000000.
–>CacheWriteCompletion.
CacheWriteCompletion: Deal with Irp 0x93ad18a8,nSequencialBlocks=0n242,cache update.
<–CacheWriteCompletion: Irp.status=0x00000000.
–>CacheWriteCompletion.
CacheWriteCompletion: Deal with Irp 0x90d3d910,nSequencialBlocks=0n254,cache update.
<–CacheWriteCompletion: Irp.status=0x00000000.
–>CacheWriteCompletion.
CacheWriteCompletion: Deal with Irp 0x90d3d910,nSequencialBlocks=0n255,cache update.
<–CacheWriteCompletion: Irp.status=0x00000000.

You have a bug. Or multiple bugs.

You haven’t given us enough information, or asked a question that is sufficiently specific, to allow us to help you.

Your question is, basically, “My driver doesn’t work… please tell me why.”

We want to help you, but you have to give us a more specific question to answer. Your debug output is not helpful. I have no idea what it means.

Peter
OSR
@OSRDrivers

Do you have verifier enabled? If so, disable it before doing any type of performance measurement.

~kenny

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@gmail.com
Sent: Wednesday, February 26, 2014 11:51 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] why ssd response too slowly for my filter driver?

HDDReadWrite: PID 1836, run in DiskMark.exe process.
CsampPollDevice: IRP_MJ_WRITE 0x90d3d910,dbn=0x31db22000,index=6920, Buflen=0x100000.
CsampollDevice:CacheUpdate Allocate 0n1048576 Bytes,nSequenceBlocks=0n255, Block Start Index=0n3543053.
<–HDDReadWrite: Insert pending irp 0x90d3d910 to csq.
IoCallDriver Irp 0x90d3d910.
–>HDDReadWrite: Irp=0x90d3d910 Write to 0x31dc22000 ,len=0x100000.
HDDReadWrite: PID 1836, run in DiskMark.exe process.
CsampPollDevice: IRP_MJ_WRITE 0x90d3d910,dbn=0x31dc22000,index=5283, Buflen=0x100000.
CsampollDevice:CacheUpdate Allocate 0n1048576 Bytes,nSequenceBlocks=0n255, Block Start Index=0n2704899.
<–HDDReadWrite: Insert pending irp 0x90d3d910 to csq.
IoCallDriver Irp 0x90d3d910.
–>HDDReadWrite: Irp=0x8e71bc08 Write to 0x113000 ,len=0x1000.
HDDReadWrite: PID 1836, run in DiskMark.exe process.
<–HDDReadWrite: Insert pending irp 0x8e71bc08 to csq.
CsampPollDevice: IRP_MJ_WRITE 0x8e71bc08,dbn=0x113000,index=7198, Buflen=0x1000.
CsampollDevice:CacheUpdate Allocate 0n4096 Bytes,nSequenceBlocks=0n0, Block Start Index=0n3685376.
IoCallDriver Irp 0x8e71bc08.
–>CacheWriteCompletion.
CacheWriteCompletion: Deal with Irp 0x90d3d910,nSequencialBlocks=0n0,cache update.
<–CacheWriteCompletion: Irp.status=0x00000000.
–>HDDReadWrite: Irp=0x8e71bc08 Read from 0xc2a2600 ,len=0x1000.
HDDReadWrite: PID 1752, run in taskhost.exe process.
<–HDDReadWrite: Insert pending irp 0x8e71bc08 to csq.
CsampPollDevice: IRP_MJ_READ 0x8e71bc08, dbn=0xc2a2600,BufLen=0x1000,index=7480.
CsampPollDevice: Asynchronously Read Pending Return.
–>CacheWriteCompletion.
CacheWriteCompletion: Deal with Irp 0x8e71bc08,nSequencialBlocks=0n0,cache update.
<–CacheWriteCompletion: Irp.status=0x00000000.
–>CacheWriteCompletion.
CacheWriteCompletion: Deal with Irp 0x90d3d910,nSequencialBlocks=0n0,cache update.
<–CacheWriteCompletion: Irp.status=0x00000000.
–>CacheWriteCompletion.
CacheWriteCompletion: Deal with Irp 0x90d3d910,nSequencialBlocks=0n255,cache update.
<–CacheWriteCompletion: Irp.status=0x00000000.
–>CacheWriteCompletion.
CacheWriteCompletion: Deal with Irp 0x90d3d910,nSequencialBlocks=0n254,cache update.
<–CacheWriteCompletion: Irp.status=0x00000000.
–>CacheWriteCompletion.
CacheWriteCompletion: Deal with Irp 0x93db9db8,nSequencialBlocks=0n70,cache update.
<–CacheWriteCompletion: Irp.status=0x00000000.
–>CacheWriteCompletion.
CacheWriteCompletion: Deal with Irp 0x93ad18a8,nSequencialBlocks=0n242,cache update.
<–CacheWriteCompletion: Irp.status=0x00000000.
–>CacheWriteCompletion.
CacheWriteCompletion: Deal with Irp 0x90d3d910,nSequencialBlocks=0n254,cache update.
<–CacheWriteCompletion: Irp.status=0x00000000.
–>CacheWriteCompletion.
CacheWriteCompletion: Deal with Irp 0x90d3d910,nSequencialBlocks=0n255,cache update.
<–CacheWriteCompletion: Irp.status=0x00000000.


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Driver verifer are all disabled.
Just as the log, my filter driver will allocate a IRP_MJ_WRITE, and set an IoCompletion routine, pass it down to next device object (It’s an SSD disk). However, if i pass down a lot of IRP asynchronously last for some time, the corresponding IoCompletion routine will not be called immediately one by one. It seems like, these irp are blocked for a long time, then part of them will be executed quickly for a moment. Next, disk driver stop for a moment, then pick another part of irp got executed. they exectue like this:
irp1
irp2
irp3
irp4
irp5
irp1complete
irp2complete
irp3complete
irp6
irp7
irp4complete
irp5complete
irp6complete
irp7complete

I suspect that disk.sys didn’t send irp immediately down to storage port ?
And, i observe that irp complete order is not the same with irp sent order

There is no guarantee that IO to a SCSI device will be completed in the same order you sent them. Completing out of order is normal.

So far, you haven’t shown any data that explains a problem. If IO is stalling when your driver is loaded, then there is a bug in your driver.

I recommend using BusTrace to obtain an IO trace when your driver is NOT loaded, and then again when it is. That will help you understand the semantics of IO processing order.

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@gmail.com
Sent: Friday, February 28, 2014 1:44 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] why ssd response too slowly for my filter driver?

I suspect that disk.sys didn’t send irp immediately down to storage port ?
And, i observe that irp complete order is not the same with irp sent order


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

xxxxx@gmail.com wrote:

Driver verifer are all disabled.
Just as the log, my filter driver will allocate a IRP_MJ_WRITE, and set an IoCompletion routine, pass it down to next device object (It’s an SSD disk). However, if i pass down a lot of IRP asynchronously last for some time, the corresponding IoCompletion routine will not be called immediately one by one. It seems like, these irp are blocked for a long time, then part of them will be executed quickly for a moment. Next, disk driver stop for a moment, then pick another part of irp got executed.

Does this surprise you? In CPU terms, even SSD disks are very, very,
very slow. A seek can take 0.1ms. A modern processor runs 300,000
cycles in that time. You would not want all of your processing to come
to a halt while it processes one request. So, it queues them up and
completes them when the I/O finally completes.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

xxxxx@gmail.com wrote:

I suspect that disk.sys didn’t send irp immediately down to storage port ?

Wrong. What reason would it have for holding it?

And, i observe that irp complete order is not the same with irp sent order

That’s quite possible. It was common with rotating hard disks for the
driver to shuffle the transfers to avoid long seeks.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

> I suspect that disk.sys didn’t send irp immediately down to storage port ?

And, i observe that irp complete order is not the same with irp sent order

I can’t comment on the disk.sys behavior, but out-of-order IRP completion
is normal for any storage device that has latency, and ssd drives have
latency, will complete “opportunistically”, that is, the lowest-level
driver is free to rearrange the order of completion to maximize
throughput, since FIFO completion order has never been part of the spec.
In fact, for any device with multiple channels (A/D and D/A converters,
for example) or multiple processes accessing it (such as file systems), is
free to, and even expected to, complete opportunistically to maximize
channel bandwidth. In addition, if you are doing asynch I/O, a thread
(probably close to a year old), explains why APC will not guarantee FIFO
delivery of IRPs even if the device handles them in FIFO order.

What are the specified latencies of your ssd array? Don’t say “it doesn’t
have any”, because it certainly does; they are just an order of magnitude
or several faster than a mechanical rotating disk drive. But they exist.
Until an ssd runs at CPU clock speeds, it has latencies. And there are
other optimizations an ssd has to do; because the memory has an expected
upper bound on the number of writes, either the driver or even the ssd
controller will tend to “batch up” requests to minimize the number of
times a block of data is written. So your CPU will see this as “latency”
in the completion of one or more IRPs.

Do you have any actual numbers you can give us, and tell how they compare
to the specified ssd latencies?
joe


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

The biggest time spent is 22.88s. my filter driver allocated a irp with 216mb buffers then sent it down asychronously. it will free these buffer in it’s iocompletion routine.

obviously, it’s tool slow . However i did’t get an avaliable BusTrace to monitor what happened.

append info :
from my log file , I can found that, from my filter driver send down this irp to it’s completion roution, the target device processed another 1292 my allocated irps. so, why did disk driver delay my allocated irp?

I’m sorry, it’d not 216mb, but only 864kb?Every irp allocated by myself will have it’s own buffer , range from 4kb to 1mb. I know disk will resort io request, but it should’t delay too mush.

So what? the classic starvation scenario, in Windows, it is unavoidable unless you will use IO priorities.


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

wrote in message news:xxxxx@ntdev…
> append info :
> from my log file , I can found that, from my filter driver send down this irp to it’s completion roution, the target device processed another 1292 my allocated irps. so, why did disk driver delay my allocated irp?
>

Are you in a PASSIVE_LEVEL thread when you do this? If so, what is its
priority? The other 1292 IRPs: who issues those? From what IRQL level?
Note that if they are from PASSIVE_LEVEL, and the thread is a higher
priority than your issuing thread, you could simply be suffering scheduler
latency. If the other IRPs are issued from DISPATCH_LEVEL, there is a
chance that your thread does not get a chance to process your IRP because
it has been preempted by other threads and DPC routines.

So it may have nothing to do with ssd as such, but just be an issue of
thread priorities and scheduler delays/preemption.
joe

I’m sorry, it’d not 216mb, but only 864kb?Every irp allocated by myself
will have it’s own buffer , range from 4kb to 1mb. I know disk will
resort io request, but it should’t delay too mush.


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Yes, i do these operation at PASSIVE_LEVEL in my dedicate thread which have LOW_REALTIME_PRIORITY. what’s more, the other 1292 IRP are also issue by the same thread at the same IRQL.

so,how to change priority to complete every one of my allocated irp as quick as possible?

On Mar 3, 2014, at 7:04 PM, xxxxx@gmail.com wrote:

Yes, i do these operation at PASSIVE_LEVEL in my dedicate thread which have LOW_REALTIME_PRIORITY. what's more, the other 1292 IRP are also issue by the same thread at the same IRQL.

so,how to change priority to complete every one of my allocated irp as quick as possible?

Any time you find yourself dining with priorities, you are using the wrong approach. If all of your work is ?high priority?, then none of it gets to be high priority.

Seriously. The Windows scheduler is very, very, very good at getting work done. When you manually adjust priorities, ALL you are doing is interfering with the 25 years of research and fine tuning that has gone into the scheduler.

Your IRPs are being completed as quickly as possible, but if the user-mode thread you have that is reaping those completed requests is running at a lower priority than the thread that is generating the requests, you will get the results you see.

Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

> so,how to change priority to complete every one of my allocated irp as quick as possible?

I think it is just plain impossible. You will have better latency, but worse total throughput, or something like this.


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

On 3/3/2014 11:57 PM, Tim Roberts wrote:

Any time you find yourself dining with priorities, you are using the wrong approach.

My vote for “Phrase of the week”.

Works well for so many things in addition to “priories”.

– mkj


//
// Michael K. Jones
// Stone Hill Consulting, LLC
// http://www.stonehill.com
//_______________________________________________

I still have another question:
when i send my allocated irp too quickly,like 5ms per IRP, the target device (SSD) will response more and more slowly. Seemslike , it have exceive it’s ability to deal with.