Read Performance of Driver

I develop Virtual Volume Manager on Windows Server 2008 R2.

I implemented IRP_MJ_READ major function like this :

MyCompletion (pDeviceObject, irp, context) {

if (irp->MdlAddress != NULL) {

IoFreeMdl();
irp->MdlAddress = NULL;
}

FreeIrp();

free(context);
IoCompleteRequest(irp);
return STATUS_MORE_PROCESSING_REQUIRED;
}

DispatchRead (pDeviceObjext, irp) {

malloc(context);

pBuffer = MmGetSystemAddressForMdlSafe(irp->MdlAddress, );
newIrp = IoAllocateIrp(target_dev->StackSize, );
newIrp->MdlAddress = IoAllocateMdl(pBuffer, Size, …);
MmBuildMdlForNonPagedPool(newIrp->MdlAddress);
next_stack = IoGetNextIrpStackLocation(newIrp);
next_stack->MajorFunction = …;
next_stack->Parameters.Read.Length = …;
next_stack->Parameters.Read.ByteOffset = …;
IoSetCompletionRoutine(newIrp, MyCompletion, context, TRUE, TRUE, TRUE);
status = IoCallDriver(target_dev, newIrp);
return STATUS_PENDING;
}

I think the location of this driver is between File system driver and Storage driver.

I tested driver performance using perfmon.

When read file of 8g using 1m buffer without my driver, IO bandwidth is about 170MB/s.
but, with my driver, it’s about 120MB/s.

using a 512k buffer, IO bandwidth is about 150 MB/s without my driver.
120MB/s with my driver.

using a 4k buffer, It’s about 120 MB/s without my driver,
120MB/s with my driver.

I don’t know exactly why when the buffer size becomes larger, It shows better performance and my driver can’t show same perfomance.

I think dispatch routine has not problems about performance issue.
Are any factors required for accelerating performance?

Usually storage drivers could handle in one request a buffer > 4k. How big it would depends on hardware. It means that big buffer provides bigger throughput. It is likely that your driver restricts length of buffer. What do you put in next_stack->Parameters.Read.Length and next_stack->Parameters.Read.ByteOffset in your DispatchRead ?

Igor Sharovar

Just a general observation: in Windows driver dev you should use the ExAllocatePoolWithTag and ExFreePool/ExFreePoolWithTag APIs to handle memory allocation/deallocation in lieu of malloc and free.

I doubt it is the cause of your performance bottleneck, but you could
avoid the system VA allocation your are doing by calling
MmGetSystemAddressForMdlSafe by using IoBuildPartialMdl instead of
MmBuildMdlForNonPagedPool.

Mark Roddy

On Tue, Aug 30, 2011 at 8:19 AM, wrote:
> I develop Virtual Volume Manager on Windows Server 2008 R2.
>
> I implemented IRP_MJ_READ major function like this :
>
> MyCompletion (pDeviceObject, irp, context) {
>
> ? ?if (irp->MdlAddress != NULL) {
> ? ? ? ?..
> ? ? ? ?IoFreeMdl();
> ? ? ? ?irp->MdlAddress = NULL;
> ? ?}
>
> ? ?FreeIrp();
>
> ? ?free(context);
> ? ?IoCompleteRequest(irp);
> ? ?return STATUS_MORE_PROCESSING_REQUIRED;
> }
>
> DispatchRead (pDeviceObjext, irp) {
>
> ? ? malloc(context);
>
> ? ? pBuffer = MmGetSystemAddressForMdlSafe(irp->MdlAddress, );
> ? ? newIrp = IoAllocateIrp(target_dev->StackSize, );
> ? ? newIrp->MdlAddress = IoAllocateMdl(pBuffer, Size, …);
> ? ? MmBuildMdlForNonPagedPool(newIrp->MdlAddress);
> ? ? next_stack = IoGetNextIrpStackLocation(newIrp);
> ? ? next_stack->MajorFunction = …;
> ? ? next_stack->Parameters.Read.Length = …;
> ? ? next_stack->Parameters.Read.ByteOffset = …;
> ? ? IoSetCompletionRoutine(newIrp, MyCompletion, context, TRUE, TRUE, TRUE);
> ? ? status = IoCallDriver(target_dev, newIrp);
> ? ? return STATUS_PENDING;
> }
>
> I think the location of this driver is between File system driver and Storage driver.
>
> I tested driver performance using perfmon.
>
> When read file of 8g using 1m buffer without my driver, IO bandwidth is about 170MB/s.
> but, with my driver, it’s about 120MB/s.
>
> using a 512k buffer, IO bandwidth is about 150 MB/s without my driver.
> 120MB/s with my driver.
>
> using a 4k buffer, It’s about 120 MB/s without my driver,
> 120MB/s with my driver.
>
> I don’t know exactly why when the buffer size becomes larger, It shows better performance and my driver can’t show same perfomance.
>
> I think dispatch routine has not problems about performance issue.
> Are any factors required for accelerating performance?
>
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
>

Sorry, I’m late.

To Igor Sharovar

Because of Files System Cache, My Driver always gets over 4K buffers.

To Rob

I didn’t use real malloc function. It’s just psuedo code. I used ExAllocateFromNPagedLookasideList.
It’s more efficient than ExAllocatePoolWithTag function.