Failure in ScsiPortGetDeviceBase

My driver is failing in “Common Scenario Stress with I/O” on W2K8-32 after exactly 100 iterations due to a failure in StorPortGetDeviceBase. I haven’t yet tried running memory leak detectors (poolmon), but the driver is passing the job on all other Windows Server versions.

Nevertheless, I never call StorPortFreeDeviceBase (e.g. from ScsiStopAdapter), is that a problem? The docs don’t say that I should.

What could be the cause of StorPortGetDeviceBase failing to run? Actually, how can I find the cause of the failure, since the function doesn’t return an NTSTATUS?

At what time do you call StorPortGetDeviceBase? You can only do that from FindAdapter function.

The function’s SCSIPORT equivalent only says the miniport should call Free function if it finds out the BAR doesn’t belong to the supported device. This only applies to non-PNP miniports, though.

Sorry for following up to myself. I noticed that every time the test is restarting the driver, the NNN in “\Device\RaidPortNNN” assigned to the driver increases. This could be the reason why the test fails after exactly 100 iterations (I certainly have no nice power of tens in my driver). Do you have any idea if that is my own leak or storport’s?

> At what time do you call StorPortGetDeviceBase? You

can only do that from FindAdapter function.

Yes, of course I only do that in FindAdapter, passing the ACCESS_RANGE that I get from storport.

What test is failing? Does it simulate surprise removal?

It’s “Common Scenario Stress with I/O” that is failing. It doesn’t do surprise removal (pnpdtest passes with flashing colors, even the stress test that WHQL doesn’t run), just repeated disable/enable. The problem code changes from 0 to 10 (failure to start) after 100 iterations. The failure to start is due to StorPortGetDeviceBase.

AFter the test fails, do !drvobj for you driver. See how many device objects are in the list. Check the device names.
After reboot, do a few enable/disable cycles manyally, without IO. Do !drvobj; see the device object list.

Turns out it’s “just” kernel address space fragmentation. I can fix it by mapping less than the full I/O space of the device, but I’d have expected something better than first-fit from the Windows page allocator…

Paolo

Surely physically contiguous pages are limited a lot.

Also, I think, if you need a DMA common buffer, no need to call this function, just set device extension size in the init params structur (one of 2). Then ScsiPortGetPhysicalAddress can return the bus-side addresses of this devext, which is a per-device common buffer.

SRB extensions are also allocated off the large common buffer, the same as the devext is allocated.

LUN extensions are not allocated off phys. contig. buffer, and thus are not DMAble.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

wrote in message news:xxxxx@ntdev…
> Turns out it’s “just” kernel address space fragmentation. I can fix it by mapping less than the full I/O space of the device, but I’d have expected something better than first-fit from the Windows page allocator…
>
> Paolo
>

“Turns out it’s “just” kernel address space fragmentation. I can fix it by
mapping less than the full I/O space of the device, but I’d have expected
something better than first-fit from the Windows page allocator…”

What BAR size do you have?

We do thousands of enable/disable cycles, and mostly without problems.

Do a simple test. Print the mapped address in your miniport, then disable the device and check if the mapping still exist by using the !pte (or whatever command it is). If the address is still mapped, there’s some problem.

16 MB. The mapped address changes all the time, but the amount of kernel memory doesn’t increase by 1 MB even after hundreds of cycles with the fixed driver, so I don’t have a leak.

I actually only need the first 256 KB, so the fix is appropriate.

I don’t think you’ll see kernel memory change with leak of memory mappings. What can change is number of free PTEs. Do !vm and check that.