storport with usermode backend

There is a distributed network based storage system available under Linux, and the block device access code has both a separate usermode and kernel mode implementation. The kernel mode implementation is fairly Linux specific, while the usermode implementation is mostly OS agnostic but is C++, so neither option lends itself well to a straightforward implementation under Windows. A from-scratch implementation under Windows would be a big job, high ongoing maintenance, and would always lag behind the current Linux implementation.

Assuming I’m not interested in booting from this storage, or putting a pagefile on it etc, there exists the possibility of porting the usermode code to Windows and using a thin driver to provide a ‘block device’ interface (eg storport) to Windows, and route all requests to the usermode implementation.

Are there going to be too many overheads with the requests doing a round trip to usermode then back through ndis in the kernel? And even without pagefile, is there still a possibility of a deadlock? Linux achieves very good performance and stability with its usermode filesystem implementation (FUSE), but as is regularly pointed out here Windows is not Linux…

Is storport the best way to do this, assuming there is no requirement for pagefile etc?

And is having a userspace process in the paging path completely unachievable (without resorting to fragile “hacks”)?

Thanks

James

You can still deadlock on paging. If any executing code is paged out of your um process and your um process is the one servicing that page request, you can get deadlocked very quickly

d

Bent from my phone


From: James Harpermailto:xxxxx
Sent: ?11/?24/?2013 4:40 PM
To: Windows System Software Devs Interest Listmailto:xxxxx
Subject: [ntdev] storport with usermode backend

There is a distributed network based storage system available under Linux, and the block device access code has both a separate usermode and kernel mode implementation. The kernel mode implementation is fairly Linux specific, while the usermode implementation is mostly OS agnostic but is C++, so neither option lends itself well to a straightforward implementation under Windows. A from-scratch implementation under Windows would be a big job, high ongoing maintenance, and would always lag behind the current Linux implementation.

Assuming I’m not interested in booting from this storage, or putting a pagefile on it etc, there exists the possibility of porting the usermode code to Windows and using a thin driver to provide a ‘block device’ interface (eg storport) to Windows, and route all requests to the usermode implementation.

Are there going to be too many overheads with the requests doing a round trip to usermode then back through ndis in the kernel? And even without pagefile, is there still a possibility of a deadlock? Linux achieves very good performance and stability with its usermode filesystem implementation (FUSE), but as is regularly pointed out here Windows is not Linux…

Is storport the best way to do this, assuming there is no requirement for pagefile etc?

And is having a userspace process in the paging path completely unachievable (without resorting to fragile “hacks”)?

Thanks

James


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer</mailto:xxxxx></mailto:xxxxx>

>

> And is having a userspace process in the paging path completely
> unachievable (without resorting to fragile “hacks”)?

(I assume this is the specific part you were responding to)

You can still deadlock on paging. If any executing code is paged out of your
um process and your um process is the one servicing that page request, you
can get deadlocked very quickly

Yes I get that, but is that the only obstacle? If all the usermode code was locked into memory would that solve it?

I’m only dwelling on this point as a matter of curiosity now. There is no good reason why I would want to put a pagefile on this block device, but if someone ever asks me why not it would be nice to have an answer. I can’t even see an API function to do it. VirtualLock looks like it might, but it only seems to guarantee that “subsequent access to the region will not incur a page fault”, not that it won’t ever page out the memory while the process is in a wait and then swap it back in when it resumes. But even then I’d have to hunt down all my memory and lock it, and the memory of any DLL I called (even if I only used one function in it), and any memory that DLL had allocated, or might allocate in the future… fragile doesn’t even begin to describe it.

James

VirtualLock has its own costs, more so (if I believe all the posts I’ve
read here) for a VM. Fragility is an arguable issue. For example, you
can use EnumModules (if I’ve remembered the API name correctly) to find
all the modules, which will give you their lengths. I remember using
another API to enumerate pages in the heap, as part of my NumaExplorer. I
no longer remember if it is defined for non-NUMA architectures, but you
can download my source and look at it. The real danger is C++. Implicit
constructors that allocate memory make it very hard to track what is
happening, and you will end up writing your own ‘new’ operator, or
probably better still, take the CRT code for malloc and friends and tweak
it to help track new allocations. I’m not sure this is going to be a
major effort to get robustness; I’ve done more under more trying
conditions and succeeded (hooking an overlay manager in an MS-DOS system,
for example, gave me performance traces AND control of what was
happening). If your driver isn’t in the pagefile.sys path, you shouldn’t
have to worry about this, but you will have to support paging for things
like memory-mapped files on your remote server anyway. But as long as you
don’t end up recursively calling yourself, deadlock should not be an
issue.
joe

>
> > And is having a userspace process in the paging path completely
> > unachievable (without resorting to fragile “hacks”)?

(I assume this is the specific part you were responding to)

>
> You can still deadlock on paging. If any executing code is paged out of
> your
> um process and your um process is the one servicing that page request,
> you
> can get deadlocked very quickly
>

Yes I get that, but is that the only obstacle? If all the usermode code
was locked into memory would that solve it?

I’m only dwelling on this point as a matter of curiosity now. There is no
good reason why I would want to put a pagefile on this block device, but
if someone ever asks me why not it would be nice to have an answer. I
can’t even see an API function to do it. VirtualLock looks like it might,
but it only seems to guarantee that “subsequent access to the region will
not incur a page fault”, not that it won’t ever page out the memory while
the process is in a wait and then swap it back in when it resumes. But
even then I’d have to hunt down all my memory and lock it, and the memory
of any DLL I called (even if I only used one function in it), and any
memory that DLL had allocated, or might allocate in the future… fragile
doesn’t even begin to describe it.

James


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

If you wanted to lock all memory for your process, the you need to shim
VirtualAlloc and turn it into AllocateUserPhysicalPages &
MapUserPhysicalPages. Do this as soon as possible and use VirtualQuery to
find out about VADs already allocated. For these you can only call
VirtualLock and hope. It goes without saying that no memory mapped files
could be opened from this disk

the

“James Harper” wrote in message news:xxxxx@ntdev…

> And is having a userspace process in the paging path completely
> unachievable (without resorting to fragile “hacks”)?

(I assume this is the specific part you were responding to)

You can still deadlock on paging. If any executing code is paged out of
your
um process and your um process is the one servicing that page request, you
can get deadlocked very quickly

Yes I get that, but is that the only obstacle? If all the usermode code was
locked into memory would that solve it?

I’m only dwelling on this point as a matter of curiosity now. There is no
good reason why I would want to put a pagefile on this block device, but if
someone ever asks me why not it would be nice to have an answer. I can’t
even see an API function to do it. VirtualLock looks like it might, but it
only seems to guarantee that “subsequent access to the region will not incur
a page fault”, not that it won’t ever page out the memory while the process
is in a wait and then swap it back in when it resumes. But even then I’d
have to hunt down all my memory and lock it, and the memory of any DLL I
called (even if I only used one function in it), and any memory that DLL had
allocated, or might allocate in the future… fragile doesn’t even begin to
describe it.

James

We’ve done user-mode file systems. Multiple times. In shipping products.

Too true.

The Windows performance of a user-mode file system is actually surprisingly good. As in kernel mode, the maximum throughput increases as read/write size increases (duh).

We did NOT support paging files. We certainly DID support memory mapped files (this is really not optional in Windows).

Up at the file-system level, I don’t recall having any serious deadlock issues that we couldn’t handle in the “normal” ways.

Peter
OSR

>There is no good why I would want to put a pagefile on this block device, but if someone asks me why not it would be nice to have an answer.

There is no way you can guarantee that the whole usermode process and all kernel code it may call is nonpaged.

To prevent using your device for the special files, you should fail IRP_MN_DEVICE_USAGE_NOTIFICATION.

>

>There is no good why I would want to put a pagefile on this block device,
but if someone asks me why not it would be nice to have an answer.

There is no way you can guarantee that the whole usermode process and all
kernel code it may call is nonpaged.

To prevent using your device for the special files, you should fail
IRP_MN_DEVICE_USAGE_NOTIFICATION.

I’m hoping my device can simply be a root enumerated virtual storport, so I won’t see IRP_MN_DEVICE_USAGE_NOTIFICATION. From what I’ve read, if I flag my virtual target device as being removable then that would prevent it being used for paging, but also would probably turn off all forms of write caching and possibly a few other features so I may need to mess with this a bit to get it right.

But maybe a bus driver with storport devices hanging off it would be better…

Thanks

James

> But maybe a bus driver with storport devices hanging off it would be better…

Better how?

>I’m hoping my device can simply be a root enumerated virtual storport, so I won’t see

IRP_MN_DEVICE_USAGE_NOTIFICATION

A must to pass PNPDTEST IIRC.


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

>But maybe a bus driver with storport devices hanging off it would be better…

Having a lower filter would be easier.

>

> But maybe a bus driver with storport devices hanging off it would be
> better…

Better how?

One device per storport instead of one storport running many devices (queue limits are per-LU though so maybe this isn’t a limitation)
Dynamic number of devices - AFAIK storport can’t change maximum target on the fly
Adding or removing a device won’t cause a re-enumeration of the virtual scsi bus
The interface to userspace can be managed with a KMDF bus driver, with the storport component being very thin
Parent bus can disallow paging via USAGE_NOTIFICATION (can’t see how storport can see this)
Parent bus can handle surprise removal of backend connections
Child device could be scsiport (I’ve done virtual scsiports before) if I wanted support for XP/2000 (which I don’t)

The drawbacks are that such a solution now needs multiple drivers, and is potentially more complicated (but maybe outweighed but the fact that I can use KMDF for the complex bits…)

I think some of the buffer management might be easier too but haven’t investigated yet, and looking through the archives, every time someone asks “how can I map kernel memory into a userspace process” they get drowned in canned “don’t do that - map userspace memory into kernel” responses (eg this thread http://www.osronline.com/showthread.cfm?link=151072 where nobody addresses the point that mapping the SRB databuffer into userspace avoids double buffering). So I won’t ask that question just yet.

Thanks

James

We had a similar project and ended up buying a 3rd party toolkit for it. The way the toolkit works is instead of using any filter driver, the miniport makes a processor device and a special SCSI class driver is dynamically loaded for the processor and does the low level things you mention. In usermode there is a service that dynamically starts when the processor device appears. The service runs CDBs and handles devices lifetime. CDBs are passed using IOCTL, not shared memory. If you are contemplating a build or buy decision, we had the toolkit developed from perisoft.net. I know other companies have done similar things before. So it is a technically possible route you pursue but perhaps need to develop a couple drivers and a service to do it real well.