My FSD and Miniport together

So I am trying to address the missing feature where you can create a virtual hard disk - and reading posts around here, I get the impression that it is recommended that I use a Miniport storage driver.
I found some random sample “WDKStorPortVirtualMiniport” and I am reading http://www.osronline.com/article.cfm?article=538 - so far so good. Looks quite straight forward.

However, I already have a DriverEntry for my Filesystem. I now define and call StorPortInitialize(), then setup(overwrite?) my DriverEntry callbacks.

But this appears to “steal” the callbacks from StorPortInitialize(), and I am unsure how to call into StorPort with its IRPs when I receive them in my dispatcher.
Is there a call for me to relay the IRPs to StorPort?

Or perhaps it should be the other way around, let StorPort take all the callbacks in DriverEntry, how do I tell StorPort to call me for the FSD related IRPs?

What is the typical setup for people who have a FSD and Storage in the same driver?

Is there some reason why they have to be the same driver? Why not have one driver surface your devices and one driver recognise the file system on in ?

In ZFS, you put storage (physical disks) into pools, where you can create datasets (filesystems) inside it. But you can also carve out some of that pool space into a “ZVOL” - a virtual disk, which can be used as anything (ext4, ffs, ntfs etc). So it is primarily a storage pool driver with “a filesystem”, but also can export virtual disks. The FSD does not go on the Virtual Disks (although, you can put a pool in a zvol if you are so inclined)

disk(s) → zpool(s) → ZFS dataset(s) (fsd)
disk(s) → zpool(s) → ZVOL(s) (virtual disk)

If they were separate drivers, the miniport driver for ZVOLs would have to constantly talk to the “main” zfs driver? (just to do IO etc). Or compile everything twice I suppose. How would such IO passing work generally? Would the main driver create a sub-driver for miniport, and pass IRPs between them?

Ah,i hadn’t realised that you were layering a block server above the file system. You’ll probably need to wait until those who do real hardware and also hang out here come back to get a definitive answer, but it seems to me that your real problem is that each driver only gets one set of driver entry points which is insoluble.

If you consider a standard file system and then device stack the cost of IoCallDriver cannot be large, almost by definition…

OK so I guess it comes down to;

  1. Do the virtual disk code myself - adds a lot of code and complexity (replicating miniport)
  2. Do 2 drivers, and use miniport - adds a lot of code and complexity (drivers talking to each other)

But as a proof of concept, after I call StorPortInitialize(), I save a copy of DriverObject->MajorFunction, as DriverObject->DriverUnload, then set my own. If an IRP comes in that is not “mine”, I call CopyMajorFunction[Irp->MajorFunction].

I think I can sense you all collectively disapprove - but it does appear to work. I’m also impressed at the number of IRPs that come in, just to register, so I am keen to avoid having to write all that myself.

That actually seems like a good hack. But I’ll be interested in what others , with more device , experience think. This is, after all, the reason why you get given the device object. If you look at the OpenAfs sources they do a similar sort of thing - although the two devices are treated as peers rather than a master/slave

Oh, one other thing, you are aware that a driver is actually a DLL so you can link one driver against another and just call across?

Obviously this is a recipe for an architectural bodge if you are not structured but…

Doing it one driver is archetecturally aberrant and will definitely, certainly, and positively cause you problems. There are problems waiting for you in terms of IRQLs, locks, and many other things. The biggest issue is even if you got it to “work” you’d have an unsupportable mess that nobody in the Windows driver world would expect or understand.

OK… so… build a file system, and also build a virtual mini port. How to best structure this is the big question. Do you really want you file system to be above the Windiws Volume abstraction, like an ordinary file system? If not, perhaps you want a bus driver over the disk class driver? Sorry… I don’t understand enough about what you’re trying to accomplish to be able to provide better advice. But I would suggest a solid architectural foundation before “just writing some code” for this project…PETER

rod_widdowson: " you are aware that a driver is actually a DLL so you can link one driver against another and just call across"

Oh that interesting, and wouldn’t be so bad perhaps.

Peter_Viscarola_(OSR): " I don’t understand enough about what you’re trying to accomplish to be able to provide better advice. But I would suggest a solid architectural foundation before “just writing some code”

Porting ZFS from Sun Microsystems, and from a Unix POV it is very well designed and has solid architectural foundation. Personally, I think it would be difficult to find a better designed filesystem, albeit perhaps biased.

It is difficult to describe how it is designed in this forum, and probably beyond your interests as well.
http://wiki.lustre.org/images/4/49/Beijing-2010.2-ZFS_overview_3.1_Dilger.pdf

ZFS takes all the physical disks you have, and puts them in a storage pools in whatever redundancy you want (mirror/raidz1-3, compression, dedup), and is handled as part of the DMU layer (Data Management Unit). Technically there are 2 consumers of the DMU, the ZPL (ZFS Posix Layer -ie, filesystems you mount and use as files and directories) that consume storage in the pool as “ZFS Datasets”, and is mounted as a file-system.
And then there is ZVOL, which consumes storage space and represents it as a raw device (virtual disk), to do with as you please (ntfs, fat, iscsi export etc).

When it comes to porting, I started at the very bottom (physical disks on Windows, IO etc) so it can have a pool for the DMU. Then at the very other end, implemented FSD for the “ZFS Datasets” to be mounted as Volumes on windows.

Note that “ZFS Dataset” filesystems is only ever on DMU storage space. They do not go on regular disks “on its own” like FAT/NTFS. If you had a disk you want ZFS on, you would create a pool on it, and DMU, which then gets a “ZFS Dataset”, and you can create your files.

So now I’m working on the 2nd DMU consumer, ZVOLs, that is presented to Windows as a virtual disk.

I guess technically, I should have done 1 driver for the physical disks, pool and all the way to DMU, then 2 adjacent consumer drivers for ZPL and ZVOL which directly talk to DMU, and present to Windows as either a mounted “ZFS Dataset” FSD Volume, or miniport virtual disk.

Windows is unfamiliar to me, and hence I am hoping to get advice from glorious Windows devs on how they would do things, on what Windows can and can not do. It has been quite fun to work on this port, and quite a few things in Windows was a big surprise.
https://openzfsonosx.org/wiki/Windows_port

Thanks for that.

When I was talking about architural design, I was speaking of how the ZFS constellation of features fits into Windows - not ZFS itself (which is well-proven, clearly). I’ve often joked about doing ZFS for Windows, in fact. But always walked away from the task due to the vast complexity involved (and the lack of a paying market, I sadly do not have the leisure to undertake such a project for the sheer joy of accomplishment). So…I wish you luck!

On Windows, NTFS and FAT do not exist above raw disks either. Rather, the file systems mount on volume instances, created by the Volume Manager. This allows you to pool disks and partitions into RAID or stripe sets. Of course, the most common volume is a Basic volume where one partition = one volume.

Starting st the bottom: im not sure how you’ve accomplished this, but… the best way to pool disks in Windows is usually with a virtual StorPort driver (that’s not how VolMgr does is, be jade VolMgr is old and crufty… but it is how the newer Storage Spaces does it). Having this pooled your storage units and surfaced a single virtual disk PDO, that can be claimed by… whatever you want moving up the stack from there.

I’d like to see diagram of the the PDO/FDO relationships in the storage stack you’re creating. That’s where the architecture comes in. Rememebr, when we move things from Linux/Unix to Windows we generally don’t “port” … we redesign and then use common code when we can.

Peter

@“Peter_Viscarola_(OSR)” said:
When I was talking about architural design, I was speaking of how the ZFS constellation of features fits into Windows - not ZFS itself (which is well-proven, clearly). I’ve often joked about doing ZFS for Windows, in fact.

Amusing, while doing the OSX port of ZFS, I did the same joke. Then one day, I just started - I wanted to see how far I could get. Admittedly it has gone beyond “proof of concept” now.

On Windows, NTFS and FAT do not exist above raw disks either. Rather, the file systems mount on volume instances, created by the Volume Manager. This allows you to pool disks and partitions into RAID or stripe sets. Of course, the most common volume is a Basic volume where one partition = one volume.

Ah yes, probably why it took me so many weeks to get MountMgr to assign a letter to my new Volume - it felt like a lot of undocumented things to me, but I suspect I just never found the proper documentation.

Starting st the bottom: im not sure how you’ve accomplished this, but… the best way to pool disks in Windows is usually with a virtual StorPort driver (that’s not how VolMgr does is, be jade VolMgr is old and crufty… but it is how the newer Storage Spaces does it). Having this pooled your storage units and surfaced a single virtual disk PDO, that can be claimed by… whatever you want moving up the stack from there.

Yes, I always suspected there would be a volume manager - but I made sure it ignore it. I think as a Windows developer you would be tempted to use it, but for me, a Windows ZFS port needs to be able to import “foreign pools” - ie, those used with Linux, illumos, BSD and OSX. So if you have an “n” DISK pool using raidz, you can not group them with Windows VolMgr (I assume?), but have to let ZFS open them as disks, like it is used to doing.

That porting was then quite easy. Just open the disk, implement the kernel code to read/write offset+length, and the rest of the ZFS pool code compiles unmodified, so mirror, raidz, compression, dedup is done and working. I am happy with that part of the code (some minor things I’d like to fix, like IO-Completion - I tried DPC, but you can not do any waiting in DPCs, so have thread waiting for completion-event which I think could be made more efficient).

I am not imaging using ZFS on Windows as production storage servers, there is already a solution for that. I aim is more for cross-platform compatibility, something better than FAT that can be accessed on all platforms, for those of us who runs many operating systems and would like to share data between them. Having snapshots, and clones on a bootable ZFS dataset would be awesome, there is so much flexibility to be had there for OS upgrades etc.

At least I hope nobody runs large storage farms with it, there is no way I could do support for that.

The WDKStorPortVirtualMiniport was quite easy to work with once I understood where it had temporary code, and I have managed to get a test disk up and running. I have yet to decide if keeping a copy of storport’s dispatcher function pointers is too hacky or not.