Proper steps for removing a virtual volume device in a legacy driver.

I have a legacy (non-WDM/PnP) driver that creates virtual volumes. I send
custom IOCTL’s to a private control device to add and remove these devices.
I’ve run into a bug with SR when I do this. I’m trying to figure out if I’m
doing something wrong or if SR.SYS has a bug.

I follow these steps when removing/deleting a virtual volume device:

  1. In a user mode control app I open a handle to my device.
  2. From that app I call FSCTL_LOCK_VOLUME on that handle.
  3. The app then calls FSCTL_DISMOUNT_VOLUME on that handle.
  4. I then call a custom IOCTL that sets a flag in my device that prevents
    anyone from opening a handle or sending I/O to the virtual volume device.
  5. I then close my handle to the device.
  6. I then instruct my driver through a custom IOCTL on my control device
    object (not the volume device, becuase I can’t get a handle to it any more)
    to delete the device.
  7. In my driver I use a remove lock to keep track of IRP’s so I call
    IoReleaseRemoveLockAndWait() so that I know I have completed all IRP’s sent
    to me and wont allow new IRP’s on my device.
  8. I then clean up and delete the device with IoDeleteDevice.

Will these steps work? Is there something more I need to do?

This works great in most cases and I ran some significant testing on Windows
2003 Server SP1 full checked and driver verifier without problems. On XP
SP2 we found a scenario that causes a BSOD. If we have initiated a delete
on our device, while another thread is trying to mount a file system on our
device (the device has the delete pending flag set), SR.SYS blows up. The
code in SR.SYS looks like it may be flawed to me. In the call stack of a
call to IopMountVolume, SR is invoked and it calls ObQueryNameString() to
get the name of the device object. ObQueryNameString() returns
STATUS_SUCCESS but the UNICODE_STRING in the OBJECT_NAME_INFORMATION passed
to ObQueryNameString() has these contents after the call:

kd> dt -b nt!_OBJECT_NAME_INFORMATION e1decc08
+0x000 Name : _UNICODE_STRING “”
+0x000 Length : 0
+0x002 MaximumLength : 0x3ea
+0x004 Buffer : (null)

SR (in SrGetObjectName()) then accesses the Buffer in this structure and
(trys to null terminate it by writing word 0 at offset 0 which, of course
blows up. Is this a bug in SR or ObQueryNameString()? Should
ObQueryNameString() ever return such a UNICODE_STRING and success? If not,
should SR be checking this buffer or its length before accessing it? Did I
cause this with my call to IoDeleteDevice? I know that I’ve called
IoDeleteDevice, because the object always has the delete pending flag set,
and I’ve just seen the trace for this remove.

In case it helps, the PointerCount in the OBJECT_HEADER for the DO has been
either 0 or 4 in the crashes I’ve inspected and the HandleCount is always 0.
The device objects reference count varies to. In the last crash
PointerCount, HandleCount and ReferenceCount were all 0.

I’ve also palyed with the DeviceLock event in the DO. It’s always in a
non-signaled state in the crash (meaning a mount is in progress, right?).
I’ve tried aquiring it, but If I Set the event right before my call to
IoDeleteDevice(), I see the crash. If I never set it and just delete the
device, the OS never crashes, but the thread that had requested the mount
hangs.

If there is no good way to do this, I’d like to look into a WDM driver, but
I’m not sure how to handle adding and removing devices on a WDM driver and
my project scheduel wont permit that for this release. If I create a WDM
version the future, where can I find info on how to manually invoke and
AddDevice or an IRP_MN_REMOVE_DEVICE? Do I have to create a bus driver as
well?

Thanks,

Jonathan Ludwig