Strange fltmc load issue

We ran into an issue last week with a client where we went to update our
driver and service, and are now stuck in a screwed up state. Briefly,
how we got there and what’s happening:

  1. There was a bug in our service that made it spin and use all of the
    CPU cycles under certain circumstances.
  2. It got killed a couple times because of that.
  3. I found the bug, wrote a fix, and tested it.
  4. We went to uninstall the old version. The uninstaller hung from the
    Add/Remove programs control panel.
  5. After say 5 minutes, We verified that a) the filter was unloaded,
    and b) the service had stopped.
  6. We killed msiexec.exe (probably a bad idea, I know)
  7. Eventually, we got msi out of the state where it thought it was
    still trying to uninstall the product. We did this by repeating the
    uninstall via add/remove after closing and reopening it.

All of this is pretty garden variety stuff so far. When we went to
install the new version, things got weird.

  1. We tried to reinstall. The user-mode service was installed
    successfully, the driver inf file installed successfully, but the driver
    wouldn’t load. Trying to load the driver by hand fails with error
    0x80070002. The error message informs us that this means that “The
    system cannot find the file specified.”.
  2. We checked the registry. The path is right, and the file exists at
    the path. Everything else looks fine.
  3. We installed and loaded our other filter to see if we could load
    that one successfully. We can. The problem is restricted to the
    baddriver
  4. sc qc baddriver returns sensible looking output that matches the
    contents of the registry.
  5. Repeating the uninstall/install doesn’t get things out of the
    messed-up state.
  6. Neither does logging out and back in.
  7. Uninstalling gets rid of the registry entries it should.
    Installing puts them where they should be. Things are going smoothly
    except for the load.

Based on the above, it looks like something isn’t picking up the changes
to the registry when the inf file runs. Or something is screwed up with
the existence of the sys file in the sys32/drivers directory (more on
that in a second).

Trying to reproduce this, I’ve been abusing the msi end of things all
morning by building a version of the service that waits far too long to
stop (10 minutes), a version that goes into a forever and ever (for
(;;):wink: when you try to stop it, killing the service and msiexec at
inopportune moments, and I can’t get a machine into a similarly
screwed-up state. The service sometimes does take up to a minute to
stop, which has caused confusion in the msi stuff before, but never to
this degree. None of these have gotten the problem to appear on a
system I can actually debug and poke at will.

The closest I’ve come to reproducing the scenario we’re seeing at the
customer site is by renaming the driver binary, rebooting, and naming it
back to the correct name. This duplicates the “registry correct, file
exists, driver load fails with 0x80070002” aspect of the problem, but an
uninstall/reinstall still fixes it.

Any ideas?

Thanks,

~Eric

Incidentally, can I submit a bug that 0x80070002 is not a particularly
useful or specific error code? Try “fltmc load sarosh” for instance.
Or try removing the binary for an actual driver. It would be nice if
the error code actually helped me figure out where info wasn’t making it
through.