Performance Regression regarding UNC access

Bastian_Wessling · February 1, 2021, 8:54am

Hello everyone,

we develop a product that uses a minifilter to restrict which files may be accessed in which way on which device. We noticed that the performance for accessing files on UNC shares regressed over the past Windows versions - without our product, but even more so with our product installed.

We measured startup times of Visual Studio Code from a network share:

Without our product:
** Windows 7: 8,89s
** Windows 10 1709: 9,99s
** Windows 10 1909: 11,32s
** Windows 10 2004: 16,68s
With our minifilter:
** Windows 7: 9,51s
** Windows 10 1709: 28,34s
** Windows 10 1909: 40,04s
** Windows 10 2004: 34,65s

Does anyone have any pointers for me?

How can I profile the minifilter?
Was there any change in Windows that would lead to such a regression? If so, is there a known remedy we could implement to decrease the gap between performance with and without our driver?

Thank you!

Dejan_Maksimovic · February 1, 2021, 10:43am

My preferred way of profiling something like this is Process Monitor.
With and without our filter, and below/above our filter (4 cases,
giving us a chance to see if the slowdown is in our driver or not).

So much for SMB3

we develop a product that uses a minifilter to restrict which files may be
accessed in which way on which device. We noticed that the performance for
accessing files on UNC shares regressed over the past Windows versions -
without our product, but even more so with our product installed.

We measured startup times of Visual Studio Code from a network share:

* Without our product:

** Windows 7: 8,89s

** Windows 10 1709: 9,99s

** Windows 10 1909: 11,32s

** Windows 10 2004: 16,68s

* With our minifilter:

** Windows 7: 9,51s

** Windows 10 1709: 28,34s

** Windows 10 1909: 40,04s

** Windows 10 2004: 34,65s

Does anyone have any pointers for me?

* How can I profile the minifilter?
* Was there any change in Windows that would lead to such a regression? If
so, is there a known remedy we could implement to decrease the gap between
performance with and without our driver?

Peter_Viscarola_OSR · February 1, 2021, 11:07pm

Agree. Well, with ProcMon you’re not really “profiling” as in timing so much as you’re analyzing what’s going on. The operative question is “Why is it so much slower? What’s different from the unfiltered case?” This is rarely fun.

Peter

Scott_Noone_OSR · February 3, 2021, 7:04pm

No idea about the regression without your filter. I wouldn’t start by looking at closing that gap to get your numbers down though.

You can get timings for your minifilter callbacks with WPR/WPA

That gives you VERY fine grained data though and not really helpful if you don’t know what you’re looking for.

I’d start by thinking about what my driver does and how it might impact an app launch. At a high level, what does your driver do? Approve/deny based on hash? Name? Do you filter every open? Every read?

One random thing pops to mind: do you query for normalized names? That can be killer on the network.

Bastian_Wessling · February 5, 2021, 2:31pm

After lots of research using procmon as well as WPT, I come to the same conclusion: We channel each request through a user mode process that works like this: At first we determine the user name and also the executable name (via GetModuleFileNameEx). If I replace this code with returning constant strings, VSCode starts much faster - and I did not even have to touch the filter driver. That part of the problem seems to be of our own making. We then use these names to check against a list of rules (is the user allowed to open a file X using Y program for reading/writing/deleting/…?).

I believe that our rules also are not as fast as they could be, but the biggest problem seems to be the lack of a cache for the executable names. Resolving the same name for every request is just bound to be slower than necessary.

Bastian_Wessling · February 8, 2021, 11:16am

As a side question, how badly does it reduce performance if our minifilter simply returns FLT_PREOP_DISALLOW_FASTIO for all FastIo requests? If Windows 10 relies more on FastIo, that would be another reason for the regression we are noticing when our driver is installed.