Debugging broken named pipes

Hello,

I have an application that uses named pipes for parent/child process communications. When I have scenarios that reprot back as broken named pipe, I typically see in WPA's minifilter view the installed antivirus agent inspecting the named pipe create event and taking a longer than average amount of time to do so. Adding exclusions seems to remediate these scenarios in most cases. But for the scenarios where it doesn't, is there any special method to debugging named pipe operations?

Maybe a set of providers in ETW or the like I can subscribe to in WPR and see whats happening? Getting it recorded in an etl file hasn't been that difficult, but actually getting to root cause is not something i've been able to get to, just more a 'yeah that looks like it took too long, add exclusions' and then the problem is gone in 8/10 cases or so.

Thanks for your time. I did some searching and I did see some posts from waaaay back saying named pipes arent inspected by minifilters, I am pretty sure thats what I am seeing in this view in WPA's minifilter view though.

Thanks,
Jeff Stokes

Named pipe filtering was added to minifilters starting in Windows 8. See FLTFL_REGISTRATION_SUPPORT_NPFS_MSFS.

As for debugging these problems...Do the problems go away if you disable the A/V? If yes then you're really looking for how to debug the A/V product, which is going to take a kernel debugger and a lot of patience lol Better off filing a bug a report unless this is just for amusement/learning.

Ah thanks Scott for that link!

In most cases, maybe all, yes. It's the 'how do I prove this to the vendor, that they cause a problem" part that I was hoping to check the box on. But you're right, there's only so much I can do and see with a closed source security product at play.

Thanks for your response here,
Jeff

just more a 'yeah that looks like it took too long, add exclusions' and then the problem is gone in 8/10 cases or so

So even after adding exclusions and rebooting the system, you see the problem in 2 out of 10 cases?

To confirm if it is indeed timing related, you may artificially induce a delay in the said scenario. I am not sure if your product already has some filter or intercepting module.

Alternatively, as already pointed you can try inspecting what the said AV is doing, If you are handy with the kernel debugger and reading some assembly code.

Hi there!

yes, in the other cases I've seen this even with AV's filter driver removed, another filter driver from a specific 3rd party was installed that hooked into us and inspected our named pipe creation.

I think also it is possible in some cases the child process is being closed by antivirus so I'm looking at the gflags route for "who killed my child process" perspective. (https://techcommunity.microsoft.com/t5/ask-the-performance-team/what-killed-my-process/ba-p/375329).

After thinking on it some, I realize that monitoring the named pipe itself probably isn't going to be the whole story of what is happening to our children processes and in some/.most cases even. It may just be that the pipe is broken because the child was terminated, crashed, or responded in a way that our tooling didn't like.

I appreciate your thoughts on this and your comment.