Best approach to process intercepting information

Alex_Funky · May 12, 2016, 3:52am

Hello!
I am a long time trying to decide which approach is preferable to the development of driver that intercept some information in kernel (network and filesystem) and process them (allow/deny access) :
1)place all processing logic in user mode application (use inverted call and pend irp)
2)or place all logic in kernel driver?
First approach IMHO more flexible and scalable, but second approach may be much faster and stable?
What more preferably in modern system?

Gabriel_Bercea · May 12, 2016, 5:27am

There are indeed a few approaches you can take here but it ultimately
depends on what or how much you can process in kernel mode.
For example if you want to make a decision to allow/deny some binary if it
is signed by some publisher then you must definitely use user-mode call as
well. That does not mean in any way that the security model logic could not
all reside in KM and only make a call to user-mode to ask for the
authenticode information. So it just comes down to your security model and
how does that work.

If it is more closely residing an old school AV model then maybe you should
go with how the AVs work and that is have a UM service as the core, and the
KM just as an extension of that service that calls into it and denies or
allows actions.
On the other hand if you plan to make something that is more similar to an
application control/white listing/policy driver then I would argue there is
more sense to design your security model in such a way that the logic plays
in kernel and the UM in this case is an extension ( you call into UM from
time to time for authenticode stuff let’s say and then you cache it in KM ).

Make sure you have first things figured out and then based on that it
should be easy to see which of the two is more desirable.

Regards,
Gabriel
www.kasardia.com
Windows Kernel Driver Consulting

On Thu, May 12, 2016 at 9:49 AM, wrote:

> Hello!
> I am a long time trying to decide which approach is preferable to the
> development of driver that intercept some information in kernel (network
> and filesystem) and process them (allow/deny access) :
> 1)place all processing logic in user mode application (use inverted call
> and pend irp)
> 2)or place all logic in kernel driver?
> First approach IMHO more flexible and scalable, but second approach may be
> much faster and stable?
> What more preferably in modern system?
>
> —
> NTDEV is sponsored by OSR
>
> Visit the list online at: <
> http://www.osronline.com/showlists.cfm?list=ntdev>
>
> MONTHLY seminars on crash dump analysis, WDF, Windows internals and
> software drivers!
> Details at http:
>
> To unsubscribe, visit the List Server section of OSR Online at <
> http://www.osronline.com/page.cfm?name=ListServer>
>

–
Bercea. G.</http:>

Alex_Funky · May 12, 2016, 7:22am

Gabriel, thank you very much!
I want to implement (how many people done this!!=))) real-time filesystem filtering platform, wich provide : allow/deny file operation by some criteria, modify file data, backup file data.
If I will use UM as main logic (via inverted call) - it’s more flexible and scalable, but what about performance? If use KM as main logic - very hard implement some filtering rules in kernel…
In WDK I found “AvScan File System Minifilter Driver” that place main logic in UM, but Windows Filtering Platform perform all major action (allow, block, inspect, in generic case) in kernel mode…
What approach use advanced data filtering system?

Peter_Viscarola_OSR · May 12, 2016, 8:01am

Lots of people have done what you describe. At OSR we’ve designed many such systems for folks in the past ten years.

There’s no question: You want the logic to be in user mode. Putting the logic in kernel-mode doesn’t make it any faster. The I/O path is well optimized, so inverted call (or the Minifilter equivalent) is reasonably efficient.

Don’t prematurely optimize. If you want a speed-up later in the project, you might be able to do something like cache results for previously validated actions, avoiding any rules interpretation at all.

But don’t waste even a minute considering this implementation 100% in kernel mode. You’ll regret the decision later when it comes to maintenance.

Peter
OSR
@OSRDrivers

Alex_Funky · May 12, 2016, 9:09am

Peter, thank you for excellent explanation!
But, in post :
https://www.osronline.com/showthread.cfm?link=266912
Tim Roberts wrote :
“You don’t have time
for all of the user/kernel transitions that are required to keep up with
a full modern network pipe. The Windows Filtering Platform is split
between user-mode and kernel-mode so that it can keep up.”
This remark confuses me…may be I’ll faced with critical performance issue.
In additional Gabriel in this post wrote that logic in UM - this is old school.
I try to understand the best modern solutions for data filtering)

Gabriel_Bercea · May 12, 2016, 9:21am

I said old school AV implementation. This does not mean this is not used
today in modern security products.

Bottom line you need to understand is that you must first know what you
want from your product and if you are worried about performance you can
benchmark it ( you always have the ADK ) and then see what works better.
Just simulate the data, simulate inverted calls etc…

Can you make everything work in KM ? That is the question. Is it doable ? I
just have the feeling that you are not 100% sure of how your security model
work. How do you gather your information. What do you compare the
information against ? How do you update the information you compare it
against with. Is your driver going to enforce this during boot ? How does
the security get enforced ? Take this complex equation with you, create a
proof of concept driver, similar to the AvScan sample and do the “work”
both in KM or UM and come up with the best design for you. Personally I had
experience with both and you can make both work but it really depends on
how you plan to enforce security and how your model looks like. You need to
have a crystal clear picture there.

Cheers,
Gabriel
www.kasardia.com

On Thu, May 12, 2016 at 3:05 PM, wrote:

> Peter, thank you for excellent explanation!
> But, in post :
> https://www.osronline.com/showthread.cfm?link=266912
> Tim Roberts wrote :
> “You don’t have time
> for all of the user/kernel transitions that are required to keep up with
> a full modern network pipe. The Windows Filtering Platform is split
> between user-mode and kernel-mode so that it can keep up.”
> This remark confuses me…may be I’ll faced with critical performance
> issue.
> In additional Gabriel in this post wrote that logic in UM - this is old
> school.
> I try to understand the best modern solutions for data filtering)
>
>
> —
> NTDEV is sponsored by OSR
>
> Visit the list online at: <
> http://www.osronline.com/showlists.cfm?list=ntdev>
>
> MONTHLY seminars on crash dump analysis, WDF, Windows internals and
> software drivers!
> Details at http:
>
> To unsubscribe, visit the List Server section of OSR Online at <
> http://www.osronline.com/page.cfm?name=ListServer>
>

–
Bercea. G.</http:>

Peter_Viscarola_OSR · May 12, 2016, 9:46am

Well, I can’t answer for Mr. Roberts. He’s a smart and very experienced dev who knows what he’s talking about. But I do know that he was answering in the context of filtering network packets. I was answering in the context of FS operations. But even in the context of networking, think through the design. The question rapidly becomes “What, exactly, do you need to examine?”

A 100Gb Ethernet link is “pretty fast.” If you have to hold all that data while you wait for user-mode to examine it, that’s probably not going to work too very well… right? But I’d suggest in most cases, trying to decode very received buffer and examine the data in every packet probably isn’t necessary. Rather, you probably want to look at the protocol level and examine connect requests and such (UDP being a bit more of an issue, but… here, you’re concerned with the ports being used I assume). Hey, there are existing models for this that work well. You mentioned WFP for example.

It’s similar for file system operations. WHAT are you examining? Open requests? Well, dude… you have all the time in the world, right?

Also, bear in mind that the cost of “ring transitions” isn’t NEARLY what it used to be. Back in the day, this was an EXPENSIVE activity. Lots of devs worried over this cost. Today, it’s neither time nor resource intensive. In fact, it’s pretty damn efficient. The *real* cost is the cost (in terms of latency) is the cost of the thread switch. But, even there, on commodity processors, if you take some measurements, I think you’d be very surprised to see how little time it takes *on average* to get data back from kernel-mode to user-mode and for the receiving thread to process that data. I’ve measured (yes, actually measured) times in the area of 100usec (from DPC getting the timestamp to a user-mode service running and getting a timestamp).

And one final caveat: An answer in an Internet forum is not a complete architectural evaluation. It doesn’t involve the kind of time, thought, care, and consideration of various potential trade-offs that a multi-day design session would entail. Rather, it’s an off-hand “here’s my rule of thumb” kind of guidance. Like MSDN, forum answers provide “general guidance” that fits MOST situations. The answers I give here are designed to describe what I have found to be “best practice” in my experience. Could your situation be unique and require something different? Sure.

Hope all that helps,

Peter
OSR
@OSRDrivers

Alex_Funky · May 12, 2016, 2:53pm

Gabriel, Peter, thank you very much for discussion and explanation!
I will try place logic to UM and measured performance.
Thank you, guys!