Embedded database

Hi,

Do you know about some embedded database that can be used in kernel? Some times I need to load/store data, like rules for a firewall and would like to know if I can avoid creating a custom format for it.

Kind regards,
Mauro.

how much data are you planning to store? configuration is commonly stored in registry keys. the registry is a hierarchical database designed for this purpose

Hi @MBond2 , may be thousand of rows. Although it would fit on memory, want to know if it exists some lightweight database that can be used in kernel-mode.

For e.g., sqlite with its VFS routines changed to use kernel api.

That is one of the stupidest ideas I’ve heard in my almost 50 years of system programming. I thought I would never ask this again, but what firm and product are you working for, so we can be sure to tell everyone to boycott it.

IMHO more appropriate solution in this case is use helper user mode service.
But “rules for a firewall” may be the problem. How early in boot process your “firewall” loads and so needs this info from DB?

That is one of the stupidest ideas I’ve heard in my almost 50 years of system programming.

That, quite literally, made me LOL. Thanks for that, Mr. Burn. I needed a good laugh today.

Of course, Mr. Burn is entirely correct. What you’re asking is beyond silly. And I have to also say that Mr. SweetLow has suggested exactly the right solution here: Send whatever you need done off to a user-mode helper service. That’s the proper way to do this in Windows, full stop.

If you can put-up with the overhead, send it to user mode. If you can’t, store it in memory.

Peter

Thousands of rows – at let’s say 1 KB per row, that would be something like 1 MB to a maybe 10 MB of memory.

To provide context, I picked this one at random from the Intel website

https://www.intel.com/content/www/us/en/products/processors/xeon/scalable/platinum-processors/platinum-8268.html

it was launched about a year ago, and supports 1 TB of RAM. On the chip itself, there is about 35 MB of L3 CACHE.

Do you think that the total data that you need to store will exceed this? Many systems would not be so powerful, but the idea is to provide context about orders of magnitude.

With all respect, it seems that three different things are mixed here:

  1. load/store the data (persistence)
  2. manipulating the data in RAM (search, update…), especially from multiple threads.
  3. synchronizing the data with usermode

Unless (3) is needed, the data can be stored in a binary file in format convenient for (2).
Reading and writing a single file from kernel mode is not so hard.
If the database is created with non-trivial usermode tools, it can be converted to format convenient for (2).
One important point to take care of is integrity of the data file which goes into the kernel - creative folks can tamper with it… you know.

More than few KB of binary data in the registry is possible, but looks like abuse of the registry.

Wikipedia has this on embedded in-memory databases: https://en.wikipedia.org/wiki/List_of_in-memory_databases
All these probably are overkill for the OP’s goal.

Regards,
– pa

@Don_Burn said:
That is one of the stupidest ideas I’ve heard in my almost 50 years of system programming. I thought I would never ask this again, but what firm and product are you working for, so we can be sure to tell everyone to boycott it.

Hey, take it easy. Handling HTTP requests in kernel might also sound stupid but now you have http.sys.

I was thinking in how to develop a content filter and block sites. Some times you have IP addresses and some other times you have just hostnames where IPs must be periodically updated.

@“Peter_Viscarola_(OSR)” Of course I can communicate with an user-mode service which does the hard job (actually I do so) but, if the service is not loaded for some reason, or some app accessing before my service load, there is a chance to miss the request. This is the trade-off I’m evaluating.

As @Pavel_A said, I may have an in-memory list and persist in a custom binary format (I’ll do in other apps) but I don’t know how big the list can become. Currently I have a database with 2 millon entries and having all of them in memory, although it would fit, not sure if a good idea. I think it is better to have some cached entries.

Regards,
Mauro.

Handling HTTP requests in kernel might also sound stupid but now you have http.sys.

I don’t remember anyone here saying that’s a good design. Just cuz MSFT did it, doesn’t mean it’s a good idea.

if the service is not loaded for some reason

What reason would that be? Because it died? Make it auto-restart? Because somebody stopped it? Well, you have to decide if that’s allowed. In general, this is a tractable problem.

some app accessing before my service load

Yeah? So, a user-mode app that’s started before your service? That seems “pretty unlikely” – Again, the problem is tractable; Perhaps deny everything until your service loads. If this is a problem, you’ll find out pretty quickly if there is anything you need to add to your “allowed” list.

Yeah… You could write, and test, and debug, a very complex mechanism to handle database lookups from kernel mode… or you could, you know, just shuttle it off to user-mode, like Gxd intended.

Your money, your choice…

Peter

AFAIK these days IIS hands most requests off to out of process C# code in UM. http.sys will serve static content, but how many websites or REST APIs have a significant amount of that?

When this design decision was made, Microsoft was looking to break speed records on ~500 MHz processors, and the idea of eliminating the ‘hair-pining’ of the request to ReadFile via UM sort of made sense. Today, the HTTP GET probably results in one or more calls to SQL server or computations equivalent to calculating the first 500 digits of PI or something like that instead of just reading a file and providing the content back

Some times you have IP addresses and some other times you have just hostnames where IPs must be periodically updated.

Well, this is a different sort of a problem - not just retrieve data from a database or disk file.
Are you going to cache the resolved IPs? Update the whole database proactively? What if application gets somehow the address of a bad site not thru a DNS request which you can detect?
And a more general question - why roll your own when a lot of products exist for such filtering?

Currently I have a database with 2 millon entries and having all of them in memory, although it would fit, not sure if a good idea. I think it is better to have some cached entries.

How about the cache in non-pageable memory and the “cold” data in paged pool?

– pa

Yeah? So, a user-mode app that’s started before your service?

Actually I was thinking in another service.

And a more general question - why roll your own when a lot of products exist for such filtering?

A customer was asking for it and I was doing some research, including the database stuff.

How about the cache in non-pageable memory and the “cold” data in paged pool?

Yes, I’ll end with this approach and letting a service to do updates, etc.

Honestly speaking, I though other products would do more operations in kernel side because several reasons, mainly if we talk about a security product, but seems it is not the case.

Regards,
Mauro.

I though other products would do more operations in kernel side because several reasons

No. One of my “general rules” that Anton dislikes so much is “never do anything in the kernel that can’t be done just as well in user mode.” Kernel code runs at the exact same speed as user code, but development and debugging is much harder, the risk of doing damage is much higher, and of course the cost of a failure is much higher.

One of my “general rules” that Anton dislikes so much is “never do anything in the kernel that can’t be done just as well in user mode.”

Hahaha I tend to be on Anton side. Many times I had to stop myself.

Kernel code runs at the exact same speed as user code.

A bit offtopic: And what about doing too much context switching? May be not a concern nowadays but I remember in the past, transitions kernel ↔ user mode were costly.

context switches are costly - do it all in UM to avoid them :wink:

One of my “general rules” that Anton dislikes so much the well-known best practices for Windows driver development is “never do anything in the kernel that can be done just as well in user mode.”

Otherwise known as “do as little in kernel-mode as possible”. This. Absolutely. All day long.

Peter

> @MBond2 said: > context switches are costly - do it all in UM to avoid them :wink: Unlikely not always possible. ? > @“Peter_Viscarola_(OSR)” said: > One of my “general rules” that Anton dislikes so much the well-known best practices for Windows driver development is “never do anything in the kernel that can be done just as well in user mode.” > > Otherwise known as “do as little in kernel-mode as possible”. This. Absolutely. All day long. > > Peter Good to know. I always try but sometimes doubts appears. Thanks.