Is there any problem in interlock operation processing between kernel-applications?

Hello? I am very helpful in this community.
May I ask you a question?

I have succeeded in sharing memory between kernel and application by referring to the following article. Memory was shared via MDL. MDL that points to a single physical memory is created and the virtual memory is converted and delivered to the kernel and application through the MDL.

http://www.osronline.com/article.cfm^article=39.htm

Shared memory is a structure, and there are also variables for interlock operations. So, in the kernel and application, each thread is running, and changing and reading of values are performed by interlock operation.
There is no problem sharing memory between kernels and applications. The question is whether the interlock operation works without any problem.
Is there any problem in interlock operation processing between kernel-applications?
This is my question.

You’re already doing interlocking in the kernel. If that’s working, then that answers your question, doesn’t it?

Specifically, how are you doing the interlocking?

@Tim_Roberts
Thanks for the comment.

My question is to share the memory between the kernel and the application, and inside the memory there is a volatile __int64 variable to increase / decrease with interlock. I was wondering if the interlock operation works the same for shared memory areas between kernel-applications. I think it works equally well, but I was wondering if there was a known problem.

You still didn’t say how you are doing the interlocking. If you are using InterlockedIncrement and InterlockedDecrement, then the answer should be clear. Those APIs use special CPU instructions that guarantee the operation is done atomically, meaning that it can’t be interrupted, and the instructions it generates are the same in either mode. So yes, it works as you expect.

1 Like

@Tim_Roberts

Thank you so much for your comment.
In the kernel, InterlockedCompareExchange64 is used to assign a specific value, and the application reads and processes the value. In the kernel, multiple threads change values at the same time, and the application reads them through the InterlockedCompareExchange64 function to perform some action.

Your problem is not in interlocked operations.
It (your problem) is with your English translation, I’m sorry to say.

Tim already answered your question. But your translator did not
translate it properly.

I think once you understand what those operations do, you’ll see why it’s not a problem.

InterlockedCompareExchange64 (usually) compiles inline to a single CPU instruction. The processor itself ensures that no other instruction can happen in the middle of that instruction: the compare and the exchange are guaranteed to happen without interference.

So, IF your communication scheme is such that InterlockedCompareExchange64 is enough to ensure that everyone plays nicely, then it doesn’t matter whether the players are user or kernel. It is the PROCESSOR doing the interlocking.

The answer to your specific question is no, there is no difference in the way that interlocked operations behave based on privilege level (ring 3 vs ring 0). The CPU will execute them using the same coherency protocol implied by the LOCK prefix and it is sufficient to guard access to a value from multiple threads running in both.

BUT, it must be said that this is an inherently insecure design. There is no guarantee whatsoever that the UM (less trusted) code will actually access this variable using interlocked access. If that causes the UM code to malfunction and crash, well that’s okay. But if that causes the KM (more trusted) code to malfunction, then that is a serious problem. Without looking at your code at all, or knowing anything about what you are doing, in this context use of InterlockedCompareExchange probably implies that the latter is possible.

If this is a test program, a learning exercise or for use on a closed system, you can ignore this issue. If this is intended for a general purpose mass produced device, you will need to address this problem as no one likes software that introduces new security vulnerabilities

1 Like

@Tim_Roberts said:
You still didn’t say how you are doing the interlocking. If you are using InterlockedIncrement and InterlockedDecrement, then the answer should be clear. Those APIs use special CPU instructions that guarantee the operation is done atomically, meaning that it can’t be interrupted, and the instructions it generates are the same in either mode. So yes, it works as you expect.

Thanks for your comments. Your comments helped me a lot.

@MBond2 said:
The answer to your specific question is no, there is no difference in the way that interlocked operations behave based on privilege level (ring 3 vs ring 0). The CPU will execute them using the same coherency protocol implied by the LOCK prefix and it is sufficient to guard access to a value from multiple threads running in both.

BUT, it must be said that this is an inherently insecure design. There is no guarantee whatsoever that the UM (less trusted) code will actually access this variable using interlocked access. If that causes the UM code to malfunction and crash, well that’s okay. But if that causes the KM (more trusted) code to malfunction, then that is a serious problem. Without looking at your code at all, or knowing anything about what you are doing, in this context use of InterlockedCompareExchange probably implies that the latter is possible.

If this is a test program, a learning exercise or for use on a closed system, you can ignore this issue. If this is intended for a general purpose mass produced device, you will need to address this problem as no one likes software that introduces new security vulnerabilities

Thanks for your comments. Your comments helped me a lot.

There is no guarantee whatsoever that the UM (less trusted) code will actually access this variable using interlocked access.

Well, the whole thing sounds pretty much like a tightly-coupled app-driver pair, so that they are,apparently,following some protocol that is well-understood by both sides. OTOH, I think it would,probably, be better to use separate counters for writer and reader, with writer one being atomically modified by a driver and read by an app (I assume a driver and app are respectively a data producer/consumer). Besides this, an app will also maintain its own counter that it unilaterally modifies, so that it can judge data availability simply by matching these two counters. To make it even easier, one can make the buffer sized as a certain power of 2 - as long as the buffer size if a power of 2 these counters will automatically wrap around when the buffer size gets exceeded

Anton Bassov

Anton that is clearly true. It could not possibly work if the application did not modify the memory in exactly the way that the driver expected - which is exactly the security problem I am pointing out.

If this is a closed system, then this could be just fine. If this is for general distribution, then it is clearly a serious design flaw.

Your proposal is probably better, but the same security issue remains