Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Home NTDEV
Before Posting...
Please check out the Community Guidelines in the Announcements and Administration Category.

More Info on Driver Writing and Debugging


The free OSR Learning Library has more than 50 articles on a wide variety of topics about writing and debugging device drivers and Minifilters. From introductory level to advanced. All the articles have been recently reviewed and updated, and are written using the clear and definitive style you've come to expect from OSR over the years.


Check out The OSR Learning Library at: https://www.osr.com/osr-learning-library/


Is there any problem in interlock operation processing between kernel-applications?

iamupdiamupd Member Posts: 8
edited March 23 in NTDEV

Hello? I am very helpful in this community.
May I ask you a question?

I have succeeded in sharing memory between kernel and application by referring to the following article. Memory was shared via MDL. MDL that points to a single physical memory is created and the virtual memory is converted and delivered to the kernel and application through the MDL.

http://www.osronline.com/article.cfm^article=39.htm

Shared memory is a structure, and there are also variables for interlock operations. So, in the kernel and application, each thread is running, and changing and reading of values are performed by interlock operation.
There is no problem sharing memory between kernels and applications. The question is whether the interlock operation works without any problem.
Is there any problem in interlock operation processing between kernel-applications?
This is my question.

Comments

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,496

    You're already doing interlocking in the kernel. If that's working, then that answers your question, doesn't it?

    Specifically, how are you doing the interlocking?

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • iamupdiamupd Member Posts: 8

    @Tim_Roberts
    Thanks for the comment.

    My question is to share the memory between the kernel and the application, and inside the memory there is a volatile __int64 variable to increase / decrease with interlock. I was wondering if the interlock operation works the same for shared memory areas between kernel-applications. I think it works equally well, but I was wondering if there was a known problem.

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,496

    You still didn't say how you are doing the interlocking. If you are using InterlockedIncrement and InterlockedDecrement, then the answer should be clear. Those APIs use special CPU instructions that guarantee the operation is done atomically, meaning that it can't be interrupted, and the instructions it generates are the same in either mode. So yes, it works as you expect.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • iamupdiamupd Member Posts: 8

    @Tim_Roberts

    Thank you so much for your comment.
    In the kernel, InterlockedCompareExchange64 is used to assign a specific value, and the application reads and processes the value. In the kernel, multiple threads change values at the same time, and the application reads them through the InterlockedCompareExchange64 function to perform some action.

  • Dejan_MaksimovicDejan_Maksimovic Member - All Emails Posts: 327
    via Email
    Your problem is not in interlocked operations.
    It (your problem) is with your English translation, I'm sorry to say.

    Tim already answered your question. But your translator did not
    translate it properly.
  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,496
    edited March 23

    I think once you understand what those operations do, you'll see why it's not a problem.

    InterlockedCompareExchange64 (usually) compiles inline to a single CPU instruction. The processor itself ensures that no other instruction can happen in the middle of that instruction: the compare and the exchange are guaranteed to happen without interference.

    So, IF your communication scheme is such that InterlockedCompareExchange64 is enough to ensure that everyone plays nicely, then it doesn't matter whether the players are user or kernel. It is the PROCESSOR doing the interlocking.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • MBond2MBond2 Member Posts: 144

    The answer to your specific question is no, there is no difference in the way that interlocked operations behave based on privilege level (ring 3 vs ring 0). The CPU will execute them using the same coherency protocol implied by the LOCK prefix and it is sufficient to guard access to a value from multiple threads running in both.

    BUT, it must be said that this is an inherently insecure design. There is no guarantee whatsoever that the UM (less trusted) code will actually access this variable using interlocked access. If that causes the UM code to malfunction and crash, well that’s okay. But if that causes the KM (more trusted) code to malfunction, then that is a serious problem. Without looking at your code at all, or knowing anything about what you are doing, in this context use of InterlockedCompareExchange probably implies that the latter is possible.

    If this is a test program, a learning exercise or for use on a closed system, you can ignore this issue. If this is intended for a general purpose mass produced device, you will need to address this problem as no one likes software that introduces new security vulnerabilities

  • iamupdiamupd Member Posts: 8

    @Tim_Roberts said:
    You still didn't say how you are doing the interlocking. If you are using InterlockedIncrement and InterlockedDecrement, then the answer should be clear. Those APIs use special CPU instructions that guarantee the operation is done atomically, meaning that it can't be interrupted, and the instructions it generates are the same in either mode. So yes, it works as you expect.

    Thanks for your comments. Your comments helped me a lot.

  • iamupdiamupd Member Posts: 8

    @MBond2 said:
    The answer to your specific question is no, there is no difference in the way that interlocked operations behave based on privilege level (ring 3 vs ring 0). The CPU will execute them using the same coherency protocol implied by the LOCK prefix and it is sufficient to guard access to a value from multiple threads running in both.

    BUT, it must be said that this is an inherently insecure design. There is no guarantee whatsoever that the UM (less trusted) code will actually access this variable using interlocked access. If that causes the UM code to malfunction and crash, well that’s okay. But if that causes the KM (more trusted) code to malfunction, then that is a serious problem. Without looking at your code at all, or knowing anything about what you are doing, in this context use of InterlockedCompareExchange probably implies that the latter is possible.

    If this is a test program, a learning exercise or for use on a closed system, you can ignore this issue. If this is intended for a general purpose mass produced device, you will need to address this problem as no one likes software that introduces new security vulnerabilities

    Thanks for your comments. Your comments helped me a lot.

  • anton_bassovanton_bassov Member Posts: 5,162

    There is no guarantee whatsoever that the UM (less trusted) code will actually access this variable using interlocked access.

    Well, the whole thing sounds pretty much like a tightly-coupled app-driver pair, so that they are,apparently,following some protocol that is well-understood by both sides. OTOH, I think it would,probably, be better to use separate counters for writer and reader, with writer one being atomically modified by a driver and read by an app (I assume a driver and app are respectively a data producer/consumer). Besides this, an app will also maintain its own counter that it unilaterally modifies, so that it can judge data availability simply by matching these two counters. To make it even easier, one can make the buffer sized as a certain power of 2 - as long as the buffer size if a power of 2 these counters will automatically wrap around when the buffer size gets exceeded

    Anton Bassov

  • MBond2MBond2 Member Posts: 144

    Anton that is clearly true. It could not possibly work if the application did not modify the memory in exactly the way that the driver expected - which is exactly the security problem I am pointing out.

    If this is a closed system, then this could be just fine. If this is for general distribution, then it is clearly a serious design flaw.

    Your proposal is probably better, but the same security issue remains

Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Upcoming OSR Seminars
OSR has suspended in-person seminars due to the Covid-19 outbreak. But, don't miss your training! Attend via the internet instead!
Kernel Debugging 30 Mar 2020 OSR Seminar Space
Developing Minifilters 15 Jun 2020 LIVE ONLINE
Writing WDF Drivers 22 June 2020 LIVE ONLINE
Internals & Software Drivers 28 Sept 2020 Dulles, VA