Spin Lock

Sorry Peter for such a late reply. Thanks for your reply. Yes. I want to implement mutual exclusive access to some structure(pFilter) between the driver dispatch routine and the NdisSendNetBufferList routine. The second one usually works at the IRQL DISPATCH LEVEL level. Drivers dispatch routine calls at the PASSIVE LEVEL. Whether it is necessary to increase the IRQL of the dispatch routine before using FILTER_ACQUIRE_LOCK, FILTER_RELEASE_LOCK pair.

And the second question. Who can help me. I want to copy all data from NET_BUFFER to Irp->AssociatedIrp.SystemBuffer.
There is a NdisGetDataBuffer function for this, but I can’t figure out if it copies all non-contiguous data from the network buffer.

I wrote this piece of code. It works correctly, but sometimes windows stops working with IRQL NOT LESS OR EQUAL. When I comment out the second part of the code (brunch else), the driver does not crash, but only copies the data from the first mdl (the current network buffer mdl). Can you help me what is wrong with the second part of the code

ULONG indexOfCurMdl = 1 + 1 + countOfNBLs + (countOfNB_MDLs + 1) * sizeof(LONG);  //the last one is the length of buffer
        pNBL = NetBufferLists;
        ULONG numberOfNB_MDL = 0;
        while (pNBL) {
            NET_BUFFER* pNB = NET_BUFFER_LIST_FIRST_NB(pNBL);
            while (pNB) {
                *(LONG*)(BUFFER + 1 + 1 + countOfNBLs + numberOfNB_MDL * sizeof(LONG)) = indexOfCurMdl;
                PMDL pMdl = NET_BUFFER_CURRENT_MDL(pNB);
                while (pMdl) {
                  if (pMdl == NET_BUFFER_CURRENT_MDL(pNB)) {
                        if (MmGetMdlByteCount(pMdl) - NET_BUFFER_CURRENT_MDL_OFFSET(pNB) >= NET_BUFFER_DATA_LENGTH(pNB)) {
                            if (indexOfCurMdl + NET_BUFFER_DATA_LENGTH(pNB) > BUFFER_SIZE) {DEBUGP(DL_FATAL, "Length > BUFFER_SIZE");goto m1;}
                            NdisMoveMemory(BUFFER + indexOfCurMdl, (PUCHAR)MmGetMdlVirtualAddress(pMdl) + NET_BUFFER_CURRENT_MDL_OFFSET(pNB), NET_BUFFER_DATA_LENGTH(pNB));
                            indexOfCurMdl += NET_BUFFER_DATA_LENGTH(pNB);
//                            break;
                        }
                        else {
                            if (indexOfCurMdl + MmGetMdlByteCount(pMdl) - NET_BUFFER_CURRENT_MDL_OFFSET(pNB) > BUFFER_SIZE) {DEBUGP(DL_FATAL, "Length > BUFFER_SIZE");goto m1;}
                            NdisMoveMemory(BUFFER + indexOfCurMdl, (PUCHAR)MmGetMdlVirtualAddress(pMdl) + NET_BUFFER_CURRENT_MDL_OFFSET(pNB), MmGetMdlByteCount(pMdl) - NET_BUFFER_CURRENT_MDL_OFFSET(pNB));
                            indexOfCurMdl += MmGetMdlByteCount(pMdl) - NET_BUFFER_CURRENT_MDL_OFFSET(pNB);
                        }
                    } else {
                    ULONG temp = indexOfCurMdl - *(LONG*)(BUFFER + 1 + 1 + countOfNBLs + numberOfNB_MDL * sizeof(LONG));
                        if (NET_BUFFER_DATA_LENGTH(pNB) - temp >= MmGetMdlByteCount(pMdl) && temp < NET_BUFFER_DATA_LENGTH(pNB)) {
                            if (indexOfCurMdl + MmGetMdlByteCount(pMdl) > BUFFER_SIZE) { DEBUGP(DL_FATAL, "Length > BUFFER_SIZE"); goto m1; }
                            NdisMoveMemory(BUFFER + indexOfCurMdl, (PUCHAR)MmGetMdlVirtualAddress(pMdl), MmGetMdlByteCount(pMdl));
                            indexOfCurMdl += MmGetMdlByteCount(pMdl);
                        }
                        else {
                            if (indexOfCurMdl + NET_BUFFER_DATA_LENGTH(pNB) - temp > BUFFER_SIZE) {DEBUGP(DL_FATAL, "Length > BUFFER_SIZE");goto m1;}
                            NdisMoveMemory(BUFFER + indexOfCurMdl, (PUCHAR)MmGetMdlVirtualAddress(pMdl), NET_BUFFER_DATA_LENGTH(pNB) - temp);
                            indexOfCurMdl += NET_BUFFER_DATA_LENGTH(pNB) - temp;
//                            break;
                        }

                    }  
                    pMdl = pMdl->Next;
                }
                numberOfNB_MDL++;
                pNB = NET_BUFFER_NEXT_NB(pNB);
            }
            pNBL = NET_BUFFER_LIST_NEXT_NBL(pNBL);
        }

Another question.
It is documented that NdisSendNetBufferList must be completed before being called again. But sometimes it is called twice at the same time. dbgView shows all my DEBUGP(DL_WARN, … messages twice. This causes my driver to crash. Is it correct to call a procedure NdisSendNetBufferList before the end of the last one. This happens very rarely. How can I avoid this problem. The point is that dbgView shows messages I wrote with DEBUGP(DL_WARN, …, only for procedures that have already completed. So I don’t see messages from the last NdisSendNetBufferList call. So it took one month to figure out that sometimes NdisSendNetBufferList calls before ending previous

I already know the answer to the second question. The point is MmGetMdlVirtualAddress. I should use MmGetSystemAddressForMdlSafe instead.
Is there any way to delete own replies?

You ought to consider using the ndis driver library routines from microsoft
here: https://github.com/microsoft/ndis-driver-library as they cover your
problem and you basically do not have to write any buggy new code to get it
right.
Mark Roddy

Thanks

Who can say what happens with the user mode allocated memory(malloc), locked with VirtualLock(), when user process being inactive. Is is stays resident or pages. Can I access to it from the driver routine, working in arbitrary thread context (DISPATCH_LEVEL)

All memory allocated from UM can be paged out if the system is under memory pressure. VirtualLock provides a hint that this memory is important and should remain in the working set even if it would otherwise be a candidate for pruning (hasn’t been accessed recently and the system is looking for resources). There are only a few valid uses for this function

in KM it is never safe to access UM addresses without verifying the input is valid. Probing and locking allow the use of this memory in other thread contexts. There is lots of information about this topic in the archives of this form and elsewhere on the Internet

All memory allocated from UM can be paged out if the system is under memory pressure.

You are almost never wrong, but I have to dispute this assertion. Pages locked with VirtualLock will remain in physical memory until the process exits. Even the documentation says they are “guaranteed not to be written to the page file while they are locked”.

Thanks to all. In a few days I will probe and inform you about the result of my experiments.

@Tim_Roberts said:

All memory allocated from UM can be paged out if the system is under memory pressure.

You are almost never wrong, but I have to dispute this assertion. Pages locked with VirtualLock will remain in physical memory until the process exits. Even the documentation says they are “guaranteed not to be written to the page file while they are locked”.

Raymond Chen covered the confusing nature of VirtualLock here:

https://devblogs.microsoft.com/oldnewthing/20071106-00/?p=24573

With an update/correction here:

https://devblogs.microsoft.com/oldnewthing/20140207-00/?p=1833

To the OP’s question of:

Can I access [the user buffer locked with VirtualLock] from the driver routine, working in arbitrary thread context (DISPATCH_LEVEL)

That would be a no for at least two reasons:

  1. If you’re in an arbitrary thread context you can’t access a user data pointer tied to a specific context
  2. Nothing says the user data pointer won’t become invalid while you’re using it (e.g. the app could call VirtualFree, VirtualProtect and make it read only, etc.). If you’re running at IRQL < DISPATCH_LEVEL the Mm will raise an exception that you can catch. If you’re at DISPATCH_LEVEL the system crashes.

If you want to access a user data buffer in an arbitrary context at arbitrary IRQL you use the MmProbeAndLockPages/MmGetSystemAddressForMdlSafe pattern.

Of course. Thank You. The user virtual addresses can be used only in a dispatch routine, working in process context of initiator, in IRQL level 0.
But I can not understand why to use MmProbeAndLockPage if it is already Locked( VirtualLock])

The most basic answer to your question of why to probe and lock is one of trust. by definition KM code does not trust UM code, so everything that UM passes to KM must be checked for validity and protected against interference until it can no longer cause KM to fail

As to the question of VritualLock, I have re-red these old posts from Raymond, and the present documentation, and I guess it comes down to how strenuously the system actually enforces the minimum working set for a process. I’m not convinced that this has not changed between windows versions, but my only evidence is vague memories. The current documentation clearly indicates how it should work. The use cases are still very limited

Well. Thank You

I red the article “Sharing Memory Between Drivers and Applications”. Thank You for that usefull article. I have a question. I want to realize the same, as in "CreateAndMapMemory " function in device creation time. But how to send the virtual address of user buffer in creation time, if function
HANDLE CreateFileW(
[in] LPCWSTR lpFileName,
[in] DWORD dwDesiredAccess,
[in] DWORD dwShareMode,
[in, optional] LPSECURITY_ATTRIBUTES lpSecurityAttributes,
[in] DWORD dwCreationDisposition,
[in] DWORD dwFlagsAndAttributes,
[in, optional] HANDLE hTemplateFile
);
do not have any extra parameter.

You don’t do it at “CreateFile” time, of course. You open the driver using “CreateFile”, and then send a custom ioctl to receive the mapped buffer address. The article is showing you the driver code you would use to help handle that ioctl.

Thanks

sorry Mr.Tim. You said receive. Do You mean send. Because the user program must create device, allocate memory and send to a driver user virtual address of that buffer. I am organizing bidirectional Direct I/O. I want to use the same user buffer for sending to driver and receiving from driver, and all time, while driver is active.

That’s one way to do it, but that’s not what the article you quoted does. That article has the kernel driver allocate the memory, and then map that memory into the user-mode process. Doing it that way is required if you need physically-contiguous memory for DMA, for example.

Yet another way is to have the application send down a METHOD_IN_DIRECT ioctl very early on, with the desired buffer as the second buffer in the ioctl, and then have the driver keep that ioctl pending for a long time. That way, the I/O system handles the mapping of the memory, and keeps it mapped as long as the ioctl is pending.

The very long pending IRP is a much better design choice. The content of whatever memory buffer you share is entirely up to you.

but consider that whatever mechanism you invent, it is unlikely to be more efficient or effective that standard ReadFile / WriteFile or DeviceIOControl calls. The shared memory design has advantages only in very specific use cases and you should be sure that yours is one before you go to the significant effort to implement a scheme like this.

If your application can tolerate lost, duplicated and corrupted data, then shared memory is easy to implement. If not, then you will end up re-implementing the standard calls