Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results
The free OSR Learning Library has more than 50 articles on a wide variety of topics about writing and debugging device drivers and Minifilters. From introductory level to advanced. All the articles have been recently reviewed and updated, and are written using the clear and definitive style you've come to expect from OSR over the years.
Check out The OSR Learning Library at: https://www.osr.com/osr-learning-library/
I'm glad to start my first discussion on this forum, hope it will be interesting for everyone.
I'm working on the implementation of a WDF filter driver which performs caching of data reads in NonPagedPool. For that, I have an LRU data structure that stores data in chunks with sizes 0x8000, when in read callback I'm receiving requests for read with a size that is not multiple to 0x8000 I'm creating a new request which is aligned with 0x8000, and send it to the lower level driver with completion routine, once request is completed I'm completing original request with the requested data and saving 0x8000 to the LRU data structure. If I'm getting a write request which intercepts with any chunk in the LRU, a written chunk is deleted.
The solution was working as expected until I enabled the page file. Once it happened I got BSOD after some time of work, BSOD is related to the verification of the CRC value of the page file.
From my investigation, I see that it is happening because a WRITE request can come and modify data on the disk during aligned READ request is in progress (it is async with WdfIoTargetFormatRequestForRead and WdfRequestSend), so when the completion routine return, actual data on the disk is different from the data returned by read completion routine, and I'm unable to invalidate this data in WRITE callback because at the time of WRITE request arrival completion routine was in progress or even not arrived yet.
So I wanted to use WdfIoTargetSendReadSynchronously but from what I see it is only for PASSIVE_LEVEL and I do not want to move all requests to the disk in the queue which will be processed by the thread on the PASSIVE_LEVEL.
Maybe other options exist?
The purpose of such caching is to reduce network load on the systems without HD or SDD which are booted from the image on the central computer over the network.
|Upcoming OSR Seminars|
|OSR has suspended in-person seminars due to the Covid-19 outbreak. But, don't miss your training! Attend via the internet instead!|
|Internals & Software Drivers||19-23 June 2023||Live, Online|
|Writing WDF Drivers||10-14 July 2023||Live, Online|
|Kernel Debugging||16-20 October 2023||Live, Online|
|Developing Minifilters||13-17 November 2023||Live, Online|
Yes, because you cannot block above PASSIVE_LEVEL. However, it's very easy just to send the read asynchronously, and do the stuff you WOULD have done after your call into the completion routine.
HOWEVER, you need to know that Windows already has very extensive LRU file system caching that does exactly what you're describing. Stuff written to disk is held in the cache in case someone comes along to read it later. If the system is not memory-constrained, I believe it will actually use up to half of your physical RAM as a file system cache. It seems unlikely to me that you'll be able to improve things with your scheme.
Tim Roberts, [email protected]
Providenza & Boekelheide, Inc.
Can you suggest where to check for how to configure windows embedded LRU?
But really there is nothing much to configure.
At the end of the day, I've configured I/O queue processing on the PASSIVE_LEVEL and do send requests to the underline device with WdfIoTargetSendReadSynchronously , the solution works fine and I'm getting expected performance increase, the only part about which developer needs to take care is to properly handle situation when I/O queue is synchronous and inside read handler you are sending synch request to the underline device if some driver will try to allocate paged out memory in the request completion routine the system can fall in the deadlock which will be hard to find, the solution which I've used is parallel queue with locks on the LRU