Memory buffers

I need a memory buffer which i need to store in my driver’s context. I was planning to use WDFMemory. However, i need to flexible size buffer, where i can increase or decrease the buffer size, this buffer is not passed to the framework or anything. It is used only internal to my driver. Any thoughts?

Allocate it once for the maximum size you need and forget about it until you have to free it.

Gary Little
H (952) 223-1349
C (952) 454-4629
xxxxx@comcast.net

On Mar 19, 2012, at 6:12 PM, xxxxx@gmail.com wrote:

I need a memory buffer which i need to store in my driver’s context. I was planning to use WDFMemory. However, i need to flexible size buffer, where i can increase or decrease the buffer size, this buffer is not passed to the framework or anything. It is used only internal to my driver. Any thoughts?


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

A wdfmemory is no different than allocated pool in this case. Both need to be freed and reallocated when grown (for shrink you can track the valid subset size without freeing if you want and you are not wasting tons of memory)

d

debt from my phone


From: xxxxx@gmail.com
Sent: 3/19/2012 4:09 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Memory buffers

I need a memory buffer which i need to store in my driver’s context. I was planning to use WDFMemory. However, i need to flexible size buffer, where i can increase or decrease the buffer size, this buffer is not passed to the framework or anything. It is used only internal to my driver. Any thoughts?


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

There is no magic solution here. You have several choices:

Gary’s solution: allocate once of maximum size

Unfortunately, your description is such a vague handwave (no hints as to
minimum, maximum, and expected sizes, frequency of reallocation, and
pretty much anything else that might give the slightest hint about the
scope of the problem) but here’s a few ideas

Allocate what you need. When you need to grow, allocate a new chunk and
copy the contents of the existing buffer into it, then free the old
buffer. Shrinking is the same way. Advantage: the buffer is contiguous,
and never consumes more storage than it needs. Disadvantage: memory
fragmentation, potentially expensive copy time, need to have two buffers
in existence for a period of time, the buffer is contiguous [no, that’s
not an error; it depends on the size, but huge buffers require contiguous
chunks of address space, a possible disaster in 32-bit systems booted in
“3GB” mode]

Allocate quantized by some size, such as N pages, for N >= 1. Now you can
expand into the “slack space” until it is full, then do another
allocation. Advantage: all advantages of previous scheme, plus
far-less-frequent copies. Disadvantage: all disadvantages of previous
scheme, plus need to track “actual used space” and “total available space”
as separate concepts (see UNICODE_STRING).

Note that std::vector uses this scheme, plus changing the quantum size of
additions. Advantage: all the above advantages, plus expected performance
is near-linear rather than exponential. Disadvantage: the most recent
allocation may overcommit, thus consuming more space than is required.

Variation on above schemes: expand, but never shrink. Advantage: reduces
gratuitous copies. Disadvantage: storage consumption is peak rather than
average usage. Under conditions of minimum usage, potential great waste
of space.

Bear in mind that all contiguous schemes require a resource often more
critical than memory space: contiguous kernel address space.
Allocate quantized by some size. Your “buffer” is a linked list or array
of pointers. Advantage: no copy needed to expand or contract, does not
require large contiguous chunks of address space, making it more friendly
to the kernel ecosystem, can easily free blocks “at front” when they are
consumed by the app. Disadvantage: Copy to/from more complex, especially
when crossing block boundaries, hits allocator hard if expand/shrink are
too frequent.

Variation of above: use lookaside list (cache of blocks). Free storage to
cache. Advantage: Faster allocation/deallocation than previous scheme
(plus all its advantages). Disadvantage: all disadvantages of previous
scheme, plus, blocks are not deallocated from the cache until some event
like close or remove, thus potentially keeping storage tied up
indefinitely, with maximum usage representing peak consumption.

Variation of above: the NPLOOKASIDELIST (if I remember the name correctly)
has a “depth” parameter which is essentially the maximum number of blocks
kept in the cache of blocks. When freeing a block would result in more
than this many free blocks, the block is freed to the general nonpaged
pool. Advantage: limits the amount of memory consumed, plus all above
advantages. Disadvantage: choose too small a depth and you will be
hitting the general allocator too often, hurting performance, choose too
large a depth and you will tie up too much of the nonpaged pool, hurting
performance, effort to find the “ideal” depth can be considerable.

Now, in the context of your driver, the frequency with which you must
expand/contract, the storage quantum you choose, the amount of complexity
advanced schemes introduce, the phase of the moon and the angular
relationship of Jupiter and Mars on the day the driver is running, and
probably a few other parameters, one of these schemes will be the best
choice. But with your vague descriptions, it is hard to guess.
joe

I need a memory buffer which i need to store in my driver’s context. I was
planning to use WDFMemory. However, i need to flexible size buffer, where
i can increase or decrease the buffer size, this buffer is not passed to
the framework or anything. It is used only internal to my driver. Any
thoughts?


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

> Note that std::vector uses this scheme, plus changing the quantum size of

additions.

Yes, it always reallocates 2 times larger size.

This gives amortized linear number of copy operations (which are the main bottleneck here) on a long track of lots of push_back() calls.

Nevertheless, it fragments the memory a lot, so the perf losses inside the allocator itself can become more serious then the copies.

So, for huge number of objects, it is better to use std::deque and only push_back() to it. It has much better memory allocation pattern.

Re-implementing std::deque in the kernel in C can be the task.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com