Allocation strategy of ExAllocatePool (nonpaged)?

My driver is going to dynamically allocate and release memory from the
non-paged pool as it operates. It will be allocating and releasing
variable-length chunks of memory in the area of about 40 bytes at a time,
holding them for a long or short period of time, then releasing them on
demand. I know that doing this may cause fragmentation of the non-paged
pool for my driver, but I’m unsure of the effect this will have on other
drivers’ use of the non-paged pool.

Is the allocation strategy that ExAllocatePool uses to 1) reserve a page of
memory exclusively for my driver then allocate the chunks of memory from
that page, or does it 2) reserve a page of memory that will be shared, then
grant chunks of it to any driver that calls?

It seems to me that if the answer is #1 and I can accept the pool
fragmentation in my own driver, then I don’t need to worry too much about
it, but if the answer is #2 I may have to reserve pool memory a whole page
at a time and use custom memory allocation routines with those blocks to
prevent my driver from causing too much non-paged pool fragmentation in
*other* drivers. Anyone know?

Consider using non-paged lookaside list. See ExInitializeNPagedLookasideList
and friends.

If your variable-length chunks are “about 40 bytes at a time”, create a
lookaside list with sufficient size (say 64 bytes) for the largest intended
use and always allocate that chunk from the lookaside list - even if you
only need something less.

My two cents.

Thomas F. Divine

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:bounce-320983-
xxxxx@lists.osr.com] On Behalf Of Matthew Carter
Sent: Sunday, April 13, 2008 12:04 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Allocation strategy of ExAllocatePool (nonpaged)?

My driver is going to dynamically allocate and release memory from the
non-paged pool as it operates. It will be allocating and releasing
variable-length chunks of memory in the area of about 40 bytes at a
time,
holding them for a long or short period of time, then releasing them on
demand. I know that doing this may cause fragmentation of the non-
paged
pool for my driver, but I’m unsure of the effect this will have on
other
drivers’ use of the non-paged pool.

Is the allocation strategy that ExAllocatePool uses to 1) reserve a
page of
memory exclusively for my driver then allocate the chunks of memory
from
that page, or does it 2) reserve a page of memory that will be shared,
then
grant chunks of it to any driver that calls?

It seems to me that if the answer is #1 and I can accept the pool
fragmentation in my own driver, then I don’t need to worry too much
about
it, but if the answer is #2 I may have to reserve pool memory a whole
page
at a time and use custom memory allocation routines with those blocks
to
prevent my driver from causing too much non-paged pool fragmentation in
*other* drivers. Anyone know?


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

It is sharerd over the whole system, drivers and system components. How
many of these things do you think you will have? The allocator is pretty
smart and will look for fragments to use if available. As has already been
pointed out consider a lookaside list.

I would warn against custom allocators, bottom line is they have a place,
but I my experiences show most of them cause significantly more problems
than they fix.


Don Burn (MVP, Windows DDK)
Windows 2k/XP/2k3 Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr
Remove StopSpam to reply

“Matthew Carter” wrote in message
news:xxxxx@ntdev…
> My driver is going to dynamically allocate and release memory from the
> non-paged pool as it operates. It will be allocating and releasing
> variable-length chunks of memory in the area of about 40 bytes at a time,
> holding them for a long or short period of time, then releasing them on
> demand. I know that doing this may cause fragmentation of the non-paged
> pool for my driver, but I’m unsure of the effect this will have on other
> drivers’ use of the non-paged pool.
>
> Is the allocation strategy that ExAllocatePool uses to 1) reserve a page
> of memory exclusively for my driver then allocate the chunks of memory
> from that page, or does it 2) reserve a page of memory that will be
> shared, then grant chunks of it to any driver that calls?
>
> It seems to me that if the answer is #1 and I can accept the pool
> fragmentation in my own driver, then I don’t need to worry too much about
> it, but if the answer is #2 I may have to reserve pool memory a whole page
> at a time and use custom memory allocation routines with those blocks to
> prevent my driver from causing too much non-paged pool fragmentation in
> other drivers. Anyone know?
>
>
>
>

I’m just worried that when my driver frees up all the small chunks of data
only 20-60 bytes long all over the non-paged area it will end up leaving a
whole lot of small blocks around that are too small for other drivers to use
and looking through a long list of tiny empty blocks will slow down
ExAllocatePool for everyone. I am guessing that as a worst case there will
be up to 200 of these little chunks of data at a given time and they could
be allocated and deallocated on an ongoing basis as quickly as 1 per second,
possibly leaving hundreds of tiny free bits of memory all over the place.
Is this something to be concerned with?

I had considered that if I allocated memory a page at a time with
ExAllocatePool then kept track of my own allocation within those pages I
could defragment my own non-paged data if it gets too fragmented, something
that ExAllocatePool does not have the ability to do, and when I release it I
don’t leave hundreds of tiny chunks of memory all over the place, but
instead maybe 5 or 6 full-size memory pages. I certainly would rather avoid
doing this sort of memory management myself, but I also don’t want to be
causing a situation where my driver is responsible for causing massive
fragmentation of non-paged pool bogging down all the memory allocations in
the system.

I’ll look into the lookaside lists. Perhaps those are a good middle ground
solution.

“Don Burn” wrote in message news:xxxxx@ntdev…
> It is sharerd over the whole system, drivers and system components. How
> many of these things do you think you will have? The allocator is pretty
> smart and will look for fragments to use if available. As has already
> been pointed out consider a lookaside list.
>
> I would warn against custom allocators, bottom line is they have a place,
> but I my experiences show most of them cause significantly more problems
> than they fix.

“Matthew Carter” wrote in message
news:xxxxx@ntdev…
> I’m just worried that when my driver frees up all the small chunks of data
> only 20-60 bytes long all over the non-paged area it will end up leaving a
> whole lot of small blocks around that are too small for other drivers to
> use and looking through a long list of tiny empty blocks will slow down
> ExAllocatePool for everyone. I am guessing that as a worst case there
> will be up to 200 of these little chunks of data at a given time and they
> could be allocated and deallocated on an ongoing basis as quickly as 1 per
> second, possibly leaving hundreds of tiny free bits of memory all over the
> place. Is this something to be concerned with?
>
> I had considered that if I allocated memory a page at a time with
> ExAllocatePool then kept track of my own allocation within those pages I
> could defragment my own non-paged data if it gets too fragmented,
> something that ExAllocatePool does not have the ability to do, and when I
> release it I don’t leave hundreds of tiny chunks of memory all over the
> place, but instead maybe 5 or 6 full-size memory pages. I certainly would
> rather avoid doing this sort of memory management myself, but I also don’t
> want to be causing a situation where my driver is responsible for causing
> massive fragmentation of non-paged pool bogging down all the memory
> allocations in the system.
>
> I’ll look into the lookaside lists. Perhaps those are a good middle
> ground solution.
>
> “Don Burn” wrote in message news:xxxxx@ntdev…
>> It is sharerd over the whole system, drivers and system components. How
>> many of these things do you think you will have? The allocator is pretty
>> smart and will look for fragments to use if available. As has already
>> been pointed out consider a lookaside list.
>>
>> I would warn against custom allocators, bottom line is they have a place,
>> but I my experiences show most of them cause significantly more problems
>> than they fix.

I was tempted to use the lookaside lists in several projects, but
finally resorted to (naïve) custom suballocator
+ greedy initial allocation, or just nothing at all :frowning:

The documentation still looks too vague…

What is “system-determined maximum number of entries” ?
Can we observe or tweak it?
Can it change in the next OS/SP? (you bet…)
If we free N memory blocks to lookaside list, will they all be immediately
available?
If we define a custom allocation function, will it be called per item, or
for
large blocks? (again, can this change in next SP? )

For small amounts of memory, perhaps, anything will go,
but our concern was how it will behave with larger amounts
and in long runs. Since the memory may be returned to the
system, our code must be prepared for random allocation failures -
something that people hate to do, and testing becomes harder.
Also, we usually need variable alocation sizes, so would have
several lookasides, which is too complex or not efficient.

Regards,
–PA

> What is “system-determined maximum number of entries” ?

Can we observe or tweak it?
The lookaside list code used to have a max depth, no longer. The list is free to grow as deep as you want based on the number of entries you add

Can it change in the next OS/SP? (you bet…)
Not really anymore. If you look at the allocate function ExAllocateFromNPagedLookasideList it is a forceinline now which means its impl cannot change

If we free N memory blocks to lookaside list, will they all be immediately available?
As long as the OS is not under memory pressure, the lookasides remain untouched. The whole point of lookasides is to be able to cache allocations in a OS friendly manner that the OS can trim when it needs to, but will not touch if there is no pressure

If we define a custom allocation function, will it be called per item, or for large blocks? (again, can this change in next SP? )
Again, look at the header. It is always called one per item. You are never asked to allocate a large block and suballocate. No, it cannot change.

D

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Pavel A.
Sent: Monday, April 14, 2008 1:42 AM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] Allocation strategy of ExAllocatePool (nonpaged)?

“Matthew Carter” wrote in message
news:xxxxx@ntdev…
> I’m just worried that when my driver frees up all the small chunks of data
> only 20-60 bytes long all over the non-paged area it will end up leaving a
> whole lot of small blocks around that are too small for other drivers to
> use and looking through a long list of tiny empty blocks will slow down
> ExAllocatePool for everyone. I am guessing that as a worst case there
> will be up to 200 of these little chunks of data at a given time and they
> could be allocated and deallocated on an ongoing basis as quickly as 1 per
> second, possibly leaving hundreds of tiny free bits of memory all over the
> place. Is this something to be concerned with?
>
> I had considered that if I allocated memory a page at a time with
> ExAllocatePool then kept track of my own allocation within those pages I
> could defragment my own non-paged data if it gets too fragmented,
> something that ExAllocatePool does not have the ability to do, and when I
> release it I don’t leave hundreds of tiny chunks of memory all over the
> place, but instead maybe 5 or 6 full-size memory pages. I certainly would
> rather avoid doing this sort of memory management myself, but I also don’t
> want to be causing a situation where my driver is responsible for causing
> massive fragmentation of non-paged pool bogging down all the memory
> allocations in the system.
>
> I’ll look into the lookaside lists. Perhaps those are a good middle
> ground solution.
>
> “Don Burn” wrote in message news:xxxxx@ntdev…
>> It is sharerd over the whole system, drivers and system components. How
>> many of these things do you think you will have? The allocator is pretty
>> smart and will look for fragments to use if available. As has already
>> been pointed out consider a lookaside list.
>>
>> I would warn against custom allocators, bottom line is they have a place,
>> but I my experiences show most of them cause significantly more problems
>> than they fix.

I was tempted to use the lookaside lists in several projects, but
finally resorted to (na?ve) custom suballocator
+ greedy initial allocation, or just nothing at all :frowning:

The documentation still looks too vague…

What is “system-determined maximum number of entries” ?
Can we observe or tweak it?
Can it change in the next OS/SP? (you bet…)
If we free N memory blocks to lookaside list, will they all be immediately
available?
If we define a custom allocation function, will it be called per item, or
for
large blocks? (again, can this change in next SP? )

For small amounts of memory, perhaps, anything will go,
but our concern was how it will behave with larger amounts
and in long runs. Since the memory may be returned to the
system, our code must be prepared for random allocation failures -
something that people hate to do, and testing becomes harder.
Also, we usually need variable alocation sizes, so would have
several lookasides, which is too complex or not efficient.

Regards,
–PA


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Doron, thank you, this is the answer we needed.

–PA

“Doron Holan” wrote in message
news:xxxxx@ntdev…
>> What is “system-determined maximum number of entries” ?
>> Can we observe or tweak it?
> The lookaside list code used to have a max depth, no longer. The list is
> free to grow as deep as you want based on the number of entries you add
>
>> Can it change in the next OS/SP? (you bet…)
> Not really anymore. If you look at the allocate function
> ExAllocateFromNPagedLookasideList it is a forceinline now which means its
> impl cannot change
>
>> If we free N memory blocks to lookaside list, will they all be
>> immediately available?
> As long as the OS is not under memory pressure, the lookasides remain
> untouched. The whole point of lookasides is to be able to cache
> allocations in a OS friendly manner that the OS can trim when it needs to,
> but will not touch if there is no pressure
>
>> If we define a custom allocation function, will it be called per item, or
>> for large blocks? (again, can this change in next SP? )
> Again, look at the header. It is always called one per item. You are
> never asked to allocate a large block and suballocate. No, it cannot
> change.
>
> D
>
> -----Original Message-----
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of Pavel A.
> Sent: Monday, April 14, 2008 1:42 AM
> To: Windows System Software Devs Interest List
> Subject: Re:[ntdev] Allocation strategy of ExAllocatePool (nonpaged)?
>
> “Matthew Carter” wrote in message
> news:xxxxx@ntdev…
>> I’m just worried that when my driver frees up all the small chunks of
>> data
>> only 20-60 bytes long all over the non-paged area it will end up leaving
>> a
>> whole lot of small blocks around that are too small for other drivers to
>> use and looking through a long list of tiny empty blocks will slow down
>> ExAllocatePool for everyone. I am guessing that as a worst case there
>> will be up to 200 of these little chunks of data at a given time and they
>> could be allocated and deallocated on an ongoing basis as quickly as 1
>> per
>> second, possibly leaving hundreds of tiny free bits of memory all over
>> the
>> place. Is this something to be concerned with?
>>
>> I had considered that if I allocated memory a page at a time with
>> ExAllocatePool then kept track of my own allocation within those pages I
>> could defragment my own non-paged data if it gets too fragmented,
>> something that ExAllocatePool does not have the ability to do, and when I
>> release it I don’t leave hundreds of tiny chunks of memory all over the
>> place, but instead maybe 5 or 6 full-size memory pages. I certainly
>> would
>> rather avoid doing this sort of memory management myself, but I also
>> don’t
>> want to be causing a situation where my driver is responsible for causing
>> massive fragmentation of non-paged pool bogging down all the memory
>> allocations in the system.
>>
>> I’ll look into the lookaside lists. Perhaps those are a good middle
>> ground solution.
>>
>> “Don Burn” wrote in message news:xxxxx@ntdev…
>>> It is sharerd over the whole system, drivers and system components. How
>>> many of these things do you think you will have? The allocator is
>>> pretty
>>> smart and will look for fragments to use if available. As has already
>>> been pointed out consider a lookaside list.
>>>
>>> I would warn against custom allocators, bottom line is they have a
>>> place,
>>> but I my experiences show most of them cause significantly more problems
>>> than they fix.
>
> I was tempted to use the lookaside lists in several projects, but
> finally resorted to (naïve) custom suballocator
> + greedy initial allocation, or just nothing at all :frowning:
>
> The documentation still looks too vague…
>
> What is “system-determined maximum number of entries” ?
> Can we observe or tweak it?
> Can it change in the next OS/SP? (you bet…)
> If we free N memory blocks to lookaside list, will they all be immediately
> available?
> If we define a custom allocation function, will it be called per item, or
> for
> large blocks? (again, can this change in next SP? )
>
> For small amounts of memory, perhaps, anything will go,
> but our concern was how it will behave with larger amounts
> and in long runs. Since the memory may be returned to the
> system, our code must be prepared for random allocation failures -
> something that people hate to do, and testing becomes harder.
> Also, we usually need variable alocation sizes, so would have
> several lookasides, which is too complex or not efficient.
>
> Regards,
> --PA
>
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>