No More PTEs

Hello All,

The problem that I am having is that I seem to be running out of PTE’s on my
system. Basically, I am getting a bunch of User-Land buffers that I want to
copy data into from the driver. I use IoAllocateMdl, MmProbeAndLockPages
and MmGetSystemAddressForMdlSafer (was posted by Walter Oney if I am not
mistaken, it basically uses MmGetSystemAddressForMdl but allows it to fail.)

Well, what I am seeing is the driver is repeatable failing in the call to
MmGetSystemAddressForMdlSafe, and if I use MmGetSystemAddressForMdl the
PC will BSOD with NO_MORE_PTES.

I read that the number of PTE’s that are available can be changed by changing
the Registry at HKEY_LOCAL_MACHINE/System/Current…/SystemPages which was
set to 0x63000 (Microsoft’s web says to change it to 40000 (<=128) or to
change it to 110000 for (128-256MB) of ram in your system. Even Changing
the value of the Registry Key did not change the fact that it was running
out of PTE’s.

Now the ammount that gets mapped to system space can vary and can be in as
large chunks as 6291456 bytes however it is failing on the 4th try to
map in a user space buffer of size 25165824 bytes (the first 3 have already
succeeded) and the largest chunk of size 6291456 can be mapped in on its
own (I haven’t tried to see how many times it can be mapped in.)

Basically, when I am done with the user buffer, I unmap it and the userapp
should then send down another. This works fine with most cases, just not
when I get 4 25165824 byte buffers (well trying to.)

Is that size really TOO much data to try and map? I could understand if
I was trying to allocate that much for usage, but I am not. Does anyone
know what I can do to get this to work short of keeping track of how many
I can mapped in and maybe keeping a queue of user buffers that are
waiting to be mapped in (So we have 4 buffers yet only 2 are actually
mapped in, and buffers from this queue get mapped in as we are done with
others.)

PS. The driver will have allocated a total of 2764800 bytes as its own
internal buffer from the non-paged pool (although it is done as smaller
segments for other uses.)

Thanks for your help.
Brad.

Brad,

I can’t say for sure, but I suspect that your problem is the size of
the virtually contiguous memory you want to map. Mapping a 25MB piece
of user buffer space fully requires enough contiguous PTEs to map the
whole 25MB. You could have lots of free PTEs and still not have enough
contiguous PTEs to map a buffer that size.

Can you break the MDLs up into smaller pieces and have several of them
and move through the data in smaller pieces?

just a thought, Rick…

=====================================================================

The problem that I am having is that I seem to be running out of PTE’s on my
system. Basically, I am getting a bunch of User-Land buffers that I want to
copy data into from the driver. I use IoAllocateMdl, MmProbeAndLockPages
and MmGetSystemAddressForMdlSafer (was posted by Walter Oney if I am not
mistaken, it basically uses MmGetSystemAddressForMdl but allows it to fail.)

Well, what I am seeing is the driver is repeatable failing in the call to
MmGetSystemAddressForMdlSafe, and if I use MmGetSystemAddressForMdl the
PC will BSOD with NO_MORE_PTES.

I read that the number of PTE’s that are available can be changed by changing
the Registry at HKEY_LOCAL_MACHINE/System/Current…/SystemPages which was
set to 0x63000 (Microsoft’s web says to change it to 40000 (<=128) or to
change it to 110000 for (128-256MB) of ram in your system. Even Changing
the value of the Registry Key did not change the fact that it was running
out of PTE’s.

Now the ammount that gets mapped to system space can vary and can be in as
large chunks as 6291456 bytes however it is failing on the 4th try to
map in a user space buffer of size 25165824 bytes (the first 3 have already
succeeded) and the largest chunk of size 6291456 can be mapped in on its
own (I haven’t tried to see how many times it can be mapped in.)

Basically, when I am done with the user buffer, I unmap it and the userapp
should then send down another. This works fine with most cases, just not
when I get 4 25165824 byte buffers (well trying to.)

Is that size really TOO much data to try and map? I could understand if
I was trying to allocate that much for usage, but I am not. Does anyone
know what I can do to get this to work short of keeping track of how many
I can mapped in and maybe keeping a queue of user buffers that are
waiting to be mapped in (So we have 4 buffers yet only 2 are actually
mapped in, and buffers from this queue get mapped in as we are done with
others.)

PS. The driver will have allocated a total of 2764800 bytes as its own
internal buffer from the non-paged pool (although it is done as smaller
segments for other uses.)

Thanks for your help.
Brad.

A Page Table is 4KB page memory of 1KB entries that covers 4MB contiguous
virtual memory. Page directory has entries that point to the page tables. So
your guess is wrong. A PTE for a process must be resident in memory if the
process is current process. If one tries to map many large size of memory,
one may run out off virtual address space and/or PTE, depending on the
situation.

I suspect there might be a memory leak or similar kind. For example, if the
Mdl is not freed, which he allocates, it will keep the user memory mapped in
kernel mode, thus using up PTE memory and Mdl at the same time.

Bi

-----Original Message-----
From: xxxxx@rdperf.com [mailto:xxxxx@rdperf.com]
Sent: Wednesday, December 18, 2002 2:19 PM
To: NT Developers Interest List
Subject: [ntdev] Re: No More PTEs

Brad,

I can’t say for sure, but I suspect that your problem is the size of
the virtually contiguous memory you want to map. Mapping a 25MB piece
of user buffer space fully requires enough contiguous PTEs to map the
whole 25MB. You could have lots of free PTEs and still not have enough
contiguous PTEs to map a buffer that size.

Can you break the MDLs up into smaller pieces and have several of them
and move through the data in smaller pieces?

just a thought, Rick…

=====================================================================

The problem that I am having is that I seem to be running out of PTE’s on my
system. Basically, I am getting a bunch of User-Land buffers that I want to
copy data into from the driver. I use IoAllocateMdl, MmProbeAndLockPages
and MmGetSystemAddressForMdlSafer (was posted by Walter Oney if I am not
mistaken, it basically uses MmGetSystemAddressForMdl but allows it to fail.)

Well, what I am seeing is the driver is repeatable failing in the call to
MmGetSystemAddressForMdlSafe, and if I use MmGetSystemAddressForMdl the
PC will BSOD with NO_MORE_PTES.

I read that the number of PTE’s that are available can be changed by
changing
the Registry at HKEY_LOCAL_MACHINE/System/Current…/SystemPages which was
set to 0x63000 (Microsoft’s web says to change it to 40000 (<=128) or to
change it to 110000 for (128-256MB) of ram in your system. Even Changing
the value of the Registry Key did not change the fact that it was running
out of PTE’s.

Now the ammount that gets mapped to system space can vary and can be in as
large chunks as 6291456 bytes however it is failing on the 4th try to
map in a user space buffer of size 25165824 bytes (the first 3 have already
succeeded) and the largest chunk of size 6291456 can be mapped in on its
own (I haven’t tried to see how many times it can be mapped in.)

Basically, when I am done with the user buffer, I unmap it and the userapp
should then send down another. This works fine with most cases, just not
when I get 4 25165824 byte buffers (well trying to.)

Is that size really TOO much data to try and map? I could understand if
I was trying to allocate that much for usage, but I am not. Does anyone
know what I can do to get this to work short of keeping track of how many
I can mapped in and maybe keeping a queue of user buffers that are
waiting to be mapped in (So we have 4 buffers yet only 2 are actually
mapped in, and buffers from this queue get mapped in as we are done with
others.)

PS. The driver will have allocated a total of 2764800 bytes as its own
internal buffer from the non-paged pool (although it is done as smaller
segments for other uses.)

Thanks for your help.
Brad.


You are currently subscribed to ntdev as: xxxxx@appstream.com
To unsubscribe send a blank email to %%email.unsub%%

On Wed, Dec 18, 2002 at 03:26:59PM -0800, Bi Chen wrote:

A Page Table is 4KB page memory of 1KB entries that covers 4MB contiguous
virtual memory. Page directory has entries that point to the page tables. So
your guess is wrong. A PTE for a process must be resident in memory if the
process is current process. If one tries to map many large size of memory,
one may run out off virtual address space and/or PTE, depending on the
situation.

I suspect there might be a memory leak or similar kind. For example, if the
Mdl is not freed, which he allocates, it will keep the user memory mapped in
kernel mode, thus using up PTE memory and Mdl at the same time.

Well, I’m pretty sure that there is not a memory leak, as I would expect
that Windows would BSOD with the LOCKED_PAGES error. Also, using perfmon
I can see that the Level of PTEs goes back to the previous value and the
driver will run without problem for extended periods of time if fewer or
smaller buffers are used.

The actual flow should be

  1. Map 25MB
  2. Map 25MB
  3. Map 25MB
  4. Map 25MB
  5. Unmap 25MB and Map New 25MB (Repeat as long as needed.)
  6. UnMap 25MB
  7. UnMap 25MB
  8. Unmap 25MB
  9. UnMap 25MB.

However, it is failing at #4. I am guessing that mapping in up to
100MB of user buffers into the system is not possible under windows.
At the time of failure I should have about 270-290MB of Free Memory
(Physical RAM) as my machine has 512MB of Memory. Its weirder that
it works on another machine (Same OS version different Hardware)
that has 256 MB of Physical Ram.

I can’t say for sure, but I suspect that your problem is the size of
the virtually contiguous memory you want to map. Mapping a 25MB piece
of user buffer space fully requires enough contiguous PTEs to map the
whole 25MB. You could have lots of free PTEs and still not have enough
contiguous PTEs to map a buffer that size.

Can you break the MDLs up into smaller pieces and have several of them
and move through the data in smaller pieces?

I don’t have control of all of the user applications that send down
the user buffers unforntunatly and although changing the amount that
I map in the at once on my kernel driver (per MDL) might fix it
on my machine I would still be allocating a total of 100MB so it
probably won’t fix it for others.

Does anyone know what the normal upper limit that you can map (not
allocate) into your driver from User Level without getting into problems
with machines that are considered low on memory (maybe 128 or even
maybe 64?)

My current plan is to keep a list of user buffers and delay
translating them to Kernel Accessible Buffers till I am done
with a few of the previous ones (that were already translated.)

Thank you for your help. If anyone has any other ideas or
explanations I’m still listening.

Brad.

Brad:

Mm… I speculate that you may run out virtual address space (above 2GB)
rather than physical page for mapping your memory. Depending on OS version,
the 2GB-4GB address space is divided into multiple regions. There is a
description in the book Insider Windows 2000 2rd edition. XP is said to be
better. Microsoft claims it is possible to have 1GB address space for
allocation.

Base on the 25MB and number of buffer you allocate I speculate that it is
for uncompressed SD (Standard Definition) video. If you just shuttle data in
user memory to/from hardware, you don’t need system address at all. Mdl and
DMA suffices. If you have to do something to the user memory, you don’t need
system address if you can process user memory in user thread context and at
IRQ passive level (assuming your driver is the top most driver, using
METHOD_NEITHER and ProbeForRead/Write). I would move processing data to user
mode as much as possible.

If you want to check against this speculation, you could map your user mode
memory using named memory mapped file. Then you can map the memory in system
process using ZwMapViewOfSection. System process 0-2GB address space is
relative free.

Hope that helps

Bi

-----Original Message-----
From: Brad R [mailto:xxxxx@jedacite.dyndns.org]
Sent: Wednesday, December 18, 2002 3:45 PM
To: NT Developers Interest List
Subject: [ntdev] Re: No More PTEs

On Wed, Dec 18, 2002 at 03:26:59PM -0800, Bi Chen wrote:

A Page Table is 4KB page memory of 1KB entries that covers 4MB contiguous
virtual memory. Page directory has entries that point to the page tables.
So
your guess is wrong. A PTE for a process must be resident in memory if the
process is current process. If one tries to map many large size of memory,
one may run out off virtual address space and/or PTE, depending on the
situation.

I suspect there might be a memory leak or similar kind. For example, if
the
Mdl is not freed, which he allocates, it will keep the user memory mapped
in
kernel mode, thus using up PTE memory and Mdl at the same time.

Well, I’m pretty sure that there is not a memory leak, as I would expect
that Windows would BSOD with the LOCKED_PAGES error. Also, using perfmon
I can see that the Level of PTEs goes back to the previous value and the
driver will run without problem for extended periods of time if fewer or
smaller buffers are used.

The actual flow should be

  1. Map 25MB
  2. Map 25MB
  3. Map 25MB
  4. Map 25MB
  5. Unmap 25MB and Map New 25MB (Repeat as long as needed.)
  6. UnMap 25MB
  7. UnMap 25MB
  8. Unmap 25MB
  9. UnMap 25MB.

However, it is failing at #4. I am guessing that mapping in up to
100MB of user buffers into the system is not possible under windows.
At the time of failure I should have about 270-290MB of Free Memory
(Physical RAM) as my machine has 512MB of Memory. Its weirder that
it works on another machine (Same OS version different Hardware)
that has 256 MB of Physical Ram.

I can’t say for sure, but I suspect that your problem is the size of
the virtually contiguous memory you want to map. Mapping a 25MB piece
of user buffer space fully requires enough contiguous PTEs to map the
whole 25MB. You could have lots of free PTEs and still not have enough
contiguous PTEs to map a buffer that size.

Can you break the MDLs up into smaller pieces and have several of them
and move through the data in smaller pieces?

I don’t have control of all of the user applications that send down
the user buffers unforntunatly and although changing the amount that
I map in the at once on my kernel driver (per MDL) might fix it
on my machine I would still be allocating a total of 100MB so it
probably won’t fix it for others.

Does anyone know what the normal upper limit that you can map (not
allocate) into your driver from User Level without getting into problems
with machines that are considered low on memory (maybe 128 or even
maybe 64?)

My current plan is to keep a list of user buffers and delay
translating them to Kernel Accessible Buffers till I am done
with a few of the previous ones (that were already translated.)

Thank you for your help. If anyone has any other ideas or
explanations I’m still listening.

Brad.


You are currently subscribed to ntdev as: xxxxx@appstream.com
To unsubscribe send a blank email to %%email.unsub%%