How to allocate non-paged buffer below 4GB

Hello,

I have a WDM function driver that allocates buffers from non-paged pool
and submits them to the 1394 bus driver. The 1394 OHCI controller
supports 32 bit addressing only. So on an x64 machine with 8G RAM the
DMA abstraction in the HAL will do an intermediate copy. In order to
avoid that extra copy I would like to ensure that all pages of my buffer
are below the 4G boundary. How can I achieve this? Is
MmAllocatePagesForMdl the way to go?

Note: I’m talking about driver-allocated buffers. I do not use app
buffers directly for DMA because the driver has to process the data
before it’s passed to apps.

Thanks for any comments.

Udo

You could use MmAllocateContiguousMemorySpecifyCache to allocate blocks of
memory below 4gb. Do you actually know that the map register copy operations
already get in your way and prevent you from meeting your performance
requirements?

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:bounce-284394-
xxxxx@lists.osr.com] On Behalf Of Udo Eberhardt
Sent: Monday, April 23, 2007 4:19 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] How to allocate non-paged buffer below 4GB

Hello,

I have a WDM function driver that allocates buffers from non-paged pool
and submits them to the 1394 bus driver. The 1394 OHCI controller
supports 32 bit addressing only. So on an x64 machine with 8G RAM the
DMA abstraction in the HAL will do an intermediate copy. In order to
avoid that extra copy I would like to ensure that all pages of my
buffer
are below the 4G boundary. How can I achieve this? Is
MmAllocatePagesForMdl the way to go?

Note: I’m talking about driver-allocated buffers. I do not use app
buffers directly for DMA because the driver has to process the data
before it’s passed to apps.

Thanks for any comments.

Udo


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Thanks for your reply, Mark.

I considered using MmAllocateContiguousMemorySpecifyCache. But I don’t
need physically contiguous memory. I just need a (virtually contiguous)
non-paged buffer that is comprised of pages below 4G. When I use
MmAllocateContiguousMemorySpecifyCache I’m afraid that this will
increase likelihood of failed allocations.

I notice a (slightly) increased CPU load when I equip the machine with
8G RAM, if compared to 4G RAM. Because my users are very sensitive
regarding CPU utilization I’m afraid they will notice this as well.

What do you exactly mean with “map register copy operations”? I thought
that the HAL’s DMA emulation for 32 bit devices is achieved by
allocating a buffer in the 4G area, DMA-ing the data into this buffer
and then copying the data from this intermediate buffer to my original
buffer. This assumes that the underlying bus driver uses the DMA APIs
correctly, which should be a valid assumption.

Because my original buffer is allocated in kernel mode (and not
submitted by an app) my idea was to take into account the DMA
restrictions for its allocation. This should avoid the intermediate copy
operation and will also reduce memory consumption.

Udo

Mark Roddy wrote:

You could use MmAllocateContiguousMemorySpecifyCache to allocate blocks of
memory below 4gb. Do you actually know that the map register copy operations
already get in your way and prevent you from meeting your performance
requirements?

> -----Original Message-----
> From: xxxxx@lists.osr.com [mailto:bounce-284394-
> xxxxx@lists.osr.com] On Behalf Of Udo Eberhardt
> Sent: Monday, April 23, 2007 4:19 PM
> To: Windows System Software Devs Interest List
> Subject: [ntdev] How to allocate non-paged buffer below 4GB
>
> Hello,
>
> I have a WDM function driver that allocates buffers from non-paged pool
> and submits them to the 1394 bus driver. The 1394 OHCI controller
> supports 32 bit addressing only. So on an x64 machine with 8G RAM the
> DMA abstraction in the HAL will do an intermediate copy. In order to
> avoid that extra copy I would like to ensure that all pages of my
> buffer
> are below the 4G boundary. How can I achieve this? Is
> MmAllocatePagesForMdl the way to go?
>
> Note: I’m talking about driver-allocated buffers. I do not use app
> buffers directly for DMA because the driver has to process the data
> before it’s passed to apps.
>
> Thanks for any comments.
>
> Udo
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer

Indeed - so how about MmAllocatePagesForMdl? That would appear to allow
you to specify the physical address range limits and not require
contiguoua allocation.

By “map register copy operations” I meant what you described as “the
HAL’s DMA emulation”. The buffer(s) used for this are referred to in the
documentation as “map registers” for historic reasons.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Udo Eberhardt
Sent: Tuesday, April 24, 2007 3:34 PM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] How to allocate non-paged buffer below 4GB

Thanks for your reply, Mark.

I considered using MmAllocateContiguousMemorySpecifyCache. But I don’t
need physically contiguous memory. I just need a (virtually contiguous)
non-paged buffer that is comprised of pages below 4G. When I use
MmAllocateContiguousMemorySpecifyCache I’m afraid that this will
increase likelihood of failed allocations.

I notice a (slightly) increased CPU load when I equip the machine with
8G RAM, if compared to 4G RAM. Because my users are very sensitive
regarding CPU utilization I’m afraid they will notice this as well.

What do you exactly mean with “map register copy operations”? I thought
that the HAL’s DMA emulation for 32 bit devices is achieved by
allocating a buffer in the 4G area, DMA-ing the data into this buffer
and then copying the data from this intermediate buffer to my original
buffer. This assumes that the underlying bus driver uses the DMA APIs
correctly, which should be a valid assumption.

Because my original buffer is allocated in kernel mode (and not
submitted by an app) my idea was to take into account the DMA
restrictions for its allocation. This should avoid the intermediate copy

operation and will also reduce memory consumption.

Udo

Mark Roddy wrote:

You could use MmAllocateContiguousMemorySpecifyCache to allocate
blocks of
memory below 4gb. Do you actually know that the map register copy
operations
already get in your way and prevent you from meeting your performance
requirements?

> -----Original Message-----
> From: xxxxx@lists.osr.com [mailto:bounce-284394-
> xxxxx@lists.osr.com] On Behalf Of Udo Eberhardt
> Sent: Monday, April 23, 2007 4:19 PM
> To: Windows System Software Devs Interest List
> Subject: [ntdev] How to allocate non-paged buffer below 4GB
>
> Hello,
>
> I have a WDM function driver that allocates buffers from non-paged
pool
> and submits them to the 1394 bus driver. The 1394 OHCI controller
> supports 32 bit addressing only. So on an x64 machine with 8G RAM the
> DMA abstraction in the HAL will do an intermediate copy. In order to
> avoid that extra copy I would like to ensure that all pages of my
> buffer
> are below the 4G boundary. How can I achieve this? Is
> MmAllocatePagesForMdl the way to go?
>
> Note: I’m talking about driver-allocated buffers. I do not use app
> buffers directly for DMA because the driver has to process the data
> before it’s passed to apps.
>
> Thanks for any comments.
>
> Udo
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Yes, MmAllocatePagesForMdl seems to do the right thing. But I do also
need a kernel-mode virtual address for the buffer it returns (because
the driver has to process the data). Can I create this by a subsequent
call to MmGetSystemAddressForMdlSafe, is this sufficient?
MmProbeAndLockPages and MmMapLockedPagesSpecifyCache should not be
necessary as the doc for MmAllocatePagesForMdl states that the buffer is
non-paged.

Udo

Roddy, Mark wrote:

Indeed - so how about MmAllocatePagesForMdl? That would appear to allow
you to specify the physical address range limits and not require
contiguoua allocation.

By “map register copy operations” I meant what you described as “the
HAL’s DMA emulation”. The buffer(s) used for this are referred to in the
documentation as “map registers” for historic reasons.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Udo Eberhardt
Sent: Tuesday, April 24, 2007 3:34 PM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] How to allocate non-paged buffer below 4GB

Thanks for your reply, Mark.

I considered using MmAllocateContiguousMemorySpecifyCache. But I don’t
need physically contiguous memory. I just need a (virtually contiguous)
non-paged buffer that is comprised of pages below 4G. When I use
MmAllocateContiguousMemorySpecifyCache I’m afraid that this will
increase likelihood of failed allocations.

I notice a (slightly) increased CPU load when I equip the machine with
8G RAM, if compared to 4G RAM. Because my users are very sensitive
regarding CPU utilization I’m afraid they will notice this as well.

What do you exactly mean with “map register copy operations”? I thought
that the HAL’s DMA emulation for 32 bit devices is achieved by
allocating a buffer in the 4G area, DMA-ing the data into this buffer
and then copying the data from this intermediate buffer to my original
buffer. This assumes that the underlying bus driver uses the DMA APIs
correctly, which should be a valid assumption.

Because my original buffer is allocated in kernel mode (and not
submitted by an app) my idea was to take into account the DMA
restrictions for its allocation. This should avoid the intermediate copy

operation and will also reduce memory consumption.

Udo

Mark Roddy wrote:
> You could use MmAllocateContiguousMemorySpecifyCache to allocate
blocks of
> memory below 4gb. Do you actually know that the map register copy
operations
> already get in your way and prevent you from meeting your performance
> requirements?
>
>> -----Original Message-----
>> From: xxxxx@lists.osr.com [mailto:bounce-284394-
>> xxxxx@lists.osr.com] On Behalf Of Udo Eberhardt
>> Sent: Monday, April 23, 2007 4:19 PM
>> To: Windows System Software Devs Interest List
>> Subject: [ntdev] How to allocate non-paged buffer below 4GB
>>
>> Hello,
>>
>> I have a WDM function driver that allocates buffers from non-paged
pool
>> and submits them to the 1394 bus driver. The 1394 OHCI controller
>> supports 32 bit addressing only. So on an x64 machine with 8G RAM the
>> DMA abstraction in the HAL will do an intermediate copy. In order to
>> avoid that extra copy I would like to ensure that all pages of my
>> buffer
>> are below the 4G boundary. How can I achieve this? Is
>> MmAllocatePagesForMdl the way to go?
>>
>> Note: I’m talking about driver-allocated buffers. I do not use app
>> buffers directly for DMA because the driver has to process the data
>> before it’s passed to apps.
>>
>> Thanks for any comments.
>>
>> Udo
>>
>> —
>> Questions? First check the Kernel Driver FAQ at
>> http://www.osronline.com/article.cfm?id=256
>>
>> To unsubscribe, visit the List Server section of OSR Online at
>> http://www.osronline.com/page.cfm?name=ListServer
>
>


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

> the driver has to process the data). Can I create this by a subsequent

call to MmGetSystemAddressForMdlSafe, is this sufficient?

Yes you can.


Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

You just need to call MmGetSystemAddressForMdlSafe.

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:bounce-284502-
xxxxx@lists.osr.com] On Behalf Of Udo Eberhardt
Sent: Tuesday, April 24, 2007 5:07 PM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] How to allocate non-paged buffer below 4GB

Yes, MmAllocatePagesForMdl seems to do the right thing. But I do also
need a kernel-mode virtual address for the buffer it returns (because
the driver has to process the data). Can I create this by a subsequent
call to MmGetSystemAddressForMdlSafe, is this sufficient?
MmProbeAndLockPages and MmMapLockedPagesSpecifyCache should not be
necessary as the doc for MmAllocatePagesForMdl states that the buffer
is
non-paged.

Udo

Roddy, Mark wrote:
> Indeed - so how about MmAllocatePagesForMdl? That would appear to
allow
> you to specify the physical address range limits and not require
> contiguoua allocation.
>
> By “map register copy operations” I meant what you described as “the
> HAL’s DMA emulation”. The buffer(s) used for this are referred to in
the
> documentation as “map registers” for historic reasons.
>
> -----Original Message-----
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of Udo Eberhardt
> Sent: Tuesday, April 24, 2007 3:34 PM
> To: Windows System Software Devs Interest List
> Subject: Re:[ntdev] How to allocate non-paged buffer below 4GB
>
> Thanks for your reply, Mark.
>
> I considered using MmAllocateContiguousMemorySpecifyCache. But I
don’t
> need physically contiguous memory. I just need a (virtually
contiguous)
> non-paged buffer that is comprised of pages below 4G. When I use
> MmAllocateContiguousMemorySpecifyCache I’m afraid that this will
> increase likelihood of failed allocations.
>
> I notice a (slightly) increased CPU load when I equip the machine
with
> 8G RAM, if compared to 4G RAM. Because my users are very sensitive
> regarding CPU utilization I’m afraid they will notice this as well.
>
> What do you exactly mean with “map register copy operations”? I
thought
> that the HAL’s DMA emulation for 32 bit devices is achieved by
> allocating a buffer in the 4G area, DMA-ing the data into this buffer
> and then copying the data from this intermediate buffer to my
original
> buffer. This assumes that the underlying bus driver uses the DMA APIs
> correctly, which should be a valid assumption.
>
> Because my original buffer is allocated in kernel mode (and not
> submitted by an app) my idea was to take into account the DMA
> restrictions for its allocation. This should avoid the intermediate
copy
>
> operation and will also reduce memory consumption.
>
> Udo
>
>
> Mark Roddy wrote:
>> You could use MmAllocateContiguousMemorySpecifyCache to allocate
> blocks of
>> memory below 4gb. Do you actually know that the map register copy
> operations
>> already get in your way and prevent you from meeting your
performance
>> requirements?
>>
>>> -----Original Message-----
>>> From: xxxxx@lists.osr.com [mailto:bounce-284394-
>>> xxxxx@lists.osr.com] On Behalf Of Udo Eberhardt
>>> Sent: Monday, April 23, 2007 4:19 PM
>>> To: Windows System Software Devs Interest List
>>> Subject: [ntdev] How to allocate non-paged buffer below 4GB
>>>
>>> Hello,
>>>
>>> I have a WDM function driver that allocates buffers from non-paged
> pool
>>> and submits them to the 1394 bus driver. The 1394 OHCI controller
>>> supports 32 bit addressing only. So on an x64 machine with 8G RAM
the
>>> DMA abstraction in the HAL will do an intermediate copy. In order
to
>>> avoid that extra copy I would like to ensure that all pages of my
>>> buffer
>>> are below the 4G boundary. How can I achieve this? Is
>>> MmAllocatePagesForMdl the way to go?
>>>
>>> Note: I’m talking about driver-allocated buffers. I do not use app
>>> buffers directly for DMA because the driver has to process the data
>>> before it’s passed to apps.
>>>
>>> Thanks for any comments.
>>>
>>> Udo
>>>
>>> —
>>> Questions? First check the Kernel Driver FAQ at
>>> http://www.osronline.com/article.cfm?id=256
>>>
>>> To unsubscribe, visit the List Server section of OSR Online at
>>> http://www.osronline.com/page.cfm?name=ListServer
>>
>>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Thanks Mark and Maxim for your comments.

Udo

Mark Roddy wrote:

You just need to call MmGetSystemAddressForMdlSafe.

> -----Original Message-----
> From: xxxxx@lists.osr.com [mailto:bounce-284502-
> xxxxx@lists.osr.com] On Behalf Of Udo Eberhardt
> Sent: Tuesday, April 24, 2007 5:07 PM
> To: Windows System Software Devs Interest List
> Subject: Re:[ntdev] How to allocate non-paged buffer below 4GB
>
> Yes, MmAllocatePagesForMdl seems to do the right thing. But I do also
> need a kernel-mode virtual address for the buffer it returns (because
> the driver has to process the data). Can I create this by a subsequent
> call to MmGetSystemAddressForMdlSafe, is this sufficient?
> MmProbeAndLockPages and MmMapLockedPagesSpecifyCache should not be
> necessary as the doc for MmAllocatePagesForMdl states that the buffer
> is
> non-paged.
>
> Udo
>
>
> Roddy, Mark wrote:
>> Indeed - so how about MmAllocatePagesForMdl? That would appear to
> allow
>> you to specify the physical address range limits and not require
>> contiguoua allocation.
>>
>> By “map register copy operations” I meant what you described as “the
>> HAL’s DMA emulation”. The buffer(s) used for this are referred to in
> the
>> documentation as “map registers” for historic reasons.
>>
>> -----Original Message-----
>> From: xxxxx@lists.osr.com
>> [mailto:xxxxx@lists.osr.com] On Behalf Of Udo Eberhardt
>> Sent: Tuesday, April 24, 2007 3:34 PM
>> To: Windows System Software Devs Interest List
>> Subject: Re:[ntdev] How to allocate non-paged buffer below 4GB
>>
>> Thanks for your reply, Mark.
>>
>> I considered using MmAllocateContiguousMemorySpecifyCache. But I
> don’t
>> need physically contiguous memory. I just need a (virtually
> contiguous)
>> non-paged buffer that is comprised of pages below 4G. When I use
>> MmAllocateContiguousMemorySpecifyCache I’m afraid that this will
>> increase likelihood of failed allocations.
>>
>> I notice a (slightly) increased CPU load when I equip the machine
> with
>> 8G RAM, if compared to 4G RAM. Because my users are very sensitive
>> regarding CPU utilization I’m afraid they will notice this as well.
>>
>> What do you exactly mean with “map register copy operations”? I
> thought
>> that the HAL’s DMA emulation for 32 bit devices is achieved by
>> allocating a buffer in the 4G area, DMA-ing the data into this buffer
>> and then copying the data from this intermediate buffer to my
> original
>> buffer. This assumes that the underlying bus driver uses the DMA APIs
>> correctly, which should be a valid assumption.
>>
>> Because my original buffer is allocated in kernel mode (and not
>> submitted by an app) my idea was to take into account the DMA
>> restrictions for its allocation. This should avoid the intermediate
> copy
>> operation and will also reduce memory consumption.
>>
>> Udo
>>
>>
>> Mark Roddy wrote:
>>> You could use MmAllocateContiguousMemorySpecifyCache to allocate
>> blocks of
>>> memory below 4gb. Do you actually know that the map register copy
>> operations
>>> already get in your way and prevent you from meeting your
> performance
>>> requirements?
>>>
>>>> -----Original Message-----
>>>> From: xxxxx@lists.osr.com [mailto:bounce-284394-
>>>> xxxxx@lists.osr.com] On Behalf Of Udo Eberhardt
>>>> Sent: Monday, April 23, 2007 4:19 PM
>>>> To: Windows System Software Devs Interest List
>>>> Subject: [ntdev] How to allocate non-paged buffer below 4GB
>>>>
>>>> Hello,
>>>>
>>>> I have a WDM function driver that allocates buffers from non-paged
>> pool
>>>> and submits them to the 1394 bus driver. The 1394 OHCI controller
>>>> supports 32 bit addressing only. So on an x64 machine with 8G RAM
> the
>>>> DMA abstraction in the HAL will do an intermediate copy. In order
> to
>>>> avoid that extra copy I would like to ensure that all pages of my
>>>> buffer
>>>> are below the 4G boundary. How can I achieve this? Is
>>>> MmAllocatePagesForMdl the way to go?
>>>>
>>>> Note: I’m talking about driver-allocated buffers. I do not use app
>>>> buffers directly for DMA because the driver has to process the data
>>>> before it’s passed to apps.
>>>>
>>>> Thanks for any comments.
>>>>
>>>> Udo
>>>>
>>>> —
>>>> Questions? First check the Kernel Driver FAQ at
>>>> http://www.osronline.com/article.cfm?id=256
>>>>
>>>> To unsubscribe, visit the List Server section of OSR Online at
>>>> http://www.osronline.com/page.cfm?name=ListServer
>>>
>> —
>> Questions? First check the Kernel Driver FAQ at
>> http://www.osronline.com/article.cfm?id=256
>>
>> To unsubscribe, visit the List Server section of OSR Online at
>> http://www.osronline.com/page.cfm?name=ListServer
>>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer