Concurrent memory acces to PCIe device

Hi, I have quite strange problems with memory access to custom PCIe device.

I have a PCIe device that has a BAR with some registers (32bit non-prefetchable memory BAR) - it’s basically about 10 kB of memory. When my driver is open, it starts a thread that periodically reads and writes some of the registers (there is about 100ms interval). There is also an option that an user mode software can write to that registers using memory mapping. But when I write to this registers using the user mode SW it happens quite often that the PC freezes. When I remove the periodic writes, the problem disappears. But when I start more (I’ve tried up to 10) instances of the user mode SW and start parallel continuous reading and writing of the registers from the user mode, it works without any problem. The only problem is the periodic reading and writing in the kernel mode, when it is turned on then there is big probability that accessing the memory from the user mode will freeze the computer.

Could there be some concurrency problem? It’s normal 32bit access, do I have to “lock” the memory somehow to use it from the kernel mode? Can the simultaneous access of the memory from the user mode and kernel mode lead to PC lockup?

> There is also an option that an user mode software can write to that registers using memory mapping.

If you don’t mind, could you please explain the above in a bit more detailed way. How do you map device BARs to the userland??? Normally you are supposed to map device registers with MmMapIoSpace() which maps them to the the kernel space. The only way a driver can map memory to the userland (unless it decides to mess around with PTEs directly, of course) is to call MmMapLockedPagesSpecifyCache() , but this function requires MDL…

Anton Bassov

I am using ZwMapViewOfSection on PhysicalMemory to accomplish this. It’s just for testing purposes as user mode SW is easier to modify than the driver. But it is not a problem of the user mode mapping - I do have an IOCTL that writes some data sent from application to the registers in kernel mode and the result is the same - when I access the registers while the periodic reading and writing is running, the PC freezes quite often.

> I am using ZwMapViewOfSection on PhysicalMemory to accomplish this

Then you map the same physical memory as both cached and noncached.
This causes undefined behavior, and symptoms are like you described.
Do not use PhysicalMemory. make a MDL as Anton suggested.

– pa

Well, but I have the same problem even when I am not accessing the memory from the user mode. In that case is the mapping done (ZwMapViewOfSection is called, but the pointers are not used), but I am accessing that memory only in kernel mode - once from the thread that does the periodic checks and once from IOCTL. Can the cached/non-cached problem (as you’ve described) still be there?

xxxxx@centrum.cz wrote:

Hi, I have quite strange problems with memory access to custom PCIe device.

I have a PCIe device that has a BAR with some registers (32bit non-prefetchable memory BAR) - it’s basically about 10 kB of memory. When my driver is open, it starts a thread that periodically reads and writes some of the registers (there is about 100ms interval). There is also an option that an user mode software can write to that registers using memory mapping.

That is almost always a mistake, for exactly the reason you describe –
no interlocking. The driver should own the registers exclusively. Your
applications should send an ioctl to read and write the registers. That
way, the driver can serialize register access, if required.

Could there be some concurrency problem?

Absolutely.

It’s normal 32bit access, do I have to “lock” the memory somehow to use it from the kernel mode? Can the simultaneous access of the memory from the user mode and kernel mode lead to PC lockup?

That depends in part on your hardware. For example, if you have
indirect registers, where you write an index to one register and then
read or write data from another register, then I hope it is clear that
it is inherently unsafe to allow unsynchronized access from multiple
sources.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

What is the latency of the memory controller on your device?

So if I understand you correctly, you are treating the pci device
memory space simply as a buffer that for testing purposes you are
reading/writing concurrently from multiple threads. You are not
attempting unsynchronized control of the device through control
registers. For both kernel mode and user mode concurrent access
read/write testing, you are getting system hangs. In the case for
kernel mode you are using the same mapping of the pci device address
space, but separate and perhaps concurrent threads. The kernel mode
test also hangs, so the potential cache mode conflict between the user
mode mapping and the kernel mode mapping can be ruled out.

Mark Roddy

On Mon, Nov 22, 2010 at 10:57 AM, wrote:
> Well, but I have the same problem even when I am not accessing the memory from the user mode. In that case is the mapping done (ZwMapViewOfSection is called, but the pointers are not used), but I am accessing that memory only in kernel mode - once from the thread that does the periodic checks and once from IOCTL. Can the cached/non-cached problem (as you’ve described) still be there?
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
>

I don’t know exactly the latency, but it should be something around 80 ns.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of George M. Garner Jr.
Sent: Monday, November 22, 2010 7:42 PM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] Concurrent memory acces to PCIe device

What is the latency of the memory controller on your device?


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

wrote in message news:xxxxx@ntdev…
> Well, but I have the same problem even when I am not accessing the memory
> from the user mode. In that case is the mapping done (ZwMapViewOfSection
> is called, but the pointers are not used), but I am accessing that memory
> only in kernel mode - once from the thread that does the periodic checks
> and once from IOCTL. Can the cached/non-cached problem (as you’ve
> described) still be there?
>

AFAIK yes. Just creating a conflicting mapping (cached and noncached)
is enough to confuse the memory controller.

–pa

Well I don’t know what do you mean by “You are not attempting unsynchronized
control of the device through control registers”.

But generally you are right. I do treat the registers as a buffer. Access to
some registers should be serialized (and will be in production version of
the driver), but the hardware will not do any damage if the access is not
serialized. The hardware will issue interrupts in the future, but currently
for testing purposes is interrupt generation disabled.

When the driver is open by the application, the driver starts a thread (if
it is not already running) that in about 100ms does read and write of some
registers. Also when the driver is opened, the registers are mapped to the
user mode application memory space.

Then I can run two tests - one does reading and writing of the registers
from user mode (using the mapped memory) and I can issue some IOCTLs that
does the same in kernel mode. Both of the test do hang very often. The
access from user mode is not synchronized with the thread in kernel mode,
the IOCTLs are synchronized only with other IOCTLs, but not with the thread.

When I do not run any tests, just open the driver, the “periodic” thread
alone runs without any problem. When I disable the thread and do only the
tests, everything runs correctly. Even when I run more applications
simultaneously and run the tests, everything is ok. But only with the thread
disabled. Once the thread is enabled and I start the tests, system hangs.

The testing machine does have dual cores, if it can mean anything.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Mark Roddy
Sent: Monday, November 22, 2010 9:04 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] Concurrent memory acces to PCIe device

So if I understand you correctly, you are treating the pci device
memory space simply as a buffer that for testing purposes you are
reading/writing concurrently from multiple threads. You are not
attempting unsynchronized control of the device through control
registers. For both kernel mode and user mode concurrent access
read/write testing, you are getting system hangs. In the case for
kernel mode you are using the same mapping of the pci device address
space, but separate and perhaps concurrent threads. The kernel mode
test also hangs, so the potential cache mode conflict between the user
mode mapping and the kernel mode mapping can be ruled out.

Mark Roddy

On Mon, Nov 22, 2010 at 10:57 AM, wrote:
> Well, but I have the same problem even when I am not accessing the memory
from the user mode. In that case is the mapping done (ZwMapViewOfSection is
called, but the pointers are not used), but I am accessing that memory only
in kernel mode - once from the thread that does the periodic checks and once
from IOCTL. Can the cached/non-cached problem (as you’ve described) still be
there?
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer
>


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

There was a concern expressed that your tests were attempting
unsynchronized stateful control operations on device registers - that
seems to not be the case.

I still can’t tell if your test is failing with concurrent read/write
access using the same kernel mode mapping, or if the failure only
occurs using two mappings, one the kernel mode mapping obtained from
start device, the other the user mode mapping obtained using
ZwMapViewOfSection. You should make sure that you are using the same
memory caching type for both mappings. As another poster noted, simply
having two conflicting mappings is a problem.

Mark Roddy

2010/11/22 Martin ?i?ka :
> Well I don’t know what do you mean by “You are not attempting unsynchronized
> control of the device through control registers”.
>
> But generally you are right. I do treat the registers as a buffer. Access to
> some registers should be serialized (and will be in production version of
> the driver), but the hardware will not do any damage if the access is not
> serialized. The hardware will issue interrupts in the future, but currently
> for testing purposes is interrupt generation disabled.
>
> When the driver is open by the application, the driver starts a thread (if
> it is not already running) that in about 100ms does read and write of some
> registers. Also when the driver is opened, the registers are mapped to the
> user mode application memory space.
>
> Then I can run two tests - one does reading and writing of the registers
> from user mode (using the mapped memory) and I can issue some IOCTLs that
> does the same in kernel mode. Both of the test do hang very often. The
> access from user mode is not synchronized with the thread in kernel mode,
> the IOCTLs are synchronized only with other IOCTLs, but not with the thread.
>
> When I do not run any tests, just open the driver, the “periodic” thread
> alone runs without any problem. When I disable the thread and do only the
> tests, everything runs correctly. Even when I run more applications
> simultaneously and run the tests, everything is ok. But only with the thread
> disabled. Once the thread is enabled and I start the tests, system hangs.
>
> The testing machine does have dual cores, if it can mean anything.
>
>
>
> -----Original Message-----
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of Mark Roddy
> Sent: Monday, November 22, 2010 9:04 PM
> To: Windows System Software Devs Interest List
> Subject: Re: [ntdev] Concurrent memory acces to PCIe device
>
> So if I understand you correctly, you are treating the pci device
> memory space simply as a buffer that for testing purposes you are
> reading/writing concurrently from multiple threads. You are not
> attempting unsynchronized control of the device through control
> registers. For both kernel mode and user mode concurrent access
> read/write testing, you are getting system hangs. In the case for
> kernel mode you are using the same mapping of the pci device address
> space, but separate and perhaps concurrent threads. The kernel mode
> test also hangs, so the potential cache mode conflict between the user
> mode mapping and the kernel mode mapping can be ruled out.
>
>
> Mark Roddy
>
>
>
> On Mon, Nov 22, 2010 at 10:57 AM, ? wrote:
>> Well, but I have the same problem even when I am not accessing the memory
> from the user mode. In that case is the mapping done (ZwMapViewOfSection is
> called, but the pointers are not used), but I am accessing that memory only
> in kernel mode - once from the thread that does the periodic checks and once
> from IOCTL. Can the cached/non-cached problem (as you’ve described) still be
> there?
>>
>> —
>> NTDEV is sponsored by OSR
>>
>> For our schedule of WDF, WDM, debugging and other seminars visit:
>> http://www.osr.com/seminars
>>
>> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
>
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
>