Dear all,
?
??? I heard that we can use write-combined to make the Hardware collect data from
memory access ( example: WRITE_REGISTER_ULONG) , next Hardware can send
maximum 32B ( cache size) to device .
?
??? So, I have a trial with the PCIe driver . When the allocation resource of plug and play dispatch is triggered . I map the physical address (32 bit memory at BAR0) to virtual address as following MmmapIOSpace (…,… , MmWriteCombined) . I expect that the Hardware ( Root complex) can send 32B to the PCIe bus when “WRITE_REGISTER_BUFFER_ULONG”??is sent.
?
But when I send data , the PC is hang up and I must restart .
I check that the driver can finish the Write function . But the “hang up” may relate to the hardware conflict somewhere.
My mother board is “Lenovo G31T-LM” which use Chipset “G31T” of Intel ( XP 32 bit)
I heard that the North bridge must support “WriteCombined” then I can map physical address to “WriteCombined” memory . However, I don’t know how to check this in the Chipset because there’s no explanation of "Write combined " inside .
?
?Do you have any suggestion ? Do you know any mother board which support “WriteCombined” ??
?
Best Regards
HanNguyen
PCI Bus Analyzer.
However write-combined is generally used only by video devices.
Mark Roddy
On Tue, Jul 5, 2011 at 10:00 AM, Nhat Han wrote:
> Dear all,
>
> I heard that we can use write-combined to make the Hardware collect
> data from
> memory access ( example: WRITE_REGISTER_ULONG) , next Hardware can send
> maximum 32B ( cache size) to device .
>
> So, I have a trial with the PCIe driver . When the allocation resource
> of plug and play dispatch is triggered . I map the physical address (32 bit
> memory at BAR0) to virtual address as following MmmapIOSpace (…,… ,
> MmWriteCombined) . I expect that the Hardware ( Root complex) can send 32B
> to the PCIe bus when “WRITE_REGISTER_BUFFER_ULONG” is sent.
>
> But when I send data , the PC is hang up and I must restart .
> I check that the driver can finish the Write function . But the “hang up”
> may relate to the hardware conflict somewhere.
> My mother board is “Lenovo G31T-LM” which use Chipset “G31T” of Intel ( XP
> 32 bit)
> I heard that the North bridge must support “WriteCombined” then I can map
> physical address to “WriteCombined” memory . However, I don’t know how to
> check this in the Chipset because there’s no explanation of "Write combined
> " inside .
>
> Do you have any suggestion ? Do you know any mother board which
> support “WriteCombined” ?
>
> Best Regards
> HanNguyen
> — NTDEV is sponsored by OSR For our schedule of WDF, WDM, debugging and
> other seminars visit: http://www.osr.com/seminars To unsubscribe, visit
> the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
Nhat Han wrote:
I heard that we can use write-combined to make the Hardware
collect data from
memory access ( example: WRITE_REGISTER_ULONG) , next Hardware can send
maximum 32B ( cache size) to device .
The root complex CAN do this. It is a performance optimization. It is
worthwhile, so it is commonly done.
So, I have a trial with the PCIe driver . When the allocation
resource of plug and play dispatch is triggered . I map the physical
address (32 bit memory at BAR0) to virtual address as following
MmmapIOSpace (…,… , MmWriteCombined) . I expect that the Hardware
( Root complex) can send 32B to the PCIe bus when
“WRITE_REGISTER_BUFFER_ULONG” is sent.But when I send data , the PC is hang up and I must restart .
Does it work if you write one dword at a time? Does your hardware
implement this whole address range? Remember, if you do a
WRITE_REGISTER_BUFFER_ULONG of 0x100 bytes to offset 0x100, it’s going
to get written to consecutive addresses starting at 0x100 ending at
0x1FF. What does your device do with these writes?
I check that the driver can finish the Write function . But the “hang
up” may relate to the hardware conflict somewhere.Do you have any suggestion ? Do you know any mother board which
> support “WriteCombined” ?
>
They ALL support write-combining at the level you are talking about.
They HAVE to. Processors run 5 times faster than memory. Unless they
want the processor to spend all of its time waiting for memory, the
chipset MUST support write-combining. That doesn’t necessarily mean the
PCIe root-complex does it, but even if it didn’t it wouldn’t cause a hang.
–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.
There is a subtle problem with write combining to be aware of, however – register ordering. Suppose I have some control registers mapped into a BAR and the registers are contiguous (0xa, 0xb, 0xc etc). If I have write combining “off” and write the registers in the order 0xa, 0xc, 0xb then that will be the sequence the HW sees: a, c, b in three bus write cycles. If I have write combining “on” and do the same thing then the HW will (likely) see a,b,c as one bus write cycle. This can be a problem if one of those registers is a “start” bit or the ordering of the register writes is important …
Generally I have write combining enabled for BARs used for DMA operations, write combining off for control registers and PIO registers …
Cheers!
Dear Choward,
?
Actually, my device doesn’t have “DMA bus mastering” . So , I only care about sending
data from PC to device using PIO . It’s not problem if there’s re-ordering because my device only implement memory.
?
So, do you send PIO with “MmWriteCombined” normally ?
?
Best Regards
HanNguyen
— On Wed, 7/6/11, choward@ix.netcom.com wrote:
From: choward@ix.netcom.com
Subject: RE:[ntdev] About using “MmWriteCombined” of MmmapIOSpace
To: “Windows System Software Devs Interest List”
Date: Wednesday, July 6, 2011, 3:55 AM
There is a subtle problem with write combining to be aware of, however – register ordering.? Suppose I have some control registers mapped into a BAR and the registers are contiguous (0xa, 0xb, 0xc etc).? If I have write combining “off” and write the registers in the order 0xa, 0xc, 0xb then that will be the sequence the HW sees: a, c, b in three bus write cycles.? If I have write combining “on” and do the same thing then the HW will (likely) see a,b,c as one bus write cycle.? This can be a problem if one of those registers is a “start” bit or the ordering of the register writes is important …
Generally I have write combining enabled for BARs used for DMA operations, write combining off for control registers and PIO registers …
Cheers!
—
NTDEV is sponsored by OSR
For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars
To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
Dear Tim Roberts,
?
Thank you for your concern to my problem !
?
> The root complex CAN do this.? It is a performance optimization.? It is
> worthwhile, so it is commonly done.
?
I also hope so . Actually, I have no evidence?to sure that?WriteCombined is
supported by most motherboard.WDK only explains very little . Information online doesn’t point to trusted ?document.
So, If you know some trusted document which explains it , please help to show me .
?
?
> Does it work if you write one dword at a time??
No, I tried two different case :
?
First case, Sending 32B by using WRITE_REGISTER_BUFFER_ULONG with size = 4
Second case, Sending 8B by using WRITE_REGISTER_BUFFER_ULONG with?size =1
?
But both case, the PC is hang up . As you know, HAL didn’t return any status but
next PC is hang up and I have no more information except the “Kdprint” displayed on
Debugging Host PC.
?
?
> Does your implement this whole address range?
?
Yes, when doing allocate the resource ,?at ?IRP_MN_START_DEVICE state of PnP dispatch, I check?the field (partialResourceListTranslated->PartialDescriptors->Type),
when it’s CmResourceTypeMemory, I start to do the mapping .
?
As the result, ?I map the whole memory of BAR0 ( 1MB) to virtual address with “MmWriteCombined” .
?
pMappedVirtualAddress = (PVOID) MmMapIoSpace(resourceTrans->u.Memory.Start,
??? ??? resourceTrans->u.Memory.Length,
??? ??? MmNonCached? );
?
Length of BAR0 is 1MB and the whole area of BAR0 is mapped.
?
>? Remember, if you do a WRITE_REGISTER_BUFFER_ULONG of 0x100 bytes to offset 0x100, it’s going to get written to consecutive addresses starting at 0x100 ending at
0x1FF.? What does your device do with these writes?
As above, I only check sending short data (32B, 8B) but the PC is hang up .
I didn’t check sending 0x100 bytes .
But you mean that I must map all physical? address with “MmWriteCombined” before sending data . Is that right ?
?
As above , I map whole BAR0 .
?
> but even if it didn’t it wouldn’t cause a hang.
?
Oh, so my problem is a bit strange . Currently, I don’t know where to start
for the debugging .
?
Best Regards
HanNguyen
?
?
— On Wed, 7/6/11, Tim Roberts wrote:
From: Tim Roberts
Subject: Re: [ntdev] About using “MmWriteCombined” of MmmapIOSpace
To: “Windows System Software Devs Interest List”
Date: Wednesday, July 6, 2011, 1:42 AM
Nhat Han wrote:
>?
>? ? ???I heard that we can use write-combined to make the Hardware
> collect data from
> memory access ( example: WRITE_REGISTER_ULONG) , next Hardware can send
> maximum 32B ( cache size) to device .
>
The root complex CAN do this.? It is a performance optimization.? It is
worthwhile, so it is commonly done.
>? ? ? So, I have a trial with the PCIe driver . When the allocation
> resource of plug and play dispatch is triggered . I map the physical
> address (32 bit memory at BAR0) to virtual address as following
> MmmapIOSpace (…,… , MmWriteCombined) . I expect that the Hardware
> ( Root complex) can send 32B to the PCIe bus when
> “WRITE_REGISTER_BUFFER_ULONG”? is sent.
>?
> But when I send data , the PC is hang up and I must restart .
>
Does it work if you write one dword at a time?? Does your hardware
implement this whole address range?? Remember, if you do a
WRITE_REGISTER_BUFFER_ULONG of 0x100 bytes to offset 0x100, it’s going
to get written to consecutive addresses starting at 0x100 ending at
0x1FF.? What does your device do with these writes?
> I check that the driver can finish the Write function . But the “hang
> up” may relate to the hardware conflict somewhere.
>
> Do you have any suggestion ? Do you know any mother board which
> support “WriteCombined”? ?
>
They ALL support write-combining at the level you are talking about.
They HAVE to.? Processors run 5 times faster than memory.? Unless they
want the processor to spend all of its time waiting for memory, the
chipset MUST support write-combining.? That doesn’t necessarily mean the
PCIe root-complex does it, but even if it didn’t it wouldn’t cause a hang.
–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.
—
NTDEV is sponsored by OSR
For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars
To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
>So, do you send PIO with “MmWriteCombined” normally ?
To control registers - never.
To memory - yes, this is how video memory works.
–
Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com
Nhat Han wrote:
>> The root complex CAN do this. It is a performance optimization. It is
>> worthwhile, so it is commonly done.I also hope so . Actually, I have no evidence to sure
that WriteCombined is
supported by most motherboard.WDK only explains very little .
Information online doesn’t point to trusted document.
So, If you know some trusted document which explains it , please help
to show me .
It’s not a WDK issue. It’s not an operating system issue. It’s a
hardware issue. You may be able to find information in the PCIExpress
bus specification, or in the data sheets for your PCIExpress bus
controller, but there’s really not much more to tell.
We have tried to tell you over and over that the task you have set for
yourself (exercising all the options of your PCIExpress board) REQUIRES
a PCIe bus exerciser and analyzer. There is no alternative. You MIGHT
get coverage through write-combining, but if you are truly tasked with
testing your PCIe core at a microscopic level, then you need a bus
exerciser. You cannot rely on the potential behavior of an arbitrary
bus controller.
First case, Sending 32B by using WRITE_REGISTER_BUFFER_ULONG with size = 4
Second case, Sending 8B by using WRITE_REGISTER_BUFFER_ULONG with size =1But both case, the PC is hang up . As you know, HAL didn’t return any
status but
next PC is hang up and I have no more information except the “Kdprint”
displayed on
Debugging Host PC.>> but even if it didn’t it wouldn’t cause a hang.
Oh, so my problem is a bit strange . Currently, I don’t know where to
start for the debugging .
We have ALL told you where to start. You have a hardware problem. You
need to do hardware debugging. Logic analyzers, oscilloscopes, PCIe bus
analyzers. This is not a software problem. This is not a driver problem.
–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.
Dear all,
??? Firstly, I’d like to say thank you to? Tim-Roberts which persuaded me that
“write-combine” can work well in almost PC . Now, I can make it run on my XP PC 32 bit.
??? Instead of using “MmNonCached” , I used “MmWriteCombined” for cache type
in the mapping of MmmapIOSpace . And the data rate increase 10 times .( 14Mb/s –> 178 Mb/s )
??? PC? can send maximum 16DW (equal cache size)? per burst? to PCIe bus instead of 1DW when using Non-cache memory.
??? The problem that I described before ( PC hang up when using write combine) is not
the problem of “Write combine” but the problem of VGA card . VGA card doesn’t allow to write to BAR0 . It may be the control area . When writing some data to BAR0, the PC is restarted or hang up.
However , my driver can’t run well with “write combine” on XP 64 bit .When I write to BAR3 (memory 32 bit) at address (BASE + 0x00) with 8 bytes. Nothing in the memory is changed.
Do you experience this ?
Best Regards
HanNguyen
---------------------------------------
Nguyen Nhat Han
MobiPhone : 0906.739.923
C?ng ty : Cty TNHH Thiet ke Renesas
---------------------------------------
— On Thu, 7/7/11, Tim Roberts wrote:
From: Tim Roberts
Subject: Re: [ntdev] About using “MmWriteCombined” of MmmapIOSpace
To: “Windows System Software Devs Interest List”
Date: Thursday, July 7, 2011, 1:50 AM
Nhat Han wrote:
>?
> >> The root complex CAN do this.? It is a performance optimization.? It is
> >> worthwhile, so it is commonly done.
>?
> I also hope so . Actually, I have no evidence to sure
> that WriteCombined is
> supported by most motherboard.WDK only explains very little .
> Information online doesn’t point to trusted? document.
> So, If you know some trusted document which explains it , please help
> to show me .
>
It’s not a WDK issue.? It’s not an operating system issue.? It’s a
hardware issue.? You may be able to find information in the PCIExpress
bus specification, or in the data sheets for your PCIExpress bus
controller, but there’s really not much more to tell.
We have tried to tell you over and over that the task you have set for
yourself (exercising all the options of your PCIExpress board) REQUIRES
a PCIe bus exerciser and analyzer.? There is no alternative.? You MIGHT
get coverage through write-combining, but if you are truly tasked with
testing your PCIe core at a microscopic level, then you need a bus
exerciser.? You cannot rely on the potential behavior of an arbitrary
bus controller.
> First case, Sending 32B by using WRITE_REGISTER_BUFFER_ULONG with size = 4
> Second case, Sending 8B by using WRITE_REGISTER_BUFFER_ULONG with size =1
>?
> But both case, the PC is hang up . As you know, HAL didn’t return any
> status but
> next PC is hang up and I have no more information except the “Kdprint”
> displayed on
> Debugging Host PC.
>?
> >> but even if it didn’t it wouldn’t cause a hang.
>?
> Oh, so my problem is a bit strange . Currently, I don’t know where to
> start for the debugging .
>
We have ALL told you where to start.? You have a hardware problem.? You
need to do hardware debugging.? Logic analyzers, oscilloscopes, PCIe bus
analyzers.? This is not a software problem.? This is not a driver problem.
–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.
—
NTDEV is sponsored by OSR
For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars
To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
Nhat Han wrote:
Instead of using “MmNonCached” , I used “MmWriteCombined” for
cache type
in the mapping of MmmapIOSpace . And the data rate increase 10 times
.( 14Mb/s –> 178 Mb/s )
PC can send maximum 16DW (equal cache size) per burst to PCIe
bus instead of 1DW when using Non-cache memory.The problem that I described before ( PC hang up when using write
combine) is not
the problem of “Write combine” but the problem of VGA card . VGA card
doesn’t allow to write to BAR0 . It may be the control area . When
writing some data to BAR0, the PC is restarted or hang up.
That’s quite possible. The VGA card never promised that it would handle
write-combining. Plus, that memory area is also being access by your
graphics card driver. You might be trying to access the graphics engine
while it is in the middle of some other operation. You can’t do this.
It’s never going to work.
However , my driver can’t run well with “write combine” on XP 64 bit
.When I write to BAR3 (memory 32 bit) at address (BASE + 0x00) with 8
bytes. Nothing in the memory is changed.
BAR3 of what device? Some devices don’t allow their registers to be
read. You can write to them, but can’t read anything.
This whole exercise is just a waste of your time. You can’t simulate
YOUR device by stealing register spaces from OTHER devices.
–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.
Dear Tim Roberts,
?
>BAR3 of what device?? Some devices don’t allow their registers to be
>read.? You can write to them, but can’t read anything.
>This whole exercise is just a waste of your time.? You can’t simulate
>YOUR device by stealing register spaces from OTHER devices.
I used VGA Card to check . I can access the same memory region in
XP 32 bit ( write combine)? read & write .
?
But when I used XP 64 bit (write combine) , I can’t write correctly.
?
Of course, or XP 64 bit, that memory area can be written normally in Non-cached mode.
?
?
But as your advice, I’ll check the hardware operation in my real device
to assure if it receives the packets correctly or not.
?
??? My trouble is that I don’t have the “PCIe bus capture device” . So I can’t
see the transacion on bus. But my customer has , so I’ll ask him to check.
?
I’ll come back this topic later .
?
Thank you very much !
Best Regards
HanNguyen
Nguyen Nhat Han
MobiPhone : 0906.739.923
C?ng ty : Cty TNHH Thiet ke Renesas
— On Fri, 7/29/11, Tim Roberts wrote:
From: Tim Roberts
Subject: Re: [ntdev] About using “MmWriteCombined” of MmmapIOSpace
To: “Windows System Software Devs Interest List”
Date: Friday, July 29, 2011, 1:08 AM
Nhat Han wrote:
>
>? ? ? Instead of using “MmNonCached” , I used “MmWriteCombined” for
> cache type
> in the mapping of MmmapIOSpace . And the data rate increase 10 times
> .( 14Mb/s –> 178 Mb/s )
>? ? ???PC? can send maximum 16DW (equal cache size)? per burst? to PCIe
> bus instead of 1DW when using Non-cache memory.
>
>? ???The problem that I described before ( PC hang up when using write
> combine) is not
> the problem of “Write combine” but the problem of VGA card . VGA card
> doesn’t allow to write to BAR0 . It may be the control area . When
> writing some data to BAR0, the PC is restarted or hang up.
>
That’s quite possible.? The VGA card never promised that it would handle
write-combining.? Plus, that memory area is also being access by your
graphics card driver.? You might be trying to access the graphics engine
while it is in the middle of some other operation.? You can’t do this.
It’s never going to work.
> However , my driver can’t run well with “write combine” on XP 64 bit
> .When I write to BAR3 (memory 32 bit) at address (BASE + 0x00) with 8
> bytes. Nothing in the memory is changed.
>
BAR3 of what device?? Some devices don’t allow their registers to be
read.? You can write to them, but can’t read anything.
This whole exercise is just a waste of your time.? You can’t simulate
YOUR device by stealing register spaces from OTHER devices.
–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.
—
NTDEV is sponsored by OSR
For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars
To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer