I’m working with NUMA machine (Tyan box with two CPU sockets) that hosts two CPUs and 8 CUDA-compatible GPUs. The task is to find out the NUMA node CUDA device with particular pci bus id is near to.
Of course I can run a number of host->device memory transfer tests to see the throughput and understand which device is near to which CPU, however, this is not too elegant and time consuming.
Is it possible to programmatically analyze the station and understand the {NUMA node <-> Set of near PCI bus IDs} neighbourhood ?
Windows can’t tell you which NUMA node a PCIe root complex is attached to (and thus which NUMA node the devices underneath are attached to) unless the BIOS tells Windows. It’s somewhat unlikely that your Tyan’s BIOS does that.
Jake Oshins
Windows Kernel Team
-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@gmail.com
Sent: Sunday, April 21, 2013 11:37 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] PCI-E device bus id and NUMA node association
Hi All,
I’m working with NUMA machine (Tyan box with two CPU sockets) that hosts two CPUs and 8 CUDA-compatible GPUs. The task is to find out the NUMA node CUDA device with particular pci bus id is near to.
Of course I can run a number of host->device memory transfer tests to see the throughput and understand which device is near to which CPU, however, this is not too elegant and time consuming.
Is it possible to programmatically analyze the station and understand the {NUMA node <-> Set of near PCI bus IDs} neighbourhood ?
If this information is contained in the SMBIOS or ACPI tables then you have
a chance. Otherwise, you need to profile your system and guess at the
hardware connections
Do you have control of the BIOS?
wrote in message news:xxxxx@ntdev…
Hi All,
I’m working with NUMA machine (Tyan box with two CPU sockets) that hosts two
CPUs and 8 CUDA-compatible GPUs. The task is to find out the NUMA node CUDA
device with particular pci bus id is near to.
Of course I can run a number of host->device memory transfer tests to see
the throughput and understand which device is near to which CPU, however,
this is not too elegant and time consuming.
Is it possible to programmatically analyze the station and understand the
{NUMA node <-> Set of near PCI bus IDs} neighbourhood ?
Software I’m working on is not system application or driver, it is normal application that relies on win32 api.
Seems like I have no chance to solve this task heuristically, just a number of memory throughput tests that would show which devices can be accessed faster from which NUMA nodes.
I assume that you mean there is no chance for a definitive solution.
Measuring throughput will give you data and you use heuristics to associate
the processing resources with the best IO resources.
I’m not blaming Microsoft on this, because system vendors are notorious for
poor quality of this information, but I have been working on and around NUMA
systems for many years and the OS provided facilities for determining how an
application can reduce waste of hardware resources are still weak
wrote in message news:xxxxx@ntdev…
Nope, no control.
Software I’m working on is not system application or driver, it is normal
application that relies on win32 api.
Seems like I have no chance to solve this task heuristically, just a number
of memory throughput tests that would show which devices can be accessed
faster from which NUMA nodes.