Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Home NTDEV

Before Posting...

Please check out the Community Guidelines in the Announcements and Administration Category.

More Info on Driver Writing and Debugging


The free OSR Learning Library has more than 50 articles on a wide variety of topics about writing and debugging device drivers and Minifilters. From introductory level to advanced. All the articles have been recently reviewed and updated, and are written using the clear and definitive style you've come to expect from OSR over the years.


Check out The OSR Learning Library at: https://www.osr.com/osr-learning-library/


PCI-E device bus id and NUMA node association

OSR_Community_UserOSR_Community_User Member Posts: 110,217
Hi All,

I'm working with NUMA machine (Tyan box with two CPU sockets) that hosts two CPUs and 8 CUDA-compatible GPUs. The task is to find out the NUMA node CUDA device with particular pci bus id is near to.

Of course I can run a number of host->device memory transfer tests to see the throughput and understand which device is near to which CPU, however, this is not too elegant and time consuming.

Is it possible to programmatically analyze the station and understand the {NUMA node <-> Set of near PCI bus IDs} neighbourhood ?

Thanks in advance.

Comments

  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    Windows can't tell you which NUMA node a PCIe root complex is attached to (and thus which NUMA node the devices underneath are attached to) unless the BIOS tells Windows. It's somewhat unlikely that your Tyan's BIOS does that.

    - Jake Oshins
    Windows Kernel Team

    -----Original Message-----
    From: [email protected] [mailto:[email protected]] On Behalf Of [email protected]
    Sent: Sunday, April 21, 2013 11:37 PM
    To: Windows System Software Devs Interest List
    Subject: [ntdev] PCI-E device bus id and NUMA node association

    Hi All,

    I'm working with NUMA machine (Tyan box with two CPU sockets) that hosts two CPUs and 8 CUDA-compatible GPUs. The task is to find out the NUMA node CUDA device with particular pci bus id is near to.

    Of course I can run a number of host->device memory transfer tests to see the throughput and understand which device is near to which CPU, however, this is not too elegant and time consuming.

    Is it possible to programmatically analyze the station and understand the {NUMA node <-> Set of near PCI bus IDs} neighbourhood ?

    Thanks in advance.

    ---
    NTDEV is sponsored by OSR

    OSR is HIRING!! See http://www.osr.com/careers

    For our schedule of WDF, WDM, debugging and other seminars visit:
    http://www.osr.com/seminars

    To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
  • MBondMBond Member - All Emails Posts: 846
    If this information is contained in the SMBIOS or ACPI tables then you have
    a chance. Otherwise, you need to profile your system and guess at the
    hardware connections

    Do you have control of the BIOS?

    wrote in message news:[email protected]

    Hi All,

    I'm working with NUMA machine (Tyan box with two CPU sockets) that hosts two
    CPUs and 8 CUDA-compatible GPUs. The task is to find out the NUMA node CUDA
    device with particular pci bus id is near to.

    Of course I can run a number of host->device memory transfer tests to see
    the throughput and understand which device is near to which CPU, however,
    this is not too elegant and time consuming.

    Is it possible to programmatically analyze the station and understand the
    {NUMA node <-> Set of near PCI bus IDs} neighbourhood ?

    Thanks in advance.
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    Nope, no control.

    Software I'm working on is not system application or driver, it is normal application that relies on win32 api.

    Seems like I have no chance to solve this task heuristically, just a number of memory throughput tests that would show which devices can be accessed faster from which NUMA nodes.
  • MBondMBond Member - All Emails Posts: 846
    I assume that you mean there is no chance for a definitive solution.
    Measuring throughput will give you data and you use heuristics to associate
    the processing resources with the best IO resources.

    I'm not blaming Microsoft on this, because system vendors are notorious for
    poor quality of this information, but I have been working on and around NUMA
    systems for many years and the OS provided facilities for determining how an
    application can reduce waste of hardware resources are still weak

    wrote in message news:[email protected]

    Nope, no control.

    Software I'm working on is not system application or driver, it is normal
    application that relies on win32 api.

    Seems like I have no chance to solve this task heuristically, just a number
    of memory throughput tests that would show which devices can be accessed
    faster from which NUMA nodes.
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    Yeah, that's exactly what I mean. The only solution is in analysis of throughput tests results.
Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. Sign in or register to get started.

Upcoming OSR Seminars
OSR has suspended in-person seminars due to the Covid-19 outbreak. But, don't miss your training! Attend via the internet instead!
Internals & Software Drivers 19-23 June 2023 Live, Online
Writing WDF Drivers 10-14 July 2023 Live, Online
Kernel Debugging 16-20 October 2023 Live, Online
Developing Minifilters 13-17 November 2023 Live, Online