Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Before Posting...
Please check out the Community Guidelines in the Announcements and Administration Category.

Implementing PROCESSOR_GROUPS

ronakdoshironakdoshi Member Posts: 16

Hello,

I am trying to implement cpu groups feature in windows to miniport driver. I see the interrupt member in _IO_RESOURCE_DESCRIPTOR being defined as
{
...

if ...

  IRQ_DEVICE_POLICY AffinityPolicy;
  USHORT            Group;

else

  IRQ_DEVICE_POLICY AffinityPolicy;

endif

  IRQ_PRIORITY      PriorityPolicy;
  KAFFINITY         TargetedProcessors;
} Interrupt;

Specifies a processor group number. Group is a valid (but optional) member of u.Interrupt only in Windows 7 and later versions of Windows. This member exists only if NT_PROCESSOR_GROUPS is defined at compile time.

So, I need help in defining NT_PROCESSOR_GROUPS during compile time. Which header files I need to include to get this #define included? Any other information to implement this feature would be helpful.

Thanks

Comments

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 7,630

    You define it yourself, if you need to use the processor group feature.

    You can #define it in the source code, or set it as a preprocessor define using a compiler option.

    It's that easy.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • ronakdoshironakdoshi Member Posts: 16
    edited January 29

    Thanks a lot Peter. I tried that and it worked. However, I see that RSS indirection table does not show group information.

    I have configured groupsize to be 4 and I have 8 logical processors. Below output shows correct cpu group and processor number for MaxProcessor and RssProcessorArray. However, rss indirection table shared by stack in OID_GEN_RECEIVE_SCALE_PARAMETERS contains processors from group 0 only. I am not sure why is that the case. Does RSS ind table by default use processors from group 0? And Is there a command or a way to modify indirection table to set different processors in indirection table?

    I have also set appropriate processor and group info for interrupts in IO_RESOURCE_LIST members.
    ird->u.Interrupt.Group
    ird->u.Interrupt.TargetedProcessors
    ird->u.Interrupt.AffinityPolicy = IrqPolicySpecifiedProcessors

    PS C:\Windows\system32> Get-NetAdapterRSS -Name Ethernet2
    Name : Ethernet2
    InterfaceDescription : Ethernet Adapter
    Enabled : True
    NumberOfReceiveQueues : 8
    Profile : NUMAStatic
    BaseProcessor: [Group:Number] : 0:0
    MaxProcessor: [Group:Number] : 1:3
    MaxProcessors : 8
    RssProcessorArray: [Group:Number/NUMA Distance] : 0:0/0 0:1/0 0:2/0 0:3/0 1:0/0 1:1/0 1:2/0 1:3/0
    IndirectionTable: [Group:Number] : 0:0 0:1 0:2 0:3 0:0 0:1 0:2 0:3
    0:0 0:1 0:2 0:3 0:0 0:1 0:2 0:3
    0:0 0:1 0:2 0:3 0:0 0:1 0:2 0:3
    0:0 0:1 0:2 0:3 0:0 0:1 0:2 0:3
    0:0 0:1 0:2 0:3 0:0 0:1 0:2 0:3
    0:0 0:1 0:2 0:3 0:0 0:1 0:2 0:3
    0:0 0:1 0:2 0:3 0:0 0:1 0:2 0:3
    0:0 0:1 0:2 0:3 0:0 0:1 0:2 0:3
    0:0 0:1 0:2 0:3 0:0 0:1 0:2 0:3
    0:0 0:1 0:2 0:3 0:0 0:1 0:2 0:3
    0:0 0:1 0:2 0:3 0:0 0:1 0:2 0:3
    0:0 0:1 0:2 0:3 0:0 0:1 0:2 0:3
    0:0 0:1 0:2 0:3 0:0 0:1 0:2 0:3
    0:0 0:1 0:2 0:3 0:0 0:1 0:2 0:3
    0:0 0:1 0:2 0:3 0:0 0:1 0:2 0:3
    0:0 0:1 0:2 0:3 0:0 0:1 0:2 0:3

    Thanks

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 7,630
    edited January 30

    Sorry, you’re well into NDIS now and hence well beyond me.

    Perhaps @Jeffrey_Tippet_[MSFT] will come around and comment.

    Peter

    Post edited by Peter_Viscarola_(OSR) on

    Peter Viscarola
    OSR
    @OSRDrivers

  • msrmsr Member Posts: 344

    If not already look at RSSProfile at https://docs.microsoft.com/en-us/windows-hardware/drivers/network/standardized-inf-keywords-for-rss
    Seems there is Set-NetAdapterRSS -profile to change the profile.

  • Jeffrey_Tippet_[MSFT]Jeffrey_Tippet_[MSFT] Member - All Emails Posts: 567

    @msr has the right idea.

    I'll fill in the background a bit. The RSS indirection table is the result of a handshake between several components, so it's not always clear where it comes from.

    First, the driver INF sets up some keywords, like *RSS. The administrator has a chance to edit these keywords.

    Next, the administrator can set some RSS configuration, like MaxProcessors, via Set-NetAdapterRss. These aren't exactly keywords, since they don't come from the driver INF.

    NDIS captures the processor topology at boot, and sifts through it:

    • "Hyperthreaded" processors are ignored
    • Processors are sorted by their NUMA distance to the NIC (which is itself ideally obtained from ACPI SRAT)
    • Processors that are excluded by the administrator's RSS configuration above are ignored, e.g. any processor number less than BaseProcessor.

    NDIS publishes (NdisGetRssProcessorInformation) the resulting RSS candidate processor list to both the NIC driver (so it can allocate interrupt vectors) and protocol drivers (so they can cook up the final indirection table).

    Finally, the protocol driver makes the ultimate decision on which processors to put into the indirection table. The protocol is allowed to select any processors it likes from the RSS candidate processor set. NDIS does inform the protocol of the "RSS profile" that the administrator selected, but the protocol is not obligated to honor the profile. In fact, Windows comes with 2 protocol drivers that can use RSS: TCPIP and VMSWITCH, and currently only the former honors the RSS profile.

    So to return to the original question: the precise choice of processor numbers in the indirection table comes from either TCPIP or VMSWITCH. If you don't have an external vSwitch over the NIC, it's TCPIP. By default, TCPIP tries to avoid spanning processor groups (actually, NUMA nodes -- all nodes are groups, but not all groups are nodes). But you can nudge it into doing so, using different Profile hints.

    If, on the other hand, you're using VMSWITCH, then you currently don't get even that amount of control over its indirection table algorithm. (It's possible that VMSWITCH will implement support for some or all RSS profiles in a future release, or add some other mechanism for more administrator control.)

  • ronakdoshironakdoshi Member Posts: 16

    Hello Jeffrey

    Thanks for you response. I agree RSSProfile can be used to indicate the protocol driver to select RSS processors based on their NUMA distance, however, how do we tell it to use processors from different groups. There is no VMSwitch, so TCPIP would be deciding the indirection table.

    My system does not have 64 processors, so to test processor groups feature, I have configured 8 logical processors into 2 groups using

    bcdedit.exe /set groupsize 4
    bcdedit.exe /set groupaware on

    So, when Get-NetAdapterRSS is done, MaxProcessor and RSSProcessorArray show group information, but I still do not see groups used in indirection table. I tried different RSSProfiles as well, but same result.

    I also printed NdisGetRssProcessorInformation() and it, too, shows proper group information.

    FilterResourceRequirements: NdisGetRssProcessorInformation() provided revision 2
    FilterResourceRequirements: NdisGetRssProcessorInformation() provided max: group 1 num 3
    FilterResourceRequirements: NdisGetRssProcessorInformation() provided profile: 4

    And accordingly, interrupts are assigned to appropriate processor from respective group. So, I am not sure what is missing or why TCPIP protocol driver does not use processors from group 1 for indirection table.

  • Jeffrey_Tippet_[MSFT]Jeffrey_Tippet_[MSFT] Member - All Emails Posts: 567

    Hmm, my understanding of TCPIP is that it will choose to span processor groups, if you set it to Numa or NumaStatic modes. But I didn't write that algorithm, so maybe I'm wrong. It's possible that TCPIP only uses NUMA nodes that all share the same processor group, but that seems like a weird and unnecessary limitation.

    I'll ask around internally and report back if I find out something interesting.

    Meanwhile, you can also try your luck with the HLK: it has a dedicated test for RSS, which should exercise the feature more exhaustively than TCPIP does.

  • ronakdoshironakdoshi Member Posts: 16
    edited January 31

    Thanks Jeffrey. Yes, My next step is to run CPU group test and RSS tests of HLK and see the results.

    Let me know if you get any other information regarding this.

    Thanks

  • MBond2MBond2 Member Posts: 29

    this statement can't be right

    actually, NUMA nodes -- all nodes are groups, but not all groups are nodes

    https://docs.microsoft.com/en-us/windows/win32/procthread/processor-groups

    processor groups span NUMA nodes and speical special applications aware of processor goups can span them when ordinary processes don't. SQL server is again the cannonical example

  • anton_bassovanton_bassov Member Posts: 5,095

    I am trying to implement cpu groups feature in windows to miniport driver.

    What I would advise you to do is to take a short break from your "endeavours", and to read the article that Marion refers to

    I have configured groupsize to be 4 and I have 8 logical processors.

    https://docs.microsoft.com/en-us/windows/win32/procthread/processor-groups

    [begin quote]

    Support for systems that have more than 64 logical processors is based on the concept of a processor group, which is a static set of up to 64 logical processors that is treated as a single scheduling entity. Processor groups are numbered starting with 0. Systems with fewer than 64 logical processors always have a single group, Group 0.

    Windows Server 2008, Windows Vista, Windows Server 2003 and Windows XP: Processor groups are not supported.

    When the system starts, the operating system creates processor groups and assigns logical processors to the groups. If the system is capable of hot-adding processors, the operating system allows space in groups for processors that might arrive while the system is running. The operating system minimizes the number of groups in a system. For example, a system with 128 logical processors would have two processor groups with 64 processors in each group, not four groups with 32 logical processors in each group.

    [end quote]

    Anton Bassov

  • Jeffrey_Tippet_[MSFT]Jeffrey_Tippet_[MSFT] Member - All Emails Posts: 567

    this statement can't be right

    actually, NUMA nodes -- all nodes are groups, but not all groups are nodes

    It certainly isn't ;)

    I had forgotten the complicated relationship between nodes and groups, and incorrectly thought I could simplify the relationship down to once parenthetical. I don't think it's possible to describe in one sentence -- interested parties had better just read the pages that you've linked.

  • MBond2MBond2 Member Posts: 29

    I'm glad that you agree 'cause I was worried for a moment that what I 'know' might not be right

Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Upcoming OSR Seminars
Kernel Debugging 30 Mar 2020 OSR Seminar Space
Developing Minifilters 20 Apr 2020 OSR Seminar Space & ONLINE
Writing WDF Drivers 11 May 2020 OSR Seminar Space & ONLINE
Internals & Software Drivers 28 Sept 2020 Dulles, VA