Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Sept/Oct 2019 Issue of The NT Insider available


Download PDF here: http://insider.osr.com/2019/ntinsider_2019_01.pdf

It’s a particularly BIG issue, too: 40 pages of technical goodness, ranging from WDF to Minifilters. Check it out.
Before Posting...
Please check out the Community Guidelines in the Announcements and Administration Category.

Alternate RSS hash function for Windows

ronakdoshironakdoshi Member Posts: 12

Hello,

Currently, it seems windows only has support for Toeplitz hash function. NdisHashFunctionToeplitz is indicated as hash function.

https://docs.microsoft.com/en-us/windows-hardware/drivers/network/rss-hashing-functions
Above link mentions the same, and has reserved the fields for other hash functions. So, the questions is, do we know if Windows plans to support more RSS hash functions for miniport drivers? Or are they already supported (any docs pointing to this)?

Toeplitz, though good enough, is very computational intensive. So, wanted to see if we can use a cheaper hash function to perform RSS?

Another question is: Even if other algorithm is used to perform RSS, will windows complain about it (as windows expects Toeplitz to be used)?

Thanks

Comments

  • Jeffrey_Tippet_[MSFT]Jeffrey_Tippet_[MSFT] Member - All Emails Posts: 558

    do we know if Windows plans to support more RSS hash functions for miniport drivers?

    I'm not aware of current plans to add other hash functions. But I'm interested in any alternatives you might propose.

    Keep in mind -- even if we hypothetically added support for a new hash function to the next release of Windows Server, you'd still have customers running Windows Server 2019 for, approximately, 15 years. So those customers will all still want to know why RSS behaves oddly with your device.

    Even if other algorithm is used to perform RSS, will windows complain about it

    You could build a NIC that goes and does the "wrong" hash function on every packet. You'd still get spreading, and thus be able to scale out to more CPU cores to handle the load. That would possibly still be a performance improvement over no RSS at all, depending on workload and system configuration, etc.

    What goes wrong is that, internally, the TCP stack is going to use the Toeplitz hash (with the hash key and indirection table that we share with the NIC) to bucket its internal datastructures. So if the NIC and the TCP stack don't agree on which CPU should process each socket's traffic, then a lot of packets will have to do costly cross-processor lookups to grab the socket datastructure from the wrong processor.

    Honestly, I don't know how bad that really is. Perhaps it's only a 10% performance penalty, which would be more than offset by the ability to spread across 4 processors instead of just 1 processor. Measure it and see. (If you don't have silicon, you can simulate the effect of using the wrong hash algorithm by doing Toepliz and using the wrong hash key. If you have a kernel debugger, you can even do that to an off-the-shelf NIC by intercepting and changing the OID on its way down.)

    Windows itself won't complain, per se, since Windows always expects some packets to arrive on the wrong CPU, in various edge cases. E.g., there's always a race between an indirection table update OID going down, and the traffic coming up. There's going to be some packets that get processed by some RSS-unaware filter driver that breaks the RSS hash or spreading. So Windows will always cope with "wrong" RSS/hashing -- it's just a question of how much performance you lose.

    The HLK reserves the right to complain about incorrectly-performed RSS, though. We reserve the right to prevent your driver from getting certified for Windows Server, or for becoming an inbox driver if you can't pass a (hypothetical) HLK test that validates your driver performs RSS to specification.

  • ronakdoshironakdoshi Member Posts: 12

    Thank you Jeffrey for the quick reply.

    I'm not aware of current plans to add other hash functions. But I'm interested in any alternatives you might propose.

    Toeplitz hash is costly (even if you use cache) when there are lot of sessions. There are other hashes such as CRC32, Jenkins hash, inverted XOR, etc. which can be used. Linux has support for CRC32 and XOR based hash for RSS (also used jHash on tx). These hash are cheap computationally, and give decent spread across queues/cpus. Having a support for such algorithms (though default can still be Toeplitz), gives additional flexibility in choosing the algorithm and improve performance.

    I gave a try with CRC32 and it showed huge performance difference (in cpu cycles spent without rss cache) while performing RSS with same spread. I see there are physical nic cards such as Intel, Mellanox, ENA (Amazon), etc. support XOR/CRC32 based hash.

    Of course, as default is Toeplitz, the certification shouldn't have any problem.

    Thanks

  • Jeffrey_Tippet_[MSFT]Jeffrey_Tippet_[MSFT] Member - All Emails Posts: 558

    It's not clear to me -- is your driver calculating the RSS hash in software (on the main CPU)?

  • ronakdoshironakdoshi Member Posts: 12

    Yes software

  • Jeffrey_Tippet_[MSFT]Jeffrey_Tippet_[MSFT] Member - All Emails Posts: 558

    Hmm. We don't usually recommend doing offloads entirely on the CPU, since the OS could usually do it just as well. Have you benchmarked a performance improvement from doing RSS entirely in software in the miniport driver?

  • ronakdoshironakdoshi Member Posts: 12

    Jeffrey, this is in virtual environment (apologies for not mentioning before). To spread the traffic over SW queues, RSS is performed in software.

Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Upcoming OSR Seminars
Writing WDF Drivers 21 Oct 2019 OSR Seminar Space & ONLINE
Internals & Software Drivers 18 Nov 2019 Dulles, VA
Kernel Debugging 30 Mar 2020 OSR Seminar Space
Developing Minifilters 27 Apr 2020 OSR Seminar Space & ONLINE