Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Before Posting...
Please check out the Community Guidelines in the Announcements and Administration Category.

More Info on Driver Writing and Debugging

The free OSR Learning Library has more than 50 articles on a wide variety of topics about writing and debugging device drivers and Minifilters. From introductory level to advanced. All the articles have been recently reviewed and updated, and are written using the clear and definitive style you've come to expect from OSR over the years.

Check out The OSR Learning Library at:

Alternate RSS hash function for Windows

ronakdoshironakdoshi Member Posts: 18


Currently, it seems windows only has support for Toeplitz hash function. NdisHashFunctionToeplitz is indicated as hash function.
Above link mentions the same, and has reserved the fields for other hash functions. So, the questions is, do we know if Windows plans to support more RSS hash functions for miniport drivers? Or are they already supported (any docs pointing to this)?

Toeplitz, though good enough, is very computational intensive. So, wanted to see if we can use a cheaper hash function to perform RSS?

Another question is: Even if other algorithm is used to perform RSS, will windows complain about it (as windows expects Toeplitz to be used)?



  • Jeffrey_Tippet_[MSFT]Jeffrey_Tippet_[MSFT] Member - All Emails Posts: 573

    do we know if Windows plans to support more RSS hash functions for miniport drivers?

    I'm not aware of current plans to add other hash functions. But I'm interested in any alternatives you might propose.

    Keep in mind -- even if we hypothetically added support for a new hash function to the next release of Windows Server, you'd still have customers running Windows Server 2019 for, approximately, 15 years. So those customers will all still want to know why RSS behaves oddly with your device.

    Even if other algorithm is used to perform RSS, will windows complain about it

    You could build a NIC that goes and does the "wrong" hash function on every packet. You'd still get spreading, and thus be able to scale out to more CPU cores to handle the load. That would possibly still be a performance improvement over no RSS at all, depending on workload and system configuration, etc.

    What goes wrong is that, internally, the TCP stack is going to use the Toeplitz hash (with the hash key and indirection table that we share with the NIC) to bucket its internal datastructures. So if the NIC and the TCP stack don't agree on which CPU should process each socket's traffic, then a lot of packets will have to do costly cross-processor lookups to grab the socket datastructure from the wrong processor.

    Honestly, I don't know how bad that really is. Perhaps it's only a 10% performance penalty, which would be more than offset by the ability to spread across 4 processors instead of just 1 processor. Measure it and see. (If you don't have silicon, you can simulate the effect of using the wrong hash algorithm by doing Toepliz and using the wrong hash key. If you have a kernel debugger, you can even do that to an off-the-shelf NIC by intercepting and changing the OID on its way down.)

    Windows itself won't complain, per se, since Windows always expects some packets to arrive on the wrong CPU, in various edge cases. E.g., there's always a race between an indirection table update OID going down, and the traffic coming up. There's going to be some packets that get processed by some RSS-unaware filter driver that breaks the RSS hash or spreading. So Windows will always cope with "wrong" RSS/hashing -- it's just a question of how much performance you lose.

    The HLK reserves the right to complain about incorrectly-performed RSS, though. We reserve the right to prevent your driver from getting certified for Windows Server, or for becoming an inbox driver if you can't pass a (hypothetical) HLK test that validates your driver performs RSS to specification.

  • ronakdoshironakdoshi Member Posts: 18

    Thank you Jeffrey for the quick reply.

    I'm not aware of current plans to add other hash functions. But I'm interested in any alternatives you might propose.

    Toeplitz hash is costly (even if you use cache) when there are lot of sessions. There are other hashes such as CRC32, Jenkins hash, inverted XOR, etc. which can be used. Linux has support for CRC32 and XOR based hash for RSS (also used jHash on tx). These hash are cheap computationally, and give decent spread across queues/cpus. Having a support for such algorithms (though default can still be Toeplitz), gives additional flexibility in choosing the algorithm and improve performance.

    I gave a try with CRC32 and it showed huge performance difference (in cpu cycles spent without rss cache) while performing RSS with same spread. I see there are physical nic cards such as Intel, Mellanox, ENA (Amazon), etc. support XOR/CRC32 based hash.

    Of course, as default is Toeplitz, the certification shouldn't have any problem.


  • Jeffrey_Tippet_[MSFT]Jeffrey_Tippet_[MSFT] Member - All Emails Posts: 573

    It's not clear to me -- is your driver calculating the RSS hash in software (on the main CPU)?

  • ronakdoshironakdoshi Member Posts: 18

    Yes software

  • Jeffrey_Tippet_[MSFT]Jeffrey_Tippet_[MSFT] Member - All Emails Posts: 573

    Hmm. We don't usually recommend doing offloads entirely on the CPU, since the OS could usually do it just as well. Have you benchmarked a performance improvement from doing RSS entirely in software in the miniport driver?

  • ronakdoshironakdoshi Member Posts: 18

    Jeffrey, this is in virtual environment (apologies for not mentioning before). To spread the traffic over SW queues, RSS is performed in software.

Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Upcoming OSR Seminars
OSR has suspended in-person seminars due to the Covid-19 outbreak. But, don't miss your training! Attend via the internet instead!
Kernel Debugging 30 Mar 2020 OSR Seminar Space
Developing Minifilters 15 Jun 2020 LIVE ONLINE
Writing WDF Drivers 22 June 2020 LIVE ONLINE
Internals & Software Drivers 28 Sept 2020 Dulles, VA