Re: [ntdev] Unable to achieve high network bandwidth using Winsock Kernel

As you are clearly using TCP, try a UM TCP pert test tool like NTTTCP (free from Microsoft)

Sent from Surface Pro

From: Maxim S. Shatskih
Sent: ‎Tuesday‎, ‎December‎ ‎23‎, ‎2014 ‎3‎:‎07‎ ‎PM
To: Windows System Software Devs Interest List

real Broadband NetXtreme BCM57800 10 gigabit) whereas the physical machines use Intel 10G
network cards.

Try on physical machines with Broadband NetXtreme BCM57800 10 gigabit.

What will the speed be?

:slight_smile:


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

The typical causes of low TCP throughput are

packet loss

Failure to push data into the stream quickly enough on the sending end

Failure to pull data out of the stream quickly enough on the receiving end

The TCP window size changes depending on the timing of how the data is pushed into / pulled out of the stream on each end and can give you an indication of a problem, but is unlikely to be a root cause.

Assuming that you do not have packet loss, you need to look at how you are reading / writing to the socket. This is confirmed by the fact that simple test programs can achieve the desired throughput, and that the change in timing / driver topology caused by your hypervisor have a significant impact.

Based on the fact that you cite wait functions, you probably have a serial IO model. You further suggest this via TCP no delay and jumbo frame use. Ultimately this IO model will be your bottleneck, but it may not be obvious in your VM environment depending on the implementation of the NIC driver for the virtual NIC and hypervisor switch

To check, you should setup a pert counter to track your 'call rate’ for sending. And your 'pending buffer’ size for reading. Simple interlocked counters can track this and then you can sample the results and track trends.

Assuming that I am even close to right, you will want to change to a completion based deserialized design

Sent from Surface Pro

From: xxxxx@gmail.com
Sent: ‎Wednesday‎, ‎December‎ ‎24‎, ‎2014 ‎8‎:‎49‎ ‎AM
To: Windows System Software Devs Interest List

Thanks Alex and Maxim for replying.
Alex

  1. Similar program is user mode gives high throughput provided we keep data uninterrupted. A sleep or waitforXXX primitives slows it down significantly. We ran netperf which comsumes about 80-85% of the network bandwidth. Also a sample kernel mode program gives good bandwidth provided we dont wait on any responses or do any other work. But the user mode program or the kernel mode drivers are simplistic emulations of what we are trying to do and dont capture the exact code flow.
  2. Network capture shows that the TCP window becomes small for the real hardware however it remains as 64K for the virtual machines. We tried to boost this up using the SO_SND_BUF and SO_RECV_BUF settings however this didnt help either.
  3. Nagle is disabled as we set the WSK_FLAG_NO_DELAY. Wireshark shows that the PSH flag set in the TCP header.

Maxim
Tried on these too. The performance is even worse than Intel. However this might be due to the fact that the windows drivers for the Broadcom cards dont respect the Jumbo Frame settings higher than 1500. If we set anythin higher the TCP show only packets of 546 bytes flowing through. On intel cards we have the Jumbo Frame set to ~4K.

Again, thanks for replying guys. Really appreciate it. :slight_smile:


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

As you should have inferred from my post, your conclusions are not correct. On the wire timing is important for network protocols; this is self evident. And identical socket client code can produce radically different on the wire timing depending on the underlying driver implementation. Ordinarily we talk about a performance reduction due to virtualization, but there is no reason to reject the possibility that the implementation of the virtual NIC doesn’t have a side effect like obviating poor client design. Based on your descriptions, that is your situation; but more information will help us to help you better

Sent from Surface Pro

From: xxxxx@gmail.com
Sent: ‎Thursday‎, ‎December‎ ‎25‎, ‎2014 ‎7‎:‎13‎ ‎AM
To: Windows System Software Devs Interest List

Actually I am using the same code that I am using on the Virtual Machines. If that was the case i should have seen the problem on both setups. We have tried using both the WskReceiveEvent callbacks as well as WskReceive. The former gave bit better performance.


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer