Hi everyone. I am seeking someone that is able to assist or at least advise on what might be the issue with memory performance when running Windows 10 as a KVM guest under QEMU.
I have an AMD ThreadRipper platform running Linux, with both Windows and Linux VMs on it running under Qemu with KVM acceleration. Linux guests are able to achieve near native memory performance, at roughly 36GB/s at any copy size. However Windows guests are only able to achieve native performance until the memcpy block size exceeds exactly 1,054,735 bytes, at which point the performance drops to 12-14GB/s.
Using various memcpy implementations, (glib, vcruntime, custom, apex) no matter what I do I can not improve upon 14GB/s.
I have been pouring through the QEMU source for days now working with a few other people and we are unable to determine why Windows copy performance is so abysmal.