Am Mon, 5 Jul 2021 14:01:07 +0100 schrieb Andrew Cooper : > > The last one is always way faster because apparently map/unmap is less costly with a stopped guest. > That's suspicious.  If true, we've got some very wonky behaviour in the > hypervisor... At least the transfer rate this last iteration is consistent. Since the only difference I can see is the fact that the domU is suspended, I suspect the mapping. I did no investigation where the time is spent, I should probably do that one day to better understand this specific difference. > > Right now the code may reach up to 15Gbit/s. The next step is to map the domU just once to reach wirespeed. > > We can in principle do that in 64bit toolstacks, for HVM guests.  But > not usefully until we've fixed the fact that Xen has no idea what the > guest physmap is supposed to look like. Why would Xen care? My attempt last year with a new save/restore code did just 'map' the memory on both sides. The 'unmap' was done in exit(). With this approach I got wirespeed in all iterations with a 10G link. > At the moment, the current scheme is a little more resilient to bugs > caused by the guest attempting to balloon during the live phase. I did not specifically test how a domU behaves when it claims and releases pages while being migrated. I think this series would handle at least parts of that: If a page appears or disappears it will be recognized by getpageframeinfo. If a page disappears between getpageframeinfo and MMAPBATCH I expect an error. This error is fatal right now, perhaps the code could catch this and move on. If a page disappears after MMAPBATCH it will be caught by later iterations. > Another area to improve, which can be started now, is to avoid bounce > buffering hypercall data.  Now that we have /dev/xen/hypercall which you > can mmap() regular kernel pages from, what we want is a simple memory > allocator which we can allocate permanent hypercall buffers from, rather > than the internals of every xc_*() hypercall wrapper bouncing the data > in (potentially) both directions. That sounds like a good idea. Not sure how costly the current approach is. > Oh - so the speedup might not be from reduced data handling? At least not on the systems I have now. Perhaps I should test how the numbers look like with the NIC and the toolstack in node#0, and the domU in node#1. Olaf