From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40105) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dQID0-00044d-ML for qemu-devel@nongnu.org; Wed, 28 Jun 2017 15:01:03 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dQICz-0007hs-6c for qemu-devel@nongnu.org; Wed, 28 Jun 2017 15:01:02 -0400 Received: from mx1.redhat.com ([209.132.183.28]:42584) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dQICy-0007gu-UG for qemu-devel@nongnu.org; Wed, 28 Jun 2017 15:01:01 -0400 From: "Dr. David Alan Gilbert (git)" Date: Wed, 28 Jun 2017 20:00:18 +0100 Message-Id: <20170628190047.26159-1-dgilbert@redhat.com> Subject: [Qemu-devel] [RFC 00/29] postcopy+vhost-user/shared ram List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org, a.perevalov@samsung.com, marcandre.lureau@redhat.com, maxime.coquelin@redhat.com, mst@redhat.com, quintela@redhat.com, peterx@redhat.com, lvivier@redhat.com, aarcange@redhat.com From: "Dr. David Alan Gilbert" Hi, This is a RFC/WIP series that enables postcopy migration with shared memory to a vhost-user process. It's based off current-head + Juan's load_cleanup series, and Alexey's bitmap series (v4). It's very lightly tested and seems to work, but it's quite rough. I've modified the vhost-user-bridge (aka vhub) in qemu's tests/ to use the new feature, since this is about the simplest client around. Structure: The basic idea is that near the start of postcopy, the client opens its own userfaultfd fd and sends that back to QEMU over the socket it's already using for VHUST_USER_* commands. Then when VHOST_USER_SET_MEM_TABLE arrives it registers the areas with userfaultfd and sends the mapped addresses back to QEMU. QEMU then reads the clients UFD in it's fault thread and issues requests back to the source as needed. QEMU also issues 'WAKE' ioctls on the UFD to let the client know that the page has arrived and can carry on. A new feature (VHOST_USER_PROTOCOL_F_POSTCOPY) is added so that the QEMU knows the client can talk postcopy. Three new messages (VHOST_USER_POSTCOPY_{ADVISE/LISTEN/END}) are added to guide the process along. Current known issues: I've not tested it with hugepages yet; and I suspect the madvises will need tweaking for it. The qemu gets to see the base addresses that the client has its regions mapped at; that's not great for security Take care of deadlocking; any thread in the client that accesses a userfault protected page can stall. There's a nasty hack of a lock around the set_mem_table message. I've not looked at the recent IOMMU code. Some cleanup and a lot of corner cases need thinking about. There are probably plenty of unknown issues as well. Test setup: I'm running on one host at the moment, with the guest scping a large file from the host as it migrates. The setup is based on one I found in the vhost-user setups. You'll need a recent kernel for the shared memory support in userfaultfd, and userfault isn't that happy if a process using shared memory core's - so make sure you have the latest fixes. SESS=vhost ulimit -c unlimited tmux -L $SESS new-session -d tmux -L $SESS set-option -g history-limit 30000 # Start a router using the system qemu tmux -L $SESS new-window -n router ./x86_64-softmmu/qemu-system-x86_64 -M none -nographic -net socket,vlan=0,udp=loca lhost:4444,localaddr=localhost:5555 -net socket,vlan=0,udp=localhost:4445,localaddr=localhost:5556 -net user,vlan=0 tmux -L $SESS set-option -g set-remain-on-exit on # Start source vhost bridge tmux -L $SESS new-window -n srcvhostbr "./tests/vhost-user-bridge -u /tmp/vubrsrc.sock 2>src-vub-log" sleep 0.5 tmux -L $SESS new-window -n source "./x86_64-softmmu/qemu-system-x86_64 -enable-kvm -m 8G -smp 2 -object memory-backe nd-file,id=mem,size=8G,mem-path=/dev/shm,share=on -numa node,memdev=mem -mem-prealloc -chardev socket,id=char0,path=/ tmp/vubrsrc.sock -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce -device virtio-net-pci,netdev=mynet1 my.qcow2 -net none -vnc :0 -monitor stdio -trace events=/root/trace-file 2>src-qemu-log " # Start dest vhost bridge tmux -L $SESS new-window -n destvhostbr "./tests/vhost-user-bridge -u /tmp/vubrdst.sock -l 127.0.0.1:4445 -r 127.0.0. 1:5556 2>dst-vub-log" sleep 0.5 tmux -L $SESS new-window -n dest "./x86_64-softmmu/qemu-system-x86_64 -enable-kvm -m 8G -smp 2 -object memory-backend -file,id=mem,size=8G,mem-path=/dev/shm,share=on -numa node,memdev=mem -mem-prealloc -chardev socket,id=char0,path=/tm p/vubrdst.sock -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce -device virtio-net-pci,netdev=mynet1 my.qcow2 -net none -vnc :1 -monitor stdio -incoming tcp::8888 -trace events=/root/trace-file 2>dst-qemu-log" tmux -L $SESS send-keys -t source "migrate_set_capability postcopy-ram on tmux -L $SESS send-keys -t source "migrate_set_speed 20M tmux -L $SESS send-keys -t dest "migrate_set_capability postcopy-ram on then once booted: tmux -L vhost send-keys -t source 'migrate -d tcp:0:8888^M' tmux -L vhost send-keys -t source 'migrate_start_postcopy^M' (Note those ^M's are actual ctrl-M's i.e. ctrl-v ctrl-M) Dave Dr. David Alan Gilbert (29): RAMBlock/migration: Add migration flags migrate: Update ram_block_discard_range for shared qemu_ram_block_host_offset migration/ram: ramblock_recv_bitmap_test_byte_offset postcopy: use UFFDIO_ZEROPAGE only when available postcopy: Add notifier chain postcopy: Add vhost-user flag for postcopy and check it vhost-user: Add 'VHOST_USER_POSTCOPY_ADVISE' message vhub: Support sending fds back to qemu vhub: Open userfaultfd postcopy: Allow registering of fd handler vhost+postcopy: Register shared ufd with postcopy vhost+postcopy: Transmit 'listen' to client vhost+postcopy: Register new regions with the ufd vhost+postcopy: Send address back to qemu vhost+postcopy: Stash RAMBlock and offset vhost+postcopy: Send requests to source for shared pages vhost+postcopy: Resolve client address postcopy: wake shared postcopy: postcopy_notify_shared_wake vhost+postcopy: Add vhost waker vhost+postcopy: Call wakeups vub+postcopy: madvises vhost+postcopy: Lock around set_mem_table vhu: enable = false on get_vring_base vhost: Add VHOST_USER_POSTCOPY_END message vhost+postcopy: Wire up POSTCOPY_END notify postcopy: Allow shared memory vhost-user: Claim support for postcopy contrib/libvhost-user/libvhost-user.c | 178 ++++++++++++++++- contrib/libvhost-user/libvhost-user.h | 8 + exec.c | 44 +++-- hw/virtio/trace-events | 13 ++ hw/virtio/vhost-user.c | 293 +++++++++++++++++++++++++++- include/exec/cpu-common.h | 3 + include/exec/ram_addr.h | 2 + migration/migration.c | 3 + migration/migration.h | 8 + migration/postcopy-ram.c | 357 +++++++++++++++++++++++++++------- migration/postcopy-ram.h | 69 +++++++ migration/ram.c | 5 + migration/ram.h | 1 + migration/savevm.c | 13 ++ migration/trace-events | 6 + trace-events | 3 + vl.c | 4 +- 17 files changed, 926 insertions(+), 84 deletions(-) -- 2.13.0