From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45060) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YnczD-0003Iv-S2 for qemu-devel@nongnu.org; Wed, 29 Apr 2015 21:09:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YnczA-0004UK-51 for qemu-devel@nongnu.org; Wed, 29 Apr 2015 21:09:55 -0400 Received: from mga01.intel.com ([192.55.52.88]:42686) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yncz9-0004UA-QI for qemu-devel@nongnu.org; Wed, 29 Apr 2015 21:09:52 -0400 From: "Li, Liang Z" Date: Thu, 30 Apr 2015 01:09:35 +0000 Message-ID: References: <1429031053-4454-1-git-send-email-dgilbert@redhat.com> <20150429172306.GB2240@work-vm> In-Reply-To: <20150429172306.GB2240@work-vm> Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH v6 00/47] Postcopy implementation List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert" Cc: "aarcange@redhat.com" , "yamahata@private.email.ne.jp" , "quintela@redhat.com" , "qemu-devel@nongnu.org" , "amit.shah@redhat.com" , "pbonzini@redhat.com" , "yayanghy@cn.fujitsu.com" , "david@gibson.dropbear.id.au" > * Li, Liang Z (liang.z.li@intel.com) wrote: > > Hi David, > > > > I have tired your v6 postcopy patches and found it doesn't work. When > > I tried to start the postcopy in live migration, some errors were print= ed. I > just did the following things: > > > > On destination side, started the qemu like this: > > > > /root/vt-sync/post_copy_v6_qemu.git/x86_64-softmmu/qemu-system- > x86_64 > > -enable-kvm -smp 2 -m 1024 -net none /mnt/jinshi_ia32e_rhel6u5.qcow2 > > -monitor stdio -incoming tcp:0:4444 > > > > On source side, started the qemu like this: > > > > /root/vt-sync/post_copy_v6_qemu.git/x86_64-softmmu/qemu-system- > x86_64 > > -enable-kvm -smp 2 -m 1024 -net none /mnt/jinshi_ia32e_rhel6u5.qcow2 > > -monitor stdio > > > > and then > > (qemu) migrate_set_capability x-postcopy-ram on > > > > When I started the post copy with > > (qemu) migrate -d tcp:localhost:4444 > > > > I got the error message on the source side: > > > > (qemu) qemu-system-x86_64: socket_writev_buffer: Got err=3D104 for > (131552/-1) > > qemu-system-x86_64: RP: Received invalid message 0x0000 length > 0x0000 > > > > and the following error on the destination side: > > > > (qemu) qemu-system-x86_64: postcopy_ram_supported_by_host: No OS > > support > > qemu-system-x86_64: load of migration failed: Operation not permitted >=20 > OK, the important error here is: > postcopy_ram_supported_by_host: No OS support >=20 > that's saying that the destination OS either: > 1) The kernel isn't the correct kernel with Andrea's userfault code co= mpiled > in > (check that userfaultfd is configured into the kernel as well) > 2) That when you built the QEMU it didn't find the syscall definition = for the > userfaultfd in the header as it compiled it. >=20 > I think from that error it is (2) - so make sure that when you built the = qemu > that you're using the headers from that kernel, or use the extra-cflags h= ack > that I mentioned in the cover letter. >=20 > Note that you need to use the kernel tree which I point to in the first > message. > (The older kernel from v5 wont work). >=20 Thanks Dave, I will retry according to your suggestion. > Dave > P.S. I'm on holiday this week, so not checking work email much. >=20 > > > > > > the dmesg printed: > > [ 233.456545] kvm: zapping shadow pages for mmio generation > > wraparound [ 239.785916] kvm [11926]: vcpu0 disabled perfctr wrmsr: > > 0xc1 data 0xabcd > > > > > > The v5 patches have no such errors. Do you have any suggestion? > > > > Liang > > > > > > > -----Original Message----- > > > From: qemu-devel-bounces+liang.z.li=3Dintel.com@nongnu.org > > > [mailto:qemu- > > > devel-bounces+liang.z.li=3Dintel.com@nongnu.org] On Behalf Of Dr. > > > devel-bounces+David Alan > > > Gilbert (git) > > > Sent: Wednesday, April 15, 2015 1:03 AM > > > To: qemu-devel@nongnu.org > > > Cc: aarcange@redhat.com; yamahata@private.email.ne.jp; > > > quintela@redhat.com; amit.shah@redhat.com; pbonzini@redhat.com; > > > david@gibson.dropbear.id.au; yayanghy@cn.fujitsu.com > > > Subject: [Qemu-devel] [PATCH v6 00/47] Postcopy implementation > > > > > > From: "Dr. David Alan Gilbert" > > > > > > This is the 6th cut of my version of postcopy; it is designed for > > > use with the Linux kernel additions posted by Andrea Arcangeli here: > > > > > > git clone --reference linux -b userfault18 > > > git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git > > > > > > (Note this is a different API from the last version) > > > > > > This qemu series can be found at: > > > > > > https://github.com/orbitfp7/qemu.git > > > on the wp3-postcopy-v6 tag. > > > > > > It addresses some but not yet all of the previous review comments; > > > however there are a couple of large simplifications, so it seems > > > worth posting to meet the new kernel API and to stop people reviewing > deadcode. > > > > > > Note: That the userfaultfd.h header is no longer included in this > > > tree: > > > - if you're building with the appropriate kernel headers it sho= uld find it > > > - if you're building on a host that doesn't have the kernel hea= ders > > > installed in the right place then: > > > configure with: --extra-cflags=3D"-D__NR_userfaultfd=3D3= 23" > > > cp include/uapi/linux/userfaultfd.h into somewhere in the = include > > > path, e.g. /usr/local/include/linux > > > > > > v6 > > > Removed the PMI bitmaps > > > - Andrea updated the kernel API so that userspace doesn't > > > need to do wakeups, and thus QEMU doesn't need to keep > > > track of which pages it's received; there is a price - which > > > is we end up sending more dupes to the source, but it simplif= ies > > > stuff a lot and makes the normal paths a lot quicker. > > > (10s of line change in kernel, 10%-ish simplification in this= code!) > > > Changed discard message format to a simpler start/end address schem= e > > > and rework discard and chunking code to work in long's to mat= ch > bitmap > > > 'qemu_get_buffer_less_copy' for postcopy pages > > > - avoids a userspace copy since the kernel now does it > > > - the new qemufile interface might also be useful for other pla= ces that > > > don't need a copy (maybe xbzrle?) > > > Changed the blockingness of the incoming fd > > > it was incorrectly blocking during the precopy phase after a po= stcopy > was > > > enabled, causing the HMP to be unavailable. It's now blocking = only > once > > > the postcopy thread starts up, since it's not a coroutine it ca= n't deal > > > with the yields in qemu_file. > > > An error on the return-path now marks the migration as failed > > > > > > Fixups from Dave Gibson's comments > > > Removed can_postcopy, renamed save_complete to > > > save_complete_precopy > > > added save_complete_postcopy > > > Simplified loadvm loop exits > > > discard message format changes above > > > and many more smaller changes. > > > > > > small fixups for RCU > > > > > > > > > This work has been partially funded by the EU Orbit project: > > > see http://www.orbitproject.eu/about/ > > > > > > TODO: > > > The major work is to rework the page send/receive loops so that > supporting > > > larger host pages doesn't make it quite as messy. > > > > > > Dr. David Alan Gilbert (47): > > > Start documenting how postcopy works. > > > Split header writing out of qemu_savevm_state_begin > > > qemu_ram_foreach_block: pass up error value, and down the ramblock > > > name > > > Add qemu_get_counted_string to read a string prefixed by a count by= te > > > Create MigrationIncomingState > > > Provide runtime Target page information > > > Move copy out of qemu_peek_buffer > > > Add qemu_get_buffer_less_copy to avoid copies some of the time > > > Add wrapper for setting blocking status on a QEMUFile > > > Rename save_live_complete to save_live_complete_precopy > > > Return path: Open a return path on QEMUFile for sockets > > > Return path: socket_writev_buffer: Block even on non-blocking fd's > > > Migration commands > > > Return path: Control commands > > > Return path: Send responses from destination to source > > > Return path: Source handling of return path > > > ram_debug_dump_bitmap: Dump a migration bitmap as text > > > Move loadvm_handlers into MigrationIncomingState > > > Rework loadvm path for subloops > > > Add migration-capability boolean for postcopy-ram. > > > Add wrappers and handlers for sending/receiving the postcopy-ram > > > migration messages. > > > MIG_CMD_PACKAGED: Send a packaged chunk of migration stream > > > migrate_init: Call from savevm > > > Modify save_live_pending for postcopy > > > postcopy: OS support test > > > migrate_start_postcopy: Command to trigger transition to postcopy > > > MIGRATION_STATUS_POSTCOPY_ACTIVE: Add new migration state > > > Add qemu_savevm_state_complete_postcopy > > > Postcopy: Maintain sentmap and calculate discard > > > postcopy: Incoming initialisation > > > postcopy: ram_enable_notify to switch on userfault > > > Postcopy: Postcopy startup in migration thread > > > Postcopy end in migration_thread > > > Page request: Add MIG_RP_MSG_REQ_PAGES reverse command > > > Page request: Process incoming page request > > > Page request: Consume pages off the post-copy queue > > > postcopy_ram.c: place_page and helpers > > > Postcopy: Use helpers to map pages during migration > > > qemu_ram_block_from_host > > > Don't sync dirty bitmaps in postcopy > > > Host page!=3Dtarget page: Cleanup bitmaps > > > Postcopy; Handle userfault requests > > > Start up a postcopy/listener thread ready for incoming page data > > > postcopy: Wire up loadvm_postcopy_handle_ commands > > > End of migration for postcopy > > > Disable mlock around incoming postcopy > > > Inhibit ballooning during postcopy > > > > > > arch_init.c | 868 > ++++++++++++++++++++++++++++++++++++--- > > > balloon.c | 11 + > > > docs/migration.txt | 167 ++++++++ > > > exec.c | 74 +++- > > > hmp-commands.hx | 15 + > > > hmp.c | 7 + > > > hmp.h | 1 + > > > hw/ppc/spapr.c | 2 +- > > > hw/virtio/virtio-balloon.c | 4 +- > > > include/exec/cpu-all.h | 2 - > > > include/exec/cpu-common.h | 7 +- > > > include/migration/migration.h | 126 +++++- > > > include/migration/postcopy-ram.h | 88 ++++ > > > include/migration/qemu-file.h | 15 +- > > > include/migration/vmstate.h | 10 +- > > > include/qemu/typedefs.h | 5 + > > > include/sysemu/balloon.h | 2 + > > > include/sysemu/sysemu.h | 45 +- > > > migration/Makefile.objs | 2 +- > > > migration/block.c | 9 +- > > > migration/migration.c | 743 > +++++++++++++++++++++++++++++++-- > > > migration/postcopy-ram.c | 715 > > > ++++++++++++++++++++++++++++++++ > > > migration/qemu-file-unix.c | 106 ++++- > > > migration/qemu-file.c | 100 ++++- > > > migration/rdma.c | 4 +- > > > migration/vmstate.c | 5 +- > > > qapi-schema.json | 19 +- > > > qmp-commands.hx | 19 + > > > savevm.c | 809 +++++++++++++++++++++++++++++= +++---- > > > trace-events | 77 +++- > > > 30 files changed, 3832 insertions(+), 225 deletions(-) create mode > > > 100644 include/migration/postcopy-ram.h create mode 100644 > > > migration/postcopy- ram.c > > > > > > -- > > > 2.1.0 > > > > > > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK