From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefan Hajnoczi Subject: Re: [Qemu-devel] [PATCH 00/21][RFC] postcopy live migration Date: Sun, 1 Jan 2012 16:27:56 +0000 Message-ID: References: <4EFCEC38.3080308@codemonkey.ws> <4F002AC6.7080007@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Anthony Liguori , kvm@vger.kernel.org, satoshi.itoh@aist.go.jp, t.hirofuchi@aist.go.jp, Juan Quintela , Michael Roth , qemu-devel@nongnu.org, Isaku Yamahata To: Orit Wasserman Return-path: Received: from mail-we0-f174.google.com ([74.125.82.174]:51802 "EHLO mail-we0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751925Ab2AAQ2A convert rfc822-to-8bit (ORCPT ); Sun, 1 Jan 2012 11:28:00 -0500 Received: by werm1 with SMTP id m1so7162922wer.19 for ; Sun, 01 Jan 2012 08:27:59 -0800 (PST) In-Reply-To: <4F002AC6.7080007@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On Sun, Jan 1, 2012 at 9:43 AM, Orit Wasserman wr= ote: > On 12/30/2011 12:39 AM, Anthony Liguori wrote: >> On 12/28/2011 07:25 PM, Isaku Yamahata wrote: >>> Intro >>> =3D=3D=3D=3D=3D >>> This patch series implements postcopy live migration.[1] >>> As discussed at KVM forum 2011, dedicated character device is used = for >>> distributed shared memory between migration source and destination. >>> Now we can discuss/benchmark/compare with precopy. I believe there = are >>> much rooms for improvement. >>> >>> [1] http://wiki.qemu.org/Features/PostCopyLiveMigration >>> >>> >>> Usage >>> =3D=3D=3D=3D=3D >>> You need load umem character device on the host before starting mig= ration. >>> Postcopy can be used for tcg and kvm accelarator. The implementatio= n depend >>> on only linux umem character device. But the driver dependent code = is split >>> into a file. >>> I tested only host page size =3D=3D guest page size case, but the i= mplementation >>> allows host page size !=3D guest page size case. >>> >>> The following options are added with this patch series. >>> - incoming part >>> =A0 =A0command line options >>> =A0 =A0-postcopy [-postcopy-flags] >>> =A0 =A0where flags is for changing behavior for benchmark/debugging >>> =A0 =A0Currently the following flags are available >>> =A0 =A00: default >>> =A0 =A01: enable touching page request >>> >>> =A0 =A0example: >>> =A0 =A0qemu -postcopy -incoming tcp:0:4444 -monitor stdio -machine = accel=3Dkvm >>> >>> - outging part >>> =A0 =A0options for migrate command >>> =A0 =A0migrate [-p [-n]] URI >>> =A0 =A0-p: indicate postcopy migration >>> =A0 =A0-n: disable background transferring pages: This is for bench= mark/debugging >>> >>> =A0 =A0example: >>> =A0 =A0migrate -p -n tcp::4444 >>> >>> >>> TODO >>> =3D=3D=3D=3D >>> - benchmark/evaluation. Especially how async page fault affects the= result. >> >> I'll review this series next week (Mike/Juan, please also review whe= n you can). >> >> But we really need to think hard about whether this is the right thi= ng to take into the tree. =A0I worry a lot about the fact that we don't= test pre-copy migration nearly enough and adding a second form just in= troduces more things to test. >> >> It's also not clear to me why post-copy is better. =A0If you were go= ing to sit down and explain to someone building a management tool when = they should use pre-copy and when they should use post-copy, what would= you tell them? > > Start with pre-copy , if it doesn't converge switch to post-copy Post-copy throttles the guest when page faults are encountered because the destination machine waits for memory pages from the source machine. Is there a reason this page fault-based throttling cannot be done on the source machine with pre-copy migration? I'm not sure post-copy provides new behavior in terms of convergence, we could do the same with pre-copy migration. Post-copy has other advantages though, it immediately frees logical CPUs on the source machine (though RAM and network bandwidth is still required until migration completes). Stefan From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:42542) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RhOGQ-0003KA-DE for qemu-devel@nongnu.org; Sun, 01 Jan 2012 11:28:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RhOGO-0005QD-I9 for qemu-devel@nongnu.org; Sun, 01 Jan 2012 11:28:02 -0500 Received: from mail-we0-f173.google.com ([74.125.82.173]:37747) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RhOGO-0005Q6-Dn for qemu-devel@nongnu.org; Sun, 01 Jan 2012 11:28:00 -0500 Received: by werb10 with SMTP id b10so8512063wer.4 for ; Sun, 01 Jan 2012 08:27:59 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <4F002AC6.7080007@redhat.com> References: <4EFCEC38.3080308@codemonkey.ws> <4F002AC6.7080007@redhat.com> Date: Sun, 1 Jan 2012 16:27:56 +0000 Message-ID: From: Stefan Hajnoczi Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH 00/21][RFC] postcopy live migration List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Orit Wasserman Cc: kvm@vger.kernel.org, Juan Quintela , t.hirofuchi@aist.go.jp, satoshi.itoh@aist.go.jp, Michael Roth , qemu-devel@nongnu.org, Isaku Yamahata On Sun, Jan 1, 2012 at 9:43 AM, Orit Wasserman wrote: > On 12/30/2011 12:39 AM, Anthony Liguori wrote: >> On 12/28/2011 07:25 PM, Isaku Yamahata wrote: >>> Intro >>> =3D=3D=3D=3D=3D >>> This patch series implements postcopy live migration.[1] >>> As discussed at KVM forum 2011, dedicated character device is used for >>> distributed shared memory between migration source and destination. >>> Now we can discuss/benchmark/compare with precopy. I believe there are >>> much rooms for improvement. >>> >>> [1] http://wiki.qemu.org/Features/PostCopyLiveMigration >>> >>> >>> Usage >>> =3D=3D=3D=3D=3D >>> You need load umem character device on the host before starting migrati= on. >>> Postcopy can be used for tcg and kvm accelarator. The implementation de= pend >>> on only linux umem character device. But the driver dependent code is s= plit >>> into a file. >>> I tested only host page size =3D=3D guest page size case, but the imple= mentation >>> allows host page size !=3D guest page size case. >>> >>> The following options are added with this patch series. >>> - incoming part >>> =A0 =A0command line options >>> =A0 =A0-postcopy [-postcopy-flags] >>> =A0 =A0where flags is for changing behavior for benchmark/debugging >>> =A0 =A0Currently the following flags are available >>> =A0 =A00: default >>> =A0 =A01: enable touching page request >>> >>> =A0 =A0example: >>> =A0 =A0qemu -postcopy -incoming tcp:0:4444 -monitor stdio -machine acce= l=3Dkvm >>> >>> - outging part >>> =A0 =A0options for migrate command >>> =A0 =A0migrate [-p [-n]] URI >>> =A0 =A0-p: indicate postcopy migration >>> =A0 =A0-n: disable background transferring pages: This is for benchmark= /debugging >>> >>> =A0 =A0example: >>> =A0 =A0migrate -p -n tcp::4444 >>> >>> >>> TODO >>> =3D=3D=3D=3D >>> - benchmark/evaluation. Especially how async page fault affects the res= ult. >> >> I'll review this series next week (Mike/Juan, please also review when yo= u can). >> >> But we really need to think hard about whether this is the right thing t= o take into the tree. =A0I worry a lot about the fact that we don't test pr= e-copy migration nearly enough and adding a second form just introduces mor= e things to test. >> >> It's also not clear to me why post-copy is better. =A0If you were going = to sit down and explain to someone building a management tool when they sho= uld use pre-copy and when they should use post-copy, what would you tell th= em? > > Start with pre-copy , if it doesn't converge switch to post-copy Post-copy throttles the guest when page faults are encountered because the destination machine waits for memory pages from the source machine. Is there a reason this page fault-based throttling cannot be done on the source machine with pre-copy migration? I'm not sure post-copy provides new behavior in terms of convergence, we could do the same with pre-copy migration. Post-copy has other advantages though, it immediately frees logical CPUs on the source machine (though RAM and network bandwidth is still required until migration completes). Stefan