From mboxrd@z Thu Jan 1 00:00:00 1970 From: Haozhong Zhang Subject: Re: [PATCH v4 2/3] target-i386: add migration support for Intel LMCE Date: Mon, 20 Jun 2016 10:11:36 +0800 Message-ID: <20160620021136.mepymwylqvzsku55@hz-desktop> References: <20160616060621.30422-3-haozhong.zhang@intel.com> <1d2312d2-4dd3-6a73-d0d7-84b4e8c749e2@redhat.com> <20160616102918.7geiaomeitldj7jy@hz-desktop> <20160616105529.dpmjjeqsdnf5cdnm@hz-desktop> <20160616173624.GO18662@thinpad.lan.raisama.net> <7e106359-c7e3-93b5-4cca-d669c26c873e@redhat.com> <20160616175822.GP18662@thinpad.lan.raisama.net> <20160617020105.mjelzxhdy5wcgqcm@hz-desktop> <20160617172016.GK18662@thinpad.lan.raisama.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Tony Luck , rkrcmar@redhat.com, Ashok Raj , kvm@vger.kernel.org, "Michael S . Tsirkin" , Marcelo Tosatti , qemu-devel@nongnu.org, Andi Kleen , Paolo Bonzini , Boris Petkov , Richard Henderson To: Eduardo Habkost Return-path: Content-Disposition: inline In-Reply-To: <20160617172016.GK18662@thinpad.lan.raisama.net> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+gceq-qemu-devel=gmane.org@nongnu.org Sender: "Qemu-devel" List-Id: kvm.vger.kernel.org On 06/17/16 14:20, Eduardo Habkost wrote: > On Fri, Jun 17, 2016 at 10:01:05AM +0800, Haozhong Zhang wrote: > > On 06/16/16 14:58, Eduardo Habkost wrote: > > > On Thu, Jun 16, 2016 at 07:40:20PM +0200, Paolo Bonzini wrote: > > > > > > > > > > > > On 16/06/2016 19:36, Eduardo Habkost wrote: > > > > >> > > > > > >> > Eduardo said nice for this part in previous version [1], so we may wait > > > > >> > for his comments? > > > > >> > > > > > >> > [1] http://lists.nongnu.org/archive/html/qemu-devel/2016-06/msg01992.html > > > > > I agree we don't need this check, but I still believe it is a > > > > > nice thing to have. > > > > > > > > > > In addition to detecting user errors, they don't hurt and are > > > > > useful for things like "-cpu host", that don't guarantee > > > > > live-migration compatibility but still allow migration if you > > > > > ensure host capabilities are the same on both sides. > > > > > > > > On the other hand we don't check for this on any other property, either > > > > CPU or device, do we? Considering "lmce=on" always breaks on an old > > > > kernel (i.e. there's no need for an explicit ",enforce" on the -cpu > > > > flag), I think it's unnecessary and makes things inconsistent. > > > > > > We don't check that because we normally can't: we usually don't > > > send any configuration data (or anything that could be used to > > > detect configuration mismatches) to the destination. When we do, > > > it's often by accident. > > > > > > In this case, it looks like we never needed to send mcg_cap in > > > the migration stream. But we already send it, so let's use it for > > > something useful. > > > > > > I believe we should have more checks like these, when possible. I > > > have been planning for a while to send CPUID data in the > > > migration stream, to detect migration compatibility errors > > > (either user errors or QEMU bugs). > > > > > > In theory, those checks should never be necessary. In practice I > > > believe they would be very useful. > > > > > > > Hi Eduardo and Paolo, > > > > What will be the conclusion? Do we still need this check? > > > > I'm fine to remove this check if we normally didn't make such kind of > > checks and require users to avoid configuration mismatch. > > I don't know yet if Paolo is convinced that the check is still > useful. :) > > I suggest doing it as a separate patch, so we can apply the rest > of the series now and discuss/apply the check later. > Yes, I'll move the check to a separate patch so that we can easily drop it if not necessary. Thanks for the suggestion! > > > > > > > > > > > (I was going to suggest enabling lmce automatically on "-cpu > > > > > host" as a follow-up patch, BTW.) > > > > > > > > Interesting. Technically it comes from the host kernel, not from the > > > > host CPU. But it does sounds like a good idea; -cpu host pretty much > > > > implies the same kernel (in addition to the same processor) on both > > > > sides of the migration. > > > > > > "-cpu host" already means "whatever is allowed by the host [CPU > > > and/or kernel]", not just "host CPU". It enables x2apic on all > > > hosts, for example. > > > > > > > Does that mean we can automatically enable LMCE for "-cpu host"? > > We can automatically enable LMCE for "-cpu host" if and only if > the host kernel supports LMCE. > According to our discussion for KVM Patch 3, we may have to disable it by default by -cpu host, so that pc-2.7 will not require new kernels unless LMCE is required explicitly by users. Thanks, Haozhong From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55539) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bEogl-0007kP-2w for qemu-devel@nongnu.org; Sun, 19 Jun 2016 22:11:52 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bEogf-00042B-VP for qemu-devel@nongnu.org; Sun, 19 Jun 2016 22:11:47 -0400 Received: from mga04.intel.com ([192.55.52.120]:4228) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bEogf-000425-Jm for qemu-devel@nongnu.org; Sun, 19 Jun 2016 22:11:41 -0400 Date: Mon, 20 Jun 2016 10:11:36 +0800 From: Haozhong Zhang Message-ID: <20160620021136.mepymwylqvzsku55@hz-desktop> References: <20160616060621.30422-3-haozhong.zhang@intel.com> <1d2312d2-4dd3-6a73-d0d7-84b4e8c749e2@redhat.com> <20160616102918.7geiaomeitldj7jy@hz-desktop> <20160616105529.dpmjjeqsdnf5cdnm@hz-desktop> <20160616173624.GO18662@thinpad.lan.raisama.net> <7e106359-c7e3-93b5-4cca-d669c26c873e@redhat.com> <20160616175822.GP18662@thinpad.lan.raisama.net> <20160617020105.mjelzxhdy5wcgqcm@hz-desktop> <20160617172016.GK18662@thinpad.lan.raisama.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160617172016.GK18662@thinpad.lan.raisama.net> Subject: Re: [Qemu-devel] [PATCH v4 2/3] target-i386: add migration support for Intel LMCE List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eduardo Habkost Cc: Paolo Bonzini , qemu-devel@nongnu.org, Richard Henderson , "Michael S . Tsirkin" , Marcelo Tosatti , kvm@vger.kernel.org, Boris Petkov , Tony Luck , Andi Kleen , rkrcmar@redhat.com, Ashok Raj On 06/17/16 14:20, Eduardo Habkost wrote: > On Fri, Jun 17, 2016 at 10:01:05AM +0800, Haozhong Zhang wrote: > > On 06/16/16 14:58, Eduardo Habkost wrote: > > > On Thu, Jun 16, 2016 at 07:40:20PM +0200, Paolo Bonzini wrote: > > > > > > > > > > > > On 16/06/2016 19:36, Eduardo Habkost wrote: > > > > >> > > > > > >> > Eduardo said nice for this part in previous version [1], so we may wait > > > > >> > for his comments? > > > > >> > > > > > >> > [1] http://lists.nongnu.org/archive/html/qemu-devel/2016-06/msg01992.html > > > > > I agree we don't need this check, but I still believe it is a > > > > > nice thing to have. > > > > > > > > > > In addition to detecting user errors, they don't hurt and are > > > > > useful for things like "-cpu host", that don't guarantee > > > > > live-migration compatibility but still allow migration if you > > > > > ensure host capabilities are the same on both sides. > > > > > > > > On the other hand we don't check for this on any other property, either > > > > CPU or device, do we? Considering "lmce=on" always breaks on an old > > > > kernel (i.e. there's no need for an explicit ",enforce" on the -cpu > > > > flag), I think it's unnecessary and makes things inconsistent. > > > > > > We don't check that because we normally can't: we usually don't > > > send any configuration data (or anything that could be used to > > > detect configuration mismatches) to the destination. When we do, > > > it's often by accident. > > > > > > In this case, it looks like we never needed to send mcg_cap in > > > the migration stream. But we already send it, so let's use it for > > > something useful. > > > > > > I believe we should have more checks like these, when possible. I > > > have been planning for a while to send CPUID data in the > > > migration stream, to detect migration compatibility errors > > > (either user errors or QEMU bugs). > > > > > > In theory, those checks should never be necessary. In practice I > > > believe they would be very useful. > > > > > > > Hi Eduardo and Paolo, > > > > What will be the conclusion? Do we still need this check? > > > > I'm fine to remove this check if we normally didn't make such kind of > > checks and require users to avoid configuration mismatch. > > I don't know yet if Paolo is convinced that the check is still > useful. :) > > I suggest doing it as a separate patch, so we can apply the rest > of the series now and discuss/apply the check later. > Yes, I'll move the check to a separate patch so that we can easily drop it if not necessary. Thanks for the suggestion! > > > > > > > > > > > (I was going to suggest enabling lmce automatically on "-cpu > > > > > host" as a follow-up patch, BTW.) > > > > > > > > Interesting. Technically it comes from the host kernel, not from the > > > > host CPU. But it does sounds like a good idea; -cpu host pretty much > > > > implies the same kernel (in addition to the same processor) on both > > > > sides of the migration. > > > > > > "-cpu host" already means "whatever is allowed by the host [CPU > > > and/or kernel]", not just "host CPU". It enables x2apic on all > > > hosts, for example. > > > > > > > Does that mean we can automatically enable LMCE for "-cpu host"? > > We can automatically enable LMCE for "-cpu host" if and only if > the host kernel supports LMCE. > According to our discussion for KVM Patch 3, we may have to disable it by default by -cpu host, so that pc-2.7 will not require new kernels unless LMCE is required explicitly by users. Thanks, Haozhong