All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Greg Kurz <groug@kaod.org>
Cc: clg@kaod.org, aik@ozlabs.ru, mdroth@linux.vnet.ibm.com,
	nikunj@linux.vnet.ibm.com, lvivier@redhat.com, thuth@redhat.com,
	qemu-devel@nongnu.org, abologna@redhat.com, qemu-ppc@nongnu.org
Subject: Re: [Qemu-devel] [Qemu-ppc] [RFCv2 12/12] ppc: Rework CPU compatibility testing across migration
Date: Mon, 5 Dec 2016 15:09:16 +1100	[thread overview]
Message-ID: <20161205040916.GB32366@umbus.fritz.box> (raw)
In-Reply-To: <20161202154825.613bc4ad@bahia>

[-- Attachment #1: Type: text/plain, Size: 6007 bytes --]

On Fri, Dec 02, 2016 at 03:48:25PM +0100, Greg Kurz wrote:
> On Wed, 16 Nov 2016 09:17:55 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > Migrating between different CPU versions is quite complicated for ppc.
> > A long time ago, we ensure identical CPU versions at either end by checking
> > the PVR had the same value.  However, this breaks under KVM HV, because
> > we always have to use the host's PVR - it's not virtualized.  That would
> > mean we couldn't migrate between hosts with different PVRs, even if the
> > CPUs are close enough to compatible in practice (sometimes identical cores
> > with different surrounding logic have different PVRs, so this happens in
> > practice quite often).
> > 
> > So, we removed the PVR check, but instead checked that several flags
> > indicating supported instructions matched.  This turns out to be a bad
> > idea, because those instruction masks are not architected information, but
> > essentially a TCG implementation detail.  So changes to qemu internal CPU
> > modelling can break migration - this happened between qemu-2.6 and
> > qemu-2.7.
> > 
> > Modern server-class CPUs can be placed into compatibility modes.  Now that
> > we're handling those properly, we finally have the information to sanely
> > deal with CPU compatibility across migration.
> > 
> > This patch bumps the migration version number for the ppc CPU removing the
> > instruction mask field (and some other unwise VMSTATE_EQUAL checks), and
> > adding the compatibility PVR to the migration stream.
> > 
> 
> Things have changed since you posted this RFC:
> 
> commit 16a2497bd44cac1856e259654fd304079bd1dcdc
> Author: David Gibson <david@gibson.dropbear.id.au>
> Date:   Mon Nov 21 16:28:12 2016 +1100
> 
>     target-ppc: Fix CPU migration from qemu-2.6 <-> later versions
> 
> and
> 
> commit 146c11f16f12dbfea62cbd7f865614bb6fcbc6b5
> Author: David Gibson <david@gibson.dropbear.id.au>
> Date:   Mon Nov 21 16:29:30 2016 +1100
> 
>     target-ppc: Allow eventual removal of old migration mistakes
> 
> I guess that the version bumping isn't necessary anymore if we keep these.
> 
> I'll assume yes and rebase this patch against current master, simply dropping
> the version bumping and related lines.

Yeah, I realised that breaking backwards migration was a bad idea, and
with some help from Dave Gilbert worked out how to make it possible.

I realize I'm going to have to rework my compat series in light of
these changes.

> > We consider the CPUs compatible for migration if:
> >     * The source was running in a compatibility mode which the destination
> >       supports
> > OR  * The source has a PVR matching the same qemu CPU class as the
> >       destination, either an exact match or an approximate match determined
> >       by the cpu class's pvr_match hook.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  target-ppc/machine.c | 87 +++++++++++++++++++++++++++++++++++++++++++++++-----
> >  1 file changed, 79 insertions(+), 8 deletions(-)
> > 
> > diff --git a/target-ppc/machine.c b/target-ppc/machine.c
> > index e43cb6c..25a30d5 100644
> > --- a/target-ppc/machine.c
> > +++ b/target-ppc/machine.c
> > @@ -8,6 +8,7 @@
> >  #include "helper_regs.h"
> >  #include "mmu-hash64.h"
> >  #include "migration/cpu.h"
> > +#include "qapi/error.h"
> >  
> >  static int cpu_load_old(QEMUFile *f, void *opaque, int version_id)
> >  {
> > @@ -163,6 +164,30 @@ static void cpu_pre_save(void *opaque)
> >      }
> >  }
> >  
> > +/*
> > + * Determine if a given PVR is a "close enough" match to the CPU
> > + * object.  For TCG and KVM PR it would probably be sufficient to
> > + * require an exact PVR match.  However for KVM HV the user is
> > + * restricted to a PVR exactly matching the host CPU.  The correct way
> > + * to handle this is to put the guest into an architected
> > + * compatibility mode.  However, to allow a more forgiving transition
> > + * and migration from before this was widely done, we allow migration
> > + * between sufficiently similar PVRs, as determined by the CPU class's
> > + * pvr_match() hook.
> > + */
> > +static bool pvr_match(PowerPCCPU *cpu, uint32_t pvr)
> > +{
> > +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
> > +
> > +    if (pvr == pcc->pvr) {
> > +        return true;
> > +    }
> > +    if (pcc->pvr_match) {
> > +        return pcc->pvr_match(pcc, pvr);
> > +    }
> > +    return false;
> > +}
> > +
> >  static int cpu_post_load(void *opaque, int version_id)
> >  {
> >      PowerPCCPU *cpu = opaque;
> > @@ -171,10 +196,31 @@ static int cpu_post_load(void *opaque, int version_id)
> >      target_ulong msr;
> >  
> >      /*
> > -     * We always ignore the source PVR. The user or management
> > -     * software has to take care of running QEMU in a compatible mode.
> > +     * If we're operating in compat mode, we should be ok as long as
> > +     * the destination supports the same compatiblity mode.
> > +     *
> > +     * Otherwise, however, we require that the destination has exactly
> > +     * the same CPU model as the source.
> >       */
> > -    env->spr[SPR_PVR] = env->spr_cb[SPR_PVR].default_value;
> > +
> > +#if defined(TARGET_PPC64)
> > +    if (cpu->compat_pvr) {
> > +        Error *local_err = NULL;
> > +
> > +        ppc_set_compat(cpu, cpu->compat_pvr, &local_err);
> 
> This calls cpu_synchronize_state(CPU(cpu)) and trashes the registers. This
> is the root cause behind the program interrupts I mentioned in another mail.
> 
> Adding a sync_needed boolean argument to ppc_set_compat() seems to be enough
> to get this working. So I'll just do that and rerun the tests.
> 
> Cheers.
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2016-12-05  4:53 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-15 22:17 [Qemu-devel] [RFCv2 00/12] Clean up compatibility mode handling David Gibson
2016-11-15 22:17 ` [Qemu-devel] [RFCv2 01/12] pseries: Always use core objects for CPU construction David Gibson
2016-11-18 15:00   ` Greg Kurz
2016-11-15 22:17 ` [Qemu-devel] [RFCv2 02/12] pseries: Make cpu_update during CAS unconditional David Gibson
2016-11-15 22:17 ` [Qemu-devel] [RFCv2 03/12] ppc: Clean up and QOMify hypercall emulation David Gibson
2016-11-15 22:17 ` [Qemu-devel] [RFCv2 04/12] ppc: Rename cpu_version to compat_pvr David Gibson
2016-11-15 22:17 ` [Qemu-devel] [RFCv2 05/12] ppc: Rewrite ppc_set_compat() David Gibson
2016-11-15 22:17 ` [Qemu-devel] [RFCv2 06/12] ppc: Rewrite ppc_get_compat_smt_threads() David Gibson
2016-11-15 22:17 ` [Qemu-devel] [RFCv2 07/12] ppc: Validate compatibility modes when setting David Gibson
2016-11-15 22:17 ` [Qemu-devel] [RFCv2 08/12] pseries: Rewrite CAS PVR compatibility logic David Gibson
2016-11-15 22:17 ` [Qemu-devel] [RFCv2 09/12] ppc: Add ppc_set_compat_all() David Gibson
2016-11-15 22:17 ` [Qemu-devel] [RFCv2 10/12] pseries: Move CPU compatibility property to machine David Gibson
2016-11-19  8:27   ` Greg Kurz
2016-11-15 22:17 ` [Qemu-devel] [RFCv2 11/12] pseries: Reset CPU compatibility mode David Gibson
2016-11-15 22:17 ` [Qemu-devel] [RFCv2 12/12] ppc: Rework CPU compatibility testing across migration David Gibson
2016-12-02 14:48   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
2016-12-05  4:09     ` David Gibson [this message]
2016-12-13 17:58       ` Greg Kurz
2016-11-15 22:44 ` [Qemu-devel] [RFCv2 00/12] Clean up compatibility mode handling no-reply
2016-11-26  0:33 ` Greg Kurz
2016-11-28  4:23   ` David Gibson
2016-11-28  4:25     ` David Gibson
2016-12-01 13:16 ` Greg Kurz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161205040916.GB32366@umbus.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=abologna@redhat.com \
    --cc=aik@ozlabs.ru \
    --cc=clg@kaod.org \
    --cc=groug@kaod.org \
    --cc=lvivier@redhat.com \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=nikunj@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.