All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kurz <groug@kaod.org>
To: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
Cc: aik@ozlabs.ru, qemu-devel@nongnu.org, paulus@ozlabs.org,
	qemu-ppc@nongnu.org, david@gibson.dropbear.id.au
Subject: Re: [Qemu-devel] [PATCH v13 6/6] migration: Include migration support for machine check handling
Date: Tue, 10 Sep 2019 10:48:14 +0200	[thread overview]
Message-ID: <20190910104814.6bd89cec@bahia.lan> (raw)
In-Reply-To: <156801390267.24362.17017161761742932333.stgit@aravinda>

Hi Aravinda,

Sorry for not being able to review the whole series in one pass,
and thus forcing you to poste more versions... but I have some
more remarks about migration.

On Mon, 09 Sep 2019 12:55:02 +0530
Aravinda Prasad <aravinda@linux.vnet.ibm.com> wrote:

> This patch includes migration support for machine check
> handling. Especially this patch blocks VM migration
> requests until the machine check error handling is
> complete as (i) these errors are specific to the source
> hardware and is irrelevant on the target hardware,
> (ii) these errors cause data corruption and should
> be handled before migration.
> 
> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c         |   44 ++++++++++++++++++++++++++++++++++++++++++++
>  hw/ppc/spapr_events.c  |   14 ++++++++++++++
>  hw/ppc/spapr_rtas.c    |    2 ++
>  include/hw/ppc/spapr.h |    2 ++
>  4 files changed, 62 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 1c0908e..f6262f0 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -46,6 +46,7 @@
>  #include "migration/qemu-file-types.h"
>  #include "migration/global_state.h"
>  #include "migration/register.h"
> +#include "migration/blocker.h"
>  #include "mmu-hash64.h"
>  #include "mmu-book3s-v3.h"
>  #include "cpu-models.h"
> @@ -1829,6 +1830,8 @@ static void spapr_machine_reset(MachineState *machine)
>  
>      /* Signal all vCPUs waiting on this condition */
>      qemu_cond_broadcast(&spapr->mc_delivery_cond);
> +
> +    migrate_del_blocker(spapr->fwnmi_migration_blocker);
>  }
>  
>  static void spapr_create_nvram(SpaprMachineState *spapr)
> @@ -2119,6 +2122,42 @@ static const VMStateDescription vmstate_spapr_dtb = {
>      },
>  };
>  
> +static bool spapr_fwnmi_needed(void *opaque)
> +{
> +    SpaprMachineState *spapr = (SpaprMachineState *)opaque;
> +
> +    return spapr->guest_machine_check_addr != -1;
> +}
> +
> +static int spapr_fwnmi_post_load(void *opaque, int version_id)
> +{
> +    SpaprMachineState *spapr = (SpaprMachineState *)opaque;
> +
> +    if (spapr_get_cap(spapr, SPAPR_CAP_FWNMI_MCE) == SPAPR_CAP_ON) {
> +
> +        if (kvmppc_has_cap_ppc_fwnmi()) {
> +            return 0;
> +        }
> +
> +        return kvmppc_set_fwnmi();
> +    }
> +
> +    return 0;
> +}
> +
> +static const VMStateDescription vmstate_spapr_machine_check = {
> +    .name = "spapr_machine_check",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .needed = spapr_fwnmi_needed,
> +    .post_load = spapr_fwnmi_post_load,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_UINT64(guest_machine_check_addr, SpaprMachineState),
> +        VMSTATE_INT32(mc_status, SpaprMachineState),
> +        VMSTATE_END_OF_LIST()
> +    },
> +};
> +
>  static const VMStateDescription vmstate_spapr = {
>      .name = "spapr",
>      .version_id = 3,
> @@ -2152,6 +2191,7 @@ static const VMStateDescription vmstate_spapr = {
>          &vmstate_spapr_dtb,
>          &vmstate_spapr_cap_large_decr,
>          &vmstate_spapr_cap_ccf_assist,
> +        &vmstate_spapr_machine_check,
>          NULL
>      }
>  };
> @@ -2948,6 +2988,10 @@ static void spapr_machine_init(MachineState *machine)
>              exit(1);
>          }
>  
> +        /* Create the error string for live migration blocker */
> +        error_setg(&spapr->fwnmi_migration_blocker,
> +            "Live migration not supported during machine check handling");
> +
>          /* Register ibm,nmi-register and ibm,nmi-interlock RTAS calls */
>          spapr_fwnmi_register();
>      }
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index ecc3d68..83f0a22 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -43,6 +43,7 @@
>  #include "qemu/main-loop.h"
>  #include "hw/ppc/spapr_ovec.h"
>  #include <libfdt.h>
> +#include "migration/blocker.h"
>  
>  #define RTAS_LOG_VERSION_MASK                   0xff000000
>  #define   RTAS_LOG_VERSION_6                    0x06000000
> @@ -844,6 +845,8 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
>  {
>      SpaprMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
>      CPUState *cs = CPU(cpu);
> +    int ret;
> +    Error *local_err = NULL;
>  
>      if (spapr->guest_machine_check_addr == -1) {
>          /*
> @@ -857,6 +860,17 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
>          return;
>      }
>  
> +    ret = migrate_add_blocker(spapr->fwnmi_migration_blocker, &local_err);

If an MCE is already being handled, this adds yet another blocker. IIUC only
the vCPU handling the previous MCE is supposed to call "ibm,nmi-interlock"
and clear the blocker. This might cause a blocker to be leaked. I think
migrate_add_blocker() should only be called when we know that the vCPU
does handle the MCE, ie, after the loop.

Also, please note that migrate_add_blocker() can fail for two reasons:
(1) migration is already in progress (-EBUSY)
(2) QEMU was started with -only-migratable (-EACCES)

> +    if (ret < 0) {
> +        /*
> +         * We don't want to abort and let the migration to continue. In a
> +         * rare case, the machine check handler will run on the target
> +         * hardware. Though this is not preferable, it is better than aborting
> +         * the migration or killing the VM.
> +         */

This seems correct for case (1).

> +        warn_report_err(local_err);

The warning would be:

disallowing migration blocker (migration in progress) for:
 Live migration not supported during machine check handling

This rather looks rather cryptic for the average user. Maybe
better to ignore the generic message, ie, pass NULL to
migrate_add_blocker, and output a more meaningul warning
with warn_report() directly. Something like:

"A machine check is being handled during migration. This may
cause data corruption or abusive poisoning of some of the
guest memory on the destination"

Case (2) is different. There isn't any migration in progress: the idea
behind the -only-migratable QEMU option is to avoid configurations that
can block migration. If migration doesn't happen while the MCE is being
handled, I don't think we should output a warning at all. But a warning
(same as above?) should be printed if migration happens before the vCPU
did call "ibm,nmi-interlock", by checking mc_status in spapr_pre_save()
for example.

> +    }
> +
>      while (spapr->mc_status != -1) {
>          /*
>           * Check whether the same CPU got machine check error
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index d892583..b682cc2 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -50,6 +50,7 @@
>  #include "hw/ppc/fdt.h"
>  #include "target/ppc/mmu-hash64.h"
>  #include "target/ppc/mmu-book3s-v3.h"
> +#include "migration/blocker.h"
>  
>  static void rtas_display_character(PowerPCCPU *cpu, SpaprMachineState *spapr,
>                                     uint32_t token, uint32_t nargs,
> @@ -438,6 +439,7 @@ static void rtas_ibm_nmi_interlock(PowerPCCPU *cpu,
>           */
>          spapr->mc_status = -1;
>          qemu_cond_signal(&spapr->mc_delivery_cond);
> +        migrate_del_blocker(spapr->fwnmi_migration_blocker);
>          rtas_st(rets, 0, RTAS_OUT_SUCCESS);
>      }
>  }
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index dada821..ea7625e 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -217,6 +217,8 @@ struct SpaprMachineState {
>  
>      unsigned gpu_numa_id;
>      SpaprTpmProxy *tpm_proxy;
> +
> +    Error *fwnmi_migration_blocker;
>  };
>  
>  #define H_SUCCESS         0
> 



  reply	other threads:[~2019-09-10  9:07 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-09  7:24 [Qemu-devel] [PATCH v13 0/6] target-ppc/spapr: Add FWNMI support in QEMU for PowerKVM guests Aravinda Prasad
2019-09-09  7:24 ` [Qemu-devel] [PATCH v13 1/6] Wrapper function to wait on condition for the main loop mutex Aravinda Prasad
2019-09-09  7:24 ` [Qemu-devel] [PATCH v13 2/6] ppc: spapr: Introduce FWNMI capability Aravinda Prasad
2019-09-09  7:24 ` [Qemu-devel] [PATCH v13 3/6] target/ppc: Handle NMI guest exit Aravinda Prasad
2019-09-09  7:24 ` [Qemu-devel] [PATCH v13 4/6] target/ppc: Build rtas error log upon an MCE Aravinda Prasad
2019-09-09  7:24 ` [Qemu-devel] [PATCH v13 5/6] ppc: spapr: Handle "ibm, nmi-register" and "ibm, nmi-interlock" RTAS calls Aravinda Prasad
2019-09-09 15:19   ` Greg Kurz
2019-09-11  6:03     ` Aravinda Prasad
2019-09-09  7:25 ` [Qemu-devel] [PATCH v13 6/6] migration: Include migration support for machine check handling Aravinda Prasad
2019-09-10  8:48   ` Greg Kurz [this message]
2019-09-11  7:46     ` Aravinda Prasad

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190910104814.6bd89cec@bahia.lan \
    --to=groug@kaod.org \
    --cc=aik@ozlabs.ru \
    --cc=aravinda@linux.vnet.ibm.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=paulus@ozlabs.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.