All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
Cc: aik@au1.ibm.com, Greg Kurz <groug@kaod.org>,
	qemu-devel@nongnu.org, paulus@ozlabs.org, qemu-ppc@nongnu.org,
	david@gibson.dropbear.id.au
Subject: Re: [Qemu-devel] [Qemu-ppc] [PATCH v8 6/6] migration: Block migration while handling machine check
Date: Thu, 16 May 2019 15:17:47 +0100	[thread overview]
Message-ID: <20190516141746.GB3005@work-vm> (raw)
In-Reply-To: <d087094a-6459-0eda-0fee-935cd3b5bdbc@linux.vnet.ibm.com>

* Aravinda Prasad (aravinda@linux.vnet.ibm.com) wrote:
> 
> 
> On Thursday 16 May 2019 04:24 PM, Greg Kurz wrote:
> > On Mon, 22 Apr 2019 12:33:45 +0530
> > Aravinda Prasad <aravinda@linux.vnet.ibm.com> wrote:
> > 
> >> Block VM migration requests until the machine check
> >> error handling is complete as (i) these errors are
> >> specific to the source hardware and is irrelevant on
> >> the target hardware, (ii) these errors cause data
> >> corruption and should be handled before migration.
> >>
> >> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
> >> ---
> >>  hw/ppc/spapr_events.c  |   17 +++++++++++++++++
> >>  hw/ppc/spapr_rtas.c    |    4 ++++
> >>  include/hw/ppc/spapr.h |    3 +++
> >>  3 files changed, 24 insertions(+)
> >>
> >> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> >> index 4032db0..45b990c 100644
> >> --- a/hw/ppc/spapr_events.c
> >> +++ b/hw/ppc/spapr_events.c
> >> @@ -41,6 +41,7 @@
> >>  #include "qemu/bcd.h"
> >>  #include "hw/ppc/spapr_ovec.h"
> >>  #include <libfdt.h>
> >> +#include "migration/blocker.h"
> >>  
> >>  #define RTAS_LOG_VERSION_MASK                   0xff000000
> >>  #define   RTAS_LOG_VERSION_6                    0x06000000
> >> @@ -864,6 +865,22 @@ static void spapr_mce_dispatch_elog(PowerPCCPU *cpu, bool recovered)
> >>  void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
> >>  {
> >>      SpaprMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> >> +    int ret;
> >> +    Error *local_err = NULL;
> >> +
> >> +    error_setg(&spapr->migration_blocker,
> >> +            "Live migration not supported during machine check handling");
> >> +    ret = migrate_add_blocker(spapr->migration_blocker, &local_err);
> > 
> > migrate_add_blocker() propagates the reason of the failure in local_err,
> > ie. because a migration is already in progress or --only-migratable was
> > passed on the QEMU command line, along with the error message passed in
> > the first argument. This means that...
> > 
> >> +    if (ret < 0) {
> >> +        /*
> >> +         * We don't want to abort and let the migration to continue. In a
> >> +         * rare case, the machine check handler will run on the target
> >> +         * hardware. Though this is not preferable, it is better than aborting
> >> +         * the migration or killing the VM.
> >> +         */
> >> +        error_free(spapr->migration_blocker);
> >> +        fprintf(stderr, "Warning: Machine check during VM migration\n");
> > 
> > ... you should just do:
> > 
> >         error_report_err(local_err);
> > 
> > This also takes care of freeing local_err which would be leaked otherwise.
> 
> Sure. I am planning to use warn_report_err() as I don't want to abort.

I worry what the high level effect of this blocker will be.
Since failing hardware is a common reason for wanting to do a migrate
I worry that if the hardware is reporting lots of errors you might not
be able to migrate the VM to more solid hardware because of this
blocker.

Dave

> Regards,
> Aravinda
> 
> > 
> >> +    }
> >>  
> >>      while (spapr->mc_status != -1) {
> >>          /*
> >> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> >> index 997cf19..1229a0e 100644
> >> --- a/hw/ppc/spapr_rtas.c
> >> +++ b/hw/ppc/spapr_rtas.c
> >> @@ -50,6 +50,7 @@
> >>  #include "target/ppc/mmu-hash64.h"
> >>  #include "target/ppc/mmu-book3s-v3.h"
> >>  #include "kvm_ppc.h"
> >> +#include "migration/blocker.h"
> >>  
> >>  static void rtas_display_character(PowerPCCPU *cpu, SpaprMachineState *spapr,
> >>                                     uint32_t token, uint32_t nargs,
> >> @@ -396,6 +397,9 @@ static void rtas_ibm_nmi_interlock(PowerPCCPU *cpu,
> >>          spapr->mc_status = -1;
> >>          qemu_cond_signal(&spapr->mc_delivery_cond);
> >>          rtas_st(rets, 0, RTAS_OUT_SUCCESS);
> >> +        migrate_del_blocker(spapr->migration_blocker);
> >> +        error_free(spapr->migration_blocker);
> >> +        spapr->migration_blocker = NULL;
> >>      }
> >>  }
> >>  
> >> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> >> index 9d16ad1..dda5fd2 100644
> >> --- a/include/hw/ppc/spapr.h
> >> +++ b/include/hw/ppc/spapr.h
> >> @@ -10,6 +10,7 @@
> >>  #include "hw/ppc/spapr_irq.h"
> >>  #include "hw/ppc/spapr_xive.h"  /* For SpaprXive */
> >>  #include "hw/ppc/xics.h"        /* For ICSState */
> >> +#include "qapi/error.h"
> >>  
> >>  struct SpaprVioBus;
> >>  struct SpaprPhbState;
> >> @@ -213,6 +214,8 @@ struct SpaprMachineState {
> >>      SpaprCapabilities def, eff, mig;
> >>  
> >>      unsigned gpu_numa_id;
> >> +
> >> +    Error *migration_blocker;
> >>  };
> >>  
> >>  #define H_SUCCESS         0
> >>
> >>
> > 
> 
> -- 
> Regards,
> Aravinda
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


  reply	other threads:[~2019-05-16 14:18 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-22  7:02 [Qemu-devel] [PATCH v8 0/6] target-ppc/spapr: Add FWNMI support in QEMU for PowerKVM guests Aravinda Prasad
2019-04-22  7:02 ` Aravinda Prasad
2019-04-22  7:02 ` [Qemu-devel] [PATCH v8 1/6] ppc: spapr: Handle "ibm, nmi-register" and "ibm, nmi-interlock" RTAS calls Aravinda Prasad
2019-04-22  7:02   ` Aravinda Prasad
2019-04-23  6:45   ` David Gibson
2019-04-23  6:45     ` David Gibson
2019-04-25  4:56     ` Aravinda Prasad
2019-04-25  4:56       ` Aravinda Prasad
2019-05-10  9:06   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
2019-05-10  9:54     ` David Gibson
2019-05-10 14:33     ` Greg Kurz
2019-05-13  4:57       ` Aravinda Prasad
2019-05-13  4:53     ` Aravinda Prasad
2019-04-22  7:03 ` [Qemu-devel] [PATCH v8 2/6] Wrapper function to wait on condition for the main loop mutex Aravinda Prasad
2019-04-22  7:03   ` Aravinda Prasad
2019-04-23  6:47   ` David Gibson
2019-04-23  6:47     ` David Gibson
2019-05-10 13:14   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
2019-04-22  7:03 ` [Qemu-devel] [PATCH v8 3/6] target/ppc: Handle NMI guest exit Aravinda Prasad
2019-04-22  7:03   ` Aravinda Prasad
2019-04-23  6:53   ` David Gibson
2019-04-23  6:53     ` David Gibson
2019-04-24  4:50     ` [Qemu-devel] [Qemu-ppc] " Aravinda Prasad
2019-04-24  4:50       ` Aravinda Prasad
2019-05-10  6:37       ` David Gibson
2019-05-10  6:58         ` Aravinda Prasad
2019-05-10 16:25   ` Greg Kurz
2019-05-13  5:40     ` Aravinda Prasad
2019-05-13  5:56       ` David Gibson
2019-04-22  7:03 ` [Qemu-devel] [PATCH v8 4/6] target/ppc: Build rtas error log upon an MCE Aravinda Prasad
2019-04-22  7:03   ` Aravinda Prasad
2019-04-23 14:38   ` Fabiano Rosas
2019-04-23 14:38     ` Fabiano Rosas
2019-04-24  4:51     ` [Qemu-devel] [Qemu-ppc] " Aravinda Prasad
2019-04-24  4:51       ` Aravinda Prasad
2019-05-10  6:42   ` [Qemu-devel] " David Gibson
2019-05-10  7:05     ` Aravinda Prasad
2019-05-10  9:52       ` David Gibson
2019-05-13  5:00         ` Aravinda Prasad
2019-05-13 11:30   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
2019-05-14  0:08     ` David Gibson
2019-05-14  4:26       ` Aravinda Prasad
2019-05-14  4:40         ` David Gibson
2019-05-14  5:06           ` Aravinda Prasad
2019-05-16  1:47             ` David Gibson
2019-05-16  4:54               ` Aravinda Prasad
2019-04-22  7:03 ` [Qemu-devel] [PATCH v8 5/6] ppc: spapr: Enable FWNMI capability Aravinda Prasad
2019-04-22  7:03   ` Aravinda Prasad
2019-05-10  6:46   ` David Gibson
2019-05-10  7:15     ` [Qemu-devel] [Qemu-ppc] " Aravinda Prasad
2019-05-10  9:53       ` David Gibson
2019-05-13 10:30         ` Aravinda Prasad
2019-05-14  4:47           ` David Gibson
2019-05-14  5:32             ` Aravinda Prasad
2019-05-16  1:45               ` David Gibson
2019-05-16  4:59                 ` Aravinda Prasad
2019-04-22  7:03 ` [Qemu-devel] [PATCH v8 6/6] migration: Block migration while handling machine check Aravinda Prasad
2019-04-22  7:03   ` Aravinda Prasad
2019-05-10  6:51   ` David Gibson
2019-05-10  7:16     ` Aravinda Prasad
2019-05-29  5:46     ` [Qemu-devel] [Qemu-ppc] " Aravinda Prasad
2019-05-16 10:54   ` Greg Kurz
2019-05-16 10:59     ` Aravinda Prasad
2019-05-16 14:17       ` Dr. David Alan Gilbert [this message]
2019-05-20  5:57         ` Aravinda Prasad

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190516141746.GB3005@work-vm \
    --to=dgilbert@redhat.com \
    --cc=aik@au1.ibm.com \
    --cc=aravinda@linux.vnet.ibm.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=groug@kaod.org \
    --cc=paulus@ozlabs.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.