qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kurz <groug@kaod.org>
To: Nicholas Piggin <npiggin@gmail.com>
Cc: "Aravinda Prasad" <arawinda.p@gmail.com>,
	"Alexey Kardashevskiy" <aik@ozlabs.ru>,
	"Mahesh Salgaonkar" <mahesh@linux.vnet.ibm.com>,
	qemu-devel@nongnu.org, "Cédric Le Goater" <clg@fr.ibm.com>,
	"Ganesh Goudar" <ganeshgr@linux.ibm.com>,
	qemu-ppc@nongnu.org, "David Gibson" <david@gibson.dropbear.id.au>
Subject: Re: [PATCH v2 4/4] ppc/spapr: Don't kill the guest if a recovered FWNMI machine check delivery fails
Date: Wed, 25 Mar 2020 19:13:32 +0100	[thread overview]
Message-ID: <20200325191332.7da79231@bahia.lan> (raw)
In-Reply-To: <20200325142906.221248-5-npiggin@gmail.com>

On Thu, 26 Mar 2020 00:29:06 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:

> Try to be tolerant of FWNMI delivery errors if the machine check had been
> recovered by the host.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  hw/ppc/spapr_events.c | 27 ++++++++++++++++++++++-----
>  1 file changed, 22 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index c8964eb25d..b90ecb8afe 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -833,13 +833,25 @@ static void spapr_mce_dispatch_elog(PowerPCCPU *cpu, bool recovered)
>      /* get rtas addr from fdt */
>      rtas_addr = spapr_get_rtas_addr();
>      if (!rtas_addr) {
> -        error_report(
> +        if (!recovered) {
> +            error_report(
>  "FWNMI: Unable to deliver machine check to guest: rtas_addr not found.");
> -        qemu_system_guest_panicked(NULL);
> +            qemu_system_guest_panicked(NULL);
> +        } else {
> +            warn_report(
> +"FWNMI: Unable to deliver machine check to guest: rtas_addr not found. "
> +"Machine check recovered.");
> +        }
>          g_free(ext_elog);
>          return;
>      }
>  
> +    /*
> +     * Must not set interlock if the MCE does not get delivered to the guest
> +     * in the error case above.
> +     */

It is a bit confusing to read "must not set interlock" and to see the
interlock being set the line below IM-non-native-speaker-HO... also
a small clarification of the outcome of taking the interlock without
delivering the MCE could help people who aren't familiar with FWNMI
to avoid doing bad things.

What about something like the following ?

    /*
     * By taking the interlock, we assume that the MCE will be
     * delivered to the guest. CAUTION: don't add anything that
     * could prevent the MCE to be delivered after this line,
     * otherwise the guest won't be able to release the interlock
     * and ultimately hang/crash?
     */

> +    spapr->fwnmi_machine_check_interlock = cpu->vcpu_id;
> +

For improved paranoia, this could even be done just before calling
ppc_cpu_do_fwnmi_machine_check().

Anyway, the change is good enough so:

Reviewed-by: Greg Kurz <groug@kaod.org>

>      stq_be_phys(&address_space_memory, rtas_addr + RTAS_ERROR_LOG_OFFSET,
>                  env->gpr[3]);
>      cpu_physical_memory_write(rtas_addr + RTAS_ERROR_LOG_OFFSET +
> @@ -876,9 +888,15 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
>           * that CPU called "ibm,nmi-interlock")
>           */
>          if (spapr->fwnmi_machine_check_interlock == cpu->vcpu_id) {
> -            error_report(
> +            if (!recovered) {
> +                error_report(
>  "FWNMI: Unable to deliver machine check to guest: nested machine check.");
> -            qemu_system_guest_panicked(NULL);
> +                qemu_system_guest_panicked(NULL);
> +            } else {
> +                warn_report(
> +"FWNMI: Unable to deliver machine check to guest: nested machine check. "
> +"Machine check recovered.");
> +            }
>              return;
>          }
>          qemu_cond_wait_iothread(&spapr->fwnmi_machine_check_interlock_cond);
> @@ -906,7 +924,6 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
>          warn_report("Received a fwnmi while migration was in progress");
>      }
>  
> -    spapr->fwnmi_machine_check_interlock = cpu->vcpu_id;
>      spapr_mce_dispatch_elog(cpu, recovered);
>  }
>  



  reply	other threads:[~2020-03-25 18:14 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-25 14:29 [PATCH v2 0/4] FWNMI follow up patches Nicholas Piggin
2020-03-25 14:29 ` [PATCH v2 1/4] ppc/spapr: KVM FWNMI should not be enabled until guest requests it Nicholas Piggin
2020-03-25 17:17   ` Greg Kurz
2020-03-26  0:17   ` David Gibson
2020-03-25 14:29 ` [PATCH v2 2/4] ppc/spapr: Improve FWNMI machine check delivery corner case comments Nicholas Piggin
2020-03-26  0:18   ` David Gibson
2020-03-25 14:29 ` [PATCH v2 3/4] ppc/spapr: Add FWNMI machine check delivery warnings Nicholas Piggin
2020-03-25 17:24   ` Greg Kurz
2020-03-26  0:19   ` David Gibson
2020-03-25 14:29 ` [PATCH v2 4/4] ppc/spapr: Don't kill the guest if a recovered FWNMI machine check delivery fails Nicholas Piggin
2020-03-25 18:13   ` Greg Kurz [this message]
2020-03-26  0:30     ` David Gibson
2020-03-26  0:30   ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200325191332.7da79231@bahia.lan \
    --to=groug@kaod.org \
    --cc=aik@ozlabs.ru \
    --cc=arawinda.p@gmail.com \
    --cc=clg@fr.ibm.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=ganeshgr@linux.ibm.com \
    --cc=mahesh@linux.vnet.ibm.com \
    --cc=npiggin@gmail.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).