All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: peter.maydell@linaro.org, groug@kaod.org
Cc: Daniel Henrique Barboza <danielhb413@gmail.com>,
	qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
	David Gibson <david@gibson.dropbear.id.au>,
	Xujun Ma <xuma@redhat.com>
Subject: [PULL 11/20] spapr_drc.c: add hotunplug timeout for CPUs
Date: Wed, 10 Mar 2021 15:09:53 +1100	[thread overview]
Message-ID: <20210310041002.333813-12-david@gibson.dropbear.id.au> (raw)
In-Reply-To: <20210310041002.333813-1-david@gibson.dropbear.id.au>

From: Daniel Henrique Barboza <danielhb413@gmail.com>

There is a reliable way to make a CPU hotunplug fail in the pseries
machine. Hotplug a CPU A, then offline all other CPUs inside the guest
but A. When trying to hotunplug A the guest kernel will refuse to do it,
because A is now the last online CPU of the guest. PAPR has no 'error
callback' in this situation to report back to the platform, so the guest
kernel will deny the unplug in silent and QEMU will never know what
happened. The unplug pending state of A will remain until the guest is
shutdown or rebooted.

Previous attempts of fixing it (see [1] and [2]) were aimed at trying to
mitigate the effects of the problem. In [1] we were trying to guess
which guest CPUs were online to forbid hotunplug of the last online CPU
in the QEMU layer, avoiding the scenario described above because QEMU is
now failing in behalf of the guest. This is not robust because the last
online CPU of the guest can change while we're in the middle of the
unplug process, and our initial assumptions are now invalid. In [2] we
were accepting that our unplug process is uncertain and the user should
be allowed to spam the IRQ hotunplug queue of the guest in case the CPU
hotunplug fails.

This patch presents another alternative, using the timeout
infrastructure introduced in the previous patch. CPU hotunplugs in the
pSeries machine will now timeout after 15 seconds. This is a long time
for a single CPU unplug to occur, regardless of guest load - although
the user is *strongly* encouraged to *not* hotunplug devices from a
guest under high load - and we can be sure that something went wrong if
it takes longer than that for the guest to release the CPU (the same
can't be said about memory hotunplug - more on that in the next patch).

Timing out the unplug operation will reset the unplug state of the CPU
and allow the user to try it again, regardless of the error situation
that prevented the hotunplug to occur. Of all the not so pretty
fixes/mitigations for CPU hotunplug errors in pSeries, timing out the
operation is an admission that we have no control in the process, and
must assume the worst case if the operation doesn't succeed in a
sensible time frame.

[1] https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg03353.html
[2] https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg04400.html

Reported-by: Xujun Ma <xuma@redhat.com>
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1911414
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210222194531.62717-5-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr.c             |  4 ++++
 hw/ppc/spapr_drc.c         | 13 +++++++++++++
 include/hw/ppc/spapr_drc.h |  1 +
 3 files changed, 18 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index b066df68cb..ecce8abf14 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3724,6 +3724,10 @@ void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
     if (!spapr_drc_unplug_requested(drc)) {
         spapr_drc_unplug_request(drc);
         spapr_hotplug_req_remove_by_index(drc);
+    } else {
+        error_setg(errp, "core-id %d unplug is still pending, %d seconds "
+                   "timeout remaining",
+                   cc->core_id, spapr_drc_unplug_timeout_remaining_sec(drc));
     }
 }
 
diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 27adbc5c30..fd2e45640f 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -409,6 +409,8 @@ void spapr_drc_unplug_request(SpaprDrc *drc)
 
     drc->unplug_requested = true;
 
+    spapr_drc_start_unplug_timeout_timer(drc);
+
     if (drc->state != drck->empty_state) {
         trace_spapr_drc_awaiting_quiesce(spapr_drc_index(drc));
         return;
@@ -417,6 +419,16 @@ void spapr_drc_unplug_request(SpaprDrc *drc)
     spapr_drc_release(drc);
 }
 
+int spapr_drc_unplug_timeout_remaining_sec(SpaprDrc *drc)
+{
+    if (drc->unplug_requested && timer_pending(drc->unplug_timeout_timer)) {
+        return (qemu_timeout_ns_to_ms(drc->unplug_timeout_timer->expire_time) -
+                qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL)) / 1000;
+    }
+
+    return 0;
+}
+
 bool spapr_drc_reset(SpaprDrc *drc)
 {
     SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
@@ -710,6 +722,7 @@ static void spapr_drc_cpu_class_init(ObjectClass *k, void *data)
     drck->drc_name_prefix = "CPU ";
     drck->release = spapr_core_release;
     drck->dt_populate = spapr_core_dt_populate;
+    drck->unplug_timeout_seconds = 15;
 }
 
 static void spapr_drc_pci_class_init(ObjectClass *k, void *data)
diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
index 38ec4c8091..26599c385a 100644
--- a/include/hw/ppc/spapr_drc.h
+++ b/include/hw/ppc/spapr_drc.h
@@ -248,6 +248,7 @@ int spapr_dt_drc(void *fdt, int offset, Object *owner, uint32_t drc_type_mask);
  */
 void spapr_drc_attach(SpaprDrc *drc, DeviceState *d);
 void spapr_drc_unplug_request(SpaprDrc *drc);
+int spapr_drc_unplug_timeout_remaining_sec(SpaprDrc *drc);
 
 /*
  * Reset all DRCs, causing pending hot-plug/unplug requests to complete.
-- 
2.29.2



  parent reply	other threads:[~2021-03-10  4:19 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-10  4:09 [PULL 00/20] ppc-for-6.0 queue 20210310 David Gibson
2021-03-10  4:09 ` [PULL 01/20] hw/display/sm501: Remove dead code for non-32-bit RGB surfaces David Gibson
2021-03-10  4:09 ` [PULL 02/20] hw/display/sm501: Expand out macros in template header David Gibson
2021-03-10  4:09 ` [PULL 03/20] hw/display/sm501: Inline template header into C file David Gibson
2021-03-10  4:09 ` [PULL 04/20] spapr_drc.c: do not call spapr_drc_detach() in drc_isolate_logical() David Gibson
2021-03-10  4:09 ` [PULL 05/20] pseries: Update SLOF firmware image David Gibson
2021-03-10  4:09 ` [PULL 06/20] spapr_drc.c: use spapr_drc_release() in isolate_physical/set_unusable David Gibson
2021-03-10  4:09 ` [PULL 07/20] spapr: rename spapr_drc_detach() to spapr_drc_unplug_request() David Gibson
2021-03-10  4:09 ` [PULL 08/20] docs/system: Extend PPC section David Gibson
2021-03-10  4:09 ` [PULL 09/20] target/ppc: Fix bcdsub. emulation when result overflows David Gibson
2021-03-10  4:09 ` [PULL 10/20] spapr_drc.c: introduce unplug_timeout_timer David Gibson
2021-03-10  4:09 ` David Gibson [this message]
2021-03-10  4:09 ` [PULL 12/20] spapr_drc.c: use DRC reconfiguration to cleanup DIMM unplug state David Gibson
2021-03-10  4:09 ` [PULL 13/20] hw/net: fsl_etsec: Fix build error when HEX_DUMP is on David Gibson
2021-03-10  4:09 ` [PULL 14/20] hw/ppc: e500: Add missing <ranges> in the eTSEC node David Gibson
2021-03-10  4:09 ` [PULL 15/20] spapr.c: add 'unplug already in progress' message for PHB unplug David Gibson
2021-03-10  4:09 ` [PULL 16/20] spapr_pci.c: add 'unplug already in progress' message for PCI unplug David Gibson
2021-03-10  4:09 ` [PULL 17/20] qemu_timer.c: add timer_deadline_ms() helper David Gibson
2021-03-10  4:10 ` [PULL 18/20] target/ppc: fix icount support on Book-e vms accessing SPRs David Gibson
2021-03-10  4:10 ` [PULL 19/20] spapr.c: remove duplicated assert in spapr_memory_unplug_request() David Gibson
2021-03-10  4:10 ` [PULL 20/20] spapr.c: send QAPI event when memory hotunplug fails David Gibson
2021-03-10  4:43 ` [PULL 00/20] ppc-for-6.0 queue 20210310 Bin Meng
2021-03-10  6:00   ` David Gibson
2021-03-11  1:26     ` Bin Meng
2021-03-10 14:09 ` Ivan Warren
2021-03-11  1:47   ` David Gibson
2021-03-11  3:22     ` Ivan Warren
2021-03-11  4:56       ` David Gibson
2021-03-11 13:31         ` Richard Henderson
2021-03-11 15:54           ` Greg Kurz
2021-03-11 18:02             ` Greg Kurz
2021-03-12 13:53 ` Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210310041002.333813-12-david@gibson.dropbear.id.au \
    --to=david@gibson.dropbear.id.au \
    --cc=danielhb413@gmail.com \
    --cc=groug@kaod.org \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=xuma@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.