qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] pSeries: revert CPU unplug timeout
@ 2021-04-01  0:04 Daniel Henrique Barboza
  2021-04-01  0:04 ` [PATCH 1/2] spapr: rollback 'unplug timeout' for CPU hotunplugs Daniel Henrique Barboza
  2021-04-01  0:04 ` [PATCH 2/2] spapr.c: always pulse guest IRQ in spapr_core_unplug_request() Daniel Henrique Barboza
  0 siblings, 2 replies; 6+ messages in thread
From: Daniel Henrique Barboza @ 2021-04-01  0:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: Daniel Henrique Barboza, qemu-ppc, groug, david

Hi,

This series reverts the CPU hotunplug timeout mechanism that was added
during the 6.0.0 cycle. See patch 1 for the reasoning behind this
decision.

Patch 2 is a re-post of a patch that allows CPU hotunplug events to be
re-send to the guest [1], regardless of any existing hotunplug pending
state present in QEMU. This will give users a way to retry hotunplugging
CPUs without relying on unplug timeouts.

[1] https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg04399.html


Daniel Henrique Barboza (2):
  spapr: rollback 'unplug timeout' for CPU hotunplugs
  spapr.c: always pulse guest IRQ in spapr_core_unplug_request()

 hw/ppc/spapr.c             | 15 +++++++----
 hw/ppc/spapr_drc.c         | 52 --------------------------------------
 include/hw/ppc/spapr_drc.h |  5 ----
 include/qemu/timer.h       |  8 ------
 util/qemu-timer.c          | 13 ----------
 5 files changed, 10 insertions(+), 83 deletions(-)

-- 
2.30.2



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] spapr: rollback 'unplug timeout' for CPU hotunplugs
  2021-04-01  0:04 [PATCH 0/2] pSeries: revert CPU unplug timeout Daniel Henrique Barboza
@ 2021-04-01  0:04 ` Daniel Henrique Barboza
  2021-04-01  0:04 ` [PATCH 2/2] spapr.c: always pulse guest IRQ in spapr_core_unplug_request() Daniel Henrique Barboza
  1 sibling, 0 replies; 6+ messages in thread
From: Daniel Henrique Barboza @ 2021-04-01  0:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Daniel Henrique Barboza, qemu-ppc, groug, david

The pseries machines introduced the concept of 'unplug timeout' for CPU
hotunplugs. The idea was to circunvent a deficiency in the pSeries
specification (PAPR), that currently does not define a proper way for
the hotunplug to fail. If the guest refuses to release the CPU (see [1]
for an example) there is no way for QEMU to detect the failure.

Further discussions about how to send a QAPI event to inform about the
hotunplug timeout [2] exposed problems that weren't predicted back when
the idea was developed. Other QEMU machines don't have any type of
hotunplug timeout mechanism for any device, e.g. ACPI based machines
have a way to make hotunplug errors visible to the hypervisor. This
would make this timeout mechanism exclusive to pSeries, which is not
ideal.

The real problem is that a QAPI event that reports hotunplug timeouts
puts the management layer (namely Libvirt) in a weird spot. We're not
telling that the hotunplug failed, because we can't be 100% sure of
that, and yet we're resetting the unplug state back, preventing any
DEVICE_DEL events to reach out in case the guest decides to release the
device. Libvirt would need to inspect the guest itself to see if the
device was released or not, otherwise the internal domain states will be
inconsistent.  Moreover, Libvirt already has an 'unplug timeout'
concept, and a QEMU side timeout would need to be juggled together with
the existing Libvirt timeout.

All this considered, this solution ended up creating more trouble than
it solved. This patch reverts the 3 commits that introduced the timeout
mechanism for CPU hotplugs in pSeries machines.

This reverts commit 4515a5f786024fabf0bef4cf3d28adf5647e6e82
"qemu_timer.c: add timer_deadline_ms() helper"

This reverts commit d1c2e3ce3d5a5424651967bce1cf1f4caa0c6d91
"spapr_drc.c: add hotunplug timeout for CPUs"

This reverts commit 51254ffb320183a4636635840c23ee0e3a1efffa
"spapr_drc.c: introduce unplug_timeout_timer"

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1911414
[2] https://lists.gnu.org/archive/html/qemu-devel/2021-03/msg04682.html

CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
---

Paolo, I'm CCing you in this patch because I reverted a change in
qemu_timer.c that added the timer_deadline_ms() helper that was used
only by pseries code that I'm removing with this patch. Figured that
a helper that is not used anywhere wouldn't do too much good. 


 hw/ppc/spapr.c             |  4 ---
 hw/ppc/spapr_drc.c         | 52 --------------------------------------
 include/hw/ppc/spapr_drc.h |  5 ----
 include/qemu/timer.h       |  8 ------
 util/qemu-timer.c          | 13 ----------
 5 files changed, 82 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 73a06df3b1..05a765fab4 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3778,10 +3778,6 @@ void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
     if (!spapr_drc_unplug_requested(drc)) {
         spapr_drc_unplug_request(drc);
         spapr_hotplug_req_remove_by_index(drc);
-    } else {
-        error_setg(errp, "core-id %d unplug is still pending, %d seconds "
-                   "timeout remaining",
-                   cc->core_id, spapr_drc_unplug_timeout_remaining_sec(drc));
     }
 }
 
diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 8a71b03800..9e16505fa1 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -57,8 +57,6 @@ static void spapr_drc_release(SpaprDrc *drc)
     drck->release(drc->dev);
 
     drc->unplug_requested = false;
-    timer_del(drc->unplug_timeout_timer);
-
     g_free(drc->fdt);
     drc->fdt = NULL;
     drc->fdt_start_offset = 0;
@@ -372,17 +370,6 @@ static void prop_get_fdt(Object *obj, Visitor *v, const char *name,
     } while (fdt_depth != 0);
 }
 
-static void spapr_drc_start_unplug_timeout_timer(SpaprDrc *drc)
-{
-    SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
-
-    if (drck->unplug_timeout_seconds != 0) {
-        timer_mod(drc->unplug_timeout_timer,
-                  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
-                  drck->unplug_timeout_seconds * 1000);
-    }
-}
-
 void spapr_drc_attach(SpaprDrc *drc, DeviceState *d)
 {
     trace_spapr_drc_attach(spapr_drc_index(drc));
@@ -409,8 +396,6 @@ void spapr_drc_unplug_request(SpaprDrc *drc)
 
     drc->unplug_requested = true;
 
-    spapr_drc_start_unplug_timeout_timer(drc);
-
     if (drc->state != drck->empty_state) {
         trace_spapr_drc_awaiting_quiesce(spapr_drc_index(drc));
         return;
@@ -419,15 +404,6 @@ void spapr_drc_unplug_request(SpaprDrc *drc)
     spapr_drc_release(drc);
 }
 
-int spapr_drc_unplug_timeout_remaining_sec(SpaprDrc *drc)
-{
-    if (drc->unplug_requested) {
-        return timer_deadline_ms(drc->unplug_timeout_timer) / 1000;
-    }
-
-    return 0;
-}
-
 bool spapr_drc_reset(SpaprDrc *drc)
 {
     SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
@@ -499,23 +475,11 @@ static bool spapr_drc_needed(void *opaque)
         spapr_drc_unplug_requested(drc);
 }
 
-static int spapr_drc_post_load(void *opaque, int version_id)
-{
-    SpaprDrc *drc = opaque;
-
-    if (drc->unplug_requested) {
-        spapr_drc_start_unplug_timeout_timer(drc);
-    }
-
-    return 0;
-}
-
 static const VMStateDescription vmstate_spapr_drc = {
     .name = "spapr_drc",
     .version_id = 1,
     .minimum_version_id = 1,
     .needed = spapr_drc_needed,
-    .post_load = spapr_drc_post_load,
     .fields  = (VMStateField []) {
         VMSTATE_UINT32(state, SpaprDrc),
         VMSTATE_END_OF_LIST()
@@ -526,15 +490,6 @@ static const VMStateDescription vmstate_spapr_drc = {
     }
 };
 
-static void drc_unplug_timeout_cb(void *opaque)
-{
-    SpaprDrc *drc = opaque;
-
-    if (drc->unplug_requested) {
-        drc->unplug_requested = false;
-    }
-}
-
 static void drc_realize(DeviceState *d, Error **errp)
 {
     SpaprDrc *drc = SPAPR_DR_CONNECTOR(d);
@@ -557,11 +512,6 @@ static void drc_realize(DeviceState *d, Error **errp)
     object_property_add_alias(root_container, link_name,
                               drc->owner, child_name);
     g_free(link_name);
-
-    drc->unplug_timeout_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
-                                             drc_unplug_timeout_cb,
-                                             drc);
-
     vmstate_register(VMSTATE_IF(drc), spapr_drc_index(drc), &vmstate_spapr_drc,
                      drc);
     trace_spapr_drc_realize_complete(spapr_drc_index(drc));
@@ -579,7 +529,6 @@ static void drc_unrealize(DeviceState *d)
     name = g_strdup_printf("%x", spapr_drc_index(drc));
     object_property_del(root_container, name);
     g_free(name);
-    timer_free(drc->unplug_timeout_timer);
 }
 
 SpaprDrc *spapr_dr_connector_new(Object *owner, const char *type,
@@ -721,7 +670,6 @@ static void spapr_drc_cpu_class_init(ObjectClass *k, void *data)
     drck->drc_name_prefix = "CPU ";
     drck->release = spapr_core_release;
     drck->dt_populate = spapr_core_dt_populate;
-    drck->unplug_timeout_seconds = 15;
 }
 
 static void spapr_drc_pci_class_init(ObjectClass *k, void *data)
diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
index 26599c385a..02a63b3666 100644
--- a/include/hw/ppc/spapr_drc.h
+++ b/include/hw/ppc/spapr_drc.h
@@ -187,8 +187,6 @@ typedef struct SpaprDrc {
     bool unplug_requested;
     void *fdt;
     int fdt_start_offset;
-
-    QEMUTimer *unplug_timeout_timer;
 } SpaprDrc;
 
 struct SpaprMachineState;
@@ -211,8 +209,6 @@ typedef struct SpaprDrcClass {
 
     int (*dt_populate)(SpaprDrc *drc, struct SpaprMachineState *spapr,
                        void *fdt, int *fdt_start_offset, Error **errp);
-
-    int unplug_timeout_seconds;
 } SpaprDrcClass;
 
 typedef struct SpaprDrcPhysical {
@@ -248,7 +244,6 @@ int spapr_dt_drc(void *fdt, int offset, Object *owner, uint32_t drc_type_mask);
  */
 void spapr_drc_attach(SpaprDrc *drc, DeviceState *d);
 void spapr_drc_unplug_request(SpaprDrc *drc);
-int spapr_drc_unplug_timeout_remaining_sec(SpaprDrc *drc);
 
 /*
  * Reset all DRCs, causing pending hot-plug/unplug requests to complete.
diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 301fa47b42..88ef114689 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -797,14 +797,6 @@ static inline int64_t get_max_clock_jump(void)
     return 60 * NANOSECONDS_PER_SECOND;
 }
 
-/**
- * timer_deadline_ms:
- *
- * Returns the remaining miliseconds for @timer to expire, or zero
- * if the timer is no longer pending.
- */
-int64_t timer_deadline_ms(QEMUTimer *timer);
-
 /*
  * Low level clock functions
  */
diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index be529c1f65..f36c75e594 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -242,19 +242,6 @@ int64_t timerlist_deadline_ns(QEMUTimerList *timer_list)
     return delta;
 }
 
-/*
- * Returns the time remaining for the deadline, in ms.
- */
-int64_t timer_deadline_ms(QEMUTimer *timer)
-{
-    if (timer_pending(timer)) {
-        return qemu_timeout_ns_to_ms(timer->expire_time) -
-               qemu_clock_get_ms(timer->timer_list->clock->type);
-    }
-
-    return 0;
-}
-
 /* Calculate the soonest deadline across all timerlists attached
  * to the clock. This is used for the icount timeout so we
  * ignore whether or not the clock should be used in deadline
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] spapr.c: always pulse guest IRQ in spapr_core_unplug_request()
  2021-04-01  0:04 [PATCH 0/2] pSeries: revert CPU unplug timeout Daniel Henrique Barboza
  2021-04-01  0:04 ` [PATCH 1/2] spapr: rollback 'unplug timeout' for CPU hotunplugs Daniel Henrique Barboza
@ 2021-04-01  0:04 ` Daniel Henrique Barboza
  2021-04-01  2:37   ` David Gibson
  1 sibling, 1 reply; 6+ messages in thread
From: Daniel Henrique Barboza @ 2021-04-01  0:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: Daniel Henrique Barboza, qemu-ppc, groug, david

Commit 47c8c915b162 fixed a problem where multiple spapr_drc_detach()
requests were breaking QEMU. The solution was to just spapr_drc_detach()
once, and use spapr_drc_unplug_requested() to filter whether we already
detached it or not. The commit also tied the hotplug request to the
guest in the same condition.

Turns out that there is a reliable way for a CPU hotunplug to fail. If a
guest with one CPU hotplugs a CPU1, then offline CPU0s via 'echo 0 >
/sys/devices/system/cpu/cpu0/online', then attempts to hotunplug CPU1,
the kernel will refuse it because it's the last online CPU of the
system. Given that we're pulsing the IRQ only in the first try, in a
failed attempt, all other CPU1 hotunplug attempts will fail, regardless
of the online state of CPU1 in the kernel, because we're simply not
letting the guest know that we want to hotunplug the device.

Let's move spapr_hotplug_req_remove_by_index() back out of the "if
(!spapr_drc_unplug_requested(drc))" conditional, allowing for multiple
'device_del' requests to the same CPU core to reach the guest, in case
the CPU core didn't fully hotunplugged previously.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
---
 hw/ppc/spapr.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 05a765fab4..e4be00b732 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3777,8 +3777,17 @@ void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
 
     if (!spapr_drc_unplug_requested(drc)) {
         spapr_drc_unplug_request(drc);
-        spapr_hotplug_req_remove_by_index(drc);
     }
+
+    /*
+     * spapr_hotplug_req_remove_by_index is left unguarded, out of the
+     * "!spapr_drc_unplug_requested" check, to allow for multiple IRQ
+     * pulses removing the same CPU. Otherwise, in an failed hotunplug
+     * attempt (e.g. the kernel will refuse to remove the last online
+     * CPU), we will never attempt it again because unplug_requested
+     * will still be 'true' in that case.
+     */
+    spapr_hotplug_req_remove_by_index(drc);
 }
 
 int spapr_core_dt_populate(SpaprDrc *drc, SpaprMachineState *spapr,
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] spapr.c: always pulse guest IRQ in spapr_core_unplug_request()
  2021-04-01  0:04 ` [PATCH 2/2] spapr.c: always pulse guest IRQ in spapr_core_unplug_request() Daniel Henrique Barboza
@ 2021-04-01  2:37   ` David Gibson
  2021-04-12 19:27     ` Daniel Henrique Barboza
  0 siblings, 1 reply; 6+ messages in thread
From: David Gibson @ 2021-04-01  2:37 UTC (permalink / raw)
  To: Daniel Henrique Barboza; +Cc: qemu-ppc, qemu-devel, groug

[-- Attachment #1: Type: text/plain, Size: 2845 bytes --]

On Wed, Mar 31, 2021 at 09:04:37PM -0300, Daniel Henrique Barboza wrote:
> Commit 47c8c915b162 fixed a problem where multiple spapr_drc_detach()
> requests were breaking QEMU. The solution was to just spapr_drc_detach()
> once, and use spapr_drc_unplug_requested() to filter whether we already
> detached it or not. The commit also tied the hotplug request to the
> guest in the same condition.
> 
> Turns out that there is a reliable way for a CPU hotunplug to fail. If a
> guest with one CPU hotplugs a CPU1, then offline CPU0s via 'echo 0 >
> /sys/devices/system/cpu/cpu0/online', then attempts to hotunplug CPU1,
> the kernel will refuse it because it's the last online CPU of the
> system. Given that we're pulsing the IRQ only in the first try, in a
> failed attempt, all other CPU1 hotunplug attempts will fail, regardless
> of the online state of CPU1 in the kernel, because we're simply not
> letting the guest know that we want to hotunplug the device.
> 
> Let's move spapr_hotplug_req_remove_by_index() back out of the "if
> (!spapr_drc_unplug_requested(drc))" conditional, allowing for multiple
> 'device_del' requests to the same CPU core to reach the guest, in case
> the CPU core didn't fully hotunplugged previously.
> 
> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

I've applied these to ppc-for-6.0, but..

> ---
>  hw/ppc/spapr.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 05a765fab4..e4be00b732 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3777,8 +3777,17 @@ void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
>  
>      if (!spapr_drc_unplug_requested(drc)) {
>          spapr_drc_unplug_request(drc);
> -        spapr_hotplug_req_remove_by_index(drc);
>      }
> +
> +    /*
> +     * spapr_hotplug_req_remove_by_index is left unguarded, out of the
> +     * "!spapr_drc_unplug_requested" check, to allow for multiple IRQ
> +     * pulses removing the same CPU. Otherwise, in an failed hotunplug
> +     * attempt (e.g. the kernel will refuse to remove the last online
> +     * CPU), we will never attempt it again because unplug_requested
> +     * will still be 'true' in that case.
> +     */
> +    spapr_hotplug_req_remove_by_index(drc);

I think we need similar changes for all the other unplug types (LMB,
PCI, PHB) - basically retries should always be allowed, and at worst
be a no-op, rather than generating an error like they do now.

>  }
>  
>  int spapr_core_dt_populate(SpaprDrc *drc, SpaprMachineState *spapr,

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] spapr.c: always pulse guest IRQ in spapr_core_unplug_request()
  2021-04-01  2:37   ` David Gibson
@ 2021-04-12 19:27     ` Daniel Henrique Barboza
  2021-04-20  1:24       ` David Gibson
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel Henrique Barboza @ 2021-04-12 19:27 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, groug



On 3/31/21 11:37 PM, David Gibson wrote:
> On Wed, Mar 31, 2021 at 09:04:37PM -0300, Daniel Henrique Barboza wrote:
>> Commit 47c8c915b162 fixed a problem where multiple spapr_drc_detach()
>> requests were breaking QEMU. The solution was to just spapr_drc_detach()
>> once, and use spapr_drc_unplug_requested() to filter whether we already
>> detached it or not. The commit also tied the hotplug request to the
>> guest in the same condition.
>>
>> Turns out that there is a reliable way for a CPU hotunplug to fail. If a
>> guest with one CPU hotplugs a CPU1, then offline CPU0s via 'echo 0 >
>> /sys/devices/system/cpu/cpu0/online', then attempts to hotunplug CPU1,
>> the kernel will refuse it because it's the last online CPU of the
>> system. Given that we're pulsing the IRQ only in the first try, in a
>> failed attempt, all other CPU1 hotunplug attempts will fail, regardless
>> of the online state of CPU1 in the kernel, because we're simply not
>> letting the guest know that we want to hotunplug the device.
>>
>> Let's move spapr_hotplug_req_remove_by_index() back out of the "if
>> (!spapr_drc_unplug_requested(drc))" conditional, allowing for multiple
>> 'device_del' requests to the same CPU core to reach the guest, in case
>> the CPU core didn't fully hotunplugged previously.
>>
>> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
> 
> I've applied these to ppc-for-6.0, but..
> 
>> ---
>>   hw/ppc/spapr.c | 11 ++++++++++-
>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 05a765fab4..e4be00b732 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -3777,8 +3777,17 @@ void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
>>   
>>       if (!spapr_drc_unplug_requested(drc)) {
>>           spapr_drc_unplug_request(drc);
>> -        spapr_hotplug_req_remove_by_index(drc);
>>       }
>> +
>> +    /*
>> +     * spapr_hotplug_req_remove_by_index is left unguarded, out of the
>> +     * "!spapr_drc_unplug_requested" check, to allow for multiple IRQ
>> +     * pulses removing the same CPU. Otherwise, in an failed hotunplug
>> +     * attempt (e.g. the kernel will refuse to remove the last online
>> +     * CPU), we will never attempt it again because unplug_requested
>> +     * will still be 'true' in that case.
>> +     */
>> +    spapr_hotplug_req_remove_by_index(drc);
> 
> I think we need similar changes for all the other unplug types (LMB,
> PCI, PHB) - basically retries should always be allowed, and at worst
> be a no-op, rather than generating an error like they do now.


For PHBs should be straightforward. Not so sure about PCI because there is
all the PCI function logic around the hotunplug of function 0.

As for LMBs, we block further attempts because there is no way we can tell
if the hotunplug is being executed but it is taking some time (it is not
uncommon for a DIMM unplug to take 20-30 seconds to complete), versus
an error scenario. What we do ATM is check is the pending DIMM unplug
state exists, and if it does, assume that a hotunplug is pending. I have
no idea what would happen if an unplug request for a LMB DRC reaches the
kernel in the middle of an error rollback (when the kernel reconnects all
the LMBs again) and the same DRC that was rolled back is disconnected
again.

We would need to check not only if the pending dimm unplug state exists, but
also if partially exists. In other words, if there are DRCs of that DIMM that
were unplugged already. That way we can prevent to issue a removal while
the unplug is still running.


Thanks,


DHB

> 
>>   }
>>   
>>   int spapr_core_dt_populate(SpaprDrc *drc, SpaprMachineState *spapr,
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] spapr.c: always pulse guest IRQ in spapr_core_unplug_request()
  2021-04-12 19:27     ` Daniel Henrique Barboza
@ 2021-04-20  1:24       ` David Gibson
  0 siblings, 0 replies; 6+ messages in thread
From: David Gibson @ 2021-04-20  1:24 UTC (permalink / raw)
  To: Daniel Henrique Barboza; +Cc: qemu-ppc, qemu-devel, groug

[-- Attachment #1: Type: text/plain, Size: 4517 bytes --]

On Mon, Apr 12, 2021 at 04:27:43PM -0300, Daniel Henrique Barboza wrote:
> 
> 
> On 3/31/21 11:37 PM, David Gibson wrote:
> > On Wed, Mar 31, 2021 at 09:04:37PM -0300, Daniel Henrique Barboza wrote:
> > > Commit 47c8c915b162 fixed a problem where multiple spapr_drc_detach()
> > > requests were breaking QEMU. The solution was to just spapr_drc_detach()
> > > once, and use spapr_drc_unplug_requested() to filter whether we already
> > > detached it or not. The commit also tied the hotplug request to the
> > > guest in the same condition.
> > > 
> > > Turns out that there is a reliable way for a CPU hotunplug to fail. If a
> > > guest with one CPU hotplugs a CPU1, then offline CPU0s via 'echo 0 >
> > > /sys/devices/system/cpu/cpu0/online', then attempts to hotunplug CPU1,
> > > the kernel will refuse it because it's the last online CPU of the
> > > system. Given that we're pulsing the IRQ only in the first try, in a
> > > failed attempt, all other CPU1 hotunplug attempts will fail, regardless
> > > of the online state of CPU1 in the kernel, because we're simply not
> > > letting the guest know that we want to hotunplug the device.
> > > 
> > > Let's move spapr_hotplug_req_remove_by_index() back out of the "if
> > > (!spapr_drc_unplug_requested(drc))" conditional, allowing for multiple
> > > 'device_del' requests to the same CPU core to reach the guest, in case
> > > the CPU core didn't fully hotunplugged previously.
> > > 
> > > Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
> > 
> > I've applied these to ppc-for-6.0, but..
> > 
> > > ---
> > >   hw/ppc/spapr.c | 11 ++++++++++-
> > >   1 file changed, 10 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index 05a765fab4..e4be00b732 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -3777,8 +3777,17 @@ void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
> > >       if (!spapr_drc_unplug_requested(drc)) {
> > >           spapr_drc_unplug_request(drc);
> > > -        spapr_hotplug_req_remove_by_index(drc);
> > >       }
> > > +
> > > +    /*
> > > +     * spapr_hotplug_req_remove_by_index is left unguarded, out of the
> > > +     * "!spapr_drc_unplug_requested" check, to allow for multiple IRQ
> > > +     * pulses removing the same CPU. Otherwise, in an failed hotunplug
> > > +     * attempt (e.g. the kernel will refuse to remove the last online
> > > +     * CPU), we will never attempt it again because unplug_requested
> > > +     * will still be 'true' in that case.
> > > +     */
> > > +    spapr_hotplug_req_remove_by_index(drc);
> > 
> > I think we need similar changes for all the other unplug types (LMB,
> > PCI, PHB) - basically retries should always be allowed, and at worst
> > be a no-op, rather than generating an error like they do now.
> 
> 
> For PHBs should be straightforward. Not so sure about PCI because there is
> all the PCI function logic around the hotunplug of function 0.
> 
> As for LMBs, we block further attempts because there is no way we can tell
> if the hotunplug is being executed but it is taking some time (it is not
> uncommon for a DIMM unplug to take 20-30 seconds to complete), versus
> an error scenario.

I don't see why that prevents retries.  Can't you reissue the
index+size unplug request anyway?  The code you already have to fail
unplugs on a reconfigure should work for both the original request and
the retry, shouldn't it?


> What we do ATM is check is the pending DIMM unplug
> state exists, and if it does, assume that a hotunplug is pending. I have
> no idea what would happen if an unplug request for a LMB DRC reaches the
> kernel in the middle of an error rollback (when the kernel reconnects all
> the LMBs again) and the same DRC that was rolled back is disconnected
> again.
> 
> We would need to check not only if the pending dimm unplug state exists, but
> also if partially exists. In other words, if there are DRCs of that DIMM that
> were unplugged already. That way we can prevent to issue a removal while
> the unplug is still running.
> 
> 
> Thanks,
> 
> 
> DHB
> 
> > 
> > >   }
> > >   int spapr_core_dt_populate(SpaprDrc *drc, SpaprMachineState *spapr,
> > 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-04-20  1:34 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-01  0:04 [PATCH 0/2] pSeries: revert CPU unplug timeout Daniel Henrique Barboza
2021-04-01  0:04 ` [PATCH 1/2] spapr: rollback 'unplug timeout' for CPU hotunplugs Daniel Henrique Barboza
2021-04-01  0:04 ` [PATCH 2/2] spapr.c: always pulse guest IRQ in spapr_core_unplug_request() Daniel Henrique Barboza
2021-04-01  2:37   ` David Gibson
2021-04-12 19:27     ` Daniel Henrique Barboza
2021-04-20  1:24       ` David Gibson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).