[Qemu-devel] [PATCH for-2.11 v2] hw/ppc: CAS reset on early device hotplug

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH for-2.11 v2] hw/ppc: CAS reset on early device hotplug
@ 2017-08-25 21:11 Daniel Henrique Barboza
  2017-08-25 21:11 ` Daniel Henrique Barboza
  2017-08-29  7:23 ` David Gibson
  0 siblings, 2 replies; 6+ messages in thread
From: Daniel Henrique Barboza @ 2017-08-25 21:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-ppc, david, mdroth

v2:
- rebased with ppc-for-2.11
- function 'spapr_cas_completed' dropped
- function 'spapr_drc_needed' made public and it's now used inside
  'spapr_hotplugged_dev_before_cas'
- 'spapr_drc_needed' was changed to support the migration of logical
  DRCs with devs attached in UNUSED state
- new function: 'spapr_clear_pending_events'. This function is used
  inside ppc_spapr_reset to reset the pending_events QTAILQ

Daniel Henrique Barboza (1):
  hw/ppc: CAS reset on early device hotplug

 hw/ppc/spapr.c             | 34 ++++++++++++++++++++++++++++++++++
 hw/ppc/spapr_drc.c         |  5 ++---
 include/hw/ppc/spapr_drc.h |  1 +
 3 files changed, 37 insertions(+), 3 deletions(-)

-- 
2.9.4

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Qemu-devel] [PATCH for-2.11 v2] hw/ppc: CAS reset on early device hotplug
  2017-08-25 21:11 [Qemu-devel] [PATCH for-2.11 v2] hw/ppc: CAS reset on early device hotplug Daniel Henrique Barboza
@ 2017-08-25 21:11 ` Daniel Henrique Barboza
  2017-08-29  6:45   ` David Gibson
  2017-08-29  7:23 ` David Gibson
  1 sibling, 1 reply; 6+ messages in thread
From: Daniel Henrique Barboza @ 2017-08-25 21:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-ppc, david, mdroth

This patch is a follow up on the discussions made in patch
"hw/ppc: disable hotplug before CAS is completed" that can be
found at [1].

At this moment, we do not support CPU/memory hotplug in early
boot stages, before CAS. When a hotplug occurs, the event is logged
in an internal RTAS event log queue and an IRQ pulse is fired. In
regular conditions, the guest handles the interrupt by executing
check_exception, fetching the generated hotplug event and enabling
the device for use.

In early boot, this IRQ isn't caught (SLOF does not handle hotplug
events), leaving the event in the rtas event log queue. If the guest
executes check_exception due to another hotplug event, the re-assertion
of the IRQ ends up de-queuing the first hotplug event as well. In short,
a device hotplugged before CAS is considered coldplugged by SLOF.
This leads to device misbehavior and, in some cases, guest kernel
Ooops when trying to unplug the device.

A proper fix would be to turn every device hotplugged before CAS
as a colplugged device. This is not trivial to do with the current
code base though - the FDT is written in the guest memory at
ppc_spapr_reset and can't be retrieved without adding extra state
(fdt_size for example) that will need to managed and migrated. Adding
the hotplugged DT in the middle of CAS negotiation via the updated DT
tree works with CPU devs, but panics the guest kernel at boot. Additional
analysis would be necessary for LMBs and PCI devices. There are
questions to be made in QEMU/SLOF/kernel level about how we can make
this change in a sustainable way.

Until we go all the way with the proper fix, this patch works around
the situation by issuing a CAS reset if a hotplugged device is detected
during CAS:

- the DRC conditions that warrant a CAS reset is the same as those that
triggers a DRC migration - the DRC must have a device attached and
the DRC state is not equal to its ready_state. With that in mind, this
patch makes use of 'spapr_drc_needed' to determine if a CAS reset
is needed.

- In the middle of CAS negotiations, the function
'spapr_hotplugged_dev_before_cas' goes through all the DRCs to see
if there are any DRC that requires a reset, using spapr_drc_needed. If
that happens, returns '1' in 'spapr_h_cas_compose_response' which will set
spapr->cas_reboot to true, causing the machine to reboot.

- a small fix was made in 'spapr_drc_needed' to change how we detect
a DRC device. Using dr_entity_sense worked for physical DRCs but,
for logical DRCs, it didn't cover the case where a logical DRC has
a drc->dev but the state is LOGICAL_UNUSABLE (e.g. a hotplugged CPU before
CAS). In this case, the dr_entity_sense of this DRC returns UNUSABLE and
spapr_drc_needed was return 'false' for a scenario what we would like
to migrate the DRC (or issue a CAS reset). Changing it to check for
drc->dev instead works for all DRC types.

- a new function called 'spapr_clear_pending_events' was created
and is being called inside ppc_spapr_reset. This function clears
the pending_events QTAILQ that holds the RTAS event logs. This prevents
old/deprecated events from persisting after a reset.

No changes are made for coldplug devices.

[1] http://lists.nongnu.org/archive/html/qemu-devel/2017-08/msg02855.html

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c             | 34 ++++++++++++++++++++++++++++++++++
 hw/ppc/spapr_drc.c         |  5 ++---
 include/hw/ppc/spapr_drc.h |  1 +
 3 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index fb1e5e0..4b23ad3 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -790,6 +790,26 @@ out:
     return ret;
 }
 
+static bool spapr_hotplugged_dev_before_cas(void)
+{
+    Object *drc_container, *obj;
+    ObjectProperty *prop;
+    ObjectPropertyIterator iter;
+
+    drc_container = container_get(object_get_root(), "/dr-connector");
+    object_property_iter_init(&iter, drc_container);
+    while ((prop = object_property_iter_next(&iter))) {
+        if (!strstart(prop->type, "link<", NULL)) {
+            continue;
+        }
+        obj = object_property_get_link(drc_container, prop->name, NULL);
+        if (spapr_drc_needed(obj)) {
+            return true;
+        }
+    }
+    return false;
+}
+
 int spapr_h_cas_compose_response(sPAPRMachineState *spapr,
                                  target_ulong addr, target_ulong size,
                                  sPAPROptionVector *ov5_updates)
@@ -797,6 +817,10 @@ int spapr_h_cas_compose_response(sPAPRMachineState *spapr,
     void *fdt, *fdt_skel;
     sPAPRDeviceTreeUpdateHeader hdr = { .version_id = 1 };
 
+    if (spapr_hotplugged_dev_before_cas()) {
+        return 1;
+    }
+
     size -= sizeof(hdr);
 
     /* Create sceleton */
@@ -1369,6 +1393,15 @@ static void find_unknown_sysbus_device(SysBusDevice *sbdev, void *opaque)
     }
 }
 
+static void spapr_clear_pending_events(sPAPRMachineState *spapr)
+{
+    sPAPREventLogEntry *entry = NULL;
+
+    QTAILQ_FOREACH(entry, &spapr->pending_events, next) {
+        QTAILQ_REMOVE(&spapr->pending_events, entry, next);
+    }
+}
+
 static void ppc_spapr_reset(void)
 {
     MachineState *machine = MACHINE(qdev_get_machine());
@@ -1392,6 +1425,7 @@ static void ppc_spapr_reset(void)
     }
 
     qemu_devices_reset();
+    spapr_clear_pending_events(spapr);
 
     /*
      * We place the device tree and RTAS just below either the top of the RMA,
diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 6541a52..915e9b5 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -460,14 +460,13 @@ static void drc_reset(void *opaque)
     spapr_drc_reset(SPAPR_DR_CONNECTOR(opaque));
 }
 
-static bool spapr_drc_needed(void *opaque)
+bool spapr_drc_needed(void *opaque)
 {
     sPAPRDRConnector *drc = (sPAPRDRConnector *)opaque;
     sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
-    sPAPRDREntitySense value = drck->dr_entity_sense(drc);
 
     /* If no dev is plugged in there is no need to migrate the DRC state */
-    if (value != SPAPR_DR_ENTITY_SENSE_PRESENT) {
+    if (!drc->dev) {
         return false;
     }
 
diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
index a7958d0..f8d9f5b 100644
--- a/include/hw/ppc/spapr_drc.h
+++ b/include/hw/ppc/spapr_drc.h
@@ -257,6 +257,7 @@ int spapr_drc_populate_dt(void *fdt, int fdt_offset, Object *owner,
 void spapr_drc_attach(sPAPRDRConnector *drc, DeviceState *d, void *fdt,
                       int fdt_start_offset, Error **errp);
 void spapr_drc_detach(sPAPRDRConnector *drc);
+bool spapr_drc_needed(void *opaque);
 
 static inline bool spapr_drc_unplug_requested(sPAPRDRConnector *drc)
 {
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.11 v2] hw/ppc: CAS reset on early device hotplug
  2017-08-25 21:11 ` Daniel Henrique Barboza
@ 2017-08-29  6:45   ` David Gibson
  0 siblings, 0 replies; 6+ messages in thread
From: David Gibson @ 2017-08-29  6:45 UTC (permalink / raw)
  To: Daniel Henrique Barboza; +Cc: qemu-devel, qemu-ppc, mdroth

[-- Attachment #1: Type: text/plain, Size: 8067 bytes --]

On Fri, Aug 25, 2017 at 06:11:19PM -0300, Daniel Henrique Barboza wrote:
> This patch is a follow up on the discussions made in patch
> "hw/ppc: disable hotplug before CAS is completed" that can be
> found at [1].
> 
> At this moment, we do not support CPU/memory hotplug in early
> boot stages, before CAS. When a hotplug occurs, the event is logged
> in an internal RTAS event log queue and an IRQ pulse is fired. In
> regular conditions, the guest handles the interrupt by executing
> check_exception, fetching the generated hotplug event and enabling
> the device for use.
> 
> In early boot, this IRQ isn't caught (SLOF does not handle hotplug
> events), leaving the event in the rtas event log queue. If the guest
> executes check_exception due to another hotplug event, the re-assertion
> of the IRQ ends up de-queuing the first hotplug event as well. In short,
> a device hotplugged before CAS is considered coldplugged by SLOF.
> This leads to device misbehavior and, in some cases, guest kernel
> Ooops when trying to unplug the device.
> 
> A proper fix would be to turn every device hotplugged before CAS
> as a colplugged device. This is not trivial to do with the current
> code base though - the FDT is written in the guest memory at
> ppc_spapr_reset and can't be retrieved without adding extra state
> (fdt_size for example) that will need to managed and migrated. Adding
> the hotplugged DT in the middle of CAS negotiation via the updated DT
> tree works with CPU devs, but panics the guest kernel at boot. Additional
> analysis would be necessary for LMBs and PCI devices. There are
> questions to be made in QEMU/SLOF/kernel level about how we can make
> this change in a sustainable way.
> 
> Until we go all the way with the proper fix, this patch works around
> the situation by issuing a CAS reset if a hotplugged device is detected
> during CAS:
> 
> - the DRC conditions that warrant a CAS reset is the same as those that
> triggers a DRC migration - the DRC must have a device attached and
> the DRC state is not equal to its ready_state. With that in mind, this
> patch makes use of 'spapr_drc_needed' to determine if a CAS reset
> is needed.
> 
> - In the middle of CAS negotiations, the function
> 'spapr_hotplugged_dev_before_cas' goes through all the DRCs to see
> if there are any DRC that requires a reset, using spapr_drc_needed. If
> that happens, returns '1' in 'spapr_h_cas_compose_response' which will set
> spapr->cas_reboot to true, causing the machine to reboot.
> 
> - a small fix was made in 'spapr_drc_needed' to change how we detect
> a DRC device. Using dr_entity_sense worked for physical DRCs but,
> for logical DRCs, it didn't cover the case where a logical DRC has
> a drc->dev but the state is LOGICAL_UNUSABLE (e.g. a hotplugged CPU before
> CAS). In this case, the dr_entity_sense of this DRC returns UNUSABLE and
> spapr_drc_needed was return 'false' for a scenario what we would like
> to migrate the DRC (or issue a CAS reset). Changing it to check for
> drc->dev instead works for all DRC types.
> 
> - a new function called 'spapr_clear_pending_events' was created
> and is being called inside ppc_spapr_reset. This function clears
> the pending_events QTAILQ that holds the RTAS event logs. This prevents
> old/deprecated events from persisting after a reset.
> 
> No changes are made for coldplug devices.
> 
> [1] http://lists.nongnu.org/archive/html/qemu-devel/2017-08/msg02855.html
> 
> Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>

Sorry I've taken a while to review.

> ---
>  hw/ppc/spapr.c             | 34 ++++++++++++++++++++++++++++++++++
>  hw/ppc/spapr_drc.c         |  5 ++---
>  include/hw/ppc/spapr_drc.h |  1 +
>  3 files changed, 37 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index fb1e5e0..4b23ad3 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -790,6 +790,26 @@ out:
>      return ret;
>  }
>  
> +static bool spapr_hotplugged_dev_before_cas(void)
> +{
> +    Object *drc_container, *obj;
> +    ObjectProperty *prop;
> +    ObjectPropertyIterator iter;
> +
> +    drc_container = container_get(object_get_root(), "/dr-connector");
> +    object_property_iter_init(&iter, drc_container);
> +    while ((prop = object_property_iter_next(&iter))) {
> +        if (!strstart(prop->type, "link<", NULL)) {
> +            continue;
> +        }
> +        obj = object_property_get_link(drc_container, prop->name, NULL);
> +        if (spapr_drc_needed(obj)) {
> +            return true;
> +        }
> +    }
> +    return false;
> +}
> +
>  int spapr_h_cas_compose_response(sPAPRMachineState *spapr,
>                                   target_ulong addr, target_ulong size,
>                                   sPAPROptionVector *ov5_updates)
> @@ -797,6 +817,10 @@ int spapr_h_cas_compose_response(sPAPRMachineState *spapr,
>      void *fdt, *fdt_skel;
>      sPAPRDeviceTreeUpdateHeader hdr = { .version_id = 1 };
>  
> +    if (spapr_hotplugged_dev_before_cas()) {
> +        return 1;
> +    }
> +
>      size -= sizeof(hdr);
>  
>      /* Create sceleton */
> @@ -1369,6 +1393,15 @@ static void find_unknown_sysbus_device(SysBusDevice *sbdev, void *opaque)
>      }
>  }
>  
> +static void spapr_clear_pending_events(sPAPRMachineState *spapr)
> +{
> +    sPAPREventLogEntry *entry = NULL;
> +
> +    QTAILQ_FOREACH(entry, &spapr->pending_events, next) {
> +        QTAILQ_REMOVE(&spapr->pending_events, entry, next);
> +    }
> +}

Ah.. unfortunately this isn't right.  You need to free each of the
events removed, as well as removing them from the list.  Since that
will require a respin anyway, can you make the following more minor
changes:
  * Move this helper function to spapr_events.c
  * Make this a preliminary patch - we should be clearing the queue at
    system reset even without the complicating case of eearly hotplug

>  static void ppc_spapr_reset(void)
>  {
>      MachineState *machine = MACHINE(qdev_get_machine());
> @@ -1392,6 +1425,7 @@ static void ppc_spapr_reset(void)
>      }
>  
>      qemu_devices_reset();
> +    spapr_clear_pending_events(spapr);
>  
>      /*
>       * We place the device tree and RTAS just below either the top of the RMA,
> diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> index 6541a52..915e9b5 100644
> --- a/hw/ppc/spapr_drc.c
> +++ b/hw/ppc/spapr_drc.c
> @@ -460,14 +460,13 @@ static void drc_reset(void *opaque)
>      spapr_drc_reset(SPAPR_DR_CONNECTOR(opaque));
>  }
>  
> -static bool spapr_drc_needed(void *opaque)
> +bool spapr_drc_needed(void *opaque)
>  {
>      sPAPRDRConnector *drc = (sPAPRDRConnector *)opaque;
>      sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> -    sPAPRDREntitySense value = drck->dr_entity_sense(drc);
>  
>      /* If no dev is plugged in there is no need to migrate the DRC state */
> -    if (value != SPAPR_DR_ENTITY_SENSE_PRESENT) {
> +    if (!drc->dev) {

Nice.  I'm not sure why on earth I did that weird indirect via
entity_sense in the first place.  Just using drc->dev is a much better
idea.
>          return false;
>      }
>  
> diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
> index a7958d0..f8d9f5b 100644
> --- a/include/hw/ppc/spapr_drc.h
> +++ b/include/hw/ppc/spapr_drc.h
> @@ -257,6 +257,7 @@ int spapr_drc_populate_dt(void *fdt, int fdt_offset, Object *owner,
>  void spapr_drc_attach(sPAPRDRConnector *drc, DeviceState *d, void *fdt,
>                        int fdt_start_offset, Error **errp);
>  void spapr_drc_detach(sPAPRDRConnector *drc);
> +bool spapr_drc_needed(void *opaque);
>  
>  static inline bool spapr_drc_unplug_requested(sPAPRDRConnector *drc)
>  {

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.11 v2] hw/ppc: CAS reset on early device hotplug
  2017-08-25 21:11 [Qemu-devel] [PATCH for-2.11 v2] hw/ppc: CAS reset on early device hotplug Daniel Henrique Barboza
  2017-08-25 21:11 ` Daniel Henrique Barboza
@ 2017-08-29  7:23 ` David Gibson
  2017-08-29 20:54   ` [Qemu-devel] [Qemu-ppc] " Daniel Henrique Barboza
  1 sibling, 1 reply; 6+ messages in thread
From: David Gibson @ 2017-08-29  7:23 UTC (permalink / raw)
  To: Daniel Henrique Barboza; +Cc: qemu-devel, qemu-ppc, mdroth

[-- Attachment #1: Type: text/plain, Size: 1476 bytes --]

On Fri, Aug 25, 2017 at 06:11:18PM -0300, Daniel Henrique Barboza wrote:
> v2:
> - rebased with ppc-for-2.11
> - function 'spapr_cas_completed' dropped
> - function 'spapr_drc_needed' made public and it's now used inside
>   'spapr_hotplugged_dev_before_cas'
> - 'spapr_drc_needed' was changed to support the migration of logical
>   DRCs with devs attached in UNUSED state
> - new function: 'spapr_clear_pending_events'. This function is used
>   inside ppc_spapr_reset to reset the pending_events QTAILQ

Thanks for the followup, unfortunately there is still an important bug
left, see comments on the patch itself.

At a higher level, though, looking at the event reset code made me
think of a possible even simpler solution to this problem.

The queue of events (both hotplug and epow) is already in a simple
internal form that's independent of the two delivery mechanisms.  The
only difference is what event source triggers the interrupt.  This
explains why an extra hotplug event after the CAS "unstuck" the queue.

AFAICT, a spurious interrupts here should be harmless - the kernel
will just check the queue and find nothing there.

So, it should be sufficient to, after CAS, pulse the hotplug queue
interrupt if the hotplug queue is negotiated.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH for-2.11 v2] hw/ppc: CAS reset on early device hotplug
  2017-08-29  7:23 ` David Gibson
@ 2017-08-29 20:54   ` Daniel Henrique Barboza
  2017-08-30  2:07     ` David Gibson
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel Henrique Barboza @ 2017-08-29 20:54 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, mdroth



On 08/29/2017 04:23 AM, David Gibson wrote:
> On Fri, Aug 25, 2017 at 06:11:18PM -0300, Daniel Henrique Barboza wrote:
>> v2:
>> - rebased with ppc-for-2.11
>> - function 'spapr_cas_completed' dropped
>> - function 'spapr_drc_needed' made public and it's now used inside
>>    'spapr_hotplugged_dev_before_cas'
>> - 'spapr_drc_needed' was changed to support the migration of logical
>>    DRCs with devs attached in UNUSED state
>> - new function: 'spapr_clear_pending_events'. This function is used
>>    inside ppc_spapr_reset to reset the pending_events QTAILQ
> Thanks for the followup, unfortunately there is still an important bug
> left, see comments on the patch itself.
>
> At a higher level, though, looking at the event reset code made me
> think of a possible even simpler solution to this problem.
>
> The queue of events (both hotplug and epow) is already in a simple
> internal form that's independent of the two delivery mechanisms.  The
> only difference is what event source triggers the interrupt.  This
> explains why an extra hotplug event after the CAS "unstuck" the queue.
>
> AFAICT, a spurious interrupts here should be harmless - the kernel
> will just check the queue and find nothing there.
>
> So, it should be sufficient to, after CAS, pulse the hotplug queue
> interrupt if the hotplug queue is negotiated.
>
This is something I've tried in my first attempts at this problem, before
sending the first patch in which I blocked hotplug before CAS. Back then,
the problem was that the kernel panics with sig 11 (acess of bad area) when
receiving the pulse after CAS.

I've investigated it a bit today and it seems that it still the case. 
Firing an IRQ right
after CAS breaks the kernel. In fact, if you time a regular CPU hotplug 
right after
CAS you'll get the same sig 11 kernel ooops. It looks like there is a 
time window after
CAS that the kernel can't handle the hotplug process and pulsing the hotplug
queue in this window breaks the guest. I've tried some hacks such as 
pulsing the queue
in the first 'event_scan' call made by the guest, but apparently it is 
still too early.

I've sent an email to the linuxppc-dev mailing list talking about this 
behavior
and asking if there is a reliable way to know when  we can safely pulse 
the hotplug
queue. Meanwhile, I'll keep working in the v3 respin of this patch in 
case this
solution of pulsing the hotplug queue ends up being not feasible.


Thanks,


Daniel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH for-2.11 v2] hw/ppc: CAS reset on early device hotplug
  2017-08-29 20:54   ` [Qemu-devel] [Qemu-ppc] " Daniel Henrique Barboza
@ 2017-08-30  2:07     ` David Gibson
  0 siblings, 0 replies; 6+ messages in thread
From: David Gibson @ 2017-08-30  2:07 UTC (permalink / raw)
  To: Daniel Henrique Barboza; +Cc: qemu-ppc, qemu-devel, mdroth

[-- Attachment #1: Type: text/plain, Size: 3271 bytes --]

On Tue, Aug 29, 2017 at 05:54:28PM -0300, Daniel Henrique Barboza wrote:
> 
> 
> On 08/29/2017 04:23 AM, David Gibson wrote:
> > On Fri, Aug 25, 2017 at 06:11:18PM -0300, Daniel Henrique Barboza wrote:
> > > v2:
> > > - rebased with ppc-for-2.11
> > > - function 'spapr_cas_completed' dropped
> > > - function 'spapr_drc_needed' made public and it's now used inside
> > >    'spapr_hotplugged_dev_before_cas'
> > > - 'spapr_drc_needed' was changed to support the migration of logical
> > >    DRCs with devs attached in UNUSED state
> > > - new function: 'spapr_clear_pending_events'. This function is used
> > >    inside ppc_spapr_reset to reset the pending_events QTAILQ
> > Thanks for the followup, unfortunately there is still an important bug
> > left, see comments on the patch itself.
> > 
> > At a higher level, though, looking at the event reset code made me
> > think of a possible even simpler solution to this problem.
> > 
> > The queue of events (both hotplug and epow) is already in a simple
> > internal form that's independent of the two delivery mechanisms.  The
> > only difference is what event source triggers the interrupt.  This
> > explains why an extra hotplug event after the CAS "unstuck" the queue.
> > 
> > AFAICT, a spurious interrupts here should be harmless - the kernel
> > will just check the queue and find nothing there.
> > 
> > So, it should be sufficient to, after CAS, pulse the hotplug queue
> > interrupt if the hotplug queue is negotiated.
> > 
> This is something I've tried in my first attempts at this problem, before
> sending the first patch in which I blocked hotplug before CAS. Back then,
> the problem was that the kernel panics with sig 11 (acess of bad area) when
> receiving the pulse after CAS.

Huh.

> I've investigated it a bit today and it seems that it still the case. Firing
> an IRQ right
> after CAS breaks the kernel. In fact, if you time a regular CPU hotplug
> right after
> CAS you'll get the same sig 11 kernel ooops. It looks like there is a time
> window after
> CAS that the kernel can't handle the hotplug process and pulsing the hotplug
> queue in this window breaks the guest. I've tried some hacks such as pulsing
> the queue
> in the first 'event_scan' call made by the guest, but apparently it is still
> too early.
> 
> I've sent an email to the linuxppc-dev mailing list talking about this
> behavior
> and asking if there is a reliable way to know when  we can safely pulse the
> hotplug
> queue. Meanwhile, I'll keep working in the v3 respin of this patch in case
> this
> solution of pulsing the hotplug queue ends up being not feasible.

Right.  As Ben's reply says that definitely looks like a guest kernel
bug.  But, it's in enough kernels in the wild that we really need to
work around it anyway.  I think the reset-at-CAS approach is our best
bet to accomplish that at this stage.

Note that the clear-queue-at-reset preliminary cleanup will be
valuable even if we end up not needing the rest of the reset at CAS
stuff.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-08-30  2:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-25 21:11 [Qemu-devel] [PATCH for-2.11 v2] hw/ppc: CAS reset on early device hotplug Daniel Henrique Barboza
2017-08-25 21:11 ` Daniel Henrique Barboza
2017-08-29  6:45   ` David Gibson
2017-08-29  7:23 ` David Gibson
2017-08-29 20:54   ` [Qemu-devel] [Qemu-ppc] " Daniel Henrique Barboza
2017-08-30  2:07     ` David Gibson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.