All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 for-5.2 0/5] papr: Cleanups for XIVE and PHB
@ 2020-08-06 16:55 Greg Kurz
  2020-08-06 16:56 ` [PATCH v2 for-5.2 1/5] spapr/xive: Fix xive->fd if kvm_create_device() fails Greg Kurz
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Greg Kurz @ 2020-08-06 16:55 UTC (permalink / raw)
  To: David Gibson
  Cc: Daniel Henrique Barboza, qemu-ppc, Cédric Le Goater, qemu-devel

Recent cleanup patch "spapr: Simplify error handling in spapr_phb_realize"
had to be dropped from ppc-for-5.2 because it would cause QEMU to crash
at init time on some POWER9 setups (eg. Boston systems), as reported by
Daniel.

The crash was happening because the kvmppc_xive_source_reset_one() function
would get called at some point (eg. initializing the LSI table of PHB0) and
fail (because XIVE KVM isn't supported on Bostons) without calling
error_setg(), which the caller doesn't expect when the patch above is applied.

The issue isn't really about a missing call to error_setg() but why do
we end up trying to claim an IRQ number in a XIVE KVM device that doesn't
exist ? The root cause for this is that we guard calls to the XIVE KVM
code with kvm_irqchip_in_kernel(), which might return true when the XICS
KVM device is active, even though the XIVE one is not. This series
upgrade the guarding code to also check if the device is actually open.

A similar cleanup could be performed on XICS.

v2: - patch 1 and 2 already applied but not yet visible on github
    - new approach with abstract methods in the base XIVE classes

---

Greg Kurz (5):
      spapr/xive: Fix xive->fd if kvm_create_device() fails
      spapr/xive: Simplify kvmppc_xive_disconnect()
      ppc/xive: Introduce dedicated kvm_irqchip_in_kernel() wrappers
      spapr/xive: Convert KVM device fd checks to assert()
      spapr: Simplify error handling in spapr_phb_realize()


 hw/intc/spapr_xive.c     |   53 ++++++++++++++++++++++++++++++++++------------
 hw/intc/spapr_xive_kvm.c |   49 ++++++++++++-------------------------------
 hw/intc/xive.c           |   28 ++++++++++++++++++------
 hw/ppc/spapr_pci.c       |   16 ++++++--------
 include/hw/ppc/xive.h    |    2 ++
 5 files changed, 83 insertions(+), 65 deletions(-)

--
Greg



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 for-5.2 1/5] spapr/xive: Fix xive->fd if kvm_create_device() fails
  2020-08-06 16:55 [PATCH v2 for-5.2 0/5] papr: Cleanups for XIVE and PHB Greg Kurz
@ 2020-08-06 16:56 ` Greg Kurz
  2020-08-06 16:56 ` [PATCH v2 for-5.2 2/5] spapr/xive: Simplify kvmppc_xive_disconnect() Greg Kurz
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Greg Kurz @ 2020-08-06 16:56 UTC (permalink / raw)
  To: David Gibson
  Cc: Daniel Henrique Barboza, qemu-ppc, Cédric Le Goater, qemu-devel

If the creation of the KVM XIVE device fails for some reasons, the
negative errno ends up in xive->fd, but the rest of the code assumes
that xive->fd either contains an open fd, ie. positive value, or -1.

This doesn't cause any misbehavior except kvmppc_xive_disconnect()
that will try to close(xive->fd) during rollback and likely be
rewarded with an EBADF.

Only set xive->fd with a open fd.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
v2: Already applied to ppc-for-5.2 but not yet visible on github
---
 hw/intc/spapr_xive_kvm.c |    8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
index edb7ee0e74f1..d55ea4670e0e 100644
--- a/hw/intc/spapr_xive_kvm.c
+++ b/hw/intc/spapr_xive_kvm.c
@@ -745,6 +745,7 @@ int kvmppc_xive_connect(SpaprInterruptController *intc, uint32_t nr_servers,
     size_t esb_len = (1ull << xsrc->esb_shift) * xsrc->nr_irqs;
     size_t tima_len = 4ull << TM_SHIFT;
     CPUState *cs;
+    int fd;
 
     /*
      * The KVM XIVE device already in use. This is the case when
@@ -760,11 +761,12 @@ int kvmppc_xive_connect(SpaprInterruptController *intc, uint32_t nr_servers,
     }
 
     /* First, create the KVM XIVE device */
-    xive->fd = kvm_create_device(kvm_state, KVM_DEV_TYPE_XIVE, false);
-    if (xive->fd < 0) {
-        error_setg_errno(errp, -xive->fd, "XIVE: error creating KVM device");
+    fd = kvm_create_device(kvm_state, KVM_DEV_TYPE_XIVE, false);
+    if (fd < 0) {
+        error_setg_errno(errp, -fd, "XIVE: error creating KVM device");
         return -1;
     }
+    xive->fd = fd;
 
     /* Tell KVM about the # of VCPUs we may have */
     if (kvm_device_check_attr(xive->fd, KVM_DEV_XIVE_GRP_CTRL,




^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 for-5.2 2/5] spapr/xive: Simplify kvmppc_xive_disconnect()
  2020-08-06 16:55 [PATCH v2 for-5.2 0/5] papr: Cleanups for XIVE and PHB Greg Kurz
  2020-08-06 16:56 ` [PATCH v2 for-5.2 1/5] spapr/xive: Fix xive->fd if kvm_create_device() fails Greg Kurz
@ 2020-08-06 16:56 ` Greg Kurz
  2020-08-06 16:56 ` [PATCH v2 for-5.2 3/5] ppc/xive: Introduce dedicated kvm_irqchip_in_kernel() wrappers Greg Kurz
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Greg Kurz @ 2020-08-06 16:56 UTC (permalink / raw)
  To: David Gibson
  Cc: Daniel Henrique Barboza, qemu-ppc, Cédric Le Goater, qemu-devel

Since this function begins with:

    /* The KVM XIVE device is not in use */
    if (!xive || xive->fd == -1) {
        return;
    }

we obviously don't need to check xive->fd again.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
v2: Already applied to ppc-for-5.2 but not yet visible on github
---
 hw/intc/spapr_xive_kvm.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
index d55ea4670e0e..893a1ee77e70 100644
--- a/hw/intc/spapr_xive_kvm.c
+++ b/hw/intc/spapr_xive_kvm.c
@@ -873,10 +873,8 @@ void kvmppc_xive_disconnect(SpaprInterruptController *intc)
      * and removed from the list of devices of the VM. The VCPU
      * presenters are also detached from the device.
      */
-    if (xive->fd != -1) {
-        close(xive->fd);
-        xive->fd = -1;
-    }
+    close(xive->fd);
+    xive->fd = -1;
 
     kvm_kernel_irqchip = false;
     kvm_msi_via_irqfd_allowed = false;




^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 for-5.2 3/5] ppc/xive: Introduce dedicated kvm_irqchip_in_kernel() wrappers
  2020-08-06 16:55 [PATCH v2 for-5.2 0/5] papr: Cleanups for XIVE and PHB Greg Kurz
  2020-08-06 16:56 ` [PATCH v2 for-5.2 1/5] spapr/xive: Fix xive->fd if kvm_create_device() fails Greg Kurz
  2020-08-06 16:56 ` [PATCH v2 for-5.2 2/5] spapr/xive: Simplify kvmppc_xive_disconnect() Greg Kurz
@ 2020-08-06 16:56 ` Greg Kurz
  2020-08-06 17:55   ` Cédric Le Goater
  2020-08-06 16:56 ` [PATCH v2 for-5.2 4/5] spapr/xive: Convert KVM device fd checks to assert() Greg Kurz
  2020-08-06 16:56 ` [PATCH v2 for-5.2 5/5] spapr: Simplify error handling in spapr_phb_realize() Greg Kurz
  4 siblings, 1 reply; 9+ messages in thread
From: Greg Kurz @ 2020-08-06 16:56 UTC (permalink / raw)
  To: David Gibson
  Cc: Daniel Henrique Barboza, qemu-ppc, Cédric Le Goater, qemu-devel

Calls to the KVM XIVE device are guarded by kvm_irqchip_in_kernel(). This
ensures that QEMU won't try to use the device if KVM is disabled or if
an in-kernel irqchip isn't required.

When using ic-mode=dual with the pseries machine, we have two possible
interrupt controllers: XIVE and XICS. The kvm_irqchip_in_kernel() helper
will return true as soon as any of the KVM device is created. It might
lure QEMU to think that the other one is also around, while it is not.
This is exactly what happens with ic-mode=dual at machine init when
claiming IRQ numbers, which must be done on all possible IRQ backends,
eg. RTAS event sources or the PHB0 LSI table : only the KVM XICS device
is active but we end up calling kvmppc_xive_source_reset_one() anyway,
which fails. This doesn't cause any trouble because of another bug :
kvmppc_xive_source_reset_one() lacks an error_setg() and callers don't
see the failure.

Most of the other kvmppc_xive_* functions have similar xive->fd
checks to filter out the case when KVM XIVE isn't active. It
might look safer to have idempotent functions but it doesn't
really help to understand what's going on when debugging.

Since we already have all the kvm_irqchip_in_kernel() in place,
also have the callers to check xive->fd as well before calling
KVM XIVE specific code. This is straight-forward for the spapr
specific XIVE code. Some more care is needed for the platform
agnostic XIVE code since it cannot access xive->fd directly.
Introduce new in_kernel() methods in some base XIVE classes
for this purpose and implement them only in spapr.

In all cases, we still need to call kvm_irqchip_in_kernel() so that
compilers can optimize the kvmppc_xive_* calls away when CONFIG_KVM
isn't defined, thus avoiding the need for stubs.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
v2: Introduce in_kernel() abstract methods in the base XIVE classes
---
 hw/intc/spapr_xive.c  |   53 ++++++++++++++++++++++++++++++++++++-------------
 hw/intc/xive.c        |   28 +++++++++++++++++++-------
 include/hw/ppc/xive.h |    2 ++
 3 files changed, 62 insertions(+), 21 deletions(-)

diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index 89c8cd96670b..cd001c580e89 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -148,12 +148,19 @@ static void spapr_xive_end_pic_print_info(SpaprXive *xive, XiveEND *end,
     xive_end_queue_pic_print_info(end, 6, mon);
 }
 
+/*
+ * kvm_irqchip_in_kernel() will cause the compiler to turn this
+ * info a nop if CONFIG_KVM isn't defined.
+ */
+#define spapr_xive_in_kernel(xive) \
+    (kvm_irqchip_in_kernel() && (xive)->fd != -1)
+
 void spapr_xive_pic_print_info(SpaprXive *xive, Monitor *mon)
 {
     XiveSource *xsrc = &xive->source;
     int i;
 
-    if (kvm_irqchip_in_kernel()) {
+    if (spapr_xive_in_kernel(xive)) {
         Error *local_err = NULL;
 
         kvmppc_xive_synchronize_state(xive, &local_err);
@@ -507,8 +514,10 @@ static const VMStateDescription vmstate_spapr_xive_eas = {
 
 static int vmstate_spapr_xive_pre_save(void *opaque)
 {
-    if (kvm_irqchip_in_kernel()) {
-        return kvmppc_xive_pre_save(SPAPR_XIVE(opaque));
+    SpaprXive *xive = SPAPR_XIVE(opaque);
+
+    if (spapr_xive_in_kernel(xive)) {
+        return kvmppc_xive_pre_save(xive);
     }
 
     return 0;
@@ -520,8 +529,10 @@ static int vmstate_spapr_xive_pre_save(void *opaque)
  */
 static int spapr_xive_post_load(SpaprInterruptController *intc, int version_id)
 {
-    if (kvm_irqchip_in_kernel()) {
-        return kvmppc_xive_post_load(SPAPR_XIVE(intc), version_id);
+    SpaprXive *xive = SPAPR_XIVE(intc);
+
+    if (spapr_xive_in_kernel(xive)) {
+        return kvmppc_xive_post_load(xive, version_id);
     }
 
     return 0;
@@ -564,7 +575,7 @@ static int spapr_xive_claim_irq(SpaprInterruptController *intc, int lisn,
         xive_source_irq_set_lsi(xsrc, lisn);
     }
 
-    if (kvm_irqchip_in_kernel()) {
+    if (spapr_xive_in_kernel(xive)) {
         return kvmppc_xive_source_reset_one(xsrc, lisn, errp);
     }
 
@@ -641,7 +652,7 @@ static void spapr_xive_set_irq(SpaprInterruptController *intc, int irq, int val)
 {
     SpaprXive *xive = SPAPR_XIVE(intc);
 
-    if (kvm_irqchip_in_kernel()) {
+    if (spapr_xive_in_kernel(xive)) {
         kvmppc_xive_source_set_irq(&xive->source, irq, val);
     } else {
         xive_source_set_irq(&xive->source, irq, val);
@@ -749,11 +760,21 @@ static void spapr_xive_deactivate(SpaprInterruptController *intc)
 
     spapr_xive_mmio_set_enabled(xive, false);
 
-    if (kvm_irqchip_in_kernel()) {
+    if (spapr_xive_in_kernel(xive)) {
         kvmppc_xive_disconnect(intc);
     }
 }
 
+static bool spapr_xive_in_kernel_xptr(const XivePresenter *xptr)
+{
+    return spapr_xive_in_kernel(SPAPR_XIVE(xptr));
+}
+
+static bool spapr_xive_in_kernel_xn(const XiveNotifier *xn)
+{
+    return spapr_xive_in_kernel(SPAPR_XIVE(xn));
+}
+
 static void spapr_xive_class_init(ObjectClass *klass, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(klass);
@@ -761,6 +782,7 @@ static void spapr_xive_class_init(ObjectClass *klass, void *data)
     SpaprInterruptControllerClass *sicc = SPAPR_INTC_CLASS(klass);
     XivePresenterClass *xpc = XIVE_PRESENTER_CLASS(klass);
     SpaprXiveClass *sxc = SPAPR_XIVE_CLASS(klass);
+    XiveNotifierClass *xnc = XIVE_NOTIFIER_CLASS(klass);
 
     dc->desc    = "sPAPR XIVE Interrupt Controller";
     device_class_set_props(dc, spapr_xive_properties);
@@ -788,6 +810,9 @@ static void spapr_xive_class_init(ObjectClass *klass, void *data)
     sicc->post_load = spapr_xive_post_load;
 
     xpc->match_nvt  = spapr_xive_match_nvt;
+    xpc->in_kernel  = spapr_xive_in_kernel_xptr;
+
+    xnc->in_kernel  = spapr_xive_in_kernel_xn;
 }
 
 static const TypeInfo spapr_xive_info = {
@@ -1058,7 +1083,7 @@ static target_ulong h_int_set_source_config(PowerPCCPU *cpu,
         new_eas.w = xive_set_field64(EAS_END_DATA, new_eas.w, eisn);
     }
 
-    if (kvm_irqchip_in_kernel()) {
+    if (spapr_xive_in_kernel(xive)) {
         Error *local_err = NULL;
 
         kvmppc_xive_set_source_config(xive, lisn, &new_eas, &local_err);
@@ -1379,7 +1404,7 @@ static target_ulong h_int_set_queue_config(PowerPCCPU *cpu,
      */
 
 out:
-    if (kvm_irqchip_in_kernel()) {
+    if (spapr_xive_in_kernel(xive)) {
         Error *local_err = NULL;
 
         kvmppc_xive_set_queue_config(xive, end_blk, end_idx, &end, &local_err);
@@ -1480,7 +1505,7 @@ static target_ulong h_int_get_queue_config(PowerPCCPU *cpu,
         args[2] = 0;
     }
 
-    if (kvm_irqchip_in_kernel()) {
+    if (spapr_xive_in_kernel(xive)) {
         Error *local_err = NULL;
 
         kvmppc_xive_get_queue_config(xive, end_blk, end_idx, end, &local_err);
@@ -1642,7 +1667,7 @@ static target_ulong h_int_esb(PowerPCCPU *cpu,
         return H_P3;
     }
 
-    if (kvm_irqchip_in_kernel()) {
+    if (spapr_xive_in_kernel(xive)) {
         args[0] = kvmppc_xive_esb_rw(xsrc, lisn, offset, data,
                                      flags & SPAPR_XIVE_ESB_STORE);
     } else {
@@ -1717,7 +1742,7 @@ static target_ulong h_int_sync(PowerPCCPU *cpu,
      * under KVM
      */
 
-    if (kvm_irqchip_in_kernel()) {
+    if (spapr_xive_in_kernel(xive)) {
         Error *local_err = NULL;
 
         kvmppc_xive_sync_source(xive, lisn, &local_err);
@@ -1761,7 +1786,7 @@ static target_ulong h_int_reset(PowerPCCPU *cpu,
 
     device_legacy_reset(DEVICE(xive));
 
-    if (kvm_irqchip_in_kernel()) {
+    if (spapr_xive_in_kernel(xive)) {
         Error *local_err = NULL;
 
         kvmppc_xive_reset(xive, &local_err);
diff --git a/hw/intc/xive.c b/hw/intc/xive.c
index 9b55e0356c62..27d27fdc9ee4 100644
--- a/hw/intc/xive.c
+++ b/hw/intc/xive.c
@@ -592,6 +592,17 @@ static const char * const xive_tctx_ring_names[] = {
     "USER", "OS", "POOL", "PHYS",
 };
 
+/*
+ * kvm_irqchip_in_kernel() will cause the compiler to turn this
+ * info a nop if CONFIG_KVM isn't defined.
+ */
+#define xive_in_kernel(xptr)                                  \
+    (kvm_irqchip_in_kernel() &&                                         \
+     ({                                                                 \
+         XivePresenterClass *xpc = XIVE_PRESENTER_GET_CLASS(xptr);      \
+         xpc->in_kernel ? xpc->in_kernel(xptr) : false;                 \
+     }))
+
 void xive_tctx_pic_print_info(XiveTCTX *tctx, Monitor *mon)
 {
     int cpu_index;
@@ -606,7 +617,7 @@ void xive_tctx_pic_print_info(XiveTCTX *tctx, Monitor *mon)
 
     cpu_index = tctx->cs ? tctx->cs->cpu_index : -1;
 
-    if (kvm_irqchip_in_kernel()) {
+    if (xive_in_kernel(tctx->xptr)) {
         Error *local_err = NULL;
 
         kvmppc_xive_cpu_synchronize_state(tctx, &local_err);
@@ -671,7 +682,7 @@ static void xive_tctx_realize(DeviceState *dev, Error **errp)
     }
 
     /* Connect the presenter to the VCPU (required for CPU hotplug) */
-    if (kvm_irqchip_in_kernel()) {
+    if (xive_in_kernel(tctx->xptr)) {
         kvmppc_xive_cpu_connect(tctx, &local_err);
         if (local_err) {
             error_propagate(errp, local_err);
@@ -682,10 +693,11 @@ static void xive_tctx_realize(DeviceState *dev, Error **errp)
 
 static int vmstate_xive_tctx_pre_save(void *opaque)
 {
+    XiveTCTX *tctx = XIVE_TCTX(opaque);
     Error *local_err = NULL;
 
-    if (kvm_irqchip_in_kernel()) {
-        kvmppc_xive_cpu_get_state(XIVE_TCTX(opaque), &local_err);
+    if (xive_in_kernel(tctx->xptr)) {
+        kvmppc_xive_cpu_get_state(tctx, &local_err);
         if (local_err) {
             error_report_err(local_err);
             return -1;
@@ -697,14 +709,15 @@ static int vmstate_xive_tctx_pre_save(void *opaque)
 
 static int vmstate_xive_tctx_post_load(void *opaque, int version_id)
 {
+    XiveTCTX *tctx = XIVE_TCTX(opaque);
     Error *local_err = NULL;
 
-    if (kvm_irqchip_in_kernel()) {
+    if (xive_in_kernel(tctx->xptr)) {
         /*
          * Required for hotplugged CPU, for which the state comes
          * after all states of the machine.
          */
-        kvmppc_xive_cpu_set_state(XIVE_TCTX(opaque), &local_err);
+        kvmppc_xive_cpu_set_state(tctx, &local_err);
         if (local_err) {
             error_report_err(local_err);
             return -1;
@@ -1128,6 +1141,7 @@ static void xive_source_reset(void *dev)
 static void xive_source_realize(DeviceState *dev, Error **errp)
 {
     XiveSource *xsrc = XIVE_SOURCE(dev);
+    XiveNotifierClass *xnc = XIVE_NOTIFIER_GET_CLASS(xsrc->xive);
 
     assert(xsrc->xive);
 
@@ -1147,7 +1161,7 @@ static void xive_source_realize(DeviceState *dev, Error **errp)
     xsrc->status = g_malloc0(xsrc->nr_irqs);
     xsrc->lsi_map = bitmap_new(xsrc->nr_irqs);
 
-    if (!kvm_irqchip_in_kernel()) {
+    if (!xnc->in_kernel || !xnc->in_kernel(xsrc->xive)) {
         memory_region_init_io(&xsrc->esb_mmio, OBJECT(xsrc),
                               &xive_source_esb_ops, xsrc, "xive.esb",
                               (1ull << xsrc->esb_shift) * xsrc->nr_irqs);
diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
index 705cf48176fc..aa46e3fcf512 100644
--- a/include/hw/ppc/xive.h
+++ b/include/hw/ppc/xive.h
@@ -161,6 +161,7 @@ typedef struct XiveNotifier XiveNotifier;
 typedef struct XiveNotifierClass {
     InterfaceClass parent;
     void (*notify)(XiveNotifier *xn, uint32_t lisn);
+    bool (*in_kernel)(const XiveNotifier *xn);
 } XiveNotifierClass;
 
 /*
@@ -396,6 +397,7 @@ typedef struct XivePresenterClass {
                      uint8_t nvt_blk, uint32_t nvt_idx,
                      bool cam_ignore, uint8_t priority,
                      uint32_t logic_serv, XiveTCTXMatch *match);
+    bool (*in_kernel)(const XivePresenter *xptr);
 } XivePresenterClass;
 
 int xive_presenter_tctx_match(XivePresenter *xptr, XiveTCTX *tctx,




^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 for-5.2 4/5] spapr/xive: Convert KVM device fd checks to assert()
  2020-08-06 16:55 [PATCH v2 for-5.2 0/5] papr: Cleanups for XIVE and PHB Greg Kurz
                   ` (2 preceding siblings ...)
  2020-08-06 16:56 ` [PATCH v2 for-5.2 3/5] ppc/xive: Introduce dedicated kvm_irqchip_in_kernel() wrappers Greg Kurz
@ 2020-08-06 16:56 ` Greg Kurz
  2020-08-06 16:56 ` [PATCH v2 for-5.2 5/5] spapr: Simplify error handling in spapr_phb_realize() Greg Kurz
  4 siblings, 0 replies; 9+ messages in thread
From: Greg Kurz @ 2020-08-06 16:56 UTC (permalink / raw)
  To: David Gibson
  Cc: Daniel Henrique Barboza, qemu-ppc, Cédric Le Goater, qemu-devel

All callers guard these functions with an xive_in_kernel() helper. Make
it clear that they are only to be called when the KVM XIVE device exists.

Note that the check on xive is dropped in kvmppc_xive_disconnect(). It
really cannot be NULL since it comes from set_active_intc() which only
passes pointers to allocated objects.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
v2: Take the helper name change into account in the changelog
---
 hw/intc/spapr_xive_kvm.c |   35 +++++++----------------------------
 1 file changed, 7 insertions(+), 28 deletions(-)

diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
index 893a1ee77e70..1908afb14b9f 100644
--- a/hw/intc/spapr_xive_kvm.c
+++ b/hw/intc/spapr_xive_kvm.c
@@ -79,10 +79,7 @@ void kvmppc_xive_cpu_set_state(XiveTCTX *tctx, Error **errp)
     uint64_t state[2];
     int ret;
 
-    /* The KVM XIVE device is not in use yet */
-    if (xive->fd == -1) {
-        return;
-    }
+    assert(xive->fd != -1);
 
     /* word0 and word1 of the OS ring. */
     state[0] = *((uint64_t *) &tctx->regs[TM_QW1_OS]);
@@ -101,10 +98,7 @@ void kvmppc_xive_cpu_get_state(XiveTCTX *tctx, Error **errp)
     uint64_t state[2] = { 0 };
     int ret;
 
-    /* The KVM XIVE device is not in use */
-    if (xive->fd == -1) {
-        return;
-    }
+    assert(xive->fd != -1);
 
     ret = kvm_get_one_reg(tctx->cs, KVM_REG_PPC_VP_STATE, state);
     if (ret != 0) {
@@ -156,10 +150,7 @@ void kvmppc_xive_cpu_connect(XiveTCTX *tctx, Error **errp)
     unsigned long vcpu_id;
     int ret;
 
-    /* The KVM XIVE device is not in use */
-    if (xive->fd == -1) {
-        return;
-    }
+    assert(xive->fd != -1);
 
     /* Check if CPU was hot unplugged and replugged. */
     if (kvm_cpu_is_enabled(tctx->cs)) {
@@ -245,10 +236,7 @@ int kvmppc_xive_source_reset_one(XiveSource *xsrc, int srcno, Error **errp)
     SpaprXive *xive = SPAPR_XIVE(xsrc->xive);
     uint64_t state = 0;
 
-    /* The KVM XIVE device is not in use */
-    if (xive->fd == -1) {
-        return -ENODEV;
-    }
+    assert(xive->fd != -1);
 
     if (xive_source_irq_is_lsi(xsrc, srcno)) {
         state |= KVM_XIVE_LEVEL_SENSITIVE;
@@ -592,10 +580,7 @@ static void kvmppc_xive_change_state_handler(void *opaque, int running,
 
 void kvmppc_xive_synchronize_state(SpaprXive *xive, Error **errp)
 {
-    /* The KVM XIVE device is not in use */
-    if (xive->fd == -1) {
-        return;
-    }
+    assert(xive->fd != -1);
 
     /*
      * When the VM is stopped, the sources are masked and the previous
@@ -622,10 +607,7 @@ int kvmppc_xive_pre_save(SpaprXive *xive)
 {
     Error *local_err = NULL;
 
-    /* The KVM XIVE device is not in use */
-    if (xive->fd == -1) {
-        return 0;
-    }
+    assert(xive->fd != -1);
 
     /* EAT: there is no extra state to query from KVM */
 
@@ -845,10 +827,7 @@ void kvmppc_xive_disconnect(SpaprInterruptController *intc)
     XiveSource *xsrc;
     size_t esb_len;
 
-    /* The KVM XIVE device is not in use */
-    if (!xive || xive->fd == -1) {
-        return;
-    }
+    assert(xive->fd != -1);
 
     /* Clear the KVM mapping */
     xsrc = &xive->source;




^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 for-5.2 5/5] spapr: Simplify error handling in spapr_phb_realize()
  2020-08-06 16:55 [PATCH v2 for-5.2 0/5] papr: Cleanups for XIVE and PHB Greg Kurz
                   ` (3 preceding siblings ...)
  2020-08-06 16:56 ` [PATCH v2 for-5.2 4/5] spapr/xive: Convert KVM device fd checks to assert() Greg Kurz
@ 2020-08-06 16:56 ` Greg Kurz
  4 siblings, 0 replies; 9+ messages in thread
From: Greg Kurz @ 2020-08-06 16:56 UTC (permalink / raw)
  To: David Gibson
  Cc: Daniel Henrique Barboza, qemu-ppc, Cédric Le Goater, qemu-devel

The spapr_phb_realize() function has a local_err variable which
is used to:

1) check failures of spapr_irq_findone() and spapr_irq_claim()

2) prepend extra information to the error message

Recent work from Markus Armbruster highlighted we get better
code when testing the return value of a function, rather than
setting up all the local_err boiler plate. For similar reasons,
it is now preferred to use ERRP_GUARD() and error_prepend()
rather than error_propagate_prepend().

Since spapr_irq_findone() and spapr_irq_claim() return negative
values in case of failure, do both changes.

This is just cleanup, no functional impact.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr_pci.c |   16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 363cdb3f7b8d..0a418f1e6711 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -1796,6 +1796,7 @@ static void spapr_phb_destroy_msi(gpointer opaque)
 
 static void spapr_phb_realize(DeviceState *dev, Error **errp)
 {
+    ERRP_GUARD();
     /* We don't use SPAPR_MACHINE() in order to exit gracefully if the user
      * tries to add a sPAPR PHB to a non-pseries machine.
      */
@@ -1813,7 +1814,6 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
     uint64_t msi_window_size = 4096;
     SpaprTceTable *tcet;
     const unsigned windows_supported = spapr_phb_windows_supported(sphb);
-    Error *local_err = NULL;
 
     if (!spapr) {
         error_setg(errp, TYPE_SPAPR_PCI_HOST_BRIDGE " needs a pseries machine");
@@ -1964,13 +1964,12 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
 
     /* Initialize the LSI table */
     for (i = 0; i < PCI_NUM_PINS; i++) {
-        uint32_t irq = SPAPR_IRQ_PCI_LSI + sphb->index * PCI_NUM_PINS + i;
+        int irq = SPAPR_IRQ_PCI_LSI + sphb->index * PCI_NUM_PINS + i;
 
         if (smc->legacy_irq_allocation) {
-            irq = spapr_irq_findone(spapr, &local_err);
-            if (local_err) {
-                error_propagate_prepend(errp, local_err,
-                                        "can't allocate LSIs: ");
+            irq = spapr_irq_findone(spapr, errp);
+            if (irq < 0) {
+                error_prepend(errp, "can't allocate LSIs: ");
                 /*
                  * Older machines will never support PHB hotplug, ie, this is an
                  * init only path and QEMU will terminate. No need to rollback.
@@ -1979,9 +1978,8 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
             }
         }
 
-        spapr_irq_claim(spapr, irq, true, &local_err);
-        if (local_err) {
-            error_propagate_prepend(errp, local_err, "can't allocate LSIs: ");
+        if (spapr_irq_claim(spapr, irq, true, errp) < 0) {
+            error_prepend(errp, "can't allocate LSIs: ");
             goto unrealize;
         }
 




^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 for-5.2 3/5] ppc/xive: Introduce dedicated kvm_irqchip_in_kernel() wrappers
  2020-08-06 16:56 ` [PATCH v2 for-5.2 3/5] ppc/xive: Introduce dedicated kvm_irqchip_in_kernel() wrappers Greg Kurz
@ 2020-08-06 17:55   ` Cédric Le Goater
  2020-08-07  7:15     ` Greg Kurz
  0 siblings, 1 reply; 9+ messages in thread
From: Cédric Le Goater @ 2020-08-06 17:55 UTC (permalink / raw)
  To: Greg Kurz, David Gibson; +Cc: Daniel Henrique Barboza, qemu-ppc, qemu-devel

On 8/6/20 6:56 PM, Greg Kurz wrote:
> Calls to the KVM XIVE device are guarded by kvm_irqchip_in_kernel(). This
> ensures that QEMU won't try to use the device if KVM is disabled or if
> an in-kernel irqchip isn't required.
> 
> When using ic-mode=dual with the pseries machine, we have two possible
> interrupt controllers: XIVE and XICS. The kvm_irqchip_in_kernel() helper
> will return true as soon as any of the KVM device is created. It might
> lure QEMU to think that the other one is also around, while it is not.
> This is exactly what happens with ic-mode=dual at machine init when
> claiming IRQ numbers, which must be done on all possible IRQ backends,
> eg. RTAS event sources or the PHB0 LSI table : only the KVM XICS device
> is active but we end up calling kvmppc_xive_source_reset_one() anyway,
> which fails. This doesn't cause any trouble because of another bug :
> kvmppc_xive_source_reset_one() lacks an error_setg() and callers don't
> see the failure.
> 
> Most of the other kvmppc_xive_* functions have similar xive->fd
> checks to filter out the case when KVM XIVE isn't active. It
> might look safer to have idempotent functions but it doesn't
> really help to understand what's going on when debugging.
> 
> Since we already have all the kvm_irqchip_in_kernel() in place,
> also have the callers to check xive->fd as well before calling
> KVM XIVE specific code. This is straight-forward for the spapr
> specific XIVE code. Some more care is needed for the platform
> agnostic XIVE code since it cannot access xive->fd directly.
> Introduce new in_kernel() methods in some base XIVE classes
> for this purpose and implement them only in spapr.
> 
> In all cases, we still need to call kvm_irqchip_in_kernel() so that
> compilers can optimize the kvmppc_xive_* calls away when CONFIG_KVM
> isn't defined, thus avoiding the need for stubs.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
> v2: Introduce in_kernel() abstract methods in the base XIVE classes
> ---
>  hw/intc/spapr_xive.c  |   53 ++++++++++++++++++++++++++++++++++++-------------
>  hw/intc/xive.c        |   28 +++++++++++++++++++-------
>  include/hw/ppc/xive.h |    2 ++
>  3 files changed, 62 insertions(+), 21 deletions(-)
> 
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index 89c8cd96670b..cd001c580e89 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -148,12 +148,19 @@ static void spapr_xive_end_pic_print_info(SpaprXive *xive, XiveEND *end,
>      xive_end_queue_pic_print_info(end, 6, mon);
>  }
>  
> +/*
> + * kvm_irqchip_in_kernel() will cause the compiler to turn this
> + * info a nop if CONFIG_KVM isn't defined.
> + */
> +#define spapr_xive_in_kernel(xive) \
> +    (kvm_irqchip_in_kernel() && (xive)->fd != -1)
> +

This looks ok. SpaprXive is the userspace frontend device of 
the KVM XIVE native device in the hypervisor.

>  void spapr_xive_pic_print_info(SpaprXive *xive, Monitor *mon)
>  {
>      XiveSource *xsrc = &xive->source;
>      int i;
>  
> -    if (kvm_irqchip_in_kernel()) {
> +    if (spapr_xive_in_kernel(xive)) {
>          Error *local_err = NULL;
>  
>          kvmppc_xive_synchronize_state(xive, &local_err);
> @@ -507,8 +514,10 @@ static const VMStateDescription vmstate_spapr_xive_eas = {
>  
>  static int vmstate_spapr_xive_pre_save(void *opaque)
>  {
> -    if (kvm_irqchip_in_kernel()) {
> -        return kvmppc_xive_pre_save(SPAPR_XIVE(opaque));
> +    SpaprXive *xive = SPAPR_XIVE(opaque);
> +
> +    if (spapr_xive_in_kernel(xive)) {
> +        return kvmppc_xive_pre_save(xive);
>      }
>  
>      return 0;
> @@ -520,8 +529,10 @@ static int vmstate_spapr_xive_pre_save(void *opaque)
>   */
>  static int spapr_xive_post_load(SpaprInterruptController *intc, int version_id)
>  {
> -    if (kvm_irqchip_in_kernel()) {
> -        return kvmppc_xive_post_load(SPAPR_XIVE(intc), version_id);
> +    SpaprXive *xive = SPAPR_XIVE(intc);
> +
> +    if (spapr_xive_in_kernel(xive)) {
> +        return kvmppc_xive_post_load(xive, version_id);
>      }
>  
>      return 0;
> @@ -564,7 +575,7 @@ static int spapr_xive_claim_irq(SpaprInterruptController *intc, int lisn,
>          xive_source_irq_set_lsi(xsrc, lisn);
>      }
>  
> -    if (kvm_irqchip_in_kernel()) {
> +    if (spapr_xive_in_kernel(xive)) {
>          return kvmppc_xive_source_reset_one(xsrc, lisn, errp);
>      }
>  
> @@ -641,7 +652,7 @@ static void spapr_xive_set_irq(SpaprInterruptController *intc, int irq, int val)
>  {
>      SpaprXive *xive = SPAPR_XIVE(intc);
>  
> -    if (kvm_irqchip_in_kernel()) {
> +    if (spapr_xive_in_kernel(xive)) {
>          kvmppc_xive_source_set_irq(&xive->source, irq, val);
>      } else {
>          xive_source_set_irq(&xive->source, irq, val);
> @@ -749,11 +760,21 @@ static void spapr_xive_deactivate(SpaprInterruptController *intc)
>  
>      spapr_xive_mmio_set_enabled(xive, false);
>  
> -    if (kvm_irqchip_in_kernel()) {
> +    if (spapr_xive_in_kernel(xive)) {
>          kvmppc_xive_disconnect(intc);
>      }
>  }
>  
> +static bool spapr_xive_in_kernel_xptr(const XivePresenter *xptr)
> +{
> +    return spapr_xive_in_kernel(SPAPR_XIVE(xptr));
> +}

This is mostly OK, a XivePresenter is a part of the XiveRouter.

> +static bool spapr_xive_in_kernel_xn(const XiveNotifier *xn)
> +{
> +    return spapr_xive_in_kernel(SPAPR_XIVE(xn));
> +}


This is weird. we have other XiveNotifiers which have no relation
with a kernel backend.

>  static void spapr_xive_class_init(ObjectClass *klass, void *data)
>  {
>      DeviceClass *dc = DEVICE_CLASS(klass);
> @@ -761,6 +782,7 @@ static void spapr_xive_class_init(ObjectClass *klass, void *data)
>      SpaprInterruptControllerClass *sicc = SPAPR_INTC_CLASS(klass);
>      XivePresenterClass *xpc = XIVE_PRESENTER_CLASS(klass);
>      SpaprXiveClass *sxc = SPAPR_XIVE_CLASS(klass);
> +    XiveNotifierClass *xnc = XIVE_NOTIFIER_CLASS(klass);
>  
>      dc->desc    = "sPAPR XIVE Interrupt Controller";
>      device_class_set_props(dc, spapr_xive_properties);
> @@ -788,6 +810,9 @@ static void spapr_xive_class_init(ObjectClass *klass, void *data)
>      sicc->post_load = spapr_xive_post_load;
>  
>      xpc->match_nvt  = spapr_xive_match_nvt;
> +    xpc->in_kernel  = spapr_xive_in_kernel_xptr;
> +
> +    xnc->in_kernel  = spapr_xive_in_kernel_xn;
>  }
>  
>  static const TypeInfo spapr_xive_info = {
> @@ -1058,7 +1083,7 @@ static target_ulong h_int_set_source_config(PowerPCCPU *cpu,
>          new_eas.w = xive_set_field64(EAS_END_DATA, new_eas.w, eisn);
>      }
>  
> -    if (kvm_irqchip_in_kernel()) {
> +    if (spapr_xive_in_kernel(xive)) {
>          Error *local_err = NULL;
>  
>          kvmppc_xive_set_source_config(xive, lisn, &new_eas, &local_err);
> @@ -1379,7 +1404,7 @@ static target_ulong h_int_set_queue_config(PowerPCCPU *cpu,
>       */
>  
>  out:
> -    if (kvm_irqchip_in_kernel()) {
> +    if (spapr_xive_in_kernel(xive)) {
>          Error *local_err = NULL;
>  
>          kvmppc_xive_set_queue_config(xive, end_blk, end_idx, &end, &local_err);
> @@ -1480,7 +1505,7 @@ static target_ulong h_int_get_queue_config(PowerPCCPU *cpu,
>          args[2] = 0;
>      }
>  
> -    if (kvm_irqchip_in_kernel()) {
> +    if (spapr_xive_in_kernel(xive)) {
>          Error *local_err = NULL;
>  
>          kvmppc_xive_get_queue_config(xive, end_blk, end_idx, end, &local_err);
> @@ -1642,7 +1667,7 @@ static target_ulong h_int_esb(PowerPCCPU *cpu,
>          return H_P3;
>      }
>  
> -    if (kvm_irqchip_in_kernel()) {
> +    if (spapr_xive_in_kernel(xive)) {
>          args[0] = kvmppc_xive_esb_rw(xsrc, lisn, offset, data,
>                                       flags & SPAPR_XIVE_ESB_STORE);
>      } else {
> @@ -1717,7 +1742,7 @@ static target_ulong h_int_sync(PowerPCCPU *cpu,
>       * under KVM
>       */
>  
> -    if (kvm_irqchip_in_kernel()) {
> +    if (spapr_xive_in_kernel(xive)) {
>          Error *local_err = NULL;
>  
>          kvmppc_xive_sync_source(xive, lisn, &local_err);
> @@ -1761,7 +1786,7 @@ static target_ulong h_int_reset(PowerPCCPU *cpu,
>  
>      device_legacy_reset(DEVICE(xive));
>  
> -    if (kvm_irqchip_in_kernel()) {
> +    if (spapr_xive_in_kernel(xive)) {
>          Error *local_err = NULL;
>  
>          kvmppc_xive_reset(xive, &local_err);
> diff --git a/hw/intc/xive.c b/hw/intc/xive.c
> index 9b55e0356c62..27d27fdc9ee4 100644
> --- a/hw/intc/xive.c
> +++ b/hw/intc/xive.c
> @@ -592,6 +592,17 @@ static const char * const xive_tctx_ring_names[] = {
>      "USER", "OS", "POOL", "PHYS",
>  };
>  
> +/*
> + * kvm_irqchip_in_kernel() will cause the compiler to turn this
> + * info a nop if CONFIG_KVM isn't defined.
> + */
> +#define xive_in_kernel(xptr)                                  \
> +    (kvm_irqchip_in_kernel() &&                                         \
> +     ({                                                                 \
> +         XivePresenterClass *xpc = XIVE_PRESENTER_GET_CLASS(xptr);      \
> +         xpc->in_kernel ? xpc->in_kernel(xptr) : false;                 \
> +     }))
> +
>
>  void xive_tctx_pic_print_info(XiveTCTX *tctx, Monitor *mon)
>  {
>      int cpu_index;
> @@ -606,7 +617,7 @@ void xive_tctx_pic_print_info(XiveTCTX *tctx, Monitor *mon)
>  
>      cpu_index = tctx->cs ? tctx->cs->cpu_index : -1;
>  
> -    if (kvm_irqchip_in_kernel()) {
> +    if (xive_in_kernel(tctx->xptr)) {
>          Error *local_err = NULL;
>  
>          kvmppc_xive_cpu_synchronize_state(tctx, &local_err);
> @@ -671,7 +682,7 @@ static void xive_tctx_realize(DeviceState *dev, Error **errp)
>      }
>  
>      /* Connect the presenter to the VCPU (required for CPU hotplug) */
> -    if (kvm_irqchip_in_kernel()) {
> +    if (xive_in_kernel(tctx->xptr)) {
>          kvmppc_xive_cpu_connect(tctx, &local_err);
>          if (local_err) {
>              error_propagate(errp, local_err);
> @@ -682,10 +693,11 @@ static void xive_tctx_realize(DeviceState *dev, Error **errp)
>  
>  static int vmstate_xive_tctx_pre_save(void *opaque)
>  {
> +    XiveTCTX *tctx = XIVE_TCTX(opaque);
>      Error *local_err = NULL;
>  
> -    if (kvm_irqchip_in_kernel()) {
> -        kvmppc_xive_cpu_get_state(XIVE_TCTX(opaque), &local_err);
> +    if (xive_in_kernel(tctx->xptr)) {
> +        kvmppc_xive_cpu_get_state(tctx, &local_err);
>          if (local_err) {
>              error_report_err(local_err);
>              return -1;
> @@ -697,14 +709,15 @@ static int vmstate_xive_tctx_pre_save(void *opaque)
>  
>  static int vmstate_xive_tctx_post_load(void *opaque, int version_id)
>  {
> +    XiveTCTX *tctx = XIVE_TCTX(opaque);
>      Error *local_err = NULL;
>  
> -    if (kvm_irqchip_in_kernel()) {
> +    if (xive_in_kernel(tctx->xptr)) {
>          /*
>           * Required for hotplugged CPU, for which the state comes
>           * after all states of the machine.
>           */
> -        kvmppc_xive_cpu_set_state(XIVE_TCTX(opaque), &local_err);
> +        kvmppc_xive_cpu_set_state(tctx, &local_err);
>          if (local_err) {
>              error_report_err(local_err);
>              return -1;
> @@ -1128,6 +1141,7 @@ static void xive_source_reset(void *dev)
>  static void xive_source_realize(DeviceState *dev, Error **errp)
>  {
>      XiveSource *xsrc = XIVE_SOURCE(dev);
> +    XiveNotifierClass *xnc = XIVE_NOTIFIER_GET_CLASS(xsrc->xive);
>  
>      assert(xsrc->xive);
>  
> @@ -1147,7 +1161,7 @@ static void xive_source_realize(DeviceState *dev, Error **errp)
>      xsrc->status = g_malloc0(xsrc->nr_irqs);
>      xsrc->lsi_map = bitmap_new(xsrc->nr_irqs);
>  
> -    if (!kvm_irqchip_in_kernel()) {
> +    if (!xnc->in_kernel || !xnc->in_kernel(xsrc->xive)) {
>          memory_region_init_io(&xsrc->esb_mmio, OBJECT(xsrc),
>                                &xive_source_esb_ops, xsrc, "xive.esb",
>                                (1ull << xsrc->esb_shift) * xsrc->nr_irqs);
> diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
> index 705cf48176fc..aa46e3fcf512 100644
> --- a/include/hw/ppc/xive.h
> +++ b/include/hw/ppc/xive.h
> @@ -161,6 +161,7 @@ typedef struct XiveNotifier XiveNotifier;
>  typedef struct XiveNotifierClass {
>      InterfaceClass parent;
>      void (*notify)(XiveNotifier *xn, uint32_t lisn);
> +    bool (*in_kernel)(const XiveNotifier *xn);
>  } XiveNotifierClass;
>  
>  /*
> @@ -396,6 +397,7 @@ typedef struct XivePresenterClass {
>                       uint8_t nvt_blk, uint32_t nvt_idx,
>                       bool cam_ignore, uint8_t priority,
>                       uint32_t logic_serv, XiveTCTXMatch *match);
> +    bool (*in_kernel)(const XivePresenter *xptr);
>  } XivePresenterClass;
>  
>  int xive_presenter_tctx_match(XivePresenter *xptr, XiveTCTX *tctx,
> 
> 

It seems redundant. Can we introduce a new XiveBackend QOM interface 
which would implement an in_kernel() handler ? and XiveRouter would 
inherit from it. 

C.







^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 for-5.2 3/5] ppc/xive: Introduce dedicated kvm_irqchip_in_kernel() wrappers
  2020-08-06 17:55   ` Cédric Le Goater
@ 2020-08-07  7:15     ` Greg Kurz
  2020-08-07  9:29       ` Greg Kurz
  0 siblings, 1 reply; 9+ messages in thread
From: Greg Kurz @ 2020-08-07  7:15 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: Daniel Henrique Barboza, qemu-ppc, qemu-devel, David Gibson

On Thu, 6 Aug 2020 19:55:29 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> On 8/6/20 6:56 PM, Greg Kurz wrote:
> > Calls to the KVM XIVE device are guarded by kvm_irqchip_in_kernel(). This
> > ensures that QEMU won't try to use the device if KVM is disabled or if
> > an in-kernel irqchip isn't required.
> > 
> > When using ic-mode=dual with the pseries machine, we have two possible
> > interrupt controllers: XIVE and XICS. The kvm_irqchip_in_kernel() helper
> > will return true as soon as any of the KVM device is created. It might
> > lure QEMU to think that the other one is also around, while it is not.
> > This is exactly what happens with ic-mode=dual at machine init when
> > claiming IRQ numbers, which must be done on all possible IRQ backends,
> > eg. RTAS event sources or the PHB0 LSI table : only the KVM XICS device
> > is active but we end up calling kvmppc_xive_source_reset_one() anyway,
> > which fails. This doesn't cause any trouble because of another bug :
> > kvmppc_xive_source_reset_one() lacks an error_setg() and callers don't
> > see the failure.
> > 
> > Most of the other kvmppc_xive_* functions have similar xive->fd
> > checks to filter out the case when KVM XIVE isn't active. It
> > might look safer to have idempotent functions but it doesn't
> > really help to understand what's going on when debugging.
> > 
> > Since we already have all the kvm_irqchip_in_kernel() in place,
> > also have the callers to check xive->fd as well before calling
> > KVM XIVE specific code. This is straight-forward for the spapr
> > specific XIVE code. Some more care is needed for the platform
> > agnostic XIVE code since it cannot access xive->fd directly.
> > Introduce new in_kernel() methods in some base XIVE classes
> > for this purpose and implement them only in spapr.
> > 
> > In all cases, we still need to call kvm_irqchip_in_kernel() so that
> > compilers can optimize the kvmppc_xive_* calls away when CONFIG_KVM
> > isn't defined, thus avoiding the need for stubs.
> > 
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> > v2: Introduce in_kernel() abstract methods in the base XIVE classes
> > ---
> >  hw/intc/spapr_xive.c  |   53 ++++++++++++++++++++++++++++++++++++-------------
> >  hw/intc/xive.c        |   28 +++++++++++++++++++-------
> >  include/hw/ppc/xive.h |    2 ++
> >  3 files changed, 62 insertions(+), 21 deletions(-)
> > 
> > diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> > index 89c8cd96670b..cd001c580e89 100644
> > --- a/hw/intc/spapr_xive.c
> > +++ b/hw/intc/spapr_xive.c
> > @@ -148,12 +148,19 @@ static void spapr_xive_end_pic_print_info(SpaprXive *xive, XiveEND *end,
> >      xive_end_queue_pic_print_info(end, 6, mon);
> >  }
> >  
> > +/*
> > + * kvm_irqchip_in_kernel() will cause the compiler to turn this
> > + * info a nop if CONFIG_KVM isn't defined.
> > + */
> > +#define spapr_xive_in_kernel(xive) \
> > +    (kvm_irqchip_in_kernel() && (xive)->fd != -1)
> > +
> 
> This looks ok. SpaprXive is the userspace frontend device of 
> the KVM XIVE native device in the hypervisor.
> 
> >  void spapr_xive_pic_print_info(SpaprXive *xive, Monitor *mon)
> >  {
> >      XiveSource *xsrc = &xive->source;
> >      int i;
> >  
> > -    if (kvm_irqchip_in_kernel()) {
> > +    if (spapr_xive_in_kernel(xive)) {
> >          Error *local_err = NULL;
> >  
> >          kvmppc_xive_synchronize_state(xive, &local_err);
> > @@ -507,8 +514,10 @@ static const VMStateDescription vmstate_spapr_xive_eas = {
> >  
> >  static int vmstate_spapr_xive_pre_save(void *opaque)
> >  {
> > -    if (kvm_irqchip_in_kernel()) {
> > -        return kvmppc_xive_pre_save(SPAPR_XIVE(opaque));
> > +    SpaprXive *xive = SPAPR_XIVE(opaque);
> > +
> > +    if (spapr_xive_in_kernel(xive)) {
> > +        return kvmppc_xive_pre_save(xive);
> >      }
> >  
> >      return 0;
> > @@ -520,8 +529,10 @@ static int vmstate_spapr_xive_pre_save(void *opaque)
> >   */
> >  static int spapr_xive_post_load(SpaprInterruptController *intc, int version_id)
> >  {
> > -    if (kvm_irqchip_in_kernel()) {
> > -        return kvmppc_xive_post_load(SPAPR_XIVE(intc), version_id);
> > +    SpaprXive *xive = SPAPR_XIVE(intc);
> > +
> > +    if (spapr_xive_in_kernel(xive)) {
> > +        return kvmppc_xive_post_load(xive, version_id);
> >      }
> >  
> >      return 0;
> > @@ -564,7 +575,7 @@ static int spapr_xive_claim_irq(SpaprInterruptController *intc, int lisn,
> >          xive_source_irq_set_lsi(xsrc, lisn);
> >      }
> >  
> > -    if (kvm_irqchip_in_kernel()) {
> > +    if (spapr_xive_in_kernel(xive)) {
> >          return kvmppc_xive_source_reset_one(xsrc, lisn, errp);
> >      }
> >  
> > @@ -641,7 +652,7 @@ static void spapr_xive_set_irq(SpaprInterruptController *intc, int irq, int val)
> >  {
> >      SpaprXive *xive = SPAPR_XIVE(intc);
> >  
> > -    if (kvm_irqchip_in_kernel()) {
> > +    if (spapr_xive_in_kernel(xive)) {
> >          kvmppc_xive_source_set_irq(&xive->source, irq, val);
> >      } else {
> >          xive_source_set_irq(&xive->source, irq, val);
> > @@ -749,11 +760,21 @@ static void spapr_xive_deactivate(SpaprInterruptController *intc)
> >  
> >      spapr_xive_mmio_set_enabled(xive, false);
> >  
> > -    if (kvm_irqchip_in_kernel()) {
> > +    if (spapr_xive_in_kernel(xive)) {
> >          kvmppc_xive_disconnect(intc);
> >      }
> >  }
> >  
> > +static bool spapr_xive_in_kernel_xptr(const XivePresenter *xptr)
> > +{
> > +    return spapr_xive_in_kernel(SPAPR_XIVE(xptr));
> > +}
> 
> This is mostly OK, a XivePresenter is a part of the XiveRouter.
> 
> > +static bool spapr_xive_in_kernel_xn(const XiveNotifier *xn)
> > +{
> > +    return spapr_xive_in_kernel(SPAPR_XIVE(xn));
> > +}
> 
> 
> This is weird. we have other XiveNotifiers which have no relation
> with a kernel backend.
> 

These other XiveNotifiers don't implement the in_kernel() method.

What's the problem ?

> >  static void spapr_xive_class_init(ObjectClass *klass, void *data)
> >  {
> >      DeviceClass *dc = DEVICE_CLASS(klass);
> > @@ -761,6 +782,7 @@ static void spapr_xive_class_init(ObjectClass *klass, void *data)
> >      SpaprInterruptControllerClass *sicc = SPAPR_INTC_CLASS(klass);
> >      XivePresenterClass *xpc = XIVE_PRESENTER_CLASS(klass);
> >      SpaprXiveClass *sxc = SPAPR_XIVE_CLASS(klass);
> > +    XiveNotifierClass *xnc = XIVE_NOTIFIER_CLASS(klass);
> >  
> >      dc->desc    = "sPAPR XIVE Interrupt Controller";
> >      device_class_set_props(dc, spapr_xive_properties);
> > @@ -788,6 +810,9 @@ static void spapr_xive_class_init(ObjectClass *klass, void *data)
> >      sicc->post_load = spapr_xive_post_load;
> >  
> >      xpc->match_nvt  = spapr_xive_match_nvt;
> > +    xpc->in_kernel  = spapr_xive_in_kernel_xptr;
> > +
> > +    xnc->in_kernel  = spapr_xive_in_kernel_xn;
> >  }
> >  
> >  static const TypeInfo spapr_xive_info = {
> > @@ -1058,7 +1083,7 @@ static target_ulong h_int_set_source_config(PowerPCCPU *cpu,
> >          new_eas.w = xive_set_field64(EAS_END_DATA, new_eas.w, eisn);
> >      }
> >  
> > -    if (kvm_irqchip_in_kernel()) {
> > +    if (spapr_xive_in_kernel(xive)) {
> >          Error *local_err = NULL;
> >  
> >          kvmppc_xive_set_source_config(xive, lisn, &new_eas, &local_err);
> > @@ -1379,7 +1404,7 @@ static target_ulong h_int_set_queue_config(PowerPCCPU *cpu,
> >       */
> >  
> >  out:
> > -    if (kvm_irqchip_in_kernel()) {
> > +    if (spapr_xive_in_kernel(xive)) {
> >          Error *local_err = NULL;
> >  
> >          kvmppc_xive_set_queue_config(xive, end_blk, end_idx, &end, &local_err);
> > @@ -1480,7 +1505,7 @@ static target_ulong h_int_get_queue_config(PowerPCCPU *cpu,
> >          args[2] = 0;
> >      }
> >  
> > -    if (kvm_irqchip_in_kernel()) {
> > +    if (spapr_xive_in_kernel(xive)) {
> >          Error *local_err = NULL;
> >  
> >          kvmppc_xive_get_queue_config(xive, end_blk, end_idx, end, &local_err);
> > @@ -1642,7 +1667,7 @@ static target_ulong h_int_esb(PowerPCCPU *cpu,
> >          return H_P3;
> >      }
> >  
> > -    if (kvm_irqchip_in_kernel()) {
> > +    if (spapr_xive_in_kernel(xive)) {
> >          args[0] = kvmppc_xive_esb_rw(xsrc, lisn, offset, data,
> >                                       flags & SPAPR_XIVE_ESB_STORE);
> >      } else {
> > @@ -1717,7 +1742,7 @@ static target_ulong h_int_sync(PowerPCCPU *cpu,
> >       * under KVM
> >       */
> >  
> > -    if (kvm_irqchip_in_kernel()) {
> > +    if (spapr_xive_in_kernel(xive)) {
> >          Error *local_err = NULL;
> >  
> >          kvmppc_xive_sync_source(xive, lisn, &local_err);
> > @@ -1761,7 +1786,7 @@ static target_ulong h_int_reset(PowerPCCPU *cpu,
> >  
> >      device_legacy_reset(DEVICE(xive));
> >  
> > -    if (kvm_irqchip_in_kernel()) {
> > +    if (spapr_xive_in_kernel(xive)) {
> >          Error *local_err = NULL;
> >  
> >          kvmppc_xive_reset(xive, &local_err);
> > diff --git a/hw/intc/xive.c b/hw/intc/xive.c
> > index 9b55e0356c62..27d27fdc9ee4 100644
> > --- a/hw/intc/xive.c
> > +++ b/hw/intc/xive.c
> > @@ -592,6 +592,17 @@ static const char * const xive_tctx_ring_names[] = {
> >      "USER", "OS", "POOL", "PHYS",
> >  };
> >  
> > +/*
> > + * kvm_irqchip_in_kernel() will cause the compiler to turn this
> > + * info a nop if CONFIG_KVM isn't defined.
> > + */
> > +#define xive_in_kernel(xptr)                                  \
> > +    (kvm_irqchip_in_kernel() &&                                         \
> > +     ({                                                                 \
> > +         XivePresenterClass *xpc = XIVE_PRESENTER_GET_CLASS(xptr);      \
> > +         xpc->in_kernel ? xpc->in_kernel(xptr) : false;                 \
> > +     }))
> > +
> >
> >  void xive_tctx_pic_print_info(XiveTCTX *tctx, Monitor *mon)
> >  {
> >      int cpu_index;
> > @@ -606,7 +617,7 @@ void xive_tctx_pic_print_info(XiveTCTX *tctx, Monitor *mon)
> >  
> >      cpu_index = tctx->cs ? tctx->cs->cpu_index : -1;
> >  
> > -    if (kvm_irqchip_in_kernel()) {
> > +    if (xive_in_kernel(tctx->xptr)) {
> >          Error *local_err = NULL;
> >  
> >          kvmppc_xive_cpu_synchronize_state(tctx, &local_err);
> > @@ -671,7 +682,7 @@ static void xive_tctx_realize(DeviceState *dev, Error **errp)
> >      }
> >  
> >      /* Connect the presenter to the VCPU (required for CPU hotplug) */
> > -    if (kvm_irqchip_in_kernel()) {
> > +    if (xive_in_kernel(tctx->xptr)) {
> >          kvmppc_xive_cpu_connect(tctx, &local_err);
> >          if (local_err) {
> >              error_propagate(errp, local_err);
> > @@ -682,10 +693,11 @@ static void xive_tctx_realize(DeviceState *dev, Error **errp)
> >  
> >  static int vmstate_xive_tctx_pre_save(void *opaque)
> >  {
> > +    XiveTCTX *tctx = XIVE_TCTX(opaque);
> >      Error *local_err = NULL;
> >  
> > -    if (kvm_irqchip_in_kernel()) {
> > -        kvmppc_xive_cpu_get_state(XIVE_TCTX(opaque), &local_err);
> > +    if (xive_in_kernel(tctx->xptr)) {
> > +        kvmppc_xive_cpu_get_state(tctx, &local_err);
> >          if (local_err) {
> >              error_report_err(local_err);
> >              return -1;
> > @@ -697,14 +709,15 @@ static int vmstate_xive_tctx_pre_save(void *opaque)
> >  
> >  static int vmstate_xive_tctx_post_load(void *opaque, int version_id)
> >  {
> > +    XiveTCTX *tctx = XIVE_TCTX(opaque);
> >      Error *local_err = NULL;
> >  
> > -    if (kvm_irqchip_in_kernel()) {
> > +    if (xive_in_kernel(tctx->xptr)) {
> >          /*
> >           * Required for hotplugged CPU, for which the state comes
> >           * after all states of the machine.
> >           */
> > -        kvmppc_xive_cpu_set_state(XIVE_TCTX(opaque), &local_err);
> > +        kvmppc_xive_cpu_set_state(tctx, &local_err);
> >          if (local_err) {
> >              error_report_err(local_err);
> >              return -1;
> > @@ -1128,6 +1141,7 @@ static void xive_source_reset(void *dev)
> >  static void xive_source_realize(DeviceState *dev, Error **errp)
> >  {
> >      XiveSource *xsrc = XIVE_SOURCE(dev);
> > +    XiveNotifierClass *xnc = XIVE_NOTIFIER_GET_CLASS(xsrc->xive);
> >  
> >      assert(xsrc->xive);
> >  
> > @@ -1147,7 +1161,7 @@ static void xive_source_realize(DeviceState *dev, Error **errp)
> >      xsrc->status = g_malloc0(xsrc->nr_irqs);
> >      xsrc->lsi_map = bitmap_new(xsrc->nr_irqs);
> >  
> > -    if (!kvm_irqchip_in_kernel()) {
> > +    if (!xnc->in_kernel || !xnc->in_kernel(xsrc->xive)) {
> >          memory_region_init_io(&xsrc->esb_mmio, OBJECT(xsrc),
> >                                &xive_source_esb_ops, xsrc, "xive.esb",
> >                                (1ull << xsrc->esb_shift) * xsrc->nr_irqs);
> > diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
> > index 705cf48176fc..aa46e3fcf512 100644
> > --- a/include/hw/ppc/xive.h
> > +++ b/include/hw/ppc/xive.h
> > @@ -161,6 +161,7 @@ typedef struct XiveNotifier XiveNotifier;
> >  typedef struct XiveNotifierClass {
> >      InterfaceClass parent;
> >      void (*notify)(XiveNotifier *xn, uint32_t lisn);
> > +    bool (*in_kernel)(const XiveNotifier *xn);
> >  } XiveNotifierClass;
> >  
> >  /*
> > @@ -396,6 +397,7 @@ typedef struct XivePresenterClass {
> >                       uint8_t nvt_blk, uint32_t nvt_idx,
> >                       bool cam_ignore, uint8_t priority,
> >                       uint32_t logic_serv, XiveTCTXMatch *match);
> > +    bool (*in_kernel)(const XivePresenter *xptr);
> >  } XivePresenterClass;
> >  
> >  int xive_presenter_tctx_match(XivePresenter *xptr, XiveTCTX *tctx,
> > 
> > 
> 
> It seems redundant. Can we introduce a new XiveBackend QOM interface 
> which would implement an in_kernel() handler ? and XiveRouter would 
> inherit from it. 
> 

Not sure to see how it would help... the XiveRouter type isn't used
at the locations where we call kvm_irqchip_in_kernel(). Only
XivePresenter and XiveNotifier...

> C.
> 
> 
> 
> 
> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 for-5.2 3/5] ppc/xive: Introduce dedicated kvm_irqchip_in_kernel() wrappers
  2020-08-07  7:15     ` Greg Kurz
@ 2020-08-07  9:29       ` Greg Kurz
  0 siblings, 0 replies; 9+ messages in thread
From: Greg Kurz @ 2020-08-07  9:29 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: Daniel Henrique Barboza, qemu-ppc, qemu-devel, David Gibson

On Fri, 7 Aug 2020 09:15:54 +0200
Greg Kurz <groug@kaod.org> wrote:

> On Thu, 6 Aug 2020 19:55:29 +0200
> Cédric Le Goater <clg@kaod.org> wrote:
> 

[...]

> > > diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
> > > index 705cf48176fc..aa46e3fcf512 100644
> > > --- a/include/hw/ppc/xive.h
> > > +++ b/include/hw/ppc/xive.h
> > > @@ -161,6 +161,7 @@ typedef struct XiveNotifier XiveNotifier;
> > >  typedef struct XiveNotifierClass {
> > >      InterfaceClass parent;
> > >      void (*notify)(XiveNotifier *xn, uint32_t lisn);
> > > +    bool (*in_kernel)(const XiveNotifier *xn);
> > >  } XiveNotifierClass;
> > >  
> > >  /*
> > > @@ -396,6 +397,7 @@ typedef struct XivePresenterClass {
> > >                       uint8_t nvt_blk, uint32_t nvt_idx,
> > >                       bool cam_ignore, uint8_t priority,
> > >                       uint32_t logic_serv, XiveTCTXMatch *match);
> > > +    bool (*in_kernel)(const XivePresenter *xptr);
> > >  } XivePresenterClass;
> > >  
> > >  int xive_presenter_tctx_match(XivePresenter *xptr, XiveTCTX *tctx,
> > > 
> > > 
> > 
> > It seems redundant. Can we introduce a new XiveBackend QOM interface 
> > which would implement an in_kernel() handler ? and XiveRouter would 
> > inherit from it. 
> > 
> 
> Not sure to see how it would help... the XiveRouter type isn't used
> at the locations where we call kvm_irqchip_in_kernel(). Only
> XivePresenter and XiveNotifier...
> 

Looking again at xive_source_realize(), I now realize (forgive the pun ;) that
the negative check on kvm_irqchip_in_kernel() is a bit awkward. We usually
do more stuff when we have a KVM backend, not less. The intent seems to be
that the ESB MMIO should point to either I/O sub-region when XIVE is emulated
or to a mmapped subregion when XIVE is backed by a KVM device. This can be
achieved with a container and overlapping sub-regions (prio 0 for emulated,
prio 1 for KVM). And we no longer need to hijack XiveNotifier.

So in the end, we only need the method for XivePresenter.

I'll cook a v3.

> > C.
> > 
> > 
> > 
> > 
> > 
> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-08-07  9:31 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-06 16:55 [PATCH v2 for-5.2 0/5] papr: Cleanups for XIVE and PHB Greg Kurz
2020-08-06 16:56 ` [PATCH v2 for-5.2 1/5] spapr/xive: Fix xive->fd if kvm_create_device() fails Greg Kurz
2020-08-06 16:56 ` [PATCH v2 for-5.2 2/5] spapr/xive: Simplify kvmppc_xive_disconnect() Greg Kurz
2020-08-06 16:56 ` [PATCH v2 for-5.2 3/5] ppc/xive: Introduce dedicated kvm_irqchip_in_kernel() wrappers Greg Kurz
2020-08-06 17:55   ` Cédric Le Goater
2020-08-07  7:15     ` Greg Kurz
2020-08-07  9:29       ` Greg Kurz
2020-08-06 16:56 ` [PATCH v2 for-5.2 4/5] spapr/xive: Convert KVM device fd checks to assert() Greg Kurz
2020-08-06 16:56 ` [PATCH v2 for-5.2 5/5] spapr: Simplify error handling in spapr_phb_realize() Greg Kurz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.