All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids
@ 2020-11-20 17:46 Greg Kurz
  2020-11-20 17:46 ` [PATCH for-6.0 1/8] spapr/xive: Turn some sanity checks into assertions Greg Kurz
                   ` (8 more replies)
  0 siblings, 9 replies; 40+ messages in thread
From: Greg Kurz @ 2020-11-20 17:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz, qemu-ppc, Cédric Le Goater, David Gibson

A regression was recently fixed in the sPAPR XIVE code for QEMU 5.2
RC3 [1]. It boiled down to a confusion between IPI numbers and vCPU
ids, which happen to be numerically equal in general, but are really
different entities that can diverge in some setups. When this happens,
we end up misconfiguring XIVE in a way that is almost fatal for the
guest.

The confusion comes from XICS which has historically assumed equality
between interrupt server numbers and vCPU ids, as mentionned in a
comment back from 2011 in the linux kernel icp_native_init_one_node()
function:

    /* This code does the theorically broken assumption that the interrupt
     * server numbers are the same as the hard CPU numbers.
     * This happens to be the case so far but we are playing with fire...
     * should be fixed one of these days. -BenH.
     */

This assumption crept into QEMU through the "ibm,interrupt-server-ranges"
property of the "interrupt-controller" node in the DT. This property
contains ranges of consecutive vCPU ids defined as (first id, # of ids).
In the case of QEMU, we define a single range starting from 0 up to the
highest vCPU id, as returned by spapr_max_server_number(). This has
always been associated to the "nr_servers" wording when naming variables
or function arguments. When XIVE got added, we introduced an sPAPR IRQ
abstraction to be able to control several interrupt controller backends.
The sPAPR IRQ base class provides a dt() handler used to populate the
"interrupt-controller" node in the DT. This handler takes an "nr_server"
argument inherited from XICS and we ended up using it to populate the
"ibm,xive-lisn-ranges" property used with XIVE, which has completely
different semantics. It contain ranges of interrupt numbers that the
guest can use for any purpose. Since one obvious purpose is IPI, its
first range should just be able to accomodate all possible vCPUs.
In the case of QEMU, we define a single range starting from 0 up
to "nr_server" but we should rather size it to the number of vCPUs
actually (ie. smp.max_cpus).

This series aims at getting rid of the "nr_server" argument in the
sPAPR IC handlers. Since both the highest possible vCPU id and the
maximum number of vCPUs are invariants for XICS and XIVE respectively,
let's make them device properties to be configured by the machine when
it creates the interrupt controllers and use them where needed.

This doesn't cause any visible change to setups using the default
VSMT machine settings. This changes "ibm,xive-lisn-ranges" for
setups that mess with VSMT, but this is acceptable since linux
only allocates one interrupt per vCPU and the higher part of the
range was never used.

[1] https://git.qemu.org/?p=qemu.git;a=commit;h=6d24795ee7e3199d199d3c415312c93382ad1807

Greg Kurz (8):
  spapr/xive: Turn some sanity checks into assertions
  spapr/xive: Introduce spapr_xive_nr_ends()
  spapr/xive: Add "nr-servers" property
  spapr/xive: Add "nr-ipis" property
  spapr/xics: Drop unused argument to xics_kvm_has_broken_disconnect()
  spapr/xics: Add "nr-servers" property
  spapr: Drop "nr_servers" argument of the sPAPR IC activate() operation
  spapr: Drop "nr_servers" argument of the sPAPR IC dt() operation

 include/hw/ppc/spapr.h      |  4 +--
 include/hw/ppc/spapr_irq.h  |  9 ++---
 include/hw/ppc/spapr_xive.h | 25 ++++++++++++-
 include/hw/ppc/xics_spapr.h | 23 +++++++++---
 hw/intc/spapr_xive.c        | 72 ++++++++++++++++++++++---------------
 hw/intc/spapr_xive_kvm.c    |  4 +--
 hw/intc/xics_kvm.c          |  4 +--
 hw/intc/xics_spapr.c        | 45 ++++++++++++++---------
 hw/ppc/spapr.c              |  7 ++--
 hw/ppc/spapr_irq.c          | 27 +++++++-------
 10 files changed, 141 insertions(+), 79 deletions(-)

-- 
2.26.2




^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH for-6.0 1/8] spapr/xive: Turn some sanity checks into assertions
  2020-11-20 17:46 [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids Greg Kurz
@ 2020-11-20 17:46 ` Greg Kurz
  2020-11-23  3:33   ` David Gibson
  2020-11-23  8:09   ` Cédric Le Goater
  2020-11-20 17:46 ` [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends() Greg Kurz
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 40+ messages in thread
From: Greg Kurz @ 2020-11-20 17:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz, qemu-ppc, Cédric Le Goater, David Gibson

The sPAPR XIVE device is created by the machine in spapr_irq_init().
The latter overrides any value provided by the user with -global for
the "nr-irqs" and "nr-ends" properties with strictly positive values.

It seems reasonable to assume these properties should never be 0,
which wouldn't make much sense by the way.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/intc/spapr_xive.c | 14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index 1fa09f287ac0..60e0d5769dcc 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -296,22 +296,16 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
     XiveENDSource *end_xsrc = &xive->end_source;
     Error *local_err = NULL;
 
+    /* Set by spapr_irq_init() */
+    g_assert(xive->nr_irqs);
+    g_assert(xive->nr_ends);
+
     sxc->parent_realize(dev, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         return;
     }
 
-    if (!xive->nr_irqs) {
-        error_setg(errp, "Number of interrupt needs to be greater 0");
-        return;
-    }
-
-    if (!xive->nr_ends) {
-        error_setg(errp, "Number of interrupt needs to be greater 0");
-        return;
-    }
-
     /*
      * Initialize the internal sources, for IPIs and virtual devices.
      */
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends()
  2020-11-20 17:46 [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids Greg Kurz
  2020-11-20 17:46 ` [PATCH for-6.0 1/8] spapr/xive: Turn some sanity checks into assertions Greg Kurz
@ 2020-11-20 17:46 ` Greg Kurz
  2020-11-23  3:33   ` David Gibson
  2020-11-23  9:46   ` Cédric Le Goater
  2020-11-20 17:46 ` [PATCH for-6.0 3/8] spapr/xive: Add "nr-servers" property Greg Kurz
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 40+ messages in thread
From: Greg Kurz @ 2020-11-20 17:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz, qemu-ppc, Cédric Le Goater, David Gibson

We're going to kill the "nr_ends" field in a subsequent patch.
Prepare ground by using an helper instead of peeking into
the sPAPR XIVE structure directly.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 include/hw/ppc/spapr_xive.h |  1 +
 hw/intc/spapr_xive.c        | 23 ++++++++++++++---------
 hw/intc/spapr_xive_kvm.c    |  4 ++--
 3 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
index 26c8d90d7196..4b967f13c030 100644
--- a/include/hw/ppc/spapr_xive.h
+++ b/include/hw/ppc/spapr_xive.h
@@ -75,6 +75,7 @@ void spapr_xive_map_mmio(SpaprXive *xive);
 
 int spapr_xive_end_to_target(uint8_t end_blk, uint32_t end_idx,
                              uint32_t *out_server, uint8_t *out_prio);
+uint32_t spapr_xive_nr_ends(const SpaprXive *xive);
 
 /*
  * KVM XIVE device helpers
diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index 60e0d5769dcc..f473ad9cba47 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -192,7 +192,7 @@ void spapr_xive_pic_print_info(SpaprXive *xive, Monitor *mon)
             uint32_t end_idx = xive_get_field64(EAS_END_INDEX, eas->w);
             XiveEND *end;
 
-            assert(end_idx < xive->nr_ends);
+            assert(end_idx < spapr_xive_nr_ends(xive));
             end = &xive->endt[end_idx];
 
             if (xive_end_is_valid(end)) {
@@ -270,7 +270,7 @@ static void spapr_xive_reset(void *dev)
     }
 
     /* Clear all ENDs */
-    for (i = 0; i < xive->nr_ends; i++) {
+    for (i = 0; i < spapr_xive_nr_ends(xive); i++) {
         spapr_xive_end_reset(&xive->endt[i]);
     }
 }
@@ -288,6 +288,11 @@ static void spapr_xive_instance_init(Object *obj)
     xive->fd = -1;
 }
 
+uint32_t spapr_xive_nr_ends(const SpaprXive *xive)
+{
+    return xive->nr_ends;
+}
+
 static void spapr_xive_realize(DeviceState *dev, Error **errp)
 {
     SpaprXive *xive = SPAPR_XIVE(dev);
@@ -336,7 +341,7 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
      * Allocate the routing tables
      */
     xive->eat = g_new0(XiveEAS, xive->nr_irqs);
-    xive->endt = g_new0(XiveEND, xive->nr_ends);
+    xive->endt = g_new0(XiveEND, spapr_xive_nr_ends(xive));
 
     xive->nodename = g_strdup_printf("interrupt-controller@%" PRIx64,
                            xive->tm_base + XIVE_TM_USER_PAGE * (1 << TM_SHIFT));
@@ -375,7 +380,7 @@ static int spapr_xive_get_end(XiveRouter *xrtr,
 {
     SpaprXive *xive = SPAPR_XIVE(xrtr);
 
-    if (end_idx >= xive->nr_ends) {
+    if (end_idx >= spapr_xive_nr_ends(xive)) {
         return -1;
     }
 
@@ -389,7 +394,7 @@ static int spapr_xive_write_end(XiveRouter *xrtr, uint8_t end_blk,
 {
     SpaprXive *xive = SPAPR_XIVE(xrtr);
 
-    if (end_idx >= xive->nr_ends) {
+    if (end_idx >= spapr_xive_nr_ends(xive)) {
         return -1;
     }
 
@@ -1138,7 +1143,7 @@ static target_ulong h_int_get_source_config(PowerPCCPU *cpu,
     /* EAS_END_BLOCK is unused on sPAPR */
     end_idx = xive_get_field64(EAS_END_INDEX, eas.w);
 
-    assert(end_idx < xive->nr_ends);
+    assert(end_idx < spapr_xive_nr_ends(xive));
     end = &xive->endt[end_idx];
 
     nvt_blk = xive_get_field32(END_W6_NVT_BLOCK, end->w6);
@@ -1216,7 +1221,7 @@ static target_ulong h_int_get_queue_info(PowerPCCPU *cpu,
         return H_P2;
     }
 
-    assert(end_idx < xive->nr_ends);
+    assert(end_idx < spapr_xive_nr_ends(xive));
     end = &xive->endt[end_idx];
 
     args[0] = xive->end_base + (1ull << (end_xsrc->esb_shift + 1)) * end_idx;
@@ -1304,7 +1309,7 @@ static target_ulong h_int_set_queue_config(PowerPCCPU *cpu,
         return H_P2;
     }
 
-    assert(end_idx < xive->nr_ends);
+    assert(end_idx < spapr_xive_nr_ends(xive));
     memcpy(&end, &xive->endt[end_idx], sizeof(XiveEND));
 
     switch (qsize) {
@@ -1470,7 +1475,7 @@ static target_ulong h_int_get_queue_config(PowerPCCPU *cpu,
         return H_P2;
     }
 
-    assert(end_idx < xive->nr_ends);
+    assert(end_idx < spapr_xive_nr_ends(xive));
     end = &xive->endt[end_idx];
 
     args[0] = 0;
diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
index 66bf4c06fe55..1566016f0e28 100644
--- a/hw/intc/spapr_xive_kvm.c
+++ b/hw/intc/spapr_xive_kvm.c
@@ -531,7 +531,7 @@ static int kvmppc_xive_get_queues(SpaprXive *xive, Error **errp)
     int i;
     int ret;
 
-    for (i = 0; i < xive->nr_ends; i++) {
+    for (i = 0; i < spapr_xive_nr_ends(xive); i++) {
         if (!xive_end_is_valid(&xive->endt[i])) {
             continue;
         }
@@ -701,7 +701,7 @@ int kvmppc_xive_post_load(SpaprXive *xive, int version_id)
     assert(xive->fd != -1);
 
     /* Restore the ENDT first. The targetting depends on it. */
-    for (i = 0; i < xive->nr_ends; i++) {
+    for (i = 0; i < spapr_xive_nr_ends(xive); i++) {
         if (!xive_end_is_valid(&xive->endt[i])) {
             continue;
         }
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH for-6.0 3/8] spapr/xive: Add "nr-servers" property
  2020-11-20 17:46 [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids Greg Kurz
  2020-11-20 17:46 ` [PATCH for-6.0 1/8] spapr/xive: Turn some sanity checks into assertions Greg Kurz
  2020-11-20 17:46 ` [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends() Greg Kurz
@ 2020-11-20 17:46 ` Greg Kurz
  2020-11-23  3:52   ` David Gibson
  2020-11-23  9:56   ` Cédric Le Goater
  2020-11-20 17:46 ` [PATCH for-6.0 4/8] spapr/xive: Add "nr-ipis" property Greg Kurz
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 40+ messages in thread
From: Greg Kurz @ 2020-11-20 17:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz, qemu-ppc, Cédric Le Goater, David Gibson

The sPAPR XIVE object has an "nr-ends" property that is used
to size the END table. This property is set by the machine
code to a value derived from spapr_max_server_number().

spapr_max_server_number() is also used to inform the KVM XIVE
device about the range of vCPU ids it might be exposed to,
in order to optimize resource allocation in the HW.

This is enough motivation to introduce an "nr-servers" property
and to use it for both purposes. The existing "nr-ends" property
is now longer used. It is kept around though because it is exposed
to -global. It will continue to be ignored as before without
causing QEMU to exit.

The associated nr_ends field cannot be dropped from SpaprXive
because it is explicitly used by vmstate_spapr_xive(). It is
thus renamed to nr_ends_vmstate.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 include/hw/ppc/spapr_xive.h | 16 +++++++++++++++-
 hw/intc/spapr_xive.c        | 28 ++++++++++++++++++++++------
 hw/ppc/spapr_irq.c          |  6 +-----
 3 files changed, 38 insertions(+), 12 deletions(-)

diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
index 4b967f13c030..7123ea07ed78 100644
--- a/include/hw/ppc/spapr_xive.h
+++ b/include/hw/ppc/spapr_xive.h
@@ -23,6 +23,16 @@
 typedef struct SpaprXive {
     XiveRouter    parent;
 
+    /*
+     * The XIVE device needs to know the highest vCPU id it might
+     * be exposed to in order to size the END table. It may also
+     * propagate the value to the KVM XIVE device in order to
+     * optimize resource allocation in the HW.
+     * This must be set to a non-null value using the "nr-servers"
+     * property, before realizing the device.
+     */
+    uint32_t      nr_servers;
+
     /* Internal interrupt source for IPIs and virtual devices */
     XiveSource    source;
     hwaddr        vc_base;
@@ -38,7 +48,11 @@ typedef struct SpaprXive {
     XiveEAS       *eat;
     uint32_t      nr_irqs;
     XiveEND       *endt;
-    uint32_t      nr_ends;
+    /*
+     * This is derived from nr_servers but it must be kept around because
+     * vmstate_spapr_xive uses it.
+     */
+    uint32_t      nr_ends_vmstate;
 
     /* TIMA mapping address */
     hwaddr        tm_base;
diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index f473ad9cba47..e4dbf9c2c409 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -99,6 +99,12 @@ int spapr_xive_end_to_target(uint8_t end_blk, uint32_t end_idx,
     return 0;
 }
 
+/*
+ * 8 XIVE END structures per CPU. One for each available
+ * priority
+ */
+#define spapr_xive_cpu_end_idx(vcpu, prio) (((vcpu) << 3) + prio)
+
 static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
                                   uint8_t *out_end_blk, uint32_t *out_end_idx)
 {
@@ -109,7 +115,7 @@ static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
     }
 
     if (out_end_idx) {
-        *out_end_idx = (cpu->vcpu_id << 3) + prio;
+        *out_end_idx = spapr_xive_cpu_end_idx(cpu->vcpu_id, prio);
     }
 }
 
@@ -290,7 +296,8 @@ static void spapr_xive_instance_init(Object *obj)
 
 uint32_t spapr_xive_nr_ends(const SpaprXive *xive)
 {
-    return xive->nr_ends;
+    g_assert(xive->nr_servers);
+    return spapr_xive_cpu_end_idx(xive->nr_servers, 0);
 }
 
 static void spapr_xive_realize(DeviceState *dev, Error **errp)
@@ -303,7 +310,7 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
 
     /* Set by spapr_irq_init() */
     g_assert(xive->nr_irqs);
-    g_assert(xive->nr_ends);
+    g_assert(xive->nr_servers);
 
     sxc->parent_realize(dev, &local_err);
     if (local_err) {
@@ -360,6 +367,8 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
     sysbus_mmio_map(SYS_BUS_DEVICE(xive), 0, xive->vc_base);
     sysbus_mmio_map(SYS_BUS_DEVICE(xive), 1, xive->end_base);
     sysbus_mmio_map(SYS_BUS_DEVICE(xive), 2, xive->tm_base);
+
+    xive->nr_ends_vmstate = spapr_xive_nr_ends(xive);
 }
 
 static int spapr_xive_get_eas(XiveRouter *xrtr, uint8_t eas_blk,
@@ -547,7 +556,7 @@ static const VMStateDescription vmstate_spapr_xive = {
         VMSTATE_UINT32_EQUAL(nr_irqs, SpaprXive, NULL),
         VMSTATE_STRUCT_VARRAY_POINTER_UINT32(eat, SpaprXive, nr_irqs,
                                      vmstate_spapr_xive_eas, XiveEAS),
-        VMSTATE_STRUCT_VARRAY_POINTER_UINT32(endt, SpaprXive, nr_ends,
+        VMSTATE_STRUCT_VARRAY_POINTER_UINT32(endt, SpaprXive, nr_ends_vmstate,
                                              vmstate_spapr_xive_end, XiveEND),
         VMSTATE_END_OF_LIST()
     },
@@ -591,7 +600,14 @@ static void spapr_xive_free_irq(SpaprInterruptController *intc, int lisn)
 
 static Property spapr_xive_properties[] = {
     DEFINE_PROP_UINT32("nr-irqs", SpaprXive, nr_irqs, 0),
-    DEFINE_PROP_UINT32("nr-ends", SpaprXive, nr_ends, 0),
+    /*
+     * "nr-ends" is deprecated by "nr-servers" since QEMU 6.0.
+     * It is just kept around because it is exposed to the user
+     * through -global and we don't want it to fail, even if
+     * the value is actually overridden internally.
+     */
+    DEFINE_PROP_UINT32("nr-ends", SpaprXive, nr_ends_vmstate, 0),
+    DEFINE_PROP_UINT32("nr-servers", SpaprXive, nr_servers, 0),
     DEFINE_PROP_UINT64("vc-base", SpaprXive, vc_base, SPAPR_XIVE_VC_BASE),
     DEFINE_PROP_UINT64("tm-base", SpaprXive, tm_base, SPAPR_XIVE_TM_BASE),
     DEFINE_PROP_UINT8("hv-prio", SpaprXive, hv_prio, 7),
@@ -742,7 +758,7 @@ static int spapr_xive_activate(SpaprInterruptController *intc,
     SpaprXive *xive = SPAPR_XIVE(intc);
 
     if (kvm_enabled()) {
-        int rc = spapr_irq_init_kvm(kvmppc_xive_connect, intc, nr_servers,
+        int rc = spapr_irq_init_kvm(kvmppc_xive_connect, intc, xive->nr_servers,
                                     errp);
         if (rc < 0) {
             return rc;
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index f59960339ec3..8c5627225636 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -330,11 +330,7 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
 
         dev = qdev_new(TYPE_SPAPR_XIVE);
         qdev_prop_set_uint32(dev, "nr-irqs", smc->nr_xirqs + SPAPR_XIRQ_BASE);
-        /*
-         * 8 XIVE END structures per CPU. One for each available
-         * priority
-         */
-        qdev_prop_set_uint32(dev, "nr-ends", nr_servers << 3);
+        qdev_prop_set_uint32(dev, "nr-servers", nr_servers);
         object_property_set_link(OBJECT(dev), "xive-fabric", OBJECT(spapr),
                                  &error_abort);
         sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH for-6.0 4/8] spapr/xive: Add "nr-ipis" property
  2020-11-20 17:46 [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids Greg Kurz
                   ` (2 preceding siblings ...)
  2020-11-20 17:46 ` [PATCH for-6.0 3/8] spapr/xive: Add "nr-servers" property Greg Kurz
@ 2020-11-20 17:46 ` Greg Kurz
  2020-11-23  4:10   ` David Gibson
  2020-11-23 10:13   ` Cédric Le Goater
  2020-11-20 17:46 ` [PATCH for-6.0 5/8] spapr/xics: Drop unused argument to xics_kvm_has_broken_disconnect() Greg Kurz
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 40+ messages in thread
From: Greg Kurz @ 2020-11-20 17:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz, qemu-ppc, Cédric Le Goater, David Gibson

The sPAPR XIVE device exposes a range of LISNs that the guest uses
for IPIs. This range is currently sized according to the highest
vCPU id, ie. spapr_max_server_number(), as obtained from the machine
through the "nr_servers" argument of the generic spapr_irq_dt() call.

This makes sense for the "ibm,interrupt-server-ranges" property of
sPAPR XICS, but certainly not for "ibm,xive-lisn-ranges". The range
should be sized to the maximum number of possible vCPUs. It happens
that spapr_max_server_number() == smp.max_cpus in practice with the
machine default settings. This might not be the case though when
VSMT is in use : we can end up with a much larger range (up to 8
times bigger) than needed and waste LISNs. But most importantly, this
lures people into thinking that IPI numbers are always equal to
vCPU ids, which is wrong and bit us recently:

https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg01476.html

Introduce an "nr-ipis" property that the machine sets to smp.max_cpus
before realizing the deice. Use that instead of "nr_servers" in
spapr_xive_dt(). Have the machine to claim the same number of IPIs
in spapr_irq_init().

This doesn't cause any guest visible change when using the machine
default settings (ie. VSMT == smp.threads).

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 include/hw/ppc/spapr_xive.h | 8 ++++++++
 hw/intc/spapr_xive.c        | 4 +++-
 hw/ppc/spapr_irq.c          | 4 +++-
 3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
index 7123ea07ed78..69b9fbc1b9a5 100644
--- a/include/hw/ppc/spapr_xive.h
+++ b/include/hw/ppc/spapr_xive.h
@@ -43,6 +43,14 @@ typedef struct SpaprXive {
 
     /* DT */
     gchar *nodename;
+    /*
+     * The sPAPR XIVE device needs to know how many vCPUs it
+     * might be exposed to in order to size the IPI range in
+     * "ibm,interrupt-server-ranges". It is the purpose of the
+     * "nr-ipis" property which *must* be set to a non-null
+     * value before realizing the sPAPR XIVE device.
+     */
+    uint32_t nr_ipis;
 
     /* Routing table */
     XiveEAS       *eat;
diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index e4dbf9c2c409..d13a2955ce9b 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -311,6 +311,7 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
     /* Set by spapr_irq_init() */
     g_assert(xive->nr_irqs);
     g_assert(xive->nr_servers);
+    g_assert(xive->nr_ipis);
 
     sxc->parent_realize(dev, &local_err);
     if (local_err) {
@@ -608,6 +609,7 @@ static Property spapr_xive_properties[] = {
      */
     DEFINE_PROP_UINT32("nr-ends", SpaprXive, nr_ends_vmstate, 0),
     DEFINE_PROP_UINT32("nr-servers", SpaprXive, nr_servers, 0),
+    DEFINE_PROP_UINT32("nr-ipis", SpaprXive, nr_ipis, 0),
     DEFINE_PROP_UINT64("vc-base", SpaprXive, vc_base, SPAPR_XIVE_VC_BASE),
     DEFINE_PROP_UINT64("tm-base", SpaprXive, tm_base, SPAPR_XIVE_TM_BASE),
     DEFINE_PROP_UINT8("hv-prio", SpaprXive, hv_prio, 7),
@@ -698,7 +700,7 @@ static void spapr_xive_dt(SpaprInterruptController *intc, uint32_t nr_servers,
     /* Interrupt number ranges for the IPIs */
     uint32_t lisn_ranges[] = {
         cpu_to_be32(SPAPR_IRQ_IPI),
-        cpu_to_be32(SPAPR_IRQ_IPI + nr_servers),
+        cpu_to_be32(SPAPR_IRQ_IPI + xive->nr_ipis),
     };
     /*
      * EQ size - the sizes of pages supported by the system 4K, 64K,
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 8c5627225636..a0fc474ecb06 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -325,12 +325,14 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
 
     if (spapr->irq->xive) {
         uint32_t nr_servers = spapr_max_server_number(spapr);
+        uint32_t max_cpus = MACHINE(spapr)->smp.max_cpus;
         DeviceState *dev;
         int i;
 
         dev = qdev_new(TYPE_SPAPR_XIVE);
         qdev_prop_set_uint32(dev, "nr-irqs", smc->nr_xirqs + SPAPR_XIRQ_BASE);
         qdev_prop_set_uint32(dev, "nr-servers", nr_servers);
+        qdev_prop_set_uint32(dev, "nr-ipis", max_cpus);
         object_property_set_link(OBJECT(dev), "xive-fabric", OBJECT(spapr),
                                  &error_abort);
         sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
@@ -338,7 +340,7 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
         spapr->xive = SPAPR_XIVE(dev);
 
         /* Enable the CPU IPIs */
-        for (i = 0; i < nr_servers; ++i) {
+        for (i = 0; i < max_cpus; ++i) {
             SpaprInterruptControllerClass *sicc
                 = SPAPR_INTC_GET_CLASS(spapr->xive);
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH for-6.0 5/8] spapr/xics: Drop unused argument to xics_kvm_has_broken_disconnect()
  2020-11-20 17:46 [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids Greg Kurz
                   ` (3 preceding siblings ...)
  2020-11-20 17:46 ` [PATCH for-6.0 4/8] spapr/xive: Add "nr-ipis" property Greg Kurz
@ 2020-11-20 17:46 ` Greg Kurz
  2020-11-23  4:10   ` David Gibson
  2020-11-23 10:15   ` Cédric Le Goater
  2020-11-20 17:46 ` [PATCH for-6.0 6/8] spapr/xics: Add "nr-servers" property Greg Kurz
                   ` (3 subsequent siblings)
  8 siblings, 2 replies; 40+ messages in thread
From: Greg Kurz @ 2020-11-20 17:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz, qemu-ppc, Cédric Le Goater, David Gibson

Never used from the start.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 include/hw/ppc/xics_spapr.h | 2 +-
 hw/intc/xics_kvm.c          | 2 +-
 hw/ppc/spapr_irq.c          | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/hw/ppc/xics_spapr.h b/include/hw/ppc/xics_spapr.h
index 0b8182e40b33..de752c0d2c7e 100644
--- a/include/hw/ppc/xics_spapr.h
+++ b/include/hw/ppc/xics_spapr.h
@@ -38,6 +38,6 @@ DECLARE_INSTANCE_CHECKER(ICSState, ICS_SPAPR,
 int xics_kvm_connect(SpaprInterruptController *intc, uint32_t nr_servers,
                      Error **errp);
 void xics_kvm_disconnect(SpaprInterruptController *intc);
-bool xics_kvm_has_broken_disconnect(SpaprMachineState *spapr);
+bool xics_kvm_has_broken_disconnect(void);
 
 #endif /* XICS_SPAPR_H */
diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
index 68bb1914b9bb..570d635bcc08 100644
--- a/hw/intc/xics_kvm.c
+++ b/hw/intc/xics_kvm.c
@@ -484,7 +484,7 @@ void xics_kvm_disconnect(SpaprInterruptController *intc)
  * support destruction of a KVM XICS device while the VM is running.
  * Required to start a spapr machine with ic-mode=dual,kernel-irqchip=on.
  */
-bool xics_kvm_has_broken_disconnect(SpaprMachineState *spapr)
+bool xics_kvm_has_broken_disconnect(void)
 {
     int rc;
 
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index a0fc474ecb06..2dacbecfd5fd 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -186,7 +186,7 @@ static int spapr_irq_check(SpaprMachineState *spapr, Error **errp)
     if (kvm_enabled() &&
         spapr->irq == &spapr_irq_dual &&
         kvm_kernel_irqchip_required() &&
-        xics_kvm_has_broken_disconnect(spapr)) {
+        xics_kvm_has_broken_disconnect()) {
         error_setg(errp,
             "KVM is incompatible with ic-mode=dual,kernel-irqchip=on");
         error_append_hint(errp,
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH for-6.0 6/8] spapr/xics: Add "nr-servers" property
  2020-11-20 17:46 [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids Greg Kurz
                   ` (4 preceding siblings ...)
  2020-11-20 17:46 ` [PATCH for-6.0 5/8] spapr/xics: Drop unused argument to xics_kvm_has_broken_disconnect() Greg Kurz
@ 2020-11-20 17:46 ` Greg Kurz
  2020-11-23  4:18   ` David Gibson
  2020-11-23 10:24   ` Cédric Le Goater
  2020-11-20 17:46 ` [PATCH for-6.0 7/8] spapr: Drop "nr_servers" argument of the sPAPR IC activate() operation Greg Kurz
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 40+ messages in thread
From: Greg Kurz @ 2020-11-20 17:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz, qemu-ppc, Cédric Le Goater, David Gibson

The sPAPR ICS device exposes the range of vCPU ids it can handle in
the "ibm,interrupt-server-ranges" FDT property. The highest vCPU
id, ie. spapr_max_server_number(), is obtained from the machine
through the "nr_servers" argument of the generic spapr_irq_dt() call.

We want to drop the "nr_servers" argument from the API because it
doesn't make sense for the sPAPR XIVE device actually.

On POWER9, we also pass the highest vCPU id to the KVM XICS-on-XIVE
device, in order to optimize resource allocation in the HW.

This is enough motivation to introduce an "nr-servers" property
and to use it for both purposes.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 include/hw/ppc/spapr.h      |  4 ++--
 include/hw/ppc/xics_spapr.h | 21 +++++++++++++++++---
 hw/intc/xics_kvm.c          |  2 +-
 hw/intc/xics_spapr.c        | 38 ++++++++++++++++++++++++-------------
 hw/ppc/spapr.c              |  5 +++--
 hw/ppc/spapr_irq.c          |  7 +++++--
 6 files changed, 54 insertions(+), 23 deletions(-)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 2e89e36cfbdc..b76c84a2f884 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -10,7 +10,7 @@
 #include "hw/ppc/spapr_irq.h"
 #include "qom/object.h"
 #include "hw/ppc/spapr_xive.h"  /* For SpaprXive */
-#include "hw/ppc/xics.h"        /* For ICSState */
+#include "hw/ppc/xics_spapr.h"  /* For IcsSpaprState */
 #include "hw/ppc/spapr_tpm_proxy.h"
 
 struct SpaprVioBus;
@@ -230,7 +230,7 @@ struct SpaprMachineState {
     SpaprIrq *irq;
     qemu_irq *qirqs;
     SpaprInterruptController *active_intc;
-    ICSState *ics;
+    IcsSpaprState *ics;
     SpaprXive *xive;
 
     bool cmd_line_caps[SPAPR_CAP_NUM];
diff --git a/include/hw/ppc/xics_spapr.h b/include/hw/ppc/xics_spapr.h
index de752c0d2c7e..1a483a873b62 100644
--- a/include/hw/ppc/xics_spapr.h
+++ b/include/hw/ppc/xics_spapr.h
@@ -28,12 +28,27 @@
 #define XICS_SPAPR_H
 
 #include "hw/ppc/spapr.h"
+#include "hw/ppc/xics.h"
 #include "qom/object.h"
 
+typedef struct IcsSpaprState {
+    /*< private >*/
+    ICPState parent_obj;
+
+    /*
+     * The ICS needs to know the upper limit to vCPU ids it
+     * might be exposed to in order to size the vCPU id range
+     * in "ibm,interrupt-server-ranges" and to optimize HW
+     * resource allocation when using the XICS-on-XIVE KVM
+     * device. It is the purpose of the "nr-servers" property
+     * which *must* be set to a non-null value before realizing
+     * the ICS.
+     */
+    uint32_t nr_servers;
+} IcsSpaprState;
+
 #define TYPE_ICS_SPAPR "ics-spapr"
-/* This is reusing the ICSState typedef from TYPE_ICS */
-DECLARE_INSTANCE_CHECKER(ICSState, ICS_SPAPR,
-                         TYPE_ICS_SPAPR)
+DECLARE_INSTANCE_CHECKER(IcsSpaprState, ICS_SPAPR, TYPE_ICS_SPAPR)
 
 int xics_kvm_connect(SpaprInterruptController *intc, uint32_t nr_servers,
                      Error **errp);
diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
index 570d635bcc08..ecbbda0e249b 100644
--- a/hw/intc/xics_kvm.c
+++ b/hw/intc/xics_kvm.c
@@ -350,7 +350,7 @@ void ics_kvm_set_irq(ICSState *ics, int srcno, int val)
 int xics_kvm_connect(SpaprInterruptController *intc, uint32_t nr_servers,
                      Error **errp)
 {
-    ICSState *ics = ICS_SPAPR(intc);
+    ICSState *ics = ICS(intc);
     int rc;
     CPUState *cs;
     Error *local_err = NULL;
diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
index 8ae4f41459c3..ce147e8980ed 100644
--- a/hw/intc/xics_spapr.c
+++ b/hw/intc/xics_spapr.c
@@ -34,6 +34,7 @@
 #include "hw/ppc/xics.h"
 #include "hw/ppc/xics_spapr.h"
 #include "hw/ppc/fdt.h"
+#include "hw/qdev-properties.h"
 #include "qapi/visitor.h"
 
 /*
@@ -154,7 +155,7 @@ static void rtas_set_xive(PowerPCCPU *cpu, SpaprMachineState *spapr,
                           uint32_t nargs, target_ulong args,
                           uint32_t nret, target_ulong rets)
 {
-    ICSState *ics = spapr->ics;
+    ICSState *ics = ICS(spapr->ics);
     uint32_t nr, srcno, server, priority;
 
     CHECK_EMULATED_XICS_RTAS(spapr, rets);
@@ -189,7 +190,7 @@ static void rtas_get_xive(PowerPCCPU *cpu, SpaprMachineState *spapr,
                           uint32_t nargs, target_ulong args,
                           uint32_t nret, target_ulong rets)
 {
-    ICSState *ics = spapr->ics;
+    ICSState *ics = ICS(spapr->ics);
     uint32_t nr, srcno;
 
     CHECK_EMULATED_XICS_RTAS(spapr, rets);
@@ -221,7 +222,7 @@ static void rtas_int_off(PowerPCCPU *cpu, SpaprMachineState *spapr,
                          uint32_t nargs, target_ulong args,
                          uint32_t nret, target_ulong rets)
 {
-    ICSState *ics = spapr->ics;
+    ICSState *ics = ICS(spapr->ics);
     uint32_t nr, srcno;
 
     CHECK_EMULATED_XICS_RTAS(spapr, rets);
@@ -254,7 +255,7 @@ static void rtas_int_on(PowerPCCPU *cpu, SpaprMachineState *spapr,
                         uint32_t nargs, target_ulong args,
                         uint32_t nret, target_ulong rets)
 {
-    ICSState *ics = spapr->ics;
+    ICSState *ics = ICS(spapr->ics);
     uint32_t nr, srcno;
 
     CHECK_EMULATED_XICS_RTAS(spapr, rets);
@@ -285,10 +286,13 @@ static void rtas_int_on(PowerPCCPU *cpu, SpaprMachineState *spapr,
 
 static void ics_spapr_realize(DeviceState *dev, Error **errp)
 {
-    ICSState *ics = ICS_SPAPR(dev);
-    ICSStateClass *icsc = ICS_GET_CLASS(ics);
+    IcsSpaprState *sics = ICS_SPAPR(dev);
+    ICSStateClass *icsc = ICS_GET_CLASS(dev);
     Error *local_err = NULL;
 
+    /* Set by spapr_irq_init() */
+    g_assert(sics->nr_servers);
+
     icsc->parent_realize(dev, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
@@ -312,7 +316,7 @@ static void xics_spapr_dt(SpaprInterruptController *intc, uint32_t nr_servers,
                           void *fdt, uint32_t phandle)
 {
     uint32_t interrupt_server_ranges_prop[] = {
-        0, cpu_to_be32(nr_servers),
+        0, cpu_to_be32(ICS_SPAPR(intc)->nr_servers),
     };
     int node;
 
@@ -333,7 +337,7 @@ static void xics_spapr_dt(SpaprInterruptController *intc, uint32_t nr_servers,
 static int xics_spapr_cpu_intc_create(SpaprInterruptController *intc,
                                        PowerPCCPU *cpu, Error **errp)
 {
-    ICSState *ics = ICS_SPAPR(intc);
+    ICSState *ics = ICS(intc);
     Object *obj;
     SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
 
@@ -364,7 +368,7 @@ static void xics_spapr_cpu_intc_destroy(SpaprInterruptController *intc,
 static int xics_spapr_claim_irq(SpaprInterruptController *intc, int irq,
                                 bool lsi, Error **errp)
 {
-    ICSState *ics = ICS_SPAPR(intc);
+    ICSState *ics = ICS(intc);
 
     assert(ics);
     assert(ics_valid_irq(ics, irq));
@@ -380,7 +384,7 @@ static int xics_spapr_claim_irq(SpaprInterruptController *intc, int irq,
 
 static void xics_spapr_free_irq(SpaprInterruptController *intc, int irq)
 {
-    ICSState *ics = ICS_SPAPR(intc);
+    ICSState *ics = ICS(intc);
     uint32_t srcno = irq - ics->offset;
 
     assert(ics_valid_irq(ics, irq));
@@ -390,7 +394,7 @@ static void xics_spapr_free_irq(SpaprInterruptController *intc, int irq)
 
 static void xics_spapr_set_irq(SpaprInterruptController *intc, int irq, int val)
 {
-    ICSState *ics = ICS_SPAPR(intc);
+    ICSState *ics = ICS(intc);
     uint32_t srcno = irq - ics->offset;
 
     ics_set_irq(ics, srcno, val);
@@ -398,7 +402,7 @@ static void xics_spapr_set_irq(SpaprInterruptController *intc, int irq, int val)
 
 static void xics_spapr_print_info(SpaprInterruptController *intc, Monitor *mon)
 {
-    ICSState *ics = ICS_SPAPR(intc);
+    ICSState *ics = ICS(intc);
     CPUState *cs;
 
     CPU_FOREACH(cs) {
@@ -426,7 +430,8 @@ static int xics_spapr_activate(SpaprInterruptController *intc,
                                uint32_t nr_servers, Error **errp)
 {
     if (kvm_enabled()) {
-        return spapr_irq_init_kvm(xics_kvm_connect, intc, nr_servers, errp);
+        return spapr_irq_init_kvm(xics_kvm_connect, intc,
+                                  ICS_SPAPR(intc)->nr_servers, errp);
     }
     return 0;
 }
@@ -438,6 +443,11 @@ static void xics_spapr_deactivate(SpaprInterruptController *intc)
     }
 }
 
+static Property ics_spapr_properties[] = {
+    DEFINE_PROP_UINT32("nr-servers", IcsSpaprState, nr_servers, 0),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
 static void ics_spapr_class_init(ObjectClass *klass, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(klass);
@@ -446,6 +456,7 @@ static void ics_spapr_class_init(ObjectClass *klass, void *data)
 
     device_class_set_parent_realize(dc, ics_spapr_realize,
                                     &isc->parent_realize);
+    device_class_set_props(dc, ics_spapr_properties);
     sicc->activate = xics_spapr_activate;
     sicc->deactivate = xics_spapr_deactivate;
     sicc->cpu_intc_create = xics_spapr_cpu_intc_create;
@@ -462,6 +473,7 @@ static void ics_spapr_class_init(ObjectClass *klass, void *data)
 static const TypeInfo ics_spapr_info = {
     .name = TYPE_ICS_SPAPR,
     .parent = TYPE_ICS,
+    .instance_size = sizeof(IcsSpaprState),
     .class_init = ics_spapr_class_init,
     .interfaces = (InterfaceInfo[]) {
         { TYPE_SPAPR_INTC },
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 12a012d9dd09..21de0456446b 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -4218,15 +4218,16 @@ static void spapr_phb_placement(SpaprMachineState *spapr, uint32_t index,
 static ICSState *spapr_ics_get(XICSFabric *dev, int irq)
 {
     SpaprMachineState *spapr = SPAPR_MACHINE(dev);
+    ICSState *ics = ICS(spapr->ics);
 
-    return ics_valid_irq(spapr->ics, irq) ? spapr->ics : NULL;
+    return ics_valid_irq(ics, irq) ? ics : NULL;
 }
 
 static void spapr_ics_resend(XICSFabric *dev)
 {
     SpaprMachineState *spapr = SPAPR_MACHINE(dev);
 
-    ics_resend(spapr->ics);
+    ics_resend(ICS(spapr->ics));
 }
 
 static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 2dacbecfd5fd..be6f4041e433 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -316,6 +316,9 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
         object_property_set_link(obj, ICS_PROP_XICS, OBJECT(spapr),
                                  &error_abort);
         object_property_set_int(obj, "nr-irqs", smc->nr_xirqs, &error_abort);
+        object_property_set_uint(obj, "nr-servers",
+                                 spapr_max_server_number(spapr),
+                                 &error_abort);
         if (!qdev_realize(DEVICE(obj), NULL, errp)) {
             return;
         }
@@ -426,7 +429,7 @@ qemu_irq spapr_qirq(SpaprMachineState *spapr, int irq)
     assert(irq < (smc->nr_xirqs + SPAPR_XIRQ_BASE));
 
     if (spapr->ics) {
-        assert(ics_valid_irq(spapr->ics, irq));
+        assert(ics_valid_irq(ICS(spapr->ics), irq));
     }
     if (spapr->xive) {
         assert(irq < spapr->xive->nr_irqs);
@@ -556,7 +559,7 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum)
 
 int spapr_irq_find(SpaprMachineState *spapr, int num, bool align, Error **errp)
 {
-    ICSState *ics = spapr->ics;
+    ICSState *ics = ICS(spapr->ics);
     int first = -1;
 
     assert(ics);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH for-6.0 7/8] spapr: Drop "nr_servers" argument of the sPAPR IC activate() operation
  2020-11-20 17:46 [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids Greg Kurz
                   ` (5 preceding siblings ...)
  2020-11-20 17:46 ` [PATCH for-6.0 6/8] spapr/xics: Add "nr-servers" property Greg Kurz
@ 2020-11-20 17:46 ` Greg Kurz
  2020-11-23  4:38   ` David Gibson
  2020-11-20 17:46 ` [PATCH for-6.0 8/8] spapr: Drop "nr_servers" argument of the sPAPR IC dt() operation Greg Kurz
  2020-11-23  8:04 ` [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids Cédric Le Goater
  8 siblings, 1 reply; 40+ messages in thread
From: Greg Kurz @ 2020-11-20 17:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz, qemu-ppc, Cédric Le Goater, David Gibson

This argument isn't used by the backends anymore.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 include/hw/ppc/spapr_irq.h | 3 +--
 hw/intc/spapr_xive.c       | 3 +--
 hw/intc/xics_spapr.c       | 3 +--
 hw/ppc/spapr_irq.c         | 3 +--
 4 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index c22a72c9e270..3e1c619d4c06 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -43,8 +43,7 @@ DECLARE_CLASS_CHECKERS(SpaprInterruptControllerClass, SPAPR_INTC,
 struct SpaprInterruptControllerClass {
     InterfaceClass parent;
 
-    int (*activate)(SpaprInterruptController *intc, uint32_t nr_servers,
-                    Error **errp);
+    int (*activate)(SpaprInterruptController *intc, Error **errp);
     void (*deactivate)(SpaprInterruptController *intc);
 
     /*
diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index d13a2955ce9b..8331026fdb12 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -754,8 +754,7 @@ static void spapr_xive_dt(SpaprInterruptController *intc, uint32_t nr_servers,
                      plat_res_int_priorities, sizeof(plat_res_int_priorities)));
 }
 
-static int spapr_xive_activate(SpaprInterruptController *intc,
-                               uint32_t nr_servers, Error **errp)
+static int spapr_xive_activate(SpaprInterruptController *intc, Error **errp)
 {
     SpaprXive *xive = SPAPR_XIVE(intc);
 
diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
index ce147e8980ed..8810bd93c856 100644
--- a/hw/intc/xics_spapr.c
+++ b/hw/intc/xics_spapr.c
@@ -426,8 +426,7 @@ static int xics_spapr_post_load(SpaprInterruptController *intc, int version_id)
     return 0;
 }
 
-static int xics_spapr_activate(SpaprInterruptController *intc,
-                               uint32_t nr_servers, Error **errp)
+static int xics_spapr_activate(SpaprInterruptController *intc, Error **errp)
 {
     if (kvm_enabled()) {
         return spapr_irq_init_kvm(xics_kvm_connect, intc,
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index be6f4041e433..f2897fbc942a 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -480,7 +480,6 @@ static void set_active_intc(SpaprMachineState *spapr,
                             SpaprInterruptController *new_intc)
 {
     SpaprInterruptControllerClass *sicc;
-    uint32_t nr_servers = spapr_max_server_number(spapr);
 
     assert(new_intc);
 
@@ -498,7 +497,7 @@ static void set_active_intc(SpaprMachineState *spapr,
 
     sicc = SPAPR_INTC_GET_CLASS(new_intc);
     if (sicc->activate) {
-        sicc->activate(new_intc, nr_servers, &error_fatal);
+        sicc->activate(new_intc, &error_fatal);
     }
 
     spapr->active_intc = new_intc;
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH for-6.0 8/8] spapr: Drop "nr_servers" argument of the sPAPR IC dt() operation
  2020-11-20 17:46 [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids Greg Kurz
                   ` (6 preceding siblings ...)
  2020-11-20 17:46 ` [PATCH for-6.0 7/8] spapr: Drop "nr_servers" argument of the sPAPR IC activate() operation Greg Kurz
@ 2020-11-20 17:46 ` Greg Kurz
  2020-11-23  8:04 ` [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids Cédric Le Goater
  8 siblings, 0 replies; 40+ messages in thread
From: Greg Kurz @ 2020-11-20 17:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: Greg Kurz, qemu-ppc, Cédric Le Goater, David Gibson

This argument isn't used by the backends anymore.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 include/hw/ppc/spapr_irq.h | 6 ++----
 hw/intc/spapr_xive.c       | 4 ++--
 hw/intc/xics_spapr.c       | 4 ++--
 hw/ppc/spapr.c             | 2 +-
 hw/ppc/spapr_irq.c         | 5 ++---
 5 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index 3e1c619d4c06..e2519e4bc596 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -61,8 +61,7 @@ struct SpaprInterruptControllerClass {
     /* These methods should only be called on the active intc */
     void (*set_irq)(SpaprInterruptController *intc, int irq, int val);
     void (*print_info)(SpaprInterruptController *intc, Monitor *mon);
-    void (*dt)(SpaprInterruptController *intc, uint32_t nr_servers,
-               void *fdt, uint32_t phandle);
+    void (*dt)(SpaprInterruptController *intc, void *fdt, uint32_t phandle);
     int (*post_load)(SpaprInterruptController *intc, int version_id);
 };
 
@@ -73,8 +72,7 @@ int spapr_irq_cpu_intc_create(struct SpaprMachineState *spapr,
 void spapr_irq_cpu_intc_reset(struct SpaprMachineState *spapr, PowerPCCPU *cpu);
 void spapr_irq_cpu_intc_destroy(struct SpaprMachineState *spapr, PowerPCCPU *cpu);
 void spapr_irq_print_info(struct SpaprMachineState *spapr, Monitor *mon);
-void spapr_irq_dt(struct SpaprMachineState *spapr, uint32_t nr_servers,
-                  void *fdt, uint32_t phandle);
+void spapr_irq_dt(struct SpaprMachineState *spapr, void *fdt, uint32_t phandle);
 
 uint32_t spapr_irq_nr_msis(struct SpaprMachineState *spapr);
 int spapr_irq_msi_alloc(struct SpaprMachineState *spapr, uint32_t num, bool align,
diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index 8331026fdb12..749cff9bf2b9 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -691,8 +691,8 @@ static void spapr_xive_print_info(SpaprInterruptController *intc, Monitor *mon)
     spapr_xive_pic_print_info(xive, mon);
 }
 
-static void spapr_xive_dt(SpaprInterruptController *intc, uint32_t nr_servers,
-                          void *fdt, uint32_t phandle)
+static void spapr_xive_dt(SpaprInterruptController *intc, void *fdt,
+                          uint32_t phandle)
 {
     SpaprXive *xive = SPAPR_XIVE(intc);
     int node;
diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
index 8810bd93c856..a790b59f1bbc 100644
--- a/hw/intc/xics_spapr.c
+++ b/hw/intc/xics_spapr.c
@@ -312,8 +312,8 @@ static void ics_spapr_realize(DeviceState *dev, Error **errp)
     spapr_register_hypercall(H_IPOLL, h_ipoll);
 }
 
-static void xics_spapr_dt(SpaprInterruptController *intc, uint32_t nr_servers,
-                          void *fdt, uint32_t phandle)
+static void xics_spapr_dt(SpaprInterruptController *intc, void *fdt,
+                          uint32_t phandle)
 {
     uint32_t interrupt_server_ranges_prop[] = {
         0, cpu_to_be32(ICS_SPAPR(intc)->nr_servers),
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 21de0456446b..595dd1b81ce1 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1164,7 +1164,7 @@ void *spapr_build_fdt(SpaprMachineState *spapr, bool reset, size_t space)
     _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
 
     /* /interrupt controller */
-    spapr_irq_dt(spapr, spapr_max_server_number(spapr), fdt, PHANDLE_INTC);
+    spapr_irq_dt(spapr, fdt, PHANDLE_INTC);
 
     ret = spapr_dt_memory(spapr, fdt);
     if (ret < 0) {
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index f2897fbc942a..f93476d00a59 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -271,13 +271,12 @@ void spapr_irq_print_info(SpaprMachineState *spapr, Monitor *mon)
     sicc->print_info(spapr->active_intc, mon);
 }
 
-void spapr_irq_dt(SpaprMachineState *spapr, uint32_t nr_servers,
-                  void *fdt, uint32_t phandle)
+void spapr_irq_dt(SpaprMachineState *spapr, void *fdt, uint32_t phandle)
 {
     SpaprInterruptControllerClass *sicc
         = SPAPR_INTC_GET_CLASS(spapr->active_intc);
 
-    sicc->dt(spapr->active_intc, nr_servers, fdt, phandle);
+    sicc->dt(spapr->active_intc, fdt, phandle);
 }
 
 uint32_t spapr_irq_nr_msis(SpaprMachineState *spapr)
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 1/8] spapr/xive: Turn some sanity checks into assertions
  2020-11-20 17:46 ` [PATCH for-6.0 1/8] spapr/xive: Turn some sanity checks into assertions Greg Kurz
@ 2020-11-23  3:33   ` David Gibson
  2020-11-23  8:09   ` Cédric Le Goater
  1 sibling, 0 replies; 40+ messages in thread
From: David Gibson @ 2020-11-23  3:33 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, qemu-devel, Cédric Le Goater

[-- Attachment #1: Type: text/plain, Size: 1759 bytes --]

On Fri, Nov 20, 2020 at 06:46:39PM +0100, Greg Kurz wrote:
> The sPAPR XIVE device is created by the machine in spapr_irq_init().
> The latter overrides any value provided by the user with -global for
> the "nr-irqs" and "nr-ends" properties with strictly positive values.
> 
> It seems reasonable to assume these properties should never be 0,
> which wouldn't make much sense by the way.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>

Applied to ppc-for-6.0.

> ---
>  hw/intc/spapr_xive.c | 14 ++++----------
>  1 file changed, 4 insertions(+), 10 deletions(-)
> 
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index 1fa09f287ac0..60e0d5769dcc 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -296,22 +296,16 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
>      XiveENDSource *end_xsrc = &xive->end_source;
>      Error *local_err = NULL;
>  
> +    /* Set by spapr_irq_init() */
> +    g_assert(xive->nr_irqs);
> +    g_assert(xive->nr_ends);
> +
>      sxc->parent_realize(dev, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
>          return;
>      }
>  
> -    if (!xive->nr_irqs) {
> -        error_setg(errp, "Number of interrupt needs to be greater 0");
> -        return;
> -    }
> -
> -    if (!xive->nr_ends) {
> -        error_setg(errp, "Number of interrupt needs to be greater 0");
> -        return;
> -    }
> -
>      /*
>       * Initialize the internal sources, for IPIs and virtual devices.
>       */

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends()
  2020-11-20 17:46 ` [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends() Greg Kurz
@ 2020-11-23  3:33   ` David Gibson
  2020-11-25 22:43     ` Greg Kurz
  2020-11-23  9:46   ` Cédric Le Goater
  1 sibling, 1 reply; 40+ messages in thread
From: David Gibson @ 2020-11-23  3:33 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, qemu-devel, Cédric Le Goater

[-- Attachment #1: Type: text/plain, Size: 5589 bytes --]

On Fri, Nov 20, 2020 at 06:46:40PM +0100, Greg Kurz wrote:
> We're going to kill the "nr_ends" field in a subsequent patch.
> Prepare ground by using an helper instead of peeking into
> the sPAPR XIVE structure directly.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>

Applied to ppc-for-6.0, thanks.


> ---
>  include/hw/ppc/spapr_xive.h |  1 +
>  hw/intc/spapr_xive.c        | 23 ++++++++++++++---------
>  hw/intc/spapr_xive_kvm.c    |  4 ++--
>  3 files changed, 17 insertions(+), 11 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> index 26c8d90d7196..4b967f13c030 100644
> --- a/include/hw/ppc/spapr_xive.h
> +++ b/include/hw/ppc/spapr_xive.h
> @@ -75,6 +75,7 @@ void spapr_xive_map_mmio(SpaprXive *xive);
>  
>  int spapr_xive_end_to_target(uint8_t end_blk, uint32_t end_idx,
>                               uint32_t *out_server, uint8_t *out_prio);
> +uint32_t spapr_xive_nr_ends(const SpaprXive *xive);
>  
>  /*
>   * KVM XIVE device helpers
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index 60e0d5769dcc..f473ad9cba47 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -192,7 +192,7 @@ void spapr_xive_pic_print_info(SpaprXive *xive, Monitor *mon)
>              uint32_t end_idx = xive_get_field64(EAS_END_INDEX, eas->w);
>              XiveEND *end;
>  
> -            assert(end_idx < xive->nr_ends);
> +            assert(end_idx < spapr_xive_nr_ends(xive));
>              end = &xive->endt[end_idx];
>  
>              if (xive_end_is_valid(end)) {
> @@ -270,7 +270,7 @@ static void spapr_xive_reset(void *dev)
>      }
>  
>      /* Clear all ENDs */
> -    for (i = 0; i < xive->nr_ends; i++) {
> +    for (i = 0; i < spapr_xive_nr_ends(xive); i++) {
>          spapr_xive_end_reset(&xive->endt[i]);
>      }
>  }
> @@ -288,6 +288,11 @@ static void spapr_xive_instance_init(Object *obj)
>      xive->fd = -1;
>  }
>  
> +uint32_t spapr_xive_nr_ends(const SpaprXive *xive)
> +{
> +    return xive->nr_ends;
> +}
> +
>  static void spapr_xive_realize(DeviceState *dev, Error **errp)
>  {
>      SpaprXive *xive = SPAPR_XIVE(dev);
> @@ -336,7 +341,7 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
>       * Allocate the routing tables
>       */
>      xive->eat = g_new0(XiveEAS, xive->nr_irqs);
> -    xive->endt = g_new0(XiveEND, xive->nr_ends);
> +    xive->endt = g_new0(XiveEND, spapr_xive_nr_ends(xive));
>  
>      xive->nodename = g_strdup_printf("interrupt-controller@%" PRIx64,
>                             xive->tm_base + XIVE_TM_USER_PAGE * (1 << TM_SHIFT));
> @@ -375,7 +380,7 @@ static int spapr_xive_get_end(XiveRouter *xrtr,
>  {
>      SpaprXive *xive = SPAPR_XIVE(xrtr);
>  
> -    if (end_idx >= xive->nr_ends) {
> +    if (end_idx >= spapr_xive_nr_ends(xive)) {
>          return -1;
>      }
>  
> @@ -389,7 +394,7 @@ static int spapr_xive_write_end(XiveRouter *xrtr, uint8_t end_blk,
>  {
>      SpaprXive *xive = SPAPR_XIVE(xrtr);
>  
> -    if (end_idx >= xive->nr_ends) {
> +    if (end_idx >= spapr_xive_nr_ends(xive)) {
>          return -1;
>      }
>  
> @@ -1138,7 +1143,7 @@ static target_ulong h_int_get_source_config(PowerPCCPU *cpu,
>      /* EAS_END_BLOCK is unused on sPAPR */
>      end_idx = xive_get_field64(EAS_END_INDEX, eas.w);
>  
> -    assert(end_idx < xive->nr_ends);
> +    assert(end_idx < spapr_xive_nr_ends(xive));
>      end = &xive->endt[end_idx];
>  
>      nvt_blk = xive_get_field32(END_W6_NVT_BLOCK, end->w6);
> @@ -1216,7 +1221,7 @@ static target_ulong h_int_get_queue_info(PowerPCCPU *cpu,
>          return H_P2;
>      }
>  
> -    assert(end_idx < xive->nr_ends);
> +    assert(end_idx < spapr_xive_nr_ends(xive));
>      end = &xive->endt[end_idx];
>  
>      args[0] = xive->end_base + (1ull << (end_xsrc->esb_shift + 1)) * end_idx;
> @@ -1304,7 +1309,7 @@ static target_ulong h_int_set_queue_config(PowerPCCPU *cpu,
>          return H_P2;
>      }
>  
> -    assert(end_idx < xive->nr_ends);
> +    assert(end_idx < spapr_xive_nr_ends(xive));
>      memcpy(&end, &xive->endt[end_idx], sizeof(XiveEND));
>  
>      switch (qsize) {
> @@ -1470,7 +1475,7 @@ static target_ulong h_int_get_queue_config(PowerPCCPU *cpu,
>          return H_P2;
>      }
>  
> -    assert(end_idx < xive->nr_ends);
> +    assert(end_idx < spapr_xive_nr_ends(xive));
>      end = &xive->endt[end_idx];
>  
>      args[0] = 0;
> diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
> index 66bf4c06fe55..1566016f0e28 100644
> --- a/hw/intc/spapr_xive_kvm.c
> +++ b/hw/intc/spapr_xive_kvm.c
> @@ -531,7 +531,7 @@ static int kvmppc_xive_get_queues(SpaprXive *xive, Error **errp)
>      int i;
>      int ret;
>  
> -    for (i = 0; i < xive->nr_ends; i++) {
> +    for (i = 0; i < spapr_xive_nr_ends(xive); i++) {
>          if (!xive_end_is_valid(&xive->endt[i])) {
>              continue;
>          }
> @@ -701,7 +701,7 @@ int kvmppc_xive_post_load(SpaprXive *xive, int version_id)
>      assert(xive->fd != -1);
>  
>      /* Restore the ENDT first. The targetting depends on it. */
> -    for (i = 0; i < xive->nr_ends; i++) {
> +    for (i = 0; i < spapr_xive_nr_ends(xive); i++) {
>          if (!xive_end_is_valid(&xive->endt[i])) {
>              continue;
>          }

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 3/8] spapr/xive: Add "nr-servers" property
  2020-11-20 17:46 ` [PATCH for-6.0 3/8] spapr/xive: Add "nr-servers" property Greg Kurz
@ 2020-11-23  3:52   ` David Gibson
  2020-11-23  9:20     ` Greg Kurz
  2020-11-23  9:56   ` Cédric Le Goater
  1 sibling, 1 reply; 40+ messages in thread
From: David Gibson @ 2020-11-23  3:52 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, qemu-devel, Cédric Le Goater

[-- Attachment #1: Type: text/plain, Size: 7779 bytes --]

On Fri, Nov 20, 2020 at 06:46:41PM +0100, Greg Kurz wrote:
> The sPAPR XIVE object has an "nr-ends" property that is used
> to size the END table. This property is set by the machine
> code to a value derived from spapr_max_server_number().
> 
> spapr_max_server_number() is also used to inform the KVM XIVE
> device about the range of vCPU ids it might be exposed to,
> in order to optimize resource allocation in the HW.
> 
> This is enough motivation to introduce an "nr-servers" property
> and to use it for both purposes. The existing "nr-ends" property
> is now longer used. It is kept around though because it is exposed
> to -global. It will continue to be ignored as before without
> causing QEMU to exit.

I'm a little dubious about keeping the property around.  It's
technically a breaking change to remove it, but since IIUC, it's
*never* had any effect, it seems implausible anyone out there's using
it.

Can we at least put it straight into the deprecation document?

> The associated nr_ends field cannot be dropped from SpaprXive
> because it is explicitly used by vmstate_spapr_xive(). It is
> thus renamed to nr_ends_vmstate.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
>  include/hw/ppc/spapr_xive.h | 16 +++++++++++++++-
>  hw/intc/spapr_xive.c        | 28 ++++++++++++++++++++++------
>  hw/ppc/spapr_irq.c          |  6 +-----
>  3 files changed, 38 insertions(+), 12 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> index 4b967f13c030..7123ea07ed78 100644
> --- a/include/hw/ppc/spapr_xive.h
> +++ b/include/hw/ppc/spapr_xive.h
> @@ -23,6 +23,16 @@
>  typedef struct SpaprXive {
>      XiveRouter    parent;
>  
> +    /*
> +     * The XIVE device needs to know the highest vCPU id it might
> +     * be exposed to in order to size the END table. It may also
> +     * propagate the value to the KVM XIVE device in order to
> +     * optimize resource allocation in the HW.
> +     * This must be set to a non-null value using the "nr-servers"
> +     * property, before realizing the device.
> +     */
> +    uint32_t      nr_servers;
> +
>      /* Internal interrupt source for IPIs and virtual devices */
>      XiveSource    source;
>      hwaddr        vc_base;
> @@ -38,7 +48,11 @@ typedef struct SpaprXive {
>      XiveEAS       *eat;
>      uint32_t      nr_irqs;
>      XiveEND       *endt;
> -    uint32_t      nr_ends;
> +    /*
> +     * This is derived from nr_servers but it must be kept around because
> +     * vmstate_spapr_xive uses it.
> +     */
> +    uint32_t      nr_ends_vmstate;
>  
>      /* TIMA mapping address */
>      hwaddr        tm_base;
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index f473ad9cba47..e4dbf9c2c409 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -99,6 +99,12 @@ int spapr_xive_end_to_target(uint8_t end_blk, uint32_t end_idx,
>      return 0;
>  }
>  
> +/*
> + * 8 XIVE END structures per CPU. One for each available
> + * priority
> + */
> +#define spapr_xive_cpu_end_idx(vcpu, prio) (((vcpu) << 3) + prio)
> +
>  static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
>                                    uint8_t *out_end_blk, uint32_t *out_end_idx)
>  {
> @@ -109,7 +115,7 @@ static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
>      }
>  
>      if (out_end_idx) {
> -        *out_end_idx = (cpu->vcpu_id << 3) + prio;
> +        *out_end_idx = spapr_xive_cpu_end_idx(cpu->vcpu_id, prio);
>      }
>  }
>  
> @@ -290,7 +296,8 @@ static void spapr_xive_instance_init(Object *obj)
>  
>  uint32_t spapr_xive_nr_ends(const SpaprXive *xive)
>  {
> -    return xive->nr_ends;
> +    g_assert(xive->nr_servers);
> +    return spapr_xive_cpu_end_idx(xive->nr_servers, 0);
>  }
>  
>  static void spapr_xive_realize(DeviceState *dev, Error **errp)
> @@ -303,7 +310,7 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
>  
>      /* Set by spapr_irq_init() */
>      g_assert(xive->nr_irqs);
> -    g_assert(xive->nr_ends);
> +    g_assert(xive->nr_servers);
>  
>      sxc->parent_realize(dev, &local_err);
>      if (local_err) {
> @@ -360,6 +367,8 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
>      sysbus_mmio_map(SYS_BUS_DEVICE(xive), 0, xive->vc_base);
>      sysbus_mmio_map(SYS_BUS_DEVICE(xive), 1, xive->end_base);
>      sysbus_mmio_map(SYS_BUS_DEVICE(xive), 2, xive->tm_base);
> +
> +    xive->nr_ends_vmstate = spapr_xive_nr_ends(xive);
>  }
>  
>  static int spapr_xive_get_eas(XiveRouter *xrtr, uint8_t eas_blk,
> @@ -547,7 +556,7 @@ static const VMStateDescription vmstate_spapr_xive = {
>          VMSTATE_UINT32_EQUAL(nr_irqs, SpaprXive, NULL),
>          VMSTATE_STRUCT_VARRAY_POINTER_UINT32(eat, SpaprXive, nr_irqs,
>                                       vmstate_spapr_xive_eas, XiveEAS),
> -        VMSTATE_STRUCT_VARRAY_POINTER_UINT32(endt, SpaprXive, nr_ends,
> +        VMSTATE_STRUCT_VARRAY_POINTER_UINT32(endt, SpaprXive, nr_ends_vmstate,
>                                               vmstate_spapr_xive_end, XiveEND),
>          VMSTATE_END_OF_LIST()
>      },
> @@ -591,7 +600,14 @@ static void spapr_xive_free_irq(SpaprInterruptController *intc, int lisn)
>  
>  static Property spapr_xive_properties[] = {
>      DEFINE_PROP_UINT32("nr-irqs", SpaprXive, nr_irqs, 0),
> -    DEFINE_PROP_UINT32("nr-ends", SpaprXive, nr_ends, 0),
> +    /*
> +     * "nr-ends" is deprecated by "nr-servers" since QEMU 6.0.
> +     * It is just kept around because it is exposed to the user
> +     * through -global and we don't want it to fail, even if
> +     * the value is actually overridden internally.
> +     */
> +    DEFINE_PROP_UINT32("nr-ends", SpaprXive, nr_ends_vmstate, 0),
> +    DEFINE_PROP_UINT32("nr-servers", SpaprXive, nr_servers, 0),
>      DEFINE_PROP_UINT64("vc-base", SpaprXive, vc_base, SPAPR_XIVE_VC_BASE),
>      DEFINE_PROP_UINT64("tm-base", SpaprXive, tm_base, SPAPR_XIVE_TM_BASE),
>      DEFINE_PROP_UINT8("hv-prio", SpaprXive, hv_prio, 7),
> @@ -742,7 +758,7 @@ static int spapr_xive_activate(SpaprInterruptController *intc,
>      SpaprXive *xive = SPAPR_XIVE(intc);
>  
>      if (kvm_enabled()) {
> -        int rc = spapr_irq_init_kvm(kvmppc_xive_connect, intc, nr_servers,
> +        int rc = spapr_irq_init_kvm(kvmppc_xive_connect, intc, xive->nr_servers,

Hmm.  So we're now ignoring the 'nr_servers' parameter to this
function, which doesn't seem right.  Should we be assert()ing that
it's equal to xive->nr_servers?

>                                      errp);
>          if (rc < 0) {
>              return rc;
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index f59960339ec3..8c5627225636 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -330,11 +330,7 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
>  
>          dev = qdev_new(TYPE_SPAPR_XIVE);
>          qdev_prop_set_uint32(dev, "nr-irqs", smc->nr_xirqs + SPAPR_XIRQ_BASE);
> -        /*
> -         * 8 XIVE END structures per CPU. One for each available
> -         * priority
> -         */
> -        qdev_prop_set_uint32(dev, "nr-ends", nr_servers << 3);
> +        qdev_prop_set_uint32(dev, "nr-servers", nr_servers);
>          object_property_set_link(OBJECT(dev), "xive-fabric", OBJECT(spapr),
>                                   &error_abort);
>          sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 4/8] spapr/xive: Add "nr-ipis" property
  2020-11-20 17:46 ` [PATCH for-6.0 4/8] spapr/xive: Add "nr-ipis" property Greg Kurz
@ 2020-11-23  4:10   ` David Gibson
  2020-11-23 10:13   ` Cédric Le Goater
  1 sibling, 0 replies; 40+ messages in thread
From: David Gibson @ 2020-11-23  4:10 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, qemu-devel, Cédric Le Goater

[-- Attachment #1: Type: text/plain, Size: 5289 bytes --]

On Fri, Nov 20, 2020 at 06:46:42PM +0100, Greg Kurz wrote:
> The sPAPR XIVE device exposes a range of LISNs that the guest uses
> for IPIs. This range is currently sized according to the highest
> vCPU id, ie. spapr_max_server_number(), as obtained from the machine
> through the "nr_servers" argument of the generic spapr_irq_dt() call.
> 
> This makes sense for the "ibm,interrupt-server-ranges" property of
> sPAPR XICS, but certainly not for "ibm,xive-lisn-ranges". The range
> should be sized to the maximum number of possible vCPUs. It happens
> that spapr_max_server_number() == smp.max_cpus in practice with the
> machine default settings. This might not be the case though when
> VSMT is in use : we can end up with a much larger range (up to 8
> times bigger) than needed and waste LISNs. But most importantly, this
> lures people into thinking that IPI numbers are always equal to
> vCPU ids, which is wrong and bit us recently:
> 
> https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg01476.html
> 
> Introduce an "nr-ipis" property that the machine sets to smp.max_cpus
> before realizing the deice. Use that instead of "nr_servers" in
> spapr_xive_dt(). Have the machine to claim the same number of IPIs
> in spapr_irq_init().
> 
> This doesn't cause any guest visible change when using the machine
> default settings (ie. VSMT == smp.threads).
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  include/hw/ppc/spapr_xive.h | 8 ++++++++
>  hw/intc/spapr_xive.c        | 4 +++-
>  hw/ppc/spapr_irq.c          | 4 +++-
>  3 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> index 7123ea07ed78..69b9fbc1b9a5 100644
> --- a/include/hw/ppc/spapr_xive.h
> +++ b/include/hw/ppc/spapr_xive.h
> @@ -43,6 +43,14 @@ typedef struct SpaprXive {
>  
>      /* DT */
>      gchar *nodename;
> +    /*
> +     * The sPAPR XIVE device needs to know how many vCPUs it
> +     * might be exposed to in order to size the IPI range in
> +     * "ibm,interrupt-server-ranges". It is the purpose of the
> +     * "nr-ipis" property which *must* be set to a non-null
> +     * value before realizing the sPAPR XIVE device.
> +     */
> +    uint32_t nr_ipis;
>  
>      /* Routing table */
>      XiveEAS       *eat;
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index e4dbf9c2c409..d13a2955ce9b 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -311,6 +311,7 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
>      /* Set by spapr_irq_init() */
>      g_assert(xive->nr_irqs);
>      g_assert(xive->nr_servers);
> +    g_assert(xive->nr_ipis);
>  
>      sxc->parent_realize(dev, &local_err);
>      if (local_err) {
> @@ -608,6 +609,7 @@ static Property spapr_xive_properties[] = {
>       */
>      DEFINE_PROP_UINT32("nr-ends", SpaprXive, nr_ends_vmstate, 0),
>      DEFINE_PROP_UINT32("nr-servers", SpaprXive, nr_servers, 0),
> +    DEFINE_PROP_UINT32("nr-ipis", SpaprXive, nr_ipis, 0),
>      DEFINE_PROP_UINT64("vc-base", SpaprXive, vc_base, SPAPR_XIVE_VC_BASE),
>      DEFINE_PROP_UINT64("tm-base", SpaprXive, tm_base, SPAPR_XIVE_TM_BASE),
>      DEFINE_PROP_UINT8("hv-prio", SpaprXive, hv_prio, 7),
> @@ -698,7 +700,7 @@ static void spapr_xive_dt(SpaprInterruptController *intc, uint32_t nr_servers,
>      /* Interrupt number ranges for the IPIs */
>      uint32_t lisn_ranges[] = {
>          cpu_to_be32(SPAPR_IRQ_IPI),
> -        cpu_to_be32(SPAPR_IRQ_IPI + nr_servers),
> +        cpu_to_be32(SPAPR_IRQ_IPI + xive->nr_ipis),
>      };
>      /*
>       * EQ size - the sizes of pages supported by the system 4K, 64K,
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 8c5627225636..a0fc474ecb06 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -325,12 +325,14 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
>  
>      if (spapr->irq->xive) {
>          uint32_t nr_servers = spapr_max_server_number(spapr);
> +        uint32_t max_cpus = MACHINE(spapr)->smp.max_cpus;
>          DeviceState *dev;
>          int i;
>  
>          dev = qdev_new(TYPE_SPAPR_XIVE);
>          qdev_prop_set_uint32(dev, "nr-irqs", smc->nr_xirqs + SPAPR_XIRQ_BASE);
>          qdev_prop_set_uint32(dev, "nr-servers", nr_servers);
> +        qdev_prop_set_uint32(dev, "nr-ipis", max_cpus);
>          object_property_set_link(OBJECT(dev), "xive-fabric", OBJECT(spapr),
>                                   &error_abort);
>          sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
> @@ -338,7 +340,7 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
>          spapr->xive = SPAPR_XIVE(dev);
>  
>          /* Enable the CPU IPIs */
> -        for (i = 0; i < nr_servers; ++i) {
> +        for (i = 0; i < max_cpus; ++i) {
>              SpaprInterruptControllerClass *sicc
>                  = SPAPR_INTC_GET_CLASS(spapr->xive);
>  

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 5/8] spapr/xics: Drop unused argument to xics_kvm_has_broken_disconnect()
  2020-11-20 17:46 ` [PATCH for-6.0 5/8] spapr/xics: Drop unused argument to xics_kvm_has_broken_disconnect() Greg Kurz
@ 2020-11-23  4:10   ` David Gibson
  2020-11-23 10:15   ` Cédric Le Goater
  1 sibling, 0 replies; 40+ messages in thread
From: David Gibson @ 2020-11-23  4:10 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, qemu-devel, Cédric Le Goater

[-- Attachment #1: Type: text/plain, Size: 2279 bytes --]

On Fri, Nov 20, 2020 at 06:46:43PM +0100, Greg Kurz wrote:
> Never used from the start.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>

Applied to ppc-for-6.0, thanks.

> ---
>  include/hw/ppc/xics_spapr.h | 2 +-
>  hw/intc/xics_kvm.c          | 2 +-
>  hw/ppc/spapr_irq.c          | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/include/hw/ppc/xics_spapr.h b/include/hw/ppc/xics_spapr.h
> index 0b8182e40b33..de752c0d2c7e 100644
> --- a/include/hw/ppc/xics_spapr.h
> +++ b/include/hw/ppc/xics_spapr.h
> @@ -38,6 +38,6 @@ DECLARE_INSTANCE_CHECKER(ICSState, ICS_SPAPR,
>  int xics_kvm_connect(SpaprInterruptController *intc, uint32_t nr_servers,
>                       Error **errp);
>  void xics_kvm_disconnect(SpaprInterruptController *intc);
> -bool xics_kvm_has_broken_disconnect(SpaprMachineState *spapr);
> +bool xics_kvm_has_broken_disconnect(void);
>  
>  #endif /* XICS_SPAPR_H */
> diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
> index 68bb1914b9bb..570d635bcc08 100644
> --- a/hw/intc/xics_kvm.c
> +++ b/hw/intc/xics_kvm.c
> @@ -484,7 +484,7 @@ void xics_kvm_disconnect(SpaprInterruptController *intc)
>   * support destruction of a KVM XICS device while the VM is running.
>   * Required to start a spapr machine with ic-mode=dual,kernel-irqchip=on.
>   */
> -bool xics_kvm_has_broken_disconnect(SpaprMachineState *spapr)
> +bool xics_kvm_has_broken_disconnect(void)
>  {
>      int rc;
>  
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index a0fc474ecb06..2dacbecfd5fd 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -186,7 +186,7 @@ static int spapr_irq_check(SpaprMachineState *spapr, Error **errp)
>      if (kvm_enabled() &&
>          spapr->irq == &spapr_irq_dual &&
>          kvm_kernel_irqchip_required() &&
> -        xics_kvm_has_broken_disconnect(spapr)) {
> +        xics_kvm_has_broken_disconnect()) {
>          error_setg(errp,
>              "KVM is incompatible with ic-mode=dual,kernel-irqchip=on");
>          error_append_hint(errp,

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 6/8] spapr/xics: Add "nr-servers" property
  2020-11-20 17:46 ` [PATCH for-6.0 6/8] spapr/xics: Add "nr-servers" property Greg Kurz
@ 2020-11-23  4:18   ` David Gibson
  2020-11-23  9:39     ` Greg Kurz
  2020-11-23 10:24   ` Cédric Le Goater
  1 sibling, 1 reply; 40+ messages in thread
From: David Gibson @ 2020-11-23  4:18 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, qemu-devel, Cédric Le Goater

[-- Attachment #1: Type: text/plain, Size: 12848 bytes --]

On Fri, Nov 20, 2020 at 06:46:44PM +0100, Greg Kurz wrote:
> The sPAPR ICS device exposes the range of vCPU ids it can handle in
> the "ibm,interrupt-server-ranges" FDT property. The highest vCPU
> id, ie. spapr_max_server_number(), is obtained from the machine
> through the "nr_servers" argument of the generic spapr_irq_dt() call.
> 
> We want to drop the "nr_servers" argument from the API because it
> doesn't make sense for the sPAPR XIVE device actually.
> 
> On POWER9, we also pass the highest vCPU id to the KVM XICS-on-XIVE
> device, in order to optimize resource allocation in the HW.
> 
> This is enough motivation to introduce an "nr-servers" property
> and to use it for both purposes.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
>  include/hw/ppc/spapr.h      |  4 ++--
>  include/hw/ppc/xics_spapr.h | 21 +++++++++++++++++---
>  hw/intc/xics_kvm.c          |  2 +-
>  hw/intc/xics_spapr.c        | 38 ++++++++++++++++++++++++-------------
>  hw/ppc/spapr.c              |  5 +++--
>  hw/ppc/spapr_irq.c          |  7 +++++--
>  6 files changed, 54 insertions(+), 23 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 2e89e36cfbdc..b76c84a2f884 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -10,7 +10,7 @@
>  #include "hw/ppc/spapr_irq.h"
>  #include "qom/object.h"
>  #include "hw/ppc/spapr_xive.h"  /* For SpaprXive */
> -#include "hw/ppc/xics.h"        /* For ICSState */
> +#include "hw/ppc/xics_spapr.h"  /* For IcsSpaprState */
>  #include "hw/ppc/spapr_tpm_proxy.h"
>  
>  struct SpaprVioBus;
> @@ -230,7 +230,7 @@ struct SpaprMachineState {
>      SpaprIrq *irq;
>      qemu_irq *qirqs;
>      SpaprInterruptController *active_intc;
> -    ICSState *ics;
> +    IcsSpaprState *ics;
>      SpaprXive *xive;
>  
>      bool cmd_line_caps[SPAPR_CAP_NUM];
> diff --git a/include/hw/ppc/xics_spapr.h b/include/hw/ppc/xics_spapr.h
> index de752c0d2c7e..1a483a873b62 100644
> --- a/include/hw/ppc/xics_spapr.h
> +++ b/include/hw/ppc/xics_spapr.h
> @@ -28,12 +28,27 @@
>  #define XICS_SPAPR_H
>  
>  #include "hw/ppc/spapr.h"
> +#include "hw/ppc/xics.h"
>  #include "qom/object.h"
>  
> +typedef struct IcsSpaprState {
> +    /*< private >*/
> +    ICPState parent_obj;

It's called "*Ics*SpaprState" and it's replacing an ICSState, but it's
parent object is an *ICP*State - that doesn't seem right.

> +
> +    /*
> +     * The ICS needs to know the upper limit to vCPU ids it
> +     * might be exposed to in order to size the vCPU id range
> +     * in "ibm,interrupt-server-ranges" and to optimize HW
> +     * resource allocation when using the XICS-on-XIVE KVM
> +     * device. It is the purpose of the "nr-servers" property
> +     * which *must* be set to a non-null value before realizing
> +     * the ICS.
> +     */
> +    uint32_t nr_servers;

That said, I'm a but dubious about attaching the number of servers to
the ICS side, rather than the ICP side, since server numbers is
basically an ICP concept.  In fact... things about the overall
topology are usually done via the XicsFabric, so I'm wondering if we
should add a callback there for nr_servers.

> +} IcsSpaprState;
> +
>  #define TYPE_ICS_SPAPR "ics-spapr"
> -/* This is reusing the ICSState typedef from TYPE_ICS */
> -DECLARE_INSTANCE_CHECKER(ICSState, ICS_SPAPR,
> -                         TYPE_ICS_SPAPR)
> +DECLARE_INSTANCE_CHECKER(IcsSpaprState, ICS_SPAPR, TYPE_ICS_SPAPR)
>  
>  int xics_kvm_connect(SpaprInterruptController *intc, uint32_t nr_servers,
>                       Error **errp);
> diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
> index 570d635bcc08..ecbbda0e249b 100644
> --- a/hw/intc/xics_kvm.c
> +++ b/hw/intc/xics_kvm.c
> @@ -350,7 +350,7 @@ void ics_kvm_set_irq(ICSState *ics, int srcno, int val)
>  int xics_kvm_connect(SpaprInterruptController *intc, uint32_t nr_servers,
>                       Error **errp)
>  {
> -    ICSState *ics = ICS_SPAPR(intc);
> +    ICSState *ics = ICS(intc);
>      int rc;
>      CPUState *cs;
>      Error *local_err = NULL;
> diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
> index 8ae4f41459c3..ce147e8980ed 100644
> --- a/hw/intc/xics_spapr.c
> +++ b/hw/intc/xics_spapr.c
> @@ -34,6 +34,7 @@
>  #include "hw/ppc/xics.h"
>  #include "hw/ppc/xics_spapr.h"
>  #include "hw/ppc/fdt.h"
> +#include "hw/qdev-properties.h"
>  #include "qapi/visitor.h"
>  
>  /*
> @@ -154,7 +155,7 @@ static void rtas_set_xive(PowerPCCPU *cpu, SpaprMachineState *spapr,
>                            uint32_t nargs, target_ulong args,
>                            uint32_t nret, target_ulong rets)
>  {
> -    ICSState *ics = spapr->ics;
> +    ICSState *ics = ICS(spapr->ics);
>      uint32_t nr, srcno, server, priority;
>  
>      CHECK_EMULATED_XICS_RTAS(spapr, rets);
> @@ -189,7 +190,7 @@ static void rtas_get_xive(PowerPCCPU *cpu, SpaprMachineState *spapr,
>                            uint32_t nargs, target_ulong args,
>                            uint32_t nret, target_ulong rets)
>  {
> -    ICSState *ics = spapr->ics;
> +    ICSState *ics = ICS(spapr->ics);
>      uint32_t nr, srcno;
>  
>      CHECK_EMULATED_XICS_RTAS(spapr, rets);
> @@ -221,7 +222,7 @@ static void rtas_int_off(PowerPCCPU *cpu, SpaprMachineState *spapr,
>                           uint32_t nargs, target_ulong args,
>                           uint32_t nret, target_ulong rets)
>  {
> -    ICSState *ics = spapr->ics;
> +    ICSState *ics = ICS(spapr->ics);
>      uint32_t nr, srcno;
>  
>      CHECK_EMULATED_XICS_RTAS(spapr, rets);
> @@ -254,7 +255,7 @@ static void rtas_int_on(PowerPCCPU *cpu, SpaprMachineState *spapr,
>                          uint32_t nargs, target_ulong args,
>                          uint32_t nret, target_ulong rets)
>  {
> -    ICSState *ics = spapr->ics;
> +    ICSState *ics = ICS(spapr->ics);
>      uint32_t nr, srcno;
>  
>      CHECK_EMULATED_XICS_RTAS(spapr, rets);
> @@ -285,10 +286,13 @@ static void rtas_int_on(PowerPCCPU *cpu, SpaprMachineState *spapr,
>  
>  static void ics_spapr_realize(DeviceState *dev, Error **errp)
>  {
> -    ICSState *ics = ICS_SPAPR(dev);
> -    ICSStateClass *icsc = ICS_GET_CLASS(ics);
> +    IcsSpaprState *sics = ICS_SPAPR(dev);
> +    ICSStateClass *icsc = ICS_GET_CLASS(dev);
>      Error *local_err = NULL;
>  
> +    /* Set by spapr_irq_init() */
> +    g_assert(sics->nr_servers);
> +
>      icsc->parent_realize(dev, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
> @@ -312,7 +316,7 @@ static void xics_spapr_dt(SpaprInterruptController *intc, uint32_t nr_servers,
>                            void *fdt, uint32_t phandle)
>  {
>      uint32_t interrupt_server_ranges_prop[] = {
> -        0, cpu_to_be32(nr_servers),
> +        0, cpu_to_be32(ICS_SPAPR(intc)->nr_servers),
>      };
>      int node;
>  
> @@ -333,7 +337,7 @@ static void xics_spapr_dt(SpaprInterruptController *intc, uint32_t nr_servers,
>  static int xics_spapr_cpu_intc_create(SpaprInterruptController *intc,
>                                         PowerPCCPU *cpu, Error **errp)
>  {
> -    ICSState *ics = ICS_SPAPR(intc);
> +    ICSState *ics = ICS(intc);
>      Object *obj;
>      SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
>  
> @@ -364,7 +368,7 @@ static void xics_spapr_cpu_intc_destroy(SpaprInterruptController *intc,
>  static int xics_spapr_claim_irq(SpaprInterruptController *intc, int irq,
>                                  bool lsi, Error **errp)
>  {
> -    ICSState *ics = ICS_SPAPR(intc);
> +    ICSState *ics = ICS(intc);
>  
>      assert(ics);
>      assert(ics_valid_irq(ics, irq));
> @@ -380,7 +384,7 @@ static int xics_spapr_claim_irq(SpaprInterruptController *intc, int irq,
>  
>  static void xics_spapr_free_irq(SpaprInterruptController *intc, int irq)
>  {
> -    ICSState *ics = ICS_SPAPR(intc);
> +    ICSState *ics = ICS(intc);
>      uint32_t srcno = irq - ics->offset;
>  
>      assert(ics_valid_irq(ics, irq));
> @@ -390,7 +394,7 @@ static void xics_spapr_free_irq(SpaprInterruptController *intc, int irq)
>  
>  static void xics_spapr_set_irq(SpaprInterruptController *intc, int irq, int val)
>  {
> -    ICSState *ics = ICS_SPAPR(intc);
> +    ICSState *ics = ICS(intc);
>      uint32_t srcno = irq - ics->offset;
>  
>      ics_set_irq(ics, srcno, val);
> @@ -398,7 +402,7 @@ static void xics_spapr_set_irq(SpaprInterruptController *intc, int irq, int val)
>  
>  static void xics_spapr_print_info(SpaprInterruptController *intc, Monitor *mon)
>  {
> -    ICSState *ics = ICS_SPAPR(intc);
> +    ICSState *ics = ICS(intc);
>      CPUState *cs;
>  
>      CPU_FOREACH(cs) {
> @@ -426,7 +430,8 @@ static int xics_spapr_activate(SpaprInterruptController *intc,
>                                 uint32_t nr_servers, Error **errp)
>  {
>      if (kvm_enabled()) {
> -        return spapr_irq_init_kvm(xics_kvm_connect, intc, nr_servers, errp);
> +        return spapr_irq_init_kvm(xics_kvm_connect, intc,
> +                                  ICS_SPAPR(intc)->nr_servers, errp);
>      }
>      return 0;
>  }
> @@ -438,6 +443,11 @@ static void xics_spapr_deactivate(SpaprInterruptController *intc)
>      }
>  }
>  
> +static Property ics_spapr_properties[] = {
> +    DEFINE_PROP_UINT32("nr-servers", IcsSpaprState, nr_servers, 0),
> +    DEFINE_PROP_END_OF_LIST(),
> +};
> +
>  static void ics_spapr_class_init(ObjectClass *klass, void *data)
>  {
>      DeviceClass *dc = DEVICE_CLASS(klass);
> @@ -446,6 +456,7 @@ static void ics_spapr_class_init(ObjectClass *klass, void *data)
>  
>      device_class_set_parent_realize(dc, ics_spapr_realize,
>                                      &isc->parent_realize);
> +    device_class_set_props(dc, ics_spapr_properties);
>      sicc->activate = xics_spapr_activate;
>      sicc->deactivate = xics_spapr_deactivate;
>      sicc->cpu_intc_create = xics_spapr_cpu_intc_create;
> @@ -462,6 +473,7 @@ static void ics_spapr_class_init(ObjectClass *klass, void *data)
>  static const TypeInfo ics_spapr_info = {
>      .name = TYPE_ICS_SPAPR,
>      .parent = TYPE_ICS,
> +    .instance_size = sizeof(IcsSpaprState),
>      .class_init = ics_spapr_class_init,
>      .interfaces = (InterfaceInfo[]) {
>          { TYPE_SPAPR_INTC },
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 12a012d9dd09..21de0456446b 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -4218,15 +4218,16 @@ static void spapr_phb_placement(SpaprMachineState *spapr, uint32_t index,
>  static ICSState *spapr_ics_get(XICSFabric *dev, int irq)
>  {
>      SpaprMachineState *spapr = SPAPR_MACHINE(dev);
> +    ICSState *ics = ICS(spapr->ics);
>  
> -    return ics_valid_irq(spapr->ics, irq) ? spapr->ics : NULL;
> +    return ics_valid_irq(ics, irq) ? ics : NULL;
>  }
>  
>  static void spapr_ics_resend(XICSFabric *dev)
>  {
>      SpaprMachineState *spapr = SPAPR_MACHINE(dev);
>  
> -    ics_resend(spapr->ics);
> +    ics_resend(ICS(spapr->ics));
>  }
>  
>  static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 2dacbecfd5fd..be6f4041e433 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -316,6 +316,9 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
>          object_property_set_link(obj, ICS_PROP_XICS, OBJECT(spapr),
>                                   &error_abort);
>          object_property_set_int(obj, "nr-irqs", smc->nr_xirqs, &error_abort);
> +        object_property_set_uint(obj, "nr-servers",
> +                                 spapr_max_server_number(spapr),
> +                                 &error_abort);
>          if (!qdev_realize(DEVICE(obj), NULL, errp)) {
>              return;
>          }
> @@ -426,7 +429,7 @@ qemu_irq spapr_qirq(SpaprMachineState *spapr, int irq)
>      assert(irq < (smc->nr_xirqs + SPAPR_XIRQ_BASE));
>  
>      if (spapr->ics) {
> -        assert(ics_valid_irq(spapr->ics, irq));
> +        assert(ics_valid_irq(ICS(spapr->ics), irq));
>      }
>      if (spapr->xive) {
>          assert(irq < spapr->xive->nr_irqs);
> @@ -556,7 +559,7 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum)
>  
>  int spapr_irq_find(SpaprMachineState *spapr, int num, bool align, Error **errp)
>  {
> -    ICSState *ics = spapr->ics;
> +    ICSState *ics = ICS(spapr->ics);
>      int first = -1;
>  
>      assert(ics);

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 7/8] spapr: Drop "nr_servers" argument of the sPAPR IC activate() operation
  2020-11-20 17:46 ` [PATCH for-6.0 7/8] spapr: Drop "nr_servers" argument of the sPAPR IC activate() operation Greg Kurz
@ 2020-11-23  4:38   ` David Gibson
  2020-11-23  9:47     ` Greg Kurz
  0 siblings, 1 reply; 40+ messages in thread
From: David Gibson @ 2020-11-23  4:38 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, qemu-devel, Cédric Le Goater

[-- Attachment #1: Type: text/plain, Size: 3553 bytes --]

On Fri, Nov 20, 2020 at 06:46:45PM +0100, Greg Kurz wrote:
> This argument isn't used by the backends anymore.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
>  include/hw/ppc/spapr_irq.h | 3 +--
>  hw/intc/spapr_xive.c       | 3 +--
>  hw/intc/xics_spapr.c       | 3 +--
>  hw/ppc/spapr_irq.c         | 3 +--
>  4 files changed, 4 insertions(+), 8 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> index c22a72c9e270..3e1c619d4c06 100644
> --- a/include/hw/ppc/spapr_irq.h
> +++ b/include/hw/ppc/spapr_irq.h
> @@ -43,8 +43,7 @@ DECLARE_CLASS_CHECKERS(SpaprInterruptControllerClass, SPAPR_INTC,
>  struct SpaprInterruptControllerClass {
>      InterfaceClass parent;
>  
> -    int (*activate)(SpaprInterruptController *intc, uint32_t nr_servers,
> -                    Error **errp);

Hm.  Thinking back on this, is the problem just that it's not clear
here if the 'nr_servers' parameter here indicates the number of CPU
targets, or the maximum index of CPU targets?

If so, I wonder if it might be better to pass both numbers as
parameters here, rather than shuffling the properties of the devices
themselves.

> +    int (*activate)(SpaprInterruptController *intc, Error **errp);
>      void (*deactivate)(SpaprInterruptController *intc);
>  
>      /*
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index d13a2955ce9b..8331026fdb12 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -754,8 +754,7 @@ static void spapr_xive_dt(SpaprInterruptController *intc, uint32_t nr_servers,
>                       plat_res_int_priorities, sizeof(plat_res_int_priorities)));
>  }
>  
> -static int spapr_xive_activate(SpaprInterruptController *intc,
> -                               uint32_t nr_servers, Error **errp)
> +static int spapr_xive_activate(SpaprInterruptController *intc, Error **errp)
>  {
>      SpaprXive *xive = SPAPR_XIVE(intc);
>  
> diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
> index ce147e8980ed..8810bd93c856 100644
> --- a/hw/intc/xics_spapr.c
> +++ b/hw/intc/xics_spapr.c
> @@ -426,8 +426,7 @@ static int xics_spapr_post_load(SpaprInterruptController *intc, int version_id)
>      return 0;
>  }
>  
> -static int xics_spapr_activate(SpaprInterruptController *intc,
> -                               uint32_t nr_servers, Error **errp)
> +static int xics_spapr_activate(SpaprInterruptController *intc, Error **errp)
>  {
>      if (kvm_enabled()) {
>          return spapr_irq_init_kvm(xics_kvm_connect, intc,
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index be6f4041e433..f2897fbc942a 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -480,7 +480,6 @@ static void set_active_intc(SpaprMachineState *spapr,
>                              SpaprInterruptController *new_intc)
>  {
>      SpaprInterruptControllerClass *sicc;
> -    uint32_t nr_servers = spapr_max_server_number(spapr);
>  
>      assert(new_intc);
>  
> @@ -498,7 +497,7 @@ static void set_active_intc(SpaprMachineState *spapr,
>  
>      sicc = SPAPR_INTC_GET_CLASS(new_intc);
>      if (sicc->activate) {
> -        sicc->activate(new_intc, nr_servers, &error_fatal);
> +        sicc->activate(new_intc, &error_fatal);
>      }
>  
>      spapr->active_intc = new_intc;

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids
  2020-11-20 17:46 [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids Greg Kurz
                   ` (7 preceding siblings ...)
  2020-11-20 17:46 ` [PATCH for-6.0 8/8] spapr: Drop "nr_servers" argument of the sPAPR IC dt() operation Greg Kurz
@ 2020-11-23  8:04 ` Cédric Le Goater
  2020-11-23 10:07   ` Greg Kurz
  8 siblings, 1 reply; 40+ messages in thread
From: Cédric Le Goater @ 2020-11-23  8:04 UTC (permalink / raw)
  To: Greg Kurz, qemu-devel; +Cc: qemu-ppc, David Gibson

On 11/20/20 6:46 PM, Greg Kurz wrote:
> A regression was recently fixed in the sPAPR XIVE code for QEMU 5.2
> RC3 [1]. It boiled down to a confusion between IPI numbers and vCPU
> ids, which happen to be numerically equal in general, but are really
> different entities that can diverge in some setups. When this happens,
> we end up misconfiguring XIVE in a way that is almost fatal for the
> guest.
> 
> The confusion comes from XICS which has historically assumed equality
> between interrupt server numbers and vCPU ids, as mentionned in a
> comment back from 2011 in the linux kernel icp_native_init_one_node()
> function:
> 
>     /* This code does the theorically broken assumption that the interrupt
>      * server numbers are the same as the hard CPU numbers.
>      * This happens to be the case so far but we are playing with fire...
>      * should be fixed one of these days. -BenH.
>      */
> 
> This assumption crept into QEMU through the "ibm,interrupt-server-ranges"
> property of the "interrupt-controller" node in the DT. This property
> contains ranges of consecutive vCPU ids defined as (first id, # of ids).
> In the case of QEMU, we define a single range starting from 0 up to the
> highest vCPU id, as returned by spapr_max_server_number(). This has
> always been associated to the "nr_servers" wording when naming variables
> or function arguments. When XIVE got added, we introduced an sPAPR IRQ
> abstraction to be able to control several interrupt controller backends.
> The sPAPR IRQ base class provides a dt() handler used to populate the
> "interrupt-controller" node in the DT. This handler takes an "nr_server"
> argument inherited from XICS and we ended up using it to populate the
> "ibm,xive-lisn-ranges" property used with XIVE, which has completely
> different semantics. It contain ranges of interrupt numbers that the
> guest can use for any purpose. Since one obvious purpose is IPI, its
> first range should just be able to accomodate all possible vCPUs.

To clarify, PAPR says it is a requirement :

  "The first range will contain at least one per possible thread in the 
   partition."

The regression showed that we were not initializing correctly this range 
in QEMU/KVM. I an not even sure it had the correct size either since
we were anyhow initializing 4096 IPIs.

> In the case of QEMU, we define a single range starting from 0 up
> to "nr_server" but we should rather size it to the number of vCPUs
> actually (ie. smp.max_cpus).

ah. And so spapr_max_server_number(spapr) is crap ? This is starting
to be complex to follow :/
 
> This series aims at getting rid of the "nr_server" argument in the
> sPAPR IC handlers. Since both the highest possible vCPU id and the
> maximum number of vCPUs are invariants for XICS and XIVE respectively,

What XIVE cares about is the number of possible IPIs and the number
of vCPUs since we deduced from that the number of event queue 
descriptors, which is another XIVE structure.

> let's make them device properties to be configured by the machine when
> it creates the interrupt controllers and use them where needed.
> 
> This doesn't cause any visible change to setups using the default
> VSMT machine settings. This changes "ibm,xive-lisn-ranges" for
> setups that mess with VSMT, but this is acceptable since linux
> only allocates one interrupt per vCPU and the higher part of the
> range was never used.

This range is only used for vCPUs. 

C.

> [1] https://git.qemu.org/?p=qemu.git;a=commit;h=6d24795ee7e3199d199d3c415312c93382ad1807
> 
> Greg Kurz (8):
>   spapr/xive: Turn some sanity checks into assertions
>   spapr/xive: Introduce spapr_xive_nr_ends()
>   spapr/xive: Add "nr-servers" property
>   spapr/xive: Add "nr-ipis" property
>   spapr/xics: Drop unused argument to xics_kvm_has_broken_disconnect()
>   spapr/xics: Add "nr-servers" property
>   spapr: Drop "nr_servers" argument of the sPAPR IC activate() operation
>   spapr: Drop "nr_servers" argument of the sPAPR IC dt() operation
> 
>  include/hw/ppc/spapr.h      |  4 +--
>  include/hw/ppc/spapr_irq.h  |  9 ++---
>  include/hw/ppc/spapr_xive.h | 25 ++++++++++++-
>  include/hw/ppc/xics_spapr.h | 23 +++++++++---
>  hw/intc/spapr_xive.c        | 72 ++++++++++++++++++++++---------------
>  hw/intc/spapr_xive_kvm.c    |  4 +--
>  hw/intc/xics_kvm.c          |  4 +--
>  hw/intc/xics_spapr.c        | 45 ++++++++++++++---------
>  hw/ppc/spapr.c              |  7 ++--
>  hw/ppc/spapr_irq.c          | 27 +++++++-------
>  10 files changed, 141 insertions(+), 79 deletions(-)
> 



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 1/8] spapr/xive: Turn some sanity checks into assertions
  2020-11-20 17:46 ` [PATCH for-6.0 1/8] spapr/xive: Turn some sanity checks into assertions Greg Kurz
  2020-11-23  3:33   ` David Gibson
@ 2020-11-23  8:09   ` Cédric Le Goater
  1 sibling, 0 replies; 40+ messages in thread
From: Cédric Le Goater @ 2020-11-23  8:09 UTC (permalink / raw)
  To: Greg Kurz, qemu-devel; +Cc: qemu-ppc, David Gibson

On 11/20/20 6:46 PM, Greg Kurz wrote:
> The sPAPR XIVE device is created by the machine in spapr_irq_init().
> The latter overrides any value provided by the user with -global for
> the "nr-irqs" and "nr-ends" properties with strictly positive values.
> 
> It seems reasonable to assume these properties should never be 0,
> which wouldn't make much sense by the way.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>

Reviewed-by: Cédric Le Goater <clg@kaod.org>

> ---
>  hw/intc/spapr_xive.c | 14 ++++----------
>  1 file changed, 4 insertions(+), 10 deletions(-)
> 
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index 1fa09f287ac0..60e0d5769dcc 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -296,22 +296,16 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
>      XiveENDSource *end_xsrc = &xive->end_source;
>      Error *local_err = NULL;
>  
> +    /* Set by spapr_irq_init() */
> +    g_assert(xive->nr_irqs);
> +    g_assert(xive->nr_ends);
> +
>      sxc->parent_realize(dev, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
>          return;
>      }
>  
> -    if (!xive->nr_irqs) {
> -        error_setg(errp, "Number of interrupt needs to be greater 0");
> -        return;
> -    }
> -
> -    if (!xive->nr_ends) {
> -        error_setg(errp, "Number of interrupt needs to be greater 0");
> -        return;
> -    }
> -
>      /*
>       * Initialize the internal sources, for IPIs and virtual devices.
>       */
> 



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 3/8] spapr/xive: Add "nr-servers" property
  2020-11-23  3:52   ` David Gibson
@ 2020-11-23  9:20     ` Greg Kurz
  0 siblings, 0 replies; 40+ messages in thread
From: Greg Kurz @ 2020-11-23  9:20 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, Cédric Le Goater

[-- Attachment #1: Type: text/plain, Size: 8322 bytes --]

On Mon, 23 Nov 2020 14:52:14 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Fri, Nov 20, 2020 at 06:46:41PM +0100, Greg Kurz wrote:
> > The sPAPR XIVE object has an "nr-ends" property that is used
> > to size the END table. This property is set by the machine
> > code to a value derived from spapr_max_server_number().
> > 
> > spapr_max_server_number() is also used to inform the KVM XIVE
> > device about the range of vCPU ids it might be exposed to,
> > in order to optimize resource allocation in the HW.
> > 
> > This is enough motivation to introduce an "nr-servers" property
> > and to use it for both purposes. The existing "nr-ends" property
> > is now longer used. It is kept around though because it is exposed
> > to -global. It will continue to be ignored as before without
> > causing QEMU to exit.
> 
> I'm a little dubious about keeping the property around.  It's
> technically a breaking change to remove it, but since IIUC, it's
> *never* had any effect, it seems implausible anyone out there's using
> it.
> 

Likely. BTW, I find a bit odd that these properties are silently
ignored.

> Can we at least put it straight into the deprecation document?
> 

Yes we can initiate a formal deprecation. This would allow us
to drop the property in QEMU 6.2. We usually also print out
a warning so that users are aware of the upcoming removal.

I'll look into it.

> > The associated nr_ends field cannot be dropped from SpaprXive
> > because it is explicitly used by vmstate_spapr_xive(). It is
> > thus renamed to nr_ends_vmstate.
> > 
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> >  include/hw/ppc/spapr_xive.h | 16 +++++++++++++++-
> >  hw/intc/spapr_xive.c        | 28 ++++++++++++++++++++++------
> >  hw/ppc/spapr_irq.c          |  6 +-----
> >  3 files changed, 38 insertions(+), 12 deletions(-)
> > 
> > diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> > index 4b967f13c030..7123ea07ed78 100644
> > --- a/include/hw/ppc/spapr_xive.h
> > +++ b/include/hw/ppc/spapr_xive.h
> > @@ -23,6 +23,16 @@
> >  typedef struct SpaprXive {
> >      XiveRouter    parent;
> >  
> > +    /*
> > +     * The XIVE device needs to know the highest vCPU id it might
> > +     * be exposed to in order to size the END table. It may also
> > +     * propagate the value to the KVM XIVE device in order to
> > +     * optimize resource allocation in the HW.
> > +     * This must be set to a non-null value using the "nr-servers"
> > +     * property, before realizing the device.
> > +     */
> > +    uint32_t      nr_servers;
> > +
> >      /* Internal interrupt source for IPIs and virtual devices */
> >      XiveSource    source;
> >      hwaddr        vc_base;
> > @@ -38,7 +48,11 @@ typedef struct SpaprXive {
> >      XiveEAS       *eat;
> >      uint32_t      nr_irqs;
> >      XiveEND       *endt;
> > -    uint32_t      nr_ends;
> > +    /*
> > +     * This is derived from nr_servers but it must be kept around because
> > +     * vmstate_spapr_xive uses it.
> > +     */
> > +    uint32_t      nr_ends_vmstate;
> >  
> >      /* TIMA mapping address */
> >      hwaddr        tm_base;
> > diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> > index f473ad9cba47..e4dbf9c2c409 100644
> > --- a/hw/intc/spapr_xive.c
> > +++ b/hw/intc/spapr_xive.c
> > @@ -99,6 +99,12 @@ int spapr_xive_end_to_target(uint8_t end_blk, uint32_t end_idx,
> >      return 0;
> >  }
> >  
> > +/*
> > + * 8 XIVE END structures per CPU. One for each available
> > + * priority
> > + */
> > +#define spapr_xive_cpu_end_idx(vcpu, prio) (((vcpu) << 3) + prio)
> > +
> >  static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
> >                                    uint8_t *out_end_blk, uint32_t *out_end_idx)
> >  {
> > @@ -109,7 +115,7 @@ static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
> >      }
> >  
> >      if (out_end_idx) {
> > -        *out_end_idx = (cpu->vcpu_id << 3) + prio;
> > +        *out_end_idx = spapr_xive_cpu_end_idx(cpu->vcpu_id, prio);
> >      }
> >  }
> >  
> > @@ -290,7 +296,8 @@ static void spapr_xive_instance_init(Object *obj)
> >  
> >  uint32_t spapr_xive_nr_ends(const SpaprXive *xive)
> >  {
> > -    return xive->nr_ends;
> > +    g_assert(xive->nr_servers);
> > +    return spapr_xive_cpu_end_idx(xive->nr_servers, 0);
> >  }
> >  
> >  static void spapr_xive_realize(DeviceState *dev, Error **errp)
> > @@ -303,7 +310,7 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
> >  
> >      /* Set by spapr_irq_init() */
> >      g_assert(xive->nr_irqs);
> > -    g_assert(xive->nr_ends);
> > +    g_assert(xive->nr_servers);
> >  
> >      sxc->parent_realize(dev, &local_err);
> >      if (local_err) {
> > @@ -360,6 +367,8 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
> >      sysbus_mmio_map(SYS_BUS_DEVICE(xive), 0, xive->vc_base);
> >      sysbus_mmio_map(SYS_BUS_DEVICE(xive), 1, xive->end_base);
> >      sysbus_mmio_map(SYS_BUS_DEVICE(xive), 2, xive->tm_base);
> > +
> > +    xive->nr_ends_vmstate = spapr_xive_nr_ends(xive);
> >  }
> >  
> >  static int spapr_xive_get_eas(XiveRouter *xrtr, uint8_t eas_blk,
> > @@ -547,7 +556,7 @@ static const VMStateDescription vmstate_spapr_xive = {
> >          VMSTATE_UINT32_EQUAL(nr_irqs, SpaprXive, NULL),
> >          VMSTATE_STRUCT_VARRAY_POINTER_UINT32(eat, SpaprXive, nr_irqs,
> >                                       vmstate_spapr_xive_eas, XiveEAS),
> > -        VMSTATE_STRUCT_VARRAY_POINTER_UINT32(endt, SpaprXive, nr_ends,
> > +        VMSTATE_STRUCT_VARRAY_POINTER_UINT32(endt, SpaprXive, nr_ends_vmstate,
> >                                               vmstate_spapr_xive_end, XiveEND),
> >          VMSTATE_END_OF_LIST()
> >      },
> > @@ -591,7 +600,14 @@ static void spapr_xive_free_irq(SpaprInterruptController *intc, int lisn)
> >  
> >  static Property spapr_xive_properties[] = {
> >      DEFINE_PROP_UINT32("nr-irqs", SpaprXive, nr_irqs, 0),
> > -    DEFINE_PROP_UINT32("nr-ends", SpaprXive, nr_ends, 0),
> > +    /*
> > +     * "nr-ends" is deprecated by "nr-servers" since QEMU 6.0.
> > +     * It is just kept around because it is exposed to the user
> > +     * through -global and we don't want it to fail, even if
> > +     * the value is actually overridden internally.
> > +     */
> > +    DEFINE_PROP_UINT32("nr-ends", SpaprXive, nr_ends_vmstate, 0),
> > +    DEFINE_PROP_UINT32("nr-servers", SpaprXive, nr_servers, 0),
> >      DEFINE_PROP_UINT64("vc-base", SpaprXive, vc_base, SPAPR_XIVE_VC_BASE),
> >      DEFINE_PROP_UINT64("tm-base", SpaprXive, tm_base, SPAPR_XIVE_TM_BASE),
> >      DEFINE_PROP_UINT8("hv-prio", SpaprXive, hv_prio, 7),
> > @@ -742,7 +758,7 @@ static int spapr_xive_activate(SpaprInterruptController *intc,
> >      SpaprXive *xive = SPAPR_XIVE(intc);
> >  
> >      if (kvm_enabled()) {
> > -        int rc = spapr_irq_init_kvm(kvmppc_xive_connect, intc, nr_servers,
> > +        int rc = spapr_irq_init_kvm(kvmppc_xive_connect, intc, xive->nr_servers,
> 
> Hmm.  So we're now ignoring the 'nr_servers' parameter to this
> function, which doesn't seem right.  Should we be assert()ing that
> it's equal to xive->nr_servers?
> 
> >                                      errp);
> >          if (rc < 0) {
> >              return rc;
> > diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> > index f59960339ec3..8c5627225636 100644
> > --- a/hw/ppc/spapr_irq.c
> > +++ b/hw/ppc/spapr_irq.c
> > @@ -330,11 +330,7 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
> >  
> >          dev = qdev_new(TYPE_SPAPR_XIVE);
> >          qdev_prop_set_uint32(dev, "nr-irqs", smc->nr_xirqs + SPAPR_XIRQ_BASE);
> > -        /*
> > -         * 8 XIVE END structures per CPU. One for each available
> > -         * priority
> > -         */
> > -        qdev_prop_set_uint32(dev, "nr-ends", nr_servers << 3);
> > +        qdev_prop_set_uint32(dev, "nr-servers", nr_servers);
> >          object_property_set_link(OBJECT(dev), "xive-fabric", OBJECT(spapr),
> >                                   &error_abort);
> >          sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 6/8] spapr/xics: Add "nr-servers" property
  2020-11-23  4:18   ` David Gibson
@ 2020-11-23  9:39     ` Greg Kurz
  0 siblings, 0 replies; 40+ messages in thread
From: Greg Kurz @ 2020-11-23  9:39 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, Cédric Le Goater

[-- Attachment #1: Type: text/plain, Size: 13665 bytes --]

On Mon, 23 Nov 2020 15:18:09 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Fri, Nov 20, 2020 at 06:46:44PM +0100, Greg Kurz wrote:
> > The sPAPR ICS device exposes the range of vCPU ids it can handle in
> > the "ibm,interrupt-server-ranges" FDT property. The highest vCPU
> > id, ie. spapr_max_server_number(), is obtained from the machine
> > through the "nr_servers" argument of the generic spapr_irq_dt() call.
> > 
> > We want to drop the "nr_servers" argument from the API because it
> > doesn't make sense for the sPAPR XIVE device actually.
> > 
> > On POWER9, we also pass the highest vCPU id to the KVM XICS-on-XIVE
> > device, in order to optimize resource allocation in the HW.
> > 
> > This is enough motivation to introduce an "nr-servers" property
> > and to use it for both purposes.
> > 
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> >  include/hw/ppc/spapr.h      |  4 ++--
> >  include/hw/ppc/xics_spapr.h | 21 +++++++++++++++++---
> >  hw/intc/xics_kvm.c          |  2 +-
> >  hw/intc/xics_spapr.c        | 38 ++++++++++++++++++++++++-------------
> >  hw/ppc/spapr.c              |  5 +++--
> >  hw/ppc/spapr_irq.c          |  7 +++++--
> >  6 files changed, 54 insertions(+), 23 deletions(-)
> > 
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index 2e89e36cfbdc..b76c84a2f884 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -10,7 +10,7 @@
> >  #include "hw/ppc/spapr_irq.h"
> >  #include "qom/object.h"
> >  #include "hw/ppc/spapr_xive.h"  /* For SpaprXive */
> > -#include "hw/ppc/xics.h"        /* For ICSState */
> > +#include "hw/ppc/xics_spapr.h"  /* For IcsSpaprState */
> >  #include "hw/ppc/spapr_tpm_proxy.h"
> >  
> >  struct SpaprVioBus;
> > @@ -230,7 +230,7 @@ struct SpaprMachineState {
> >      SpaprIrq *irq;
> >      qemu_irq *qirqs;
> >      SpaprInterruptController *active_intc;
> > -    ICSState *ics;
> > +    IcsSpaprState *ics;
> >      SpaprXive *xive;
> >  
> >      bool cmd_line_caps[SPAPR_CAP_NUM];
> > diff --git a/include/hw/ppc/xics_spapr.h b/include/hw/ppc/xics_spapr.h
> > index de752c0d2c7e..1a483a873b62 100644
> > --- a/include/hw/ppc/xics_spapr.h
> > +++ b/include/hw/ppc/xics_spapr.h
> > @@ -28,12 +28,27 @@
> >  #define XICS_SPAPR_H
> >  
> >  #include "hw/ppc/spapr.h"
> > +#include "hw/ppc/xics.h"
> >  #include "qom/object.h"
> >  
> > +typedef struct IcsSpaprState {
> > +    /*< private >*/
> > +    ICPState parent_obj;
> 
> It's called "*Ics*SpaprState" and it's replacing an ICSState, but it's
> parent object is an *ICP*State - that doesn't seem right.
> 

Oops, typo :-\

> > +
> > +    /*
> > +     * The ICS needs to know the upper limit to vCPU ids it
> > +     * might be exposed to in order to size the vCPU id range
> > +     * in "ibm,interrupt-server-ranges" and to optimize HW
> > +     * resource allocation when using the XICS-on-XIVE KVM
> > +     * device. It is the purpose of the "nr-servers" property
> > +     * which *must* be set to a non-null value before realizing
> > +     * the ICS.
> > +     */
> > +    uint32_t nr_servers;
> 
> That said, I'm a but dubious about attaching the number of servers to
> the ICS side, rather than the ICP side, since server numbers is
> basically an ICP concept.  In fact... things about the overall
> topology are usually done via the XicsFabric, so I'm wondering if we
> should add a callback there for nr_servers.
> 

I agree this would be the way to go. I sent a patch to do so a year ago
but it didn't reach consensus at the time:

http://patchwork.ozlabs.org/project/qemu-devel/patch/157010405465.246126.7760334967989385566.stgit@bahia.lan/

I'll revisit the approach.

> > +} IcsSpaprState;
> > +
> >  #define TYPE_ICS_SPAPR "ics-spapr"
> > -/* This is reusing the ICSState typedef from TYPE_ICS */
> > -DECLARE_INSTANCE_CHECKER(ICSState, ICS_SPAPR,
> > -                         TYPE_ICS_SPAPR)
> > +DECLARE_INSTANCE_CHECKER(IcsSpaprState, ICS_SPAPR, TYPE_ICS_SPAPR)
> >  
> >  int xics_kvm_connect(SpaprInterruptController *intc, uint32_t nr_servers,
> >                       Error **errp);
> > diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
> > index 570d635bcc08..ecbbda0e249b 100644
> > --- a/hw/intc/xics_kvm.c
> > +++ b/hw/intc/xics_kvm.c
> > @@ -350,7 +350,7 @@ void ics_kvm_set_irq(ICSState *ics, int srcno, int val)
> >  int xics_kvm_connect(SpaprInterruptController *intc, uint32_t nr_servers,
> >                       Error **errp)
> >  {
> > -    ICSState *ics = ICS_SPAPR(intc);
> > +    ICSState *ics = ICS(intc);
> >      int rc;
> >      CPUState *cs;
> >      Error *local_err = NULL;
> > diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
> > index 8ae4f41459c3..ce147e8980ed 100644
> > --- a/hw/intc/xics_spapr.c
> > +++ b/hw/intc/xics_spapr.c
> > @@ -34,6 +34,7 @@
> >  #include "hw/ppc/xics.h"
> >  #include "hw/ppc/xics_spapr.h"
> >  #include "hw/ppc/fdt.h"
> > +#include "hw/qdev-properties.h"
> >  #include "qapi/visitor.h"
> >  
> >  /*
> > @@ -154,7 +155,7 @@ static void rtas_set_xive(PowerPCCPU *cpu, SpaprMachineState *spapr,
> >                            uint32_t nargs, target_ulong args,
> >                            uint32_t nret, target_ulong rets)
> >  {
> > -    ICSState *ics = spapr->ics;
> > +    ICSState *ics = ICS(spapr->ics);
> >      uint32_t nr, srcno, server, priority;
> >  
> >      CHECK_EMULATED_XICS_RTAS(spapr, rets);
> > @@ -189,7 +190,7 @@ static void rtas_get_xive(PowerPCCPU *cpu, SpaprMachineState *spapr,
> >                            uint32_t nargs, target_ulong args,
> >                            uint32_t nret, target_ulong rets)
> >  {
> > -    ICSState *ics = spapr->ics;
> > +    ICSState *ics = ICS(spapr->ics);
> >      uint32_t nr, srcno;
> >  
> >      CHECK_EMULATED_XICS_RTAS(spapr, rets);
> > @@ -221,7 +222,7 @@ static void rtas_int_off(PowerPCCPU *cpu, SpaprMachineState *spapr,
> >                           uint32_t nargs, target_ulong args,
> >                           uint32_t nret, target_ulong rets)
> >  {
> > -    ICSState *ics = spapr->ics;
> > +    ICSState *ics = ICS(spapr->ics);
> >      uint32_t nr, srcno;
> >  
> >      CHECK_EMULATED_XICS_RTAS(spapr, rets);
> > @@ -254,7 +255,7 @@ static void rtas_int_on(PowerPCCPU *cpu, SpaprMachineState *spapr,
> >                          uint32_t nargs, target_ulong args,
> >                          uint32_t nret, target_ulong rets)
> >  {
> > -    ICSState *ics = spapr->ics;
> > +    ICSState *ics = ICS(spapr->ics);
> >      uint32_t nr, srcno;
> >  
> >      CHECK_EMULATED_XICS_RTAS(spapr, rets);
> > @@ -285,10 +286,13 @@ static void rtas_int_on(PowerPCCPU *cpu, SpaprMachineState *spapr,
> >  
> >  static void ics_spapr_realize(DeviceState *dev, Error **errp)
> >  {
> > -    ICSState *ics = ICS_SPAPR(dev);
> > -    ICSStateClass *icsc = ICS_GET_CLASS(ics);
> > +    IcsSpaprState *sics = ICS_SPAPR(dev);
> > +    ICSStateClass *icsc = ICS_GET_CLASS(dev);
> >      Error *local_err = NULL;
> >  
> > +    /* Set by spapr_irq_init() */
> > +    g_assert(sics->nr_servers);
> > +
> >      icsc->parent_realize(dev, &local_err);
> >      if (local_err) {
> >          error_propagate(errp, local_err);
> > @@ -312,7 +316,7 @@ static void xics_spapr_dt(SpaprInterruptController *intc, uint32_t nr_servers,
> >                            void *fdt, uint32_t phandle)
> >  {
> >      uint32_t interrupt_server_ranges_prop[] = {
> > -        0, cpu_to_be32(nr_servers),
> > +        0, cpu_to_be32(ICS_SPAPR(intc)->nr_servers),
> >      };
> >      int node;
> >  
> > @@ -333,7 +337,7 @@ static void xics_spapr_dt(SpaprInterruptController *intc, uint32_t nr_servers,
> >  static int xics_spapr_cpu_intc_create(SpaprInterruptController *intc,
> >                                         PowerPCCPU *cpu, Error **errp)
> >  {
> > -    ICSState *ics = ICS_SPAPR(intc);
> > +    ICSState *ics = ICS(intc);
> >      Object *obj;
> >      SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
> >  
> > @@ -364,7 +368,7 @@ static void xics_spapr_cpu_intc_destroy(SpaprInterruptController *intc,
> >  static int xics_spapr_claim_irq(SpaprInterruptController *intc, int irq,
> >                                  bool lsi, Error **errp)
> >  {
> > -    ICSState *ics = ICS_SPAPR(intc);
> > +    ICSState *ics = ICS(intc);
> >  
> >      assert(ics);
> >      assert(ics_valid_irq(ics, irq));
> > @@ -380,7 +384,7 @@ static int xics_spapr_claim_irq(SpaprInterruptController *intc, int irq,
> >  
> >  static void xics_spapr_free_irq(SpaprInterruptController *intc, int irq)
> >  {
> > -    ICSState *ics = ICS_SPAPR(intc);
> > +    ICSState *ics = ICS(intc);
> >      uint32_t srcno = irq - ics->offset;
> >  
> >      assert(ics_valid_irq(ics, irq));
> > @@ -390,7 +394,7 @@ static void xics_spapr_free_irq(SpaprInterruptController *intc, int irq)
> >  
> >  static void xics_spapr_set_irq(SpaprInterruptController *intc, int irq, int val)
> >  {
> > -    ICSState *ics = ICS_SPAPR(intc);
> > +    ICSState *ics = ICS(intc);
> >      uint32_t srcno = irq - ics->offset;
> >  
> >      ics_set_irq(ics, srcno, val);
> > @@ -398,7 +402,7 @@ static void xics_spapr_set_irq(SpaprInterruptController *intc, int irq, int val)
> >  
> >  static void xics_spapr_print_info(SpaprInterruptController *intc, Monitor *mon)
> >  {
> > -    ICSState *ics = ICS_SPAPR(intc);
> > +    ICSState *ics = ICS(intc);
> >      CPUState *cs;
> >  
> >      CPU_FOREACH(cs) {
> > @@ -426,7 +430,8 @@ static int xics_spapr_activate(SpaprInterruptController *intc,
> >                                 uint32_t nr_servers, Error **errp)
> >  {
> >      if (kvm_enabled()) {
> > -        return spapr_irq_init_kvm(xics_kvm_connect, intc, nr_servers, errp);
> > +        return spapr_irq_init_kvm(xics_kvm_connect, intc,
> > +                                  ICS_SPAPR(intc)->nr_servers, errp);
> >      }
> >      return 0;
> >  }
> > @@ -438,6 +443,11 @@ static void xics_spapr_deactivate(SpaprInterruptController *intc)
> >      }
> >  }
> >  
> > +static Property ics_spapr_properties[] = {
> > +    DEFINE_PROP_UINT32("nr-servers", IcsSpaprState, nr_servers, 0),
> > +    DEFINE_PROP_END_OF_LIST(),
> > +};
> > +
> >  static void ics_spapr_class_init(ObjectClass *klass, void *data)
> >  {
> >      DeviceClass *dc = DEVICE_CLASS(klass);
> > @@ -446,6 +456,7 @@ static void ics_spapr_class_init(ObjectClass *klass, void *data)
> >  
> >      device_class_set_parent_realize(dc, ics_spapr_realize,
> >                                      &isc->parent_realize);
> > +    device_class_set_props(dc, ics_spapr_properties);
> >      sicc->activate = xics_spapr_activate;
> >      sicc->deactivate = xics_spapr_deactivate;
> >      sicc->cpu_intc_create = xics_spapr_cpu_intc_create;
> > @@ -462,6 +473,7 @@ static void ics_spapr_class_init(ObjectClass *klass, void *data)
> >  static const TypeInfo ics_spapr_info = {
> >      .name = TYPE_ICS_SPAPR,
> >      .parent = TYPE_ICS,
> > +    .instance_size = sizeof(IcsSpaprState),
> >      .class_init = ics_spapr_class_init,
> >      .interfaces = (InterfaceInfo[]) {
> >          { TYPE_SPAPR_INTC },
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 12a012d9dd09..21de0456446b 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -4218,15 +4218,16 @@ static void spapr_phb_placement(SpaprMachineState *spapr, uint32_t index,
> >  static ICSState *spapr_ics_get(XICSFabric *dev, int irq)
> >  {
> >      SpaprMachineState *spapr = SPAPR_MACHINE(dev);
> > +    ICSState *ics = ICS(spapr->ics);
> >  
> > -    return ics_valid_irq(spapr->ics, irq) ? spapr->ics : NULL;
> > +    return ics_valid_irq(ics, irq) ? ics : NULL;
> >  }
> >  
> >  static void spapr_ics_resend(XICSFabric *dev)
> >  {
> >      SpaprMachineState *spapr = SPAPR_MACHINE(dev);
> >  
> > -    ics_resend(spapr->ics);
> > +    ics_resend(ICS(spapr->ics));
> >  }
> >  
> >  static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
> > diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> > index 2dacbecfd5fd..be6f4041e433 100644
> > --- a/hw/ppc/spapr_irq.c
> > +++ b/hw/ppc/spapr_irq.c
> > @@ -316,6 +316,9 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
> >          object_property_set_link(obj, ICS_PROP_XICS, OBJECT(spapr),
> >                                   &error_abort);
> >          object_property_set_int(obj, "nr-irqs", smc->nr_xirqs, &error_abort);
> > +        object_property_set_uint(obj, "nr-servers",
> > +                                 spapr_max_server_number(spapr),
> > +                                 &error_abort);
> >          if (!qdev_realize(DEVICE(obj), NULL, errp)) {
> >              return;
> >          }
> > @@ -426,7 +429,7 @@ qemu_irq spapr_qirq(SpaprMachineState *spapr, int irq)
> >      assert(irq < (smc->nr_xirqs + SPAPR_XIRQ_BASE));
> >  
> >      if (spapr->ics) {
> > -        assert(ics_valid_irq(spapr->ics, irq));
> > +        assert(ics_valid_irq(ICS(spapr->ics), irq));
> >      }
> >      if (spapr->xive) {
> >          assert(irq < spapr->xive->nr_irqs);
> > @@ -556,7 +559,7 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum)
> >  
> >  int spapr_irq_find(SpaprMachineState *spapr, int num, bool align, Error **errp)
> >  {
> > -    ICSState *ics = spapr->ics;
> > +    ICSState *ics = ICS(spapr->ics);
> >      int first = -1;
> >  
> >      assert(ics);
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends()
  2020-11-20 17:46 ` [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends() Greg Kurz
  2020-11-23  3:33   ` David Gibson
@ 2020-11-23  9:46   ` Cédric Le Goater
  2020-11-23 11:16     ` Greg Kurz
  1 sibling, 1 reply; 40+ messages in thread
From: Cédric Le Goater @ 2020-11-23  9:46 UTC (permalink / raw)
  To: Greg Kurz, qemu-devel; +Cc: qemu-ppc, David Gibson

On 11/20/20 6:46 PM, Greg Kurz wrote:
> We're going to kill the "nr_ends" field in a subsequent patch.

why ? it is one of the tables of the controller and its part of 
the main XIVE concepts. Conceptually, we could let the machine 
dimension it with an arbitrary value as OPAL does. The controller
would fail when the table is fully used. 

 
> Prepare ground by using an helper instead of peeking into
> the sPAPR XIVE structure directly.


I am not against the helper though but we should introduce a 
prio_shift value which would let us define the number of 
available priorities. To be linked with "hv-prio"

C.


> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
>  include/hw/ppc/spapr_xive.h |  1 +
>  hw/intc/spapr_xive.c        | 23 ++++++++++++++---------
>  hw/intc/spapr_xive_kvm.c    |  4 ++--
>  3 files changed, 17 insertions(+), 11 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> index 26c8d90d7196..4b967f13c030 100644
> --- a/include/hw/ppc/spapr_xive.h
> +++ b/include/hw/ppc/spapr_xive.h
> @@ -75,6 +75,7 @@ void spapr_xive_map_mmio(SpaprXive *xive);
>  
>  int spapr_xive_end_to_target(uint8_t end_blk, uint32_t end_idx,
>                               uint32_t *out_server, uint8_t *out_prio);
> +uint32_t spapr_xive_nr_ends(const SpaprXive *xive);
>  
>  /*
>   * KVM XIVE device helpers
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index 60e0d5769dcc..f473ad9cba47 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -192,7 +192,7 @@ void spapr_xive_pic_print_info(SpaprXive *xive, Monitor *mon)
>              uint32_t end_idx = xive_get_field64(EAS_END_INDEX, eas->w);
>              XiveEND *end;
>  
> -            assert(end_idx < xive->nr_ends);
> +            assert(end_idx < spapr_xive_nr_ends(xive));
>              end = &xive->endt[end_idx];
>  
>              if (xive_end_is_valid(end)) {
> @@ -270,7 +270,7 @@ static void spapr_xive_reset(void *dev)
>      }
>  
>      /* Clear all ENDs */
> -    for (i = 0; i < xive->nr_ends; i++) {
> +    for (i = 0; i < spapr_xive_nr_ends(xive); i++) {
>          spapr_xive_end_reset(&xive->endt[i]);
>      }
>  }
> @@ -288,6 +288,11 @@ static void spapr_xive_instance_init(Object *obj)
>      xive->fd = -1;
>  }
>  
> +uint32_t spapr_xive_nr_ends(const SpaprXive *xive)
> +{
> +    return xive->nr_ends;
> +}
> +
>  static void spapr_xive_realize(DeviceState *dev, Error **errp)
>  {
>      SpaprXive *xive = SPAPR_XIVE(dev);
> @@ -336,7 +341,7 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
>       * Allocate the routing tables
>       */
>      xive->eat = g_new0(XiveEAS, xive->nr_irqs);
> -    xive->endt = g_new0(XiveEND, xive->nr_ends);
> +    xive->endt = g_new0(XiveEND, spapr_xive_nr_ends(xive));
>  
>      xive->nodename = g_strdup_printf("interrupt-controller@%" PRIx64,
>                             xive->tm_base + XIVE_TM_USER_PAGE * (1 << TM_SHIFT));
> @@ -375,7 +380,7 @@ static int spapr_xive_get_end(XiveRouter *xrtr,
>  {
>      SpaprXive *xive = SPAPR_XIVE(xrtr);
>  
> -    if (end_idx >= xive->nr_ends) {
> +    if (end_idx >= spapr_xive_nr_ends(xive)) {
>          return -1;
>      }
>  
> @@ -389,7 +394,7 @@ static int spapr_xive_write_end(XiveRouter *xrtr, uint8_t end_blk,
>  {
>      SpaprXive *xive = SPAPR_XIVE(xrtr);
>  
> -    if (end_idx >= xive->nr_ends) {
> +    if (end_idx >= spapr_xive_nr_ends(xive)) {
>          return -1;
>      }
>  
> @@ -1138,7 +1143,7 @@ static target_ulong h_int_get_source_config(PowerPCCPU *cpu,
>      /* EAS_END_BLOCK is unused on sPAPR */
>      end_idx = xive_get_field64(EAS_END_INDEX, eas.w);
>  
> -    assert(end_idx < xive->nr_ends);
> +    assert(end_idx < spapr_xive_nr_ends(xive));
>      end = &xive->endt[end_idx];
>  
>      nvt_blk = xive_get_field32(END_W6_NVT_BLOCK, end->w6);
> @@ -1216,7 +1221,7 @@ static target_ulong h_int_get_queue_info(PowerPCCPU *cpu,
>          return H_P2;
>      }
>  
> -    assert(end_idx < xive->nr_ends);
> +    assert(end_idx < spapr_xive_nr_ends(xive));
>      end = &xive->endt[end_idx];
>  
>      args[0] = xive->end_base + (1ull << (end_xsrc->esb_shift + 1)) * end_idx;
> @@ -1304,7 +1309,7 @@ static target_ulong h_int_set_queue_config(PowerPCCPU *cpu,
>          return H_P2;
>      }
>  
> -    assert(end_idx < xive->nr_ends);
> +    assert(end_idx < spapr_xive_nr_ends(xive));
>      memcpy(&end, &xive->endt[end_idx], sizeof(XiveEND));
>  
>      switch (qsize) {
> @@ -1470,7 +1475,7 @@ static target_ulong h_int_get_queue_config(PowerPCCPU *cpu,
>          return H_P2;
>      }
>  
> -    assert(end_idx < xive->nr_ends);
> +    assert(end_idx < spapr_xive_nr_ends(xive));
>      end = &xive->endt[end_idx];
>  
>      args[0] = 0;
> diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
> index 66bf4c06fe55..1566016f0e28 100644
> --- a/hw/intc/spapr_xive_kvm.c
> +++ b/hw/intc/spapr_xive_kvm.c
> @@ -531,7 +531,7 @@ static int kvmppc_xive_get_queues(SpaprXive *xive, Error **errp)
>      int i;
>      int ret;
>  
> -    for (i = 0; i < xive->nr_ends; i++) {
> +    for (i = 0; i < spapr_xive_nr_ends(xive); i++) {
>          if (!xive_end_is_valid(&xive->endt[i])) {
>              continue;
>          }
> @@ -701,7 +701,7 @@ int kvmppc_xive_post_load(SpaprXive *xive, int version_id)
>      assert(xive->fd != -1);
>  
>      /* Restore the ENDT first. The targetting depends on it. */
> -    for (i = 0; i < xive->nr_ends; i++) {
> +    for (i = 0; i < spapr_xive_nr_ends(xive); i++) {
>          if (!xive_end_is_valid(&xive->endt[i])) {
>              continue;
>          }
> 



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 7/8] spapr: Drop "nr_servers" argument of the sPAPR IC activate() operation
  2020-11-23  4:38   ` David Gibson
@ 2020-11-23  9:47     ` Greg Kurz
  0 siblings, 0 replies; 40+ messages in thread
From: Greg Kurz @ 2020-11-23  9:47 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, Cédric Le Goater

[-- Attachment #1: Type: text/plain, Size: 3806 bytes --]

On Mon, 23 Nov 2020 15:38:32 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Fri, Nov 20, 2020 at 06:46:45PM +0100, Greg Kurz wrote:
> > This argument isn't used by the backends anymore.
> > 
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> >  include/hw/ppc/spapr_irq.h | 3 +--
> >  hw/intc/spapr_xive.c       | 3 +--
> >  hw/intc/xics_spapr.c       | 3 +--
> >  hw/ppc/spapr_irq.c         | 3 +--
> >  4 files changed, 4 insertions(+), 8 deletions(-)
> > 
> > diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> > index c22a72c9e270..3e1c619d4c06 100644
> > --- a/include/hw/ppc/spapr_irq.h
> > +++ b/include/hw/ppc/spapr_irq.h
> > @@ -43,8 +43,7 @@ DECLARE_CLASS_CHECKERS(SpaprInterruptControllerClass, SPAPR_INTC,
> >  struct SpaprInterruptControllerClass {
> >      InterfaceClass parent;
> >  
> > -    int (*activate)(SpaprInterruptController *intc, uint32_t nr_servers,
> > -                    Error **errp);
> 
> Hm.  Thinking back on this, is the problem just that it's not clear
> here if the 'nr_servers' parameter here indicates the number of CPU
> targets, or the maximum index of CPU targets?
> 

AIUI 'nr_servers' has always been referring to the number of consecutive
server ids that we put in the "ibm,interrupt-server-ranges" property for
XICS.

> If so, I wonder if it might be better to pass both numbers as
> parameters here, rather than shuffling the properties of the devices
> themselves.
> 

Maybe. I'll give a try.

> > +    int (*activate)(SpaprInterruptController *intc, Error **errp);
> >      void (*deactivate)(SpaprInterruptController *intc);
> >  
> >      /*
> > diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> > index d13a2955ce9b..8331026fdb12 100644
> > --- a/hw/intc/spapr_xive.c
> > +++ b/hw/intc/spapr_xive.c
> > @@ -754,8 +754,7 @@ static void spapr_xive_dt(SpaprInterruptController *intc, uint32_t nr_servers,
> >                       plat_res_int_priorities, sizeof(plat_res_int_priorities)));
> >  }
> >  
> > -static int spapr_xive_activate(SpaprInterruptController *intc,
> > -                               uint32_t nr_servers, Error **errp)
> > +static int spapr_xive_activate(SpaprInterruptController *intc, Error **errp)
> >  {
> >      SpaprXive *xive = SPAPR_XIVE(intc);
> >  
> > diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
> > index ce147e8980ed..8810bd93c856 100644
> > --- a/hw/intc/xics_spapr.c
> > +++ b/hw/intc/xics_spapr.c
> > @@ -426,8 +426,7 @@ static int xics_spapr_post_load(SpaprInterruptController *intc, int version_id)
> >      return 0;
> >  }
> >  
> > -static int xics_spapr_activate(SpaprInterruptController *intc,
> > -                               uint32_t nr_servers, Error **errp)
> > +static int xics_spapr_activate(SpaprInterruptController *intc, Error **errp)
> >  {
> >      if (kvm_enabled()) {
> >          return spapr_irq_init_kvm(xics_kvm_connect, intc,
> > diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> > index be6f4041e433..f2897fbc942a 100644
> > --- a/hw/ppc/spapr_irq.c
> > +++ b/hw/ppc/spapr_irq.c
> > @@ -480,7 +480,6 @@ static void set_active_intc(SpaprMachineState *spapr,
> >                              SpaprInterruptController *new_intc)
> >  {
> >      SpaprInterruptControllerClass *sicc;
> > -    uint32_t nr_servers = spapr_max_server_number(spapr);
> >  
> >      assert(new_intc);
> >  
> > @@ -498,7 +497,7 @@ static void set_active_intc(SpaprMachineState *spapr,
> >  
> >      sicc = SPAPR_INTC_GET_CLASS(new_intc);
> >      if (sicc->activate) {
> > -        sicc->activate(new_intc, nr_servers, &error_fatal);
> > +        sicc->activate(new_intc, &error_fatal);
> >      }
> >  
> >      spapr->active_intc = new_intc;
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 3/8] spapr/xive: Add "nr-servers" property
  2020-11-20 17:46 ` [PATCH for-6.0 3/8] spapr/xive: Add "nr-servers" property Greg Kurz
  2020-11-23  3:52   ` David Gibson
@ 2020-11-23  9:56   ` Cédric Le Goater
  2020-11-23 11:25     ` Greg Kurz
  1 sibling, 1 reply; 40+ messages in thread
From: Cédric Le Goater @ 2020-11-23  9:56 UTC (permalink / raw)
  To: Greg Kurz, qemu-devel; +Cc: qemu-ppc, David Gibson

On 11/20/20 6:46 PM, Greg Kurz wrote:
> The sPAPR XIVE object has an "nr-ends" property that is used
> to size the END table. This property is set by the machine
> code to a value derived from spapr_max_server_number().

Yes. This is done at the machine level.

> spapr_max_server_number() is also used to inform the KVM XIVE
> device about the range of vCPU ids it might be exposed to,
> in order to optimize resource allocation in the HW.

Yes. It's deeply buried in the spapr/irq/xive/kvm layers but it is
called by set_active_intc() which is, for me, a machine level 
operation.

The only other place where the number of vCPUs is used is in 
spapr_xive_dt() to define the vCPU IPI range, which done at
the machine level again.

So I don't agree with this patch.

C.


> This is enough motivation to introduce an "nr-servers" property
> and to use it for both purposes. The existing "nr-ends" property
> is now longer used. It is kept around though because it is exposed
> to -global. It will continue to be ignored as before without
> causing QEMU to exit.
> 
> The associated nr_ends field cannot be dropped from SpaprXive
> because it is explicitly used by vmstate_spapr_xive(). It is
> thus renamed to nr_ends_vmstate.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
>  include/hw/ppc/spapr_xive.h | 16 +++++++++++++++-
>  hw/intc/spapr_xive.c        | 28 ++++++++++++++++++++++------
>  hw/ppc/spapr_irq.c          |  6 +-----
>  3 files changed, 38 insertions(+), 12 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> index 4b967f13c030..7123ea07ed78 100644
> --- a/include/hw/ppc/spapr_xive.h
> +++ b/include/hw/ppc/spapr_xive.h
> @@ -23,6 +23,16 @@
>  typedef struct SpaprXive {
>      XiveRouter    parent;
>  
> +    /*
> +     * The XIVE device needs to know the highest vCPU id it might
> +     * be exposed to in order to size the END table. It may also
> +     * propagate the value to the KVM XIVE device in order to
> +     * optimize resource allocation in the HW.
> +     * This must be set to a non-null value using the "nr-servers"
> +     * property, before realizing the device.
> +     */
> +    uint32_t      nr_servers;
> +
>      /* Internal interrupt source for IPIs and virtual devices */
>      XiveSource    source;
>      hwaddr        vc_base;
> @@ -38,7 +48,11 @@ typedef struct SpaprXive {
>      XiveEAS       *eat;
>      uint32_t      nr_irqs;
>      XiveEND       *endt;
> -    uint32_t      nr_ends;
> +    /*
> +     * This is derived from nr_servers but it must be kept around because
> +     * vmstate_spapr_xive uses it.
> +     */
> +    uint32_t      nr_ends_vmstate;
>  
>      /* TIMA mapping address */
>      hwaddr        tm_base;
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index f473ad9cba47..e4dbf9c2c409 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -99,6 +99,12 @@ int spapr_xive_end_to_target(uint8_t end_blk, uint32_t end_idx,
>      return 0;
>  }
>  
> +/*
> + * 8 XIVE END structures per CPU. One for each available
> + * priority
> + */
> +#define spapr_xive_cpu_end_idx(vcpu, prio) (((vcpu) << 3) + prio)
> +
>  static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
>                                    uint8_t *out_end_blk, uint32_t *out_end_idx)
>  {
> @@ -109,7 +115,7 @@ static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
>      }
>  
>      if (out_end_idx) {
> -        *out_end_idx = (cpu->vcpu_id << 3) + prio;
> +        *out_end_idx = spapr_xive_cpu_end_idx(cpu->vcpu_id, prio);
>      }
>  }
>  
> @@ -290,7 +296,8 @@ static void spapr_xive_instance_init(Object *obj)
>  
>  uint32_t spapr_xive_nr_ends(const SpaprXive *xive)
>  {
> -    return xive->nr_ends;
> +    g_assert(xive->nr_servers);
> +    return spapr_xive_cpu_end_idx(xive->nr_servers, 0);
>  }
>  
>  static void spapr_xive_realize(DeviceState *dev, Error **errp)
> @@ -303,7 +310,7 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
>  
>      /* Set by spapr_irq_init() */
>      g_assert(xive->nr_irqs);
> -    g_assert(xive->nr_ends);
> +    g_assert(xive->nr_servers);
>  
>      sxc->parent_realize(dev, &local_err);
>      if (local_err) {
> @@ -360,6 +367,8 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
>      sysbus_mmio_map(SYS_BUS_DEVICE(xive), 0, xive->vc_base);
>      sysbus_mmio_map(SYS_BUS_DEVICE(xive), 1, xive->end_base);
>      sysbus_mmio_map(SYS_BUS_DEVICE(xive), 2, xive->tm_base);
> +
> +    xive->nr_ends_vmstate = spapr_xive_nr_ends(xive);
>  }
>  
>  static int spapr_xive_get_eas(XiveRouter *xrtr, uint8_t eas_blk,
> @@ -547,7 +556,7 @@ static const VMStateDescription vmstate_spapr_xive = {
>          VMSTATE_UINT32_EQUAL(nr_irqs, SpaprXive, NULL),
>          VMSTATE_STRUCT_VARRAY_POINTER_UINT32(eat, SpaprXive, nr_irqs,
>                                       vmstate_spapr_xive_eas, XiveEAS),
> -        VMSTATE_STRUCT_VARRAY_POINTER_UINT32(endt, SpaprXive, nr_ends,
> +        VMSTATE_STRUCT_VARRAY_POINTER_UINT32(endt, SpaprXive, nr_ends_vmstate,
>                                               vmstate_spapr_xive_end, XiveEND),
>          VMSTATE_END_OF_LIST()
>      },
> @@ -591,7 +600,14 @@ static void spapr_xive_free_irq(SpaprInterruptController *intc, int lisn)
>  
>  static Property spapr_xive_properties[] = {
>      DEFINE_PROP_UINT32("nr-irqs", SpaprXive, nr_irqs, 0),
> -    DEFINE_PROP_UINT32("nr-ends", SpaprXive, nr_ends, 0),
> +    /*
> +     * "nr-ends" is deprecated by "nr-servers" since QEMU 6.0.
> +     * It is just kept around because it is exposed to the user
> +     * through -global and we don't want it to fail, even if
> +     * the value is actually overridden internally.
> +     */
> +    DEFINE_PROP_UINT32("nr-ends", SpaprXive, nr_ends_vmstate, 0),
> +    DEFINE_PROP_UINT32("nr-servers", SpaprXive, nr_servers, 0),
>      DEFINE_PROP_UINT64("vc-base", SpaprXive, vc_base, SPAPR_XIVE_VC_BASE),
>      DEFINE_PROP_UINT64("tm-base", SpaprXive, tm_base, SPAPR_XIVE_TM_BASE),
>      DEFINE_PROP_UINT8("hv-prio", SpaprXive, hv_prio, 7),
> @@ -742,7 +758,7 @@ static int spapr_xive_activate(SpaprInterruptController *intc,
>      SpaprXive *xive = SPAPR_XIVE(intc);
>  
>      if (kvm_enabled()) {
> -        int rc = spapr_irq_init_kvm(kvmppc_xive_connect, intc, nr_servers,
> +        int rc = spapr_irq_init_kvm(kvmppc_xive_connect, intc, xive->nr_servers,
>                                      errp);
>          if (rc < 0) {
>              return rc;
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index f59960339ec3..8c5627225636 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -330,11 +330,7 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
>  
>          dev = qdev_new(TYPE_SPAPR_XIVE);
>          qdev_prop_set_uint32(dev, "nr-irqs", smc->nr_xirqs + SPAPR_XIRQ_BASE);
> -        /*
> -         * 8 XIVE END structures per CPU. One for each available
> -         * priority
> -         */
> -        qdev_prop_set_uint32(dev, "nr-ends", nr_servers << 3);
> +        qdev_prop_set_uint32(dev, "nr-servers", nr_servers);
>          object_property_set_link(OBJECT(dev), "xive-fabric", OBJECT(spapr),
>                                   &error_abort);
>          sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
> 



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids
  2020-11-23  8:04 ` [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids Cédric Le Goater
@ 2020-11-23 10:07   ` Greg Kurz
  0 siblings, 0 replies; 40+ messages in thread
From: Greg Kurz @ 2020-11-23 10:07 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, David Gibson

On Mon, 23 Nov 2020 09:04:42 +0100
Cédric Le Goater <clg@kaod.org> wrote:

> On 11/20/20 6:46 PM, Greg Kurz wrote:
> > A regression was recently fixed in the sPAPR XIVE code for QEMU 5.2
> > RC3 [1]. It boiled down to a confusion between IPI numbers and vCPU
> > ids, which happen to be numerically equal in general, but are really
> > different entities that can diverge in some setups. When this happens,
> > we end up misconfiguring XIVE in a way that is almost fatal for the
> > guest.
> > 
> > The confusion comes from XICS which has historically assumed equality
> > between interrupt server numbers and vCPU ids, as mentionned in a
> > comment back from 2011 in the linux kernel icp_native_init_one_node()
> > function:
> > 
> >     /* This code does the theorically broken assumption that the interrupt
> >      * server numbers are the same as the hard CPU numbers.
> >      * This happens to be the case so far but we are playing with fire...
> >      * should be fixed one of these days. -BenH.
> >      */
> > 
> > This assumption crept into QEMU through the "ibm,interrupt-server-ranges"
> > property of the "interrupt-controller" node in the DT. This property
> > contains ranges of consecutive vCPU ids defined as (first id, # of ids).
> > In the case of QEMU, we define a single range starting from 0 up to the
> > highest vCPU id, as returned by spapr_max_server_number(). This has
> > always been associated to the "nr_servers" wording when naming variables
> > or function arguments. When XIVE got added, we introduced an sPAPR IRQ
> > abstraction to be able to control several interrupt controller backends.
> > The sPAPR IRQ base class provides a dt() handler used to populate the
> > "interrupt-controller" node in the DT. This handler takes an "nr_server"
> > argument inherited from XICS and we ended up using it to populate the
> > "ibm,xive-lisn-ranges" property used with XIVE, which has completely
> > different semantics. It contain ranges of interrupt numbers that the
> > guest can use for any purpose. Since one obvious purpose is IPI, its
> > first range should just be able to accomodate all possible vCPUs.
> 
> To clarify, PAPR says it is a requirement :
> 
>   "The first range will contain at least one per possible thread in the 
>    partition."
> 
> The regression showed that we were not initializing correctly this range 
> in QEMU/KVM. I an not even sure it had the correct size either since
> we were anyhow initializing 4096 IPIs.
> 

The bad thing was that each online vCPU would reset it's IPI in
KVM using a bogus IPI number (the vCPU id), and thus doesn't reset
the interrupt the guest is really going to use for the IPI.

> > In the case of QEMU, we define a single range starting from 0 up
> > to "nr_server" but we should rather size it to the number of vCPUs
> > actually (ie. smp.max_cpus).
> 
> ah. And so spapr_max_server_number(spapr) is crap ? This is starting
> to be complex to follow :/
>  

No. spapr_max_server_number(spapr) gives the highest vCPU id that
we end over to KVM in order to optimize VP id allocation in the HW.
But it definitely has nothing to do with "ibm,xive-lisn-ranges".

David suggested in some other mail that we could maybe pass
both spapr_max_server_number(spapr) and smp.max_cpus to the
activate() handler.

> > This series aims at getting rid of the "nr_server" argument in the
> > sPAPR IC handlers. Since both the highest possible vCPU id and the
> > maximum number of vCPUs are invariants for XICS and XIVE respectively,
> 
> What XIVE cares about is the number of possible IPIs and the number
> of vCPUs since we deduced from that the number of event queue 
> descriptors, which is another XIVE structure.
> 
> > let's make them device properties to be configured by the machine when
> > it creates the interrupt controllers and use them where needed.
> > 
> > This doesn't cause any visible change to setups using the default
> > VSMT machine settings. This changes "ibm,xive-lisn-ranges" for
> > setups that mess with VSMT, but this is acceptable since linux
> > only allocates one interrupt per vCPU and the higher part of the
> > range was never used.
> 
> This range is only used for vCPUs. 
> 
> C.
> 
> > [1] https://git.qemu.org/?p=qemu.git;a=commit;h=6d24795ee7e3199d199d3c415312c93382ad1807
> > 
> > Greg Kurz (8):
> >   spapr/xive: Turn some sanity checks into assertions
> >   spapr/xive: Introduce spapr_xive_nr_ends()
> >   spapr/xive: Add "nr-servers" property
> >   spapr/xive: Add "nr-ipis" property
> >   spapr/xics: Drop unused argument to xics_kvm_has_broken_disconnect()
> >   spapr/xics: Add "nr-servers" property
> >   spapr: Drop "nr_servers" argument of the sPAPR IC activate() operation
> >   spapr: Drop "nr_servers" argument of the sPAPR IC dt() operation
> > 
> >  include/hw/ppc/spapr.h      |  4 +--
> >  include/hw/ppc/spapr_irq.h  |  9 ++---
> >  include/hw/ppc/spapr_xive.h | 25 ++++++++++++-
> >  include/hw/ppc/xics_spapr.h | 23 +++++++++---
> >  hw/intc/spapr_xive.c        | 72 ++++++++++++++++++++++---------------
> >  hw/intc/spapr_xive_kvm.c    |  4 +--
> >  hw/intc/xics_kvm.c          |  4 +--
> >  hw/intc/xics_spapr.c        | 45 ++++++++++++++---------
> >  hw/ppc/spapr.c              |  7 ++--
> >  hw/ppc/spapr_irq.c          | 27 +++++++-------
> >  10 files changed, 141 insertions(+), 79 deletions(-)
> > 
> 



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 4/8] spapr/xive: Add "nr-ipis" property
  2020-11-20 17:46 ` [PATCH for-6.0 4/8] spapr/xive: Add "nr-ipis" property Greg Kurz
  2020-11-23  4:10   ` David Gibson
@ 2020-11-23 10:13   ` Cédric Le Goater
  2020-11-24 17:18     ` Greg Kurz
  1 sibling, 1 reply; 40+ messages in thread
From: Cédric Le Goater @ 2020-11-23 10:13 UTC (permalink / raw)
  To: Greg Kurz, qemu-devel; +Cc: qemu-ppc, David Gibson

On 11/20/20 6:46 PM, Greg Kurz wrote:
> The sPAPR XIVE device exposes a range of LISNs that the guest uses
> for IPIs. This range is currently sized according to the highest
> vCPU id, ie. spapr_max_server_number(), as obtained from the machine
> through the "nr_servers" argument of the generic spapr_irq_dt() call.
> 
> This makes sense for the "ibm,interrupt-server-ranges" property of
> sPAPR XICS, but certainly not for "ibm,xive-lisn-ranges". The range
> should be sized to the maximum number of possible vCPUs. It happens
> that spapr_max_server_number() == smp.max_cpus in practice with the
> machine default settings. This might not be the case though when
> VSMT is in use : we can end up with a much larger range (up to 8
> times bigger) than needed and waste LISNs. 

will it exceed 4K ? if not, we are fine.

The fact that it is so complex to get the number of vCPUs is a 
problem by it self to me. Can we simplify that ? or introduce a 
spapr_max_server_number_id() ? 

> But most importantly, this
> lures people into thinking that IPI numbers are always equal to
> vCPU ids, which is wrong and bit us recently:

do we have a converting routing vcpu_id_to_ipi ? I think we have
that in KVM.

> https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg01476.html
> 
> Introduce an "nr-ipis" property that the machine sets to smp.max_cpus
> before realizing the deice. Use that instead of "nr_servers" in
> spapr_xive_dt(). Have the machine to claim the same number of IPIs
> in spapr_irq_init().
> 
> This doesn't cause any guest visible change when using the machine
> default settings (ie. VSMT == smp.threads).>
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
>  include/hw/ppc/spapr_xive.h | 8 ++++++++
>  hw/intc/spapr_xive.c        | 4 +++-
>  hw/ppc/spapr_irq.c          | 4 +++-
>  3 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> index 7123ea07ed78..69b9fbc1b9a5 100644
> --- a/include/hw/ppc/spapr_xive.h
> +++ b/include/hw/ppc/spapr_xive.h
> @@ -43,6 +43,14 @@ typedef struct SpaprXive {
>  
>      /* DT */
>      gchar *nodename;
> +    /*
> +     * The sPAPR XIVE device needs to know how many vCPUs it
> +     * might be exposed to in order to size the IPI range in
> +     * "ibm,interrupt-server-ranges". It is the purpose of the

There is no "ibm,interrupt-server-ranges"  property in XIVE

> +     * "nr-ipis" property which *must* be set to a non-null
> +     * value before realizing the sPAPR XIVE device.
> +     */
> +    uint32_t nr_ipis;
>  
>      /* Routing table */
>      XiveEAS       *eat;
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index e4dbf9c2c409..d13a2955ce9b 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -311,6 +311,7 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
>      /* Set by spapr_irq_init() */
>      g_assert(xive->nr_irqs);
>      g_assert(xive->nr_servers);
> +    g_assert(xive->nr_ipis);
>  
>      sxc->parent_realize(dev, &local_err);
>      if (local_err) {
> @@ -608,6 +609,7 @@ static Property spapr_xive_properties[] = {
>       */
>      DEFINE_PROP_UINT32("nr-ends", SpaprXive, nr_ends_vmstate, 0),
>      DEFINE_PROP_UINT32("nr-servers", SpaprXive, nr_servers, 0),
> +    DEFINE_PROP_UINT32("nr-ipis", SpaprXive, nr_ipis, 0),
>      DEFINE_PROP_UINT64("vc-base", SpaprXive, vc_base, SPAPR_XIVE_VC_BASE),
>      DEFINE_PROP_UINT64("tm-base", SpaprXive, tm_base, SPAPR_XIVE_TM_BASE),
>      DEFINE_PROP_UINT8("hv-prio", SpaprXive, hv_prio, 7),
> @@ -698,7 +700,7 @@ static void spapr_xive_dt(SpaprInterruptController *intc, uint32_t nr_servers,
>      /* Interrupt number ranges for the IPIs */
>      uint32_t lisn_ranges[] = {
>          cpu_to_be32(SPAPR_IRQ_IPI),
> -        cpu_to_be32(SPAPR_IRQ_IPI + nr_servers),
> +        cpu_to_be32(SPAPR_IRQ_IPI + xive->nr_ipis),

I don't understand why we need nr_ipis. Can't we pass the same value in 
nr_servers from the machine ?

( Conceptually, the first 4K are all IPIs. The first range is for 
  HW threads ) 


>      };
>      /*
>       * EQ size - the sizes of pages supported by the system 4K, 64K,
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 8c5627225636..a0fc474ecb06 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -325,12 +325,14 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
>  
>      if (spapr->irq->xive) {
>          uint32_t nr_servers = spapr_max_server_number(spapr);
> +        uint32_t max_cpus = MACHINE(spapr)->smp.max_cpus;

So that's the value we should be using in case of VSMT and not 
spapr_max_server_number(). If I am correct, this is a max_cpu_id ?


>          DeviceState *dev;
>          int i;
>  
>          dev = qdev_new(TYPE_SPAPR_XIVE);
>          qdev_prop_set_uint32(dev, "nr-irqs", smc->nr_xirqs + SPAPR_XIRQ_BASE);
>          qdev_prop_set_uint32(dev, "nr-servers", nr_servers);
> +        qdev_prop_set_uint32(dev, "nr-ipis", max_cpus);
>          object_property_set_link(OBJECT(dev), "xive-fabric", OBJECT(spapr),
>                                   &error_abort);
>          sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
> @@ -338,7 +340,7 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
>          spapr->xive = SPAPR_XIVE(dev);
>  
>          /* Enable the CPU IPIs */
> -        for (i = 0; i < nr_servers; ++i) {
> +        for (i = 0; i < max_cpus; ++i) {
>              SpaprInterruptControllerClass *sicc
>                  = SPAPR_INTC_GET_CLASS(spapr->xive);






^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 5/8] spapr/xics: Drop unused argument to xics_kvm_has_broken_disconnect()
  2020-11-20 17:46 ` [PATCH for-6.0 5/8] spapr/xics: Drop unused argument to xics_kvm_has_broken_disconnect() Greg Kurz
  2020-11-23  4:10   ` David Gibson
@ 2020-11-23 10:15   ` Cédric Le Goater
  1 sibling, 0 replies; 40+ messages in thread
From: Cédric Le Goater @ 2020-11-23 10:15 UTC (permalink / raw)
  To: Greg Kurz, qemu-devel; +Cc: qemu-ppc, David Gibson

On 11/20/20 6:46 PM, Greg Kurz wrote:
> Never used from the start.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>

Reviewed-by: Cédric Le Goater <clg@kaod.org>



> ---
>  include/hw/ppc/xics_spapr.h | 2 +-
>  hw/intc/xics_kvm.c          | 2 +-
>  hw/ppc/spapr_irq.c          | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/include/hw/ppc/xics_spapr.h b/include/hw/ppc/xics_spapr.h
> index 0b8182e40b33..de752c0d2c7e 100644
> --- a/include/hw/ppc/xics_spapr.h
> +++ b/include/hw/ppc/xics_spapr.h
> @@ -38,6 +38,6 @@ DECLARE_INSTANCE_CHECKER(ICSState, ICS_SPAPR,
>  int xics_kvm_connect(SpaprInterruptController *intc, uint32_t nr_servers,
>                       Error **errp);
>  void xics_kvm_disconnect(SpaprInterruptController *intc);
> -bool xics_kvm_has_broken_disconnect(SpaprMachineState *spapr);
> +bool xics_kvm_has_broken_disconnect(void);
>  
>  #endif /* XICS_SPAPR_H */
> diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
> index 68bb1914b9bb..570d635bcc08 100644
> --- a/hw/intc/xics_kvm.c
> +++ b/hw/intc/xics_kvm.c
> @@ -484,7 +484,7 @@ void xics_kvm_disconnect(SpaprInterruptController *intc)
>   * support destruction of a KVM XICS device while the VM is running.
>   * Required to start a spapr machine with ic-mode=dual,kernel-irqchip=on.
>   */
> -bool xics_kvm_has_broken_disconnect(SpaprMachineState *spapr)
> +bool xics_kvm_has_broken_disconnect(void)
>  {
>      int rc;
>  
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index a0fc474ecb06..2dacbecfd5fd 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -186,7 +186,7 @@ static int spapr_irq_check(SpaprMachineState *spapr, Error **errp)
>      if (kvm_enabled() &&
>          spapr->irq == &spapr_irq_dual &&
>          kvm_kernel_irqchip_required() &&
> -        xics_kvm_has_broken_disconnect(spapr)) {
> +        xics_kvm_has_broken_disconnect()) {
>          error_setg(errp,
>              "KVM is incompatible with ic-mode=dual,kernel-irqchip=on");
>          error_append_hint(errp,
> 



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 6/8] spapr/xics: Add "nr-servers" property
  2020-11-20 17:46 ` [PATCH for-6.0 6/8] spapr/xics: Add "nr-servers" property Greg Kurz
  2020-11-23  4:18   ` David Gibson
@ 2020-11-23 10:24   ` Cédric Le Goater
  1 sibling, 0 replies; 40+ messages in thread
From: Cédric Le Goater @ 2020-11-23 10:24 UTC (permalink / raw)
  To: Greg Kurz, qemu-devel; +Cc: qemu-ppc, David Gibson

On 11/20/20 6:46 PM, Greg Kurz wrote:
> The sPAPR ICS device exposes the range of vCPU ids it can handle in
> the "ibm,interrupt-server-ranges" FDT property. The highest vCPU
> id, ie. spapr_max_server_number(), is obtained from the machine
> through the "nr_servers" argument of the generic spapr_irq_dt() call.
> 
> We want to drop the "nr_servers" argument from the API because it
> doesn't make sense for the sPAPR XIVE device actually.
> 
> On POWER9, we also pass the highest vCPU id to the KVM XICS-on-XIVE
> device, in order to optimize resource allocation in the HW.
> 
> This is enough motivation to introduce an "nr-servers" property
> and to use it for both purposes.

I don't see a strong justification for storing this information at
the interrupt controller level. If it can be kept at the machine 
level, like it is today, I think it's better.

C. 

 
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
>  include/hw/ppc/spapr.h      |  4 ++--
>  include/hw/ppc/xics_spapr.h | 21 +++++++++++++++++---
>  hw/intc/xics_kvm.c          |  2 +-
>  hw/intc/xics_spapr.c        | 38 ++++++++++++++++++++++++-------------
>  hw/ppc/spapr.c              |  5 +++--
>  hw/ppc/spapr_irq.c          |  7 +++++--
>  6 files changed, 54 insertions(+), 23 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 2e89e36cfbdc..b76c84a2f884 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -10,7 +10,7 @@
>  #include "hw/ppc/spapr_irq.h"
>  #include "qom/object.h"
>  #include "hw/ppc/spapr_xive.h"  /* For SpaprXive */
> -#include "hw/ppc/xics.h"        /* For ICSState */
> +#include "hw/ppc/xics_spapr.h"  /* For IcsSpaprState */
>  #include "hw/ppc/spapr_tpm_proxy.h"
>  
>  struct SpaprVioBus;
> @@ -230,7 +230,7 @@ struct SpaprMachineState {
>      SpaprIrq *irq;
>      qemu_irq *qirqs;
>      SpaprInterruptController *active_intc;
> -    ICSState *ics;
> +    IcsSpaprState *ics;
>      SpaprXive *xive;
>  
>      bool cmd_line_caps[SPAPR_CAP_NUM];
> diff --git a/include/hw/ppc/xics_spapr.h b/include/hw/ppc/xics_spapr.h
> index de752c0d2c7e..1a483a873b62 100644
> --- a/include/hw/ppc/xics_spapr.h
> +++ b/include/hw/ppc/xics_spapr.h
> @@ -28,12 +28,27 @@
>  #define XICS_SPAPR_H
>  
>  #include "hw/ppc/spapr.h"
> +#include "hw/ppc/xics.h"
>  #include "qom/object.h"
>  
> +typedef struct IcsSpaprState {
> +    /*< private >*/
> +    ICPState parent_obj;
> +
> +    /*
> +     * The ICS needs to know the upper limit to vCPU ids it
> +     * might be exposed to in order to size the vCPU id range
> +     * in "ibm,interrupt-server-ranges" and to optimize HW
> +     * resource allocation when using the XICS-on-XIVE KVM
> +     * device. It is the purpose of the "nr-servers" property
> +     * which *must* be set to a non-null value before realizing
> +     * the ICS.
> +     */
> +    uint32_t nr_servers;
> +} IcsSpaprState;
> +
>  #define TYPE_ICS_SPAPR "ics-spapr"
> -/* This is reusing the ICSState typedef from TYPE_ICS */
> -DECLARE_INSTANCE_CHECKER(ICSState, ICS_SPAPR,
> -                         TYPE_ICS_SPAPR)
> +DECLARE_INSTANCE_CHECKER(IcsSpaprState, ICS_SPAPR, TYPE_ICS_SPAPR)
>  
>  int xics_kvm_connect(SpaprInterruptController *intc, uint32_t nr_servers,
>                       Error **errp);
> diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
> index 570d635bcc08..ecbbda0e249b 100644
> --- a/hw/intc/xics_kvm.c
> +++ b/hw/intc/xics_kvm.c
> @@ -350,7 +350,7 @@ void ics_kvm_set_irq(ICSState *ics, int srcno, int val)
>  int xics_kvm_connect(SpaprInterruptController *intc, uint32_t nr_servers,
>                       Error **errp)
>  {
> -    ICSState *ics = ICS_SPAPR(intc);
> +    ICSState *ics = ICS(intc);
>      int rc;
>      CPUState *cs;
>      Error *local_err = NULL;
> diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
> index 8ae4f41459c3..ce147e8980ed 100644
> --- a/hw/intc/xics_spapr.c
> +++ b/hw/intc/xics_spapr.c
> @@ -34,6 +34,7 @@
>  #include "hw/ppc/xics.h"
>  #include "hw/ppc/xics_spapr.h"
>  #include "hw/ppc/fdt.h"
> +#include "hw/qdev-properties.h"
>  #include "qapi/visitor.h"
>  
>  /*
> @@ -154,7 +155,7 @@ static void rtas_set_xive(PowerPCCPU *cpu, SpaprMachineState *spapr,
>                            uint32_t nargs, target_ulong args,
>                            uint32_t nret, target_ulong rets)
>  {
> -    ICSState *ics = spapr->ics;
> +    ICSState *ics = ICS(spapr->ics);
>      uint32_t nr, srcno, server, priority;
>  
>      CHECK_EMULATED_XICS_RTAS(spapr, rets);
> @@ -189,7 +190,7 @@ static void rtas_get_xive(PowerPCCPU *cpu, SpaprMachineState *spapr,
>                            uint32_t nargs, target_ulong args,
>                            uint32_t nret, target_ulong rets)
>  {
> -    ICSState *ics = spapr->ics;
> +    ICSState *ics = ICS(spapr->ics);
>      uint32_t nr, srcno;
>  
>      CHECK_EMULATED_XICS_RTAS(spapr, rets);
> @@ -221,7 +222,7 @@ static void rtas_int_off(PowerPCCPU *cpu, SpaprMachineState *spapr,
>                           uint32_t nargs, target_ulong args,
>                           uint32_t nret, target_ulong rets)
>  {
> -    ICSState *ics = spapr->ics;
> +    ICSState *ics = ICS(spapr->ics);
>      uint32_t nr, srcno;
>  
>      CHECK_EMULATED_XICS_RTAS(spapr, rets);
> @@ -254,7 +255,7 @@ static void rtas_int_on(PowerPCCPU *cpu, SpaprMachineState *spapr,
>                          uint32_t nargs, target_ulong args,
>                          uint32_t nret, target_ulong rets)
>  {
> -    ICSState *ics = spapr->ics;
> +    ICSState *ics = ICS(spapr->ics);
>      uint32_t nr, srcno;
>  
>      CHECK_EMULATED_XICS_RTAS(spapr, rets);
> @@ -285,10 +286,13 @@ static void rtas_int_on(PowerPCCPU *cpu, SpaprMachineState *spapr,
>  
>  static void ics_spapr_realize(DeviceState *dev, Error **errp)
>  {
> -    ICSState *ics = ICS_SPAPR(dev);
> -    ICSStateClass *icsc = ICS_GET_CLASS(ics);
> +    IcsSpaprState *sics = ICS_SPAPR(dev);
> +    ICSStateClass *icsc = ICS_GET_CLASS(dev);
>      Error *local_err = NULL;
>  
> +    /* Set by spapr_irq_init() */
> +    g_assert(sics->nr_servers);
> +
>      icsc->parent_realize(dev, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
> @@ -312,7 +316,7 @@ static void xics_spapr_dt(SpaprInterruptController *intc, uint32_t nr_servers,
>                            void *fdt, uint32_t phandle)
>  {
>      uint32_t interrupt_server_ranges_prop[] = {
> -        0, cpu_to_be32(nr_servers),
> +        0, cpu_to_be32(ICS_SPAPR(intc)->nr_servers),
>      };
>      int node;
>  
> @@ -333,7 +337,7 @@ static void xics_spapr_dt(SpaprInterruptController *intc, uint32_t nr_servers,
>  static int xics_spapr_cpu_intc_create(SpaprInterruptController *intc,
>                                         PowerPCCPU *cpu, Error **errp)
>  {
> -    ICSState *ics = ICS_SPAPR(intc);
> +    ICSState *ics = ICS(intc);
>      Object *obj;
>      SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
>  
> @@ -364,7 +368,7 @@ static void xics_spapr_cpu_intc_destroy(SpaprInterruptController *intc,
>  static int xics_spapr_claim_irq(SpaprInterruptController *intc, int irq,
>                                  bool lsi, Error **errp)
>  {
> -    ICSState *ics = ICS_SPAPR(intc);
> +    ICSState *ics = ICS(intc);
>  
>      assert(ics);
>      assert(ics_valid_irq(ics, irq));
> @@ -380,7 +384,7 @@ static int xics_spapr_claim_irq(SpaprInterruptController *intc, int irq,
>  
>  static void xics_spapr_free_irq(SpaprInterruptController *intc, int irq)
>  {
> -    ICSState *ics = ICS_SPAPR(intc);
> +    ICSState *ics = ICS(intc);
>      uint32_t srcno = irq - ics->offset;
>  
>      assert(ics_valid_irq(ics, irq));
> @@ -390,7 +394,7 @@ static void xics_spapr_free_irq(SpaprInterruptController *intc, int irq)
>  
>  static void xics_spapr_set_irq(SpaprInterruptController *intc, int irq, int val)
>  {
> -    ICSState *ics = ICS_SPAPR(intc);
> +    ICSState *ics = ICS(intc);
>      uint32_t srcno = irq - ics->offset;
>  
>      ics_set_irq(ics, srcno, val);
> @@ -398,7 +402,7 @@ static void xics_spapr_set_irq(SpaprInterruptController *intc, int irq, int val)
>  
>  static void xics_spapr_print_info(SpaprInterruptController *intc, Monitor *mon)
>  {
> -    ICSState *ics = ICS_SPAPR(intc);
> +    ICSState *ics = ICS(intc);
>      CPUState *cs;
>  
>      CPU_FOREACH(cs) {
> @@ -426,7 +430,8 @@ static int xics_spapr_activate(SpaprInterruptController *intc,
>                                 uint32_t nr_servers, Error **errp)
>  {
>      if (kvm_enabled()) {
> -        return spapr_irq_init_kvm(xics_kvm_connect, intc, nr_servers, errp);
> +        return spapr_irq_init_kvm(xics_kvm_connect, intc,
> +                                  ICS_SPAPR(intc)->nr_servers, errp);
>      }
>      return 0;
>  }
> @@ -438,6 +443,11 @@ static void xics_spapr_deactivate(SpaprInterruptController *intc)
>      }
>  }
>  
> +static Property ics_spapr_properties[] = {
> +    DEFINE_PROP_UINT32("nr-servers", IcsSpaprState, nr_servers, 0),
> +    DEFINE_PROP_END_OF_LIST(),
> +};
> +
>  static void ics_spapr_class_init(ObjectClass *klass, void *data)
>  {
>      DeviceClass *dc = DEVICE_CLASS(klass);
> @@ -446,6 +456,7 @@ static void ics_spapr_class_init(ObjectClass *klass, void *data)
>  
>      device_class_set_parent_realize(dc, ics_spapr_realize,
>                                      &isc->parent_realize);
> +    device_class_set_props(dc, ics_spapr_properties);
>      sicc->activate = xics_spapr_activate;
>      sicc->deactivate = xics_spapr_deactivate;
>      sicc->cpu_intc_create = xics_spapr_cpu_intc_create;
> @@ -462,6 +473,7 @@ static void ics_spapr_class_init(ObjectClass *klass, void *data)
>  static const TypeInfo ics_spapr_info = {
>      .name = TYPE_ICS_SPAPR,
>      .parent = TYPE_ICS,
> +    .instance_size = sizeof(IcsSpaprState),
>      .class_init = ics_spapr_class_init,
>      .interfaces = (InterfaceInfo[]) {
>          { TYPE_SPAPR_INTC },
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 12a012d9dd09..21de0456446b 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -4218,15 +4218,16 @@ static void spapr_phb_placement(SpaprMachineState *spapr, uint32_t index,
>  static ICSState *spapr_ics_get(XICSFabric *dev, int irq)
>  {
>      SpaprMachineState *spapr = SPAPR_MACHINE(dev);
> +    ICSState *ics = ICS(spapr->ics);
>  
> -    return ics_valid_irq(spapr->ics, irq) ? spapr->ics : NULL;
> +    return ics_valid_irq(ics, irq) ? ics : NULL;
>  }
>  
>  static void spapr_ics_resend(XICSFabric *dev)
>  {
>      SpaprMachineState *spapr = SPAPR_MACHINE(dev);
>  
> -    ics_resend(spapr->ics);
> +    ics_resend(ICS(spapr->ics));
>  }
>  
>  static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 2dacbecfd5fd..be6f4041e433 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -316,6 +316,9 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
>          object_property_set_link(obj, ICS_PROP_XICS, OBJECT(spapr),
>                                   &error_abort);
>          object_property_set_int(obj, "nr-irqs", smc->nr_xirqs, &error_abort);
> +        object_property_set_uint(obj, "nr-servers",
> +                                 spapr_max_server_number(spapr),
> +                                 &error_abort);
>          if (!qdev_realize(DEVICE(obj), NULL, errp)) {
>              return;
>          }
> @@ -426,7 +429,7 @@ qemu_irq spapr_qirq(SpaprMachineState *spapr, int irq)
>      assert(irq < (smc->nr_xirqs + SPAPR_XIRQ_BASE));
>  
>      if (spapr->ics) {
> -        assert(ics_valid_irq(spapr->ics, irq));
> +        assert(ics_valid_irq(ICS(spapr->ics), irq));
>      }
>      if (spapr->xive) {
>          assert(irq < spapr->xive->nr_irqs);
> @@ -556,7 +559,7 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum)
>  
>  int spapr_irq_find(SpaprMachineState *spapr, int num, bool align, Error **errp)
>  {
> -    ICSState *ics = spapr->ics;
> +    ICSState *ics = ICS(spapr->ics);
>      int first = -1;
>  
>      assert(ics);
> 



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends()
  2020-11-23  9:46   ` Cédric Le Goater
@ 2020-11-23 11:16     ` Greg Kurz
  2020-11-24 13:54       ` Cédric Le Goater
  0 siblings, 1 reply; 40+ messages in thread
From: Greg Kurz @ 2020-11-23 11:16 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, David Gibson

On Mon, 23 Nov 2020 10:46:38 +0100
Cédric Le Goater <clg@kaod.org> wrote:

> On 11/20/20 6:46 PM, Greg Kurz wrote:
> > We're going to kill the "nr_ends" field in a subsequent patch.
> 
> why ? it is one of the tables of the controller and its part of 
> the main XIVE concepts. Conceptually, we could let the machine 
> dimension it with an arbitrary value as OPAL does. The controller
> would fail when the table is fully used. 
> 

The idea is that the sPAPR machine only true need is to create a
controller that can accommodate up to a certain number of vCPU ids.
It doesn't really to know about the END itself IMHO.

This being said, if we decide to pass both spapr_max_server_number()
and smp.max_cpus down to the backends as function arguments, we won't
have to change "nr_ends" at all.

>  
> > Prepare ground by using an helper instead of peeking into
> > the sPAPR XIVE structure directly.
> 
> 
> I am not against the helper though but we should introduce a 
> prio_shift value which would let us define the number of 
> available priorities. To be linked with "hv-prio"
> 
> C.
> 
> 
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> >  include/hw/ppc/spapr_xive.h |  1 +
> >  hw/intc/spapr_xive.c        | 23 ++++++++++++++---------
> >  hw/intc/spapr_xive_kvm.c    |  4 ++--
> >  3 files changed, 17 insertions(+), 11 deletions(-)
> > 
> > diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> > index 26c8d90d7196..4b967f13c030 100644
> > --- a/include/hw/ppc/spapr_xive.h
> > +++ b/include/hw/ppc/spapr_xive.h
> > @@ -75,6 +75,7 @@ void spapr_xive_map_mmio(SpaprXive *xive);
> >  
> >  int spapr_xive_end_to_target(uint8_t end_blk, uint32_t end_idx,
> >                               uint32_t *out_server, uint8_t *out_prio);
> > +uint32_t spapr_xive_nr_ends(const SpaprXive *xive);
> >  
> >  /*
> >   * KVM XIVE device helpers
> > diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> > index 60e0d5769dcc..f473ad9cba47 100644
> > --- a/hw/intc/spapr_xive.c
> > +++ b/hw/intc/spapr_xive.c
> > @@ -192,7 +192,7 @@ void spapr_xive_pic_print_info(SpaprXive *xive, Monitor *mon)
> >              uint32_t end_idx = xive_get_field64(EAS_END_INDEX, eas->w);
> >              XiveEND *end;
> >  
> > -            assert(end_idx < xive->nr_ends);
> > +            assert(end_idx < spapr_xive_nr_ends(xive));
> >              end = &xive->endt[end_idx];
> >  
> >              if (xive_end_is_valid(end)) {
> > @@ -270,7 +270,7 @@ static void spapr_xive_reset(void *dev)
> >      }
> >  
> >      /* Clear all ENDs */
> > -    for (i = 0; i < xive->nr_ends; i++) {
> > +    for (i = 0; i < spapr_xive_nr_ends(xive); i++) {
> >          spapr_xive_end_reset(&xive->endt[i]);
> >      }
> >  }
> > @@ -288,6 +288,11 @@ static void spapr_xive_instance_init(Object *obj)
> >      xive->fd = -1;
> >  }
> >  
> > +uint32_t spapr_xive_nr_ends(const SpaprXive *xive)
> > +{
> > +    return xive->nr_ends;
> > +}
> > +
> >  static void spapr_xive_realize(DeviceState *dev, Error **errp)
> >  {
> >      SpaprXive *xive = SPAPR_XIVE(dev);
> > @@ -336,7 +341,7 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
> >       * Allocate the routing tables
> >       */
> >      xive->eat = g_new0(XiveEAS, xive->nr_irqs);
> > -    xive->endt = g_new0(XiveEND, xive->nr_ends);
> > +    xive->endt = g_new0(XiveEND, spapr_xive_nr_ends(xive));
> >  
> >      xive->nodename = g_strdup_printf("interrupt-controller@%" PRIx64,
> >                             xive->tm_base + XIVE_TM_USER_PAGE * (1 << TM_SHIFT));
> > @@ -375,7 +380,7 @@ static int spapr_xive_get_end(XiveRouter *xrtr,
> >  {
> >      SpaprXive *xive = SPAPR_XIVE(xrtr);
> >  
> > -    if (end_idx >= xive->nr_ends) {
> > +    if (end_idx >= spapr_xive_nr_ends(xive)) {
> >          return -1;
> >      }
> >  
> > @@ -389,7 +394,7 @@ static int spapr_xive_write_end(XiveRouter *xrtr, uint8_t end_blk,
> >  {
> >      SpaprXive *xive = SPAPR_XIVE(xrtr);
> >  
> > -    if (end_idx >= xive->nr_ends) {
> > +    if (end_idx >= spapr_xive_nr_ends(xive)) {
> >          return -1;
> >      }
> >  
> > @@ -1138,7 +1143,7 @@ static target_ulong h_int_get_source_config(PowerPCCPU *cpu,
> >      /* EAS_END_BLOCK is unused on sPAPR */
> >      end_idx = xive_get_field64(EAS_END_INDEX, eas.w);
> >  
> > -    assert(end_idx < xive->nr_ends);
> > +    assert(end_idx < spapr_xive_nr_ends(xive));
> >      end = &xive->endt[end_idx];
> >  
> >      nvt_blk = xive_get_field32(END_W6_NVT_BLOCK, end->w6);
> > @@ -1216,7 +1221,7 @@ static target_ulong h_int_get_queue_info(PowerPCCPU *cpu,
> >          return H_P2;
> >      }
> >  
> > -    assert(end_idx < xive->nr_ends);
> > +    assert(end_idx < spapr_xive_nr_ends(xive));
> >      end = &xive->endt[end_idx];
> >  
> >      args[0] = xive->end_base + (1ull << (end_xsrc->esb_shift + 1)) * end_idx;
> > @@ -1304,7 +1309,7 @@ static target_ulong h_int_set_queue_config(PowerPCCPU *cpu,
> >          return H_P2;
> >      }
> >  
> > -    assert(end_idx < xive->nr_ends);
> > +    assert(end_idx < spapr_xive_nr_ends(xive));
> >      memcpy(&end, &xive->endt[end_idx], sizeof(XiveEND));
> >  
> >      switch (qsize) {
> > @@ -1470,7 +1475,7 @@ static target_ulong h_int_get_queue_config(PowerPCCPU *cpu,
> >          return H_P2;
> >      }
> >  
> > -    assert(end_idx < xive->nr_ends);
> > +    assert(end_idx < spapr_xive_nr_ends(xive));
> >      end = &xive->endt[end_idx];
> >  
> >      args[0] = 0;
> > diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
> > index 66bf4c06fe55..1566016f0e28 100644
> > --- a/hw/intc/spapr_xive_kvm.c
> > +++ b/hw/intc/spapr_xive_kvm.c
> > @@ -531,7 +531,7 @@ static int kvmppc_xive_get_queues(SpaprXive *xive, Error **errp)
> >      int i;
> >      int ret;
> >  
> > -    for (i = 0; i < xive->nr_ends; i++) {
> > +    for (i = 0; i < spapr_xive_nr_ends(xive); i++) {
> >          if (!xive_end_is_valid(&xive->endt[i])) {
> >              continue;
> >          }
> > @@ -701,7 +701,7 @@ int kvmppc_xive_post_load(SpaprXive *xive, int version_id)
> >      assert(xive->fd != -1);
> >  
> >      /* Restore the ENDT first. The targetting depends on it. */
> > -    for (i = 0; i < xive->nr_ends; i++) {
> > +    for (i = 0; i < spapr_xive_nr_ends(xive); i++) {
> >          if (!xive_end_is_valid(&xive->endt[i])) {
> >              continue;
> >          }
> > 
> 



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 3/8] spapr/xive: Add "nr-servers" property
  2020-11-23  9:56   ` Cédric Le Goater
@ 2020-11-23 11:25     ` Greg Kurz
  2020-11-24 14:18       ` Cédric Le Goater
  0 siblings, 1 reply; 40+ messages in thread
From: Greg Kurz @ 2020-11-23 11:25 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, David Gibson

On Mon, 23 Nov 2020 10:56:13 +0100
Cédric Le Goater <clg@kaod.org> wrote:

> On 11/20/20 6:46 PM, Greg Kurz wrote:
> > The sPAPR XIVE object has an "nr-ends" property that is used
> > to size the END table. This property is set by the machine
> > code to a value derived from spapr_max_server_number().
> 
> Yes. This is done at the machine level.
> 
> > spapr_max_server_number() is also used to inform the KVM XIVE
> > device about the range of vCPU ids it might be exposed to,
> > in order to optimize resource allocation in the HW.
> 
> Yes. It's deeply buried in the spapr/irq/xive/kvm layers but it is
> called by set_active_intc() which is, for me, a machine level 
> operation.
> 
> The only other place where the number of vCPUs is used is in 
> spapr_xive_dt() to define the vCPU IPI range, which done at
> the machine level again.
> 

spapr_max_server_number() isn't the number of vCPUs. It is the
number of consecutive vCPU ids the IC might be exposed to,
starting from 0.

> So I don't agree with this patch.
> 

As said in another mail, I'll try a different approach.

> C.
> 
> 
> > This is enough motivation to introduce an "nr-servers" property
> > and to use it for both purposes. The existing "nr-ends" property
> > is now longer used. It is kept around though because it is exposed
> > to -global. It will continue to be ignored as before without
> > causing QEMU to exit.
> > 
> > The associated nr_ends field cannot be dropped from SpaprXive
> > because it is explicitly used by vmstate_spapr_xive(). It is
> > thus renamed to nr_ends_vmstate.
> > 
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> >  include/hw/ppc/spapr_xive.h | 16 +++++++++++++++-
> >  hw/intc/spapr_xive.c        | 28 ++++++++++++++++++++++------
> >  hw/ppc/spapr_irq.c          |  6 +-----
> >  3 files changed, 38 insertions(+), 12 deletions(-)
> > 
> > diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> > index 4b967f13c030..7123ea07ed78 100644
> > --- a/include/hw/ppc/spapr_xive.h
> > +++ b/include/hw/ppc/spapr_xive.h
> > @@ -23,6 +23,16 @@
> >  typedef struct SpaprXive {
> >      XiveRouter    parent;
> >  
> > +    /*
> > +     * The XIVE device needs to know the highest vCPU id it might
> > +     * be exposed to in order to size the END table. It may also
> > +     * propagate the value to the KVM XIVE device in order to
> > +     * optimize resource allocation in the HW.
> > +     * This must be set to a non-null value using the "nr-servers"
> > +     * property, before realizing the device.
> > +     */
> > +    uint32_t      nr_servers;
> > +
> >      /* Internal interrupt source for IPIs and virtual devices */
> >      XiveSource    source;
> >      hwaddr        vc_base;
> > @@ -38,7 +48,11 @@ typedef struct SpaprXive {
> >      XiveEAS       *eat;
> >      uint32_t      nr_irqs;
> >      XiveEND       *endt;
> > -    uint32_t      nr_ends;
> > +    /*
> > +     * This is derived from nr_servers but it must be kept around because
> > +     * vmstate_spapr_xive uses it.
> > +     */
> > +    uint32_t      nr_ends_vmstate;
> >  
> >      /* TIMA mapping address */
> >      hwaddr        tm_base;
> > diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> > index f473ad9cba47..e4dbf9c2c409 100644
> > --- a/hw/intc/spapr_xive.c
> > +++ b/hw/intc/spapr_xive.c
> > @@ -99,6 +99,12 @@ int spapr_xive_end_to_target(uint8_t end_blk, uint32_t end_idx,
> >      return 0;
> >  }
> >  
> > +/*
> > + * 8 XIVE END structures per CPU. One for each available
> > + * priority
> > + */
> > +#define spapr_xive_cpu_end_idx(vcpu, prio) (((vcpu) << 3) + prio)
> > +
> >  static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
> >                                    uint8_t *out_end_blk, uint32_t *out_end_idx)
> >  {
> > @@ -109,7 +115,7 @@ static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
> >      }
> >  
> >      if (out_end_idx) {
> > -        *out_end_idx = (cpu->vcpu_id << 3) + prio;
> > +        *out_end_idx = spapr_xive_cpu_end_idx(cpu->vcpu_id, prio);
> >      }
> >  }
> >  
> > @@ -290,7 +296,8 @@ static void spapr_xive_instance_init(Object *obj)
> >  
> >  uint32_t spapr_xive_nr_ends(const SpaprXive *xive)
> >  {
> > -    return xive->nr_ends;
> > +    g_assert(xive->nr_servers);
> > +    return spapr_xive_cpu_end_idx(xive->nr_servers, 0);
> >  }
> >  
> >  static void spapr_xive_realize(DeviceState *dev, Error **errp)
> > @@ -303,7 +310,7 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
> >  
> >      /* Set by spapr_irq_init() */
> >      g_assert(xive->nr_irqs);
> > -    g_assert(xive->nr_ends);
> > +    g_assert(xive->nr_servers);
> >  
> >      sxc->parent_realize(dev, &local_err);
> >      if (local_err) {
> > @@ -360,6 +367,8 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
> >      sysbus_mmio_map(SYS_BUS_DEVICE(xive), 0, xive->vc_base);
> >      sysbus_mmio_map(SYS_BUS_DEVICE(xive), 1, xive->end_base);
> >      sysbus_mmio_map(SYS_BUS_DEVICE(xive), 2, xive->tm_base);
> > +
> > +    xive->nr_ends_vmstate = spapr_xive_nr_ends(xive);
> >  }
> >  
> >  static int spapr_xive_get_eas(XiveRouter *xrtr, uint8_t eas_blk,
> > @@ -547,7 +556,7 @@ static const VMStateDescription vmstate_spapr_xive = {
> >          VMSTATE_UINT32_EQUAL(nr_irqs, SpaprXive, NULL),
> >          VMSTATE_STRUCT_VARRAY_POINTER_UINT32(eat, SpaprXive, nr_irqs,
> >                                       vmstate_spapr_xive_eas, XiveEAS),
> > -        VMSTATE_STRUCT_VARRAY_POINTER_UINT32(endt, SpaprXive, nr_ends,
> > +        VMSTATE_STRUCT_VARRAY_POINTER_UINT32(endt, SpaprXive, nr_ends_vmstate,
> >                                               vmstate_spapr_xive_end, XiveEND),
> >          VMSTATE_END_OF_LIST()
> >      },
> > @@ -591,7 +600,14 @@ static void spapr_xive_free_irq(SpaprInterruptController *intc, int lisn)
> >  
> >  static Property spapr_xive_properties[] = {
> >      DEFINE_PROP_UINT32("nr-irqs", SpaprXive, nr_irqs, 0),
> > -    DEFINE_PROP_UINT32("nr-ends", SpaprXive, nr_ends, 0),
> > +    /*
> > +     * "nr-ends" is deprecated by "nr-servers" since QEMU 6.0.
> > +     * It is just kept around because it is exposed to the user
> > +     * through -global and we don't want it to fail, even if
> > +     * the value is actually overridden internally.
> > +     */
> > +    DEFINE_PROP_UINT32("nr-ends", SpaprXive, nr_ends_vmstate, 0),
> > +    DEFINE_PROP_UINT32("nr-servers", SpaprXive, nr_servers, 0),
> >      DEFINE_PROP_UINT64("vc-base", SpaprXive, vc_base, SPAPR_XIVE_VC_BASE),
> >      DEFINE_PROP_UINT64("tm-base", SpaprXive, tm_base, SPAPR_XIVE_TM_BASE),
> >      DEFINE_PROP_UINT8("hv-prio", SpaprXive, hv_prio, 7),
> > @@ -742,7 +758,7 @@ static int spapr_xive_activate(SpaprInterruptController *intc,
> >      SpaprXive *xive = SPAPR_XIVE(intc);
> >  
> >      if (kvm_enabled()) {
> > -        int rc = spapr_irq_init_kvm(kvmppc_xive_connect, intc, nr_servers,
> > +        int rc = spapr_irq_init_kvm(kvmppc_xive_connect, intc, xive->nr_servers,
> >                                      errp);
> >          if (rc < 0) {
> >              return rc;
> > diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> > index f59960339ec3..8c5627225636 100644
> > --- a/hw/ppc/spapr_irq.c
> > +++ b/hw/ppc/spapr_irq.c
> > @@ -330,11 +330,7 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
> >  
> >          dev = qdev_new(TYPE_SPAPR_XIVE);
> >          qdev_prop_set_uint32(dev, "nr-irqs", smc->nr_xirqs + SPAPR_XIRQ_BASE);
> > -        /*
> > -         * 8 XIVE END structures per CPU. One for each available
> > -         * priority
> > -         */
> > -        qdev_prop_set_uint32(dev, "nr-ends", nr_servers << 3);
> > +        qdev_prop_set_uint32(dev, "nr-servers", nr_servers);
> >          object_property_set_link(OBJECT(dev), "xive-fabric", OBJECT(spapr),
> >                                   &error_abort);
> >          sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
> > 
> 



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends()
  2020-11-23 11:16     ` Greg Kurz
@ 2020-11-24 13:54       ` Cédric Le Goater
  2020-11-24 17:01         ` Greg Kurz
  0 siblings, 1 reply; 40+ messages in thread
From: Cédric Le Goater @ 2020-11-24 13:54 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, qemu-devel, David Gibson

On 11/23/20 12:16 PM, Greg Kurz wrote:
> On Mon, 23 Nov 2020 10:46:38 +0100
> Cédric Le Goater <clg@kaod.org> wrote:
> 
>> On 11/20/20 6:46 PM, Greg Kurz wrote:
>>> We're going to kill the "nr_ends" field in a subsequent patch.
>>
>> why ? it is one of the tables of the controller and its part of 
>> the main XIVE concepts. Conceptually, we could let the machine 
>> dimension it with an arbitrary value as OPAL does. The controller
>> would fail when the table is fully used. 
>>
> 
> The idea is that the sPAPR machine only true need is to create a
> controller that can accommodate up to a certain number of vCPU ids.
> It doesn't really to know about the END itself IMHO.> 
> This being said, if we decide to pass both spapr_max_server_number()
> and smp.max_cpus down to the backends as function arguments, we won't
> have to change "nr_ends" at all.

I would prefer that but I am still not sure what they represent. 

Looking at the sPAPR XIVE code, we deal with numbers/ranges in the 
following places today.

 * spapr_xive_dt() 

   It defines a range of interrupt numbers to be used by the guest 
   for the threads/vCPUs IPIs. It's a subset of interrupt numbers 
   in :

		[ SPAPR_IRQ_IPI - SPAPR_IRQ_EPOW [

   These are not vCPU ids.

   Since these interrupt numbers will be considered as free to use
   by the OS, it makes sense to pre-claim them. But claiming an 
   interrupt number in the guest can potentially set up, through 
   the KVM device, a mapping on the host and in HW. See below why
   this can be a problem.

 * kvmppc_xive_cpu_connect()   

   This sizes the NVT tables in OPAL for the guest. This is the  
   max number of vCPUs of the guest (not vCPU ids)

 * spapr_irq_init()

   This is where the IPI interrupt numbers are claimed today. 
   Directly in QEMU and KVM, if the machine is running XIVE only, 
   indirectly if it's dual, first in QEMU and then in KVM when 
   the machine switches of interrupt mode in CAS. 

   The problem is that the underlying XIVE resources in HW are 
   allocated where the QEMU process is running. Which is not the
   best option when the vCPUs are pinned on different chips.

   My patchset was trying to improve that by claiming the IPI on 
   demand when the vCPU is connected to the KVM device. But it 
   was using the vCPU id as the IPI interrupt number which is 
   utterly wrong, the guest OS could use any number in the range 
   exposed in the DT.
   
   The last patch you sent was going in the right direction I think.
   That is to claim the IPI when the guest OS is requesting for it. 

   http://patchwork.ozlabs.org/project/qemu-devel/patch/160528045027.804522.6161091782230763832.stgit@bahia.lan/
   
   But I don't understand why it was so complex. It should be like
   the MSIs claimed by PCI devices.


All this to say, that we need to size better the range in the 
"ibm,xive-lisn-ranges" property if that's broken for vSMT. 

Then, I think the IPIs can be treated just like the PCI MSIs
but they need to be claimed first. That's the ugly part. 

Should we add a special check in h_int_set_source_config to
deal with unclaimed IPIs that are being configured ?


C.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 3/8] spapr/xive: Add "nr-servers" property
  2020-11-23 11:25     ` Greg Kurz
@ 2020-11-24 14:18       ` Cédric Le Goater
  0 siblings, 0 replies; 40+ messages in thread
From: Cédric Le Goater @ 2020-11-24 14:18 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, qemu-devel, David Gibson

>> The only other place where the number of vCPUs is used is in 
>> spapr_xive_dt() to define the vCPU IPI range, which done at
>> the machine level again.
> 
> spapr_max_server_number() isn't the number of vCPUs. It is the
> number of consecutive vCPU ids the IC might be exposed to,
> starting from 0.

We need to clarify the naming. this is confusing. 

What we care about is the number of vCPUs to compute the width 
of the IPI range. 

C.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends()
  2020-11-24 13:54       ` Cédric Le Goater
@ 2020-11-24 17:01         ` Greg Kurz
  2020-11-24 17:56           ` Cédric Le Goater
  0 siblings, 1 reply; 40+ messages in thread
From: Greg Kurz @ 2020-11-24 17:01 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, David Gibson

On Tue, 24 Nov 2020 14:54:38 +0100
Cédric Le Goater <clg@kaod.org> wrote:

> On 11/23/20 12:16 PM, Greg Kurz wrote:
> > On Mon, 23 Nov 2020 10:46:38 +0100
> > Cédric Le Goater <clg@kaod.org> wrote:
> > 
> >> On 11/20/20 6:46 PM, Greg Kurz wrote:
> >>> We're going to kill the "nr_ends" field in a subsequent patch.
> >>
> >> why ? it is one of the tables of the controller and its part of 
> >> the main XIVE concepts. Conceptually, we could let the machine 
> >> dimension it with an arbitrary value as OPAL does. The controller
> >> would fail when the table is fully used. 
> >>
> > 
> > The idea is that the sPAPR machine only true need is to create a
> > controller that can accommodate up to a certain number of vCPU ids.
> > It doesn't really to know about the END itself IMHO.> 
> > This being said, if we decide to pass both spapr_max_server_number()
> > and smp.max_cpus down to the backends as function arguments, we won't
> > have to change "nr_ends" at all.
> 
> I would prefer that but I am still not sure what they represent. 
> 
> Looking at the sPAPR XIVE code, we deal with numbers/ranges in the 
> following places today.
> 
>  * spapr_xive_dt() 
> 
>    It defines a range of interrupt numbers to be used by the guest 
>    for the threads/vCPUs IPIs. It's a subset of interrupt numbers 
>    in :
> 
> 		[ SPAPR_IRQ_IPI - SPAPR_IRQ_EPOW [
> 
>    These are not vCPU ids.
> 
>    Since these interrupt numbers will be considered as free to use
>    by the OS, it makes sense to pre-claim them. But claiming an 
>    interrupt number in the guest can potentially set up, through 
>    the KVM device, a mapping on the host and in HW. See below why
>    this can be a problem.
> 
>  * kvmppc_xive_cpu_connect()   
> 
>    This sizes the NVT tables in OPAL for the guest. This is the  
>    max number of vCPUs of the guest (not vCPU ids)
> 

I guess you're talking about KVM_DEV_XIVE_NR_SERVERS in
kvmppc_xive_connect() actually. We're currently passing
spapr_max_server_number() (vCPU id) but you might be
right.

I need to re-read the story around VSMT and XIVE.

commit 1e175d2e07c71d9574f5b1c74523abca54e2654f
Author: Sam Bobroff <sam.bobroff@au1.ibm.com>
Date:   Wed Jul 25 16:12:02 2018 +1000

    KVM: PPC: Book3S HV: Pack VCORE IDs to access full VCPU ID space

>  * spapr_irq_init()
> 
>    This is where the IPI interrupt numbers are claimed today. 
>    Directly in QEMU and KVM, if the machine is running XIVE only, 
>    indirectly if it's dual, first in QEMU and then in KVM when 
>    the machine switches of interrupt mode in CAS. 
> 
>    The problem is that the underlying XIVE resources in HW are 
>    allocated where the QEMU process is running. Which is not the
>    best option when the vCPUs are pinned on different chips.
> 
>    My patchset was trying to improve that by claiming the IPI on 
>    demand when the vCPU is connected to the KVM device. But it 
>    was using the vCPU id as the IPI interrupt number which is 
>    utterly wrong, the guest OS could use any number in the range 
>    exposed in the DT.
>    
>    The last patch you sent was going in the right direction I think.
>    That is to claim the IPI when the guest OS is requesting for it. 
> 
>    http://patchwork.ozlabs.org/project/qemu-devel/patch/160528045027.804522.6161091782230763832.stgit@bahia.lan/
>    
>    But I don't understand why it was so complex. It should be like
>    the MSIs claimed by PCI devices.
> 

The difference here is that the guest doesn't claim IPIs. They are
supposedly pre-claimed in "ibm,xive-lisn-ranges". And this is actually
the case in QEMU.

The IPI setup sequence in the guest is basically:
1) grab a free irq from the bitmap, ie. "ibm,xive-lisn-ranges"
2) calls H_INT_GET_SOURCE_INFO, ie. populate_irq_data()
3) calls H_INT_SET_SOURCE_CONFIG, ie, configure_irq())

If we want an IPI to be claimed by the appropriate vCPU, we
can only do this from under H_INT_SET_SOURCE_CONFIG. And
until the guest eventually configures the IPI, KVM and QEMU
are out of sync.

This complexifies migration because we have to guess at
post load if we should claim the IPI in KVM or not. The
simple presence of the vCPU isn't enough : we need to
guess if the guest actually configured the IPI or not.

> 
> All this to say, that we need to size better the range in the 
> "ibm,xive-lisn-ranges" property if that's broken for vSMT. 
> 

Sizing the range to smp.max_cpus as proposed in this series
is fine, no matter what the VSMT is.

> Then, I think the IPIs can be treated just like the PCI MSIs
> but they need to be claimed first. That's the ugly part. 
> 

Yeah that's the big difference. For PCI MSIs, QEMU owns the
bitmap and the guest can claim (or release) a number of
MSIs the "ibm,change-msi" RTAS interface. There's no
such thing for IPIs : they are supposedly already claimed.

> Should we add a special check in h_int_set_source_config to
> deal with unclaimed IPIs that are being configured ?
> 

This is what my tentative fix does.

> 
> C.



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 4/8] spapr/xive: Add "nr-ipis" property
  2020-11-23 10:13   ` Cédric Le Goater
@ 2020-11-24 17:18     ` Greg Kurz
  0 siblings, 0 replies; 40+ messages in thread
From: Greg Kurz @ 2020-11-24 17:18 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, David Gibson

On Mon, 23 Nov 2020 11:13:21 +0100
Cédric Le Goater <clg@kaod.org> wrote:

> On 11/20/20 6:46 PM, Greg Kurz wrote:
> > The sPAPR XIVE device exposes a range of LISNs that the guest uses
> > for IPIs. This range is currently sized according to the highest
> > vCPU id, ie. spapr_max_server_number(), as obtained from the machine
> > through the "nr_servers" argument of the generic spapr_irq_dt() call.
> > 
> > This makes sense for the "ibm,interrupt-server-ranges" property of
> > sPAPR XICS, but certainly not for "ibm,xive-lisn-ranges". The range
> > should be sized to the maximum number of possible vCPUs. It happens
> > that spapr_max_server_number() == smp.max_cpus in practice with the
> > machine default settings. This might not be the case though when
> > VSMT is in use : we can end up with a much larger range (up to 8
> > times bigger) than needed and waste LISNs. 
> 
> will it exceed 4K ? if not, we are fine.
> 
> The fact that it is so complex to get the number of vCPUs is a 
> problem by it self to me. Can we simplify that ? or introduce a 
> spapr_max_server_number_id() ? 
> 

"server number" is the XICS wording for vCPU id. The name of the
existing spapr_max_server_number() is perfectly fine from this
perspective: this sizes "ibm,interrupt-server-ranges".

> > But most importantly, this
> > lures people into thinking that IPI numbers are always equal to
> > vCPU ids, which is wrong and bit us recently:
> 
> do we have a converting routing vcpu_id_to_ipi ? I think we have
> that in KVM.
> 
> > https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg01476.html
> > 
> > Introduce an "nr-ipis" property that the machine sets to smp.max_cpus
> > before realizing the deice. Use that instead of "nr_servers" in
> > spapr_xive_dt(). Have the machine to claim the same number of IPIs
> > in spapr_irq_init().
> > 
> > This doesn't cause any guest visible change when using the machine
> > default settings (ie. VSMT == smp.threads).>
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> >  include/hw/ppc/spapr_xive.h | 8 ++++++++
> >  hw/intc/spapr_xive.c        | 4 +++-
> >  hw/ppc/spapr_irq.c          | 4 +++-
> >  3 files changed, 14 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> > index 7123ea07ed78..69b9fbc1b9a5 100644
> > --- a/include/hw/ppc/spapr_xive.h
> > +++ b/include/hw/ppc/spapr_xive.h
> > @@ -43,6 +43,14 @@ typedef struct SpaprXive {
> >  
> >      /* DT */
> >      gchar *nodename;
> > +    /*
> > +     * The sPAPR XIVE device needs to know how many vCPUs it
> > +     * might be exposed to in order to size the IPI range in
> > +     * "ibm,interrupt-server-ranges". It is the purpose of the
> 
> There is no "ibm,interrupt-server-ranges"  property in XIVE
> 

Yeah copy/paste error. Read "ibm,xive-lisn-ranges" of course :)

> > +     * "nr-ipis" property which *must* be set to a non-null
> > +     * value before realizing the sPAPR XIVE device.
> > +     */
> > +    uint32_t nr_ipis;
> >  
> >      /* Routing table */
> >      XiveEAS       *eat;
> > diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> > index e4dbf9c2c409..d13a2955ce9b 100644
> > --- a/hw/intc/spapr_xive.c
> > +++ b/hw/intc/spapr_xive.c
> > @@ -311,6 +311,7 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
> >      /* Set by spapr_irq_init() */
> >      g_assert(xive->nr_irqs);
> >      g_assert(xive->nr_servers);
> > +    g_assert(xive->nr_ipis);
> >  
> >      sxc->parent_realize(dev, &local_err);
> >      if (local_err) {
> > @@ -608,6 +609,7 @@ static Property spapr_xive_properties[] = {
> >       */
> >      DEFINE_PROP_UINT32("nr-ends", SpaprXive, nr_ends_vmstate, 0),
> >      DEFINE_PROP_UINT32("nr-servers", SpaprXive, nr_servers, 0),
> > +    DEFINE_PROP_UINT32("nr-ipis", SpaprXive, nr_ipis, 0),
> >      DEFINE_PROP_UINT64("vc-base", SpaprXive, vc_base, SPAPR_XIVE_VC_BASE),
> >      DEFINE_PROP_UINT64("tm-base", SpaprXive, tm_base, SPAPR_XIVE_TM_BASE),
> >      DEFINE_PROP_UINT8("hv-prio", SpaprXive, hv_prio, 7),
> > @@ -698,7 +700,7 @@ static void spapr_xive_dt(SpaprInterruptController *intc, uint32_t nr_servers,
> >      /* Interrupt number ranges for the IPIs */
> >      uint32_t lisn_ranges[] = {
> >          cpu_to_be32(SPAPR_IRQ_IPI),
> > -        cpu_to_be32(SPAPR_IRQ_IPI + nr_servers),
> > +        cpu_to_be32(SPAPR_IRQ_IPI + xive->nr_ipis),
> 
> I don't understand why we need nr_ipis. Can't we pass the same value in 
> nr_servers from the machine ?
> 

This is the point of this patch. nr_servers is vCPU id and shouldn't be
used to size the LISN range.

> ( Conceptually, the first 4K are all IPIs. The first range is for 
>   HW threads ) 
> 
> 
> >      };
> >      /*
> >       * EQ size - the sizes of pages supported by the system 4K, 64K,
> > diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> > index 8c5627225636..a0fc474ecb06 100644
> > --- a/hw/ppc/spapr_irq.c
> > +++ b/hw/ppc/spapr_irq.c
> > @@ -325,12 +325,14 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
> >  
> >      if (spapr->irq->xive) {
> >          uint32_t nr_servers = spapr_max_server_number(spapr);
> > +        uint32_t max_cpus = MACHINE(spapr)->smp.max_cpus;
> 
> So that's the value we should be using in case of VSMT and not 
> spapr_max_server_number(). If I am correct, this is a max_cpu_id ?
> 

This isn't an id, it is what you pass to -smp maxcpus, ie. the total
number of vCPUs that the may run inside the guest. So this is what
we should use everywhere we care for a number of vCPUs.

> 
> >          DeviceState *dev;
> >          int i;
> >  
> >          dev = qdev_new(TYPE_SPAPR_XIVE);
> >          qdev_prop_set_uint32(dev, "nr-irqs", smc->nr_xirqs + SPAPR_XIRQ_BASE);
> >          qdev_prop_set_uint32(dev, "nr-servers", nr_servers);
> > +        qdev_prop_set_uint32(dev, "nr-ipis", max_cpus);
> >          object_property_set_link(OBJECT(dev), "xive-fabric", OBJECT(spapr),
> >                                   &error_abort);
> >          sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
> > @@ -338,7 +340,7 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
> >          spapr->xive = SPAPR_XIVE(dev);
> >  
> >          /* Enable the CPU IPIs */
> > -        for (i = 0; i < nr_servers; ++i) {
> > +        for (i = 0; i < max_cpus; ++i) {
> >              SpaprInterruptControllerClass *sicc
> >                  = SPAPR_INTC_GET_CLASS(spapr->xive);
> 
> 
> 
> 



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends()
  2020-11-24 17:01         ` Greg Kurz
@ 2020-11-24 17:56           ` Cédric Le Goater
  2020-11-25  9:33             ` Greg Kurz
  0 siblings, 1 reply; 40+ messages in thread
From: Cédric Le Goater @ 2020-11-24 17:56 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, qemu-devel, David Gibson

On 11/24/20 6:01 PM, Greg Kurz wrote:
> On Tue, 24 Nov 2020 14:54:38 +0100
> Cédric Le Goater <clg@kaod.org> wrote:
> 
>> On 11/23/20 12:16 PM, Greg Kurz wrote:
>>> On Mon, 23 Nov 2020 10:46:38 +0100
>>> Cédric Le Goater <clg@kaod.org> wrote:
>>>
>>>> On 11/20/20 6:46 PM, Greg Kurz wrote:
>>>>> We're going to kill the "nr_ends" field in a subsequent patch.
>>>>
>>>> why ? it is one of the tables of the controller and its part of 
>>>> the main XIVE concepts. Conceptually, we could let the machine 
>>>> dimension it with an arbitrary value as OPAL does. The controller
>>>> would fail when the table is fully used. 
>>>>
>>>
>>> The idea is that the sPAPR machine only true need is to create a
>>> controller that can accommodate up to a certain number of vCPU ids.
>>> It doesn't really to know about the END itself IMHO.> 
>>> This being said, if we decide to pass both spapr_max_server_number()
>>> and smp.max_cpus down to the backends as function arguments, we won't
>>> have to change "nr_ends" at all.
>>
>> I would prefer that but I am still not sure what they represent. 
>>
>> Looking at the sPAPR XIVE code, we deal with numbers/ranges in the 
>> following places today.
>>
>>  * spapr_xive_dt() 
>>
>>    It defines a range of interrupt numbers to be used by the guest 
>>    for the threads/vCPUs IPIs. It's a subset of interrupt numbers 
>>    in :
>>
>> 		[ SPAPR_IRQ_IPI - SPAPR_IRQ_EPOW [
>>
>>    These are not vCPU ids.
>>
>>    Since these interrupt numbers will be considered as free to use
>>    by the OS, it makes sense to pre-claim them. But claiming an 
>>    interrupt number in the guest can potentially set up, through 
>>    the KVM device, a mapping on the host and in HW. See below why
>>    this can be a problem.
>>
>>  * kvmppc_xive_cpu_connect()   
>>
>>    This sizes the NVT tables in OPAL for the guest. This is the  
>>    max number of vCPUs of the guest (not vCPU ids)
>>
> 
> I guess you're talking about KVM_DEV_XIVE_NR_SERVERS in
> kvmppc_xive_connect() actually. We're currently passing
> spapr_max_server_number() (vCPU id) but you might be
> right.
> 
> I need to re-read the story around VSMT and XIVE.

ok. What we care about here, is a size to allocate the NVT block
representing the vCPUs in HW. NVT ids are pushed in the thread 
contexts when the vCPUs are scheduled to run and looked for by 
the presenter when an interrupt is to be delivered.

 
> commit 1e175d2e07c71d9574f5b1c74523abca54e2654f
> Author: Sam Bobroff <sam.bobroff@au1.ibm.com>
> Date:   Wed Jul 25 16:12:02 2018 +1000
> 
>     KVM: PPC: Book3S HV: Pack VCORE IDs to access full VCPU ID space
> 
>>  * spapr_irq_init()
>>
>>    This is where the IPI interrupt numbers are claimed today. 
>>    Directly in QEMU and KVM, if the machine is running XIVE only, 
>>    indirectly if it's dual, first in QEMU and then in KVM when 
>>    the machine switches of interrupt mode in CAS. 
>>
>>    The problem is that the underlying XIVE resources in HW are 
>>    allocated where the QEMU process is running. Which is not the
>>    best option when the vCPUs are pinned on different chips.
>>
>>    My patchset was trying to improve that by claiming the IPI on 
>>    demand when the vCPU is connected to the KVM device. But it 
>>    was using the vCPU id as the IPI interrupt number which is 
>>    utterly wrong, the guest OS could use any number in the range 
>>    exposed in the DT.
>>    
>>    The last patch you sent was going in the right direction I think.
>>    That is to claim the IPI when the guest OS is requesting for it. 
>>
>>    http://patchwork.ozlabs.org/project/qemu-devel/patch/160528045027.804522.6161091782230763832.stgit@bahia.lan/
>>    
>>    But I don't understand why it was so complex. It should be like
>>    the MSIs claimed by PCI devices.
>>
> 
> The difference here is that the guest doesn't claim IPIs. They are
> supposedly pre-claimed in "ibm,xive-lisn-ranges". And this is actually
> the case in QEMU.

yes. That's what I want to change (for performance)

> The IPI setup sequence in the guest is basically:
> 1) grab a free irq from the bitmap, ie. "ibm,xive-lisn-ranges"
> 2) calls H_INT_GET_SOURCE_INFO, ie. populate_irq_data()
> 3) calls H_INT_SET_SOURCE_CONFIG, ie, configure_irq())
> 
> If we want an IPI to be claimed by the appropriate vCPU, we
> can only do this from under H_INT_SET_SOURCE_CONFIG. And
> until the guest eventually configures the IPI, KVM and QEMU
> are out of sync.

Well, KVM doesn't know either how much PCI MSIs will be claimed.
It all depends on the guest OS. 

I don't think this is a problem to expose unclaimed interrupt
numbers to the guest if they are IPIs. We can detect that
easily with the range and claim the source at KVM level when 
it's configured or in h_int_get_source_info(). Talking of which, 
it might be good to have a KVM command to query the source 
characteristics on the host. I sent a patchset a while ago in 
that sense.

> This complexifies migration because we have to guess at
> post load if we should claim the IPI in KVM or not. The
> simple presence of the vCPU isn't enough : we need to
> guess if the guest actually configured the IPI or not.

The EAT will be transferred from the source and the call to 
kvmppc_xive_source_reset_one() should initialize the KVM 
device correctly on the target for all interrupts.

>> All this to say, that we need to size better the range in the 
>> "ibm,xive-lisn-ranges" property if that's broken for vSMT. 
>>
> 
> Sizing the range to smp.max_cpus as proposed in this series
> is fine, no matter what the VSMT is.

ok. That's a fix for spapr_irq_dt() then. And possibly, there 
is a similar one for KVM_DEV_XIVE_NR_SERVERS.
 
>> Then, I think the IPIs can be treated just like the PCI MSIs
>> but they need to be claimed first. That's the ugly part. 
>>
> 
> Yeah that's the big difference. For PCI MSIs, QEMU owns the
> bitmap and the guest can claim (or release) a number of
> MSIs the "ibm,change-msi" RTAS interface. There's no
> such thing for IPIs : they are supposedly already claimed.

IPIs are a bit special because there are no I/O devices to
claim them. We could consider the vCPU has being the device. 
That was my first attempt but it was wrong since the OS is 
in charge of choosing an interrupt number for the IPI. 

>> Should we add a special check in h_int_set_source_config to
>> deal with unclaimed IPIs that are being configured ?
>>
> 
> This is what my tentative fix does.

I didn't understand the complexity of it, may be due to my
patchset.

You should try again :)

C.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends()
  2020-11-24 17:56           ` Cédric Le Goater
@ 2020-11-25  9:33             ` Greg Kurz
  2020-11-25 11:34               ` Cédric Le Goater
  0 siblings, 1 reply; 40+ messages in thread
From: Greg Kurz @ 2020-11-25  9:33 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, David Gibson

On Tue, 24 Nov 2020 18:56:02 +0100
Cédric Le Goater <clg@kaod.org> wrote:

[...]

> > 
> > I guess you're talking about KVM_DEV_XIVE_NR_SERVERS in
> > kvmppc_xive_connect() actually. We're currently passing
> > spapr_max_server_number() (vCPU id) but you might be
> > right.
> > 
> > I need to re-read the story around VSMT and XIVE.
> 
> ok. What we care about here, is a size to allocate the NVT block
> representing the vCPUs in HW. NVT ids are pushed in the thread 
> contexts when the vCPUs are scheduled to run and looked for by 
> the presenter when an interrupt is to be delivered.
> 

Yeah, looking at the code again, I realize there was a confusion
when we added the possibility to size the NVT. This should not
depend on the vCPU id limnit as it is done today, it should just
be the maximum number of possible vCPUs (ie. smp.max_cpus).

[...]
> > 
> > The difference here is that the guest doesn't claim IPIs. They are
> > supposedly pre-claimed in "ibm,xive-lisn-ranges". And this is actually
> > the case in QEMU.
> 
> yes. That's what I want to change (for performance)
> 

I understand the purpose of claiming the IPI from its associated
vCPU context. But this can only be done on a path where we have
both the vCPU id and the IPI ; kvmppc_xive_set_source_config()
looks like a good candidate to handle this for runtime and
post-load.

> > The IPI setup sequence in the guest is basically:
> > 1) grab a free irq from the bitmap, ie. "ibm,xive-lisn-ranges"
> > 2) calls H_INT_GET_SOURCE_INFO, ie. populate_irq_data()
> > 3) calls H_INT_SET_SOURCE_CONFIG, ie, configure_irq())
> > 
> > If we want an IPI to be claimed by the appropriate vCPU, we
> > can only do this from under H_INT_SET_SOURCE_CONFIG. And
> > until the guest eventually configures the IPI, KVM and QEMU
> > are out of sync.
> 
> Well, KVM doesn't know either how much PCI MSIs will be claimed.
> It all depends on the guest OS. 
> 

Yes but QEMU and KVM are always in sync for them. When the
guest calls the "ibm,change-msi" RTAS interface to get some
MSIs for a device, they are immediately claimed both in
QEMU and KVM.

> I don't think this is a problem to expose unclaimed interrupt
> numbers to the guest if they are IPIs. We can detect that
> easily with the range and claim the source at KVM level when 
> it's configured or in h_int_get_source_info(). Talking of which, 

We cannot claim the source at the KVM level from
H_INT_GET_SOURCE_INFO because we don't know about
the vCPU id here => we can't do the run_on_cpu()
optimization.

> it might be good to have a KVM command to query the source 
> characteristics on the host. I sent a patchset a while ago in 
> that sense.
> 
> > This complexifies migration because we have to guess at
> > post load if we should claim the IPI in KVM or not. The
> > simple presence of the vCPU isn't enough : we need to
> > guess if the guest actually configured the IPI or not.
> 
> The EAT will be transferred from the source and the call to 
> kvmppc_xive_source_reset_one() should initialize the KVM 
> device correctly on the target for all interrupts.
> 

Except that the EAS appears as valid for all IPIs, even
though the source didn't claim them at the KVM level. It
looks wrong to blindly restore all of them in post-load.

> >> All this to say, that we need to size better the range in the 
> >> "ibm,xive-lisn-ranges" property if that's broken for vSMT. 
> >>
> > 
> > Sizing the range to smp.max_cpus as proposed in this series
> > is fine, no matter what the VSMT is.
> 
> ok. That's a fix for spapr_irq_dt() then. And possibly, there 
> is a similar one for KVM_DEV_XIVE_NR_SERVERS.
>  

Yup.

> >> Then, I think the IPIs can be treated just like the PCI MSIs
> >> but they need to be claimed first. That's the ugly part. 
> >>
> > 
> > Yeah that's the big difference. For PCI MSIs, QEMU owns the
> > bitmap and the guest can claim (or release) a number of
> > MSIs the "ibm,change-msi" RTAS interface. There's no
> > such thing for IPIs : they are supposedly already claimed.
> 
> IPIs are a bit special because there are no I/O devices to
> claim them. We could consider the vCPU has being the device. 
> That was my first attempt but it was wrong since the OS is 
> in charge of choosing an interrupt number for the IPI. 
> 
> >> Should we add a special check in h_int_set_source_config to
> >> deal with unclaimed IPIs that are being configured ?
> >>
> > 
> > This is what my tentative fix does.
> 
> I didn't understand the complexity of it, may be due to my
> patchset.
> 
> You should try again :)
> 

Will do when I've fixed the misuses of spapr_max_server_number().

> C.



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends()
  2020-11-25  9:33             ` Greg Kurz
@ 2020-11-25 11:34               ` Cédric Le Goater
  2020-11-25 12:26                 ` Greg Kurz
  0 siblings, 1 reply; 40+ messages in thread
From: Cédric Le Goater @ 2020-11-25 11:34 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, qemu-devel, David Gibson

 
>>> This complexifies migration because we have to guess at
>>> post load if we should claim the IPI in KVM or not. The
>>> simple presence of the vCPU isn't enough : we need to
>>> guess if the guest actually configured the IPI or not.
>>
>> The EAT will be transferred from the source and the call to 
>> kvmppc_xive_source_reset_one() should initialize the KVM 
>> device correctly on the target for all interrupts.
>>
> 
> Except that the EAS appears as valid for all IPIs, even
> though the source didn't claim them at the KVM level. 

why ? we would stop claiming IPIs in spapr_irq_init() and so
they won't appear as being valid anymore, at boot time or
restore time.


C.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends()
  2020-11-25 11:34               ` Cédric Le Goater
@ 2020-11-25 12:26                 ` Greg Kurz
  2020-11-26  7:06                   ` Cédric Le Goater
  0 siblings, 1 reply; 40+ messages in thread
From: Greg Kurz @ 2020-11-25 12:26 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, David Gibson

On Wed, 25 Nov 2020 12:34:25 +0100
Cédric Le Goater <clg@kaod.org> wrote:

>  
> >>> This complexifies migration because we have to guess at
> >>> post load if we should claim the IPI in KVM or not. The
> >>> simple presence of the vCPU isn't enough : we need to
> >>> guess if the guest actually configured the IPI or not.
> >>
> >> The EAT will be transferred from the source and the call to 
> >> kvmppc_xive_source_reset_one() should initialize the KVM 
> >> device correctly on the target for all interrupts.
> >>
> > 
> > Except that the EAS appears as valid for all IPIs, even
> > though the source didn't claim them at the KVM level. 
> 
> why ? we would stop claiming IPIs in spapr_irq_init() and so
> they won't appear as being valid anymore, at boot time or
> restore time.
> 

If we don't claim the IPIs in spapr_irq_init() anymore then
we must at least claim them on the path of H_INT_GET_SOURCE_INFO
otherwise it will fail with H_P2 and the guest won't even
try to setup the IPI. Even if we do that, we still have a
window where the source is valid in QEMU but not yet at
the KVM level.

> 
> C.



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends()
  2020-11-23  3:33   ` David Gibson
@ 2020-11-25 22:43     ` Greg Kurz
  2020-11-26  0:06       ` David Gibson
  0 siblings, 1 reply; 40+ messages in thread
From: Greg Kurz @ 2020-11-25 22:43 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, Cédric Le Goater

[-- Attachment #1: Type: text/plain, Size: 6056 bytes --]

On Mon, 23 Nov 2020 14:33:55 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Fri, Nov 20, 2020 at 06:46:40PM +0100, Greg Kurz wrote:
> > We're going to kill the "nr_ends" field in a subsequent patch.
> > Prepare ground by using an helper instead of peeking into
> > the sPAPR XIVE structure directly.
> > 
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> 
> Applied to ppc-for-6.0, thanks.
> 

I'm working on a new approach that doesn't need this change. Especially the
new approach doesn't kill the "nr_ends" fields, which makes the changelog of
this patch slightly wrong. Since it doesn't bring much in the end, maybe you
can just drop it from ppc-for-6.0 ?

> 
> > ---
> >  include/hw/ppc/spapr_xive.h |  1 +
> >  hw/intc/spapr_xive.c        | 23 ++++++++++++++---------
> >  hw/intc/spapr_xive_kvm.c    |  4 ++--
> >  3 files changed, 17 insertions(+), 11 deletions(-)
> > 
> > diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> > index 26c8d90d7196..4b967f13c030 100644
> > --- a/include/hw/ppc/spapr_xive.h
> > +++ b/include/hw/ppc/spapr_xive.h
> > @@ -75,6 +75,7 @@ void spapr_xive_map_mmio(SpaprXive *xive);
> >  
> >  int spapr_xive_end_to_target(uint8_t end_blk, uint32_t end_idx,
> >                               uint32_t *out_server, uint8_t *out_prio);
> > +uint32_t spapr_xive_nr_ends(const SpaprXive *xive);
> >  
> >  /*
> >   * KVM XIVE device helpers
> > diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> > index 60e0d5769dcc..f473ad9cba47 100644
> > --- a/hw/intc/spapr_xive.c
> > +++ b/hw/intc/spapr_xive.c
> > @@ -192,7 +192,7 @@ void spapr_xive_pic_print_info(SpaprXive *xive, Monitor *mon)
> >              uint32_t end_idx = xive_get_field64(EAS_END_INDEX, eas->w);
> >              XiveEND *end;
> >  
> > -            assert(end_idx < xive->nr_ends);
> > +            assert(end_idx < spapr_xive_nr_ends(xive));
> >              end = &xive->endt[end_idx];
> >  
> >              if (xive_end_is_valid(end)) {
> > @@ -270,7 +270,7 @@ static void spapr_xive_reset(void *dev)
> >      }
> >  
> >      /* Clear all ENDs */
> > -    for (i = 0; i < xive->nr_ends; i++) {
> > +    for (i = 0; i < spapr_xive_nr_ends(xive); i++) {
> >          spapr_xive_end_reset(&xive->endt[i]);
> >      }
> >  }
> > @@ -288,6 +288,11 @@ static void spapr_xive_instance_init(Object *obj)
> >      xive->fd = -1;
> >  }
> >  
> > +uint32_t spapr_xive_nr_ends(const SpaprXive *xive)
> > +{
> > +    return xive->nr_ends;
> > +}
> > +
> >  static void spapr_xive_realize(DeviceState *dev, Error **errp)
> >  {
> >      SpaprXive *xive = SPAPR_XIVE(dev);
> > @@ -336,7 +341,7 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
> >       * Allocate the routing tables
> >       */
> >      xive->eat = g_new0(XiveEAS, xive->nr_irqs);
> > -    xive->endt = g_new0(XiveEND, xive->nr_ends);
> > +    xive->endt = g_new0(XiveEND, spapr_xive_nr_ends(xive));
> >  
> >      xive->nodename = g_strdup_printf("interrupt-controller@%" PRIx64,
> >                             xive->tm_base + XIVE_TM_USER_PAGE * (1 << TM_SHIFT));
> > @@ -375,7 +380,7 @@ static int spapr_xive_get_end(XiveRouter *xrtr,
> >  {
> >      SpaprXive *xive = SPAPR_XIVE(xrtr);
> >  
> > -    if (end_idx >= xive->nr_ends) {
> > +    if (end_idx >= spapr_xive_nr_ends(xive)) {
> >          return -1;
> >      }
> >  
> > @@ -389,7 +394,7 @@ static int spapr_xive_write_end(XiveRouter *xrtr, uint8_t end_blk,
> >  {
> >      SpaprXive *xive = SPAPR_XIVE(xrtr);
> >  
> > -    if (end_idx >= xive->nr_ends) {
> > +    if (end_idx >= spapr_xive_nr_ends(xive)) {
> >          return -1;
> >      }
> >  
> > @@ -1138,7 +1143,7 @@ static target_ulong h_int_get_source_config(PowerPCCPU *cpu,
> >      /* EAS_END_BLOCK is unused on sPAPR */
> >      end_idx = xive_get_field64(EAS_END_INDEX, eas.w);
> >  
> > -    assert(end_idx < xive->nr_ends);
> > +    assert(end_idx < spapr_xive_nr_ends(xive));
> >      end = &xive->endt[end_idx];
> >  
> >      nvt_blk = xive_get_field32(END_W6_NVT_BLOCK, end->w6);
> > @@ -1216,7 +1221,7 @@ static target_ulong h_int_get_queue_info(PowerPCCPU *cpu,
> >          return H_P2;
> >      }
> >  
> > -    assert(end_idx < xive->nr_ends);
> > +    assert(end_idx < spapr_xive_nr_ends(xive));
> >      end = &xive->endt[end_idx];
> >  
> >      args[0] = xive->end_base + (1ull << (end_xsrc->esb_shift + 1)) * end_idx;
> > @@ -1304,7 +1309,7 @@ static target_ulong h_int_set_queue_config(PowerPCCPU *cpu,
> >          return H_P2;
> >      }
> >  
> > -    assert(end_idx < xive->nr_ends);
> > +    assert(end_idx < spapr_xive_nr_ends(xive));
> >      memcpy(&end, &xive->endt[end_idx], sizeof(XiveEND));
> >  
> >      switch (qsize) {
> > @@ -1470,7 +1475,7 @@ static target_ulong h_int_get_queue_config(PowerPCCPU *cpu,
> >          return H_P2;
> >      }
> >  
> > -    assert(end_idx < xive->nr_ends);
> > +    assert(end_idx < spapr_xive_nr_ends(xive));
> >      end = &xive->endt[end_idx];
> >  
> >      args[0] = 0;
> > diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
> > index 66bf4c06fe55..1566016f0e28 100644
> > --- a/hw/intc/spapr_xive_kvm.c
> > +++ b/hw/intc/spapr_xive_kvm.c
> > @@ -531,7 +531,7 @@ static int kvmppc_xive_get_queues(SpaprXive *xive, Error **errp)
> >      int i;
> >      int ret;
> >  
> > -    for (i = 0; i < xive->nr_ends; i++) {
> > +    for (i = 0; i < spapr_xive_nr_ends(xive); i++) {
> >          if (!xive_end_is_valid(&xive->endt[i])) {
> >              continue;
> >          }
> > @@ -701,7 +701,7 @@ int kvmppc_xive_post_load(SpaprXive *xive, int version_id)
> >      assert(xive->fd != -1);
> >  
> >      /* Restore the ENDT first. The targetting depends on it. */
> > -    for (i = 0; i < xive->nr_ends; i++) {
> > +    for (i = 0; i < spapr_xive_nr_ends(xive); i++) {
> >          if (!xive_end_is_valid(&xive->endt[i])) {
> >              continue;
> >          }
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends()
  2020-11-25 22:43     ` Greg Kurz
@ 2020-11-26  0:06       ` David Gibson
  0 siblings, 0 replies; 40+ messages in thread
From: David Gibson @ 2020-11-26  0:06 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, qemu-devel, Cédric Le Goater

[-- Attachment #1: Type: text/plain, Size: 6643 bytes --]

On Wed, Nov 25, 2020 at 11:43:26PM +0100, Greg Kurz wrote:
> On Mon, 23 Nov 2020 14:33:55 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Fri, Nov 20, 2020 at 06:46:40PM +0100, Greg Kurz wrote:
> > > We're going to kill the "nr_ends" field in a subsequent patch.
> > > Prepare ground by using an helper instead of peeking into
> > > the sPAPR XIVE structure directly.
> > > 
> > > Signed-off-by: Greg Kurz <groug@kaod.org>
> > 
> > Applied to ppc-for-6.0, thanks.
> > 
> 
> I'm working on a new approach that doesn't need this change. Especially the
> new approach doesn't kill the "nr_ends" fields, which makes the changelog of
> this patch slightly wrong. Since it doesn't bring much in the end, maybe you
> can just drop it from ppc-for-6.0 ?

Done.

> 
> > 
> > > ---
> > >  include/hw/ppc/spapr_xive.h |  1 +
> > >  hw/intc/spapr_xive.c        | 23 ++++++++++++++---------
> > >  hw/intc/spapr_xive_kvm.c    |  4 ++--
> > >  3 files changed, 17 insertions(+), 11 deletions(-)
> > > 
> > > diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> > > index 26c8d90d7196..4b967f13c030 100644
> > > --- a/include/hw/ppc/spapr_xive.h
> > > +++ b/include/hw/ppc/spapr_xive.h
> > > @@ -75,6 +75,7 @@ void spapr_xive_map_mmio(SpaprXive *xive);
> > >  
> > >  int spapr_xive_end_to_target(uint8_t end_blk, uint32_t end_idx,
> > >                               uint32_t *out_server, uint8_t *out_prio);
> > > +uint32_t spapr_xive_nr_ends(const SpaprXive *xive);
> > >  
> > >  /*
> > >   * KVM XIVE device helpers
> > > diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> > > index 60e0d5769dcc..f473ad9cba47 100644
> > > --- a/hw/intc/spapr_xive.c
> > > +++ b/hw/intc/spapr_xive.c
> > > @@ -192,7 +192,7 @@ void spapr_xive_pic_print_info(SpaprXive *xive, Monitor *mon)
> > >              uint32_t end_idx = xive_get_field64(EAS_END_INDEX, eas->w);
> > >              XiveEND *end;
> > >  
> > > -            assert(end_idx < xive->nr_ends);
> > > +            assert(end_idx < spapr_xive_nr_ends(xive));
> > >              end = &xive->endt[end_idx];
> > >  
> > >              if (xive_end_is_valid(end)) {
> > > @@ -270,7 +270,7 @@ static void spapr_xive_reset(void *dev)
> > >      }
> > >  
> > >      /* Clear all ENDs */
> > > -    for (i = 0; i < xive->nr_ends; i++) {
> > > +    for (i = 0; i < spapr_xive_nr_ends(xive); i++) {
> > >          spapr_xive_end_reset(&xive->endt[i]);
> > >      }
> > >  }
> > > @@ -288,6 +288,11 @@ static void spapr_xive_instance_init(Object *obj)
> > >      xive->fd = -1;
> > >  }
> > >  
> > > +uint32_t spapr_xive_nr_ends(const SpaprXive *xive)
> > > +{
> > > +    return xive->nr_ends;
> > > +}
> > > +
> > >  static void spapr_xive_realize(DeviceState *dev, Error **errp)
> > >  {
> > >      SpaprXive *xive = SPAPR_XIVE(dev);
> > > @@ -336,7 +341,7 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
> > >       * Allocate the routing tables
> > >       */
> > >      xive->eat = g_new0(XiveEAS, xive->nr_irqs);
> > > -    xive->endt = g_new0(XiveEND, xive->nr_ends);
> > > +    xive->endt = g_new0(XiveEND, spapr_xive_nr_ends(xive));
> > >  
> > >      xive->nodename = g_strdup_printf("interrupt-controller@%" PRIx64,
> > >                             xive->tm_base + XIVE_TM_USER_PAGE * (1 << TM_SHIFT));
> > > @@ -375,7 +380,7 @@ static int spapr_xive_get_end(XiveRouter *xrtr,
> > >  {
> > >      SpaprXive *xive = SPAPR_XIVE(xrtr);
> > >  
> > > -    if (end_idx >= xive->nr_ends) {
> > > +    if (end_idx >= spapr_xive_nr_ends(xive)) {
> > >          return -1;
> > >      }
> > >  
> > > @@ -389,7 +394,7 @@ static int spapr_xive_write_end(XiveRouter *xrtr, uint8_t end_blk,
> > >  {
> > >      SpaprXive *xive = SPAPR_XIVE(xrtr);
> > >  
> > > -    if (end_idx >= xive->nr_ends) {
> > > +    if (end_idx >= spapr_xive_nr_ends(xive)) {
> > >          return -1;
> > >      }
> > >  
> > > @@ -1138,7 +1143,7 @@ static target_ulong h_int_get_source_config(PowerPCCPU *cpu,
> > >      /* EAS_END_BLOCK is unused on sPAPR */
> > >      end_idx = xive_get_field64(EAS_END_INDEX, eas.w);
> > >  
> > > -    assert(end_idx < xive->nr_ends);
> > > +    assert(end_idx < spapr_xive_nr_ends(xive));
> > >      end = &xive->endt[end_idx];
> > >  
> > >      nvt_blk = xive_get_field32(END_W6_NVT_BLOCK, end->w6);
> > > @@ -1216,7 +1221,7 @@ static target_ulong h_int_get_queue_info(PowerPCCPU *cpu,
> > >          return H_P2;
> > >      }
> > >  
> > > -    assert(end_idx < xive->nr_ends);
> > > +    assert(end_idx < spapr_xive_nr_ends(xive));
> > >      end = &xive->endt[end_idx];
> > >  
> > >      args[0] = xive->end_base + (1ull << (end_xsrc->esb_shift + 1)) * end_idx;
> > > @@ -1304,7 +1309,7 @@ static target_ulong h_int_set_queue_config(PowerPCCPU *cpu,
> > >          return H_P2;
> > >      }
> > >  
> > > -    assert(end_idx < xive->nr_ends);
> > > +    assert(end_idx < spapr_xive_nr_ends(xive));
> > >      memcpy(&end, &xive->endt[end_idx], sizeof(XiveEND));
> > >  
> > >      switch (qsize) {
> > > @@ -1470,7 +1475,7 @@ static target_ulong h_int_get_queue_config(PowerPCCPU *cpu,
> > >          return H_P2;
> > >      }
> > >  
> > > -    assert(end_idx < xive->nr_ends);
> > > +    assert(end_idx < spapr_xive_nr_ends(xive));
> > >      end = &xive->endt[end_idx];
> > >  
> > >      args[0] = 0;
> > > diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
> > > index 66bf4c06fe55..1566016f0e28 100644
> > > --- a/hw/intc/spapr_xive_kvm.c
> > > +++ b/hw/intc/spapr_xive_kvm.c
> > > @@ -531,7 +531,7 @@ static int kvmppc_xive_get_queues(SpaprXive *xive, Error **errp)
> > >      int i;
> > >      int ret;
> > >  
> > > -    for (i = 0; i < xive->nr_ends; i++) {
> > > +    for (i = 0; i < spapr_xive_nr_ends(xive); i++) {
> > >          if (!xive_end_is_valid(&xive->endt[i])) {
> > >              continue;
> > >          }
> > > @@ -701,7 +701,7 @@ int kvmppc_xive_post_load(SpaprXive *xive, int version_id)
> > >      assert(xive->fd != -1);
> > >  
> > >      /* Restore the ENDT first. The targetting depends on it. */
> > > -    for (i = 0; i < xive->nr_ends; i++) {
> > > +    for (i = 0; i < spapr_xive_nr_ends(xive); i++) {
> > >          if (!xive_end_is_valid(&xive->endt[i])) {
> > >              continue;
> > >          }
> > 
> 



-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends()
  2020-11-25 12:26                 ` Greg Kurz
@ 2020-11-26  7:06                   ` Cédric Le Goater
  0 siblings, 0 replies; 40+ messages in thread
From: Cédric Le Goater @ 2020-11-26  7:06 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, qemu-devel, David Gibson

>> why ? we would stop claiming IPIs in spapr_irq_init() and so
>> they won't appear as being valid anymore, at boot time or
>> restore time.
>>
> 
> If we don't claim the IPIs in spapr_irq_init() anymore then
> we must at least claim them on the path of H_INT_GET_SOURCE_INFO
> otherwise it will fail with H_P2 and the guest won't even
> try to setup the IPI. 

IPIs could be handled a special case as we know in which range 
they are.

> Even if we do that, we still have a
> window where the source is valid in QEMU but not yet at
> the KVM level.

May be some kind of lazy binding with KVM would help us.

The problem is that the guest ESB pages won't be backed by HW 
pages which could lead to a crash. We could overlap the emulated 
one until we are bound to KVM. Just a thought.

C.




^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2020-11-26  7:08 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-20 17:46 [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids Greg Kurz
2020-11-20 17:46 ` [PATCH for-6.0 1/8] spapr/xive: Turn some sanity checks into assertions Greg Kurz
2020-11-23  3:33   ` David Gibson
2020-11-23  8:09   ` Cédric Le Goater
2020-11-20 17:46 ` [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends() Greg Kurz
2020-11-23  3:33   ` David Gibson
2020-11-25 22:43     ` Greg Kurz
2020-11-26  0:06       ` David Gibson
2020-11-23  9:46   ` Cédric Le Goater
2020-11-23 11:16     ` Greg Kurz
2020-11-24 13:54       ` Cédric Le Goater
2020-11-24 17:01         ` Greg Kurz
2020-11-24 17:56           ` Cédric Le Goater
2020-11-25  9:33             ` Greg Kurz
2020-11-25 11:34               ` Cédric Le Goater
2020-11-25 12:26                 ` Greg Kurz
2020-11-26  7:06                   ` Cédric Le Goater
2020-11-20 17:46 ` [PATCH for-6.0 3/8] spapr/xive: Add "nr-servers" property Greg Kurz
2020-11-23  3:52   ` David Gibson
2020-11-23  9:20     ` Greg Kurz
2020-11-23  9:56   ` Cédric Le Goater
2020-11-23 11:25     ` Greg Kurz
2020-11-24 14:18       ` Cédric Le Goater
2020-11-20 17:46 ` [PATCH for-6.0 4/8] spapr/xive: Add "nr-ipis" property Greg Kurz
2020-11-23  4:10   ` David Gibson
2020-11-23 10:13   ` Cédric Le Goater
2020-11-24 17:18     ` Greg Kurz
2020-11-20 17:46 ` [PATCH for-6.0 5/8] spapr/xics: Drop unused argument to xics_kvm_has_broken_disconnect() Greg Kurz
2020-11-23  4:10   ` David Gibson
2020-11-23 10:15   ` Cédric Le Goater
2020-11-20 17:46 ` [PATCH for-6.0 6/8] spapr/xics: Add "nr-servers" property Greg Kurz
2020-11-23  4:18   ` David Gibson
2020-11-23  9:39     ` Greg Kurz
2020-11-23 10:24   ` Cédric Le Goater
2020-11-20 17:46 ` [PATCH for-6.0 7/8] spapr: Drop "nr_servers" argument of the sPAPR IC activate() operation Greg Kurz
2020-11-23  4:38   ` David Gibson
2020-11-23  9:47     ` Greg Kurz
2020-11-20 17:46 ` [PATCH for-6.0 8/8] spapr: Drop "nr_servers" argument of the sPAPR IC dt() operation Greg Kurz
2020-11-23  8:04 ` [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids Cédric Le Goater
2020-11-23 10:07   ` Greg Kurz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.