All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] spapr: Use less XIVE HW resources in KVM
@ 2019-10-03 12:00 Greg Kurz
  2019-10-03 12:00 ` [PATCH 1/7] spapr, xics: Get number of servers with a XICSFabricClass method Greg Kurz
                   ` (6 more replies)
  0 siblings, 7 replies; 31+ messages in thread
From: Greg Kurz @ 2019-10-03 12:00 UTC (permalink / raw)
  To: David Gibson, Cédric Le Goater; +Cc: qemu-ppc, qemu-devel

On POWER9 systems, the XICS-on-XIVE and XIVE KVM devices currently
allocate a bunch of VPs in the XIVE HW to accomodate the highest
VCPU id that may be possibly used in a VM. This limits the number
of VMs that can run with an in-kernel interrupt controller to 63
per POWER9 chip, irrespectively of its number of HW threads, eg.
up to 96 on a POWER9 Nimbus socket. This is an unfortunate waste
of scarce HW resources since a typical VM doesn't need that much
VPs to run.

This series exploits new attributes of the XICS-on-XIVE and XIVE
KVM devices that allow userspace to tune the numbers of VPs it
really needs.

Patches 1 to 3 are preliminary work to teach the XICS and XIVE
backends about the range of needed VCPU ids, according to the
maximum number of VCPUs specified in the QEMU command line.

Patch 5 and 6 do the actual work of configuring the KVM devices,
based on new defines brought by a patch 4. RFC since the patches
for KVM are still being discussed on the kvm-ppc list:

https://patchwork.ozlabs.org/project/kvm-ppc/list/?series=132910

As a bonus, patch 7 allows the latest machine type to automatically
set int KVM the guest core stride (VSMT) to be equal to the number
of threads per core (-smp threads=N). This makes VCPU ids contiguous
and allows to reduce the VP consumption even more.

Both KVM and QEMU changes are available here:
https://github.com/gkurz/linux/commits/xive-nr-servers-5.3
https://github.com/gkurz/qemu/commits/xive-nr-servers-for-4.2

--
Greg

---

Greg Kurz (7):
      spapr, xics: Get number of servers with a XICSFabricClass method
      spapr, xive: Turn "nr-ends" property into "nr-servers" property
      spapr, xics, xive: Drop nr_servers argument in DT-related functions
      RFC linux-headers: Update against 5.3-rc2
      spapr/xics: Configure number of servers in KVM
      spapr/xive: Configure number of servers in KVM
      spapr: Set VSMT to smp_threads by default


 hw/intc/spapr_xive.c                         |   24 ++++++++++++++++------
 hw/intc/spapr_xive_kvm.c                     |   22 +++++++++++++++++++-
 hw/intc/xics.c                               |    7 +++++++
 hw/intc/xics_kvm.c                           |   24 ++++++++++++++++++++--
 hw/intc/xics_spapr.c                         |    6 +++---
 hw/ppc/spapr.c                               |   18 ++++++++++++++---
 hw/ppc/spapr_irq.c                           |    7 +++----
 include/hw/ppc/spapr.h                       |    1 +
 include/hw/ppc/spapr_irq.h                   |    3 +--
 include/hw/ppc/spapr_xive.h                  |    4 ++--
 include/hw/ppc/xics.h                        |    2 ++
 include/hw/ppc/xics_spapr.h                  |    3 +--
 include/standard-headers/asm-x86/bootparam.h |    2 ++
 include/standard-headers/asm-x86/kvm_para.h  |    1 +
 include/standard-headers/linux/ethtool.h     |    2 ++
 include/standard-headers/linux/pci_regs.h    |    4 ++++
 include/standard-headers/linux/virtio_ids.h  |    1 +
 include/standard-headers/linux/virtio_pmem.h |    6 +++---
 linux-headers/asm-arm/kvm.h                  |   12 +++++++++++
 linux-headers/asm-arm/unistd-common.h        |    2 ++
 linux-headers/asm-arm64/kvm.h                |   17 ++++++++++++++++
 linux-headers/asm-generic/mman-common.h      |   15 ++++++++------
 linux-headers/asm-generic/mman.h             |   10 ++++-----
 linux-headers/asm-generic/unistd.h           |    8 +++++++
 linux-headers/asm-mips/unistd_n32.h          |    1 +
 linux-headers/asm-mips/unistd_n64.h          |    1 +
 linux-headers/asm-mips/unistd_o32.h          |    1 +
 linux-headers/asm-powerpc/kvm.h              |    3 +++
 linux-headers/asm-powerpc/mman.h             |    6 +-----
 linux-headers/asm-powerpc/unistd_32.h        |    2 ++
 linux-headers/asm-powerpc/unistd_64.h        |    2 ++
 linux-headers/asm-s390/unistd_32.h           |    2 ++
 linux-headers/asm-s390/unistd_64.h           |    2 ++
 linux-headers/asm-x86/kvm.h                  |   28 ++++++++++++++++++++------
 linux-headers/asm-x86/unistd_32.h            |    2 ++
 linux-headers/asm-x86/unistd_64.h            |    2 ++
 linux-headers/asm-x86/unistd_x32.h           |    2 ++
 linux-headers/linux/kvm.h                    |    7 +++++--
 linux-headers/linux/psp-sev.h                |    5 +----
 39 files changed, 206 insertions(+), 61 deletions(-)



^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 1/7] spapr, xics: Get number of servers with a XICSFabricClass method
  2019-10-03 12:00 [PATCH 0/7] spapr: Use less XIVE HW resources in KVM Greg Kurz
@ 2019-10-03 12:00 ` Greg Kurz
  2019-10-03 12:24   ` Cédric Le Goater
  2019-10-03 12:01 ` [PATCH 2/7] spapr, xive: Turn "nr-ends" property into "nr-servers" property Greg Kurz
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 31+ messages in thread
From: Greg Kurz @ 2019-10-03 12:00 UTC (permalink / raw)
  To: David Gibson, Cédric Le Goater; +Cc: qemu-ppc, qemu-devel

The number of servers, ie. upper bound of the highest VCPU id, is
currently only needed to generate the "interrupt-controller" node
in the DT. Soon it will be needed to inform the XICS-on-XIVE KVM
device that it can allocates less resources in the XIVE HW.

Add a method to XICSFabricClass for this purpose. Implement it
for sPAPR and use it to generate the "interrupt-controller" node.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/intc/xics.c        |    7 +++++++
 hw/intc/xics_spapr.c  |    3 ++-
 hw/ppc/spapr.c        |    8 ++++++++
 include/hw/ppc/xics.h |    2 ++
 4 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index dfe7dbd254ab..f82072935266 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -716,6 +716,13 @@ ICPState *xics_icp_get(XICSFabric *xi, int server)
     return xic->icp_get(xi, server);
 }
 
+uint32_t xics_nr_servers(XICSFabric *xi)
+{
+    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(xi);
+
+    return xic->nr_servers(xi);
+}
+
 void ics_set_irq_type(ICSState *ics, int srcno, bool lsi)
 {
     assert(!(ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MASK));
diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
index 6e5eb24b3cca..aa568ed0dc0d 100644
--- a/hw/intc/xics_spapr.c
+++ b/hw/intc/xics_spapr.c
@@ -311,8 +311,9 @@ static void ics_spapr_realize(DeviceState *dev, Error **errp)
 void spapr_dt_xics(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
                    uint32_t phandle)
 {
+    ICSState *ics = spapr->ics;
     uint32_t interrupt_server_ranges_prop[] = {
-        0, cpu_to_be32(nr_servers),
+        0, cpu_to_be32(xics_nr_servers(ics->xics)),
     };
     int node;
 
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 514a17ae74d6..b8b9796c88e4 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -4266,6 +4266,13 @@ static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
     return cpu ? spapr_cpu_state(cpu)->icp : NULL;
 }
 
+static uint32_t spapr_nr_servers(XICSFabric *xi)
+{
+    SpaprMachineState *spapr = SPAPR_MACHINE(xi);
+
+    return spapr_max_server_number(spapr);
+}
+
 static void spapr_pic_print_info(InterruptStatsProvider *obj,
                                  Monitor *mon)
 {
@@ -4423,6 +4430,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     xic->ics_get = spapr_ics_get;
     xic->ics_resend = spapr_ics_resend;
     xic->icp_get = spapr_icp_get;
+    xic->nr_servers = spapr_nr_servers;
     ispc->print_info = spapr_pic_print_info;
     /* Force NUMA node memory size to be a multiple of
      * SPAPR_MEMORY_BLOCK_SIZE (256M) since that's the granularity
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 1e6a9300eb2b..e6bb1239e8f8 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -151,9 +151,11 @@ typedef struct XICSFabricClass {
     ICSState *(*ics_get)(XICSFabric *xi, int irq);
     void (*ics_resend)(XICSFabric *xi);
     ICPState *(*icp_get)(XICSFabric *xi, int server);
+    uint32_t (*nr_servers)(XICSFabric *xi);
 } XICSFabricClass;
 
 ICPState *xics_icp_get(XICSFabric *xi, int server);
+uint32_t xics_nr_servers(XICSFabric *xi);
 
 /* Internal XICS interfaces */
 void icp_set_cppr(ICPState *icp, uint8_t cppr);



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 2/7] spapr, xive: Turn "nr-ends" property into "nr-servers" property
  2019-10-03 12:00 [PATCH 0/7] spapr: Use less XIVE HW resources in KVM Greg Kurz
  2019-10-03 12:00 ` [PATCH 1/7] spapr, xics: Get number of servers with a XICSFabricClass method Greg Kurz
@ 2019-10-03 12:01 ` Greg Kurz
  2019-10-03 12:21   ` Cédric Le Goater
  2019-10-04  4:07   ` David Gibson
  2019-10-03 12:01 ` [PATCH 3/7] spapr, xics, xive: Drop nr_servers argument in DT-related functions Greg Kurz
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 31+ messages in thread
From: Greg Kurz @ 2019-10-03 12:01 UTC (permalink / raw)
  To: David Gibson, Cédric Le Goater; +Cc: qemu-ppc, qemu-devel

The sPAPR XIVE object has an nr_ends field which happens to be a
multiple of spapr_max_server_number(). It is currently set with
the help of "nr-ends" property. This is a bit unfortunate since
it exposes to the sPAPR irq frontend what should remain an
implemantation detail within the XIVE backend.

It will be possible soon to inform the XIVE KVM device about the
range of VCPU ids that may be used in the VM, as returned by the
spapr_max_server_number() function. This will allow the device
to substantially reduce the consumption of scarce resources
in the XIVE HW.

For both reasons, replace the "nr-ends" property with an "nr-servers"
one. The existing nr_ends field must be kept though since it tells how
many ENDs are migrated, it is derived from "nr-servers" at realize time
for simplicity. Convert spapr_dt_xive() to use it as well.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/intc/spapr_xive.c        |   21 ++++++++++++++++-----
 hw/ppc/spapr_irq.c          |    2 +-
 include/hw/ppc/spapr_xive.h |    1 +
 3 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index 04879abf2e7a..62888ddc68db 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -99,6 +99,15 @@ int spapr_xive_end_to_target(uint8_t end_blk, uint32_t end_idx,
     return 0;
 }
 
+static uint32_t spapr_xive_vcpu_id_to_end_idx(uint32_t vcpu_id)
+{
+    /*
+     * 8 XIVE END structures per CPU. One for each available
+     * priority
+     */
+    return vcpu_id << 3;
+}
+
 static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
                                   uint8_t *out_end_blk, uint32_t *out_end_idx)
 {
@@ -109,7 +118,7 @@ static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
     }
 
     if (out_end_idx) {
-        *out_end_idx = (cpu->vcpu_id << 3) + prio;
+        *out_end_idx = spapr_xive_vcpu_id_to_end_idx(cpu->vcpu_id) + prio;
     }
 }
 
@@ -283,11 +292,13 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
         return;
     }
 
-    if (!xive->nr_ends) {
-        error_setg(errp, "Number of interrupt needs to be greater 0");
+    if (!xive->nr_servers) {
+        error_setg(errp, "Number of interrupt servers must be greater than 0");
         return;
     }
 
+    xive->nr_ends = spapr_xive_vcpu_id_to_end_idx(xive->nr_servers);
+
     /*
      * Initialize the internal sources, for IPIs and virtual devices.
      */
@@ -489,7 +500,7 @@ static const VMStateDescription vmstate_spapr_xive = {
 
 static Property spapr_xive_properties[] = {
     DEFINE_PROP_UINT32("nr-irqs", SpaprXive, nr_irqs, 0),
-    DEFINE_PROP_UINT32("nr-ends", SpaprXive, nr_ends, 0),
+    DEFINE_PROP_UINT32("nr-servers", SpaprXive, nr_servers, 0),
     DEFINE_PROP_UINT64("vc-base", SpaprXive, vc_base, SPAPR_XIVE_VC_BASE),
     DEFINE_PROP_UINT64("tm-base", SpaprXive, tm_base, SPAPR_XIVE_TM_BASE),
     DEFINE_PROP_END_OF_LIST(),
@@ -1550,7 +1561,7 @@ void spapr_dt_xive(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
     /* Interrupt number ranges for the IPIs */
     uint32_t lisn_ranges[] = {
         cpu_to_be32(0),
-        cpu_to_be32(nr_servers),
+        cpu_to_be32(xive->nr_servers),
     };
     /*
      * EQ size - the sizes of pages supported by the system 4K, 64K,
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 457eabe24cda..025fd00143a2 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -591,7 +591,7 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
          * 8 XIVE END structures per CPU. One for each available
          * priority
          */
-        qdev_prop_set_uint32(dev, "nr-ends", nr_servers << 3);
+        qdev_prop_set_uint32(dev, "nr-servers", nr_servers);
         qdev_init_nofail(dev);
 
         spapr->xive = SPAPR_XIVE(dev);
diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
index 0df20a6590a5..4a4a6fc6be7f 100644
--- a/include/hw/ppc/spapr_xive.h
+++ b/include/hw/ppc/spapr_xive.h
@@ -22,6 +22,7 @@ typedef struct SpaprXive {
     /* Internal interrupt source for IPIs and virtual devices */
     XiveSource    source;
     hwaddr        vc_base;
+    uint32_t      nr_servers;
 
     /* END ESB MMIOs */
     XiveENDSource end_source;



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/7] spapr, xics, xive: Drop nr_servers argument in DT-related functions
  2019-10-03 12:00 [PATCH 0/7] spapr: Use less XIVE HW resources in KVM Greg Kurz
  2019-10-03 12:00 ` [PATCH 1/7] spapr, xics: Get number of servers with a XICSFabricClass method Greg Kurz
  2019-10-03 12:01 ` [PATCH 2/7] spapr, xive: Turn "nr-ends" property into "nr-servers" property Greg Kurz
@ 2019-10-03 12:01 ` Greg Kurz
  2019-10-03 12:25   ` Cédric Le Goater
  2019-10-03 12:01 ` [PATCH RFC 4/7] linux-headers: Update against 5.3-rc2 Greg Kurz
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 31+ messages in thread
From: Greg Kurz @ 2019-10-03 12:01 UTC (permalink / raw)
  To: David Gibson, Cédric Le Goater; +Cc: qemu-ppc, qemu-devel

Both XICS and XIVE backends can access nr_servers by other means. No
need to pass it around anymore.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/intc/spapr_xive.c        |    3 +--
 hw/intc/xics_spapr.c        |    3 +--
 hw/ppc/spapr.c              |    3 +--
 hw/ppc/spapr_irq.c          |    5 ++---
 include/hw/ppc/spapr_irq.h  |    3 +--
 include/hw/ppc/spapr_xive.h |    3 +--
 include/hw/ppc/xics_spapr.h |    3 +--
 7 files changed, 8 insertions(+), 15 deletions(-)

diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index 62888ddc68db..56d851169cf6 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -1552,8 +1552,7 @@ void spapr_xive_hcall_init(SpaprMachineState *spapr)
     spapr_register_hypercall(H_INT_RESET, h_int_reset);
 }
 
-void spapr_dt_xive(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
-                   uint32_t phandle)
+void spapr_dt_xive(SpaprMachineState *spapr, void *fdt, uint32_t phandle)
 {
     SpaprXive *xive = spapr->xive;
     int node;
diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
index aa568ed0dc0d..015753c19c5d 100644
--- a/hw/intc/xics_spapr.c
+++ b/hw/intc/xics_spapr.c
@@ -308,8 +308,7 @@ static void ics_spapr_realize(DeviceState *dev, Error **errp)
     spapr_register_hypercall(H_IPOLL, h_ipoll);
 }
 
-void spapr_dt_xics(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
-                   uint32_t phandle)
+void spapr_dt_xics(SpaprMachineState *spapr, void *fdt, uint32_t phandle)
 {
     ICSState *ics = spapr->ics;
     uint32_t interrupt_server_ranges_prop[] = {
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index b8b9796c88e4..8f59f08c102e 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1255,8 +1255,7 @@ static void *spapr_build_fdt(SpaprMachineState *spapr)
     _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
 
     /* /interrupt controller */
-    spapr->irq->dt_populate(spapr, spapr_max_server_number(spapr), fdt,
-                          PHANDLE_INTC);
+    spapr->irq->dt_populate(spapr, fdt, PHANDLE_INTC);
 
     ret = spapr_populate_memory(spapr, fdt);
     if (ret < 0) {
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 025fd00143a2..02e1b5503b65 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -368,11 +368,10 @@ static void spapr_irq_print_info_dual(SpaprMachineState *spapr, Monitor *mon)
     spapr_irq_current(spapr)->print_info(spapr, mon);
 }
 
-static void spapr_irq_dt_populate_dual(SpaprMachineState *spapr,
-                                       uint32_t nr_servers, void *fdt,
+static void spapr_irq_dt_populate_dual(SpaprMachineState *spapr, void *fdt,
                                        uint32_t phandle)
 {
-    spapr_irq_current(spapr)->dt_populate(spapr, nr_servers, fdt, phandle);
+    spapr_irq_current(spapr)->dt_populate(spapr, fdt, phandle);
 }
 
 static void spapr_irq_cpu_intc_create_dual(SpaprMachineState *spapr,
diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index 69a37f608e01..1736e503a8e9 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -45,8 +45,7 @@ typedef struct SpaprIrq {
     int (*claim)(SpaprMachineState *spapr, int irq, bool lsi, Error **errp);
     void (*free)(SpaprMachineState *spapr, int irq);
     void (*print_info)(SpaprMachineState *spapr, Monitor *mon);
-    void (*dt_populate)(SpaprMachineState *spapr, uint32_t nr_servers,
-                        void *fdt, uint32_t phandle);
+    void (*dt_populate)(SpaprMachineState *spapr, void *fdt, uint32_t phandle);
     void (*cpu_intc_create)(SpaprMachineState *spapr, PowerPCCPU *cpu,
                             Error **errp);
     int (*post_load)(SpaprMachineState *spapr, int version_id);
diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
index 4a4a6fc6be7f..fae075d51815 100644
--- a/include/hw/ppc/spapr_xive.h
+++ b/include/hw/ppc/spapr_xive.h
@@ -61,8 +61,7 @@ void spapr_xive_pic_print_info(SpaprXive *xive, Monitor *mon);
 int spapr_xive_post_load(SpaprXive *xive, int version_id);
 
 void spapr_xive_hcall_init(SpaprMachineState *spapr);
-void spapr_dt_xive(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
-                   uint32_t phandle);
+void spapr_dt_xive(SpaprMachineState *spapr, void *fdt, uint32_t phandle);
 void spapr_xive_set_tctx_os_cam(XiveTCTX *tctx);
 void spapr_xive_mmio_set_enabled(SpaprXive *xive, bool enable);
 void spapr_xive_map_mmio(SpaprXive *xive);
diff --git a/include/hw/ppc/xics_spapr.h b/include/hw/ppc/xics_spapr.h
index 0b35e85c266a..ecb67c6c340a 100644
--- a/include/hw/ppc/xics_spapr.h
+++ b/include/hw/ppc/xics_spapr.h
@@ -32,8 +32,7 @@
 #define TYPE_ICS_SPAPR "ics-spapr"
 #define ICS_SPAPR(obj) OBJECT_CHECK(ICSState, (obj), TYPE_ICS_SPAPR)
 
-void spapr_dt_xics(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
-                   uint32_t phandle);
+void spapr_dt_xics(SpaprMachineState *spapr, void *fdt, uint32_t phandle);
 int xics_kvm_connect(SpaprMachineState *spapr, Error **errp);
 void xics_kvm_disconnect(SpaprMachineState *spapr, Error **errp);
 bool xics_kvm_has_broken_disconnect(SpaprMachineState *spapr);



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RFC 4/7] linux-headers: Update against 5.3-rc2
  2019-10-03 12:00 [PATCH 0/7] spapr: Use less XIVE HW resources in KVM Greg Kurz
                   ` (2 preceding siblings ...)
  2019-10-03 12:01 ` [PATCH 3/7] spapr, xics, xive: Drop nr_servers argument in DT-related functions Greg Kurz
@ 2019-10-03 12:01 ` Greg Kurz
  2019-10-03 12:01 ` [PATCH 5/7] spapr/xics: Configure number of servers in KVM Greg Kurz
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 31+ messages in thread
From: Greg Kurz @ 2019-10-03 12:01 UTC (permalink / raw)
  To: David Gibson, Cédric Le Goater; +Cc: qemu-ppc, qemu-devel

In order to get some new KVM definitions:

#define KVM_DEV_XICS_GRP_CTRL           2
#define   KVM_DEV_XICS_NR_SERVERS       1

#define   KVM_DEV_XIVE_NR_SERVERS       3

Signed-off-by: Greg Kurz <groug@kaod.org>
---
This is tagged as RFC since these KVM definitions aren't upstream yet.
---
 include/standard-headers/asm-x86/bootparam.h |    2 ++
 include/standard-headers/asm-x86/kvm_para.h  |    1 +
 include/standard-headers/linux/ethtool.h     |    2 ++
 include/standard-headers/linux/pci_regs.h    |    4 ++++
 include/standard-headers/linux/virtio_ids.h  |    1 +
 include/standard-headers/linux/virtio_pmem.h |    6 +++---
 linux-headers/asm-arm/kvm.h                  |   12 +++++++++++
 linux-headers/asm-arm/unistd-common.h        |    2 ++
 linux-headers/asm-arm64/kvm.h                |   17 ++++++++++++++++
 linux-headers/asm-generic/mman-common.h      |   15 ++++++++------
 linux-headers/asm-generic/mman.h             |   10 ++++-----
 linux-headers/asm-generic/unistd.h           |    8 +++++++
 linux-headers/asm-mips/unistd_n32.h          |    1 +
 linux-headers/asm-mips/unistd_n64.h          |    1 +
 linux-headers/asm-mips/unistd_o32.h          |    1 +
 linux-headers/asm-powerpc/kvm.h              |    3 +++
 linux-headers/asm-powerpc/mman.h             |    6 +-----
 linux-headers/asm-powerpc/unistd_32.h        |    2 ++
 linux-headers/asm-powerpc/unistd_64.h        |    2 ++
 linux-headers/asm-s390/unistd_32.h           |    2 ++
 linux-headers/asm-s390/unistd_64.h           |    2 ++
 linux-headers/asm-x86/kvm.h                  |   28 ++++++++++++++++++++------
 linux-headers/asm-x86/unistd_32.h            |    2 ++
 linux-headers/asm-x86/unistd_64.h            |    2 ++
 linux-headers/asm-x86/unistd_x32.h           |    2 ++
 linux-headers/linux/kvm.h                    |    7 +++++--
 linux-headers/linux/psp-sev.h                |    5 +----
 27 files changed, 112 insertions(+), 34 deletions(-)

diff --git a/include/standard-headers/asm-x86/bootparam.h b/include/standard-headers/asm-x86/bootparam.h
index 67d4f0119f45..a6f7cf535e1e 100644
--- a/include/standard-headers/asm-x86/bootparam.h
+++ b/include/standard-headers/asm-x86/bootparam.h
@@ -29,6 +29,8 @@
 #define XLF_EFI_HANDOVER_32		(1<<2)
 #define XLF_EFI_HANDOVER_64		(1<<3)
 #define XLF_EFI_KEXEC			(1<<4)
+#define XLF_5LEVEL			(1<<5)
+#define XLF_5LEVEL_ENABLED		(1<<6)
 
 
 #endif /* _ASM_X86_BOOTPARAM_H */
diff --git a/include/standard-headers/asm-x86/kvm_para.h b/include/standard-headers/asm-x86/kvm_para.h
index e1715143fdda..90604a8fb77b 100644
--- a/include/standard-headers/asm-x86/kvm_para.h
+++ b/include/standard-headers/asm-x86/kvm_para.h
@@ -30,6 +30,7 @@
 #define KVM_FEATURE_ASYNC_PF_VMEXIT	10
 #define KVM_FEATURE_PV_SEND_IPI	11
 #define KVM_FEATURE_POLL_CONTROL	12
+#define KVM_FEATURE_PV_SCHED_YIELD	13
 
 #define KVM_HINTS_REALTIME      0
 
diff --git a/include/standard-headers/linux/ethtool.h b/include/standard-headers/linux/ethtool.h
index 9b9919a8f621..16d0eeea868d 100644
--- a/include/standard-headers/linux/ethtool.h
+++ b/include/standard-headers/linux/ethtool.h
@@ -1483,6 +1483,8 @@ enum ethtool_link_mode_bit_indices {
 	ETHTOOL_LINK_MODE_200000baseLR4_ER4_FR4_Full_BIT = 64,
 	ETHTOOL_LINK_MODE_200000baseDR4_Full_BIT	 = 65,
 	ETHTOOL_LINK_MODE_200000baseCR4_Full_BIT	 = 66,
+	ETHTOOL_LINK_MODE_100baseT1_Full_BIT		 = 67,
+	ETHTOOL_LINK_MODE_1000baseT1_Full_BIT		 = 68,
 
 	/* must be last entry */
 	__ETHTOOL_LINK_MODE_MASK_NBITS
diff --git a/include/standard-headers/linux/pci_regs.h b/include/standard-headers/linux/pci_regs.h
index 27164769d184..f28e562d7ca8 100644
--- a/include/standard-headers/linux/pci_regs.h
+++ b/include/standard-headers/linux/pci_regs.h
@@ -528,6 +528,7 @@
 #define  PCI_EXP_LNKCAP_SLS_5_0GB 0x00000002 /* LNKCAP2 SLS Vector bit 1 */
 #define  PCI_EXP_LNKCAP_SLS_8_0GB 0x00000003 /* LNKCAP2 SLS Vector bit 2 */
 #define  PCI_EXP_LNKCAP_SLS_16_0GB 0x00000004 /* LNKCAP2 SLS Vector bit 3 */
+#define  PCI_EXP_LNKCAP_SLS_32_0GB 0x00000005 /* LNKCAP2 SLS Vector bit 4 */
 #define  PCI_EXP_LNKCAP_MLW	0x000003f0 /* Maximum Link Width */
 #define  PCI_EXP_LNKCAP_ASPMS	0x00000c00 /* ASPM Support */
 #define  PCI_EXP_LNKCAP_L0SEL	0x00007000 /* L0s Exit Latency */
@@ -556,6 +557,7 @@
 #define  PCI_EXP_LNKSTA_CLS_5_0GB 0x0002 /* Current Link Speed 5.0GT/s */
 #define  PCI_EXP_LNKSTA_CLS_8_0GB 0x0003 /* Current Link Speed 8.0GT/s */
 #define  PCI_EXP_LNKSTA_CLS_16_0GB 0x0004 /* Current Link Speed 16.0GT/s */
+#define  PCI_EXP_LNKSTA_CLS_32_0GB 0x0005 /* Current Link Speed 32.0GT/s */
 #define  PCI_EXP_LNKSTA_NLW	0x03f0	/* Negotiated Link Width */
 #define  PCI_EXP_LNKSTA_NLW_X1	0x0010	/* Current Link Width x1 */
 #define  PCI_EXP_LNKSTA_NLW_X2	0x0020	/* Current Link Width x2 */
@@ -661,6 +663,7 @@
 #define  PCI_EXP_LNKCAP2_SLS_5_0GB	0x00000004 /* Supported Speed 5GT/s */
 #define  PCI_EXP_LNKCAP2_SLS_8_0GB	0x00000008 /* Supported Speed 8GT/s */
 #define  PCI_EXP_LNKCAP2_SLS_16_0GB	0x00000010 /* Supported Speed 16GT/s */
+#define  PCI_EXP_LNKCAP2_SLS_32_0GB	0x00000020 /* Supported Speed 32GT/s */
 #define  PCI_EXP_LNKCAP2_CROSSLINK	0x00000100 /* Crosslink supported */
 #define PCI_EXP_LNKCTL2		48	/* Link Control 2 */
 #define  PCI_EXP_LNKCTL2_TLS		0x000f
@@ -668,6 +671,7 @@
 #define  PCI_EXP_LNKCTL2_TLS_5_0GT	0x0002 /* Supported Speed 5GT/s */
 #define  PCI_EXP_LNKCTL2_TLS_8_0GT	0x0003 /* Supported Speed 8GT/s */
 #define  PCI_EXP_LNKCTL2_TLS_16_0GT	0x0004 /* Supported Speed 16GT/s */
+#define  PCI_EXP_LNKCTL2_TLS_32_0GT	0x0005 /* Supported Speed 32GT/s */
 #define PCI_EXP_LNKSTA2		50	/* Link Status 2 */
 #define PCI_CAP_EXP_ENDPOINT_SIZEOF_V2	52	/* v2 endpoints with link end here */
 #define PCI_EXP_SLTCAP2		52	/* Slot Capabilities 2 */
diff --git a/include/standard-headers/linux/virtio_ids.h b/include/standard-headers/linux/virtio_ids.h
index 32b2f94d1f58..348fd0176f75 100644
--- a/include/standard-headers/linux/virtio_ids.h
+++ b/include/standard-headers/linux/virtio_ids.h
@@ -43,6 +43,7 @@
 #define VIRTIO_ID_INPUT        18 /* virtio input */
 #define VIRTIO_ID_VSOCK        19 /* virtio vsock transport */
 #define VIRTIO_ID_CRYPTO       20 /* virtio crypto */
+#define VIRTIO_ID_IOMMU        23 /* virtio IOMMU */
 #define VIRTIO_ID_PMEM         27 /* virtio pmem */
 
 #endif /* _LINUX_VIRTIO_IDS_H */
diff --git a/include/standard-headers/linux/virtio_pmem.h b/include/standard-headers/linux/virtio_pmem.h
index 7e3d43b12136..fc029de7988e 100644
--- a/include/standard-headers/linux/virtio_pmem.h
+++ b/include/standard-headers/linux/virtio_pmem.h
@@ -1,4 +1,4 @@
-/* SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause */
+/* SPDX-License-Identifier: (GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause */
 /*
  * Definitions for virtio-pmem devices.
  *
@@ -7,8 +7,8 @@
  * Author(s): Pankaj Gupta <pagupta@redhat.com>
  */
 
-#ifndef _UAPI_LINUX_VIRTIO_PMEM_H
-#define _UAPI_LINUX_VIRTIO_PMEM_H
+#ifndef _LINUX_VIRTIO_PMEM_H
+#define _LINUX_VIRTIO_PMEM_H
 
 #include "standard-headers/linux/types.h"
 #include "standard-headers/linux/virtio_ids.h"
diff --git a/linux-headers/asm-arm/kvm.h b/linux-headers/asm-arm/kvm.h
index e1f8b745582f..dfccc47092c0 100644
--- a/linux-headers/asm-arm/kvm.h
+++ b/linux-headers/asm-arm/kvm.h
@@ -214,6 +214,18 @@ struct kvm_vcpu_events {
 #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM | KVM_REG_SIZE_U64 | \
 					 KVM_REG_ARM_FW | ((r) & 0xffff))
 #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
+	/* Higher values mean better protection. */
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL		0
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL		1
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_REQUIRED	2
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
+	/* Higher values mean better protection. */
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL		0
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN		1
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL		2
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED	3
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	(1U << 4)
 
 /* Device Control API: ARM VGIC */
 #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
diff --git a/linux-headers/asm-arm/unistd-common.h b/linux-headers/asm-arm/unistd-common.h
index 27a9b6da27a1..eb5d361b117b 100644
--- a/linux-headers/asm-arm/unistd-common.h
+++ b/linux-headers/asm-arm/unistd-common.h
@@ -388,5 +388,7 @@
 #define __NR_fsconfig (__NR_SYSCALL_BASE + 431)
 #define __NR_fsmount (__NR_SYSCALL_BASE + 432)
 #define __NR_fspick (__NR_SYSCALL_BASE + 433)
+#define __NR_pidfd_open (__NR_SYSCALL_BASE + 434)
+#define __NR_clone3 (__NR_SYSCALL_BASE + 435)
 
 #endif /* _ASM_ARM_UNISTD_COMMON_H */
diff --git a/linux-headers/asm-arm64/kvm.h b/linux-headers/asm-arm64/kvm.h
index 2431ec35a958..a95d3a420389 100644
--- a/linux-headers/asm-arm64/kvm.h
+++ b/linux-headers/asm-arm64/kvm.h
@@ -229,6 +229,16 @@ struct kvm_vcpu_events {
 #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
 					 KVM_REG_ARM_FW | ((r) & 0xffff))
 #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL		0
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL		1
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_REQUIRED	2
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL		0
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN		1
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL		2
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED	3
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED     	(1U << 4)
 
 /* SVE registers */
 #define KVM_REG_ARM64_SVE		(0x15 << KVM_REG_ARM_COPROC_SHIFT)
@@ -260,6 +270,13 @@ struct kvm_vcpu_events {
 	 KVM_REG_SIZE_U256 |						\
 	 ((i) & (KVM_ARM64_SVE_MAX_SLICES - 1)))
 
+/*
+ * Register values for KVM_REG_ARM64_SVE_ZREG(), KVM_REG_ARM64_SVE_PREG() and
+ * KVM_REG_ARM64_SVE_FFR() are represented in memory in an endianness-
+ * invariant layout which differs from the layout used for the FPSIMD
+ * V-registers on big-endian systems: see sigcontext.h for more explanation.
+ */
+
 #define KVM_ARM64_SVE_VQ_MIN __SVE_VQ_MIN
 #define KVM_ARM64_SVE_VQ_MAX __SVE_VQ_MAX
 
diff --git a/linux-headers/asm-generic/mman-common.h b/linux-headers/asm-generic/mman-common.h
index abd238d0f7a4..63b1f506ea67 100644
--- a/linux-headers/asm-generic/mman-common.h
+++ b/linux-headers/asm-generic/mman-common.h
@@ -19,15 +19,18 @@
 #define MAP_TYPE	0x0f		/* Mask for type of mapping */
 #define MAP_FIXED	0x10		/* Interpret addr exactly */
 #define MAP_ANONYMOUS	0x20		/* don't use a file */
-#ifdef CONFIG_MMAP_ALLOW_UNINITIALIZED
-# define MAP_UNINITIALIZED 0x4000000	/* For anonymous mmap, memory could be uninitialized */
-#else
-# define MAP_UNINITIALIZED 0x0		/* Don't support this flag */
-#endif
 
-/* 0x0100 - 0x80000 flags are defined in asm-generic/mman.h */
+/* 0x0100 - 0x4000 flags are defined in asm-generic/mman.h */
+#define MAP_POPULATE		0x008000	/* populate (prefault) pagetables */
+#define MAP_NONBLOCK		0x010000	/* do not block on IO */
+#define MAP_STACK		0x020000	/* give out an address that is best suited for process/thread stacks */
+#define MAP_HUGETLB		0x040000	/* create a huge page mapping */
+#define MAP_SYNC		0x080000 /* perform synchronous page faults for the mapping */
 #define MAP_FIXED_NOREPLACE	0x100000	/* MAP_FIXED which doesn't unmap underlying mapping */
 
+#define MAP_UNINITIALIZED 0x4000000	/* For anonymous mmap, memory could be
+					 * uninitialized */
+
 /*
  * Flags for mlock
  */
diff --git a/linux-headers/asm-generic/mman.h b/linux-headers/asm-generic/mman.h
index 653687d9771b..57e8195d0b53 100644
--- a/linux-headers/asm-generic/mman.h
+++ b/linux-headers/asm-generic/mman.h
@@ -9,13 +9,11 @@
 #define MAP_EXECUTABLE	0x1000		/* mark it as an executable */
 #define MAP_LOCKED	0x2000		/* pages are locked */
 #define MAP_NORESERVE	0x4000		/* don't check for reservations */
-#define MAP_POPULATE	0x8000		/* populate (prefault) pagetables */
-#define MAP_NONBLOCK	0x10000		/* do not block on IO */
-#define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
-#define MAP_HUGETLB	0x40000		/* create a huge page mapping */
-#define MAP_SYNC	0x80000		/* perform synchronous page faults for the mapping */
 
-/* Bits [26:31] are reserved, see mman-common.h for MAP_HUGETLB usage */
+/*
+ * Bits [26:31] are reserved, see asm-generic/hugetlb_encode.h
+ * for MAP_HUGETLB usage
+ */
 
 #define MCL_CURRENT	1		/* lock all current mappings */
 #define MCL_FUTURE	2		/* lock all future mappings */
diff --git a/linux-headers/asm-generic/unistd.h b/linux-headers/asm-generic/unistd.h
index a87904daf103..1be0e798e362 100644
--- a/linux-headers/asm-generic/unistd.h
+++ b/linux-headers/asm-generic/unistd.h
@@ -844,9 +844,15 @@ __SYSCALL(__NR_fsconfig, sys_fsconfig)
 __SYSCALL(__NR_fsmount, sys_fsmount)
 #define __NR_fspick 433
 __SYSCALL(__NR_fspick, sys_fspick)
+#define __NR_pidfd_open 434
+__SYSCALL(__NR_pidfd_open, sys_pidfd_open)
+#ifdef __ARCH_WANT_SYS_CLONE3
+#define __NR_clone3 435
+__SYSCALL(__NR_clone3, sys_clone3)
+#endif
 
 #undef __NR_syscalls
-#define __NR_syscalls 434
+#define __NR_syscalls 436
 
 /*
  * 32 bit systems traditionally used different
diff --git a/linux-headers/asm-mips/unistd_n32.h b/linux-headers/asm-mips/unistd_n32.h
index fb988de9001e..7dffe8e34e63 100644
--- a/linux-headers/asm-mips/unistd_n32.h
+++ b/linux-headers/asm-mips/unistd_n32.h
@@ -363,6 +363,7 @@
 #define __NR_fsconfig	(__NR_Linux + 431)
 #define __NR_fsmount	(__NR_Linux + 432)
 #define __NR_fspick	(__NR_Linux + 433)
+#define __NR_pidfd_open	(__NR_Linux + 434)
 
 
 #endif /* _ASM_MIPS_UNISTD_N32_H */
diff --git a/linux-headers/asm-mips/unistd_n64.h b/linux-headers/asm-mips/unistd_n64.h
index 17359163c9af..f4592d6fc50c 100644
--- a/linux-headers/asm-mips/unistd_n64.h
+++ b/linux-headers/asm-mips/unistd_n64.h
@@ -339,6 +339,7 @@
 #define __NR_fsconfig	(__NR_Linux + 431)
 #define __NR_fsmount	(__NR_Linux + 432)
 #define __NR_fspick	(__NR_Linux + 433)
+#define __NR_pidfd_open	(__NR_Linux + 434)
 
 
 #endif /* _ASM_MIPS_UNISTD_N64_H */
diff --git a/linux-headers/asm-mips/unistd_o32.h b/linux-headers/asm-mips/unistd_o32.h
index 83c8d8fb83ad..04c6728352a5 100644
--- a/linux-headers/asm-mips/unistd_o32.h
+++ b/linux-headers/asm-mips/unistd_o32.h
@@ -409,6 +409,7 @@
 #define __NR_fsconfig	(__NR_Linux + 431)
 #define __NR_fsmount	(__NR_Linux + 432)
 #define __NR_fspick	(__NR_Linux + 433)
+#define __NR_pidfd_open	(__NR_Linux + 434)
 
 
 #endif /* _ASM_MIPS_UNISTD_O32_H */
diff --git a/linux-headers/asm-powerpc/kvm.h b/linux-headers/asm-powerpc/kvm.h
index b0f72dea8b11..264e266a85bf 100644
--- a/linux-headers/asm-powerpc/kvm.h
+++ b/linux-headers/asm-powerpc/kvm.h
@@ -667,6 +667,8 @@ struct kvm_ppc_cpu_char {
 
 /* PPC64 eXternal Interrupt Controller Specification */
 #define KVM_DEV_XICS_GRP_SOURCES	1	/* 64-bit source attributes */
+#define KVM_DEV_XICS_GRP_CTRL		2
+#define   KVM_DEV_XICS_NR_SERVERS	1
 
 /* Layout of 64-bit source attribute values */
 #define  KVM_XICS_DESTINATION_SHIFT	0
@@ -683,6 +685,7 @@ struct kvm_ppc_cpu_char {
 #define KVM_DEV_XIVE_GRP_CTRL		1
 #define   KVM_DEV_XIVE_RESET		1
 #define   KVM_DEV_XIVE_EQ_SYNC		2
+#define   KVM_DEV_XIVE_NR_SERVERS	3
 #define KVM_DEV_XIVE_GRP_SOURCE		2	/* 64-bit source identifier */
 #define KVM_DEV_XIVE_GRP_SOURCE_CONFIG	3	/* 64-bit source identifier */
 #define KVM_DEV_XIVE_GRP_EQ_CONFIG	4	/* 64-bit EQ identifier */
diff --git a/linux-headers/asm-powerpc/mman.h b/linux-headers/asm-powerpc/mman.h
index 1c2b3fca05a8..8db7c2a3be30 100644
--- a/linux-headers/asm-powerpc/mman.h
+++ b/linux-headers/asm-powerpc/mman.h
@@ -21,15 +21,11 @@
 #define MAP_DENYWRITE	0x0800		/* ETXTBSY */
 #define MAP_EXECUTABLE	0x1000		/* mark it as an executable */
 
+
 #define MCL_CURRENT     0x2000          /* lock all currently mapped pages */
 #define MCL_FUTURE      0x4000          /* lock all additions to address space */
 #define MCL_ONFAULT	0x8000		/* lock all pages that are faulted in */
 
-#define MAP_POPULATE	0x8000		/* populate (prefault) pagetables */
-#define MAP_NONBLOCK	0x10000		/* do not block on IO */
-#define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
-#define MAP_HUGETLB	0x40000		/* create a huge page mapping */
-
 /* Override any generic PKEY permission defines */
 #define PKEY_DISABLE_EXECUTE   0x4
 #undef PKEY_ACCESS_MASK
diff --git a/linux-headers/asm-powerpc/unistd_32.h b/linux-headers/asm-powerpc/unistd_32.h
index 04cb2d3e619e..5584cc1b4fc1 100644
--- a/linux-headers/asm-powerpc/unistd_32.h
+++ b/linux-headers/asm-powerpc/unistd_32.h
@@ -416,6 +416,8 @@
 #define __NR_fsconfig	431
 #define __NR_fsmount	432
 #define __NR_fspick	433
+#define __NR_pidfd_open	434
+#define __NR_clone3	435
 
 
 #endif /* _ASM_POWERPC_UNISTD_32_H */
diff --git a/linux-headers/asm-powerpc/unistd_64.h b/linux-headers/asm-powerpc/unistd_64.h
index b1e69214903b..251bcff77ea4 100644
--- a/linux-headers/asm-powerpc/unistd_64.h
+++ b/linux-headers/asm-powerpc/unistd_64.h
@@ -388,6 +388,8 @@
 #define __NR_fsconfig	431
 #define __NR_fsmount	432
 #define __NR_fspick	433
+#define __NR_pidfd_open	434
+#define __NR_clone3	435
 
 
 #endif /* _ASM_POWERPC_UNISTD_64_H */
diff --git a/linux-headers/asm-s390/unistd_32.h b/linux-headers/asm-s390/unistd_32.h
index 941853f3e954..7cce3ee29609 100644
--- a/linux-headers/asm-s390/unistd_32.h
+++ b/linux-headers/asm-s390/unistd_32.h
@@ -406,5 +406,7 @@
 #define __NR_fsconfig 431
 #define __NR_fsmount 432
 #define __NR_fspick 433
+#define __NR_pidfd_open 434
+#define __NR_clone3 435
 
 #endif /* _ASM_S390_UNISTD_32_H */
diff --git a/linux-headers/asm-s390/unistd_64.h b/linux-headers/asm-s390/unistd_64.h
index 90271d7f8255..2371ff1e7a79 100644
--- a/linux-headers/asm-s390/unistd_64.h
+++ b/linux-headers/asm-s390/unistd_64.h
@@ -354,5 +354,7 @@
 #define __NR_fsconfig 431
 #define __NR_fsmount 432
 #define __NR_fspick 433
+#define __NR_pidfd_open 434
+#define __NR_clone3 435
 
 #endif /* _ASM_S390_UNISTD_64_H */
diff --git a/linux-headers/asm-x86/kvm.h b/linux-headers/asm-x86/kvm.h
index 6e7dd792e448..503d3f42da16 100644
--- a/linux-headers/asm-x86/kvm.h
+++ b/linux-headers/asm-x86/kvm.h
@@ -378,23 +378,24 @@ struct kvm_sync_regs {
 	struct kvm_vcpu_events events;
 };
 
-#define KVM_X86_QUIRK_LINT0_REENABLED	(1 << 0)
-#define KVM_X86_QUIRK_CD_NW_CLEARED	(1 << 1)
-#define KVM_X86_QUIRK_LAPIC_MMIO_HOLE	(1 << 2)
-#define KVM_X86_QUIRK_OUT_7E_INC_RIP	(1 << 3)
+#define KVM_X86_QUIRK_LINT0_REENABLED	   (1 << 0)
+#define KVM_X86_QUIRK_CD_NW_CLEARED	   (1 << 1)
+#define KVM_X86_QUIRK_LAPIC_MMIO_HOLE	   (1 << 2)
+#define KVM_X86_QUIRK_OUT_7E_INC_RIP	   (1 << 3)
+#define KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT (1 << 4)
 
 #define KVM_STATE_NESTED_FORMAT_VMX	0
-#define KVM_STATE_NESTED_FORMAT_SVM	1
+#define KVM_STATE_NESTED_FORMAT_SVM	1	/* unused */
 
 #define KVM_STATE_NESTED_GUEST_MODE	0x00000001
 #define KVM_STATE_NESTED_RUN_PENDING	0x00000002
 #define KVM_STATE_NESTED_EVMCS		0x00000004
 
-#define KVM_STATE_NESTED_VMX_VMCS_SIZE	0x1000
-
 #define KVM_STATE_NESTED_SMM_GUEST_MODE	0x00000001
 #define KVM_STATE_NESTED_SMM_VMXON	0x00000002
 
+#define KVM_STATE_NESTED_VMX_VMCS_SIZE	0x1000
+
 struct kvm_vmx_nested_state_data {
 	__u8 vmcs12[KVM_STATE_NESTED_VMX_VMCS_SIZE];
 	__u8 shadow_vmcs12[KVM_STATE_NESTED_VMX_VMCS_SIZE];
@@ -432,4 +433,17 @@ struct kvm_nested_state {
 	} data;
 };
 
+/* for KVM_CAP_PMU_EVENT_FILTER */
+struct kvm_pmu_event_filter {
+	__u32 action;
+	__u32 nevents;
+	__u32 fixed_counter_bitmap;
+	__u32 flags;
+	__u32 pad[4];
+	__u64 events[0];
+};
+
+#define KVM_PMU_EVENT_ALLOW 0
+#define KVM_PMU_EVENT_DENY 1
+
 #endif /* _ASM_X86_KVM_H */
diff --git a/linux-headers/asm-x86/unistd_32.h b/linux-headers/asm-x86/unistd_32.h
index 57bb48854c9a..e8ebec1cdc99 100644
--- a/linux-headers/asm-x86/unistd_32.h
+++ b/linux-headers/asm-x86/unistd_32.h
@@ -424,5 +424,7 @@
 #define __NR_fsconfig 431
 #define __NR_fsmount 432
 #define __NR_fspick 433
+#define __NR_pidfd_open 434
+#define __NR_clone3 435
 
 #endif /* _ASM_X86_UNISTD_32_H */
diff --git a/linux-headers/asm-x86/unistd_64.h b/linux-headers/asm-x86/unistd_64.h
index fe6aa0688a18..a2f863d5493f 100644
--- a/linux-headers/asm-x86/unistd_64.h
+++ b/linux-headers/asm-x86/unistd_64.h
@@ -346,5 +346,7 @@
 #define __NR_fsconfig 431
 #define __NR_fsmount 432
 #define __NR_fspick 433
+#define __NR_pidfd_open 434
+#define __NR_clone3 435
 
 #endif /* _ASM_X86_UNISTD_64_H */
diff --git a/linux-headers/asm-x86/unistd_x32.h b/linux-headers/asm-x86/unistd_x32.h
index 09cca49ba7ba..4cdc67d84810 100644
--- a/linux-headers/asm-x86/unistd_x32.h
+++ b/linux-headers/asm-x86/unistd_x32.h
@@ -299,6 +299,8 @@
 #define __NR_fsconfig (__X32_SYSCALL_BIT + 431)
 #define __NR_fsmount (__X32_SYSCALL_BIT + 432)
 #define __NR_fspick (__X32_SYSCALL_BIT + 433)
+#define __NR_pidfd_open (__X32_SYSCALL_BIT + 434)
+#define __NR_clone3 (__X32_SYSCALL_BIT + 435)
 #define __NR_rt_sigaction (__X32_SYSCALL_BIT + 512)
 #define __NR_rt_sigreturn (__X32_SYSCALL_BIT + 513)
 #define __NR_ioctl (__X32_SYSCALL_BIT + 514)
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 18892d65414a..9cf351919c88 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -116,7 +116,7 @@ struct kvm_irq_level {
 	 * ACPI gsi notion of irq.
 	 * For IA-64 (APIC model) IOAPIC0: irq 0-23; IOAPIC1: irq 24-47..
 	 * For X86 (standard AT mode) PIC0/1: irq 0-15. IOAPIC0: 0-23..
-	 * For ARM: See Documentation/virtual/kvm/api.txt
+	 * For ARM: See Documentation/virt/kvm/api.txt
 	 */
 	union {
 		__u32 irq;
@@ -995,6 +995,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_ARM_SVE 170
 #define KVM_CAP_ARM_PTRAUTH_ADDRESS 171
 #define KVM_CAP_ARM_PTRAUTH_GENERIC 172
+#define KVM_CAP_PMU_EVENT_FILTER 173
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -1085,7 +1086,7 @@ struct kvm_xen_hvm_config {
  *
  * KVM_IRQFD_FLAG_RESAMPLE indicates resamplefd is valid and specifies
  * the irqfd to operate in resampling mode for level triggered interrupt
- * emulation.  See Documentation/virtual/kvm/api.txt.
+ * emulation.  See Documentation/virt/kvm/api.txt.
  */
 #define KVM_IRQFD_FLAG_RESAMPLE (1 << 1)
 
@@ -1329,6 +1330,8 @@ struct kvm_s390_ucas_mapping {
 #define KVM_PPC_GET_RMMU_INFO	  _IOW(KVMIO,  0xb0, struct kvm_ppc_rmmu_info)
 /* Available with KVM_CAP_PPC_GET_CPU_CHAR */
 #define KVM_PPC_GET_CPU_CHAR	  _IOR(KVMIO,  0xb1, struct kvm_ppc_cpu_char)
+/* Available with KVM_CAP_PMU_EVENT_FILTER */
+#define KVM_SET_PMU_EVENT_FILTER  _IOW(KVMIO,  0xb2, struct kvm_pmu_event_filter)
 
 /* ioctl for vm fd */
 #define KVM_CREATE_DEVICE	  _IOWR(KVMIO,  0xe0, struct kvm_create_device)
diff --git a/linux-headers/linux/psp-sev.h b/linux-headers/linux/psp-sev.h
index 36bbe17d8fa7..34c39690c09d 100644
--- a/linux-headers/linux/psp-sev.h
+++ b/linux-headers/linux/psp-sev.h
@@ -1,3 +1,4 @@
+/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
 /*
  * Userspace interface for AMD Secure Encrypted Virtualization (SEV)
  * platform management commands.
@@ -7,10 +8,6 @@
  * Author: Brijesh Singh <brijesh.singh@amd.com>
  *
  * SEV API specification is available at: https://developer.amd.com/sev/
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
  */
 
 #ifndef __PSP_SEV_USER_H__



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5/7] spapr/xics: Configure number of servers in KVM
  2019-10-03 12:00 [PATCH 0/7] spapr: Use less XIVE HW resources in KVM Greg Kurz
                   ` (3 preceding siblings ...)
  2019-10-03 12:01 ` [PATCH RFC 4/7] linux-headers: Update against 5.3-rc2 Greg Kurz
@ 2019-10-03 12:01 ` Greg Kurz
  2019-10-03 12:29   ` Cédric Le Goater
  2019-10-03 12:01 ` [PATCH 6/7] spapr/xive: " Greg Kurz
  2019-10-03 12:02 ` [PATCH 7/7] spapr: Set VSMT to smp_threads by default Greg Kurz
  6 siblings, 1 reply; 31+ messages in thread
From: Greg Kurz @ 2019-10-03 12:01 UTC (permalink / raw)
  To: David Gibson, Cédric Le Goater; +Cc: qemu-ppc, qemu-devel

The XICS-on-XIVE KVM devices now has an attribute to configure the number
of interrupt servers. This allows to greatly optimize the usage of the VP
space in the XIVE HW, and thus to start a lot more VMs.

Only set this attribute if available in order to support older POWER9 KVM
and pre-POWER9 XICS KVM devices.

The XICS-on-XIVE KVM device now reports the exhaustion of VPs upon the
connection of the first VCPU. Check that in order to have a chance to
provide an hint to the user.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/intc/xics_kvm.c |   24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
index ba90d6dc966c..12d9524cc432 100644
--- a/hw/intc/xics_kvm.c
+++ b/hw/intc/xics_kvm.c
@@ -165,8 +165,15 @@ void icp_kvm_realize(DeviceState *dev, Error **errp)
 
     ret = kvm_vcpu_enable_cap(cs, KVM_CAP_IRQ_XICS, 0, kernel_xics_fd, vcpu_id);
     if (ret < 0) {
-        error_setg(errp, "Unable to connect CPU%ld to kernel XICS: %s", vcpu_id,
-                   strerror(errno));
+        Error *local_err = NULL;
+
+        error_setg(&local_err, "Unable to connect CPU%ld to kernel XICS: %s",
+                   vcpu_id, strerror(errno));
+        if (errno == ENOSPC) {
+            error_append_hint(&local_err, "Try -smp maxcpus=N with N < %u\n",
+                              MACHINE(qdev_get_machine())->smp.max_cpus);
+        }
+        error_propagate(errp, local_err);
         return;
     }
     enabled_icp = g_malloc(sizeof(*enabled_icp));
@@ -344,6 +351,7 @@ void ics_kvm_set_irq(ICSState *ics, int srcno, int val)
 
 int xics_kvm_connect(SpaprMachineState *spapr, Error **errp)
 {
+    ICSState *ics = spapr->ics;
     int rc;
     CPUState *cs;
     Error *local_err = NULL;
@@ -397,6 +405,18 @@ int xics_kvm_connect(SpaprMachineState *spapr, Error **errp)
         goto fail;
     }
 
+    /* Tell KVM about the # of VCPUs we may have (POWER9 and newer only) */
+    if (kvm_device_check_attr(rc, KVM_DEV_XICS_GRP_CTRL,
+                              KVM_DEV_XICS_NR_SERVERS)) {
+        uint32_t nr_servers = xics_nr_servers(ics->xics);
+
+        if (kvm_device_access(rc, KVM_DEV_XICS_GRP_CTRL,
+                              KVM_DEV_XICS_NR_SERVERS, &nr_servers, true,
+                              &local_err)) {
+            goto fail;
+        }
+    }
+
     kernel_xics_fd = rc;
     kvm_kernel_irqchip = true;
     kvm_msi_via_irqfd_allowed = true;



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 6/7] spapr/xive: Configure number of servers in KVM
  2019-10-03 12:00 [PATCH 0/7] spapr: Use less XIVE HW resources in KVM Greg Kurz
                   ` (4 preceding siblings ...)
  2019-10-03 12:01 ` [PATCH 5/7] spapr/xics: Configure number of servers in KVM Greg Kurz
@ 2019-10-03 12:01 ` Greg Kurz
  2019-10-03 12:30   ` Cédric Le Goater
  2019-10-03 12:02 ` [PATCH 7/7] spapr: Set VSMT to smp_threads by default Greg Kurz
  6 siblings, 1 reply; 31+ messages in thread
From: Greg Kurz @ 2019-10-03 12:01 UTC (permalink / raw)
  To: David Gibson, Cédric Le Goater; +Cc: qemu-ppc, qemu-devel

The XIVE KVM devices now has an attribute to configure the number of
interrupt servers. This allows to greatly optimize the usage of the VP
space in the XIVE HW, and thus to start a lot more VMs.

Only set this attribute if available in order to support older POWER9
KVM.

The XIVE KVM device now reports the exhaustion of VPs upon the
connection of the first VCPU. Check that in order to have a chance
to provide an hint to the user.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/intc/spapr_xive_kvm.c |   22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
index 51b334b676a1..2a3a9ef22b6f 100644
--- a/hw/intc/spapr_xive_kvm.c
+++ b/hw/intc/spapr_xive_kvm.c
@@ -152,7 +152,8 @@ void kvmppc_xive_cpu_synchronize_state(XiveTCTX *tctx, Error **errp)
 
 void kvmppc_xive_cpu_connect(XiveTCTX *tctx, Error **errp)
 {
-    SpaprXive *xive = SPAPR_MACHINE(qdev_get_machine())->xive;
+    MachineState *ms = MACHINE(qdev_get_machine());
+    SpaprXive *xive = SPAPR_MACHINE(ms)->xive;
     unsigned long vcpu_id;
     int ret;
 
@@ -171,8 +172,15 @@ void kvmppc_xive_cpu_connect(XiveTCTX *tctx, Error **errp)
     ret = kvm_vcpu_enable_cap(tctx->cs, KVM_CAP_PPC_IRQ_XIVE, 0, xive->fd,
                               vcpu_id, 0);
     if (ret < 0) {
-        error_setg(errp, "XIVE: unable to connect CPU%ld to KVM device: %s",
+        Error *err = NULL;
+
+        error_setg(&err, "XIVE: unable to connect CPU%ld to KVM device: %s",
                    vcpu_id, strerror(errno));
+        if (errno == ENOSPC) {
+            error_append_hint(&local_err, "Try -smp maxcpus=N with N < %u\n",
+                              ms->smp.max_cpus);
+        }
+        error_propagate(errp, err);
         return;
     }
 
@@ -768,6 +776,16 @@ void kvmppc_xive_connect(SpaprXive *xive, Error **errp)
         return;
     }
 
+    /* Tell KVM about the # of VCPUs we may have */
+    if (kvm_device_check_attr(xive->fd, KVM_DEV_XIVE_GRP_CTRL,
+                              KVM_DEV_XIVE_NR_SERVERS)) {
+        if (kvm_device_access(xive->fd, KVM_DEV_XIVE_GRP_CTRL,
+                              KVM_DEV_XIVE_NR_SERVERS, &xive->nr_servers, true,
+                              &local_err)) {
+            goto fail;
+        }
+    }
+
     /*
      * 1. Source ESB pages - KVM mapping
      */



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 7/7] spapr: Set VSMT to smp_threads by default
  2019-10-03 12:00 [PATCH 0/7] spapr: Use less XIVE HW resources in KVM Greg Kurz
                   ` (5 preceding siblings ...)
  2019-10-03 12:01 ` [PATCH 6/7] spapr/xive: " Greg Kurz
@ 2019-10-03 12:02 ` Greg Kurz
  2019-10-14  6:12   ` David Gibson
  6 siblings, 1 reply; 31+ messages in thread
From: Greg Kurz @ 2019-10-03 12:02 UTC (permalink / raw)
  To: David Gibson, Cédric Le Goater; +Cc: qemu-ppc, qemu-devel

Support for setting VSMT is available in KVM since linux-4.13. Most distros
that support KVM on POWER already have it. It thus seem reasonable enough
to have the default machine to set VSMT to smp_threads.

This brings contiguous VCPU ids and thus brings their upper bound down to
the machine's max_cpus. This is especially useful for XIVE KVM devices,
which may thus allocate only one VP descriptor per VCPU.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/ppc/spapr.c         |    7 ++++++-
 include/hw/ppc/spapr.h |    1 +
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 8f59f08c102e..473ce1d04775 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2503,6 +2503,7 @@ static CPUArchId *spapr_find_cpu_slot(MachineState *ms, uint32_t id, int *idx)
 static void spapr_set_vsmt_mode(SpaprMachineState *spapr, Error **errp)
 {
     MachineState *ms = MACHINE(spapr);
+    SpaprMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
     Error *local_err = NULL;
     bool vsmt_user = !!spapr->vsmt;
     int kvm_smt = kvmppc_smt_threads();
@@ -2529,7 +2530,7 @@ static void spapr_set_vsmt_mode(SpaprMachineState *spapr, Error **errp)
             goto out;
         }
         /* In this case, spapr->vsmt has been set by the command line */
-    } else {
+    } else if (!smc->smp_threads_vsmt) {
         /*
          * Default VSMT value is tricky, because we need it to be as
          * consistent as possible (for migration), but this requires
@@ -2538,6 +2539,8 @@ static void spapr_set_vsmt_mode(SpaprMachineState *spapr, Error **errp)
          * overwhelmingly common case in production systems.
          */
         spapr->vsmt = MAX(8, smp_threads);
+    } else {
+        spapr->vsmt = smp_threads;
     }
 
     /* KVM: If necessary, set the SMT mode: */
@@ -4452,6 +4455,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     smc->irq = &spapr_irq_dual;
     smc->dr_phb_enabled = true;
     smc->linux_pci_probe = true;
+    smc->smp_threads_vsmt = true;
 }
 
 static const TypeInfo spapr_machine_info = {
@@ -4519,6 +4523,7 @@ static void spapr_machine_4_1_class_options(MachineClass *mc)
 
     spapr_machine_4_2_class_options(mc);
     smc->linux_pci_probe = false;
+    smc->smp_threads_vsmt = false;
     compat_props_add(mc->compat_props, hw_compat_4_1, hw_compat_4_1_len);
     compat_props_add(mc->compat_props, compat, G_N_ELEMENTS(compat));
 }
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index cbd1a4c9f390..2009eb64f9cb 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -122,6 +122,7 @@ struct SpaprMachineClass {
     bool broken_host_serial_model; /* present real host info to the guest */
     bool pre_4_1_migration; /* don't migrate hpt-max-page-size */
     bool linux_pci_probe;
+    bool smp_threads_vsmt; /* set VSMT to smp_threads by default */
 
     void (*phb_placement)(SpaprMachineState *spapr, uint32_t index,
                           uint64_t *buid, hwaddr *pio, 



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/7] spapr, xive: Turn "nr-ends" property into "nr-servers" property
  2019-10-03 12:01 ` [PATCH 2/7] spapr, xive: Turn "nr-ends" property into "nr-servers" property Greg Kurz
@ 2019-10-03 12:21   ` Cédric Le Goater
  2019-10-03 12:44     ` Greg Kurz
  2019-10-04  4:07   ` David Gibson
  1 sibling, 1 reply; 31+ messages in thread
From: Cédric Le Goater @ 2019-10-03 12:21 UTC (permalink / raw)
  To: Greg Kurz, David Gibson; +Cc: qemu-ppc, qemu-devel

On 03/10/2019 14:01, Greg Kurz wrote:
> The sPAPR XIVE object has an nr_ends field which happens to be a
> multiple of spapr_max_server_number(). It is currently set with
> the help of "nr-ends" property. This is a bit unfortunate since
> it exposes to the sPAPR irq frontend what should remain an
> implemantation detail within the XIVE backend.

implementation

> It will be possible soon to inform the XIVE KVM device about the
> range of VCPU ids that may be used in the VM, as returned by the
> spapr_max_server_number() function. This will allow the device
> to substantially reduce the consumption of scarce resources
> in the XIVE HW.
> 
> For both reasons, replace the "nr-ends" property with an "nr-servers"
> one. The existing nr_ends field must be kept though since it tells how
> many ENDs are migrated, it is derived from "nr-servers" at realize time
> for simplicity. Convert spapr_dt_xive() to use it as well.

Looks good. one question below.

> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
>  hw/intc/spapr_xive.c        |   21 ++++++++++++++++-----
>  hw/ppc/spapr_irq.c          |    2 +-
>  include/hw/ppc/spapr_xive.h |    1 +
>  3 files changed, 18 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index 04879abf2e7a..62888ddc68db 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -99,6 +99,15 @@ int spapr_xive_end_to_target(uint8_t end_blk, uint32_t end_idx,
>      return 0;
>  }
>  
> +static uint32_t spapr_xive_vcpu_id_to_end_idx(uint32_t vcpu_id)


may be use a simpler macro :

#define spapr_xive_cpu_end_idx(vcpu, prio) (((vcpu) << 3) + prio) 

> +{
> +    /*
> +     * 8 XIVE END structures per CPU. One for each available
> +     * priority
> +     */
> +    return vcpu_id << 3;
> +}
> +
>  static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
>                                    uint8_t *out_end_blk, uint32_t *out_end_idx)
>  {
> @@ -109,7 +118,7 @@ static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
>      }
>  
>      if (out_end_idx) {
> -        *out_end_idx = (cpu->vcpu_id << 3) + prio;
> +        *out_end_idx = spapr_xive_vcpu_id_to_end_idx(cpu->vcpu_id) + prio;
>      }
>  }
>  
> @@ -283,11 +292,13 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
>          return;
>      }
>  
> -    if (!xive->nr_ends) {
> -        error_setg(errp, "Number of interrupt needs to be greater 0");
> +    if (!xive->nr_servers) {
> +        error_setg(errp, "Number of interrupt servers must be greater than 0");
>          return;
>      }
>  
> +    xive->nr_ends = spapr_xive_vcpu_id_to_end_idx(xive->nr_servers);
> +
>      /*
>       * Initialize the internal sources, for IPIs and virtual devices.
>       */
> @@ -489,7 +500,7 @@ static const VMStateDescription vmstate_spapr_xive = {
>  
>  static Property spapr_xive_properties[] = {
>      DEFINE_PROP_UINT32("nr-irqs", SpaprXive, nr_irqs, 0),
> -    DEFINE_PROP_UINT32("nr-ends", SpaprXive, nr_ends, 0),
> +    DEFINE_PROP_UINT32("nr-servers", SpaprXive, nr_servers, 0),
>      DEFINE_PROP_UINT64("vc-base", SpaprXive, vc_base, SPAPR_XIVE_VC_BASE),
>      DEFINE_PROP_UINT64("tm-base", SpaprXive, tm_base, SPAPR_XIVE_TM_BASE),
>      DEFINE_PROP_END_OF_LIST(),
> @@ -1550,7 +1561,7 @@ void spapr_dt_xive(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,

we should remove the 'uint32_t nr_servers' parameter from spapr_dt_xive() 
then ?

>      /* Interrupt number ranges for the IPIs */
>      uint32_t lisn_ranges[] = {
>          cpu_to_be32(0),
> -        cpu_to_be32(nr_servers),
> +        cpu_to_be32(xive->nr_servers),
>      };
>      /*
>       * EQ size - the sizes of pages supported by the system 4K, 64K,
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 457eabe24cda..025fd00143a2 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -591,7 +591,7 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
>           * 8 XIVE END structures per CPU. One for each available
>           * priority
>           */
> -        qdev_prop_set_uint32(dev, "nr-ends", nr_servers << 3);
> +        qdev_prop_set_uint32(dev, "nr-servers", nr_servers);
>          qdev_init_nofail(dev);
>  
>          spapr->xive = SPAPR_XIVE(dev);
> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> index 0df20a6590a5..4a4a6fc6be7f 100644
> --- a/include/hw/ppc/spapr_xive.h
> +++ b/include/hw/ppc/spapr_xive.h
> @@ -22,6 +22,7 @@ typedef struct SpaprXive {
>      /* Internal interrupt source for IPIs and virtual devices */
>      XiveSource    source;
>      hwaddr        vc_base;
> +    uint32_t      nr_servers;
>  
>      /* END ESB MMIOs */
>      XiveENDSource end_source;
> 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/7] spapr, xics: Get number of servers with a XICSFabricClass method
  2019-10-03 12:00 ` [PATCH 1/7] spapr, xics: Get number of servers with a XICSFabricClass method Greg Kurz
@ 2019-10-03 12:24   ` Cédric Le Goater
  2019-10-03 12:49     ` Greg Kurz
  0 siblings, 1 reply; 31+ messages in thread
From: Cédric Le Goater @ 2019-10-03 12:24 UTC (permalink / raw)
  To: Greg Kurz, David Gibson; +Cc: qemu-ppc, qemu-devel

On 03/10/2019 14:00, Greg Kurz wrote:
> The number of servers, ie. upper bound of the highest VCPU id, is
> currently only needed to generate the "interrupt-controller" node
> in the DT. Soon it will be needed to inform the XICS-on-XIVE KVM
> device that it can allocates less resources in the XIVE HW.
> 
> Add a method to XICSFabricClass for this purpose. 

This is sPAPR code and PowerNV does not care.

why can not we simply call spapr_max_server_number(spapr) ?


> Implement it
> for sPAPR and use it to generate the "interrupt-controller" node.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
>  hw/intc/xics.c        |    7 +++++++
>  hw/intc/xics_spapr.c  |    3 ++-
>  hw/ppc/spapr.c        |    8 ++++++++
>  include/hw/ppc/xics.h |    2 ++
>  4 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> index dfe7dbd254ab..f82072935266 100644
> --- a/hw/intc/xics.c
> +++ b/hw/intc/xics.c
> @@ -716,6 +716,13 @@ ICPState *xics_icp_get(XICSFabric *xi, int server)
>      return xic->icp_get(xi, server);
>  }
>  
> +uint32_t xics_nr_servers(XICSFabric *xi)
> +{
> +    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(xi);
> +
> +    return xic->nr_servers(xi);
> +}
> +
>  void ics_set_irq_type(ICSState *ics, int srcno, bool lsi)
>  {
>      assert(!(ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MASK));
> diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
> index 6e5eb24b3cca..aa568ed0dc0d 100644
> --- a/hw/intc/xics_spapr.c
> +++ b/hw/intc/xics_spapr.c
> @@ -311,8 +311,9 @@ static void ics_spapr_realize(DeviceState *dev, Error **errp)
>  void spapr_dt_xics(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
>                     uint32_t phandle)
>  {
> +    ICSState *ics = spapr->ics;
>      uint32_t interrupt_server_ranges_prop[] = {
> -        0, cpu_to_be32(nr_servers),
> +        0, cpu_to_be32(xics_nr_servers(ics->xics)),
>      };
>      int node;
>  
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 514a17ae74d6..b8b9796c88e4 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -4266,6 +4266,13 @@ static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
>      return cpu ? spapr_cpu_state(cpu)->icp : NULL;
>  }
>  
> +static uint32_t spapr_nr_servers(XICSFabric *xi)
> +{
> +    SpaprMachineState *spapr = SPAPR_MACHINE(xi);
> +
> +    return spapr_max_server_number(spapr);
> +}
> +
>  static void spapr_pic_print_info(InterruptStatsProvider *obj,
>                                   Monitor *mon)
>  {
> @@ -4423,6 +4430,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>      xic->ics_get = spapr_ics_get;
>      xic->ics_resend = spapr_ics_resend;
>      xic->icp_get = spapr_icp_get;
> +    xic->nr_servers = spapr_nr_servers;
>      ispc->print_info = spapr_pic_print_info;
>      /* Force NUMA node memory size to be a multiple of
>       * SPAPR_MEMORY_BLOCK_SIZE (256M) since that's the granularity
> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> index 1e6a9300eb2b..e6bb1239e8f8 100644
> --- a/include/hw/ppc/xics.h
> +++ b/include/hw/ppc/xics.h
> @@ -151,9 +151,11 @@ typedef struct XICSFabricClass {
>      ICSState *(*ics_get)(XICSFabric *xi, int irq);
>      void (*ics_resend)(XICSFabric *xi);
>      ICPState *(*icp_get)(XICSFabric *xi, int server);
> +    uint32_t (*nr_servers)(XICSFabric *xi);
>  } XICSFabricClass;
>  
>  ICPState *xics_icp_get(XICSFabric *xi, int server);
> +uint32_t xics_nr_servers(XICSFabric *xi);
>  
>  /* Internal XICS interfaces */
>  void icp_set_cppr(ICPState *icp, uint8_t cppr);
> 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/7] spapr, xics, xive: Drop nr_servers argument in DT-related functions
  2019-10-03 12:01 ` [PATCH 3/7] spapr, xics, xive: Drop nr_servers argument in DT-related functions Greg Kurz
@ 2019-10-03 12:25   ` Cédric Le Goater
  2019-10-03 12:52     ` Greg Kurz
  0 siblings, 1 reply; 31+ messages in thread
From: Cédric Le Goater @ 2019-10-03 12:25 UTC (permalink / raw)
  To: Greg Kurz, David Gibson; +Cc: qemu-ppc, qemu-devel

On 03/10/2019 14:01, Greg Kurz wrote:
> Both XICS and XIVE backends can access nr_servers by other means. No
> need to pass it around anymore.

OK. You are doing the clean up in this patch.

> Signed-off-by: Greg Kurz <groug@kaod.org>

even if spapr_irq removal is programmed, 

Reviewed-by: Cédric Le Goater <clg@kaod.org>

> ---
>  hw/intc/spapr_xive.c        |    3 +--
>  hw/intc/xics_spapr.c        |    3 +--
>  hw/ppc/spapr.c              |    3 +--
>  hw/ppc/spapr_irq.c          |    5 ++---
>  include/hw/ppc/spapr_irq.h  |    3 +--
>  include/hw/ppc/spapr_xive.h |    3 +--
>  include/hw/ppc/xics_spapr.h |    3 +--
>  7 files changed, 8 insertions(+), 15 deletions(-)
> 
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index 62888ddc68db..56d851169cf6 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -1552,8 +1552,7 @@ void spapr_xive_hcall_init(SpaprMachineState *spapr)
>      spapr_register_hypercall(H_INT_RESET, h_int_reset);
>  }
>  
> -void spapr_dt_xive(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
> -                   uint32_t phandle)
> +void spapr_dt_xive(SpaprMachineState *spapr, void *fdt, uint32_t phandle)
>  {
>      SpaprXive *xive = spapr->xive;
>      int node;
> diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
> index aa568ed0dc0d..015753c19c5d 100644
> --- a/hw/intc/xics_spapr.c
> +++ b/hw/intc/xics_spapr.c
> @@ -308,8 +308,7 @@ static void ics_spapr_realize(DeviceState *dev, Error **errp)
>      spapr_register_hypercall(H_IPOLL, h_ipoll);
>  }
>  
> -void spapr_dt_xics(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
> -                   uint32_t phandle)
> +void spapr_dt_xics(SpaprMachineState *spapr, void *fdt, uint32_t phandle)
>  {
>      ICSState *ics = spapr->ics;
>      uint32_t interrupt_server_ranges_prop[] = {
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index b8b9796c88e4..8f59f08c102e 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1255,8 +1255,7 @@ static void *spapr_build_fdt(SpaprMachineState *spapr)
>      _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
>  
>      /* /interrupt controller */
> -    spapr->irq->dt_populate(spapr, spapr_max_server_number(spapr), fdt,
> -                          PHANDLE_INTC);
> +    spapr->irq->dt_populate(spapr, fdt, PHANDLE_INTC);
>  
>      ret = spapr_populate_memory(spapr, fdt);
>      if (ret < 0) {
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 025fd00143a2..02e1b5503b65 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -368,11 +368,10 @@ static void spapr_irq_print_info_dual(SpaprMachineState *spapr, Monitor *mon)
>      spapr_irq_current(spapr)->print_info(spapr, mon);
>  }
>  
> -static void spapr_irq_dt_populate_dual(SpaprMachineState *spapr,
> -                                       uint32_t nr_servers, void *fdt,
> +static void spapr_irq_dt_populate_dual(SpaprMachineState *spapr, void *fdt,
>                                         uint32_t phandle)
>  {
> -    spapr_irq_current(spapr)->dt_populate(spapr, nr_servers, fdt, phandle);
> +    spapr_irq_current(spapr)->dt_populate(spapr, fdt, phandle);
>  }
>  
>  static void spapr_irq_cpu_intc_create_dual(SpaprMachineState *spapr,
> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> index 69a37f608e01..1736e503a8e9 100644
> --- a/include/hw/ppc/spapr_irq.h
> +++ b/include/hw/ppc/spapr_irq.h
> @@ -45,8 +45,7 @@ typedef struct SpaprIrq {
>      int (*claim)(SpaprMachineState *spapr, int irq, bool lsi, Error **errp);
>      void (*free)(SpaprMachineState *spapr, int irq);
>      void (*print_info)(SpaprMachineState *spapr, Monitor *mon);
> -    void (*dt_populate)(SpaprMachineState *spapr, uint32_t nr_servers,
> -                        void *fdt, uint32_t phandle);
> +    void (*dt_populate)(SpaprMachineState *spapr, void *fdt, uint32_t phandle);
>      void (*cpu_intc_create)(SpaprMachineState *spapr, PowerPCCPU *cpu,
>                              Error **errp);
>      int (*post_load)(SpaprMachineState *spapr, int version_id);
> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> index 4a4a6fc6be7f..fae075d51815 100644
> --- a/include/hw/ppc/spapr_xive.h
> +++ b/include/hw/ppc/spapr_xive.h
> @@ -61,8 +61,7 @@ void spapr_xive_pic_print_info(SpaprXive *xive, Monitor *mon);
>  int spapr_xive_post_load(SpaprXive *xive, int version_id);
>  
>  void spapr_xive_hcall_init(SpaprMachineState *spapr);
> -void spapr_dt_xive(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
> -                   uint32_t phandle);
> +void spapr_dt_xive(SpaprMachineState *spapr, void *fdt, uint32_t phandle);
>  void spapr_xive_set_tctx_os_cam(XiveTCTX *tctx);
>  void spapr_xive_mmio_set_enabled(SpaprXive *xive, bool enable);
>  void spapr_xive_map_mmio(SpaprXive *xive);
> diff --git a/include/hw/ppc/xics_spapr.h b/include/hw/ppc/xics_spapr.h
> index 0b35e85c266a..ecb67c6c340a 100644
> --- a/include/hw/ppc/xics_spapr.h
> +++ b/include/hw/ppc/xics_spapr.h
> @@ -32,8 +32,7 @@
>  #define TYPE_ICS_SPAPR "ics-spapr"
>  #define ICS_SPAPR(obj) OBJECT_CHECK(ICSState, (obj), TYPE_ICS_SPAPR)
>  
> -void spapr_dt_xics(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
> -                   uint32_t phandle);
> +void spapr_dt_xics(SpaprMachineState *spapr, void *fdt, uint32_t phandle);
>  int xics_kvm_connect(SpaprMachineState *spapr, Error **errp);
>  void xics_kvm_disconnect(SpaprMachineState *spapr, Error **errp);
>  bool xics_kvm_has_broken_disconnect(SpaprMachineState *spapr);
> 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/7] spapr/xics: Configure number of servers in KVM
  2019-10-03 12:01 ` [PATCH 5/7] spapr/xics: Configure number of servers in KVM Greg Kurz
@ 2019-10-03 12:29   ` Cédric Le Goater
  2019-10-03 12:55     ` Greg Kurz
  0 siblings, 1 reply; 31+ messages in thread
From: Cédric Le Goater @ 2019-10-03 12:29 UTC (permalink / raw)
  To: Greg Kurz, David Gibson; +Cc: qemu-ppc, qemu-devel

On 03/10/2019 14:01, Greg Kurz wrote:
> The XICS-on-XIVE KVM devices now has an attribute to configure the number
> of interrupt servers. This allows to greatly optimize the usage of the VP
> space in the XIVE HW, and thus to start a lot more VMs.
> 
> Only set this attribute if available in order to support older POWER9 KVM
> and pre-POWER9 XICS KVM devices.
> 
> The XICS-on-XIVE KVM device now reports the exhaustion of VPs upon the
> connection of the first VCPU. Check that in order to have a chance to
> provide an hint to the user.

That part would have been better in its own patch. Simpler to review.
 
> Signed-off-by: Greg Kurz <groug@kaod.org>

Anyhow, if you split or not :

Reviewed-by: Cédric Le Goater <clg@kaod.org>

> ---
>  hw/intc/xics_kvm.c |   24 ++++++++++++++++++++++--
>  1 file changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
> index ba90d6dc966c..12d9524cc432 100644
> --- a/hw/intc/xics_kvm.c
> +++ b/hw/intc/xics_kvm.c
> @@ -165,8 +165,15 @@ void icp_kvm_realize(DeviceState *dev, Error **errp)
>  
>      ret = kvm_vcpu_enable_cap(cs, KVM_CAP_IRQ_XICS, 0, kernel_xics_fd, vcpu_id);
>      if (ret < 0) {
> -        error_setg(errp, "Unable to connect CPU%ld to kernel XICS: %s", vcpu_id,
> -                   strerror(errno));
> +        Error *local_err = NULL;
> +
> +        error_setg(&local_err, "Unable to connect CPU%ld to kernel XICS: %s",
> +                   vcpu_id, strerror(errno));
> +        if (errno == ENOSPC) {
> +            error_append_hint(&local_err, "Try -smp maxcpus=N with N < %u\n",
> +                              MACHINE(qdev_get_machine())->smp.max_cpus);
> +        }
> +        error_propagate(errp, local_err);
>          return;
>      }
>      enabled_icp = g_malloc(sizeof(*enabled_icp));
> @@ -344,6 +351,7 @@ void ics_kvm_set_irq(ICSState *ics, int srcno, int val)
>  
>  int xics_kvm_connect(SpaprMachineState *spapr, Error **errp)
>  {
> +    ICSState *ics = spapr->ics;
>      int rc;
>      CPUState *cs;
>      Error *local_err = NULL;
> @@ -397,6 +405,18 @@ int xics_kvm_connect(SpaprMachineState *spapr, Error **errp)
>          goto fail;
>      }
>  
> +    /* Tell KVM about the # of VCPUs we may have (POWER9 and newer only) */
> +    if (kvm_device_check_attr(rc, KVM_DEV_XICS_GRP_CTRL,
> +                              KVM_DEV_XICS_NR_SERVERS)) {
> +        uint32_t nr_servers = xics_nr_servers(ics->xics);
> +
> +        if (kvm_device_access(rc, KVM_DEV_XICS_GRP_CTRL,
> +                              KVM_DEV_XICS_NR_SERVERS, &nr_servers, true,
> +                              &local_err)) {
> +            goto fail;
> +        }
> +    }
> +
>      kernel_xics_fd = rc;
>      kvm_kernel_irqchip = true;
>      kvm_msi_via_irqfd_allowed = true;
> 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/7] spapr/xive: Configure number of servers in KVM
  2019-10-03 12:01 ` [PATCH 6/7] spapr/xive: " Greg Kurz
@ 2019-10-03 12:30   ` Cédric Le Goater
  0 siblings, 0 replies; 31+ messages in thread
From: Cédric Le Goater @ 2019-10-03 12:30 UTC (permalink / raw)
  To: Greg Kurz, David Gibson; +Cc: qemu-ppc, qemu-devel

On 03/10/2019 14:01, Greg Kurz wrote:
> The XIVE KVM devices now has an attribute to configure the number of
> interrupt servers. This allows to greatly optimize the usage of the VP
> space in the XIVE HW, and thus to start a lot more VMs.
> 
> Only set this attribute if available in order to support older POWER9
> KVM.
> 
> The XIVE KVM device now reports the exhaustion of VPs upon the
> connection of the first VCPU. Check that in order to have a chance
> to provide an hint to the user.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>

Reviewed-by: Cédric Le Goater <clg@kaod.org>

> ---
>  hw/intc/spapr_xive_kvm.c |   22 ++++++++++++++++++++--
>  1 file changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
> index 51b334b676a1..2a3a9ef22b6f 100644
> --- a/hw/intc/spapr_xive_kvm.c
> +++ b/hw/intc/spapr_xive_kvm.c
> @@ -152,7 +152,8 @@ void kvmppc_xive_cpu_synchronize_state(XiveTCTX *tctx, Error **errp)
>  
>  void kvmppc_xive_cpu_connect(XiveTCTX *tctx, Error **errp)
>  {
> -    SpaprXive *xive = SPAPR_MACHINE(qdev_get_machine())->xive;
> +    MachineState *ms = MACHINE(qdev_get_machine());
> +    SpaprXive *xive = SPAPR_MACHINE(ms)->xive;
>      unsigned long vcpu_id;
>      int ret;
>  
> @@ -171,8 +172,15 @@ void kvmppc_xive_cpu_connect(XiveTCTX *tctx, Error **errp)
>      ret = kvm_vcpu_enable_cap(tctx->cs, KVM_CAP_PPC_IRQ_XIVE, 0, xive->fd,
>                                vcpu_id, 0);
>      if (ret < 0) {
> -        error_setg(errp, "XIVE: unable to connect CPU%ld to KVM device: %s",
> +        Error *err = NULL;
> +
> +        error_setg(&err, "XIVE: unable to connect CPU%ld to KVM device: %s",
>                     vcpu_id, strerror(errno));
> +        if (errno == ENOSPC) {
> +            error_append_hint(&local_err, "Try -smp maxcpus=N with N < %u\n",
> +                              ms->smp.max_cpus);
> +        }
> +        error_propagate(errp, err);
>          return;
>      }
>  
> @@ -768,6 +776,16 @@ void kvmppc_xive_connect(SpaprXive *xive, Error **errp)
>          return;
>      }
>  
> +    /* Tell KVM about the # of VCPUs we may have */
> +    if (kvm_device_check_attr(xive->fd, KVM_DEV_XIVE_GRP_CTRL,
> +                              KVM_DEV_XIVE_NR_SERVERS)) {
> +        if (kvm_device_access(xive->fd, KVM_DEV_XIVE_GRP_CTRL,
> +                              KVM_DEV_XIVE_NR_SERVERS, &xive->nr_servers, true,
> +                              &local_err)) {
> +            goto fail;
> +        }
> +    }
> +
>      /*
>       * 1. Source ESB pages - KVM mapping
>       */
> 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/7] spapr, xive: Turn "nr-ends" property into "nr-servers" property
  2019-10-03 12:21   ` Cédric Le Goater
@ 2019-10-03 12:44     ` Greg Kurz
  0 siblings, 0 replies; 31+ messages in thread
From: Greg Kurz @ 2019-10-03 12:44 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, David Gibson

On Thu, 3 Oct 2019 14:21:59 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> On 03/10/2019 14:01, Greg Kurz wrote:
> > The sPAPR XIVE object has an nr_ends field which happens to be a
> > multiple of spapr_max_server_number(). It is currently set with
> > the help of "nr-ends" property. This is a bit unfortunate since
> > it exposes to the sPAPR irq frontend what should remain an
> > implemantation detail within the XIVE backend.
> 
> implementation
> 

oops

> > It will be possible soon to inform the XIVE KVM device about the
> > range of VCPU ids that may be used in the VM, as returned by the
> > spapr_max_server_number() function. This will allow the device
> > to substantially reduce the consumption of scarce resources
> > in the XIVE HW.
> > 
> > For both reasons, replace the "nr-ends" property with an "nr-servers"
> > one. The existing nr_ends field must be kept though since it tells how
> > many ENDs are migrated, it is derived from "nr-servers" at realize time
> > for simplicity. Convert spapr_dt_xive() to use it as well.
> 
> Looks good. one question below.
> 
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> >  hw/intc/spapr_xive.c        |   21 ++++++++++++++++-----
> >  hw/ppc/spapr_irq.c          |    2 +-
> >  include/hw/ppc/spapr_xive.h |    1 +
> >  3 files changed, 18 insertions(+), 6 deletions(-)
> > 
> > diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> > index 04879abf2e7a..62888ddc68db 100644
> > --- a/hw/intc/spapr_xive.c
> > +++ b/hw/intc/spapr_xive.c
> > @@ -99,6 +99,15 @@ int spapr_xive_end_to_target(uint8_t end_blk, uint32_t end_idx,
> >      return 0;
> >  }
> >  
> > +static uint32_t spapr_xive_vcpu_id_to_end_idx(uint32_t vcpu_id)
> 
> 
> may be use a simpler macro :
> 
> #define spapr_xive_cpu_end_idx(vcpu, prio) (((vcpu) << 3) + prio) 
> 

Will do but I'll keep the comment.

> > +{
> > +    /*
> > +     * 8 XIVE END structures per CPU. One for each available
> > +     * priority
> > +     */
> > +    return vcpu_id << 3;
> > +}
> > +
> >  static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
> >                                    uint8_t *out_end_blk, uint32_t *out_end_idx)
> >  {
> > @@ -109,7 +118,7 @@ static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
> >      }
> >  
> >      if (out_end_idx) {
> > -        *out_end_idx = (cpu->vcpu_id << 3) + prio;
> > +        *out_end_idx = spapr_xive_vcpu_id_to_end_idx(cpu->vcpu_id) + prio;
> >      }
> >  }
> >  
> > @@ -283,11 +292,13 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
> >          return;
> >      }
> >  
> > -    if (!xive->nr_ends) {
> > -        error_setg(errp, "Number of interrupt needs to be greater 0");
> > +    if (!xive->nr_servers) {
> > +        error_setg(errp, "Number of interrupt servers must be greater than 0");
> >          return;
> >      }
> >  
> > +    xive->nr_ends = spapr_xive_vcpu_id_to_end_idx(xive->nr_servers);
> > +
> >      /*
> >       * Initialize the internal sources, for IPIs and virtual devices.
> >       */
> > @@ -489,7 +500,7 @@ static const VMStateDescription vmstate_spapr_xive = {
> >  
> >  static Property spapr_xive_properties[] = {
> >      DEFINE_PROP_UINT32("nr-irqs", SpaprXive, nr_irqs, 0),
> > -    DEFINE_PROP_UINT32("nr-ends", SpaprXive, nr_ends, 0),
> > +    DEFINE_PROP_UINT32("nr-servers", SpaprXive, nr_servers, 0),
> >      DEFINE_PROP_UINT64("vc-base", SpaprXive, vc_base, SPAPR_XIVE_VC_BASE),
> >      DEFINE_PROP_UINT64("tm-base", SpaprXive, tm_base, SPAPR_XIVE_TM_BASE),
> >      DEFINE_PROP_END_OF_LIST(),
> > @@ -1550,7 +1561,7 @@ void spapr_dt_xive(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
> 
> we should remove the 'uint32_t nr_servers' parameter from spapr_dt_xive() 
> then ?
> 

Yes but this would also affect XICS since this function is a backend
method. It is hence done in the next patch.

> >      /* Interrupt number ranges for the IPIs */
> >      uint32_t lisn_ranges[] = {
> >          cpu_to_be32(0),
> > -        cpu_to_be32(nr_servers),
> > +        cpu_to_be32(xive->nr_servers),
> >      };
> >      /*
> >       * EQ size - the sizes of pages supported by the system 4K, 64K,
> > diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> > index 457eabe24cda..025fd00143a2 100644
> > --- a/hw/ppc/spapr_irq.c
> > +++ b/hw/ppc/spapr_irq.c
> > @@ -591,7 +591,7 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
> >           * 8 XIVE END structures per CPU. One for each available
> >           * priority
> >           */
> > -        qdev_prop_set_uint32(dev, "nr-ends", nr_servers << 3);
> > +        qdev_prop_set_uint32(dev, "nr-servers", nr_servers);
> >          qdev_init_nofail(dev);
> >  
> >          spapr->xive = SPAPR_XIVE(dev);
> > diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> > index 0df20a6590a5..4a4a6fc6be7f 100644
> > --- a/include/hw/ppc/spapr_xive.h
> > +++ b/include/hw/ppc/spapr_xive.h
> > @@ -22,6 +22,7 @@ typedef struct SpaprXive {
> >      /* Internal interrupt source for IPIs and virtual devices */
> >      XiveSource    source;
> >      hwaddr        vc_base;
> > +    uint32_t      nr_servers;
> >  
> >      /* END ESB MMIOs */
> >      XiveENDSource end_source;
> > 
> 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/7] spapr, xics: Get number of servers with a XICSFabricClass method
  2019-10-03 12:24   ` Cédric Le Goater
@ 2019-10-03 12:49     ` Greg Kurz
  2019-10-03 12:58       ` Cédric Le Goater
  0 siblings, 1 reply; 31+ messages in thread
From: Greg Kurz @ 2019-10-03 12:49 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, David Gibson

On Thu, 3 Oct 2019 14:24:06 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> On 03/10/2019 14:00, Greg Kurz wrote:
> > The number of servers, ie. upper bound of the highest VCPU id, is
> > currently only needed to generate the "interrupt-controller" node
> > in the DT. Soon it will be needed to inform the XICS-on-XIVE KVM
> > device that it can allocates less resources in the XIVE HW.
> > 
> > Add a method to XICSFabricClass for this purpose. 
> 
> This is sPAPR code and PowerNV does not care.
> 

Then PowerNV doesn't need to implement the method.

> why can not we simply call spapr_max_server_number(spapr) ?
> 

Because the backend shouldn't reach out to sPAPR machine
internals. XICSFabric is the natural interface for ICS/ICP
if they need something from the machine.

> 
> > Implement it
> > for sPAPR and use it to generate the "interrupt-controller" node.
> > 
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> >  hw/intc/xics.c        |    7 +++++++
> >  hw/intc/xics_spapr.c  |    3 ++-
> >  hw/ppc/spapr.c        |    8 ++++++++
> >  include/hw/ppc/xics.h |    2 ++
> >  4 files changed, 19 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> > index dfe7dbd254ab..f82072935266 100644
> > --- a/hw/intc/xics.c
> > +++ b/hw/intc/xics.c
> > @@ -716,6 +716,13 @@ ICPState *xics_icp_get(XICSFabric *xi, int server)
> >      return xic->icp_get(xi, server);
> >  }
> >  
> > +uint32_t xics_nr_servers(XICSFabric *xi)
> > +{
> > +    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(xi);
> > +
> > +    return xic->nr_servers(xi);
> > +}
> > +
> >  void ics_set_irq_type(ICSState *ics, int srcno, bool lsi)
> >  {
> >      assert(!(ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MASK));
> > diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
> > index 6e5eb24b3cca..aa568ed0dc0d 100644
> > --- a/hw/intc/xics_spapr.c
> > +++ b/hw/intc/xics_spapr.c
> > @@ -311,8 +311,9 @@ static void ics_spapr_realize(DeviceState *dev, Error **errp)
> >  void spapr_dt_xics(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
> >                     uint32_t phandle)
> >  {
> > +    ICSState *ics = spapr->ics;
> >      uint32_t interrupt_server_ranges_prop[] = {
> > -        0, cpu_to_be32(nr_servers),
> > +        0, cpu_to_be32(xics_nr_servers(ics->xics)),
> >      };
> >      int node;
> >  
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 514a17ae74d6..b8b9796c88e4 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -4266,6 +4266,13 @@ static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
> >      return cpu ? spapr_cpu_state(cpu)->icp : NULL;
> >  }
> >  
> > +static uint32_t spapr_nr_servers(XICSFabric *xi)
> > +{
> > +    SpaprMachineState *spapr = SPAPR_MACHINE(xi);
> > +
> > +    return spapr_max_server_number(spapr);
> > +}
> > +
> >  static void spapr_pic_print_info(InterruptStatsProvider *obj,
> >                                   Monitor *mon)
> >  {
> > @@ -4423,6 +4430,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
> >      xic->ics_get = spapr_ics_get;
> >      xic->ics_resend = spapr_ics_resend;
> >      xic->icp_get = spapr_icp_get;
> > +    xic->nr_servers = spapr_nr_servers;
> >      ispc->print_info = spapr_pic_print_info;
> >      /* Force NUMA node memory size to be a multiple of
> >       * SPAPR_MEMORY_BLOCK_SIZE (256M) since that's the granularity
> > diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> > index 1e6a9300eb2b..e6bb1239e8f8 100644
> > --- a/include/hw/ppc/xics.h
> > +++ b/include/hw/ppc/xics.h
> > @@ -151,9 +151,11 @@ typedef struct XICSFabricClass {
> >      ICSState *(*ics_get)(XICSFabric *xi, int irq);
> >      void (*ics_resend)(XICSFabric *xi);
> >      ICPState *(*icp_get)(XICSFabric *xi, int server);
> > +    uint32_t (*nr_servers)(XICSFabric *xi);
> >  } XICSFabricClass;
> >  
> >  ICPState *xics_icp_get(XICSFabric *xi, int server);
> > +uint32_t xics_nr_servers(XICSFabric *xi);
> >  
> >  /* Internal XICS interfaces */
> >  void icp_set_cppr(ICPState *icp, uint8_t cppr);
> > 
> 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/7] spapr, xics, xive: Drop nr_servers argument in DT-related functions
  2019-10-03 12:25   ` Cédric Le Goater
@ 2019-10-03 12:52     ` Greg Kurz
  0 siblings, 0 replies; 31+ messages in thread
From: Greg Kurz @ 2019-10-03 12:52 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, David Gibson

On Thu, 3 Oct 2019 14:25:58 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> On 03/10/2019 14:01, Greg Kurz wrote:
> > Both XICS and XIVE backends can access nr_servers by other means. No
> > need to pass it around anymore.
> 
> OK. You are doing the clean up in this patch.
> 
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> 
> even if spapr_irq removal is programmed, 
> 

I have another version of this patchset based on David's full cleanup
series :)

> Reviewed-by: Cédric Le Goater <clg@kaod.org>
> 
> > ---
> >  hw/intc/spapr_xive.c        |    3 +--
> >  hw/intc/xics_spapr.c        |    3 +--
> >  hw/ppc/spapr.c              |    3 +--
> >  hw/ppc/spapr_irq.c          |    5 ++---
> >  include/hw/ppc/spapr_irq.h  |    3 +--
> >  include/hw/ppc/spapr_xive.h |    3 +--
> >  include/hw/ppc/xics_spapr.h |    3 +--
> >  7 files changed, 8 insertions(+), 15 deletions(-)
> > 
> > diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> > index 62888ddc68db..56d851169cf6 100644
> > --- a/hw/intc/spapr_xive.c
> > +++ b/hw/intc/spapr_xive.c
> > @@ -1552,8 +1552,7 @@ void spapr_xive_hcall_init(SpaprMachineState *spapr)
> >      spapr_register_hypercall(H_INT_RESET, h_int_reset);
> >  }
> >  
> > -void spapr_dt_xive(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
> > -                   uint32_t phandle)
> > +void spapr_dt_xive(SpaprMachineState *spapr, void *fdt, uint32_t phandle)
> >  {
> >      SpaprXive *xive = spapr->xive;
> >      int node;
> > diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
> > index aa568ed0dc0d..015753c19c5d 100644
> > --- a/hw/intc/xics_spapr.c
> > +++ b/hw/intc/xics_spapr.c
> > @@ -308,8 +308,7 @@ static void ics_spapr_realize(DeviceState *dev, Error **errp)
> >      spapr_register_hypercall(H_IPOLL, h_ipoll);
> >  }
> >  
> > -void spapr_dt_xics(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
> > -                   uint32_t phandle)
> > +void spapr_dt_xics(SpaprMachineState *spapr, void *fdt, uint32_t phandle)
> >  {
> >      ICSState *ics = spapr->ics;
> >      uint32_t interrupt_server_ranges_prop[] = {
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index b8b9796c88e4..8f59f08c102e 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -1255,8 +1255,7 @@ static void *spapr_build_fdt(SpaprMachineState *spapr)
> >      _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
> >  
> >      /* /interrupt controller */
> > -    spapr->irq->dt_populate(spapr, spapr_max_server_number(spapr), fdt,
> > -                          PHANDLE_INTC);
> > +    spapr->irq->dt_populate(spapr, fdt, PHANDLE_INTC);
> >  
> >      ret = spapr_populate_memory(spapr, fdt);
> >      if (ret < 0) {
> > diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> > index 025fd00143a2..02e1b5503b65 100644
> > --- a/hw/ppc/spapr_irq.c
> > +++ b/hw/ppc/spapr_irq.c
> > @@ -368,11 +368,10 @@ static void spapr_irq_print_info_dual(SpaprMachineState *spapr, Monitor *mon)
> >      spapr_irq_current(spapr)->print_info(spapr, mon);
> >  }
> >  
> > -static void spapr_irq_dt_populate_dual(SpaprMachineState *spapr,
> > -                                       uint32_t nr_servers, void *fdt,
> > +static void spapr_irq_dt_populate_dual(SpaprMachineState *spapr, void *fdt,
> >                                         uint32_t phandle)
> >  {
> > -    spapr_irq_current(spapr)->dt_populate(spapr, nr_servers, fdt, phandle);
> > +    spapr_irq_current(spapr)->dt_populate(spapr, fdt, phandle);
> >  }
> >  
> >  static void spapr_irq_cpu_intc_create_dual(SpaprMachineState *spapr,
> > diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> > index 69a37f608e01..1736e503a8e9 100644
> > --- a/include/hw/ppc/spapr_irq.h
> > +++ b/include/hw/ppc/spapr_irq.h
> > @@ -45,8 +45,7 @@ typedef struct SpaprIrq {
> >      int (*claim)(SpaprMachineState *spapr, int irq, bool lsi, Error **errp);
> >      void (*free)(SpaprMachineState *spapr, int irq);
> >      void (*print_info)(SpaprMachineState *spapr, Monitor *mon);
> > -    void (*dt_populate)(SpaprMachineState *spapr, uint32_t nr_servers,
> > -                        void *fdt, uint32_t phandle);
> > +    void (*dt_populate)(SpaprMachineState *spapr, void *fdt, uint32_t phandle);
> >      void (*cpu_intc_create)(SpaprMachineState *spapr, PowerPCCPU *cpu,
> >                              Error **errp);
> >      int (*post_load)(SpaprMachineState *spapr, int version_id);
> > diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> > index 4a4a6fc6be7f..fae075d51815 100644
> > --- a/include/hw/ppc/spapr_xive.h
> > +++ b/include/hw/ppc/spapr_xive.h
> > @@ -61,8 +61,7 @@ void spapr_xive_pic_print_info(SpaprXive *xive, Monitor *mon);
> >  int spapr_xive_post_load(SpaprXive *xive, int version_id);
> >  
> >  void spapr_xive_hcall_init(SpaprMachineState *spapr);
> > -void spapr_dt_xive(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
> > -                   uint32_t phandle);
> > +void spapr_dt_xive(SpaprMachineState *spapr, void *fdt, uint32_t phandle);
> >  void spapr_xive_set_tctx_os_cam(XiveTCTX *tctx);
> >  void spapr_xive_mmio_set_enabled(SpaprXive *xive, bool enable);
> >  void spapr_xive_map_mmio(SpaprXive *xive);
> > diff --git a/include/hw/ppc/xics_spapr.h b/include/hw/ppc/xics_spapr.h
> > index 0b35e85c266a..ecb67c6c340a 100644
> > --- a/include/hw/ppc/xics_spapr.h
> > +++ b/include/hw/ppc/xics_spapr.h
> > @@ -32,8 +32,7 @@
> >  #define TYPE_ICS_SPAPR "ics-spapr"
> >  #define ICS_SPAPR(obj) OBJECT_CHECK(ICSState, (obj), TYPE_ICS_SPAPR)
> >  
> > -void spapr_dt_xics(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
> > -                   uint32_t phandle);
> > +void spapr_dt_xics(SpaprMachineState *spapr, void *fdt, uint32_t phandle);
> >  int xics_kvm_connect(SpaprMachineState *spapr, Error **errp);
> >  void xics_kvm_disconnect(SpaprMachineState *spapr, Error **errp);
> >  bool xics_kvm_has_broken_disconnect(SpaprMachineState *spapr);
> > 
> 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/7] spapr/xics: Configure number of servers in KVM
  2019-10-03 12:29   ` Cédric Le Goater
@ 2019-10-03 12:55     ` Greg Kurz
  0 siblings, 0 replies; 31+ messages in thread
From: Greg Kurz @ 2019-10-03 12:55 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, David Gibson

On Thu, 3 Oct 2019 14:29:46 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> On 03/10/2019 14:01, Greg Kurz wrote:
> > The XICS-on-XIVE KVM devices now has an attribute to configure the number
> > of interrupt servers. This allows to greatly optimize the usage of the VP
> > space in the XIVE HW, and thus to start a lot more VMs.
> > 
> > Only set this attribute if available in order to support older POWER9 KVM
> > and pre-POWER9 XICS KVM devices.
> > 
> > The XICS-on-XIVE KVM device now reports the exhaustion of VPs upon the
> > connection of the first VCPU. Check that in order to have a chance to
> > provide an hint to the user.
> 
> That part would have been better in its own patch. Simpler to review.
>  

Yeah, possibly... on the other hand, it's only two fairly simple
hunks. :)

> > Signed-off-by: Greg Kurz <groug@kaod.org>
> 
> Anyhow, if you split or not :
> 
> Reviewed-by: Cédric Le Goater <clg@kaod.org>
> 
> > ---
> >  hw/intc/xics_kvm.c |   24 ++++++++++++++++++++++--
> >  1 file changed, 22 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
> > index ba90d6dc966c..12d9524cc432 100644
> > --- a/hw/intc/xics_kvm.c
> > +++ b/hw/intc/xics_kvm.c
> > @@ -165,8 +165,15 @@ void icp_kvm_realize(DeviceState *dev, Error **errp)
> >  
> >      ret = kvm_vcpu_enable_cap(cs, KVM_CAP_IRQ_XICS, 0, kernel_xics_fd, vcpu_id);
> >      if (ret < 0) {
> > -        error_setg(errp, "Unable to connect CPU%ld to kernel XICS: %s", vcpu_id,
> > -                   strerror(errno));
> > +        Error *local_err = NULL;
> > +
> > +        error_setg(&local_err, "Unable to connect CPU%ld to kernel XICS: %s",
> > +                   vcpu_id, strerror(errno));
> > +        if (errno == ENOSPC) {
> > +            error_append_hint(&local_err, "Try -smp maxcpus=N with N < %u\n",
> > +                              MACHINE(qdev_get_machine())->smp.max_cpus);
> > +        }
> > +        error_propagate(errp, local_err);
> >          return;
> >      }
> >      enabled_icp = g_malloc(sizeof(*enabled_icp));
> > @@ -344,6 +351,7 @@ void ics_kvm_set_irq(ICSState *ics, int srcno, int val)
> >  
> >  int xics_kvm_connect(SpaprMachineState *spapr, Error **errp)
> >  {
> > +    ICSState *ics = spapr->ics;
> >      int rc;
> >      CPUState *cs;
> >      Error *local_err = NULL;
> > @@ -397,6 +405,18 @@ int xics_kvm_connect(SpaprMachineState *spapr, Error **errp)
> >          goto fail;
> >      }
> >  
> > +    /* Tell KVM about the # of VCPUs we may have (POWER9 and newer only) */
> > +    if (kvm_device_check_attr(rc, KVM_DEV_XICS_GRP_CTRL,
> > +                              KVM_DEV_XICS_NR_SERVERS)) {
> > +        uint32_t nr_servers = xics_nr_servers(ics->xics);
> > +
> > +        if (kvm_device_access(rc, KVM_DEV_XICS_GRP_CTRL,
> > +                              KVM_DEV_XICS_NR_SERVERS, &nr_servers, true,
> > +                              &local_err)) {
> > +            goto fail;
> > +        }
> > +    }
> > +
> >      kernel_xics_fd = rc;
> >      kvm_kernel_irqchip = true;
> >      kvm_msi_via_irqfd_allowed = true;
> > 
> 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/7] spapr, xics: Get number of servers with a XICSFabricClass method
  2019-10-03 12:49     ` Greg Kurz
@ 2019-10-03 12:58       ` Cédric Le Goater
  2019-10-03 13:02         ` Greg Kurz
  0 siblings, 1 reply; 31+ messages in thread
From: Cédric Le Goater @ 2019-10-03 12:58 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, qemu-devel, David Gibson

On 03/10/2019 14:49, Greg Kurz wrote:
> On Thu, 3 Oct 2019 14:24:06 +0200
> Cédric Le Goater <clg@kaod.org> wrote:
> 
>> On 03/10/2019 14:00, Greg Kurz wrote:
>>> The number of servers, ie. upper bound of the highest VCPU id, is
>>> currently only needed to generate the "interrupt-controller" node
>>> in the DT. Soon it will be needed to inform the XICS-on-XIVE KVM
>>> device that it can allocates less resources in the XIVE HW.
>>>
>>> Add a method to XICSFabricClass for this purpose. 
>>
>> This is sPAPR code and PowerNV does not care.
>>
> 
> Then PowerNV doesn't need to implement the method.
> 
>> why can not we simply call spapr_max_server_number(spapr) ?
>>
> 
> Because the backend shouldn't reach out to sPAPR machine
> internals. XICSFabric is the natural interface for ICS/ICP
> if they need something from the machine.

From what I can see, xics_nr_servers() is only called by : 

  spapr_dt_xics(SpaprMachineState *spapr ...)
  xics_kvm_connect(SpaprMachineState *spapr ...)

C. 

>>
>>> Implement it
>>> for sPAPR and use it to generate the "interrupt-controller" node.
>>>
>>> Signed-off-by: Greg Kurz <groug@kaod.org>
>>> ---
>>>  hw/intc/xics.c        |    7 +++++++
>>>  hw/intc/xics_spapr.c  |    3 ++-
>>>  hw/ppc/spapr.c        |    8 ++++++++
>>>  include/hw/ppc/xics.h |    2 ++
>>>  4 files changed, 19 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
>>> index dfe7dbd254ab..f82072935266 100644
>>> --- a/hw/intc/xics.c
>>> +++ b/hw/intc/xics.c
>>> @@ -716,6 +716,13 @@ ICPState *xics_icp_get(XICSFabric *xi, int server)
>>>      return xic->icp_get(xi, server);
>>>  }
>>>  
>>> +uint32_t xics_nr_servers(XICSFabric *xi)
>>> +{
>>> +    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(xi);
>>> +
>>> +    return xic->nr_servers(xi);
>>> +}
>>> +
>>>  void ics_set_irq_type(ICSState *ics, int srcno, bool lsi)
>>>  {
>>>      assert(!(ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MASK));
>>> diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
>>> index 6e5eb24b3cca..aa568ed0dc0d 100644
>>> --- a/hw/intc/xics_spapr.c
>>> +++ b/hw/intc/xics_spapr.c
>>> @@ -311,8 +311,9 @@ static void ics_spapr_realize(DeviceState *dev, Error **errp)
>>>  void spapr_dt_xics(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
>>>                     uint32_t phandle)
>>>  {
>>> +    ICSState *ics = spapr->ics;
>>>      uint32_t interrupt_server_ranges_prop[] = {
>>> -        0, cpu_to_be32(nr_servers),
>>> +        0, cpu_to_be32(xics_nr_servers(ics->xics)),
>>>      };
>>>      int node;
>>>  
>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>> index 514a17ae74d6..b8b9796c88e4 100644
>>> --- a/hw/ppc/spapr.c
>>> +++ b/hw/ppc/spapr.c
>>> @@ -4266,6 +4266,13 @@ static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
>>>      return cpu ? spapr_cpu_state(cpu)->icp : NULL;
>>>  }
>>>  
>>> +static uint32_t spapr_nr_servers(XICSFabric *xi)
>>> +{
>>> +    SpaprMachineState *spapr = SPAPR_MACHINE(xi);
>>> +
>>> +    return spapr_max_server_number(spapr);
>>> +}
>>> +
>>>  static void spapr_pic_print_info(InterruptStatsProvider *obj,
>>>                                   Monitor *mon)
>>>  {
>>> @@ -4423,6 +4430,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>>>      xic->ics_get = spapr_ics_get;
>>>      xic->ics_resend = spapr_ics_resend;
>>>      xic->icp_get = spapr_icp_get;
>>> +    xic->nr_servers = spapr_nr_servers;
>>>      ispc->print_info = spapr_pic_print_info;
>>>      /* Force NUMA node memory size to be a multiple of
>>>       * SPAPR_MEMORY_BLOCK_SIZE (256M) since that's the granularity
>>> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
>>> index 1e6a9300eb2b..e6bb1239e8f8 100644
>>> --- a/include/hw/ppc/xics.h
>>> +++ b/include/hw/ppc/xics.h
>>> @@ -151,9 +151,11 @@ typedef struct XICSFabricClass {
>>>      ICSState *(*ics_get)(XICSFabric *xi, int irq);
>>>      void (*ics_resend)(XICSFabric *xi);
>>>      ICPState *(*icp_get)(XICSFabric *xi, int server);
>>> +    uint32_t (*nr_servers)(XICSFabric *xi);
>>>  } XICSFabricClass;
>>>  
>>>  ICPState *xics_icp_get(XICSFabric *xi, int server);
>>> +uint32_t xics_nr_servers(XICSFabric *xi);
>>>  
>>>  /* Internal XICS interfaces */
>>>  void icp_set_cppr(ICPState *icp, uint8_t cppr);
>>>
>>
> 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/7] spapr, xics: Get number of servers with a XICSFabricClass method
  2019-10-03 12:58       ` Cédric Le Goater
@ 2019-10-03 13:02         ` Greg Kurz
  2019-10-03 13:19           ` Cédric Le Goater
  0 siblings, 1 reply; 31+ messages in thread
From: Greg Kurz @ 2019-10-03 13:02 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, David Gibson

On Thu, 3 Oct 2019 14:58:45 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> On 03/10/2019 14:49, Greg Kurz wrote:
> > On Thu, 3 Oct 2019 14:24:06 +0200
> > Cédric Le Goater <clg@kaod.org> wrote:
> > 
> >> On 03/10/2019 14:00, Greg Kurz wrote:
> >>> The number of servers, ie. upper bound of the highest VCPU id, is
> >>> currently only needed to generate the "interrupt-controller" node
> >>> in the DT. Soon it will be needed to inform the XICS-on-XIVE KVM
> >>> device that it can allocates less resources in the XIVE HW.
> >>>
> >>> Add a method to XICSFabricClass for this purpose. 
> >>
> >> This is sPAPR code and PowerNV does not care.
> >>
> > 
> > Then PowerNV doesn't need to implement the method.
> > 
> >> why can not we simply call spapr_max_server_number(spapr) ?
> >>
> > 
> > Because the backend shouldn't reach out to sPAPR machine
> > internals. XICSFabric is the natural interface for ICS/ICP
> > if they need something from the machine.
> 
> From what I can see, xics_nr_servers() is only called by : 
> 
>   spapr_dt_xics(SpaprMachineState *spapr ...)
>   xics_kvm_connect(SpaprMachineState *spapr ...)
> 

Yes... and ?

> C. 
> 
> >>
> >>> Implement it
> >>> for sPAPR and use it to generate the "interrupt-controller" node.
> >>>
> >>> Signed-off-by: Greg Kurz <groug@kaod.org>
> >>> ---
> >>>  hw/intc/xics.c        |    7 +++++++
> >>>  hw/intc/xics_spapr.c  |    3 ++-
> >>>  hw/ppc/spapr.c        |    8 ++++++++
> >>>  include/hw/ppc/xics.h |    2 ++
> >>>  4 files changed, 19 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> >>> index dfe7dbd254ab..f82072935266 100644
> >>> --- a/hw/intc/xics.c
> >>> +++ b/hw/intc/xics.c
> >>> @@ -716,6 +716,13 @@ ICPState *xics_icp_get(XICSFabric *xi, int server)
> >>>      return xic->icp_get(xi, server);
> >>>  }
> >>>  
> >>> +uint32_t xics_nr_servers(XICSFabric *xi)
> >>> +{
> >>> +    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(xi);
> >>> +
> >>> +    return xic->nr_servers(xi);
> >>> +}
> >>> +
> >>>  void ics_set_irq_type(ICSState *ics, int srcno, bool lsi)
> >>>  {
> >>>      assert(!(ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MASK));
> >>> diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
> >>> index 6e5eb24b3cca..aa568ed0dc0d 100644
> >>> --- a/hw/intc/xics_spapr.c
> >>> +++ b/hw/intc/xics_spapr.c
> >>> @@ -311,8 +311,9 @@ static void ics_spapr_realize(DeviceState *dev, Error **errp)
> >>>  void spapr_dt_xics(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
> >>>                     uint32_t phandle)
> >>>  {
> >>> +    ICSState *ics = spapr->ics;
> >>>      uint32_t interrupt_server_ranges_prop[] = {
> >>> -        0, cpu_to_be32(nr_servers),
> >>> +        0, cpu_to_be32(xics_nr_servers(ics->xics)),
> >>>      };
> >>>      int node;
> >>>  
> >>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >>> index 514a17ae74d6..b8b9796c88e4 100644
> >>> --- a/hw/ppc/spapr.c
> >>> +++ b/hw/ppc/spapr.c
> >>> @@ -4266,6 +4266,13 @@ static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
> >>>      return cpu ? spapr_cpu_state(cpu)->icp : NULL;
> >>>  }
> >>>  
> >>> +static uint32_t spapr_nr_servers(XICSFabric *xi)
> >>> +{
> >>> +    SpaprMachineState *spapr = SPAPR_MACHINE(xi);
> >>> +
> >>> +    return spapr_max_server_number(spapr);
> >>> +}
> >>> +
> >>>  static void spapr_pic_print_info(InterruptStatsProvider *obj,
> >>>                                   Monitor *mon)
> >>>  {
> >>> @@ -4423,6 +4430,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
> >>>      xic->ics_get = spapr_ics_get;
> >>>      xic->ics_resend = spapr_ics_resend;
> >>>      xic->icp_get = spapr_icp_get;
> >>> +    xic->nr_servers = spapr_nr_servers;
> >>>      ispc->print_info = spapr_pic_print_info;
> >>>      /* Force NUMA node memory size to be a multiple of
> >>>       * SPAPR_MEMORY_BLOCK_SIZE (256M) since that's the granularity
> >>> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> >>> index 1e6a9300eb2b..e6bb1239e8f8 100644
> >>> --- a/include/hw/ppc/xics.h
> >>> +++ b/include/hw/ppc/xics.h
> >>> @@ -151,9 +151,11 @@ typedef struct XICSFabricClass {
> >>>      ICSState *(*ics_get)(XICSFabric *xi, int irq);
> >>>      void (*ics_resend)(XICSFabric *xi);
> >>>      ICPState *(*icp_get)(XICSFabric *xi, int server);
> >>> +    uint32_t (*nr_servers)(XICSFabric *xi);
> >>>  } XICSFabricClass;
> >>>  
> >>>  ICPState *xics_icp_get(XICSFabric *xi, int server);
> >>> +uint32_t xics_nr_servers(XICSFabric *xi);
> >>>  
> >>>  /* Internal XICS interfaces */
> >>>  void icp_set_cppr(ICPState *icp, uint8_t cppr);
> >>>
> >>
> > 
> 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/7] spapr, xics: Get number of servers with a XICSFabricClass method
  2019-10-03 13:02         ` Greg Kurz
@ 2019-10-03 13:19           ` Cédric Le Goater
  2019-10-03 13:41             ` Greg Kurz
  0 siblings, 1 reply; 31+ messages in thread
From: Cédric Le Goater @ 2019-10-03 13:19 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, qemu-devel, David Gibson

On 03/10/2019 15:02, Greg Kurz wrote:
> On Thu, 3 Oct 2019 14:58:45 +0200
> Cédric Le Goater <clg@kaod.org> wrote:
> 
>> On 03/10/2019 14:49, Greg Kurz wrote:
>>> On Thu, 3 Oct 2019 14:24:06 +0200
>>> Cédric Le Goater <clg@kaod.org> wrote:
>>>
>>>> On 03/10/2019 14:00, Greg Kurz wrote:
>>>>> The number of servers, ie. upper bound of the highest VCPU id, is
>>>>> currently only needed to generate the "interrupt-controller" node
>>>>> in the DT. Soon it will be needed to inform the XICS-on-XIVE KVM
>>>>> device that it can allocates less resources in the XIVE HW.
>>>>>
>>>>> Add a method to XICSFabricClass for this purpose. 
>>>>
>>>> This is sPAPR code and PowerNV does not care.
>>>>
>>>
>>> Then PowerNV doesn't need to implement the method.
>>>
>>>> why can not we simply call spapr_max_server_number(spapr) ?
>>>>
>>>
>>> Because the backend shouldn't reach out to sPAPR machine
>>> internals. XICSFabric is the natural interface for ICS/ICP
>>> if they need something from the machine.
>>
>> From what I can see, xics_nr_servers() is only called by : 
>>
>>   spapr_dt_xics(SpaprMachineState *spapr ...)
>>   xics_kvm_connect(SpaprMachineState *spapr ...)
>>
> 
> Yes... and ?

so it is spapr only and we can use spapr_max_server_number(spapr)
without the need of a new XICSFabric handler.

C. 

>> C. 
>>
>>>>
>>>>> Implement it
>>>>> for sPAPR and use it to generate the "interrupt-controller" node.
>>>>>
>>>>> Signed-off-by: Greg Kurz <groug@kaod.org>
>>>>> ---
>>>>>  hw/intc/xics.c        |    7 +++++++
>>>>>  hw/intc/xics_spapr.c  |    3 ++-
>>>>>  hw/ppc/spapr.c        |    8 ++++++++
>>>>>  include/hw/ppc/xics.h |    2 ++
>>>>>  4 files changed, 19 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
>>>>> index dfe7dbd254ab..f82072935266 100644
>>>>> --- a/hw/intc/xics.c
>>>>> +++ b/hw/intc/xics.c
>>>>> @@ -716,6 +716,13 @@ ICPState *xics_icp_get(XICSFabric *xi, int server)
>>>>>      return xic->icp_get(xi, server);
>>>>>  }
>>>>>  
>>>>> +uint32_t xics_nr_servers(XICSFabric *xi)
>>>>> +{
>>>>> +    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(xi);
>>>>> +
>>>>> +    return xic->nr_servers(xi);
>>>>> +}
>>>>> +
>>>>>  void ics_set_irq_type(ICSState *ics, int srcno, bool lsi)
>>>>>  {
>>>>>      assert(!(ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MASK));
>>>>> diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
>>>>> index 6e5eb24b3cca..aa568ed0dc0d 100644
>>>>> --- a/hw/intc/xics_spapr.c
>>>>> +++ b/hw/intc/xics_spapr.c
>>>>> @@ -311,8 +311,9 @@ static void ics_spapr_realize(DeviceState *dev, Error **errp)
>>>>>  void spapr_dt_xics(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
>>>>>                     uint32_t phandle)
>>>>>  {
>>>>> +    ICSState *ics = spapr->ics;
>>>>>      uint32_t interrupt_server_ranges_prop[] = {
>>>>> -        0, cpu_to_be32(nr_servers),
>>>>> +        0, cpu_to_be32(xics_nr_servers(ics->xics)),
>>>>>      };
>>>>>      int node;
>>>>>  
>>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>>>> index 514a17ae74d6..b8b9796c88e4 100644
>>>>> --- a/hw/ppc/spapr.c
>>>>> +++ b/hw/ppc/spapr.c
>>>>> @@ -4266,6 +4266,13 @@ static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
>>>>>      return cpu ? spapr_cpu_state(cpu)->icp : NULL;
>>>>>  }
>>>>>  
>>>>> +static uint32_t spapr_nr_servers(XICSFabric *xi)
>>>>> +{
>>>>> +    SpaprMachineState *spapr = SPAPR_MACHINE(xi);
>>>>> +
>>>>> +    return spapr_max_server_number(spapr);
>>>>> +}
>>>>> +
>>>>>  static void spapr_pic_print_info(InterruptStatsProvider *obj,
>>>>>                                   Monitor *mon)
>>>>>  {
>>>>> @@ -4423,6 +4430,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>>>>>      xic->ics_get = spapr_ics_get;
>>>>>      xic->ics_resend = spapr_ics_resend;
>>>>>      xic->icp_get = spapr_icp_get;
>>>>> +    xic->nr_servers = spapr_nr_servers;
>>>>>      ispc->print_info = spapr_pic_print_info;
>>>>>      /* Force NUMA node memory size to be a multiple of
>>>>>       * SPAPR_MEMORY_BLOCK_SIZE (256M) since that's the granularity
>>>>> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
>>>>> index 1e6a9300eb2b..e6bb1239e8f8 100644
>>>>> --- a/include/hw/ppc/xics.h
>>>>> +++ b/include/hw/ppc/xics.h
>>>>> @@ -151,9 +151,11 @@ typedef struct XICSFabricClass {
>>>>>      ICSState *(*ics_get)(XICSFabric *xi, int irq);
>>>>>      void (*ics_resend)(XICSFabric *xi);
>>>>>      ICPState *(*icp_get)(XICSFabric *xi, int server);
>>>>> +    uint32_t (*nr_servers)(XICSFabric *xi);
>>>>>  } XICSFabricClass;
>>>>>  
>>>>>  ICPState *xics_icp_get(XICSFabric *xi, int server);
>>>>> +uint32_t xics_nr_servers(XICSFabric *xi);
>>>>>  
>>>>>  /* Internal XICS interfaces */
>>>>>  void icp_set_cppr(ICPState *icp, uint8_t cppr);
>>>>>
>>>>
>>>
>>
> 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/7] spapr, xics: Get number of servers with a XICSFabricClass method
  2019-10-03 13:19           ` Cédric Le Goater
@ 2019-10-03 13:41             ` Greg Kurz
  2019-10-03 13:59               ` Cédric Le Goater
  0 siblings, 1 reply; 31+ messages in thread
From: Greg Kurz @ 2019-10-03 13:41 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, David Gibson

On Thu, 3 Oct 2019 15:19:22 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> On 03/10/2019 15:02, Greg Kurz wrote:
> > On Thu, 3 Oct 2019 14:58:45 +0200
> > Cédric Le Goater <clg@kaod.org> wrote:
> > 
> >> On 03/10/2019 14:49, Greg Kurz wrote:
> >>> On Thu, 3 Oct 2019 14:24:06 +0200
> >>> Cédric Le Goater <clg@kaod.org> wrote:
> >>>
> >>>> On 03/10/2019 14:00, Greg Kurz wrote:
> >>>>> The number of servers, ie. upper bound of the highest VCPU id, is
> >>>>> currently only needed to generate the "interrupt-controller" node
> >>>>> in the DT. Soon it will be needed to inform the XICS-on-XIVE KVM
> >>>>> device that it can allocates less resources in the XIVE HW.
> >>>>>
> >>>>> Add a method to XICSFabricClass for this purpose. 
> >>>>
> >>>> This is sPAPR code and PowerNV does not care.
> >>>>
> >>>
> >>> Then PowerNV doesn't need to implement the method.
> >>>
> >>>> why can not we simply call spapr_max_server_number(spapr) ?
> >>>>
> >>>
> >>> Because the backend shouldn't reach out to sPAPR machine
> >>> internals. XICSFabric is the natural interface for ICS/ICP
> >>> if they need something from the machine.
> >>
> >> From what I can see, xics_nr_servers() is only called by : 
> >>
> >>   spapr_dt_xics(SpaprMachineState *spapr ...)
> >>   xics_kvm_connect(SpaprMachineState *spapr ...)
> >>
> > 
> > Yes... and ?
> 
> so it is spapr only and we can use spapr_max_server_number(spapr)
> without the need of a new XICSFabric handler.
> 

Yet we did care to have an nr_server argument to spapr_dt_xics(), even
though we could have called spapr_max_server_number() since the
beginning. And we also care not to call spapr_max_server_number()
to setup nr_ends in XIVE. Right ?

Ideally spapr_max_server_number() should even be local to spapr.c
IMHO. It happens to be extern because spapr_irq_init() needs it for
sPAPR XIVE, but I think it should rather be passed as an argument.

Anyway, if this patch doesn't reach consensus, I'll switch to using
spapr_max_server_number()... and do the same in sPAPR XIVE for
consistency ;-)

> C. 
> 
> >> C. 
> >>
> >>>>
> >>>>> Implement it
> >>>>> for sPAPR and use it to generate the "interrupt-controller" node.
> >>>>>
> >>>>> Signed-off-by: Greg Kurz <groug@kaod.org>
> >>>>> ---
> >>>>>  hw/intc/xics.c        |    7 +++++++
> >>>>>  hw/intc/xics_spapr.c  |    3 ++-
> >>>>>  hw/ppc/spapr.c        |    8 ++++++++
> >>>>>  include/hw/ppc/xics.h |    2 ++
> >>>>>  4 files changed, 19 insertions(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> >>>>> index dfe7dbd254ab..f82072935266 100644
> >>>>> --- a/hw/intc/xics.c
> >>>>> +++ b/hw/intc/xics.c
> >>>>> @@ -716,6 +716,13 @@ ICPState *xics_icp_get(XICSFabric *xi, int server)
> >>>>>      return xic->icp_get(xi, server);
> >>>>>  }
> >>>>>  
> >>>>> +uint32_t xics_nr_servers(XICSFabric *xi)
> >>>>> +{
> >>>>> +    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(xi);
> >>>>> +
> >>>>> +    return xic->nr_servers(xi);
> >>>>> +}
> >>>>> +
> >>>>>  void ics_set_irq_type(ICSState *ics, int srcno, bool lsi)
> >>>>>  {
> >>>>>      assert(!(ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MASK));
> >>>>> diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
> >>>>> index 6e5eb24b3cca..aa568ed0dc0d 100644
> >>>>> --- a/hw/intc/xics_spapr.c
> >>>>> +++ b/hw/intc/xics_spapr.c
> >>>>> @@ -311,8 +311,9 @@ static void ics_spapr_realize(DeviceState *dev, Error **errp)
> >>>>>  void spapr_dt_xics(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
> >>>>>                     uint32_t phandle)
> >>>>>  {
> >>>>> +    ICSState *ics = spapr->ics;
> >>>>>      uint32_t interrupt_server_ranges_prop[] = {
> >>>>> -        0, cpu_to_be32(nr_servers),
> >>>>> +        0, cpu_to_be32(xics_nr_servers(ics->xics)),
> >>>>>      };
> >>>>>      int node;
> >>>>>  
> >>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >>>>> index 514a17ae74d6..b8b9796c88e4 100644
> >>>>> --- a/hw/ppc/spapr.c
> >>>>> +++ b/hw/ppc/spapr.c
> >>>>> @@ -4266,6 +4266,13 @@ static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
> >>>>>      return cpu ? spapr_cpu_state(cpu)->icp : NULL;
> >>>>>  }
> >>>>>  
> >>>>> +static uint32_t spapr_nr_servers(XICSFabric *xi)
> >>>>> +{
> >>>>> +    SpaprMachineState *spapr = SPAPR_MACHINE(xi);
> >>>>> +
> >>>>> +    return spapr_max_server_number(spapr);
> >>>>> +}
> >>>>> +
> >>>>>  static void spapr_pic_print_info(InterruptStatsProvider *obj,
> >>>>>                                   Monitor *mon)
> >>>>>  {
> >>>>> @@ -4423,6 +4430,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
> >>>>>      xic->ics_get = spapr_ics_get;
> >>>>>      xic->ics_resend = spapr_ics_resend;
> >>>>>      xic->icp_get = spapr_icp_get;
> >>>>> +    xic->nr_servers = spapr_nr_servers;
> >>>>>      ispc->print_info = spapr_pic_print_info;
> >>>>>      /* Force NUMA node memory size to be a multiple of
> >>>>>       * SPAPR_MEMORY_BLOCK_SIZE (256M) since that's the granularity
> >>>>> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> >>>>> index 1e6a9300eb2b..e6bb1239e8f8 100644
> >>>>> --- a/include/hw/ppc/xics.h
> >>>>> +++ b/include/hw/ppc/xics.h
> >>>>> @@ -151,9 +151,11 @@ typedef struct XICSFabricClass {
> >>>>>      ICSState *(*ics_get)(XICSFabric *xi, int irq);
> >>>>>      void (*ics_resend)(XICSFabric *xi);
> >>>>>      ICPState *(*icp_get)(XICSFabric *xi, int server);
> >>>>> +    uint32_t (*nr_servers)(XICSFabric *xi);
> >>>>>  } XICSFabricClass;
> >>>>>  
> >>>>>  ICPState *xics_icp_get(XICSFabric *xi, int server);
> >>>>> +uint32_t xics_nr_servers(XICSFabric *xi);
> >>>>>  
> >>>>>  /* Internal XICS interfaces */
> >>>>>  void icp_set_cppr(ICPState *icp, uint8_t cppr);
> >>>>>
> >>>>
> >>>
> >>
> > 
> 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/7] spapr, xics: Get number of servers with a XICSFabricClass method
  2019-10-03 13:41             ` Greg Kurz
@ 2019-10-03 13:59               ` Cédric Le Goater
  2019-10-03 14:58                 ` Greg Kurz
  0 siblings, 1 reply; 31+ messages in thread
From: Cédric Le Goater @ 2019-10-03 13:59 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, qemu-devel, David Gibson

On 03/10/2019 15:41, Greg Kurz wrote:
> On Thu, 3 Oct 2019 15:19:22 +0200
> Cédric Le Goater <clg@kaod.org> wrote:
> 
>> On 03/10/2019 15:02, Greg Kurz wrote:
>>> On Thu, 3 Oct 2019 14:58:45 +0200
>>> Cédric Le Goater <clg@kaod.org> wrote:
>>>
>>>> On 03/10/2019 14:49, Greg Kurz wrote:
>>>>> On Thu, 3 Oct 2019 14:24:06 +0200
>>>>> Cédric Le Goater <clg@kaod.org> wrote:
>>>>>
>>>>>> On 03/10/2019 14:00, Greg Kurz wrote:
>>>>>>> The number of servers, ie. upper bound of the highest VCPU id, is
>>>>>>> currently only needed to generate the "interrupt-controller" node
>>>>>>> in the DT. Soon it will be needed to inform the XICS-on-XIVE KVM
>>>>>>> device that it can allocates less resources in the XIVE HW.
>>>>>>>
>>>>>>> Add a method to XICSFabricClass for this purpose. 
>>>>>>
>>>>>> This is sPAPR code and PowerNV does not care.
>>>>>>
>>>>>
>>>>> Then PowerNV doesn't need to implement the method.
>>>>>
>>>>>> why can not we simply call spapr_max_server_number(spapr) ?
>>>>>>
>>>>>
>>>>> Because the backend shouldn't reach out to sPAPR machine
>>>>> internals. XICSFabric is the natural interface for ICS/ICP
>>>>> if they need something from the machine.
>>>>
>>>> From what I can see, xics_nr_servers() is only called by : 
>>>>
>>>>   spapr_dt_xics(SpaprMachineState *spapr ...)
>>>>   xics_kvm_connect(SpaprMachineState *spapr ...)
>>>>
>>>
>>> Yes... and ?
>>
>> so it is spapr only and we can use spapr_max_server_number(spapr)
>> without the need of a new XICSFabric handler.
>>
> 
> Yet we did care to have an nr_server argument to spapr_dt_xics(), even
> though we could have called spapr_max_server_number() since the
> beginning. 

yes it is sometime good practice to pass only what a routine needs 
and the whole state.

> And we also care not to call spapr_max_server_number()
> to setup nr_ends in XIVE. Right ?

yes. That's handled with a property.
 
> Ideally spapr_max_server_number() should even be local to spapr.c
> IMHO. It happens to be extern because spapr_irq_init() needs it for
> sPAPR XIVE, but I think it should rather be passed as an argument.

That will be the case again  when David has finished cleaning it up.

> Anyway, if this patch doesn't reach consensus, I'll switch to using
> spapr_max_server_number()... and do the same in sPAPR XIVE for
> consistency ;-)

I don't see the problem sPAPR XIVE. There is a property "nr-server".

The XICS Fabric was introduced to handle differences between the
PowerNV machine and spapr machine mostly. We can extend its use
to abstract the interface between the machine and its device models
but, in that case, if we want consistency, we should then remove 
the use of SpaprMachinestate from xics_kvm to begin with and use 
only XICSFabric handlers. This is not the case today.

Because this is a spapr model only. Why bother ? This would add 
just extra and useless ops. 

You should wait for David to finish the cleanup. I think that 
at end we will only do pass a nr_servers to a couple of routine.
kvmppc_xive_connect() might need an extra argument.


C.


> 
>> C. 
>>
>>>> C. 
>>>>
>>>>>>
>>>>>>> Implement it
>>>>>>> for sPAPR and use it to generate the "interrupt-controller" node.
>>>>>>>
>>>>>>> Signed-off-by: Greg Kurz <groug@kaod.org>
>>>>>>> ---
>>>>>>>  hw/intc/xics.c        |    7 +++++++
>>>>>>>  hw/intc/xics_spapr.c  |    3 ++-
>>>>>>>  hw/ppc/spapr.c        |    8 ++++++++
>>>>>>>  include/hw/ppc/xics.h |    2 ++
>>>>>>>  4 files changed, 19 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
>>>>>>> index dfe7dbd254ab..f82072935266 100644
>>>>>>> --- a/hw/intc/xics.c
>>>>>>> +++ b/hw/intc/xics.c
>>>>>>> @@ -716,6 +716,13 @@ ICPState *xics_icp_get(XICSFabric *xi, int server)
>>>>>>>      return xic->icp_get(xi, server);
>>>>>>>  }
>>>>>>>  
>>>>>>> +uint32_t xics_nr_servers(XICSFabric *xi)
>>>>>>> +{
>>>>>>> +    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(xi);
>>>>>>> +
>>>>>>> +    return xic->nr_servers(xi);
>>>>>>> +}
>>>>>>> +
>>>>>>>  void ics_set_irq_type(ICSState *ics, int srcno, bool lsi)
>>>>>>>  {
>>>>>>>      assert(!(ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MASK));
>>>>>>> diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
>>>>>>> index 6e5eb24b3cca..aa568ed0dc0d 100644
>>>>>>> --- a/hw/intc/xics_spapr.c
>>>>>>> +++ b/hw/intc/xics_spapr.c
>>>>>>> @@ -311,8 +311,9 @@ static void ics_spapr_realize(DeviceState *dev, Error **errp)
>>>>>>>  void spapr_dt_xics(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
>>>>>>>                     uint32_t phandle)
>>>>>>>  {
>>>>>>> +    ICSState *ics = spapr->ics;
>>>>>>>      uint32_t interrupt_server_ranges_prop[] = {
>>>>>>> -        0, cpu_to_be32(nr_servers),
>>>>>>> +        0, cpu_to_be32(xics_nr_servers(ics->xics)),
>>>>>>>      };
>>>>>>>      int node;
>>>>>>>  
>>>>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>>>>>> index 514a17ae74d6..b8b9796c88e4 100644
>>>>>>> --- a/hw/ppc/spapr.c
>>>>>>> +++ b/hw/ppc/spapr.c
>>>>>>> @@ -4266,6 +4266,13 @@ static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
>>>>>>>      return cpu ? spapr_cpu_state(cpu)->icp : NULL;
>>>>>>>  }
>>>>>>>  
>>>>>>> +static uint32_t spapr_nr_servers(XICSFabric *xi)
>>>>>>> +{
>>>>>>> +    SpaprMachineState *spapr = SPAPR_MACHINE(xi);
>>>>>>> +
>>>>>>> +    return spapr_max_server_number(spapr);
>>>>>>> +}
>>>>>>> +
>>>>>>>  static void spapr_pic_print_info(InterruptStatsProvider *obj,
>>>>>>>                                   Monitor *mon)
>>>>>>>  {
>>>>>>> @@ -4423,6 +4430,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>>>>>>>      xic->ics_get = spapr_ics_get;
>>>>>>>      xic->ics_resend = spapr_ics_resend;
>>>>>>>      xic->icp_get = spapr_icp_get;
>>>>>>> +    xic->nr_servers = spapr_nr_servers;
>>>>>>>      ispc->print_info = spapr_pic_print_info;
>>>>>>>      /* Force NUMA node memory size to be a multiple of
>>>>>>>       * SPAPR_MEMORY_BLOCK_SIZE (256M) since that's the granularity
>>>>>>> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
>>>>>>> index 1e6a9300eb2b..e6bb1239e8f8 100644
>>>>>>> --- a/include/hw/ppc/xics.h
>>>>>>> +++ b/include/hw/ppc/xics.h
>>>>>>> @@ -151,9 +151,11 @@ typedef struct XICSFabricClass {
>>>>>>>      ICSState *(*ics_get)(XICSFabric *xi, int irq);
>>>>>>>      void (*ics_resend)(XICSFabric *xi);
>>>>>>>      ICPState *(*icp_get)(XICSFabric *xi, int server);
>>>>>>> +    uint32_t (*nr_servers)(XICSFabric *xi);
>>>>>>>  } XICSFabricClass;
>>>>>>>  
>>>>>>>  ICPState *xics_icp_get(XICSFabric *xi, int server);
>>>>>>> +uint32_t xics_nr_servers(XICSFabric *xi);
>>>>>>>  
>>>>>>>  /* Internal XICS interfaces */
>>>>>>>  void icp_set_cppr(ICPState *icp, uint8_t cppr);
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
> 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/7] spapr, xics: Get number of servers with a XICSFabricClass method
  2019-10-03 13:59               ` Cédric Le Goater
@ 2019-10-03 14:58                 ` Greg Kurz
  0 siblings, 0 replies; 31+ messages in thread
From: Greg Kurz @ 2019-10-03 14:58 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, David Gibson

On Thu, 3 Oct 2019 15:59:29 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> On 03/10/2019 15:41, Greg Kurz wrote:
> > On Thu, 3 Oct 2019 15:19:22 +0200
> > Cédric Le Goater <clg@kaod.org> wrote:
> > 
> >> On 03/10/2019 15:02, Greg Kurz wrote:
> >>> On Thu, 3 Oct 2019 14:58:45 +0200
> >>> Cédric Le Goater <clg@kaod.org> wrote:
> >>>
> >>>> On 03/10/2019 14:49, Greg Kurz wrote:
> >>>>> On Thu, 3 Oct 2019 14:24:06 +0200
> >>>>> Cédric Le Goater <clg@kaod.org> wrote:
> >>>>>
> >>>>>> On 03/10/2019 14:00, Greg Kurz wrote:
> >>>>>>> The number of servers, ie. upper bound of the highest VCPU id, is
> >>>>>>> currently only needed to generate the "interrupt-controller" node
> >>>>>>> in the DT. Soon it will be needed to inform the XICS-on-XIVE KVM
> >>>>>>> device that it can allocates less resources in the XIVE HW.
> >>>>>>>
> >>>>>>> Add a method to XICSFabricClass for this purpose. 
> >>>>>>
> >>>>>> This is sPAPR code and PowerNV does not care.
> >>>>>>
> >>>>>
> >>>>> Then PowerNV doesn't need to implement the method.
> >>>>>
> >>>>>> why can not we simply call spapr_max_server_number(spapr) ?
> >>>>>>
> >>>>>
> >>>>> Because the backend shouldn't reach out to sPAPR machine
> >>>>> internals. XICSFabric is the natural interface for ICS/ICP
> >>>>> if they need something from the machine.
> >>>>
> >>>> From what I can see, xics_nr_servers() is only called by : 
> >>>>
> >>>>   spapr_dt_xics(SpaprMachineState *spapr ...)
> >>>>   xics_kvm_connect(SpaprMachineState *spapr ...)
> >>>>
> >>>
> >>> Yes... and ?
> >>
> >> so it is spapr only and we can use spapr_max_server_number(spapr)
> >> without the need of a new XICSFabric handler.
> >>
> > 
> > Yet we did care to have an nr_server argument to spapr_dt_xics(), even
> > though we could have called spapr_max_server_number() since the
> > beginning. 
> 
> yes it is sometime good practice to pass only what a routine needs 
> and the whole state.
> 

Sure.

> > And we also care not to call spapr_max_server_number()
> > to setup nr_ends in XIVE. Right ?
> 
> yes. That's handled with a property.
>  

Yes and the same could be done if we had a derived type
for sPAPR XICS.

> > Ideally spapr_max_server_number() should even be local to spapr.c
> > IMHO. It happens to be extern because spapr_irq_init() needs it for
> > sPAPR XIVE, but I think it should rather be passed as an argument.
> 
> That will be the case again  when David has finished cleaning it up.
> 

I fully agree. I posted this series based on the current state of
ppc-for-4.2 in order to start the discussion, but I'm looking
forward to base it on the proper backend models that David is
cooking up for us :)

> > Anyway, if this patch doesn't reach consensus, I'll switch to using
> > spapr_max_server_number()... and do the same in sPAPR XIVE for
> > consistency ;-)
> 
> I don't see the problem sPAPR XIVE. There is a property "nr-server".
> 

Not before patch 2 in this series :) and it could be dropped and
replaced by a call to spapr_max_server_number() during realize.

> The XICS Fabric was introduced to handle differences between the
> PowerNV machine and spapr machine mostly. We can extend its use
> to abstract the interface between the machine and its device models

Well... this was suggested by David in some other mail...

> but, in that case, if we want consistency, we should then remove 
> the use of SpaprMachinestate from xics_kvm to begin with and use 
> only XICSFabric handlers. This is not the case today.
> 

I must admit I'm not a big fan of seeing SpaprMachinestate being
used in a lot of places like a jack-of-all-trade structure that
allows to access anything you want (even if you should not).

> Because this is a spapr model only. Why bother ? This would add 
> just extra and useless ops. 
> 
> You should wait for David to finish the cleanup. I think that 
> at end we will only do pass a nr_servers to a couple of routine.
> kvmppc_xive_connect() might need an extra argument.
> 

This is how I was doing when I started to work on this :)

> 
> C.
> 
> 
> > 
> >> C. 
> >>
> >>>> C. 
> >>>>
> >>>>>>
> >>>>>>> Implement it
> >>>>>>> for sPAPR and use it to generate the "interrupt-controller" node.
> >>>>>>>
> >>>>>>> Signed-off-by: Greg Kurz <groug@kaod.org>
> >>>>>>> ---
> >>>>>>>  hw/intc/xics.c        |    7 +++++++
> >>>>>>>  hw/intc/xics_spapr.c  |    3 ++-
> >>>>>>>  hw/ppc/spapr.c        |    8 ++++++++
> >>>>>>>  include/hw/ppc/xics.h |    2 ++
> >>>>>>>  4 files changed, 19 insertions(+), 1 deletion(-)
> >>>>>>>
> >>>>>>> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> >>>>>>> index dfe7dbd254ab..f82072935266 100644
> >>>>>>> --- a/hw/intc/xics.c
> >>>>>>> +++ b/hw/intc/xics.c
> >>>>>>> @@ -716,6 +716,13 @@ ICPState *xics_icp_get(XICSFabric *xi, int server)
> >>>>>>>      return xic->icp_get(xi, server);
> >>>>>>>  }
> >>>>>>>  
> >>>>>>> +uint32_t xics_nr_servers(XICSFabric *xi)
> >>>>>>> +{
> >>>>>>> +    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(xi);
> >>>>>>> +
> >>>>>>> +    return xic->nr_servers(xi);
> >>>>>>> +}
> >>>>>>> +
> >>>>>>>  void ics_set_irq_type(ICSState *ics, int srcno, bool lsi)
> >>>>>>>  {
> >>>>>>>      assert(!(ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MASK));
> >>>>>>> diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
> >>>>>>> index 6e5eb24b3cca..aa568ed0dc0d 100644
> >>>>>>> --- a/hw/intc/xics_spapr.c
> >>>>>>> +++ b/hw/intc/xics_spapr.c
> >>>>>>> @@ -311,8 +311,9 @@ static void ics_spapr_realize(DeviceState *dev, Error **errp)
> >>>>>>>  void spapr_dt_xics(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
> >>>>>>>                     uint32_t phandle)
> >>>>>>>  {
> >>>>>>> +    ICSState *ics = spapr->ics;
> >>>>>>>      uint32_t interrupt_server_ranges_prop[] = {
> >>>>>>> -        0, cpu_to_be32(nr_servers),
> >>>>>>> +        0, cpu_to_be32(xics_nr_servers(ics->xics)),
> >>>>>>>      };
> >>>>>>>      int node;
> >>>>>>>  
> >>>>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >>>>>>> index 514a17ae74d6..b8b9796c88e4 100644
> >>>>>>> --- a/hw/ppc/spapr.c
> >>>>>>> +++ b/hw/ppc/spapr.c
> >>>>>>> @@ -4266,6 +4266,13 @@ static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
> >>>>>>>      return cpu ? spapr_cpu_state(cpu)->icp : NULL;
> >>>>>>>  }
> >>>>>>>  
> >>>>>>> +static uint32_t spapr_nr_servers(XICSFabric *xi)
> >>>>>>> +{
> >>>>>>> +    SpaprMachineState *spapr = SPAPR_MACHINE(xi);
> >>>>>>> +
> >>>>>>> +    return spapr_max_server_number(spapr);
> >>>>>>> +}
> >>>>>>> +
> >>>>>>>  static void spapr_pic_print_info(InterruptStatsProvider *obj,
> >>>>>>>                                   Monitor *mon)
> >>>>>>>  {
> >>>>>>> @@ -4423,6 +4430,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
> >>>>>>>      xic->ics_get = spapr_ics_get;
> >>>>>>>      xic->ics_resend = spapr_ics_resend;
> >>>>>>>      xic->icp_get = spapr_icp_get;
> >>>>>>> +    xic->nr_servers = spapr_nr_servers;
> >>>>>>>      ispc->print_info = spapr_pic_print_info;
> >>>>>>>      /* Force NUMA node memory size to be a multiple of
> >>>>>>>       * SPAPR_MEMORY_BLOCK_SIZE (256M) since that's the granularity
> >>>>>>> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> >>>>>>> index 1e6a9300eb2b..e6bb1239e8f8 100644
> >>>>>>> --- a/include/hw/ppc/xics.h
> >>>>>>> +++ b/include/hw/ppc/xics.h
> >>>>>>> @@ -151,9 +151,11 @@ typedef struct XICSFabricClass {
> >>>>>>>      ICSState *(*ics_get)(XICSFabric *xi, int irq);
> >>>>>>>      void (*ics_resend)(XICSFabric *xi);
> >>>>>>>      ICPState *(*icp_get)(XICSFabric *xi, int server);
> >>>>>>> +    uint32_t (*nr_servers)(XICSFabric *xi);
> >>>>>>>  } XICSFabricClass;
> >>>>>>>  
> >>>>>>>  ICPState *xics_icp_get(XICSFabric *xi, int server);
> >>>>>>> +uint32_t xics_nr_servers(XICSFabric *xi);
> >>>>>>>  
> >>>>>>>  /* Internal XICS interfaces */
> >>>>>>>  void icp_set_cppr(ICPState *icp, uint8_t cppr);
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> > 
> 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/7] spapr, xive: Turn "nr-ends" property into "nr-servers" property
  2019-10-03 12:01 ` [PATCH 2/7] spapr, xive: Turn "nr-ends" property into "nr-servers" property Greg Kurz
  2019-10-03 12:21   ` Cédric Le Goater
@ 2019-10-04  4:07   ` David Gibson
  2019-10-04  5:53     ` Cédric Le Goater
  2019-10-04  6:51     ` Greg Kurz
  1 sibling, 2 replies; 31+ messages in thread
From: David Gibson @ 2019-10-04  4:07 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, Cédric Le Goater, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 5568 bytes --]

On Thu, Oct 03, 2019 at 02:01:13PM +0200, Greg Kurz wrote:
> The sPAPR XIVE object has an nr_ends field which happens to be a
> multiple of spapr_max_server_number(). It is currently set with
> the help of "nr-ends" property. This is a bit unfortunate since
> it exposes to the sPAPR irq frontend what should remain an
> implemantation detail within the XIVE backend.
> 
> It will be possible soon to inform the XIVE KVM device about the
> range of VCPU ids that may be used in the VM, as returned by the
> spapr_max_server_number() function. This will allow the device
> to substantially reduce the consumption of scarce resources
> in the XIVE HW.
> 
> For both reasons, replace the "nr-ends" property with an "nr-servers"
> one. The existing nr_ends field must be kept though since it tells how
> many ENDs are migrated, it is derived from "nr-servers" at realize time
> for simplicity. Convert spapr_dt_xive() to use it as well.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
>  hw/intc/spapr_xive.c        |   21 ++++++++++++++++-----
>  hw/ppc/spapr_irq.c          |    2 +-
>  include/hw/ppc/spapr_xive.h |    1 +
>  3 files changed, 18 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index 04879abf2e7a..62888ddc68db 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -99,6 +99,15 @@ int spapr_xive_end_to_target(uint8_t end_blk, uint32_t end_idx,
>      return 0;
>  }
>  
> +static uint32_t spapr_xive_vcpu_id_to_end_idx(uint32_t vcpu_id)
> +{
> +    /*
> +     * 8 XIVE END structures per CPU. One for each available
> +     * priority
> +     */
> +    return vcpu_id << 3;
> +}
> +
>  static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
>                                    uint8_t *out_end_blk, uint32_t *out_end_idx)
>  {
> @@ -109,7 +118,7 @@ static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
>      }
>  
>      if (out_end_idx) {
> -        *out_end_idx = (cpu->vcpu_id << 3) + prio;
> +        *out_end_idx = spapr_xive_vcpu_id_to_end_idx(cpu->vcpu_id) + prio;
>      }
>  }
>  
> @@ -283,11 +292,13 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
>          return;
>      }
>  
> -    if (!xive->nr_ends) {
> -        error_setg(errp, "Number of interrupt needs to be greater 0");
> +    if (!xive->nr_servers) {
> +        error_setg(errp, "Number of interrupt servers must be greater than 0");
>          return;
>      }
>  
> +    xive->nr_ends = spapr_xive_vcpu_id_to_end_idx(xive->nr_servers);

I'd prefer not to store both nr_servers and nr_servers * 8 in the
structure.  I think you just want xive->nr_servers, then derive it any
any places that current look at xive->nr_ends.

>      /*
>       * Initialize the internal sources, for IPIs and virtual devices.
>       */
> @@ -489,7 +500,7 @@ static const VMStateDescription vmstate_spapr_xive = {
>  
>  static Property spapr_xive_properties[] = {
>      DEFINE_PROP_UINT32("nr-irqs", SpaprXive, nr_irqs, 0),
> -    DEFINE_PROP_UINT32("nr-ends", SpaprXive, nr_ends, 0),
> +    DEFINE_PROP_UINT32("nr-servers", SpaprXive, nr_servers, 0),

Technically speaking the user can reach in using -global and modify
QOM properties like this, so this is arguably an interface breakage.
That said, I've always thought it was kind of a problem that the way
QOM is used internally thereby exposes as interface a bunch of things
that are really intended to be internal.

So... I'm inclined to go ahead with this anyway.  I won't tell if you
don't.

>      DEFINE_PROP_UINT64("vc-base", SpaprXive, vc_base, SPAPR_XIVE_VC_BASE),
>      DEFINE_PROP_UINT64("tm-base", SpaprXive, tm_base, SPAPR_XIVE_TM_BASE),
>      DEFINE_PROP_END_OF_LIST(),
> @@ -1550,7 +1561,7 @@ void spapr_dt_xive(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
>      /* Interrupt number ranges for the IPIs */
>      uint32_t lisn_ranges[] = {
>          cpu_to_be32(0),
> -        cpu_to_be32(nr_servers),
> +        cpu_to_be32(xive->nr_servers),
>      };
>      /*
>       * EQ size - the sizes of pages supported by the system 4K, 64K,
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 457eabe24cda..025fd00143a2 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -591,7 +591,7 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
>           * 8 XIVE END structures per CPU. One for each available
>           * priority
>           */
> -        qdev_prop_set_uint32(dev, "nr-ends", nr_servers << 3);
> +        qdev_prop_set_uint32(dev, "nr-servers", nr_servers);
>          qdev_init_nofail(dev);
>  
>          spapr->xive = SPAPR_XIVE(dev);
> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> index 0df20a6590a5..4a4a6fc6be7f 100644
> --- a/include/hw/ppc/spapr_xive.h
> +++ b/include/hw/ppc/spapr_xive.h
> @@ -22,6 +22,7 @@ typedef struct SpaprXive {
>      /* Internal interrupt source for IPIs and virtual devices */
>      XiveSource    source;
>      hwaddr        vc_base;
> +    uint32_t      nr_servers;

This is a basic paraneter, not really related to the internal source
structure, so I'd move it up above that comment there.

>  
>      /* END ESB MMIOs */
>      XiveENDSource end_source;
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/7] spapr, xive: Turn "nr-ends" property into "nr-servers" property
  2019-10-04  4:07   ` David Gibson
@ 2019-10-04  5:53     ` Cédric Le Goater
  2019-10-04  6:52       ` Greg Kurz
  2019-10-04  6:51     ` Greg Kurz
  1 sibling, 1 reply; 31+ messages in thread
From: Cédric Le Goater @ 2019-10-04  5:53 UTC (permalink / raw)
  To: David Gibson, Greg Kurz; +Cc: qemu-ppc, qemu-devel

>> @@ -283,11 +292,13 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
>>          return;
>>      }
>>  
>> -    if (!xive->nr_ends) {
>> -        error_setg(errp, "Number of interrupt needs to be greater 0");
>> +    if (!xive->nr_servers) {
>> +        error_setg(errp, "Number of interrupt servers must be greater than 0");
>>          return;
>>      }
>>  
>> +    xive->nr_ends = spapr_xive_vcpu_id_to_end_idx(xive->nr_servers);
> 
> I'd prefer not to store both nr_servers and nr_servers * 8 in the
> structure.  I think you just want xive->nr_servers, then derive it any
> any places that current look at xive->nr_ends.

Yes I agree. This is a small change.

C. 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/7] spapr, xive: Turn "nr-ends" property into "nr-servers" property
  2019-10-04  4:07   ` David Gibson
  2019-10-04  5:53     ` Cédric Le Goater
@ 2019-10-04  6:51     ` Greg Kurz
  2019-10-05 10:23       ` David Gibson
  1 sibling, 1 reply; 31+ messages in thread
From: Greg Kurz @ 2019-10-04  6:51 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, Cédric Le Goater, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 7111 bytes --]

On Fri, 4 Oct 2019 14:07:25 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Thu, Oct 03, 2019 at 02:01:13PM +0200, Greg Kurz wrote:
> > The sPAPR XIVE object has an nr_ends field which happens to be a
> > multiple of spapr_max_server_number(). It is currently set with
> > the help of "nr-ends" property. This is a bit unfortunate since
> > it exposes to the sPAPR irq frontend what should remain an
> > implemantation detail within the XIVE backend.
> > 
> > It will be possible soon to inform the XIVE KVM device about the
> > range of VCPU ids that may be used in the VM, as returned by the
> > spapr_max_server_number() function. This will allow the device
> > to substantially reduce the consumption of scarce resources
> > in the XIVE HW.
> > 
> > For both reasons, replace the "nr-ends" property with an "nr-servers"
> > one. The existing nr_ends field must be kept though since it tells how
> > many ENDs are migrated, it is derived from "nr-servers" at realize time
> > for simplicity. Convert spapr_dt_xive() to use it as well.
> > 
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> >  hw/intc/spapr_xive.c        |   21 ++++++++++++++++-----
> >  hw/ppc/spapr_irq.c          |    2 +-
> >  include/hw/ppc/spapr_xive.h |    1 +
> >  3 files changed, 18 insertions(+), 6 deletions(-)
> > 
> > diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> > index 04879abf2e7a..62888ddc68db 100644
> > --- a/hw/intc/spapr_xive.c
> > +++ b/hw/intc/spapr_xive.c
> > @@ -99,6 +99,15 @@ int spapr_xive_end_to_target(uint8_t end_blk, uint32_t end_idx,
> >      return 0;
> >  }
> >  
> > +static uint32_t spapr_xive_vcpu_id_to_end_idx(uint32_t vcpu_id)
> > +{
> > +    /*
> > +     * 8 XIVE END structures per CPU. One for each available
> > +     * priority
> > +     */
> > +    return vcpu_id << 3;
> > +}
> > +
> >  static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
> >                                    uint8_t *out_end_blk, uint32_t *out_end_idx)
> >  {
> > @@ -109,7 +118,7 @@ static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
> >      }
> >  
> >      if (out_end_idx) {
> > -        *out_end_idx = (cpu->vcpu_id << 3) + prio;
> > +        *out_end_idx = spapr_xive_vcpu_id_to_end_idx(cpu->vcpu_id) + prio;
> >      }
> >  }
> >  
> > @@ -283,11 +292,13 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
> >          return;
> >      }
> >  
> > -    if (!xive->nr_ends) {
> > -        error_setg(errp, "Number of interrupt needs to be greater 0");
> > +    if (!xive->nr_servers) {
> > +        error_setg(errp, "Number of interrupt servers must be greater than 0");
> >          return;
> >      }
> >  
> > +    xive->nr_ends = spapr_xive_vcpu_id_to_end_idx(xive->nr_servers);
> 
> I'd prefer not to store both nr_servers and nr_servers * 8 in the
> structure.  I think you just want xive->nr_servers, then derive it any
> any places that current look at xive->nr_ends.
> 

Of course but unfortunately nr_ends is involved in migration:

static const VMStateDescription vmstate_spapr_xive = {
    .name = TYPE_SPAPR_XIVE,
    .version_id = 1,
    .minimum_version_id = 1,
    .pre_save = vmstate_spapr_xive_pre_save,
    .post_load = NULL, /* handled at the machine level */
    .fields = (VMStateField[]) {
        VMSTATE_UINT32_EQUAL(nr_irqs, SpaprXive, NULL),
        VMSTATE_STRUCT_VARRAY_POINTER_UINT32(eat, SpaprXive, nr_irqs,
                                     vmstate_spapr_xive_eas, XiveEAS),
        VMSTATE_STRUCT_VARRAY_POINTER_UINT32(endt, SpaprXive, nr_ends,
                                                             ^^^^^^^^^^
                                             vmstate_spapr_xive_end, XiveEND),
        VMSTATE_END_OF_LIST()
    },
};

and we certainly cannot put nr_servers << 3 here. I suppose I should
emphasize that even more in the changelog.

We could possibly rename nr_ends to mig_nr_ends, and possibly only
compute at migration time, but it doesn't buy much IMHO.

> >      /*
> >       * Initialize the internal sources, for IPIs and virtual devices.
> >       */
> > @@ -489,7 +500,7 @@ static const VMStateDescription vmstate_spapr_xive = {
> >  
> >  static Property spapr_xive_properties[] = {
> >      DEFINE_PROP_UINT32("nr-irqs", SpaprXive, nr_irqs, 0),
> > -    DEFINE_PROP_UINT32("nr-ends", SpaprXive, nr_ends, 0),
> > +    DEFINE_PROP_UINT32("nr-servers", SpaprXive, nr_servers, 0),
> 
> Technically speaking the user can reach in using -global and modify
> QOM properties like this, so this is arguably an interface breakage.

Drat, I didn't think about this one... :-\ but it seems that "spapr-xive"
isn't user creatable and doesn't appear in the output of '-device help'.
Not sure how -global behaves in this case...

> That said, I've always thought it was kind of a problem that the way
> QOM is used internally thereby exposes as interface a bunch of things
> that are really intended to be internal.
> 

I tend to agree, nr-ends is really an internal detail that a typical
sPAPR user should probably never see.

> So... I'm inclined to go ahead with this anyway.  I won't tell if you
> don't.
> 
> >      DEFINE_PROP_UINT64("vc-base", SpaprXive, vc_base, SPAPR_XIVE_VC_BASE),
> >      DEFINE_PROP_UINT64("tm-base", SpaprXive, tm_base, SPAPR_XIVE_TM_BASE),
> >      DEFINE_PROP_END_OF_LIST(),
> > @@ -1550,7 +1561,7 @@ void spapr_dt_xive(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
> >      /* Interrupt number ranges for the IPIs */
> >      uint32_t lisn_ranges[] = {
> >          cpu_to_be32(0),
> > -        cpu_to_be32(nr_servers),
> > +        cpu_to_be32(xive->nr_servers),
> >      };
> >      /*
> >       * EQ size - the sizes of pages supported by the system 4K, 64K,
> > diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> > index 457eabe24cda..025fd00143a2 100644
> > --- a/hw/ppc/spapr_irq.c
> > +++ b/hw/ppc/spapr_irq.c
> > @@ -591,7 +591,7 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
> >           * 8 XIVE END structures per CPU. One for each available
> >           * priority
> >           */
> > -        qdev_prop_set_uint32(dev, "nr-ends", nr_servers << 3);
> > +        qdev_prop_set_uint32(dev, "nr-servers", nr_servers);
> >          qdev_init_nofail(dev);
> >  
> >          spapr->xive = SPAPR_XIVE(dev);
> > diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> > index 0df20a6590a5..4a4a6fc6be7f 100644
> > --- a/include/hw/ppc/spapr_xive.h
> > +++ b/include/hw/ppc/spapr_xive.h
> > @@ -22,6 +22,7 @@ typedef struct SpaprXive {
> >      /* Internal interrupt source for IPIs and virtual devices */
> >      XiveSource    source;
> >      hwaddr        vc_base;
> > +    uint32_t      nr_servers;
> 
> This is a basic paraneter, not really related to the internal source
> structure, so I'd move it up above that comment there.
> 

Sure.

> >  
> >      /* END ESB MMIOs */
> >      XiveENDSource end_source;
> > 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/7] spapr, xive: Turn "nr-ends" property into "nr-servers" property
  2019-10-04  5:53     ` Cédric Le Goater
@ 2019-10-04  6:52       ` Greg Kurz
  2019-10-04  7:27         ` Cédric Le Goater
  0 siblings, 1 reply; 31+ messages in thread
From: Greg Kurz @ 2019-10-04  6:52 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, David Gibson

On Fri, 4 Oct 2019 07:53:13 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> >> @@ -283,11 +292,13 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
> >>          return;
> >>      }
> >>  
> >> -    if (!xive->nr_ends) {
> >> -        error_setg(errp, "Number of interrupt needs to be greater 0");
> >> +    if (!xive->nr_servers) {
> >> +        error_setg(errp, "Number of interrupt servers must be greater than 0");
> >>          return;
> >>      }
> >>  
> >> +    xive->nr_ends = spapr_xive_vcpu_id_to_end_idx(xive->nr_servers);
> > 
> > I'd prefer not to store both nr_servers and nr_servers * 8 in the
> > structure.  I think you just want xive->nr_servers, then derive it any
> > any places that current look at xive->nr_ends.
> 
> Yes I agree. This is a small change.
> 

I'm afraid it isn't that simple (see my other mail).

> C. 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/7] spapr, xive: Turn "nr-ends" property into "nr-servers" property
  2019-10-04  6:52       ` Greg Kurz
@ 2019-10-04  7:27         ` Cédric Le Goater
  0 siblings, 0 replies; 31+ messages in thread
From: Cédric Le Goater @ 2019-10-04  7:27 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, qemu-devel, David Gibson

On 04/10/2019 08:52, Greg Kurz wrote:
> On Fri, 4 Oct 2019 07:53:13 +0200
> Cédric Le Goater <clg@kaod.org> wrote:
> 
>>>> @@ -283,11 +292,13 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
>>>>          return;
>>>>      }
>>>>  
>>>> -    if (!xive->nr_ends) {
>>>> -        error_setg(errp, "Number of interrupt needs to be greater 0");
>>>> +    if (!xive->nr_servers) {
>>>> +        error_setg(errp, "Number of interrupt servers must be greater than 0");
>>>>          return;
>>>>      }
>>>>  
>>>> +    xive->nr_ends = spapr_xive_vcpu_id_to_end_idx(xive->nr_servers);
>>>
>>> I'd prefer not to store both nr_servers and nr_servers * 8 in the
>>> structure.  I think you just want xive->nr_servers, then derive it any
>>> any places that current look at xive->nr_ends.
>>
>> Yes I agree. This is a small change.
>>
> 
> I'm afraid it isn't that simple (see my other mail).

yes. I had forgotten about the vmstate ...




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/7] spapr, xive: Turn "nr-ends" property into "nr-servers" property
  2019-10-04  6:51     ` Greg Kurz
@ 2019-10-05 10:23       ` David Gibson
  0 siblings, 0 replies; 31+ messages in thread
From: David Gibson @ 2019-10-05 10:23 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, Cédric Le Goater, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 7929 bytes --]

On Fri, Oct 04, 2019 at 08:51:19AM +0200, Greg Kurz wrote:
> On Fri, 4 Oct 2019 14:07:25 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Thu, Oct 03, 2019 at 02:01:13PM +0200, Greg Kurz wrote:
> > > The sPAPR XIVE object has an nr_ends field which happens to be a
> > > multiple of spapr_max_server_number(). It is currently set with
> > > the help of "nr-ends" property. This is a bit unfortunate since
> > > it exposes to the sPAPR irq frontend what should remain an
> > > implemantation detail within the XIVE backend.
> > > 
> > > It will be possible soon to inform the XIVE KVM device about the
> > > range of VCPU ids that may be used in the VM, as returned by the
> > > spapr_max_server_number() function. This will allow the device
> > > to substantially reduce the consumption of scarce resources
> > > in the XIVE HW.
> > > 
> > > For both reasons, replace the "nr-ends" property with an "nr-servers"
> > > one. The existing nr_ends field must be kept though since it tells how
> > > many ENDs are migrated, it is derived from "nr-servers" at realize time
> > > for simplicity. Convert spapr_dt_xive() to use it as well.
> > > 
> > > Signed-off-by: Greg Kurz <groug@kaod.org>
> > > ---
> > >  hw/intc/spapr_xive.c        |   21 ++++++++++++++++-----
> > >  hw/ppc/spapr_irq.c          |    2 +-
> > >  include/hw/ppc/spapr_xive.h |    1 +
> > >  3 files changed, 18 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> > > index 04879abf2e7a..62888ddc68db 100644
> > > --- a/hw/intc/spapr_xive.c
> > > +++ b/hw/intc/spapr_xive.c
> > > @@ -99,6 +99,15 @@ int spapr_xive_end_to_target(uint8_t end_blk, uint32_t end_idx,
> > >      return 0;
> > >  }
> > >  
> > > +static uint32_t spapr_xive_vcpu_id_to_end_idx(uint32_t vcpu_id)
> > > +{
> > > +    /*
> > > +     * 8 XIVE END structures per CPU. One for each available
> > > +     * priority
> > > +     */
> > > +    return vcpu_id << 3;
> > > +}
> > > +
> > >  static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
> > >                                    uint8_t *out_end_blk, uint32_t *out_end_idx)
> > >  {
> > > @@ -109,7 +118,7 @@ static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
> > >      }
> > >  
> > >      if (out_end_idx) {
> > > -        *out_end_idx = (cpu->vcpu_id << 3) + prio;
> > > +        *out_end_idx = spapr_xive_vcpu_id_to_end_idx(cpu->vcpu_id) + prio;
> > >      }
> > >  }
> > >  
> > > @@ -283,11 +292,13 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
> > >          return;
> > >      }
> > >  
> > > -    if (!xive->nr_ends) {
> > > -        error_setg(errp, "Number of interrupt needs to be greater 0");
> > > +    if (!xive->nr_servers) {
> > > +        error_setg(errp, "Number of interrupt servers must be greater than 0");
> > >          return;
> > >      }
> > >  
> > > +    xive->nr_ends = spapr_xive_vcpu_id_to_end_idx(xive->nr_servers);
> > 
> > I'd prefer not to store both nr_servers and nr_servers * 8 in the
> > structure.  I think you just want xive->nr_servers, then derive it any
> > any places that current look at xive->nr_ends.
> > 
> 
> Of course but unfortunately nr_ends is involved in migration:
> 
> static const VMStateDescription vmstate_spapr_xive = {
>     .name = TYPE_SPAPR_XIVE,
>     .version_id = 1,
>     .minimum_version_id = 1,
>     .pre_save = vmstate_spapr_xive_pre_save,
>     .post_load = NULL, /* handled at the machine level */
>     .fields = (VMStateField[]) {
>         VMSTATE_UINT32_EQUAL(nr_irqs, SpaprXive, NULL),
>         VMSTATE_STRUCT_VARRAY_POINTER_UINT32(eat, SpaprXive, nr_irqs,
>                                      vmstate_spapr_xive_eas, XiveEAS),
>         VMSTATE_STRUCT_VARRAY_POINTER_UINT32(endt, SpaprXive, nr_ends,
>                                                              ^^^^^^^^^^
>                                              vmstate_spapr_xive_end, XiveEND),
>         VMSTATE_END_OF_LIST()
>     },
> };
> 
> and we certainly cannot put nr_servers << 3 here. I suppose I should
> emphasize that even more in the changelog.

Ah, dangit, I forgot about that, though I've hit the same problem
before.  I'm actually less concerned about a comment in the changelog,
than in on the definition in the structure itself.

> We could possibly rename nr_ends to mig_nr_ends, and possibly only
> compute at migration time, but it doesn't buy much IMHO.

No, indeed not.

> > >      /*
> > >       * Initialize the internal sources, for IPIs and virtual devices.
> > >       */
> > > @@ -489,7 +500,7 @@ static const VMStateDescription vmstate_spapr_xive = {
> > >  
> > >  static Property spapr_xive_properties[] = {
> > >      DEFINE_PROP_UINT32("nr-irqs", SpaprXive, nr_irqs, 0),
> > > -    DEFINE_PROP_UINT32("nr-ends", SpaprXive, nr_ends, 0),
> > > +    DEFINE_PROP_UINT32("nr-servers", SpaprXive, nr_servers, 0),
> > 
> > Technically speaking the user can reach in using -global and modify
> > QOM properties like this, so this is arguably an interface breakage.
> 
> Drat, I didn't think about this one... :-\ but it seems that "spapr-xive"
> isn't user creatable and doesn't appear in the output of '-device help'.
> Not sure how -global behaves in this case...
> 
> > That said, I've always thought it was kind of a problem that the way
> > QOM is used internally thereby exposes as interface a bunch of things
> > that are really intended to be internal.
> > 
> 
> I tend to agree, nr-ends is really an internal detail that a typical
> sPAPR user should probably never see.
> 
> > So... I'm inclined to go ahead with this anyway.  I won't tell if you
> > don't.
> > 
> > >      DEFINE_PROP_UINT64("vc-base", SpaprXive, vc_base, SPAPR_XIVE_VC_BASE),
> > >      DEFINE_PROP_UINT64("tm-base", SpaprXive, tm_base, SPAPR_XIVE_TM_BASE),
> > >      DEFINE_PROP_END_OF_LIST(),
> > > @@ -1550,7 +1561,7 @@ void spapr_dt_xive(SpaprMachineState *spapr, uint32_t nr_servers, void *fdt,
> > >      /* Interrupt number ranges for the IPIs */
> > >      uint32_t lisn_ranges[] = {
> > >          cpu_to_be32(0),
> > > -        cpu_to_be32(nr_servers),
> > > +        cpu_to_be32(xive->nr_servers),
> > >      };
> > >      /*
> > >       * EQ size - the sizes of pages supported by the system 4K, 64K,
> > > diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> > > index 457eabe24cda..025fd00143a2 100644
> > > --- a/hw/ppc/spapr_irq.c
> > > +++ b/hw/ppc/spapr_irq.c
> > > @@ -591,7 +591,7 @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
> > >           * 8 XIVE END structures per CPU. One for each available
> > >           * priority
> > >           */
> > > -        qdev_prop_set_uint32(dev, "nr-ends", nr_servers << 3);
> > > +        qdev_prop_set_uint32(dev, "nr-servers", nr_servers);
> > >          qdev_init_nofail(dev);
> > >  
> > >          spapr->xive = SPAPR_XIVE(dev);
> > > diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> > > index 0df20a6590a5..4a4a6fc6be7f 100644
> > > --- a/include/hw/ppc/spapr_xive.h
> > > +++ b/include/hw/ppc/spapr_xive.h
> > > @@ -22,6 +22,7 @@ typedef struct SpaprXive {
> > >      /* Internal interrupt source for IPIs and virtual devices */
> > >      XiveSource    source;
> > >      hwaddr        vc_base;
> > > +    uint32_t      nr_servers;
> > 
> > This is a basic paraneter, not really related to the internal source
> > structure, so I'd move it up above that comment there.
> > 
> 
> Sure.
> 
> > >  
> > >      /* END ESB MMIOs */
> > >      XiveENDSource end_source;
> > > 
> > 
> 



-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 7/7] spapr: Set VSMT to smp_threads by default
  2019-10-03 12:02 ` [PATCH 7/7] spapr: Set VSMT to smp_threads by default Greg Kurz
@ 2019-10-14  6:12   ` David Gibson
  2019-10-14 11:31     ` Greg Kurz
  0 siblings, 1 reply; 31+ messages in thread
From: David Gibson @ 2019-10-14  6:12 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, Cédric Le Goater, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3533 bytes --]

On Thu, Oct 03, 2019 at 02:02:00PM +0200, Greg Kurz wrote:
> Support for setting VSMT is available in KVM since linux-4.13. Most distros
> that support KVM on POWER already have it. It thus seem reasonable enough
> to have the default machine to set VSMT to smp_threads.
> 
> This brings contiguous VCPU ids and thus brings their upper bound down to
> the machine's max_cpus. This is especially useful for XIVE KVM devices,
> which may thus allocate only one VP descriptor per VCPU.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>

Applied to ppc-for-4.2, thanks.

> ---
>  hw/ppc/spapr.c         |    7 ++++++-
>  include/hw/ppc/spapr.h |    1 +
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 8f59f08c102e..473ce1d04775 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2503,6 +2503,7 @@ static CPUArchId *spapr_find_cpu_slot(MachineState *ms, uint32_t id, int *idx)
>  static void spapr_set_vsmt_mode(SpaprMachineState *spapr, Error **errp)
>  {
>      MachineState *ms = MACHINE(spapr);
> +    SpaprMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
>      Error *local_err = NULL;
>      bool vsmt_user = !!spapr->vsmt;
>      int kvm_smt = kvmppc_smt_threads();
> @@ -2529,7 +2530,7 @@ static void spapr_set_vsmt_mode(SpaprMachineState *spapr, Error **errp)
>              goto out;
>          }
>          /* In this case, spapr->vsmt has been set by the command line */
> -    } else {
> +    } else if (!smc->smp_threads_vsmt) {
>          /*
>           * Default VSMT value is tricky, because we need it to be as
>           * consistent as possible (for migration), but this requires
> @@ -2538,6 +2539,8 @@ static void spapr_set_vsmt_mode(SpaprMachineState *spapr, Error **errp)
>           * overwhelmingly common case in production systems.
>           */
>          spapr->vsmt = MAX(8, smp_threads);
> +    } else {
> +        spapr->vsmt = smp_threads;
>      }
>  
>      /* KVM: If necessary, set the SMT mode: */
> @@ -4452,6 +4455,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>      smc->irq = &spapr_irq_dual;
>      smc->dr_phb_enabled = true;
>      smc->linux_pci_probe = true;
> +    smc->smp_threads_vsmt = true;
>  }
>  
>  static const TypeInfo spapr_machine_info = {
> @@ -4519,6 +4523,7 @@ static void spapr_machine_4_1_class_options(MachineClass *mc)
>  
>      spapr_machine_4_2_class_options(mc);
>      smc->linux_pci_probe = false;
> +    smc->smp_threads_vsmt = false;
>      compat_props_add(mc->compat_props, hw_compat_4_1, hw_compat_4_1_len);
>      compat_props_add(mc->compat_props, compat, G_N_ELEMENTS(compat));
>  }
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index cbd1a4c9f390..2009eb64f9cb 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -122,6 +122,7 @@ struct SpaprMachineClass {
>      bool broken_host_serial_model; /* present real host info to the guest */
>      bool pre_4_1_migration; /* don't migrate hpt-max-page-size */
>      bool linux_pci_probe;
> +    bool smp_threads_vsmt; /* set VSMT to smp_threads by default */
>  
>      void (*phb_placement)(SpaprMachineState *spapr, uint32_t index,
>                            uint64_t *buid, hwaddr *pio, 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 7/7] spapr: Set VSMT to smp_threads by default
  2019-10-14  6:12   ` David Gibson
@ 2019-10-14 11:31     ` Greg Kurz
  0 siblings, 0 replies; 31+ messages in thread
From: Greg Kurz @ 2019-10-14 11:31 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, Cédric Le Goater, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3798 bytes --]

On Mon, 14 Oct 2019 17:12:47 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Thu, Oct 03, 2019 at 02:02:00PM +0200, Greg Kurz wrote:
> > Support for setting VSMT is available in KVM since linux-4.13. Most distros
> > that support KVM on POWER already have it. It thus seem reasonable enough
> > to have the default machine to set VSMT to smp_threads.
> > 
> > This brings contiguous VCPU ids and thus brings their upper bound down to
> > the machine's max_cpus. This is especially useful for XIVE KVM devices,
> > which may thus allocate only one VP descriptor per VCPU.
> > 

Just to clarify that without the other patches in this series, XIVE
KVM devices still allocate a fixed block of 2048 VPs, no matter what.
ie, the last sentence in the changelog may be slightly misleading.

> > Signed-off-by: Greg Kurz <groug@kaod.org>
> 
> Applied to ppc-for-4.2, thanks.
> 
> > ---
> >  hw/ppc/spapr.c         |    7 ++++++-
> >  include/hw/ppc/spapr.h |    1 +
> >  2 files changed, 7 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 8f59f08c102e..473ce1d04775 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -2503,6 +2503,7 @@ static CPUArchId *spapr_find_cpu_slot(MachineState *ms, uint32_t id, int *idx)
> >  static void spapr_set_vsmt_mode(SpaprMachineState *spapr, Error **errp)
> >  {
> >      MachineState *ms = MACHINE(spapr);
> > +    SpaprMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> >      Error *local_err = NULL;
> >      bool vsmt_user = !!spapr->vsmt;
> >      int kvm_smt = kvmppc_smt_threads();
> > @@ -2529,7 +2530,7 @@ static void spapr_set_vsmt_mode(SpaprMachineState *spapr, Error **errp)
> >              goto out;
> >          }
> >          /* In this case, spapr->vsmt has been set by the command line */
> > -    } else {
> > +    } else if (!smc->smp_threads_vsmt) {
> >          /*
> >           * Default VSMT value is tricky, because we need it to be as
> >           * consistent as possible (for migration), but this requires
> > @@ -2538,6 +2539,8 @@ static void spapr_set_vsmt_mode(SpaprMachineState *spapr, Error **errp)
> >           * overwhelmingly common case in production systems.
> >           */
> >          spapr->vsmt = MAX(8, smp_threads);
> > +    } else {
> > +        spapr->vsmt = smp_threads;
> >      }
> >  
> >      /* KVM: If necessary, set the SMT mode: */
> > @@ -4452,6 +4455,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
> >      smc->irq = &spapr_irq_dual;
> >      smc->dr_phb_enabled = true;
> >      smc->linux_pci_probe = true;
> > +    smc->smp_threads_vsmt = true;
> >  }
> >  
> >  static const TypeInfo spapr_machine_info = {
> > @@ -4519,6 +4523,7 @@ static void spapr_machine_4_1_class_options(MachineClass *mc)
> >  
> >      spapr_machine_4_2_class_options(mc);
> >      smc->linux_pci_probe = false;
> > +    smc->smp_threads_vsmt = false;
> >      compat_props_add(mc->compat_props, hw_compat_4_1, hw_compat_4_1_len);
> >      compat_props_add(mc->compat_props, compat, G_N_ELEMENTS(compat));
> >  }
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index cbd1a4c9f390..2009eb64f9cb 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -122,6 +122,7 @@ struct SpaprMachineClass {
> >      bool broken_host_serial_model; /* present real host info to the guest */
> >      bool pre_4_1_migration; /* don't migrate hpt-max-page-size */
> >      bool linux_pci_probe;
> > +    bool smp_threads_vsmt; /* set VSMT to smp_threads by default */
> >  
> >      void (*phb_placement)(SpaprMachineState *spapr, uint32_t index,
> >                            uint64_t *buid, hwaddr *pio, 
> > 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2019-10-14 11:43 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-03 12:00 [PATCH 0/7] spapr: Use less XIVE HW resources in KVM Greg Kurz
2019-10-03 12:00 ` [PATCH 1/7] spapr, xics: Get number of servers with a XICSFabricClass method Greg Kurz
2019-10-03 12:24   ` Cédric Le Goater
2019-10-03 12:49     ` Greg Kurz
2019-10-03 12:58       ` Cédric Le Goater
2019-10-03 13:02         ` Greg Kurz
2019-10-03 13:19           ` Cédric Le Goater
2019-10-03 13:41             ` Greg Kurz
2019-10-03 13:59               ` Cédric Le Goater
2019-10-03 14:58                 ` Greg Kurz
2019-10-03 12:01 ` [PATCH 2/7] spapr, xive: Turn "nr-ends" property into "nr-servers" property Greg Kurz
2019-10-03 12:21   ` Cédric Le Goater
2019-10-03 12:44     ` Greg Kurz
2019-10-04  4:07   ` David Gibson
2019-10-04  5:53     ` Cédric Le Goater
2019-10-04  6:52       ` Greg Kurz
2019-10-04  7:27         ` Cédric Le Goater
2019-10-04  6:51     ` Greg Kurz
2019-10-05 10:23       ` David Gibson
2019-10-03 12:01 ` [PATCH 3/7] spapr, xics, xive: Drop nr_servers argument in DT-related functions Greg Kurz
2019-10-03 12:25   ` Cédric Le Goater
2019-10-03 12:52     ` Greg Kurz
2019-10-03 12:01 ` [PATCH RFC 4/7] linux-headers: Update against 5.3-rc2 Greg Kurz
2019-10-03 12:01 ` [PATCH 5/7] spapr/xics: Configure number of servers in KVM Greg Kurz
2019-10-03 12:29   ` Cédric Le Goater
2019-10-03 12:55     ` Greg Kurz
2019-10-03 12:01 ` [PATCH 6/7] spapr/xive: " Greg Kurz
2019-10-03 12:30   ` Cédric Le Goater
2019-10-03 12:02 ` [PATCH 7/7] spapr: Set VSMT to smp_threads by default Greg Kurz
2019-10-14  6:12   ` David Gibson
2019-10-14 11:31     ` Greg Kurz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.