All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH for-2.12 v3 00/11] spapr: introduce an IRQ allocator at the machine level
@ 2017-11-10 15:20 Cédric Le Goater
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type Cédric Le Goater
                   ` (10 more replies)
  0 siblings, 11 replies; 79+ messages in thread
From: Cédric Le Goater @ 2017-11-10 15:20 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel, David Gibson, Greg Kurz, Benjamin Herrenschmidt
  Cc: Cédric Le Goater

Hello,

Currently, the ICSState 'ics' object of the sPAPR machine acts as the
global interrupt source handler and also as the IRQ number allocator
for the machine. Some IRQ numbers are allocated very early in the
machine initialization sequence to populate the device tree, and this
is a problem to introduce the new POWER XIVE interrupt model, as it
needs to share the IRQ numbers with the older model.

To prepare ground for XIVE, here is a proposal adding a set of new
XICSFabric operations to let the machine handle directly the IRQ
number allocation and to decorrelate the allocation from the interrupt
source object :

    bool (*irq_test)(XICSFabric *xi, int irq);
    int (*irq_alloc_block)(XICSFabric *xi, int count, int align);
    void (*irq_free_block)(XICSFabric *xi, int irq, int num);
    bool (*irq_is_lsi)(XICSFabric *xi, int irq);

In these prototypes, the 'irq' parameter refers to a number in the
global IRQ number space.

On the latest pseries machines, these operations are simply backed by
a bitmap and to handle migration compatibility, we keep an old set of
operations using the ICSIRQState array.


To completely remove the use of the ICSState object (required to
introduce XIVE), we also need to change how the nature of an
interrupt, MSI or LSI, is stored. Today, this is done using the flag
attribute of the ICSIRQState array. We change that by splitting the
IRQ number space of the machine in two: first the LSIs and then the
MSIs. This has the benefit to keep the LSI IRQ numbers in a well known
range which is useful for PHB hotplug.

The git repo for this pachset can be found here along with the latest
XIVE model:

    https://github.com/legoater/qemu/commits/xive

Thanks,

C.

Tests :

 - make check on each patch
 - migration :
     qemu-2.12 (pseries-2.12) <->  qemu-2.12 (pseries-2.12)
     qemu-2.12 (pseries-2.10) <->  qemu-2.12 (pseries-2.10)
     qemu-2.10 (pseries-2.10) <->  qemu-2.12 (pseries-2.10)

Changes since v2 :

 - introduced a second set of XICSFabric IRQ operations for older
   pseries machines

Changes since v1 :

 - reorganised patchset to introduce the XICSFabric operations before
   the major changes: bitmap and IRQ number space split   
 - introduced a reference bitmap to save some state in migration

Cédric Le Goater (11):
  spapr: add pseries 2.12 machine type
  ppc/xics: remove useless if condition
  spapr: introduce new XICSFabric operations for an IRQ allocator
  spapr: move current IRQ allocation under the machine
  spapr: introduce an IRQ allocator using a bitmap
  spapr: store a reference IRQ bitmap
  spapr: introduce an 'irq_base' number
  spapr: introduce a XICSFabric irq_is_lsi() operation
  spapr: split the IRQ number space for LSI interrupts
  sparp: merge ics_set_irq_type() in irq_alloc_block() operation
  spapr: use sPAPRMachineState in spapr_ics_ prototypes

 hw/intc/trace-events   |   2 -
 hw/intc/xics.c         |  37 ++++-----
 hw/intc/xics_kvm.c     |   4 +-
 hw/intc/xics_spapr.c   |  76 +++---------------
 hw/ppc/pnv.c           |  34 ++++++++
 hw/ppc/pnv_psi.c       |   4 -
 hw/ppc/spapr.c         | 209 ++++++++++++++++++++++++++++++++++++++++++++++++-
 hw/ppc/spapr_events.c  |   4 +-
 hw/ppc/spapr_pci.c     |   8 +-
 hw/ppc/spapr_vio.c     |   2 +-
 hw/ppc/trace-events    |   2 +
 include/hw/ppc/spapr.h |   5 ++
 include/hw/ppc/xics.h  |  20 +++--
 13 files changed, 301 insertions(+), 106 deletions(-)

-- 
2.13.6

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [Qemu-devel] [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type
  2017-11-10 15:20 [Qemu-devel] [PATCH for-2.12 v3 00/11] spapr: introduce an IRQ allocator at the machine level Cédric Le Goater
@ 2017-11-10 15:20 ` Cédric Le Goater
  2017-11-11 15:15   ` Greg Kurz
                     ` (2 more replies)
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 02/11] ppc/xics: remove useless if condition Cédric Le Goater
                   ` (9 subsequent siblings)
  10 siblings, 3 replies; 79+ messages in thread
From: Cédric Le Goater @ 2017-11-10 15:20 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel, David Gibson, Greg Kurz, Benjamin Herrenschmidt
  Cc: Cédric Le Goater

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 hw/ppc/spapr.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index d682f013d422..a2dcbee07214 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3687,6 +3687,20 @@ static const TypeInfo spapr_machine_info = {
     type_init(spapr_machine_register_##suffix)
 
 /*
+ * pseries-2.12
+ */
+static void spapr_machine_2_12_instance_options(MachineState *machine)
+{
+}
+
+static void spapr_machine_2_12_class_options(MachineClass *mc)
+{
+    /* Defaults for the latest behaviour inherited from the base class */
+}
+
+DEFINE_SPAPR_MACHINE(2_12, "2.12", true);
+
+/*
  * pseries-2.11
  */
 static void spapr_machine_2_11_instance_options(MachineState *machine)
@@ -3698,7 +3712,7 @@ static void spapr_machine_2_11_class_options(MachineClass *mc)
     /* Defaults for the latest behaviour inherited from the base class */
 }
 
-DEFINE_SPAPR_MACHINE(2_11, "2.11", true);
+DEFINE_SPAPR_MACHINE(2_11, "2.11", false);
 
 /*
  * pseries-2.10
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [Qemu-devel] [PATCH for-2.12 v3 02/11] ppc/xics: remove useless if condition
  2017-11-10 15:20 [Qemu-devel] [PATCH for-2.12 v3 00/11] spapr: introduce an IRQ allocator at the machine level Cédric Le Goater
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type Cédric Le Goater
@ 2017-11-10 15:20 ` Cédric Le Goater
  2017-11-11 14:50   ` Greg Kurz
  2017-11-13  5:28   ` David Gibson
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 03/11] spapr: introduce new XICSFabric operations for an IRQ allocator Cédric Le Goater
                   ` (8 subsequent siblings)
  10 siblings, 2 replies; 79+ messages in thread
From: Cédric Le Goater @ 2017-11-10 15:20 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel, David Gibson, Greg Kurz, Benjamin Herrenschmidt
  Cc: Cédric Le Goater

The previous code section uses a 'first < 0' test and returns. Therefore,
there is no need to test the 'first' variable against '>= 0' afterwards.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 hw/intc/xics_spapr.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
index d98ea8b13068..e8c0a1b3e903 100644
--- a/hw/intc/xics_spapr.c
+++ b/hw/intc/xics_spapr.c
@@ -329,10 +329,8 @@ int spapr_ics_alloc_block(ICSState *ics, int num, bool lsi,
         return -1;
     }
 
-    if (first >= 0) {
-        for (i = first; i < first + num; ++i) {
-            ics_set_irq_type(ics, i, lsi);
-        }
+    for (i = first; i < first + num; ++i) {
+        ics_set_irq_type(ics, i, lsi);
     }
     first += ics->offset;
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [Qemu-devel] [PATCH for-2.12 v3 03/11] spapr: introduce new XICSFabric operations for an IRQ allocator
  2017-11-10 15:20 [Qemu-devel] [PATCH for-2.12 v3 00/11] spapr: introduce an IRQ allocator at the machine level Cédric Le Goater
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type Cédric Le Goater
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 02/11] ppc/xics: remove useless if condition Cédric Le Goater
@ 2017-11-10 15:20 ` Cédric Le Goater
  2017-11-14  8:52   ` Greg Kurz
  2017-11-17  4:48   ` David Gibson
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 04/11] spapr: move current IRQ allocation under the machine Cédric Le Goater
                   ` (7 subsequent siblings)
  10 siblings, 2 replies; 79+ messages in thread
From: Cédric Le Goater @ 2017-11-10 15:20 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel, David Gibson, Greg Kurz, Benjamin Herrenschmidt
  Cc: Cédric Le Goater

Currently, the ICSState 'ics' object of the sPAPR machine acts as the
global interrupt source handler and also as the IRQ number allocator
for the machine. Some IRQ numbers are allocated very early in the
machine initialization sequence to populate the device tree, and this
is a problem to introduce the new POWER XIVE interrupt model, as it
needs to share the IRQ numbers with the older model.

To prepare ground for XIVE, here is a set of new XICSFabric operations
to let the machine handle directly the IRQ number allocation and to
decorrelate the allocation from the interrupt source object :

    bool (*irq_test)(XICSFabric *xi, int irq);
    int (*irq_alloc_block)(XICSFabric *xi, int count, int align);
    void (*irq_free_block)(XICSFabric *xi, int irq, int num);

In these prototypes, the 'irq' parameter refers to a number in the
global IRQ number space. Indexes for arrays storing different state
informations on the interrupts, like the ICSIRQState, are usually
named 'srcno'.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 hw/ppc/spapr.c        | 19 +++++++++++++++++++
 include/hw/ppc/xics.h |  4 ++++
 2 files changed, 23 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index a2dcbee07214..84d68f2fdbae 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3536,6 +3536,21 @@ static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
     return cpu ? ICP(cpu->intc) : NULL;
 }
 
+static bool spapr_irq_test(XICSFabric *xi, int irq)
+{
+    return false;
+}
+
+static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
+{
+    return -1;
+}
+
+static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
+{
+    ;
+}
+
 static void spapr_pic_print_info(InterruptStatsProvider *obj,
                                  Monitor *mon)
 {
@@ -3630,6 +3645,10 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     xic->ics_get = spapr_ics_get;
     xic->ics_resend = spapr_ics_resend;
     xic->icp_get = spapr_icp_get;
+    xic->irq_test = spapr_irq_test;
+    xic->irq_alloc_block = spapr_irq_alloc_block;
+    xic->irq_free_block = spapr_irq_free_block;
+
     ispc->print_info = spapr_pic_print_info;
     /* Force NUMA node memory size to be a multiple of
      * SPAPR_MEMORY_BLOCK_SIZE (256M) since that's the granularity
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 28d248abad61..30e7f2e0a7dd 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -175,6 +175,10 @@ typedef struct XICSFabricClass {
     ICSState *(*ics_get)(XICSFabric *xi, int irq);
     void (*ics_resend)(XICSFabric *xi);
     ICPState *(*icp_get)(XICSFabric *xi, int server);
+    /* IRQ allocator helpers */
+    bool (*irq_test)(XICSFabric *xi, int irq);
+    int (*irq_alloc_block)(XICSFabric *xi, int count, int align);
+    void (*irq_free_block)(XICSFabric *xi, int irq, int num);
 } XICSFabricClass;
 
 #define XICS_IRQS_SPAPR               1024
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [Qemu-devel] [PATCH for-2.12 v3 04/11] spapr: move current IRQ allocation under the machine
  2017-11-10 15:20 [Qemu-devel] [PATCH for-2.12 v3 00/11] spapr: introduce an IRQ allocator at the machine level Cédric Le Goater
                   ` (2 preceding siblings ...)
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 03/11] spapr: introduce new XICSFabric operations for an IRQ allocator Cédric Le Goater
@ 2017-11-10 15:20 ` Cédric Le Goater
  2017-11-14  8:56   ` Greg Kurz
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 05/11] spapr: introduce an IRQ allocator using a bitmap Cédric Le Goater
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 79+ messages in thread
From: Cédric Le Goater @ 2017-11-10 15:20 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel, David Gibson, Greg Kurz, Benjamin Herrenschmidt
  Cc: Cédric Le Goater

Use the new XICSFabric operations to handle the IRQ number allocation
directly under the machine. These changes only move code and adapt it
to take into account the new API which uses IRQ numbers.

On PowerNV, only provide a basic irq_test() operation. For the moment,
there is no need for more.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 hw/intc/trace-events |  2 --
 hw/intc/xics.c       |  3 ++-
 hw/intc/xics_spapr.c | 57 +++++++++-------------------------------------------
 hw/ppc/pnv.c         | 18 +++++++++++++++++
 hw/ppc/spapr.c       | 56 ++++++++++++++++++++++++++++++++++++++++++++++++---
 hw/ppc/trace-events  |  2 ++
 6 files changed, 85 insertions(+), 53 deletions(-)

diff --git a/hw/intc/trace-events b/hw/intc/trace-events
index b86f242b0fcf..e34ecf7a16e5 100644
--- a/hw/intc/trace-events
+++ b/hw/intc/trace-events
@@ -65,8 +65,6 @@ xics_ics_simple_reject(int nr, int srcno) "reject irq 0x%x [src %d]"
 xics_ics_simple_eoi(int nr) "ics_eoi: irq 0x%x"
 xics_alloc(int irq) "irq %d"
 xics_alloc_block(int first, int num, bool lsi, int align) "first irq %d, %d irqs, lsi=%d, alignnum %d"
-xics_ics_free(int src, int irq, int num) "Source#%d, first irq %d, %d irqs"
-xics_ics_free_warn(int src, int irq) "Source#%d, irq %d is already free"
 
 # hw/intc/s390_flic_kvm.c
 flic_create_device(int err) "flic: create device failed %d"
diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index cc9816e7f204..2c4899f278e2 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -53,6 +53,7 @@ void icp_pic_print_info(ICPState *icp, Monitor *mon)
 void ics_pic_print_info(ICSState *ics, Monitor *mon)
 {
     uint32_t i;
+    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(ics->xics);
 
     monitor_printf(mon, "ICS %4x..%4x %p\n",
                    ics->offset, ics->offset + ics->nr_irqs - 1, ics);
@@ -64,7 +65,7 @@ void ics_pic_print_info(ICSState *ics, Monitor *mon)
     for (i = 0; i < ics->nr_irqs; i++) {
         ICSIRQState *irq = ics->irqs + i;
 
-        if (!(irq->flags & XICS_FLAGS_IRQ_MASK)) {
+        if (!xic->irq_test(ics->xics, i + ics->offset)) {
             continue;
         }
         monitor_printf(mon, "  %4x %s %02x %02x\n",
diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
index e8c0a1b3e903..de9e65d35247 100644
--- a/hw/intc/xics_spapr.c
+++ b/hw/intc/xics_spapr.c
@@ -245,50 +245,26 @@ void xics_spapr_init(sPAPRMachineState *spapr)
     spapr_register_hypercall(H_IPOLL, h_ipoll);
 }
 
-#define ICS_IRQ_FREE(ics, srcno)   \
-    (!((ics)->irqs[(srcno)].flags & (XICS_FLAGS_IRQ_MASK)))
-
-static int ics_find_free_block(ICSState *ics, int num, int alignnum)
-{
-    int first, i;
-
-    for (first = 0; first < ics->nr_irqs; first += alignnum) {
-        if (num > (ics->nr_irqs - first)) {
-            return -1;
-        }
-        for (i = first; i < first + num; ++i) {
-            if (!ICS_IRQ_FREE(ics, i)) {
-                break;
-            }
-        }
-        if (i == (first + num)) {
-            return first;
-        }
-    }
-
-    return -1;
-}
-
 int spapr_ics_alloc(ICSState *ics, int irq_hint, bool lsi, Error **errp)
 {
     int irq;
+    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(ics->xics);
 
     if (!ics) {
         return -1;
     }
     if (irq_hint) {
-        if (!ICS_IRQ_FREE(ics, irq_hint - ics->offset)) {
+        if (xic->irq_test(ics->xics, irq_hint)) {
             error_setg(errp, "can't allocate IRQ %d: already in use", irq_hint);
             return -1;
         }
         irq = irq_hint;
     } else {
-        irq = ics_find_free_block(ics, 1, 1);
+        irq = xic->irq_alloc_block(ics->xics, 1, 1);
         if (irq < 0) {
             error_setg(errp, "can't allocate IRQ: no IRQ left");
             return -1;
         }
-        irq += ics->offset;
     }
 
     ics_set_irq_type(ics, irq - ics->offset, lsi);
@@ -305,6 +281,7 @@ int spapr_ics_alloc_block(ICSState *ics, int num, bool lsi,
                           bool align, Error **errp)
 {
     int i, first = -1;
+    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(ics->xics);
 
     if (!ics) {
         return -1;
@@ -320,9 +297,9 @@ int spapr_ics_alloc_block(ICSState *ics, int num, bool lsi,
     if (align) {
         assert((num == 1) || (num == 2) || (num == 4) ||
                (num == 8) || (num == 16) || (num == 32));
-        first = ics_find_free_block(ics, num, num);
+        first = xic->irq_alloc_block(ics->xics, num, num);
     } else {
-        first = ics_find_free_block(ics, num, 1);
+        first = xic->irq_alloc_block(ics->xics, num, 1);
     }
     if (first < 0) {
         error_setg(errp, "can't find a free %d-IRQ block", num);
@@ -330,33 +307,19 @@ int spapr_ics_alloc_block(ICSState *ics, int num, bool lsi,
     }
 
     for (i = first; i < first + num; ++i) {
-        ics_set_irq_type(ics, i, lsi);
+        ics_set_irq_type(ics, i - ics->offset, lsi);
     }
-    first += ics->offset;
 
     trace_xics_alloc_block(first, num, lsi, align);
 
     return first;
 }
 
-static void ics_free(ICSState *ics, int srcno, int num)
-{
-    int i;
-
-    for (i = srcno; i < srcno + num; ++i) {
-        if (ICS_IRQ_FREE(ics, i)) {
-            trace_xics_ics_free_warn(0, i + ics->offset);
-        }
-        memset(&ics->irqs[i], 0, sizeof(ICSIRQState));
-    }
-}
-
 void spapr_ics_free(ICSState *ics, int irq, int num)
 {
-    if (ics_valid_irq(ics, irq)) {
-        trace_xics_ics_free(0, irq, num);
-        ics_free(ics, irq - ics->offset, num);
-    }
+    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(ics->xics);
+
+    xic->irq_free_block(ics->xics, irq, num);
 }
 
 void spapr_dt_xics(int nr_servers, void *fdt, uint32_t phandle)
diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index c35c439d816b..8288940ef9d7 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -1018,6 +1018,23 @@ static ICPState *pnv_icp_get(XICSFabric *xi, int pir)
     return cpu ? ICP(cpu->intc) : NULL;
 }
 
+static bool pnv_irq_test(XICSFabric *xi, int irq)
+{
+    PnvMachineState *pnv = POWERNV_MACHINE(xi);
+    int i;
+
+    /* We don't have a IRQ allocator for the PowerNV machine yet, so
+     * just check that the IRQ number is valid for the PSI source
+     */
+    for (i = 0; i < pnv->num_chips; i++) {
+        ICSState *ics = &pnv->chips[i]->psi.ics;
+        if (ics_valid_irq(ics, irq)) {
+            return true;
+        }
+    }
+    return false;
+}
+
 static void pnv_pic_print_info(InterruptStatsProvider *obj,
                                Monitor *mon)
 {
@@ -1102,6 +1119,7 @@ static void powernv_machine_class_init(ObjectClass *oc, void *data)
     xic->icp_get = pnv_icp_get;
     xic->ics_get = pnv_ics_get;
     xic->ics_resend = pnv_ics_resend;
+    xic->irq_test = pnv_irq_test;
     ispc->print_info = pnv_pic_print_info;
 
     powernv_machine_class_props_init(oc);
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 84d68f2fdbae..4bdceb45a14f 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3536,19 +3536,69 @@ static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
     return cpu ? ICP(cpu->intc) : NULL;
 }
 
+#define ICS_IRQ_FREE(ics, srcno)   \
+    (!((ics)->irqs[(srcno)].flags & (XICS_FLAGS_IRQ_MASK)))
+
+static int ics_find_free_block(ICSState *ics, int num, int alignnum)
+{
+    int first, i;
+
+    for (first = 0; first < ics->nr_irqs; first += alignnum) {
+        if (num > (ics->nr_irqs - first)) {
+            return -1;
+        }
+        for (i = first; i < first + num; ++i) {
+            if (!ICS_IRQ_FREE(ics, i)) {
+                break;
+            }
+        }
+        if (i == (first + num)) {
+            return first;
+        }
+    }
+
+    return -1;
+}
+
 static bool spapr_irq_test(XICSFabric *xi, int irq)
 {
-    return false;
+    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
+    ICSState *ics = spapr->ics;
+    int srcno = irq - ics->offset;
+
+    return !ICS_IRQ_FREE(ics, srcno);
 }
 
 static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
 {
-    return -1;
+    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
+    ICSState *ics = spapr->ics;
+    int srcno;
+
+    srcno = ics_find_free_block(ics, count, align);
+    if (srcno == -1) {
+        return -1;
+    }
+
+    return srcno + ics->offset;
 }
 
 static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
 {
-    ;
+    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
+    ICSState *ics = spapr->ics;
+    int srcno = irq - ics->offset;
+    int i;
+
+    if (ics_valid_irq(ics, irq)) {
+        trace_spapr_irq_free(0, irq, num);
+        for (i = srcno; i < srcno + num; ++i) {
+            if (ICS_IRQ_FREE(ics, i)) {
+                trace_spapr_irq_free_warn(0, i + ics->offset);
+            }
+            memset(&ics->irqs[i], 0, sizeof(ICSIRQState));
+        }
+    }
 }
 
 static void spapr_pic_print_info(InterruptStatsProvider *obj,
diff --git a/hw/ppc/trace-events b/hw/ppc/trace-events
index 4a6a6490fa78..dc9ab4c4deb3 100644
--- a/hw/ppc/trace-events
+++ b/hw/ppc/trace-events
@@ -12,6 +12,8 @@ spapr_pci_msi_retry(unsigned config_addr, unsigned req_num, unsigned max_irqs) "
 # hw/ppc/spapr.c
 spapr_cas_failed(unsigned long n) "DT diff buffer is too small: %ld bytes"
 spapr_cas_continue(unsigned long n) "Copy changes to the guest: %ld bytes"
+spapr_irq_free(int src, int irq, int num) "Source#%d, first irq %d, %d irqs"
+spapr_irq_free_warn(int src, int irq) "Source#%d, irq %d is already free"
 
 # hw/ppc/spapr_hcall.c
 spapr_cas_pvr_try(uint32_t pvr) "0x%x"
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [Qemu-devel] [PATCH for-2.12 v3 05/11] spapr: introduce an IRQ allocator using a bitmap
  2017-11-10 15:20 [Qemu-devel] [PATCH for-2.12 v3 00/11] spapr: introduce an IRQ allocator at the machine level Cédric Le Goater
                   ` (3 preceding siblings ...)
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 04/11] spapr: move current IRQ allocation under the machine Cédric Le Goater
@ 2017-11-10 15:20 ` Cédric Le Goater
  2017-11-14  9:42   ` Greg Kurz
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 06/11] spapr: store a reference IRQ bitmap Cédric Le Goater
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 79+ messages in thread
From: Cédric Le Goater @ 2017-11-10 15:20 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel, David Gibson, Greg Kurz, Benjamin Herrenschmidt
  Cc: Cédric Le Goater

Let's define a new set of XICSFabric IRQ operations for the latest
pseries machine. These simply use a a bitmap 'irq_map' as a IRQ number
allocator.

The previous pseries machines keep the old set of IRQ operations using
the ICSIRQState array.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---

 Changes since v2 :

 - introduced a second set of XICSFabric IRQ operations for older
   pseries machines

 hw/ppc/spapr.c         | 76 ++++++++++++++++++++++++++++++++++++++++++++++----
 include/hw/ppc/spapr.h |  3 ++
 2 files changed, 74 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 4bdceb45a14f..4ef0b73559ca 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1681,6 +1681,22 @@ static const VMStateDescription vmstate_spapr_patb_entry = {
     },
 };
 
+static bool spapr_irq_map_needed(void *opaque)
+{
+    return true;
+}
+
+static const VMStateDescription vmstate_spapr_irq_map = {
+    .name = "spapr_irq_map",
+    .version_id = 0,
+    .minimum_version_id = 0,
+    .needed = spapr_irq_map_needed,
+    .fields = (VMStateField[]) {
+        VMSTATE_BITMAP(irq_map, sPAPRMachineState, 0, nr_irqs),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
 static const VMStateDescription vmstate_spapr = {
     .name = "spapr",
     .version_id = 3,
@@ -1700,6 +1716,7 @@ static const VMStateDescription vmstate_spapr = {
         &vmstate_spapr_ov5_cas,
         &vmstate_spapr_patb_entry,
         &vmstate_spapr_pending_events,
+        &vmstate_spapr_irq_map,
         NULL
     }
 };
@@ -2337,8 +2354,12 @@ static void ppc_spapr_init(MachineState *machine)
     /* Setup a load limit for the ramdisk leaving room for SLOF and FDT */
     load_limit = MIN(spapr->rma_size, RTAS_MAX_ADDR) - FW_OVERHEAD;
 
+    /* Initialize the IRQ allocator */
+    spapr->nr_irqs  = XICS_IRQS_SPAPR;
+    spapr->irq_map  = bitmap_new(spapr->nr_irqs);
+
     /* Set up Interrupt Controller before we create the VCPUs */
-    xics_system_init(machine, XICS_IRQS_SPAPR, &error_fatal);
+    xics_system_init(machine, spapr->nr_irqs, &error_fatal);
 
     /* Set up containers for ibm,client-architecture-support negotiated options
      */
@@ -3560,7 +3581,7 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum)
     return -1;
 }
 
-static bool spapr_irq_test(XICSFabric *xi, int irq)
+static bool spapr_irq_test_2_11(XICSFabric *xi, int irq)
 {
     sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
     ICSState *ics = spapr->ics;
@@ -3569,7 +3590,7 @@ static bool spapr_irq_test(XICSFabric *xi, int irq)
     return !ICS_IRQ_FREE(ics, srcno);
 }
 
-static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
+static int spapr_irq_alloc_block_2_11(XICSFabric *xi, int count, int align)
 {
     sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
     ICSState *ics = spapr->ics;
@@ -3583,7 +3604,7 @@ static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
     return srcno + ics->offset;
 }
 
-static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
+static void spapr_irq_free_block_2_11(XICSFabric *xi, int irq, int num)
 {
     sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
     ICSState *ics = spapr->ics;
@@ -3601,6 +3622,46 @@ static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
     }
 }
 
+static bool spapr_irq_test(XICSFabric *xi, int irq)
+{
+    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
+    int srcno = irq - spapr->ics->offset;
+
+    return test_bit(srcno, spapr->irq_map);
+}
+
+static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
+{
+    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
+    int start = 0;
+    int srcno;
+
+    /*
+     * The 'align_mask' parameter of bitmap_find_next_zero_area()
+     * should be one less than a power of 2; 0 means no
+     * alignment. Adapt the 'align' value of the former allocator to
+     * fit the requirements of bitmap_find_next_zero_area()
+     */
+    align -= 1;
+
+    srcno = bitmap_find_next_zero_area(spapr->irq_map, spapr->nr_irqs, start,
+                                       count, align);
+    if (srcno == spapr->nr_irqs) {
+        return -1;
+    }
+
+    bitmap_set(spapr->irq_map, srcno, count);
+    return srcno + spapr->ics->offset;
+}
+
+static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
+{
+    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
+    int srcno = irq - spapr->ics->offset;
+
+    bitmap_clear(spapr->irq_map, srcno, num);
+}
+
 static void spapr_pic_print_info(InterruptStatsProvider *obj,
                                  Monitor *mon)
 {
@@ -3778,7 +3839,12 @@ static void spapr_machine_2_11_instance_options(MachineState *machine)
 
 static void spapr_machine_2_11_class_options(MachineClass *mc)
 {
-    /* Defaults for the latest behaviour inherited from the base class */
+    XICSFabricClass *xic = XICS_FABRIC_CLASS(mc);
+
+    spapr_machine_2_12_class_options(mc);
+    xic->irq_test = spapr_irq_test_2_11;
+    xic->irq_alloc_block = spapr_irq_alloc_block_2_11;
+    xic->irq_free_block = spapr_irq_free_block_2_11;
 }
 
 DEFINE_SPAPR_MACHINE(2_11, "2.11", false);
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 9d21ca9bde3a..5835c694caff 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -7,6 +7,7 @@
 #include "hw/ppc/spapr_drc.h"
 #include "hw/mem/pc-dimm.h"
 #include "hw/ppc/spapr_ovec.h"
+#include "qemu/bitmap.h"
 
 struct VIOsPAPRBus;
 struct sPAPRPHBState;
@@ -78,6 +79,8 @@ struct sPAPRMachineState {
     struct VIOsPAPRBus *vio_bus;
     QLIST_HEAD(, sPAPRPHBState) phbs;
     struct sPAPRNVRAM *nvram;
+    int32_t nr_irqs;
+    unsigned long *irq_map;
     ICSState *ics;
     sPAPRRTCState rtc;
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [Qemu-devel] [PATCH for-2.12 v3 06/11] spapr: store a reference IRQ bitmap
  2017-11-10 15:20 [Qemu-devel] [PATCH for-2.12 v3 00/11] spapr: introduce an IRQ allocator at the machine level Cédric Le Goater
                   ` (4 preceding siblings ...)
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 05/11] spapr: introduce an IRQ allocator using a bitmap Cédric Le Goater
@ 2017-11-10 15:20 ` Cédric Le Goater
  2017-11-14 15:12   ` Greg Kurz
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 07/11] spapr: introduce an 'irq_base' number Cédric Le Goater
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 79+ messages in thread
From: Cédric Le Goater @ 2017-11-10 15:20 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel, David Gibson, Greg Kurz, Benjamin Herrenschmidt
  Cc: Cédric Le Goater

To save some state when the guest is migrated, we capture the IRQ
bitmap after all devices have been reseted and store it as a reference
for the machine.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---

 We should probably merge this patch with the previous in the next
 versions of the patchset. For the moment, I thought it would be
 interesting to isolate the topic for discussion.

 hw/ppc/spapr.c         | 7 ++++++-
 include/hw/ppc/spapr.h | 1 +
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 4ef0b73559ca..bf0e5b4f815b 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1437,6 +1437,9 @@ static void ppc_spapr_reset(void)
     qemu_devices_reset();
     spapr_clear_pending_events(spapr);
 
+    spapr->irq_map_ref = bitmap_new(spapr->nr_irqs);
+    bitmap_copy(spapr->irq_map_ref, spapr->irq_map, spapr->nr_irqs);
+
     /*
      * We place the device tree and RTAS just below either the top of the RMA,
      * or just below 2GB, whichever is lowere, so that it can be
@@ -1683,7 +1686,9 @@ static const VMStateDescription vmstate_spapr_patb_entry = {
 
 static bool spapr_irq_map_needed(void *opaque)
 {
-    return true;
+    sPAPRMachineState *spapr = opaque;
+
+    return !bitmap_equal(spapr->irq_map, spapr->irq_map_ref, spapr->nr_irqs);
 }
 
 static const VMStateDescription vmstate_spapr_irq_map = {
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 5835c694caff..023436c32b2a 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -81,6 +81,7 @@ struct sPAPRMachineState {
     struct sPAPRNVRAM *nvram;
     int32_t nr_irqs;
     unsigned long *irq_map;
+    unsigned long *irq_map_ref;
     ICSState *ics;
     sPAPRRTCState rtc;
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [Qemu-devel] [PATCH for-2.12 v3 07/11] spapr: introduce an 'irq_base' number
  2017-11-10 15:20 [Qemu-devel] [PATCH for-2.12 v3 00/11] spapr: introduce an IRQ allocator at the machine level Cédric Le Goater
                   ` (5 preceding siblings ...)
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 06/11] spapr: store a reference IRQ bitmap Cédric Le Goater
@ 2017-11-10 15:20 ` Cédric Le Goater
  2017-11-14 15:45   ` Greg Kurz
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 08/11] spapr: introduce a XICSFabric irq_is_lsi() operation Cédric Le Goater
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 79+ messages in thread
From: Cédric Le Goater @ 2017-11-10 15:20 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel, David Gibson, Greg Kurz, Benjamin Herrenschmidt
  Cc: Cédric Le Goater

'irq_base' is a base IRQ number which lets us allocate only the subset
of the IRQ numbers used on the sPAPR platform. It is sync with the
ICSState 'offset' attribute and this is slightly redundant. We could
also choose to waste some extra bytes (512) and allocate the whole
number space. To be discussed.

But more important, it removes a dependency on the ICSState object of
the sPAPR machine which is required for XIVE.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 hw/ppc/spapr.c         | 7 ++++---
 include/hw/ppc/spapr.h | 1 +
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index bf0e5b4f815b..1cbbd7715a85 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2362,6 +2362,7 @@ static void ppc_spapr_init(MachineState *machine)
     /* Initialize the IRQ allocator */
     spapr->nr_irqs  = XICS_IRQS_SPAPR;
     spapr->irq_map  = bitmap_new(spapr->nr_irqs);
+    spapr->irq_base = XICS_IRQ_BASE;
 
     /* Set up Interrupt Controller before we create the VCPUs */
     xics_system_init(machine, spapr->nr_irqs, &error_fatal);
@@ -3630,7 +3631,7 @@ static void spapr_irq_free_block_2_11(XICSFabric *xi, int irq, int num)
 static bool spapr_irq_test(XICSFabric *xi, int irq)
 {
     sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
-    int srcno = irq - spapr->ics->offset;
+    int srcno = irq - spapr->irq_base;
 
     return test_bit(srcno, spapr->irq_map);
 }
@@ -3656,13 +3657,13 @@ static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
     }
 
     bitmap_set(spapr->irq_map, srcno, count);
-    return srcno + spapr->ics->offset;
+    return srcno + spapr->irq_base;
 }
 
 static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
 {
     sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
-    int srcno = irq - spapr->ics->offset;
+    int srcno = irq - spapr->irq_base;
 
     bitmap_clear(spapr->irq_map, srcno, num);
 }
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 023436c32b2a..200667dcff9d 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -82,6 +82,7 @@ struct sPAPRMachineState {
     int32_t nr_irqs;
     unsigned long *irq_map;
     unsigned long *irq_map_ref;
+    uint32_t irq_base;
     ICSState *ics;
     sPAPRRTCState rtc;
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [Qemu-devel] [PATCH for-2.12 v3 08/11] spapr: introduce a XICSFabric irq_is_lsi() operation
  2017-11-10 15:20 [Qemu-devel] [PATCH for-2.12 v3 00/11] spapr: introduce an IRQ allocator at the machine level Cédric Le Goater
                   ` (6 preceding siblings ...)
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 07/11] spapr: introduce an 'irq_base' number Cédric Le Goater
@ 2017-11-10 15:20 ` Cédric Le Goater
  2017-11-14 16:21   ` Greg Kurz
  2017-11-17  4:54   ` David Gibson
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 09/11] spapr: split the IRQ number space for LSI interrupts Cédric Le Goater
                   ` (2 subsequent siblings)
  10 siblings, 2 replies; 79+ messages in thread
From: Cédric Le Goater @ 2017-11-10 15:20 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel, David Gibson, Greg Kurz, Benjamin Herrenschmidt
  Cc: Cédric Le Goater

It will be used later on to distinguish the allocation of an LSI
interrupt from an MSI and also to reduce the use of the ICSIRQState
array of the ICSState object, which is on our way to introduce XIVE.

The 'irq' parameter continues to refer to the global IRQ number space.

On PowerNV, only the PSI controller interrupts are handled and they
are all LSIs.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 hw/intc/xics.c        | 26 +++++++++++++++++---------
 hw/intc/xics_kvm.c    |  4 ++--
 hw/ppc/pnv.c          | 16 ++++++++++++++++
 hw/ppc/spapr.c        |  9 +++++++++
 include/hw/ppc/xics.h |  2 ++
 5 files changed, 46 insertions(+), 11 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 2c4899f278e2..42880e736697 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -33,6 +33,7 @@
 #include "trace.h"
 #include "qemu/timer.h"
 #include "hw/ppc/xics.h"
+#include "hw/ppc/spapr.h"
 #include "qemu/error-report.h"
 #include "qapi/visitor.h"
 #include "monitor/monitor.h"
@@ -70,8 +71,7 @@ void ics_pic_print_info(ICSState *ics, Monitor *mon)
         }
         monitor_printf(mon, "  %4x %s %02x %02x\n",
                        ics->offset + i,
-                       (irq->flags & XICS_FLAGS_IRQ_LSI) ?
-                       "LSI" : "MSI",
+                       ics_is_lsi(ics, i) ? "LSI" : "MSI",
                        irq->priority, irq->status);
     }
 }
@@ -377,6 +377,14 @@ static const TypeInfo icp_info = {
 /*
  * ICS: Source layer
  */
+bool ics_is_lsi(ICSState *ics, int srcno)
+{
+    XICSFabric *xi = ics->xics;
+    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(xi);
+
+    return xic->irq_is_lsi(xi, srcno + ics->offset);
+}
+
 static void ics_simple_resend_msi(ICSState *ics, int srcno)
 {
     ICSIRQState *irq = ics->irqs + srcno;
@@ -435,7 +443,7 @@ static void ics_simple_set_irq(void *opaque, int srcno, int val)
 {
     ICSState *ics = (ICSState *)opaque;
 
-    if (ics->irqs[srcno].flags & XICS_FLAGS_IRQ_LSI) {
+    if (ics_is_lsi(ics, srcno)) {
         ics_simple_set_irq_lsi(ics, srcno, val);
     } else {
         ics_simple_set_irq_msi(ics, srcno, val);
@@ -472,7 +480,7 @@ void ics_simple_write_xive(ICSState *ics, int srcno, int server,
     trace_xics_ics_simple_write_xive(ics->offset + srcno, srcno, server,
                                      priority);
 
-    if (ics->irqs[srcno].flags & XICS_FLAGS_IRQ_LSI) {
+    if (ics_is_lsi(ics, srcno)) {
         ics_simple_write_xive_lsi(ics, srcno);
     } else {
         ics_simple_write_xive_msi(ics, srcno);
@@ -484,10 +492,10 @@ static void ics_simple_reject(ICSState *ics, uint32_t nr)
     ICSIRQState *irq = ics->irqs + nr - ics->offset;
 
     trace_xics_ics_simple_reject(nr, nr - ics->offset);
-    if (irq->flags & XICS_FLAGS_IRQ_MSI) {
-        irq->status |= XICS_STATUS_REJECTED;
-    } else if (irq->flags & XICS_FLAGS_IRQ_LSI) {
+    if (ics_is_lsi(ics, nr - ics->offset)) {
         irq->status &= ~XICS_STATUS_SENT;
+    } else {
+        irq->status |= XICS_STATUS_REJECTED;
     }
 }
 
@@ -497,7 +505,7 @@ static void ics_simple_resend(ICSState *ics)
 
     for (i = 0; i < ics->nr_irqs; i++) {
         /* FIXME: filter by server#? */
-        if (ics->irqs[i].flags & XICS_FLAGS_IRQ_LSI) {
+        if (ics_is_lsi(ics, i)) {
             ics_simple_resend_lsi(ics, i);
         } else {
             ics_simple_resend_msi(ics, i);
@@ -512,7 +520,7 @@ static void ics_simple_eoi(ICSState *ics, uint32_t nr)
 
     trace_xics_ics_simple_eoi(nr);
 
-    if (ics->irqs[srcno].flags & XICS_FLAGS_IRQ_LSI) {
+    if (ics_is_lsi(ics, srcno)) {
         irq->status &= ~XICS_STATUS_SENT;
     }
 }
diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
index 3091ad3ac2c8..2f10637c9f7c 100644
--- a/hw/intc/xics_kvm.c
+++ b/hw/intc/xics_kvm.c
@@ -258,7 +258,7 @@ static int ics_set_kvm_state(ICSState *ics, int version_id)
             state |= KVM_XICS_MASKED;
         }
 
-        if (ics->irqs[i].flags & XICS_FLAGS_IRQ_LSI) {
+        if (ics_is_lsi(ics, i)) {
             state |= KVM_XICS_LEVEL_SENSITIVE;
             if (irq->status & XICS_STATUS_ASSERTED) {
                 state |= KVM_XICS_PENDING;
@@ -293,7 +293,7 @@ static void ics_kvm_set_irq(void *opaque, int srcno, int val)
     int rc;
 
     args.irq = srcno + ics->offset;
-    if (ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MSI) {
+    if (!ics_is_lsi(ics, srcno)) {
         if (!val) {
             return;
         }
diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 8288940ef9d7..958223376b4c 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -1035,6 +1035,21 @@ static bool pnv_irq_test(XICSFabric *xi, int irq)
     return false;
 }
 
+static bool pnv_irq_is_lsi(XICSFabric *xi, int irq)
+{
+    PnvMachineState *pnv = POWERNV_MACHINE(xi);
+    int i;
+
+    /* PowerNV machine only has PSI interrupts which are all LSIs */
+    for (i = 0; i < pnv->num_chips; i++) {
+        ICSState *ics = &pnv->chips[i]->psi.ics;
+        if (ics_valid_irq(ics, irq)) {
+            return true;
+        }
+    }
+    return false;
+}
+
 static void pnv_pic_print_info(InterruptStatsProvider *obj,
                                Monitor *mon)
 {
@@ -1120,6 +1135,7 @@ static void powernv_machine_class_init(ObjectClass *oc, void *data)
     xic->ics_get = pnv_ics_get;
     xic->ics_resend = pnv_ics_resend;
     xic->irq_test = pnv_irq_test;
+    xic->irq_is_lsi = pnv_irq_is_lsi;
     ispc->print_info = pnv_pic_print_info;
 
     powernv_machine_class_props_init(oc);
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 1cbbd7715a85..ce314fcf38db 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3628,6 +3628,14 @@ static void spapr_irq_free_block_2_11(XICSFabric *xi, int irq, int num)
     }
 }
 
+static bool spapr_irq_is_lsi(XICSFabric *xi, int irq)
+{
+    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
+    int srcno = irq - spapr->ics->offset;
+
+    return spapr->ics->irqs[srcno].flags & XICS_FLAGS_IRQ_LSI;
+}
+
 static bool spapr_irq_test(XICSFabric *xi, int irq)
 {
     sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
@@ -3765,6 +3773,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     xic->irq_test = spapr_irq_test;
     xic->irq_alloc_block = spapr_irq_alloc_block;
     xic->irq_free_block = spapr_irq_free_block;
+    xic->irq_is_lsi = spapr_irq_is_lsi;
 
     ispc->print_info = spapr_pic_print_info;
     /* Force NUMA node memory size to be a multiple of
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 30e7f2e0a7dd..478f8e510179 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -179,6 +179,7 @@ typedef struct XICSFabricClass {
     bool (*irq_test)(XICSFabric *xi, int irq);
     int (*irq_alloc_block)(XICSFabric *xi, int count, int align);
     void (*irq_free_block)(XICSFabric *xi, int irq, int num);
+    bool (*irq_is_lsi)(XICSFabric *xi, int irq);
 } XICSFabricClass;
 
 #define XICS_IRQS_SPAPR               1024
@@ -205,6 +206,7 @@ void ics_simple_write_xive(ICSState *ics, int nr, int server,
 void ics_set_irq_type(ICSState *ics, int srcno, bool lsi);
 void icp_pic_print_info(ICPState *icp, Monitor *mon);
 void ics_pic_print_info(ICSState *ics, Monitor *mon);
+bool ics_is_lsi(ICSState *ics, int srno);
 
 void ics_resend(ICSState *ics);
 void icp_resend(ICPState *ss);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [Qemu-devel] [PATCH for-2.12 v3 09/11] spapr: split the IRQ number space for LSI interrupts
  2017-11-10 15:20 [Qemu-devel] [PATCH for-2.12 v3 00/11] spapr: introduce an IRQ allocator at the machine level Cédric Le Goater
                   ` (7 preceding siblings ...)
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 08/11] spapr: introduce a XICSFabric irq_is_lsi() operation Cédric Le Goater
@ 2017-11-10 15:20 ` Cédric Le Goater
  2017-11-15 15:52   ` Greg Kurz
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 10/11] sparp: merge ics_set_irq_type() in irq_alloc_block() operation Cédric Le Goater
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 11/11] spapr: use sPAPRMachineState in spapr_ics_ prototypes Cédric Le Goater
  10 siblings, 1 reply; 79+ messages in thread
From: Cédric Le Goater @ 2017-11-10 15:20 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel, David Gibson, Greg Kurz, Benjamin Herrenschmidt
  Cc: Cédric Le Goater

The type of an interrupt, MSI or LSI, is stored under the flag
attribute of the ICSIRQState array. To reduce the use of this array
and consequently of the ICSState object (This is needed to introduce
the new XIVE model), we choose to split the IRQ number space of the
machine in two: first the LSIs and then the MSIs.

This also has the benefit to keep the LSI IRQ numbers in a well known
range which will be useful for PHB hotplug.

This change only applies to the latest pseries machines. Older
machines still use the ICSIRQState array to define the IRQ type.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---

 Changes since v2 :

 - introduced a second set of XICSFabric IRQ operations for older
   pseries machines

 hw/intc/xics_spapr.c  |  6 +++---
 hw/ppc/spapr.c        | 33 +++++++++++++++++++++++++++++----
 include/hw/ppc/xics.h |  2 +-
 3 files changed, 33 insertions(+), 8 deletions(-)

diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
index de9e65d35247..b8e91aaf52bd 100644
--- a/hw/intc/xics_spapr.c
+++ b/hw/intc/xics_spapr.c
@@ -260,7 +260,7 @@ int spapr_ics_alloc(ICSState *ics, int irq_hint, bool lsi, Error **errp)
         }
         irq = irq_hint;
     } else {
-        irq = xic->irq_alloc_block(ics->xics, 1, 1);
+        irq = xic->irq_alloc_block(ics->xics, 1, 1, lsi);
         if (irq < 0) {
             error_setg(errp, "can't allocate IRQ: no IRQ left");
             return -1;
@@ -297,9 +297,9 @@ int spapr_ics_alloc_block(ICSState *ics, int num, bool lsi,
     if (align) {
         assert((num == 1) || (num == 2) || (num == 4) ||
                (num == 8) || (num == 16) || (num == 32));
-        first = xic->irq_alloc_block(ics->xics, num, num);
+        first = xic->irq_alloc_block(ics->xics, num, num, lsi);
     } else {
-        first = xic->irq_alloc_block(ics->xics, num, 1);
+        first = xic->irq_alloc_block(ics->xics, num, 1, lsi);
     }
     if (first < 0) {
         error_setg(errp, "can't find a free %d-IRQ block", num);
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index ce314fcf38db..f14eae6196cd 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3596,7 +3596,8 @@ static bool spapr_irq_test_2_11(XICSFabric *xi, int irq)
     return !ICS_IRQ_FREE(ics, srcno);
 }
 
-static int spapr_irq_alloc_block_2_11(XICSFabric *xi, int count, int align)
+static int spapr_irq_alloc_block_2_11(XICSFabric *xi, int count, int align,
+                                      bool lsi)
 {
     sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
     ICSState *ics = spapr->ics;
@@ -3628,7 +3629,7 @@ static void spapr_irq_free_block_2_11(XICSFabric *xi, int irq, int num)
     }
 }
 
-static bool spapr_irq_is_lsi(XICSFabric *xi, int irq)
+static bool spapr_irq_is_lsi_2_11(XICSFabric *xi, int irq)
 {
     sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
     int srcno = irq - spapr->ics->offset;
@@ -3644,10 +3645,21 @@ static bool spapr_irq_test(XICSFabric *xi, int irq)
     return test_bit(srcno, spapr->irq_map);
 }
 
-static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
+
+/*
+ * Let's provision 4 LSIs per PHBs
+ */
+#define SPAPR_MAX_LSI (SPAPR_MAX_PHBS * 4)
+
+/*
+ * Split the IRQ number space of the machine in two: first the LSIs
+ * and then the MSIs. This allows us to keep the LSI IRQ numbers in a
+ * well known range which is useful for PHB hotplug.
+ */
+static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align, bool lsi)
 {
     sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
-    int start = 0;
+    int start = lsi ? 0 : SPAPR_MAX_LSI;
     int srcno;
 
     /*
@@ -3664,6 +3676,10 @@ static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
         return -1;
     }
 
+    if (lsi && srcno >= SPAPR_MAX_LSI) {
+        return -1;
+    }
+
     bitmap_set(spapr->irq_map, srcno, count);
     return srcno + spapr->irq_base;
 }
@@ -3676,6 +3692,14 @@ static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
     bitmap_clear(spapr->irq_map, srcno, num);
 }
 
+static bool spapr_irq_is_lsi(XICSFabric *xi, int irq)
+{
+    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
+    int srcno = irq - spapr->irq_base;
+
+    return (srcno >= 0) && (srcno < SPAPR_MAX_LSI);
+}
+
 static void spapr_pic_print_info(InterruptStatsProvider *obj,
                                  Monitor *mon)
 {
@@ -3860,6 +3884,7 @@ static void spapr_machine_2_11_class_options(MachineClass *mc)
     xic->irq_test = spapr_irq_test_2_11;
     xic->irq_alloc_block = spapr_irq_alloc_block_2_11;
     xic->irq_free_block = spapr_irq_free_block_2_11;
+    xic->irq_is_lsi = spapr_irq_is_lsi_2_11;
 }
 
 DEFINE_SPAPR_MACHINE(2_11, "2.11", false);
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 478f8e510179..292b929e88eb 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -177,7 +177,7 @@ typedef struct XICSFabricClass {
     ICPState *(*icp_get)(XICSFabric *xi, int server);
     /* IRQ allocator helpers */
     bool (*irq_test)(XICSFabric *xi, int irq);
-    int (*irq_alloc_block)(XICSFabric *xi, int count, int align);
+    int (*irq_alloc_block)(XICSFabric *xi, int count, int align, bool lsi);
     void (*irq_free_block)(XICSFabric *xi, int irq, int num);
     bool (*irq_is_lsi)(XICSFabric *xi, int irq);
 } XICSFabricClass;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [Qemu-devel] [PATCH for-2.12 v3 10/11] sparp: merge ics_set_irq_type() in irq_alloc_block() operation
  2017-11-10 15:20 [Qemu-devel] [PATCH for-2.12 v3 00/11] spapr: introduce an IRQ allocator at the machine level Cédric Le Goater
                   ` (8 preceding siblings ...)
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 09/11] spapr: split the IRQ number space for LSI interrupts Cédric Le Goater
@ 2017-11-10 15:20 ` Cédric Le Goater
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 11/11] spapr: use sPAPRMachineState in spapr_ics_ prototypes Cédric Le Goater
  10 siblings, 0 replies; 79+ messages in thread
From: Cédric Le Goater @ 2017-11-10 15:20 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel, David Gibson, Greg Kurz, Benjamin Herrenschmidt
  Cc: Cédric Le Goater

Setting the XICS_FLAGS_IRQ_LSI (or XICS_FLAGS_IRQ_MSI) for older
pseries machines can now be done directly under the irq_alloc_block()
operation.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 hw/intc/xics.c        |  8 --------
 hw/intc/xics_spapr.c  |  7 +------
 hw/ppc/pnv_psi.c      |  4 ----
 hw/ppc/spapr.c        | 13 +++++++++++++
 include/hw/ppc/xics.h |  1 -
 5 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 42880e736697..237eed3c11f8 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -710,14 +710,6 @@ ICPState *xics_icp_get(XICSFabric *xi, int server)
     return xic->icp_get(xi, server);
 }
 
-void ics_set_irq_type(ICSState *ics, int srcno, bool lsi)
-{
-    assert(!(ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MASK));
-
-    ics->irqs[srcno].flags |=
-        lsi ? XICS_FLAGS_IRQ_LSI : XICS_FLAGS_IRQ_MSI;
-}
-
 static void xics_register_types(void)
 {
     type_register_static(&ics_simple_info);
diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
index b8e91aaf52bd..f28e9136f2f6 100644
--- a/hw/intc/xics_spapr.c
+++ b/hw/intc/xics_spapr.c
@@ -267,7 +267,6 @@ int spapr_ics_alloc(ICSState *ics, int irq_hint, bool lsi, Error **errp)
         }
     }
 
-    ics_set_irq_type(ics, irq - ics->offset, lsi);
     trace_xics_alloc(irq);
 
     return irq;
@@ -280,7 +279,7 @@ int spapr_ics_alloc(ICSState *ics, int irq_hint, bool lsi, Error **errp)
 int spapr_ics_alloc_block(ICSState *ics, int num, bool lsi,
                           bool align, Error **errp)
 {
-    int i, first = -1;
+    int first = -1;
     XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(ics->xics);
 
     if (!ics) {
@@ -306,10 +305,6 @@ int spapr_ics_alloc_block(ICSState *ics, int num, bool lsi,
         return -1;
     }
 
-    for (i = first; i < first + num; ++i) {
-        ics_set_irq_type(ics, i - ics->offset, lsi);
-    }
-
     trace_xics_alloc_block(first, num, lsi, align);
 
     return first;
diff --git a/hw/ppc/pnv_psi.c b/hw/ppc/pnv_psi.c
index 9876c266223d..ee7fca311cbf 100644
--- a/hw/ppc/pnv_psi.c
+++ b/hw/ppc/pnv_psi.c
@@ -487,10 +487,6 @@ static void pnv_psi_realize(DeviceState *dev, Error **errp)
         return;
     }
 
-    for (i = 0; i < ics->nr_irqs; i++) {
-        ics_set_irq_type(ics, i, true);
-    }
-
     /* XSCOM region for PSI registers */
     pnv_xscom_region_init(&psi->xscom_regs, OBJECT(dev), &pnv_psi_xscom_ops,
                 psi, "xscom-psi", PNV_XSCOM_PSIHB_SIZE);
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index f14eae6196cd..8c2cff93f933 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3596,18 +3596,31 @@ static bool spapr_irq_test_2_11(XICSFabric *xi, int irq)
     return !ICS_IRQ_FREE(ics, srcno);
 }
 
+static void ics_set_irq_type(ICSState *ics, int srcno, bool lsi)
+{
+    assert(!(ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MASK));
+
+    ics->irqs[srcno].flags |=
+        lsi ? XICS_FLAGS_IRQ_LSI : XICS_FLAGS_IRQ_MSI;
+}
+
 static int spapr_irq_alloc_block_2_11(XICSFabric *xi, int count, int align,
                                       bool lsi)
 {
     sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
     ICSState *ics = spapr->ics;
     int srcno;
+    int i;
 
     srcno = ics_find_free_block(ics, count, align);
     if (srcno == -1) {
         return -1;
     }
 
+    for (i = srcno; i < srcno + count; ++i) {
+        ics_set_irq_type(ics, i, lsi);
+    }
+
     return srcno + ics->offset;
 }
 
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 292b929e88eb..056cf37bc68f 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -203,7 +203,6 @@ void icp_eoi(ICPState *icp, uint32_t xirr);
 void ics_simple_write_xive(ICSState *ics, int nr, int server,
                            uint8_t priority, uint8_t saved_priority);
 
-void ics_set_irq_type(ICSState *ics, int srcno, bool lsi);
 void icp_pic_print_info(ICPState *icp, Monitor *mon);
 void ics_pic_print_info(ICSState *ics, Monitor *mon);
 bool ics_is_lsi(ICSState *ics, int srno);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [Qemu-devel] [PATCH for-2.12 v3 11/11] spapr: use sPAPRMachineState in spapr_ics_ prototypes
  2017-11-10 15:20 [Qemu-devel] [PATCH for-2.12 v3 00/11] spapr: introduce an IRQ allocator at the machine level Cédric Le Goater
                   ` (9 preceding siblings ...)
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 10/11] sparp: merge ics_set_irq_type() in irq_alloc_block() operation Cédric Le Goater
@ 2017-11-10 15:20 ` Cédric Le Goater
  10 siblings, 0 replies; 79+ messages in thread
From: Cédric Le Goater @ 2017-11-10 15:20 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel, David Gibson, Greg Kurz, Benjamin Herrenschmidt
  Cc: Cédric Le Goater

The routines manipulating the IRQ numbers for the sPAPR machine do not
have any relation with the ICSState anymore. So use a sPAPRMachineState
parameter in their prototype and prefix them with spapr_irq_.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 hw/intc/xics_spapr.c  | 30 ++++++++++++------------------
 hw/ppc/spapr.c        |  5 +++--
 hw/ppc/spapr_events.c |  4 ++--
 hw/ppc/spapr_pci.c    |  8 ++++----
 hw/ppc/spapr_vio.c    |  2 +-
 include/hw/ppc/xics.h | 13 +++++++------
 6 files changed, 29 insertions(+), 33 deletions(-)

diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
index f28e9136f2f6..b5c8b8fa0e89 100644
--- a/hw/intc/xics_spapr.c
+++ b/hw/intc/xics_spapr.c
@@ -245,22 +245,20 @@ void xics_spapr_init(sPAPRMachineState *spapr)
     spapr_register_hypercall(H_IPOLL, h_ipoll);
 }
 
-int spapr_ics_alloc(ICSState *ics, int irq_hint, bool lsi, Error **errp)
+int spapr_irq_alloc(sPAPRMachineState *spapr, int irq_hint, bool lsi,
+                    Error **errp)
 {
     int irq;
-    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(ics->xics);
+    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(spapr);
 
-    if (!ics) {
-        return -1;
-    }
     if (irq_hint) {
-        if (xic->irq_test(ics->xics, irq_hint)) {
+        if (xic->irq_test(XICS_FABRIC(spapr), irq_hint)) {
             error_setg(errp, "can't allocate IRQ %d: already in use", irq_hint);
             return -1;
         }
         irq = irq_hint;
     } else {
-        irq = xic->irq_alloc_block(ics->xics, 1, 1, lsi);
+        irq = xic->irq_alloc_block(XICS_FABRIC(spapr), 1, 1, lsi);
         if (irq < 0) {
             error_setg(errp, "can't allocate IRQ: no IRQ left");
             return -1;
@@ -276,15 +274,11 @@ int spapr_ics_alloc(ICSState *ics, int irq_hint, bool lsi, Error **errp)
  * Allocate block of consecutive IRQs, and return the number of the first IRQ in
  * the block. If align==true, aligns the first IRQ number to num.
  */
-int spapr_ics_alloc_block(ICSState *ics, int num, bool lsi,
+int spapr_irq_alloc_block(sPAPRMachineState *spapr, int num, bool lsi,
                           bool align, Error **errp)
 {
     int first = -1;
-    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(ics->xics);
-
-    if (!ics) {
-        return -1;
-    }
+    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(spapr);
 
     /*
      * MSIMesage::data is used for storing VIRQ so
@@ -296,9 +290,9 @@ int spapr_ics_alloc_block(ICSState *ics, int num, bool lsi,
     if (align) {
         assert((num == 1) || (num == 2) || (num == 4) ||
                (num == 8) || (num == 16) || (num == 32));
-        first = xic->irq_alloc_block(ics->xics, num, num, lsi);
+        first = xic->irq_alloc_block(XICS_FABRIC(spapr), num, num, lsi);
     } else {
-        first = xic->irq_alloc_block(ics->xics, num, 1, lsi);
+        first = xic->irq_alloc_block(XICS_FABRIC(spapr), num, 1, lsi);
     }
     if (first < 0) {
         error_setg(errp, "can't find a free %d-IRQ block", num);
@@ -310,11 +304,11 @@ int spapr_ics_alloc_block(ICSState *ics, int num, bool lsi,
     return first;
 }
 
-void spapr_ics_free(ICSState *ics, int irq, int num)
+void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num)
 {
-    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(ics->xics);
+    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(spapr);
 
-    xic->irq_free_block(ics->xics, irq, num);
+    xic->irq_free_block(XICS_FABRIC(spapr), irq, num);
 }
 
 void spapr_dt_xics(int nr_servers, void *fdt, uint32_t phandle)
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 8c2cff93f933..1ef09963519f 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3669,7 +3669,8 @@ static bool spapr_irq_test(XICSFabric *xi, int irq)
  * and then the MSIs. This allows us to keep the LSI IRQ numbers in a
  * well known range which is useful for PHB hotplug.
  */
-static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align, bool lsi)
+static int spapr_irq_alloc_block_xi(XICSFabric *xi, int count, int align,
+                                    bool lsi)
 {
     sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
     int start = lsi ? 0 : SPAPR_MAX_LSI;
@@ -3808,7 +3809,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     xic->ics_resend = spapr_ics_resend;
     xic->icp_get = spapr_icp_get;
     xic->irq_test = spapr_irq_test;
-    xic->irq_alloc_block = spapr_irq_alloc_block;
+    xic->irq_alloc_block = spapr_irq_alloc_block_xi;
     xic->irq_free_block = spapr_irq_free_block;
     xic->irq_is_lsi = spapr_irq_is_lsi;
 
diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index e377fc7ddea2..cead596f3e7a 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -718,7 +718,7 @@ void spapr_events_init(sPAPRMachineState *spapr)
     spapr->event_sources = spapr_event_sources_new();
 
     spapr_event_sources_register(spapr->event_sources, EVENT_CLASS_EPOW,
-                                 spapr_ics_alloc(spapr->ics, 0, false,
+                                 spapr_irq_alloc(spapr, 0, false,
                                                   &error_fatal));
 
     /* NOTE: if machine supports modern/dedicated hotplug event source,
@@ -731,7 +731,7 @@ void spapr_events_init(sPAPRMachineState *spapr)
      */
     if (spapr->use_hotplug_event_source) {
         spapr_event_sources_register(spapr->event_sources, EVENT_CLASS_HOT_PLUG,
-                                     spapr_ics_alloc(spapr->ics, 0, false,
+                                     spapr_irq_alloc(spapr, 0, false,
                                                       &error_fatal));
     }
 
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 5a3122a9f9f9..e0ef77a480e5 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -314,7 +314,7 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
             return;
         }
 
-        spapr_ics_free(spapr->ics, msi->first_irq, msi->num);
+        spapr_irq_free(spapr, msi->first_irq, msi->num);
         if (msi_present(pdev)) {
             spapr_msi_setmsg(pdev, 0, false, 0, 0);
         }
@@ -352,7 +352,7 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
     }
 
     /* Allocate MSIs */
-    irq = spapr_ics_alloc_block(spapr->ics, req_num, false,
+    irq = spapr_irq_alloc_block(spapr, req_num, false,
                            ret_intr_type == RTAS_TYPE_MSI, &err);
     if (err) {
         error_reportf_err(err, "Can't allocate MSIs for device %x: ",
@@ -363,7 +363,7 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
 
     /* Release previous MSIs */
     if (msi) {
-        spapr_ics_free(spapr->ics, msi->first_irq, msi->num);
+        spapr_irq_free(spapr, msi->first_irq, msi->num);
         g_hash_table_remove(phb->msi, &config_addr);
     }
 
@@ -1675,7 +1675,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
         uint32_t irq;
         Error *local_err = NULL;
 
-        irq = spapr_ics_alloc_block(spapr->ics, 1, true, false, &local_err);
+        irq = spapr_irq_alloc_block(spapr, 1, true, false, &local_err);
         if (local_err) {
             error_propagate(errp, local_err);
             error_prepend(errp, "can't allocate LSIs: ");
diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
index ea3bc8bd9e21..bb7ed2c537b0 100644
--- a/hw/ppc/spapr_vio.c
+++ b/hw/ppc/spapr_vio.c
@@ -454,7 +454,7 @@ static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
         dev->qdev.id = id;
     }
 
-    dev->irq = spapr_ics_alloc(spapr->ics, dev->irq, false, &local_err);
+    dev->irq = spapr_irq_alloc(spapr, dev->irq, false, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         return;
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 056cf37bc68f..dd3e2eacedb2 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -184,10 +184,13 @@ typedef struct XICSFabricClass {
 
 #define XICS_IRQS_SPAPR               1024
 
-int spapr_ics_alloc(ICSState *ics, int irq_hint, bool lsi, Error **errp);
-int spapr_ics_alloc_block(ICSState *ics, int num, bool lsi, bool align,
-                           Error **errp);
-void spapr_ics_free(ICSState *ics, int irq, int num);
+typedef struct sPAPRMachineState sPAPRMachineState;
+
+int spapr_irq_alloc(sPAPRMachineState *spapr, int irq_hint, bool lsi,
+                    Error **errp);
+int spapr_irq_alloc_block(sPAPRMachineState *spapr, int num, bool lsi,
+                          bool align, Error **errp);
+void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
 void spapr_dt_xics(int nr_servers, void *fdt, uint32_t phandle);
 
 qemu_irq xics_get_qirq(XICSFabric *xi, int irq);
@@ -210,8 +213,6 @@ bool ics_is_lsi(ICSState *ics, int srno);
 void ics_resend(ICSState *ics);
 void icp_resend(ICPState *ss);
 
-typedef struct sPAPRMachineState sPAPRMachineState;
-
 int xics_kvm_init(sPAPRMachineState *spapr, Error **errp);
 void xics_spapr_init(sPAPRMachineState *spapr);
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 02/11] ppc/xics: remove useless if condition
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 02/11] ppc/xics: remove useless if condition Cédric Le Goater
@ 2017-11-11 14:50   ` Greg Kurz
  2017-11-13  5:28   ` David Gibson
  1 sibling, 0 replies; 79+ messages in thread
From: Greg Kurz @ 2017-11-11 14:50 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: qemu-ppc, qemu-devel, David Gibson, Benjamin Herrenschmidt

On Fri, 10 Nov 2017 15:20:08 +0000
Cédric Le Goater <clg@kaod.org> wrote:

> The previous code section uses a 'first < 0' test and returns. Therefore,
> there is no need to test the 'first' variable against '>= 0' afterwards.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  hw/intc/xics_spapr.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
> index d98ea8b13068..e8c0a1b3e903 100644
> --- a/hw/intc/xics_spapr.c
> +++ b/hw/intc/xics_spapr.c
> @@ -329,10 +329,8 @@ int spapr_ics_alloc_block(ICSState *ics, int num, bool lsi,
>          return -1;
>      }
>  
> -    if (first >= 0) {
> -        for (i = first; i < first + num; ++i) {
> -            ics_set_irq_type(ics, i, lsi);
> -        }
> +    for (i = first; i < first + num; ++i) {
> +        ics_set_irq_type(ics, i, lsi);
>      }
>      first += ics->offset;
>  

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type Cédric Le Goater
@ 2017-11-11 15:15   ` Greg Kurz
  2017-11-13  5:51   ` David Gibson
  2017-11-13  7:14   ` [Qemu-devel] QEMU 3.0 ? (was: [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type) Thomas Huth
  2 siblings, 0 replies; 79+ messages in thread
From: Greg Kurz @ 2017-11-11 15:15 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: qemu-ppc, qemu-devel, David Gibson, Benjamin Herrenschmidt

On Fri, 10 Nov 2017 15:20:07 +0000
Cédric Le Goater <clg@kaod.org> wrote:

> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  hw/ppc/spapr.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index d682f013d422..a2dcbee07214 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3687,6 +3687,20 @@ static const TypeInfo spapr_machine_info = {
>      type_init(spapr_machine_register_##suffix)
>  
>  /*
> + * pseries-2.12
> + */
> +static void spapr_machine_2_12_instance_options(MachineState *machine)
> +{
> +}
> +
> +static void spapr_machine_2_12_class_options(MachineClass *mc)
> +{
> +    /* Defaults for the latest behaviour inherited from the base class */
> +}
> +
> +DEFINE_SPAPR_MACHINE(2_12, "2.12", true);
> +
> +/*
>   * pseries-2.11
>   */
>  static void spapr_machine_2_11_instance_options(MachineState *machine)
> @@ -3698,7 +3712,7 @@ static void spapr_machine_2_11_class_options(MachineClass *mc)
>      /* Defaults for the latest behaviour inherited from the base class */
>  }
>  
> -DEFINE_SPAPR_MACHINE(2_11, "2.11", true);
> +DEFINE_SPAPR_MACHINE(2_11, "2.11", false);
>  
>  /*
>   * pseries-2.10

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 02/11] ppc/xics: remove useless if condition
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 02/11] ppc/xics: remove useless if condition Cédric Le Goater
  2017-11-11 14:50   ` Greg Kurz
@ 2017-11-13  5:28   ` David Gibson
  1 sibling, 0 replies; 79+ messages in thread
From: David Gibson @ 2017-11-13  5:28 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: qemu-ppc, qemu-devel, Greg Kurz, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 1157 bytes --]

On Fri, Nov 10, 2017 at 03:20:08PM +0000, Cédric Le Goater wrote:
> The previous code section uses a 'first < 0' test and returns. Therefore,
> there is no need to test the 'first' variable against '>= 0' afterwards.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
>  hw/intc/xics_spapr.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
> index d98ea8b13068..e8c0a1b3e903 100644
> --- a/hw/intc/xics_spapr.c
> +++ b/hw/intc/xics_spapr.c
> @@ -329,10 +329,8 @@ int spapr_ics_alloc_block(ICSState *ics, int num, bool lsi,
>          return -1;
>      }
>  
> -    if (first >= 0) {
> -        for (i = first; i < first + num; ++i) {
> -            ics_set_irq_type(ics, i, lsi);
> -        }
> +    for (i = first; i < first + num; ++i) {
> +        ics_set_irq_type(ics, i, lsi);
>      }
>      first += ics->offset;
>  

Applied to ppc-for-2.12.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type Cédric Le Goater
  2017-11-11 15:15   ` Greg Kurz
@ 2017-11-13  5:51   ` David Gibson
  2017-11-13  9:50     ` Greg Kurz
  2017-11-13  7:14   ` [Qemu-devel] QEMU 3.0 ? (was: [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type) Thomas Huth
  2 siblings, 1 reply; 79+ messages in thread
From: David Gibson @ 2017-11-13  5:51 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: qemu-ppc, qemu-devel, Greg Kurz, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 1615 bytes --]

On Fri, Nov 10, 2017 at 03:20:07PM +0000, Cédric Le Goater wrote:
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
>  hw/ppc/spapr.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index d682f013d422..a2dcbee07214 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3687,6 +3687,20 @@ static const TypeInfo spapr_machine_info = {
>      type_init(spapr_machine_register_##suffix)
>  
>  /*
> + * pseries-2.12
> + */
> +static void spapr_machine_2_12_instance_options(MachineState *machine)
> +{
> +}
> +
> +static void spapr_machine_2_12_class_options(MachineClass *mc)
> +{
> +    /* Defaults for the latest behaviour inherited from the base class */
> +}
> +
> +DEFINE_SPAPR_MACHINE(2_12, "2.12", true);
> +
> +/*
>   * pseries-2.11
>   */
>  static void spapr_machine_2_11_instance_options(MachineState *machine)
> @@ -3698,7 +3712,7 @@ static void spapr_machine_2_11_class_options(MachineClass *mc)
>      /* Defaults for the latest behaviour inherited from the base class */
>  }
>  
> -DEFINE_SPAPR_MACHINE(2_11, "2.11", true);
> +DEFINE_SPAPR_MACHINE(2_11, "2.11", false);
>  
>  /*
>   * pseries-2.10

Uh.. not quite right.  You also need to chain the 2.11 hooks onto the
new 2.12 ones.  Never mind, we'll need it sooner or later, so I've put
my own version into ppc-for-2.12.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [Qemu-devel] QEMU 3.0 ? (was: [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type)
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type Cédric Le Goater
  2017-11-11 15:15   ` Greg Kurz
  2017-11-13  5:51   ` David Gibson
@ 2017-11-13  7:14   ` Thomas Huth
  2017-11-13  9:53     ` Peter Maydell
  2017-11-23 10:03     ` [Qemu-devel] QEMU 3.0 ? (was: [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type) Cornelia Huck
  2 siblings, 2 replies; 79+ messages in thread
From: Thomas Huth @ 2017-11-13  7:14 UTC (permalink / raw)
  To: Cédric Le Goater, qemu-devel, David Gibson, Greg Kurz,
	Peter Maydell

On 10.11.2017 16:20, Cédric Le Goater wrote:
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
>  hw/ppc/spapr.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index d682f013d422..a2dcbee07214 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3687,6 +3687,20 @@ static const TypeInfo spapr_machine_info = {
>      type_init(spapr_machine_register_##suffix)
>  
>  /*
> + * pseries-2.12
> + */
> +static void spapr_machine_2_12_instance_options(MachineState *machine)
> +{
> +}
> +
> +static void spapr_machine_2_12_class_options(MachineClass *mc)
> +{
> +    /* Defaults for the latest behaviour inherited from the base class */
> +}
> +
> +DEFINE_SPAPR_MACHINE(2_12, "2.12", true);

By the way, before everybody now introduces "2.12" machine types ... is
there already a consensus that the next version will be "2.12" ?

A couple of months ago, we discussed that we could maybe do a 3.0 after
2.11, e.g. here:

 https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05056.html

I'd still like to see that happen... Peter, any thoughts on this?

 Thomas

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type
  2017-11-13  5:51   ` David Gibson
@ 2017-11-13  9:50     ` Greg Kurz
  2017-11-14  9:08       ` David Gibson
  0 siblings, 1 reply; 79+ messages in thread
From: Greg Kurz @ 2017-11-13  9:50 UTC (permalink / raw)
  To: David Gibson
  Cc: Cédric Le Goater, qemu-ppc, qemu-devel, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 1849 bytes --]

On Mon, 13 Nov 2017 16:51:03 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Fri, Nov 10, 2017 at 03:20:07PM +0000, Cédric Le Goater wrote:
> > Signed-off-by: Cédric Le Goater <clg@kaod.org>
> > ---
> >  hw/ppc/spapr.c | 16 +++++++++++++++-
> >  1 file changed, 15 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index d682f013d422..a2dcbee07214 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -3687,6 +3687,20 @@ static const TypeInfo spapr_machine_info = {
> >      type_init(spapr_machine_register_##suffix)
> >  
> >  /*
> > + * pseries-2.12
> > + */
> > +static void spapr_machine_2_12_instance_options(MachineState *machine)
> > +{
> > +}
> > +
> > +static void spapr_machine_2_12_class_options(MachineClass *mc)
> > +{
> > +    /* Defaults for the latest behaviour inherited from the base class */
> > +}
> > +
> > +DEFINE_SPAPR_MACHINE(2_12, "2.12", true);
> > +
> > +/*
> >   * pseries-2.11
> >   */
> >  static void spapr_machine_2_11_instance_options(MachineState *machine)
> > @@ -3698,7 +3712,7 @@ static void spapr_machine_2_11_class_options(MachineClass *mc)
> >      /* Defaults for the latest behaviour inherited from the base class */
> >  }
> >  
> > -DEFINE_SPAPR_MACHINE(2_11, "2.11", true);
> > +DEFINE_SPAPR_MACHINE(2_11, "2.11", false);
> >  
> >  /*
> >   * pseries-2.10  
> 
> Uh.. not quite right.  You also need to chain the 2.11 hooks onto the

Well, this happens in patch 5, but you're right it probably makes more
sense to consolidate in a single patch.

Also, this patch should have dropped the "Defaults for the latest..."
comment...

> new 2.12 ones.  Never mind, we'll need it sooner or later, so I've put
> my own version into ppc-for-2.12.
> 

... and so should your version I guess.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ? (was: [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type)
  2017-11-13  7:14   ` [Qemu-devel] QEMU 3.0 ? (was: [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type) Thomas Huth
@ 2017-11-13  9:53     ` Peter Maydell
  2017-11-13 10:03       ` [Qemu-devel] QEMU 3.0 ? Cédric Le Goater
  2017-11-13 10:25       ` Thomas Huth
  2017-11-23 10:03     ` [Qemu-devel] QEMU 3.0 ? (was: [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type) Cornelia Huck
  1 sibling, 2 replies; 79+ messages in thread
From: Peter Maydell @ 2017-11-13  9:53 UTC (permalink / raw)
  To: Thomas Huth
  Cc: Cédric Le Goater, QEMU Developers, David Gibson, Greg Kurz

On 13 November 2017 at 07:14, Thomas Huth <thuth@redhat.com> wrote:
> By the way, before everybody now introduces "2.12" machine types ... is
> there already a consensus that the next version will be "2.12" ?
>
> A couple of months ago, we discussed that we could maybe do a 3.0 after
> 2.11, e.g. here:
>
>  https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05056.html
>
> I'd still like to see that happen... Peter, any thoughts on this?

I don't see the point in declaring a 3.0 unless we have some
sweeping change that merits it. I don't think we should do a
sweeping change unless we have a well laid out and agreed on
plan for how the transition works. So I would want to see the
plan discussed and agreed first, and then we can say "ok, and
we think we can do this in this timescale and so the version
at $DATE will be 3.0". Changing the version number should be
the last part of this process, not the first, in my view.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-13  9:53     ` Peter Maydell
@ 2017-11-13 10:03       ` Cédric Le Goater
  2017-11-13 10:21         ` Peter Maydell
  2017-11-13 10:25       ` Thomas Huth
  1 sibling, 1 reply; 79+ messages in thread
From: Cédric Le Goater @ 2017-11-13 10:03 UTC (permalink / raw)
  To: Peter Maydell, Thomas Huth; +Cc: QEMU Developers, David Gibson, Greg Kurz

On 11/13/2017 10:53 AM, Peter Maydell wrote:
> On 13 November 2017 at 07:14, Thomas Huth <thuth@redhat.com> wrote:
>> By the way, before everybody now introduces "2.12" machine types ... is
>> there already a consensus that the next version will be "2.12" ?
>>
>> A couple of months ago, we discussed that we could maybe do a 3.0 after
>> 2.11, e.g. here:
>>
>>  https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05056.html
>>
>> I'd still like to see that happen... Peter, any thoughts on this?
> 
> I don't see the point in declaring a 3.0 unless we have some
> sweeping change that merits it. I don't think we should do a
> sweeping change unless we have a well laid out and agreed on
> plan for how the transition works. So I would want to see the
> plan discussed and agreed first, and then we can say "ok, and
> we think we can do this in this timescale and so the version
> at $DATE will be 3.0". Changing the version number should be
> the last part of this process, not the first, in my view.

One of the sweeping change for 3.0 could be to stop to maintaining
migration compatibility with older versions (2.x). Even if the 
feature is really a must have in some cluster environment, the 
code (and the developer) is starting to suffer from it.


Thanks,

C.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-13 10:03       ` [Qemu-devel] QEMU 3.0 ? Cédric Le Goater
@ 2017-11-13 10:21         ` Peter Maydell
  0 siblings, 0 replies; 79+ messages in thread
From: Peter Maydell @ 2017-11-13 10:21 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: Thomas Huth, QEMU Developers, David Gibson, Greg Kurz

On 13 November 2017 at 10:03, Cédric Le Goater <clg@kaod.org> wrote:
> One of the sweeping change for 3.0 could be to stop to maintaining
> migration compatibility with older versions (2.x). Even if the
> feature is really a must have in some cluster environment, the
> code (and the developer) is starting to suffer from it.

We certainly can't just drop all migration-compat. The closest
we might come would be to say we dropped migration-compat
from versions older than $X for some value of X.

But this kind of discussion is what I mean -- various
people have different ideas about what we might or might
not drop, but we need to come to a consensus about what
we can actually do first.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-13  9:53     ` Peter Maydell
  2017-11-13 10:03       ` [Qemu-devel] QEMU 3.0 ? Cédric Le Goater
@ 2017-11-13 10:25       ` Thomas Huth
  1 sibling, 0 replies; 79+ messages in thread
From: Thomas Huth @ 2017-11-13 10:25 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Cédric Le Goater, QEMU Developers, David Gibson, Greg Kurz

On 13.11.2017 10:53, Peter Maydell wrote:
> On 13 November 2017 at 07:14, Thomas Huth <thuth@redhat.com> wrote:
>> By the way, before everybody now introduces "2.12" machine types ... is
>> there already a consensus that the next version will be "2.12" ?
>>
>> A couple of months ago, we discussed that we could maybe do a 3.0 after
>> 2.11, e.g. here:
>>
>>  https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05056.html
>>
>> I'd still like to see that happen... Peter, any thoughts on this?
> 
> I don't see the point in declaring a 3.0 unless we have some
> sweeping change that merits it. I don't think we should do a
> sweeping change unless we have a well laid out and agreed on
> plan for how the transition works.

Since we declared a lot of interfaces / features as deprecated in QEMU
2.10, we could finally remove them in the release after 2.11. Looking at
https://qemu.weilnetz.de/doc/qemu-doc.html#Deprecated-features
that's quite a bit already. That's IMHO a good justification for a 3.0
already.
 > So I would want to see the
> plan discussed and agreed first, and then we can say "ok, and
> we think we can do this in this timescale and so the version
> at $DATE will be 3.0".

We could maybe also start a wiki page to collect ideas for what we want
to do with "3.0" ... but I guess a lot of the possible changes will just
be turned down again since somebody will cry "we need to stay compatible
with older versions! Forever!". So I somehow doubt that this is worth
the effort.

> Changing the version number should be
> the last part of this process, not the first, in my view.

Yeah, but you know how this works in QEMU-Land: Once the 2.12 is
established in the heads of various people, we'll have a hard time to
bump the version number again, since there's always somebody complaining...

So I guess we'll likely end up doing it rather the Linux kernel way one
day - when we feel that the minor number got too big (three digits,
maybe?), we'll switch the major number without any further justification ;-)

 Thomas

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 03/11] spapr: introduce new XICSFabric operations for an IRQ allocator
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 03/11] spapr: introduce new XICSFabric operations for an IRQ allocator Cédric Le Goater
@ 2017-11-14  8:52   ` Greg Kurz
  2017-11-17  4:48   ` David Gibson
  1 sibling, 0 replies; 79+ messages in thread
From: Greg Kurz @ 2017-11-14  8:52 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: qemu-ppc, qemu-devel, David Gibson, Benjamin Herrenschmidt

On Fri, 10 Nov 2017 15:20:09 +0000
Cédric Le Goater <clg@kaod.org> wrote:

> Currently, the ICSState 'ics' object of the sPAPR machine acts as the
> global interrupt source handler and also as the IRQ number allocator
> for the machine. Some IRQ numbers are allocated very early in the
> machine initialization sequence to populate the device tree, and this
> is a problem to introduce the new POWER XIVE interrupt model, as it
> needs to share the IRQ numbers with the older model.
> 
> To prepare ground for XIVE, here is a set of new XICSFabric operations
> to let the machine handle directly the IRQ number allocation and to
> decorrelate the allocation from the interrupt source object :
> 
>     bool (*irq_test)(XICSFabric *xi, int irq);
>     int (*irq_alloc_block)(XICSFabric *xi, int count, int align);
>     void (*irq_free_block)(XICSFabric *xi, int irq, int num);
> 
> In these prototypes, the 'irq' parameter refers to a number in the
> global IRQ number space. Indexes for arrays storing different state
> informations on the interrupts, like the ICSIRQState, are usually
> named 'srcno'.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  hw/ppc/spapr.c        | 19 +++++++++++++++++++
>  include/hw/ppc/xics.h |  4 ++++
>  2 files changed, 23 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index a2dcbee07214..84d68f2fdbae 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3536,6 +3536,21 @@ static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
>      return cpu ? ICP(cpu->intc) : NULL;
>  }
>  
> +static bool spapr_irq_test(XICSFabric *xi, int irq)
> +{
> +    return false;
> +}
> +
> +static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
> +{
> +    return -1;
> +}
> +
> +static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
> +{
> +    ;
> +}
> +
>  static void spapr_pic_print_info(InterruptStatsProvider *obj,
>                                   Monitor *mon)
>  {
> @@ -3630,6 +3645,10 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>      xic->ics_get = spapr_ics_get;
>      xic->ics_resend = spapr_ics_resend;
>      xic->icp_get = spapr_icp_get;
> +    xic->irq_test = spapr_irq_test;
> +    xic->irq_alloc_block = spapr_irq_alloc_block;
> +    xic->irq_free_block = spapr_irq_free_block;
> +
>      ispc->print_info = spapr_pic_print_info;
>      /* Force NUMA node memory size to be a multiple of
>       * SPAPR_MEMORY_BLOCK_SIZE (256M) since that's the granularity
> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> index 28d248abad61..30e7f2e0a7dd 100644
> --- a/include/hw/ppc/xics.h
> +++ b/include/hw/ppc/xics.h
> @@ -175,6 +175,10 @@ typedef struct XICSFabricClass {
>      ICSState *(*ics_get)(XICSFabric *xi, int irq);
>      void (*ics_resend)(XICSFabric *xi);
>      ICPState *(*icp_get)(XICSFabric *xi, int server);
> +    /* IRQ allocator helpers */
> +    bool (*irq_test)(XICSFabric *xi, int irq);
> +    int (*irq_alloc_block)(XICSFabric *xi, int count, int align);
> +    void (*irq_free_block)(XICSFabric *xi, int irq, int num);
>  } XICSFabricClass;
>  
>  #define XICS_IRQS_SPAPR               1024

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 04/11] spapr: move current IRQ allocation under the machine
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 04/11] spapr: move current IRQ allocation under the machine Cédric Le Goater
@ 2017-11-14  8:56   ` Greg Kurz
  0 siblings, 0 replies; 79+ messages in thread
From: Greg Kurz @ 2017-11-14  8:56 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: qemu-ppc, qemu-devel, David Gibson, Benjamin Herrenschmidt

On Fri, 10 Nov 2017 15:20:10 +0000
Cédric Le Goater <clg@kaod.org> wrote:

> Use the new XICSFabric operations to handle the IRQ number allocation
> directly under the machine. These changes only move code and adapt it
> to take into account the new API which uses IRQ numbers.
> 
> On PowerNV, only provide a basic irq_test() operation. For the moment,
> there is no need for more.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  hw/intc/trace-events |  2 --
>  hw/intc/xics.c       |  3 ++-
>  hw/intc/xics_spapr.c | 57 +++++++++-------------------------------------------
>  hw/ppc/pnv.c         | 18 +++++++++++++++++
>  hw/ppc/spapr.c       | 56 ++++++++++++++++++++++++++++++++++++++++++++++++---
>  hw/ppc/trace-events  |  2 ++
>  6 files changed, 85 insertions(+), 53 deletions(-)
> 
> diff --git a/hw/intc/trace-events b/hw/intc/trace-events
> index b86f242b0fcf..e34ecf7a16e5 100644
> --- a/hw/intc/trace-events
> +++ b/hw/intc/trace-events
> @@ -65,8 +65,6 @@ xics_ics_simple_reject(int nr, int srcno) "reject irq 0x%x [src %d]"
>  xics_ics_simple_eoi(int nr) "ics_eoi: irq 0x%x"
>  xics_alloc(int irq) "irq %d"
>  xics_alloc_block(int first, int num, bool lsi, int align) "first irq %d, %d irqs, lsi=%d, alignnum %d"
> -xics_ics_free(int src, int irq, int num) "Source#%d, first irq %d, %d irqs"
> -xics_ics_free_warn(int src, int irq) "Source#%d, irq %d is already free"
>  
>  # hw/intc/s390_flic_kvm.c
>  flic_create_device(int err) "flic: create device failed %d"
> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> index cc9816e7f204..2c4899f278e2 100644
> --- a/hw/intc/xics.c
> +++ b/hw/intc/xics.c
> @@ -53,6 +53,7 @@ void icp_pic_print_info(ICPState *icp, Monitor *mon)
>  void ics_pic_print_info(ICSState *ics, Monitor *mon)
>  {
>      uint32_t i;
> +    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(ics->xics);
>  
>      monitor_printf(mon, "ICS %4x..%4x %p\n",
>                     ics->offset, ics->offset + ics->nr_irqs - 1, ics);
> @@ -64,7 +65,7 @@ void ics_pic_print_info(ICSState *ics, Monitor *mon)
>      for (i = 0; i < ics->nr_irqs; i++) {
>          ICSIRQState *irq = ics->irqs + i;
>  
> -        if (!(irq->flags & XICS_FLAGS_IRQ_MASK)) {
> +        if (!xic->irq_test(ics->xics, i + ics->offset)) {
>              continue;
>          }
>          monitor_printf(mon, "  %4x %s %02x %02x\n",
> diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
> index e8c0a1b3e903..de9e65d35247 100644
> --- a/hw/intc/xics_spapr.c
> +++ b/hw/intc/xics_spapr.c
> @@ -245,50 +245,26 @@ void xics_spapr_init(sPAPRMachineState *spapr)
>      spapr_register_hypercall(H_IPOLL, h_ipoll);
>  }
>  
> -#define ICS_IRQ_FREE(ics, srcno)   \
> -    (!((ics)->irqs[(srcno)].flags & (XICS_FLAGS_IRQ_MASK)))
> -
> -static int ics_find_free_block(ICSState *ics, int num, int alignnum)
> -{
> -    int first, i;
> -
> -    for (first = 0; first < ics->nr_irqs; first += alignnum) {
> -        if (num > (ics->nr_irqs - first)) {
> -            return -1;
> -        }
> -        for (i = first; i < first + num; ++i) {
> -            if (!ICS_IRQ_FREE(ics, i)) {
> -                break;
> -            }
> -        }
> -        if (i == (first + num)) {
> -            return first;
> -        }
> -    }
> -
> -    return -1;
> -}
> -
>  int spapr_ics_alloc(ICSState *ics, int irq_hint, bool lsi, Error **errp)
>  {
>      int irq;
> +    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(ics->xics);
>  
>      if (!ics) {
>          return -1;
>      }
>      if (irq_hint) {
> -        if (!ICS_IRQ_FREE(ics, irq_hint - ics->offset)) {
> +        if (xic->irq_test(ics->xics, irq_hint)) {
>              error_setg(errp, "can't allocate IRQ %d: already in use", irq_hint);
>              return -1;
>          }
>          irq = irq_hint;
>      } else {
> -        irq = ics_find_free_block(ics, 1, 1);
> +        irq = xic->irq_alloc_block(ics->xics, 1, 1);
>          if (irq < 0) {
>              error_setg(errp, "can't allocate IRQ: no IRQ left");
>              return -1;
>          }
> -        irq += ics->offset;
>      }
>  
>      ics_set_irq_type(ics, irq - ics->offset, lsi);
> @@ -305,6 +281,7 @@ int spapr_ics_alloc_block(ICSState *ics, int num, bool lsi,
>                            bool align, Error **errp)
>  {
>      int i, first = -1;
> +    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(ics->xics);
>  
>      if (!ics) {
>          return -1;
> @@ -320,9 +297,9 @@ int spapr_ics_alloc_block(ICSState *ics, int num, bool lsi,
>      if (align) {
>          assert((num == 1) || (num == 2) || (num == 4) ||
>                 (num == 8) || (num == 16) || (num == 32));
> -        first = ics_find_free_block(ics, num, num);
> +        first = xic->irq_alloc_block(ics->xics, num, num);
>      } else {
> -        first = ics_find_free_block(ics, num, 1);
> +        first = xic->irq_alloc_block(ics->xics, num, 1);
>      }
>      if (first < 0) {
>          error_setg(errp, "can't find a free %d-IRQ block", num);
> @@ -330,33 +307,19 @@ int spapr_ics_alloc_block(ICSState *ics, int num, bool lsi,
>      }
>  
>      for (i = first; i < first + num; ++i) {
> -        ics_set_irq_type(ics, i, lsi);
> +        ics_set_irq_type(ics, i - ics->offset, lsi);
>      }
> -    first += ics->offset;
>  
>      trace_xics_alloc_block(first, num, lsi, align);
>  
>      return first;
>  }
>  
> -static void ics_free(ICSState *ics, int srcno, int num)
> -{
> -    int i;
> -
> -    for (i = srcno; i < srcno + num; ++i) {
> -        if (ICS_IRQ_FREE(ics, i)) {
> -            trace_xics_ics_free_warn(0, i + ics->offset);
> -        }
> -        memset(&ics->irqs[i], 0, sizeof(ICSIRQState));
> -    }
> -}
> -
>  void spapr_ics_free(ICSState *ics, int irq, int num)
>  {
> -    if (ics_valid_irq(ics, irq)) {
> -        trace_xics_ics_free(0, irq, num);
> -        ics_free(ics, irq - ics->offset, num);
> -    }
> +    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(ics->xics);
> +
> +    xic->irq_free_block(ics->xics, irq, num);
>  }
>  
>  void spapr_dt_xics(int nr_servers, void *fdt, uint32_t phandle)
> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> index c35c439d816b..8288940ef9d7 100644
> --- a/hw/ppc/pnv.c
> +++ b/hw/ppc/pnv.c
> @@ -1018,6 +1018,23 @@ static ICPState *pnv_icp_get(XICSFabric *xi, int pir)
>      return cpu ? ICP(cpu->intc) : NULL;
>  }
>  
> +static bool pnv_irq_test(XICSFabric *xi, int irq)
> +{
> +    PnvMachineState *pnv = POWERNV_MACHINE(xi);
> +    int i;
> +
> +    /* We don't have a IRQ allocator for the PowerNV machine yet, so
> +     * just check that the IRQ number is valid for the PSI source
> +     */
> +    for (i = 0; i < pnv->num_chips; i++) {
> +        ICSState *ics = &pnv->chips[i]->psi.ics;
> +        if (ics_valid_irq(ics, irq)) {
> +            return true;
> +        }
> +    }
> +    return false;
> +}
> +
>  static void pnv_pic_print_info(InterruptStatsProvider *obj,
>                                 Monitor *mon)
>  {
> @@ -1102,6 +1119,7 @@ static void powernv_machine_class_init(ObjectClass *oc, void *data)
>      xic->icp_get = pnv_icp_get;
>      xic->ics_get = pnv_ics_get;
>      xic->ics_resend = pnv_ics_resend;
> +    xic->irq_test = pnv_irq_test;
>      ispc->print_info = pnv_pic_print_info;
>  
>      powernv_machine_class_props_init(oc);
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 84d68f2fdbae..4bdceb45a14f 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3536,19 +3536,69 @@ static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
>      return cpu ? ICP(cpu->intc) : NULL;
>  }
>  
> +#define ICS_IRQ_FREE(ics, srcno)   \
> +    (!((ics)->irqs[(srcno)].flags & (XICS_FLAGS_IRQ_MASK)))
> +
> +static int ics_find_free_block(ICSState *ics, int num, int alignnum)
> +{
> +    int first, i;
> +
> +    for (first = 0; first < ics->nr_irqs; first += alignnum) {
> +        if (num > (ics->nr_irqs - first)) {
> +            return -1;
> +        }
> +        for (i = first; i < first + num; ++i) {
> +            if (!ICS_IRQ_FREE(ics, i)) {
> +                break;
> +            }
> +        }
> +        if (i == (first + num)) {
> +            return first;
> +        }
> +    }
> +
> +    return -1;
> +}
> +
>  static bool spapr_irq_test(XICSFabric *xi, int irq)
>  {
> -    return false;
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> +    ICSState *ics = spapr->ics;
> +    int srcno = irq - ics->offset;
> +
> +    return !ICS_IRQ_FREE(ics, srcno);
>  }
>  
>  static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
>  {
> -    return -1;
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> +    ICSState *ics = spapr->ics;
> +    int srcno;
> +
> +    srcno = ics_find_free_block(ics, count, align);
> +    if (srcno == -1) {
> +        return -1;
> +    }
> +
> +    return srcno + ics->offset;
>  }
>  
>  static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
>  {
> -    ;
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> +    ICSState *ics = spapr->ics;
> +    int srcno = irq - ics->offset;
> +    int i;
> +
> +    if (ics_valid_irq(ics, irq)) {
> +        trace_spapr_irq_free(0, irq, num);
> +        for (i = srcno; i < srcno + num; ++i) {
> +            if (ICS_IRQ_FREE(ics, i)) {
> +                trace_spapr_irq_free_warn(0, i + ics->offset);
> +            }
> +            memset(&ics->irqs[i], 0, sizeof(ICSIRQState));
> +        }
> +    }
>  }
>  
>  static void spapr_pic_print_info(InterruptStatsProvider *obj,
> diff --git a/hw/ppc/trace-events b/hw/ppc/trace-events
> index 4a6a6490fa78..dc9ab4c4deb3 100644
> --- a/hw/ppc/trace-events
> +++ b/hw/ppc/trace-events
> @@ -12,6 +12,8 @@ spapr_pci_msi_retry(unsigned config_addr, unsigned req_num, unsigned max_irqs) "
>  # hw/ppc/spapr.c
>  spapr_cas_failed(unsigned long n) "DT diff buffer is too small: %ld bytes"
>  spapr_cas_continue(unsigned long n) "Copy changes to the guest: %ld bytes"
> +spapr_irq_free(int src, int irq, int num) "Source#%d, first irq %d, %d irqs"
> +spapr_irq_free_warn(int src, int irq) "Source#%d, irq %d is already free"
>  
>  # hw/ppc/spapr_hcall.c
>  spapr_cas_pvr_try(uint32_t pvr) "0x%x"

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type
  2017-11-13  9:50     ` Greg Kurz
@ 2017-11-14  9:08       ` David Gibson
  0 siblings, 0 replies; 79+ messages in thread
From: David Gibson @ 2017-11-14  9:08 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Cédric Le Goater, qemu-ppc, qemu-devel, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 2241 bytes --]

On Mon, Nov 13, 2017 at 10:50:10AM +0100, Greg Kurz wrote:
> On Mon, 13 Nov 2017 16:51:03 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Fri, Nov 10, 2017 at 03:20:07PM +0000, Cédric Le Goater wrote:
> > > Signed-off-by: Cédric Le Goater <clg@kaod.org>
> > > ---
> > >  hw/ppc/spapr.c | 16 +++++++++++++++-
> > >  1 file changed, 15 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index d682f013d422..a2dcbee07214 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -3687,6 +3687,20 @@ static const TypeInfo spapr_machine_info = {
> > >      type_init(spapr_machine_register_##suffix)
> > >  
> > >  /*
> > > + * pseries-2.12
> > > + */
> > > +static void spapr_machine_2_12_instance_options(MachineState *machine)
> > > +{
> > > +}
> > > +
> > > +static void spapr_machine_2_12_class_options(MachineClass *mc)
> > > +{
> > > +    /* Defaults for the latest behaviour inherited from the base class */
> > > +}
> > > +
> > > +DEFINE_SPAPR_MACHINE(2_12, "2.12", true);
> > > +
> > > +/*
> > >   * pseries-2.11
> > >   */
> > >  static void spapr_machine_2_11_instance_options(MachineState *machine)
> > > @@ -3698,7 +3712,7 @@ static void spapr_machine_2_11_class_options(MachineClass *mc)
> > >      /* Defaults for the latest behaviour inherited from the base class */
> > >  }
> > >  
> > > -DEFINE_SPAPR_MACHINE(2_11, "2.11", true);
> > > +DEFINE_SPAPR_MACHINE(2_11, "2.11", false);
> > >  
> > >  /*
> > >   * pseries-2.10  
> > 
> > Uh.. not quite right.  You also need to chain the 2.11 hooks onto the
> 
> Well, this happens in patch 5, but you're right it probably makes more
> sense to consolidate in a single patch.
> 
> Also, this patch should have dropped the "Defaults for the latest..."
> comment...
> 
> > new 2.12 ones.  Never mind, we'll need it sooner or later, so I've put
> > my own version into ppc-for-2.12.
> > 
> 
> ... and so should your version I guess.

Good point, amended.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 05/11] spapr: introduce an IRQ allocator using a bitmap
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 05/11] spapr: introduce an IRQ allocator using a bitmap Cédric Le Goater
@ 2017-11-14  9:42   ` Greg Kurz
  2017-11-14 11:54     ` Cédric Le Goater
  2017-11-17  4:50     ` David Gibson
  0 siblings, 2 replies; 79+ messages in thread
From: Greg Kurz @ 2017-11-14  9:42 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: qemu-ppc, qemu-devel, David Gibson, Benjamin Herrenschmidt

On Fri, 10 Nov 2017 15:20:11 +0000
Cédric Le Goater <clg@kaod.org> wrote:

> Let's define a new set of XICSFabric IRQ operations for the latest
> pseries machine. These simply use a a bitmap 'irq_map' as a IRQ number
> allocator.
> 
> The previous pseries machines keep the old set of IRQ operations using
> the ICSIRQState array.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
> 
>  Changes since v2 :
> 
>  - introduced a second set of XICSFabric IRQ operations for older
>    pseries machines
> 
>  hw/ppc/spapr.c         | 76 ++++++++++++++++++++++++++++++++++++++++++++++----
>  include/hw/ppc/spapr.h |  3 ++
>  2 files changed, 74 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 4bdceb45a14f..4ef0b73559ca 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1681,6 +1681,22 @@ static const VMStateDescription vmstate_spapr_patb_entry = {
>      },
>  };
>  
> +static bool spapr_irq_map_needed(void *opaque)
> +{
> +    return true;

I see that the next patch adds some code to avoid sending the
bitmap if it doesn't contain state, but I guess you should also
explicitly have this function to return false for older machine
types (see remark below).

> +}
> +
> +static const VMStateDescription vmstate_spapr_irq_map = {
> +    .name = "spapr_irq_map",
> +    .version_id = 0,
> +    .minimum_version_id = 0,
> +    .needed = spapr_irq_map_needed,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_BITMAP(irq_map, sPAPRMachineState, 0, nr_irqs),
> +        VMSTATE_END_OF_LIST()
> +    },
> +};
> +
>  static const VMStateDescription vmstate_spapr = {
>      .name = "spapr",
>      .version_id = 3,
> @@ -1700,6 +1716,7 @@ static const VMStateDescription vmstate_spapr = {
>          &vmstate_spapr_ov5_cas,
>          &vmstate_spapr_patb_entry,
>          &vmstate_spapr_pending_events,
> +        &vmstate_spapr_irq_map,
>          NULL
>      }
>  };
> @@ -2337,8 +2354,12 @@ static void ppc_spapr_init(MachineState *machine)
>      /* Setup a load limit for the ramdisk leaving room for SLOF and FDT */
>      load_limit = MIN(spapr->rma_size, RTAS_MAX_ADDR) - FW_OVERHEAD;
>  
> +    /* Initialize the IRQ allocator */
> +    spapr->nr_irqs  = XICS_IRQS_SPAPR;
> +    spapr->irq_map  = bitmap_new(spapr->nr_irqs);
> +

I think you should introduce a sPAPRMachineClass::has_irq_bitmap boolean
so that the bitmap is only allocated for newer machine types. And you should
then use this flag in spapr_irq_map_needed() above.

Apart from that, the rest of the patch looks good.

>      /* Set up Interrupt Controller before we create the VCPUs */
> -    xics_system_init(machine, XICS_IRQS_SPAPR, &error_fatal);
> +    xics_system_init(machine, spapr->nr_irqs, &error_fatal);
>  
>      /* Set up containers for ibm,client-architecture-support negotiated options
>       */
> @@ -3560,7 +3581,7 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum)
>      return -1;
>  }
>  
> -static bool spapr_irq_test(XICSFabric *xi, int irq)
> +static bool spapr_irq_test_2_11(XICSFabric *xi, int irq)
>  {
>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>      ICSState *ics = spapr->ics;
> @@ -3569,7 +3590,7 @@ static bool spapr_irq_test(XICSFabric *xi, int irq)
>      return !ICS_IRQ_FREE(ics, srcno);
>  }
>  
> -static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
> +static int spapr_irq_alloc_block_2_11(XICSFabric *xi, int count, int align)
>  {
>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>      ICSState *ics = spapr->ics;
> @@ -3583,7 +3604,7 @@ static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
>      return srcno + ics->offset;
>  }
>  
> -static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
> +static void spapr_irq_free_block_2_11(XICSFabric *xi, int irq, int num)
>  {
>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>      ICSState *ics = spapr->ics;
> @@ -3601,6 +3622,46 @@ static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
>      }
>  }
>  
> +static bool spapr_irq_test(XICSFabric *xi, int irq)
> +{
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> +    int srcno = irq - spapr->ics->offset;
> +
> +    return test_bit(srcno, spapr->irq_map);
> +}
> +
> +static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
> +{
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> +    int start = 0;
> +    int srcno;
> +
> +    /*
> +     * The 'align_mask' parameter of bitmap_find_next_zero_area()
> +     * should be one less than a power of 2; 0 means no
> +     * alignment. Adapt the 'align' value of the former allocator to
> +     * fit the requirements of bitmap_find_next_zero_area()
> +     */
> +    align -= 1;
> +
> +    srcno = bitmap_find_next_zero_area(spapr->irq_map, spapr->nr_irqs, start,
> +                                       count, align);
> +    if (srcno == spapr->nr_irqs) {
> +        return -1;
> +    }
> +
> +    bitmap_set(spapr->irq_map, srcno, count);
> +    return srcno + spapr->ics->offset;
> +}
> +
> +static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
> +{
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> +    int srcno = irq - spapr->ics->offset;
> +
> +    bitmap_clear(spapr->irq_map, srcno, num);
> +}
> +
>  static void spapr_pic_print_info(InterruptStatsProvider *obj,
>                                   Monitor *mon)
>  {
> @@ -3778,7 +3839,12 @@ static void spapr_machine_2_11_instance_options(MachineState *machine)
>  
>  static void spapr_machine_2_11_class_options(MachineClass *mc)
>  {
> -    /* Defaults for the latest behaviour inherited from the base class */
> +    XICSFabricClass *xic = XICS_FABRIC_CLASS(mc);
> +
> +    spapr_machine_2_12_class_options(mc);
> +    xic->irq_test = spapr_irq_test_2_11;
> +    xic->irq_alloc_block = spapr_irq_alloc_block_2_11;
> +    xic->irq_free_block = spapr_irq_free_block_2_11;
>  }
>  
>  DEFINE_SPAPR_MACHINE(2_11, "2.11", false);
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 9d21ca9bde3a..5835c694caff 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -7,6 +7,7 @@
>  #include "hw/ppc/spapr_drc.h"
>  #include "hw/mem/pc-dimm.h"
>  #include "hw/ppc/spapr_ovec.h"
> +#include "qemu/bitmap.h"
>  
>  struct VIOsPAPRBus;
>  struct sPAPRPHBState;
> @@ -78,6 +79,8 @@ struct sPAPRMachineState {
>      struct VIOsPAPRBus *vio_bus;
>      QLIST_HEAD(, sPAPRPHBState) phbs;
>      struct sPAPRNVRAM *nvram;
> +    int32_t nr_irqs;
> +    unsigned long *irq_map;
>      ICSState *ics;
>      sPAPRRTCState rtc;
>  

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 05/11] spapr: introduce an IRQ allocator using a bitmap
  2017-11-14  9:42   ` Greg Kurz
@ 2017-11-14 11:54     ` Cédric Le Goater
  2017-11-14 15:28       ` Greg Kurz
  2017-11-17  4:50     ` David Gibson
  1 sibling, 1 reply; 79+ messages in thread
From: Cédric Le Goater @ 2017-11-14 11:54 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, qemu-devel, David Gibson, Benjamin Herrenschmidt

On 11/14/2017 09:42 AM, Greg Kurz wrote:
> On Fri, 10 Nov 2017 15:20:11 +0000
> Cédric Le Goater <clg@kaod.org> wrote:
> 
>> Let's define a new set of XICSFabric IRQ operations for the latest
>> pseries machine. These simply use a a bitmap 'irq_map' as a IRQ number
>> allocator.
>>
>> The previous pseries machines keep the old set of IRQ operations using
>> the ICSIRQState array.
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>> ---
>>
>>  Changes since v2 :
>>
>>  - introduced a second set of XICSFabric IRQ operations for older
>>    pseries machines
>>
>>  hw/ppc/spapr.c         | 76 ++++++++++++++++++++++++++++++++++++++++++++++----
>>  include/hw/ppc/spapr.h |  3 ++
>>  2 files changed, 74 insertions(+), 5 deletions(-)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 4bdceb45a14f..4ef0b73559ca 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -1681,6 +1681,22 @@ static const VMStateDescription vmstate_spapr_patb_entry = {
>>      },
>>  };
>>  
>> +static bool spapr_irq_map_needed(void *opaque)
>> +{
>> +    return true;
> 
> I see that the next patch adds some code to avoid sending the
> bitmap if it doesn't contain state, but I guess you should also
> explicitly have this function to return false for older machine
> types (see remark below).
> 
>> +}
>> +
>> +static const VMStateDescription vmstate_spapr_irq_map = {
>> +    .name = "spapr_irq_map",
>> +    .version_id = 0,
>> +    .minimum_version_id = 0,
>> +    .needed = spapr_irq_map_needed,
>> +    .fields = (VMStateField[]) {
>> +        VMSTATE_BITMAP(irq_map, sPAPRMachineState, 0, nr_irqs),
>> +        VMSTATE_END_OF_LIST()
>> +    },
>> +};
>> +
>>  static const VMStateDescription vmstate_spapr = {
>>      .name = "spapr",
>>      .version_id = 3,
>> @@ -1700,6 +1716,7 @@ static const VMStateDescription vmstate_spapr = {
>>          &vmstate_spapr_ov5_cas,
>>          &vmstate_spapr_patb_entry,
>>          &vmstate_spapr_pending_events,
>> +        &vmstate_spapr_irq_map,
>>          NULL
>>      }
>>  };
>> @@ -2337,8 +2354,12 @@ static void ppc_spapr_init(MachineState *machine)
>>      /* Setup a load limit for the ramdisk leaving room for SLOF and FDT */
>>      load_limit = MIN(spapr->rma_size, RTAS_MAX_ADDR) - FW_OVERHEAD;
>>  
>> +    /* Initialize the IRQ allocator */
>> +    spapr->nr_irqs  = XICS_IRQS_SPAPR;
>> +    spapr->irq_map  = bitmap_new(spapr->nr_irqs);
>> +
> 
> I think you should introduce a sPAPRMachineClass::has_irq_bitmap boolean
> so that the bitmap is only allocated for newer machine types. And you should
> then use this flag in spapr_irq_map_needed() above.

yes. I can add a boot to be more explicit on the use of the bitmap.

Thanks,

C. 


> 
> Apart from that, the rest of the patch looks good.
> 
>>      /* Set up Interrupt Controller before we create the VCPUs */
>> -    xics_system_init(machine, XICS_IRQS_SPAPR, &error_fatal);
>> +    xics_system_init(machine, spapr->nr_irqs, &error_fatal);
>>  
>>      /* Set up containers for ibm,client-architecture-support negotiated options
>>       */
>> @@ -3560,7 +3581,7 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum)
>>      return -1;
>>  }
>>  
>> -static bool spapr_irq_test(XICSFabric *xi, int irq)
>> +static bool spapr_irq_test_2_11(XICSFabric *xi, int irq)
>>  {
>>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>>      ICSState *ics = spapr->ics;
>> @@ -3569,7 +3590,7 @@ static bool spapr_irq_test(XICSFabric *xi, int irq)
>>      return !ICS_IRQ_FREE(ics, srcno);
>>  }
>>  
>> -static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
>> +static int spapr_irq_alloc_block_2_11(XICSFabric *xi, int count, int align)
>>  {
>>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>>      ICSState *ics = spapr->ics;
>> @@ -3583,7 +3604,7 @@ static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
>>      return srcno + ics->offset;
>>  }
>>  
>> -static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
>> +static void spapr_irq_free_block_2_11(XICSFabric *xi, int irq, int num)
>>  {
>>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>>      ICSState *ics = spapr->ics;
>> @@ -3601,6 +3622,46 @@ static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
>>      }
>>  }
>>  
>> +static bool spapr_irq_test(XICSFabric *xi, int irq)
>> +{
>> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>> +    int srcno = irq - spapr->ics->offset;
>> +
>> +    return test_bit(srcno, spapr->irq_map);
>> +}
>> +
>> +static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
>> +{
>> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>> +    int start = 0;
>> +    int srcno;
>> +
>> +    /*
>> +     * The 'align_mask' parameter of bitmap_find_next_zero_area()
>> +     * should be one less than a power of 2; 0 means no
>> +     * alignment. Adapt the 'align' value of the former allocator to
>> +     * fit the requirements of bitmap_find_next_zero_area()
>> +     */
>> +    align -= 1;
>> +
>> +    srcno = bitmap_find_next_zero_area(spapr->irq_map, spapr->nr_irqs, start,
>> +                                       count, align);
>> +    if (srcno == spapr->nr_irqs) {
>> +        return -1;
>> +    }
>> +
>> +    bitmap_set(spapr->irq_map, srcno, count);
>> +    return srcno + spapr->ics->offset;
>> +}
>> +
>> +static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
>> +{
>> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>> +    int srcno = irq - spapr->ics->offset;
>> +
>> +    bitmap_clear(spapr->irq_map, srcno, num);
>> +}
>> +
>>  static void spapr_pic_print_info(InterruptStatsProvider *obj,
>>                                   Monitor *mon)
>>  {
>> @@ -3778,7 +3839,12 @@ static void spapr_machine_2_11_instance_options(MachineState *machine)
>>  
>>  static void spapr_machine_2_11_class_options(MachineClass *mc)
>>  {
>> -    /* Defaults for the latest behaviour inherited from the base class */
>> +    XICSFabricClass *xic = XICS_FABRIC_CLASS(mc);
>> +
>> +    spapr_machine_2_12_class_options(mc);
>> +    xic->irq_test = spapr_irq_test_2_11;
>> +    xic->irq_alloc_block = spapr_irq_alloc_block_2_11;
>> +    xic->irq_free_block = spapr_irq_free_block_2_11;
>>  }
>>  
>>  DEFINE_SPAPR_MACHINE(2_11, "2.11", false);
>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>> index 9d21ca9bde3a..5835c694caff 100644
>> --- a/include/hw/ppc/spapr.h
>> +++ b/include/hw/ppc/spapr.h
>> @@ -7,6 +7,7 @@
>>  #include "hw/ppc/spapr_drc.h"
>>  #include "hw/mem/pc-dimm.h"
>>  #include "hw/ppc/spapr_ovec.h"
>> +#include "qemu/bitmap.h"
>>  
>>  struct VIOsPAPRBus;
>>  struct sPAPRPHBState;
>> @@ -78,6 +79,8 @@ struct sPAPRMachineState {
>>      struct VIOsPAPRBus *vio_bus;
>>      QLIST_HEAD(, sPAPRPHBState) phbs;
>>      struct sPAPRNVRAM *nvram;
>> +    int32_t nr_irqs;
>> +    unsigned long *irq_map;
>>      ICSState *ics;
>>      sPAPRRTCState rtc;
>>  
> 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 06/11] spapr: store a reference IRQ bitmap
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 06/11] spapr: store a reference IRQ bitmap Cédric Le Goater
@ 2017-11-14 15:12   ` Greg Kurz
  0 siblings, 0 replies; 79+ messages in thread
From: Greg Kurz @ 2017-11-14 15:12 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: qemu-ppc, qemu-devel, David Gibson, Benjamin Herrenschmidt

On Fri, 10 Nov 2017 15:20:12 +0000
Cédric Le Goater <clg@kaod.org> wrote:

> To save some state when the guest is migrated, we capture the IRQ
> bitmap after all devices have been reseted and store it as a reference
> for the machine.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
> 
>  We should probably merge this patch with the previous in the next
>  versions of the patchset. For the moment, I thought it would be
>  interesting to isolate the topic for discussion.
> 

Indeed. So this will be able to catch all devices that are internally
created by the machine and the ones from the command line. If QEMU
is started with -S and the user does some device_add, the IRQs of the
corresponding devices won't be recorded, and we'll migrate the bitmap.
It is also possible to capture the state when the guest is started for
the first time, but it would bring some extra complexity... I guess
your approach is an acceptable trade-off.

>  hw/ppc/spapr.c         | 7 ++++++-
>  include/hw/ppc/spapr.h | 1 +
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 4ef0b73559ca..bf0e5b4f815b 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1437,6 +1437,9 @@ static void ppc_spapr_reset(void)
>      qemu_devices_reset();
>      spapr_clear_pending_events(spapr);
>  
> +    spapr->irq_map_ref = bitmap_new(spapr->nr_irqs);
> +    bitmap_copy(spapr->irq_map_ref, spapr->irq_map, spapr->nr_irqs);
> +
>      /*
>       * We place the device tree and RTAS just below either the top of the RMA,
>       * or just below 2GB, whichever is lowere, so that it can be
> @@ -1683,7 +1686,9 @@ static const VMStateDescription vmstate_spapr_patb_entry = {
>  
>  static bool spapr_irq_map_needed(void *opaque)
>  {
> -    return true;
> +    sPAPRMachineState *spapr = opaque;
> +
> +    return !bitmap_equal(spapr->irq_map, spapr->irq_map_ref, spapr->nr_irqs);
>  }
>  
>  static const VMStateDescription vmstate_spapr_irq_map = {
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 5835c694caff..023436c32b2a 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -81,6 +81,7 @@ struct sPAPRMachineState {
>      struct sPAPRNVRAM *nvram;
>      int32_t nr_irqs;
>      unsigned long *irq_map;
> +    unsigned long *irq_map_ref;
>      ICSState *ics;
>      sPAPRRTCState rtc;
>  

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 05/11] spapr: introduce an IRQ allocator using a bitmap
  2017-11-14 11:54     ` Cédric Le Goater
@ 2017-11-14 15:28       ` Greg Kurz
  2017-11-15  8:47         ` Cédric Le Goater
  0 siblings, 1 reply; 79+ messages in thread
From: Greg Kurz @ 2017-11-14 15:28 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: qemu-ppc, qemu-devel, David Gibson, Benjamin Herrenschmidt

On Tue, 14 Nov 2017 11:54:53 +0000
Cédric Le Goater <clg@kaod.org> wrote:

> On 11/14/2017 09:42 AM, Greg Kurz wrote:
> > On Fri, 10 Nov 2017 15:20:11 +0000
> > Cédric Le Goater <clg@kaod.org> wrote:
> >   
> >> Let's define a new set of XICSFabric IRQ operations for the latest
> >> pseries machine. These simply use a a bitmap 'irq_map' as a IRQ number
> >> allocator.
> >>
> >> The previous pseries machines keep the old set of IRQ operations using
> >> the ICSIRQState array.
> >>
> >> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> >> ---
> >>
> >>  Changes since v2 :
> >>
> >>  - introduced a second set of XICSFabric IRQ operations for older
> >>    pseries machines
> >>
> >>  hw/ppc/spapr.c         | 76 ++++++++++++++++++++++++++++++++++++++++++++++----
> >>  include/hw/ppc/spapr.h |  3 ++
> >>  2 files changed, 74 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >> index 4bdceb45a14f..4ef0b73559ca 100644
> >> --- a/hw/ppc/spapr.c
> >> +++ b/hw/ppc/spapr.c
> >> @@ -1681,6 +1681,22 @@ static const VMStateDescription vmstate_spapr_patb_entry = {
> >>      },
> >>  };
> >>  
> >> +static bool spapr_irq_map_needed(void *opaque)
> >> +{
> >> +    return true;  
> > 
> > I see that the next patch adds some code to avoid sending the
> > bitmap if it doesn't contain state, but I guess you should also
> > explicitly have this function to return false for older machine
> > types (see remark below).
> >   
> >> +}
> >> +
> >> +static const VMStateDescription vmstate_spapr_irq_map = {
> >> +    .name = "spapr_irq_map",
> >> +    .version_id = 0,
> >> +    .minimum_version_id = 0,
> >> +    .needed = spapr_irq_map_needed,
> >> +    .fields = (VMStateField[]) {
> >> +        VMSTATE_BITMAP(irq_map, sPAPRMachineState, 0, nr_irqs),
> >> +        VMSTATE_END_OF_LIST()
> >> +    },
> >> +};
> >> +
> >>  static const VMStateDescription vmstate_spapr = {
> >>      .name = "spapr",
> >>      .version_id = 3,
> >> @@ -1700,6 +1716,7 @@ static const VMStateDescription vmstate_spapr = {
> >>          &vmstate_spapr_ov5_cas,
> >>          &vmstate_spapr_patb_entry,
> >>          &vmstate_spapr_pending_events,
> >> +        &vmstate_spapr_irq_map,
> >>          NULL
> >>      }
> >>  };
> >> @@ -2337,8 +2354,12 @@ static void ppc_spapr_init(MachineState *machine)
> >>      /* Setup a load limit for the ramdisk leaving room for SLOF and FDT */
> >>      load_limit = MIN(spapr->rma_size, RTAS_MAX_ADDR) - FW_OVERHEAD;
> >>  
> >> +    /* Initialize the IRQ allocator */
> >> +    spapr->nr_irqs  = XICS_IRQS_SPAPR;

BTW, is this constant for the machine lifetime ? If so, maybe it should go
to sPAPRMachineClass.

> >> +    spapr->irq_map  = bitmap_new(spapr->nr_irqs);
> >> +  
> > 
> > I think you should introduce a sPAPRMachineClass::has_irq_bitmap boolean
> > so that the bitmap is only allocated for newer machine types. And you should
> > then use this flag in spapr_irq_map_needed() above.  
> 
> yes. I can add a boot to be more explicit on the use of the bitmap.
> 
> Thanks,
> 
> C. 
> 
> 
> > 
> > Apart from that, the rest of the patch looks good.
> >   
> >>      /* Set up Interrupt Controller before we create the VCPUs */
> >> -    xics_system_init(machine, XICS_IRQS_SPAPR, &error_fatal);
> >> +    xics_system_init(machine, spapr->nr_irqs, &error_fatal);
> >>  
> >>      /* Set up containers for ibm,client-architecture-support negotiated options
> >>       */
> >> @@ -3560,7 +3581,7 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum)
> >>      return -1;
> >>  }
> >>  
> >> -static bool spapr_irq_test(XICSFabric *xi, int irq)
> >> +static bool spapr_irq_test_2_11(XICSFabric *xi, int irq)
> >>  {
> >>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> >>      ICSState *ics = spapr->ics;
> >> @@ -3569,7 +3590,7 @@ static bool spapr_irq_test(XICSFabric *xi, int irq)
> >>      return !ICS_IRQ_FREE(ics, srcno);
> >>  }
> >>  
> >> -static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
> >> +static int spapr_irq_alloc_block_2_11(XICSFabric *xi, int count, int align)
> >>  {
> >>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> >>      ICSState *ics = spapr->ics;
> >> @@ -3583,7 +3604,7 @@ static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
> >>      return srcno + ics->offset;
> >>  }
> >>  
> >> -static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
> >> +static void spapr_irq_free_block_2_11(XICSFabric *xi, int irq, int num)
> >>  {
> >>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> >>      ICSState *ics = spapr->ics;
> >> @@ -3601,6 +3622,46 @@ static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
> >>      }
> >>  }
> >>  
> >> +static bool spapr_irq_test(XICSFabric *xi, int irq)
> >> +{
> >> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> >> +    int srcno = irq - spapr->ics->offset;
> >> +
> >> +    return test_bit(srcno, spapr->irq_map);
> >> +}
> >> +
> >> +static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
> >> +{
> >> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> >> +    int start = 0;
> >> +    int srcno;
> >> +
> >> +    /*
> >> +     * The 'align_mask' parameter of bitmap_find_next_zero_area()
> >> +     * should be one less than a power of 2; 0 means no
> >> +     * alignment. Adapt the 'align' value of the former allocator to
> >> +     * fit the requirements of bitmap_find_next_zero_area()
> >> +     */
> >> +    align -= 1;
> >> +
> >> +    srcno = bitmap_find_next_zero_area(spapr->irq_map, spapr->nr_irqs, start,
> >> +                                       count, align);
> >> +    if (srcno == spapr->nr_irqs) {
> >> +        return -1;
> >> +    }
> >> +
> >> +    bitmap_set(spapr->irq_map, srcno, count);
> >> +    return srcno + spapr->ics->offset;
> >> +}
> >> +
> >> +static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
> >> +{
> >> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> >> +    int srcno = irq - spapr->ics->offset;
> >> +
> >> +    bitmap_clear(spapr->irq_map, srcno, num);
> >> +}
> >> +
> >>  static void spapr_pic_print_info(InterruptStatsProvider *obj,
> >>                                   Monitor *mon)
> >>  {
> >> @@ -3778,7 +3839,12 @@ static void spapr_machine_2_11_instance_options(MachineState *machine)
> >>  
> >>  static void spapr_machine_2_11_class_options(MachineClass *mc)
> >>  {
> >> -    /* Defaults for the latest behaviour inherited from the base class */
> >> +    XICSFabricClass *xic = XICS_FABRIC_CLASS(mc);
> >> +
> >> +    spapr_machine_2_12_class_options(mc);
> >> +    xic->irq_test = spapr_irq_test_2_11;
> >> +    xic->irq_alloc_block = spapr_irq_alloc_block_2_11;
> >> +    xic->irq_free_block = spapr_irq_free_block_2_11;
> >>  }
> >>  
> >>  DEFINE_SPAPR_MACHINE(2_11, "2.11", false);
> >> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> >> index 9d21ca9bde3a..5835c694caff 100644
> >> --- a/include/hw/ppc/spapr.h
> >> +++ b/include/hw/ppc/spapr.h
> >> @@ -7,6 +7,7 @@
> >>  #include "hw/ppc/spapr_drc.h"
> >>  #include "hw/mem/pc-dimm.h"
> >>  #include "hw/ppc/spapr_ovec.h"
> >> +#include "qemu/bitmap.h"
> >>  
> >>  struct VIOsPAPRBus;
> >>  struct sPAPRPHBState;
> >> @@ -78,6 +79,8 @@ struct sPAPRMachineState {
> >>      struct VIOsPAPRBus *vio_bus;
> >>      QLIST_HEAD(, sPAPRPHBState) phbs;
> >>      struct sPAPRNVRAM *nvram;
> >> +    int32_t nr_irqs;
> >> +    unsigned long *irq_map;
> >>      ICSState *ics;
> >>      sPAPRRTCState rtc;
> >>    
> >   
> 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 07/11] spapr: introduce an 'irq_base' number
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 07/11] spapr: introduce an 'irq_base' number Cédric Le Goater
@ 2017-11-14 15:45   ` Greg Kurz
  2017-11-15 15:24     ` Cédric Le Goater
  0 siblings, 1 reply; 79+ messages in thread
From: Greg Kurz @ 2017-11-14 15:45 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: qemu-ppc, qemu-devel, David Gibson, Benjamin Herrenschmidt

On Fri, 10 Nov 2017 15:20:13 +0000
Cédric Le Goater <clg@kaod.org> wrote:

> 'irq_base' is a base IRQ number which lets us allocate only the subset
> of the IRQ numbers used on the sPAPR platform. It is sync with the
> ICSState 'offset' attribute and this is slightly redundant. We could
> also choose to waste some extra bytes (512) and allocate the whole
> number space. To be discussed.
> 
> But more important, it removes a dependency on the ICSState object of
> the sPAPR machine which is required for XIVE.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
>  hw/ppc/spapr.c         | 7 ++++---
>  include/hw/ppc/spapr.h | 1 +
>  2 files changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index bf0e5b4f815b..1cbbd7715a85 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2362,6 +2362,7 @@ static void ppc_spapr_init(MachineState *machine)
>      /* Initialize the IRQ allocator */
>      spapr->nr_irqs  = XICS_IRQS_SPAPR;
>      spapr->irq_map  = bitmap_new(spapr->nr_irqs);
> +    spapr->irq_base = XICS_IRQ_BASE;
> 

Since this is a constant value, do we really need a machine-level value ?

Especially now that all the code that needs it is in spapr.c, I guess it
can directly use the macro, no ?

>      /* Set up Interrupt Controller before we create the VCPUs */
>      xics_system_init(machine, spapr->nr_irqs, &error_fatal);
> @@ -3630,7 +3631,7 @@ static void spapr_irq_free_block_2_11(XICSFabric *xi, int irq, int num)
>  static bool spapr_irq_test(XICSFabric *xi, int irq)
>  {
>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> -    int srcno = irq - spapr->ics->offset;
> +    int srcno = irq - spapr->irq_base;
>  
>      return test_bit(srcno, spapr->irq_map);
>  }
> @@ -3656,13 +3657,13 @@ static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
>      }
>  
>      bitmap_set(spapr->irq_map, srcno, count);
> -    return srcno + spapr->ics->offset;
> +    return srcno + spapr->irq_base;
>  }
>  
>  static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
>  {
>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> -    int srcno = irq - spapr->ics->offset;
> +    int srcno = irq - spapr->irq_base;
>  
>      bitmap_clear(spapr->irq_map, srcno, num);
>  }
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 023436c32b2a..200667dcff9d 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -82,6 +82,7 @@ struct sPAPRMachineState {
>      int32_t nr_irqs;
>      unsigned long *irq_map;
>      unsigned long *irq_map_ref;
> +    uint32_t irq_base;
>      ICSState *ics;
>      sPAPRRTCState rtc;
>  

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 08/11] spapr: introduce a XICSFabric irq_is_lsi() operation
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 08/11] spapr: introduce a XICSFabric irq_is_lsi() operation Cédric Le Goater
@ 2017-11-14 16:21   ` Greg Kurz
  2017-11-17  4:54   ` David Gibson
  1 sibling, 0 replies; 79+ messages in thread
From: Greg Kurz @ 2017-11-14 16:21 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: qemu-ppc, qemu-devel, David Gibson, Benjamin Herrenschmidt

On Fri, 10 Nov 2017 15:20:14 +0000
Cédric Le Goater <clg@kaod.org> wrote:

> It will be used later on to distinguish the allocation of an LSI
> interrupt from an MSI and also to reduce the use of the ICSIRQState
> array of the ICSState object, which is on our way to introduce XIVE.
> 
> The 'irq' parameter continues to refer to the global IRQ number space.
> 
> On PowerNV, only the PSI controller interrupts are handled and they
> are all LSIs.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  hw/intc/xics.c        | 26 +++++++++++++++++---------
>  hw/intc/xics_kvm.c    |  4 ++--
>  hw/ppc/pnv.c          | 16 ++++++++++++++++
>  hw/ppc/spapr.c        |  9 +++++++++
>  include/hw/ppc/xics.h |  2 ++
>  5 files changed, 46 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> index 2c4899f278e2..42880e736697 100644
> --- a/hw/intc/xics.c
> +++ b/hw/intc/xics.c
> @@ -33,6 +33,7 @@
>  #include "trace.h"
>  #include "qemu/timer.h"
>  #include "hw/ppc/xics.h"
> +#include "hw/ppc/spapr.h"
>  #include "qemu/error-report.h"
>  #include "qapi/visitor.h"
>  #include "monitor/monitor.h"
> @@ -70,8 +71,7 @@ void ics_pic_print_info(ICSState *ics, Monitor *mon)
>          }
>          monitor_printf(mon, "  %4x %s %02x %02x\n",
>                         ics->offset + i,
> -                       (irq->flags & XICS_FLAGS_IRQ_LSI) ?
> -                       "LSI" : "MSI",
> +                       ics_is_lsi(ics, i) ? "LSI" : "MSI",
>                         irq->priority, irq->status);
>      }
>  }
> @@ -377,6 +377,14 @@ static const TypeInfo icp_info = {
>  /*
>   * ICS: Source layer
>   */
> +bool ics_is_lsi(ICSState *ics, int srcno)
> +{
> +    XICSFabric *xi = ics->xics;
> +    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(xi);
> +
> +    return xic->irq_is_lsi(xi, srcno + ics->offset);
> +}
> +
>  static void ics_simple_resend_msi(ICSState *ics, int srcno)
>  {
>      ICSIRQState *irq = ics->irqs + srcno;
> @@ -435,7 +443,7 @@ static void ics_simple_set_irq(void *opaque, int srcno, int val)
>  {
>      ICSState *ics = (ICSState *)opaque;
>  
> -    if (ics->irqs[srcno].flags & XICS_FLAGS_IRQ_LSI) {
> +    if (ics_is_lsi(ics, srcno)) {
>          ics_simple_set_irq_lsi(ics, srcno, val);
>      } else {
>          ics_simple_set_irq_msi(ics, srcno, val);
> @@ -472,7 +480,7 @@ void ics_simple_write_xive(ICSState *ics, int srcno, int server,
>      trace_xics_ics_simple_write_xive(ics->offset + srcno, srcno, server,
>                                       priority);
>  
> -    if (ics->irqs[srcno].flags & XICS_FLAGS_IRQ_LSI) {
> +    if (ics_is_lsi(ics, srcno)) {
>          ics_simple_write_xive_lsi(ics, srcno);
>      } else {
>          ics_simple_write_xive_msi(ics, srcno);
> @@ -484,10 +492,10 @@ static void ics_simple_reject(ICSState *ics, uint32_t nr)
>      ICSIRQState *irq = ics->irqs + nr - ics->offset;
>  
>      trace_xics_ics_simple_reject(nr, nr - ics->offset);
> -    if (irq->flags & XICS_FLAGS_IRQ_MSI) {
> -        irq->status |= XICS_STATUS_REJECTED;
> -    } else if (irq->flags & XICS_FLAGS_IRQ_LSI) {
> +    if (ics_is_lsi(ics, nr - ics->offset)) {
>          irq->status &= ~XICS_STATUS_SENT;
> +    } else {
> +        irq->status |= XICS_STATUS_REJECTED;
>      }
>  }
>  
> @@ -497,7 +505,7 @@ static void ics_simple_resend(ICSState *ics)
>  
>      for (i = 0; i < ics->nr_irqs; i++) {
>          /* FIXME: filter by server#? */
> -        if (ics->irqs[i].flags & XICS_FLAGS_IRQ_LSI) {
> +        if (ics_is_lsi(ics, i)) {
>              ics_simple_resend_lsi(ics, i);
>          } else {
>              ics_simple_resend_msi(ics, i);
> @@ -512,7 +520,7 @@ static void ics_simple_eoi(ICSState *ics, uint32_t nr)
>  
>      trace_xics_ics_simple_eoi(nr);
>  
> -    if (ics->irqs[srcno].flags & XICS_FLAGS_IRQ_LSI) {
> +    if (ics_is_lsi(ics, srcno)) {
>          irq->status &= ~XICS_STATUS_SENT;
>      }
>  }
> diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
> index 3091ad3ac2c8..2f10637c9f7c 100644
> --- a/hw/intc/xics_kvm.c
> +++ b/hw/intc/xics_kvm.c
> @@ -258,7 +258,7 @@ static int ics_set_kvm_state(ICSState *ics, int version_id)
>              state |= KVM_XICS_MASKED;
>          }
>  
> -        if (ics->irqs[i].flags & XICS_FLAGS_IRQ_LSI) {
> +        if (ics_is_lsi(ics, i)) {
>              state |= KVM_XICS_LEVEL_SENSITIVE;
>              if (irq->status & XICS_STATUS_ASSERTED) {
>                  state |= KVM_XICS_PENDING;
> @@ -293,7 +293,7 @@ static void ics_kvm_set_irq(void *opaque, int srcno, int val)
>      int rc;
>  
>      args.irq = srcno + ics->offset;
> -    if (ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MSI) {
> +    if (!ics_is_lsi(ics, srcno)) {
>          if (!val) {
>              return;
>          }
> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> index 8288940ef9d7..958223376b4c 100644
> --- a/hw/ppc/pnv.c
> +++ b/hw/ppc/pnv.c
> @@ -1035,6 +1035,21 @@ static bool pnv_irq_test(XICSFabric *xi, int irq)
>      return false;
>  }
>  
> +static bool pnv_irq_is_lsi(XICSFabric *xi, int irq)
> +{
> +    PnvMachineState *pnv = POWERNV_MACHINE(xi);
> +    int i;
> +
> +    /* PowerNV machine only has PSI interrupts which are all LSIs */
> +    for (i = 0; i < pnv->num_chips; i++) {
> +        ICSState *ics = &pnv->chips[i]->psi.ics;
> +        if (ics_valid_irq(ics, irq)) {
> +            return true;
> +        }
> +    }
> +    return false;
> +}
> +
>  static void pnv_pic_print_info(InterruptStatsProvider *obj,
>                                 Monitor *mon)
>  {
> @@ -1120,6 +1135,7 @@ static void powernv_machine_class_init(ObjectClass *oc, void *data)
>      xic->ics_get = pnv_ics_get;
>      xic->ics_resend = pnv_ics_resend;
>      xic->irq_test = pnv_irq_test;
> +    xic->irq_is_lsi = pnv_irq_is_lsi;
>      ispc->print_info = pnv_pic_print_info;
>  
>      powernv_machine_class_props_init(oc);
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 1cbbd7715a85..ce314fcf38db 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3628,6 +3628,14 @@ static void spapr_irq_free_block_2_11(XICSFabric *xi, int irq, int num)
>      }
>  }
>  
> +static bool spapr_irq_is_lsi(XICSFabric *xi, int irq)
> +{
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> +    int srcno = irq - spapr->ics->offset;
> +
> +    return spapr->ics->irqs[srcno].flags & XICS_FLAGS_IRQ_LSI;
> +}
> +
>  static bool spapr_irq_test(XICSFabric *xi, int irq)
>  {
>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> @@ -3765,6 +3773,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>      xic->irq_test = spapr_irq_test;
>      xic->irq_alloc_block = spapr_irq_alloc_block;
>      xic->irq_free_block = spapr_irq_free_block;
> +    xic->irq_is_lsi = spapr_irq_is_lsi;
>  
>      ispc->print_info = spapr_pic_print_info;
>      /* Force NUMA node memory size to be a multiple of
> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> index 30e7f2e0a7dd..478f8e510179 100644
> --- a/include/hw/ppc/xics.h
> +++ b/include/hw/ppc/xics.h
> @@ -179,6 +179,7 @@ typedef struct XICSFabricClass {
>      bool (*irq_test)(XICSFabric *xi, int irq);
>      int (*irq_alloc_block)(XICSFabric *xi, int count, int align);
>      void (*irq_free_block)(XICSFabric *xi, int irq, int num);
> +    bool (*irq_is_lsi)(XICSFabric *xi, int irq);
>  } XICSFabricClass;
>  
>  #define XICS_IRQS_SPAPR               1024
> @@ -205,6 +206,7 @@ void ics_simple_write_xive(ICSState *ics, int nr, int server,
>  void ics_set_irq_type(ICSState *ics, int srcno, bool lsi);
>  void icp_pic_print_info(ICPState *icp, Monitor *mon);
>  void ics_pic_print_info(ICSState *ics, Monitor *mon);
> +bool ics_is_lsi(ICSState *ics, int srno);
>  
>  void ics_resend(ICSState *ics);
>  void icp_resend(ICPState *ss);

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 05/11] spapr: introduce an IRQ allocator using a bitmap
  2017-11-14 15:28       ` Greg Kurz
@ 2017-11-15  8:47         ` Cédric Le Goater
  0 siblings, 0 replies; 79+ messages in thread
From: Cédric Le Goater @ 2017-11-15  8:47 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, qemu-devel, David Gibson, Benjamin Herrenschmidt

On 11/14/2017 04:28 PM, Greg Kurz wrote:
> On Tue, 14 Nov 2017 11:54:53 +0000
> Cédric Le Goater <clg@kaod.org> wrote:
> 
>> On 11/14/2017 09:42 AM, Greg Kurz wrote:
>>> On Fri, 10 Nov 2017 15:20:11 +0000
>>> Cédric Le Goater <clg@kaod.org> wrote:
>>>   
>>>> Let's define a new set of XICSFabric IRQ operations for the latest
>>>> pseries machine. These simply use a a bitmap 'irq_map' as a IRQ number
>>>> allocator.
>>>>
>>>> The previous pseries machines keep the old set of IRQ operations using
>>>> the ICSIRQState array.
>>>>
>>>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>>>> ---
>>>>
>>>>  Changes since v2 :
>>>>
>>>>  - introduced a second set of XICSFabric IRQ operations for older
>>>>    pseries machines
>>>>
>>>>  hw/ppc/spapr.c         | 76 ++++++++++++++++++++++++++++++++++++++++++++++----
>>>>  include/hw/ppc/spapr.h |  3 ++
>>>>  2 files changed, 74 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>>> index 4bdceb45a14f..4ef0b73559ca 100644
>>>> --- a/hw/ppc/spapr.c
>>>> +++ b/hw/ppc/spapr.c
>>>> @@ -1681,6 +1681,22 @@ static const VMStateDescription vmstate_spapr_patb_entry = {
>>>>      },
>>>>  };
>>>>  
>>>> +static bool spapr_irq_map_needed(void *opaque)
>>>> +{
>>>> +    return true;  
>>>
>>> I see that the next patch adds some code to avoid sending the
>>> bitmap if it doesn't contain state, but I guess you should also
>>> explicitly have this function to return false for older machine
>>> types (see remark below).
>>>   
>>>> +}
>>>> +
>>>> +static const VMStateDescription vmstate_spapr_irq_map = {
>>>> +    .name = "spapr_irq_map",
>>>> +    .version_id = 0,
>>>> +    .minimum_version_id = 0,
>>>> +    .needed = spapr_irq_map_needed,
>>>> +    .fields = (VMStateField[]) {
>>>> +        VMSTATE_BITMAP(irq_map, sPAPRMachineState, 0, nr_irqs),
>>>> +        VMSTATE_END_OF_LIST()
>>>> +    },
>>>> +};
>>>> +
>>>>  static const VMStateDescription vmstate_spapr = {
>>>>      .name = "spapr",
>>>>      .version_id = 3,
>>>> @@ -1700,6 +1716,7 @@ static const VMStateDescription vmstate_spapr = {
>>>>          &vmstate_spapr_ov5_cas,
>>>>          &vmstate_spapr_patb_entry,
>>>>          &vmstate_spapr_pending_events,
>>>> +        &vmstate_spapr_irq_map,
>>>>          NULL
>>>>      }
>>>>  };
>>>> @@ -2337,8 +2354,12 @@ static void ppc_spapr_init(MachineState *machine)
>>>>      /* Setup a load limit for the ramdisk leaving room for SLOF and FDT */
>>>>      load_limit = MIN(spapr->rma_size, RTAS_MAX_ADDR) - FW_OVERHEAD;
>>>>  
>>>> +    /* Initialize the IRQ allocator */
>>>> +    spapr->nr_irqs  = XICS_IRQS_SPAPR;
> 
> BTW, is this constant for the machine lifetime ? If so, maybe it should go
> to sPAPRMachineClass.

For Xive, we will be increasing the value of 'nr_irqs' with the number 
of max_cpus to handle the IPIs.

C.


> 
>>>> +    spapr->irq_map  = bitmap_new(spapr->nr_irqs);
>>>> +  
>>>
>>> I think you should introduce a sPAPRMachineClass::has_irq_bitmap boolean
>>> so that the bitmap is only allocated for newer machine types. And you should
>>> then use this flag in spapr_irq_map_needed() above.  
>>
>> yes. I can add a boot to be more explicit on the use of the bitmap.
>>
>> Thanks,
>>
>> C. 
>>
>>
>>>
>>> Apart from that, the rest of the patch looks good.
>>>   
>>>>      /* Set up Interrupt Controller before we create the VCPUs */
>>>> -    xics_system_init(machine, XICS_IRQS_SPAPR, &error_fatal);
>>>> +    xics_system_init(machine, spapr->nr_irqs, &error_fatal);
>>>>  
>>>>      /* Set up containers for ibm,client-architecture-support negotiated options
>>>>       */
>>>> @@ -3560,7 +3581,7 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum)
>>>>      return -1;
>>>>  }
>>>>  
>>>> -static bool spapr_irq_test(XICSFabric *xi, int irq)
>>>> +static bool spapr_irq_test_2_11(XICSFabric *xi, int irq)
>>>>  {
>>>>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>>>>      ICSState *ics = spapr->ics;
>>>> @@ -3569,7 +3590,7 @@ static bool spapr_irq_test(XICSFabric *xi, int irq)
>>>>      return !ICS_IRQ_FREE(ics, srcno);
>>>>  }
>>>>  
>>>> -static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
>>>> +static int spapr_irq_alloc_block_2_11(XICSFabric *xi, int count, int align)
>>>>  {
>>>>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>>>>      ICSState *ics = spapr->ics;
>>>> @@ -3583,7 +3604,7 @@ static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
>>>>      return srcno + ics->offset;
>>>>  }
>>>>  
>>>> -static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
>>>> +static void spapr_irq_free_block_2_11(XICSFabric *xi, int irq, int num)
>>>>  {
>>>>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>>>>      ICSState *ics = spapr->ics;
>>>> @@ -3601,6 +3622,46 @@ static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
>>>>      }
>>>>  }
>>>>  
>>>> +static bool spapr_irq_test(XICSFabric *xi, int irq)
>>>> +{
>>>> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>>>> +    int srcno = irq - spapr->ics->offset;
>>>> +
>>>> +    return test_bit(srcno, spapr->irq_map);
>>>> +}
>>>> +
>>>> +static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
>>>> +{
>>>> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>>>> +    int start = 0;
>>>> +    int srcno;
>>>> +
>>>> +    /*
>>>> +     * The 'align_mask' parameter of bitmap_find_next_zero_area()
>>>> +     * should be one less than a power of 2; 0 means no
>>>> +     * alignment. Adapt the 'align' value of the former allocator to
>>>> +     * fit the requirements of bitmap_find_next_zero_area()
>>>> +     */
>>>> +    align -= 1;
>>>> +
>>>> +    srcno = bitmap_find_next_zero_area(spapr->irq_map, spapr->nr_irqs, start,
>>>> +                                       count, align);
>>>> +    if (srcno == spapr->nr_irqs) {
>>>> +        return -1;
>>>> +    }
>>>> +
>>>> +    bitmap_set(spapr->irq_map, srcno, count);
>>>> +    return srcno + spapr->ics->offset;
>>>> +}
>>>> +
>>>> +static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
>>>> +{
>>>> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>>>> +    int srcno = irq - spapr->ics->offset;
>>>> +
>>>> +    bitmap_clear(spapr->irq_map, srcno, num);
>>>> +}
>>>> +
>>>>  static void spapr_pic_print_info(InterruptStatsProvider *obj,
>>>>                                   Monitor *mon)
>>>>  {
>>>> @@ -3778,7 +3839,12 @@ static void spapr_machine_2_11_instance_options(MachineState *machine)
>>>>  
>>>>  static void spapr_machine_2_11_class_options(MachineClass *mc)
>>>>  {
>>>> -    /* Defaults for the latest behaviour inherited from the base class */
>>>> +    XICSFabricClass *xic = XICS_FABRIC_CLASS(mc);
>>>> +
>>>> +    spapr_machine_2_12_class_options(mc);
>>>> +    xic->irq_test = spapr_irq_test_2_11;
>>>> +    xic->irq_alloc_block = spapr_irq_alloc_block_2_11;
>>>> +    xic->irq_free_block = spapr_irq_free_block_2_11;
>>>>  }
>>>>  
>>>>  DEFINE_SPAPR_MACHINE(2_11, "2.11", false);
>>>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>>>> index 9d21ca9bde3a..5835c694caff 100644
>>>> --- a/include/hw/ppc/spapr.h
>>>> +++ b/include/hw/ppc/spapr.h
>>>> @@ -7,6 +7,7 @@
>>>>  #include "hw/ppc/spapr_drc.h"
>>>>  #include "hw/mem/pc-dimm.h"
>>>>  #include "hw/ppc/spapr_ovec.h"
>>>> +#include "qemu/bitmap.h"
>>>>  
>>>>  struct VIOsPAPRBus;
>>>>  struct sPAPRPHBState;
>>>> @@ -78,6 +79,8 @@ struct sPAPRMachineState {
>>>>      struct VIOsPAPRBus *vio_bus;
>>>>      QLIST_HEAD(, sPAPRPHBState) phbs;
>>>>      struct sPAPRNVRAM *nvram;
>>>> +    int32_t nr_irqs;
>>>> +    unsigned long *irq_map;
>>>>      ICSState *ics;
>>>>      sPAPRRTCState rtc;
>>>>    
>>>   
>>
> 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 07/11] spapr: introduce an 'irq_base' number
  2017-11-14 15:45   ` Greg Kurz
@ 2017-11-15 15:24     ` Cédric Le Goater
  2017-11-15 16:43       ` Greg Kurz
  0 siblings, 1 reply; 79+ messages in thread
From: Cédric Le Goater @ 2017-11-15 15:24 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, qemu-devel, David Gibson, Benjamin Herrenschmidt

On 11/14/2017 03:45 PM, Greg Kurz wrote:
> On Fri, 10 Nov 2017 15:20:13 +0000
> Cédric Le Goater <clg@kaod.org> wrote:
> 
>> 'irq_base' is a base IRQ number which lets us allocate only the subset
>> of the IRQ numbers used on the sPAPR platform. It is sync with the
>> ICSState 'offset' attribute and this is slightly redundant. We could
>> also choose to waste some extra bytes (512) and allocate the whole
>> number space. To be discussed.
>>
>> But more important, it removes a dependency on the ICSState object of
>> the sPAPR machine which is required for XIVE.
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>> ---
>>  hw/ppc/spapr.c         | 7 ++++---
>>  include/hw/ppc/spapr.h | 1 +
>>  2 files changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index bf0e5b4f815b..1cbbd7715a85 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -2362,6 +2362,7 @@ static void ppc_spapr_init(MachineState *machine)
>>      /* Initialize the IRQ allocator */
>>      spapr->nr_irqs  = XICS_IRQS_SPAPR;
>>      spapr->irq_map  = bitmap_new(spapr->nr_irqs);
>> +    spapr->irq_base = XICS_IRQ_BASE;
>>
> 
> Since this is a constant value, do we really need a machine-level value ?

no. I don't think either.   

But I would like to know why we are starting to allocate IRQ numbers 
at 4096 ? Only 2 is reserved fo IPIs. So that seems a little large. 
I have not found the reason though.


Also I am starting to think that we should probably segment the allocation 
per device like this is specified in the PAPR specs. Each device has one 
or more Bus Unit IDentifier (BUID) which acts as a prefix for the IRQ 
number. That would facilitate the IRQ numbering and fix some issues 
in migration when devices are hotplugged. I am thinking about phbs
mostly.

C.


> Especially now that all the code that needs it is in spapr.c, I guess it
> can directly use the macro, no ?
> 
>>      /* Set up Interrupt Controller before we create the VCPUs */
>>      xics_system_init(machine, spapr->nr_irqs, &error_fatal);
>> @@ -3630,7 +3631,7 @@ static void spapr_irq_free_block_2_11(XICSFabric *xi, int irq, int num)
>>  static bool spapr_irq_test(XICSFabric *xi, int irq)
>>  {
>>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>> -    int srcno = irq - spapr->ics->offset;
>> +    int srcno = irq - spapr->irq_base;
>>  
>>      return test_bit(srcno, spapr->irq_map);
>>  }
>> @@ -3656,13 +3657,13 @@ static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
>>      }
>>  
>>      bitmap_set(spapr->irq_map, srcno, count);
>> -    return srcno + spapr->ics->offset;
>> +    return srcno + spapr->irq_base;
>>  }
>>  
>>  static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
>>  {
>>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>> -    int srcno = irq - spapr->ics->offset;
>> +    int srcno = irq - spapr->irq_base;
>>  
>>      bitmap_clear(spapr->irq_map, srcno, num);
>>  }
>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>> index 023436c32b2a..200667dcff9d 100644
>> --- a/include/hw/ppc/spapr.h
>> +++ b/include/hw/ppc/spapr.h
>> @@ -82,6 +82,7 @@ struct sPAPRMachineState {
>>      int32_t nr_irqs;
>>      unsigned long *irq_map;
>>      unsigned long *irq_map_ref;
>> +    uint32_t irq_base;
>>      ICSState *ics;
>>      sPAPRRTCState rtc;
>>  
> 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 09/11] spapr: split the IRQ number space for LSI interrupts
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 09/11] spapr: split the IRQ number space for LSI interrupts Cédric Le Goater
@ 2017-11-15 15:52   ` Greg Kurz
  2017-11-15 16:08     ` Cédric Le Goater
  0 siblings, 1 reply; 79+ messages in thread
From: Greg Kurz @ 2017-11-15 15:52 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: qemu-ppc, qemu-devel, David Gibson, Benjamin Herrenschmidt

On Fri, 10 Nov 2017 15:20:15 +0000
Cédric Le Goater <clg@kaod.org> wrote:

> The type of an interrupt, MSI or LSI, is stored under the flag
> attribute of the ICSIRQState array. To reduce the use of this array
> and consequently of the ICSState object (This is needed to introduce
> the new XIVE model), we choose to split the IRQ number space of the
> machine in two: first the LSIs and then the MSIs.
> 
> This also has the benefit to keep the LSI IRQ numbers in a well known
> range which will be useful for PHB hotplug.
> 

Well... LSIs indeed land in a well known range, but it isn't enough for PHB
hotplug. Each PHB is uniquely identified by its 'index' property, and we
want each PHB to have fixed LSIs, so that they are invariant across migration.

> This change only applies to the latest pseries machines. Older
> machines still use the ICSIRQState array to define the IRQ type.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
> 
>  Changes since v2 :
> 
>  - introduced a second set of XICSFabric IRQ operations for older
>    pseries machines
> 
>  hw/intc/xics_spapr.c  |  6 +++---
>  hw/ppc/spapr.c        | 33 +++++++++++++++++++++++++++++----
>  include/hw/ppc/xics.h |  2 +-
>  3 files changed, 33 insertions(+), 8 deletions(-)pe-total-#msi
> 
> diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
> index de9e65d35247..b8e91aaf52bd 100644
> --- a/hw/intc/xics_spapr.c
> +++ b/hw/intc/xics_spapr.c
> @@ -260,7 +260,7 @@ int spapr_ics_alloc(ICSState *ics, int irq_hint, bool lsi, Error **errp)
>          }
>          irq = irq_hint;
>      } else {
> -        irq = xic->irq_alloc_block(ics->xics, 1, 1);
> +        irq = xic->irq_alloc_block(ics->xics, 1, 1, lsi);
>          if (irq < 0) {
>              error_setg(errp, "can't allocate IRQ: no IRQ left");
>              return -1;
> @@ -297,9 +297,9 @@ int spapr_ics_alloc_block(ICSState *ics, int num, bool lsi,
>      if (align) {
>          assert((num == 1) || (num == 2) || (num == 4) ||
>                 (num == 8) || (num == 16) || (num == 32));
> -        first = xic->irq_alloc_block(ics->xics, num, num);
> +        first = xic->irq_alloc_block(ics->xics, num, num, lsi);
>      } else {
> -        first = xic->irq_alloc_block(ics->xics, num, 1);
> +        first = xic->irq_alloc_block(ics->xics, num, 1, lsi);
>      }
>      if (first < 0) {
>          error_setg(errp, "can't find a free %d-IRQ block", num);
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index ce314fcf38db..f14eae6196cd 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3596,7 +3596,8 @@ static bool spapr_irq_test_2_11(XICSFabric *xi, int irq)
>      return !ICS_IRQ_FREE(ics, srcno);
>  }
>  
> -static int spapr_irq_alloc_block_2_11(XICSFabric *xi, int count, int align)
> +static int spapr_irq_alloc_block_2_11(XICSFabric *xi, int count, int align,
> +                                      bool lsi)
>  {
>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>      ICSState *ics = spapr->ics;
> @@ -3628,7 +3629,7 @@ static void spapr_irq_free_block_2_11(XICSFabric *xi, int irq, int num)
>      }
>  }
>  
> -static bool spapr_irq_is_lsi(XICSFabric *xi, int irq)
> +static bool spapr_irq_is_lsi_2_11(XICSFabric *xi, int irq)
>  {
>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>      int srcno = irq - spapr->ics->offset;
> @@ -3644,10 +3645,21 @@ static bool spapr_irq_test(XICSFabric *xi, int irq)
>      return test_bit(srcno, spapr->irq_map);
>  }
>  
> -static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
> +
> +/*
> + * Let's provision 4 LSIs per PHBs
> + */
> +#define SPAPR_MAX_LSI (SPAPR_MAX_PHBS * 4)
> +
> +/*
> + * Split the IRQ number space of the machine in two: first the LSIs
> + * and then the MSIs. This allows us to keep the LSI IRQ numbers in a
> + * well known range which is useful for PHB hotplug.
> + */
> +static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align, bool lsi)
>  {
>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> -    int start = 0;
> +    int start = lsi ? 0 : SPAPR_MAX_LSI;
>      int srcno;
>  
>      /*
> @@ -3664,6 +3676,10 @@ static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
>          return -1;
>      }
>  
> +    if (lsi && srcno >= SPAPR_MAX_LSI) {
> +        return -1;
> +    }
> +
>      bitmap_set(spapr->irq_map, srcno, count);
>      return srcno + spapr->irq_base;
>  }
> @@ -3676,6 +3692,14 @@ static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
>      bitmap_clear(spapr->irq_map, srcno, num);
>  }
>  
> +static bool spapr_irq_is_lsi(XICSFabric *xi, int irq)
> +{
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> +    int srcno = irq - spapr->irq_base;
> +
> +    return (srcno >= 0) && (srcno < SPAPR_MAX_LSI);
> +}
> +
>  static void spapr_pic_print_info(InterruptStatsProvider *obj,
>                                   Monitor *mon)
>  {
> @@ -3860,6 +3884,7 @@ static void spapr_machine_2_11_class_options(MachineClass *mc)
>      xic->irq_test = spapr_irq_test_2_11;
>      xic->irq_alloc_block = spapr_irq_alloc_block_2_11;
>      xic->irq_free_block = spapr_irq_free_block_2_11;
> +    xic->irq_is_lsi = spapr_irq_is_lsi_2_11;
>  }
>  
>  DEFINE_SPAPR_MACHINE(2_11, "2.11", false);
> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> index 478f8e510179..292b929e88eb 100644
> --- a/include/hw/ppc/xics.h
> +++ b/include/hw/ppc/xics.h
> @@ -177,7 +177,7 @@ typedef struct XICSFabricClass {
>      ICPState *(*icp_get)(XICSFabric *xi, int server);
>      /* IRQ allocator helpers */
>      bool (*irq_test)(XICSFabric *xi, int irq);
> -    int (*irq_alloc_block)(XICSFabric *xi, int count, int align);
> +    int (*irq_alloc_block)(XICSFabric *xi, int count, int align, bool lsi);
>      void (*irq_free_block)(XICSFabric *xi, int irq, int num);
>      bool (*irq_is_lsi)(XICSFabric *xi, int irq);
>  } XICSFabricClass;

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 09/11] spapr: split the IRQ number space for LSI interrupts
  2017-11-15 15:52   ` Greg Kurz
@ 2017-11-15 16:08     ` Cédric Le Goater
  2017-11-15 20:27       ` Greg Kurz
  0 siblings, 1 reply; 79+ messages in thread
From: Cédric Le Goater @ 2017-11-15 16:08 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, qemu-devel, David Gibson, Benjamin Herrenschmidt

On 11/15/2017 03:52 PM, Greg Kurz wrote:
> On Fri, 10 Nov 2017 15:20:15 +0000
> Cédric Le Goater <clg@kaod.org> wrote:
> 
>> The type of an interrupt, MSI or LSI, is stored under the flag
>> attribute of the ICSIRQState array. To reduce the use of this array
>> and consequently of the ICSState object (This is needed to introduce
>> the new XIVE model), we choose to split the IRQ number space of the
>> machine in two: first the LSIs and then the MSIs.
>>
>> This also has the benefit to keep the LSI IRQ numbers in a well known
>> range which will be useful for PHB hotplug.
>>
> 
> Well... LSIs indeed land in a well known range, but it isn't enough for PHB
> hotplug. Each PHB is uniquely identified by its 'index' property, and we
> want each PHB to have fixed LSIs, so that they are invariant across migration.

ok. 

So, as said in another email, we should think about segmenting
the allocation per device. At least for PHBs. This is specified in
the PAPR specs, each device has one or more Bus Unit IDentifier (BUID)
acting as a prefix for the IRQ number.

We could model that by using a specific range for each PHB in the 
overall IRQ number space, depending on some index. LSIs would be 
allocated at the beginning of this range, when the device is realized 
and MSIs later on when the guest starts.  

Identifying a LSI could be done using a mask on the IRQ number range 
of each PHB. It should be fast enough. I don't see other devices 
using LSIs under the sPAPR platform.


C.



>> This change only applies to the latest pseries machines. Older
>> machines still use the ICSIRQState array to define the IRQ type.
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>> ---
>>
>>  Changes since v2 :
>>
>>  - introduced a second set of XICSFabric IRQ operations for older
>>    pseries machines
>>
>>  hw/intc/xics_spapr.c  |  6 +++---
>>  hw/ppc/spapr.c        | 33 +++++++++++++++++++++++++++++----
>>  include/hw/ppc/xics.h |  2 +-
>>  3 files changed, 33 insertions(+), 8 deletions(-)pe-total-#msi
>>
>> diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
>> index de9e65d35247..b8e91aaf52bd 100644
>> --- a/hw/intc/xics_spapr.c
>> +++ b/hw/intc/xics_spapr.c
>> @@ -260,7 +260,7 @@ int spapr_ics_alloc(ICSState *ics, int irq_hint, bool lsi, Error **errp)
>>          }
>>          irq = irq_hint;
>>      } else {
>> -        irq = xic->irq_alloc_block(ics->xics, 1, 1);
>> +        irq = xic->irq_alloc_block(ics->xics, 1, 1, lsi);
>>          if (irq < 0) {
>>              error_setg(errp, "can't allocate IRQ: no IRQ left");
>>              return -1;
>> @@ -297,9 +297,9 @@ int spapr_ics_alloc_block(ICSState *ics, int num, bool lsi,
>>      if (align) {
>>          assert((num == 1) || (num == 2) || (num == 4) ||
>>                 (num == 8) || (num == 16) || (num == 32));
>> -        first = xic->irq_alloc_block(ics->xics, num, num);
>> +        first = xic->irq_alloc_block(ics->xics, num, num, lsi);
>>      } else {
>> -        first = xic->irq_alloc_block(ics->xics, num, 1);
>> +        first = xic->irq_alloc_block(ics->xics, num, 1, lsi);
>>      }
>>      if (first < 0) {
>>          error_setg(errp, "can't find a free %d-IRQ block", num);
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index ce314fcf38db..f14eae6196cd 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -3596,7 +3596,8 @@ static bool spapr_irq_test_2_11(XICSFabric *xi, int irq)
>>      return !ICS_IRQ_FREE(ics, srcno);
>>  }
>>  
>> -static int spapr_irq_alloc_block_2_11(XICSFabric *xi, int count, int align)
>> +static int spapr_irq_alloc_block_2_11(XICSFabric *xi, int count, int align,
>> +                                      bool lsi)
>>  {
>>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>>      ICSState *ics = spapr->ics;
>> @@ -3628,7 +3629,7 @@ static void spapr_irq_free_block_2_11(XICSFabric *xi, int irq, int num)
>>      }
>>  }
>>  
>> -static bool spapr_irq_is_lsi(XICSFabric *xi, int irq)
>> +static bool spapr_irq_is_lsi_2_11(XICSFabric *xi, int irq)
>>  {
>>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>>      int srcno = irq - spapr->ics->offset;
>> @@ -3644,10 +3645,21 @@ static bool spapr_irq_test(XICSFabric *xi, int irq)
>>      return test_bit(srcno, spapr->irq_map);
>>  }
>>  
>> -static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
>> +
>> +/*
>> + * Let's provision 4 LSIs per PHBs
>> + */
>> +#define SPAPR_MAX_LSI (SPAPR_MAX_PHBS * 4)
>> +
>> +/*
>> + * Split the IRQ number space of the machine in two: first the LSIs
>> + * and then the MSIs. This allows us to keep the LSI IRQ numbers in a
>> + * well known range which is useful for PHB hotplug.
>> + */
>> +static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align, bool lsi)
>>  {
>>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>> -    int start = 0;
>> +    int start = lsi ? 0 : SPAPR_MAX_LSI;
>>      int srcno;
>>  
>>      /*
>> @@ -3664,6 +3676,10 @@ static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
>>          return -1;
>>      }
>>  
>> +    if (lsi && srcno >= SPAPR_MAX_LSI) {
>> +        return -1;
>> +    }
>> +
>>      bitmap_set(spapr->irq_map, srcno, count);
>>      return srcno + spapr->irq_base;
>>  }
>> @@ -3676,6 +3692,14 @@ static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
>>      bitmap_clear(spapr->irq_map, srcno, num);
>>  }
>>  
>> +static bool spapr_irq_is_lsi(XICSFabric *xi, int irq)
>> +{
>> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>> +    int srcno = irq - spapr->irq_base;
>> +
>> +    return (srcno >= 0) && (srcno < SPAPR_MAX_LSI);
>> +}
>> +
>>  static void spapr_pic_print_info(InterruptStatsProvider *obj,
>>                                   Monitor *mon)
>>  {
>> @@ -3860,6 +3884,7 @@ static void spapr_machine_2_11_class_options(MachineClass *mc)
>>      xic->irq_test = spapr_irq_test_2_11;
>>      xic->irq_alloc_block = spapr_irq_alloc_block_2_11;
>>      xic->irq_free_block = spapr_irq_free_block_2_11;
>> +    xic->irq_is_lsi = spapr_irq_is_lsi_2_11;
>>  }
>>  
>>  DEFINE_SPAPR_MACHINE(2_11, "2.11", false);
>> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
>> index 478f8e510179..292b929e88eb 100644
>> --- a/include/hw/ppc/xics.h
>> +++ b/include/hw/ppc/xics.h
>> @@ -177,7 +177,7 @@ typedef struct XICSFabricClass {
>>      ICPState *(*icp_get)(XICSFabric *xi, int server);
>>      /* IRQ allocator helpers */
>>      bool (*irq_test)(XICSFabric *xi, int irq);
>> -    int (*irq_alloc_block)(XICSFabric *xi, int count, int align);
>> +    int (*irq_alloc_block)(XICSFabric *xi, int count, int align, bool lsi);
>>      void (*irq_free_block)(XICSFabric *xi, int irq, int num);
>>      bool (*irq_is_lsi)(XICSFabric *xi, int irq);
>>  } XICSFabricClass;
> 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 07/11] spapr: introduce an 'irq_base' number
  2017-11-15 15:24     ` Cédric Le Goater
@ 2017-11-15 16:43       ` Greg Kurz
  0 siblings, 0 replies; 79+ messages in thread
From: Greg Kurz @ 2017-11-15 16:43 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: qemu-ppc, qemu-devel, David Gibson, Benjamin Herrenschmidt

On Wed, 15 Nov 2017 15:24:08 +0000
Cédric Le Goater <clg@kaod.org> wrote:

> On 11/14/2017 03:45 PM, Greg Kurz wrote:
> > On Fri, 10 Nov 2017 15:20:13 +0000
> > Cédric Le Goater <clg@kaod.org> wrote:
> >   
> >> 'irq_base' is a base IRQ number which lets us allocate only the subset
> >> of the IRQ numbers used on the sPAPR platform. It is sync with the
> >> ICSState 'offset' attribute and this is slightly redundant. We could
> >> also choose to waste some extra bytes (512) and allocate the whole
> >> number space. To be discussed.
> >>
> >> But more important, it removes a dependency on the ICSState object of
> >> the sPAPR machine which is required for XIVE.
> >>
> >> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> >> ---
> >>  hw/ppc/spapr.c         | 7 ++++---
> >>  include/hw/ppc/spapr.h | 1 +
> >>  2 files changed, 5 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >> index bf0e5b4f815b..1cbbd7715a85 100644
> >> --- a/hw/ppc/spapr.c
> >> +++ b/hw/ppc/spapr.c
> >> @@ -2362,6 +2362,7 @@ static void ppc_spapr_init(MachineState *machine)
> >>      /* Initialize the IRQ allocator */
> >>      spapr->nr_irqs  = XICS_IRQS_SPAPR;
> >>      spapr->irq_map  = bitmap_new(spapr->nr_irqs);
> >> +    spapr->irq_base = XICS_IRQ_BASE;
> >>  
> > 
> > Since this is a constant value, do we really need a machine-level value ?  
> 
> no. I don't think either.   
> 
> But I would like to know why we are starting to allocate IRQ numbers 
> at 4096 ? Only 2 is reserved fo IPIs. So that seems a little large. 
> I have not found the reason though.
> 

Same here... I've tried to git blame/log and google qemu-devel archives
and couldn't find anything either.

> 
> Also I am starting to think that we should probably segment the allocation 
> per device like this is specified in the PAPR specs. Each device has one 
> or more Bus Unit IDentifier (BUID) which acts as a prefix for the IRQ 
> number. That would facilitate the IRQ numbering and fix some issues 
> in migration when devices are hotplugged. I am thinking about phbs
> mostly.

Makes sense. Also there's something we should clarify: we create one ICS for
the entire machine, able to handle XICS_IRQS_SPAPR (== 1024) irqs. But each PHB
advertises it can provide XICS_IRQS_SPAPR MSIs through the “ibm,pe-total-#msi”
DT prop... this looks wrong.

> 
> C.
> 
> 
> > Especially now that all the code that needs it is in spapr.c, I guess it
> > can directly use the macro, no ?
> >   
> >>      /* Set up Interrupt Controller before we create the VCPUs */
> >>      xics_system_init(machine, spapr->nr_irqs, &error_fatal);
> >> @@ -3630,7 +3631,7 @@ static void spapr_irq_free_block_2_11(XICSFabric *xi, int irq, int num)
> >>  static bool spapr_irq_test(XICSFabric *xi, int irq)
> >>  {
> >>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> >> -    int srcno = irq - spapr->ics->offset;
> >> +    int srcno = irq - spapr->irq_base;
> >>  
> >>      return test_bit(srcno, spapr->irq_map);
> >>  }
> >> @@ -3656,13 +3657,13 @@ static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
> >>      }
> >>  
> >>      bitmap_set(spapr->irq_map, srcno, count);
> >> -    return srcno + spapr->ics->offset;
> >> +    return srcno + spapr->irq_base;
> >>  }
> >>  
> >>  static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
> >>  {
> >>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> >> -    int srcno = irq - spapr->ics->offset;
> >> +    int srcno = irq - spapr->irq_base;
> >>  
> >>      bitmap_clear(spapr->irq_map, srcno, num);
> >>  }
> >> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> >> index 023436c32b2a..200667dcff9d 100644
> >> --- a/include/hw/ppc/spapr.h
> >> +++ b/include/hw/ppc/spapr.h
> >> @@ -82,6 +82,7 @@ struct sPAPRMachineState {
> >>      int32_t nr_irqs;
> >>      unsigned long *irq_map;
> >>      unsigned long *irq_map_ref;
> >> +    uint32_t irq_base;
> >>      ICSState *ics;
> >>      sPAPRRTCState rtc;
> >>    
> >   
> 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 09/11] spapr: split the IRQ number space for LSI interrupts
  2017-11-15 16:08     ` Cédric Le Goater
@ 2017-11-15 20:27       ` Greg Kurz
  0 siblings, 0 replies; 79+ messages in thread
From: Greg Kurz @ 2017-11-15 20:27 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: qemu-ppc, qemu-devel, David Gibson, Benjamin Herrenschmidt

On Wed, 15 Nov 2017 16:08:32 +0000
Cédric Le Goater <clg@kaod.org> wrote:

> On 11/15/2017 03:52 PM, Greg Kurz wrote:
> > On Fri, 10 Nov 2017 15:20:15 +0000
> > Cédric Le Goater <clg@kaod.org> wrote:
> >   
> >> The type of an interrupt, MSI or LSI, is stored under the flag
> >> attribute of the ICSIRQState array. To reduce the use of this array
> >> and consequently of the ICSState object (This is needed to introduce
> >> the new XIVE model), we choose to split the IRQ number space of the
> >> machine in two: first the LSIs and then the MSIs.
> >>
> >> This also has the benefit to keep the LSI IRQ numbers in a well known
> >> range which will be useful for PHB hotplug.
> >>  
> > 
> > Well... LSIs indeed land in a well known range, but it isn't enough for PHB
> > hotplug. Each PHB is uniquely identified by its 'index' property, and we
> > want each PHB to have fixed LSIs, so that they are invariant across migration.  
> 
> ok. 
> 
> So, as said in another email, we should think about segmenting
> the allocation per device. At least for PHBs. This is specified in
> the PAPR specs, each device has one or more Bus Unit IDentifier (BUID)

BTW, the code currently uses "BUID" to refer to the PHB Unit ID which
is a different concept in the PAPR spec. Maybe this should be fixed
for the sake of clarity ?

> acting as a prefix for the IRQ number.
> 

Since the user can instantiate multiple VIO devices, should we also
have an irq segment for the VIO bus as well ?

> We could model that by using a specific range for each PHB in the 
> overall IRQ number space, depending on some index. LSIs would be 

Some index should be the "index" property I was mentioning before.

> allocated at the beginning of this range, when the device is realized 
> and MSIs later on when the guest starts.  
> 
> Identifying a LSI could be done using a mask on the IRQ number range 
> of each PHB. It should be fast enough. I don't see other devices 
> using LSIs under the sPAPR platform.
> 

There aren't any other AFAICT.

> 
> C.
> 
> 
> 
> >> This change only applies to the latest pseries machines. Older
> >> machines still use the ICSIRQState array to define the IRQ type.
> >>
> >> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> >> ---
> >>
> >>  Changes since v2 :
> >>
> >>  - introduced a second set of XICSFabric IRQ operations for older
> >>    pseries machines
> >>
> >>  hw/intc/xics_spapr.c  |  6 +++---
> >>  hw/ppc/spapr.c        | 33 +++++++++++++++++++++++++++++----
> >>  include/hw/ppc/xics.h |  2 +-
> >>  3 files changed, 33 insertions(+), 8 deletions(-)pe-total-#msi
> >>
> >> diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
> >> index de9e65d35247..b8e91aaf52bd 100644
> >> --- a/hw/intc/xics_spapr.c
> >> +++ b/hw/intc/xics_spapr.c
> >> @@ -260,7 +260,7 @@ int spapr_ics_alloc(ICSState *ics, int irq_hint, bool lsi, Error **errp)
> >>          }
> >>          irq = irq_hint;
> >>      } else {
> >> -        irq = xic->irq_alloc_block(ics->xics, 1, 1);
> >> +        irq = xic->irq_alloc_block(ics->xics, 1, 1, lsi);
> >>          if (irq < 0) {
> >>              error_setg(errp, "can't allocate IRQ: no IRQ left");
> >>              return -1;
> >> @@ -297,9 +297,9 @@ int spapr_ics_alloc_block(ICSState *ics, int num, bool lsi,
> >>      if (align) {
> >>          assert((num == 1) || (num == 2) || (num == 4) ||
> >>                 (num == 8) || (num == 16) || (num == 32));
> >> -        first = xic->irq_alloc_block(ics->xics, num, num);
> >> +        first = xic->irq_alloc_block(ics->xics, num, num, lsi);
> >>      } else {
> >> -        first = xic->irq_alloc_block(ics->xics, num, 1);
> >> +        first = xic->irq_alloc_block(ics->xics, num, 1, lsi);
> >>      }
> >>      if (first < 0) {
> >>          error_setg(errp, "can't find a free %d-IRQ block", num);
> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >> index ce314fcf38db..f14eae6196cd 100644
> >> --- a/hw/ppc/spapr.c
> >> +++ b/hw/ppc/spapr.c
> >> @@ -3596,7 +3596,8 @@ static bool spapr_irq_test_2_11(XICSFabric *xi, int irq)
> >>      return !ICS_IRQ_FREE(ics, srcno);
> >>  }
> >>  
> >> -static int spapr_irq_alloc_block_2_11(XICSFabric *xi, int count, int align)
> >> +static int spapr_irq_alloc_block_2_11(XICSFabric *xi, int count, int align,
> >> +                                      bool lsi)
> >>  {
> >>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> >>      ICSState *ics = spapr->ics;
> >> @@ -3628,7 +3629,7 @@ static void spapr_irq_free_block_2_11(XICSFabric *xi, int irq, int num)
> >>      }
> >>  }
> >>  
> >> -static bool spapr_irq_is_lsi(XICSFabric *xi, int irq)
> >> +static bool spapr_irq_is_lsi_2_11(XICSFabric *xi, int irq)
> >>  {
> >>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> >>      int srcno = irq - spapr->ics->offset;
> >> @@ -3644,10 +3645,21 @@ static bool spapr_irq_test(XICSFabric *xi, int irq)
> >>      return test_bit(srcno, spapr->irq_map);
> >>  }
> >>  
> >> -static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
> >> +
> >> +/*
> >> + * Let's provision 4 LSIs per PHBs
> >> + */
> >> +#define SPAPR_MAX_LSI (SPAPR_MAX_PHBS * 4)
> >> +
> >> +/*
> >> + * Split the IRQ number space of the machine in two: first the LSIs
> >> + * and then the MSIs. This allows us to keep the LSI IRQ numbers in a
> >> + * well known range which is useful for PHB hotplug.
> >> + */
> >> +static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align, bool lsi)
> >>  {
> >>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> >> -    int start = 0;
> >> +    int start = lsi ? 0 : SPAPR_MAX_LSI;
> >>      int srcno;
> >>  
> >>      /*
> >> @@ -3664,6 +3676,10 @@ static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
> >>          return -1;
> >>      }
> >>  
> >> +    if (lsi && srcno >= SPAPR_MAX_LSI) {
> >> +        return -1;
> >> +    }
> >> +
> >>      bitmap_set(spapr->irq_map, srcno, count);
> >>      return srcno + spapr->irq_base;
> >>  }
> >> @@ -3676,6 +3692,14 @@ static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
> >>      bitmap_clear(spapr->irq_map, srcno, num);
> >>  }
> >>  
> >> +static bool spapr_irq_is_lsi(XICSFabric *xi, int irq)
> >> +{
> >> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> >> +    int srcno = irq - spapr->irq_base;
> >> +
> >> +    return (srcno >= 0) && (srcno < SPAPR_MAX_LSI);
> >> +}
> >> +
> >>  static void spapr_pic_print_info(InterruptStatsProvider *obj,
> >>                                   Monitor *mon)
> >>  {
> >> @@ -3860,6 +3884,7 @@ static void spapr_machine_2_11_class_options(MachineClass *mc)
> >>      xic->irq_test = spapr_irq_test_2_11;
> >>      xic->irq_alloc_block = spapr_irq_alloc_block_2_11;
> >>      xic->irq_free_block = spapr_irq_free_block_2_11;
> >> +    xic->irq_is_lsi = spapr_irq_is_lsi_2_11;
> >>  }
> >>  
> >>  DEFINE_SPAPR_MACHINE(2_11, "2.11", false);
> >> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> >> index 478f8e510179..292b929e88eb 100644
> >> --- a/include/hw/ppc/xics.h
> >> +++ b/include/hw/ppc/xics.h
> >> @@ -177,7 +177,7 @@ typedef struct XICSFabricClass {
> >>      ICPState *(*icp_get)(XICSFabric *xi, int server);
> >>      /* IRQ allocator helpers */
> >>      bool (*irq_test)(XICSFabric *xi, int irq);
> >> -    int (*irq_alloc_block)(XICSFabric *xi, int count, int align);
> >> +    int (*irq_alloc_block)(XICSFabric *xi, int count, int align, bool lsi);
> >>      void (*irq_free_block)(XICSFabric *xi, int irq, int num);
> >>      bool (*irq_is_lsi)(XICSFabric *xi, int irq);
> >>  } XICSFabricClass;  
> >   
> 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 03/11] spapr: introduce new XICSFabric operations for an IRQ allocator
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 03/11] spapr: introduce new XICSFabric operations for an IRQ allocator Cédric Le Goater
  2017-11-14  8:52   ` Greg Kurz
@ 2017-11-17  4:48   ` David Gibson
  2017-11-17  7:16     ` Cédric Le Goater
  1 sibling, 1 reply; 79+ messages in thread
From: David Gibson @ 2017-11-17  4:48 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: qemu-ppc, qemu-devel, Greg Kurz, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 3753 bytes --]

On Fri, Nov 10, 2017 at 03:20:09PM +0000, Cédric Le Goater wrote:
> Currently, the ICSState 'ics' object of the sPAPR machine acts as the
> global interrupt source handler and also as the IRQ number allocator
> for the machine. Some IRQ numbers are allocated very early in the
> machine initialization sequence to populate the device tree, and this
> is a problem to introduce the new POWER XIVE interrupt model, as it
> needs to share the IRQ numbers with the older model.
> 
> To prepare ground for XIVE, here is a set of new XICSFabric operations
> to let the machine handle directly the IRQ number allocation and to
> decorrelate the allocation from the interrupt source object :
> 
>     bool (*irq_test)(XICSFabric *xi, int irq);
>     int (*irq_alloc_block)(XICSFabric *xi, int count, int align);
>     void (*irq_free_block)(XICSFabric *xi, int irq, int num);
> 
> In these prototypes, the 'irq' parameter refers to a number in the
> global IRQ number space. Indexes for arrays storing different state
> informations on the interrupts, like the ICSIRQState, are usually
> named 'srcno'.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>

This doesn't seem sensible to me.  When I said you should move irq
allocation to the machine, I mean actually move the code.  The only
user of irq allocation should be in the machine, so we shouldn't need
to indirect via the XICSFabric interface to do that.

And, we shouldn't be using XICSFabric things for XIVE.

> ---
>  hw/ppc/spapr.c        | 19 +++++++++++++++++++
>  include/hw/ppc/xics.h |  4 ++++
>  2 files changed, 23 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index a2dcbee07214..84d68f2fdbae 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3536,6 +3536,21 @@ static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
>      return cpu ? ICP(cpu->intc) : NULL;
>  }
>  
> +static bool spapr_irq_test(XICSFabric *xi, int irq)
> +{
> +    return false;
> +}
> +
> +static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
> +{
> +    return -1;
> +}
> +
> +static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
> +{
> +    ;
> +}
> +
>  static void spapr_pic_print_info(InterruptStatsProvider *obj,
>                                   Monitor *mon)
>  {
> @@ -3630,6 +3645,10 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>      xic->ics_get = spapr_ics_get;
>      xic->ics_resend = spapr_ics_resend;
>      xic->icp_get = spapr_icp_get;
> +    xic->irq_test = spapr_irq_test;
> +    xic->irq_alloc_block = spapr_irq_alloc_block;
> +    xic->irq_free_block = spapr_irq_free_block;
> +
>      ispc->print_info = spapr_pic_print_info;
>      /* Force NUMA node memory size to be a multiple of
>       * SPAPR_MEMORY_BLOCK_SIZE (256M) since that's the granularity
> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> index 28d248abad61..30e7f2e0a7dd 100644
> --- a/include/hw/ppc/xics.h
> +++ b/include/hw/ppc/xics.h
> @@ -175,6 +175,10 @@ typedef struct XICSFabricClass {
>      ICSState *(*ics_get)(XICSFabric *xi, int irq);
>      void (*ics_resend)(XICSFabric *xi);
>      ICPState *(*icp_get)(XICSFabric *xi, int server);
> +    /* IRQ allocator helpers */
> +    bool (*irq_test)(XICSFabric *xi, int irq);
> +    int (*irq_alloc_block)(XICSFabric *xi, int count, int align);
> +    void (*irq_free_block)(XICSFabric *xi, int irq, int num);
>  } XICSFabricClass;
>  
>  #define XICS_IRQS_SPAPR               1024

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 05/11] spapr: introduce an IRQ allocator using a bitmap
  2017-11-14  9:42   ` Greg Kurz
  2017-11-14 11:54     ` Cédric Le Goater
@ 2017-11-17  4:50     ` David Gibson
  2017-11-17  7:19       ` Cédric Le Goater
  2017-11-20 12:07       ` Greg Kurz
  1 sibling, 2 replies; 79+ messages in thread
From: David Gibson @ 2017-11-17  4:50 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Cédric Le Goater, qemu-ppc, qemu-devel, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 7652 bytes --]

On Tue, Nov 14, 2017 at 10:42:24AM +0100, Greg Kurz wrote:
> On Fri, 10 Nov 2017 15:20:11 +0000
> Cédric Le Goater <clg@kaod.org> wrote:
> 
> > Let's define a new set of XICSFabric IRQ operations for the latest
> > pseries machine. These simply use a a bitmap 'irq_map' as a IRQ number
> > allocator.
> > 
> > The previous pseries machines keep the old set of IRQ operations using
> > the ICSIRQState array.
> > 
> > Signed-off-by: Cédric Le Goater <clg@kaod.org>
> > ---
> > 
> >  Changes since v2 :
> > 
> >  - introduced a second set of XICSFabric IRQ operations for older
> >    pseries machines
> > 
> >  hw/ppc/spapr.c         | 76 ++++++++++++++++++++++++++++++++++++++++++++++----
> >  include/hw/ppc/spapr.h |  3 ++
> >  2 files changed, 74 insertions(+), 5 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 4bdceb45a14f..4ef0b73559ca 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -1681,6 +1681,22 @@ static const VMStateDescription vmstate_spapr_patb_entry = {
> >      },
> >  };
> >  
> > +static bool spapr_irq_map_needed(void *opaque)
> > +{
> > +    return true;
> 
> I see that the next patch adds some code to avoid sending the
> bitmap if it doesn't contain state, but I guess you should also
> explicitly have this function to return false for older machine
> types (see remark below).

I don't see that you should need to migrate this at all.  The machine
needs to reliably allocate the same interrupts each time, and that
means source and dest should have the same allocations without
migrating data.

> 
> > +}
> > +
> > +static const VMStateDescription vmstate_spapr_irq_map = {
> > +    .name = "spapr_irq_map",
> > +    .version_id = 0,
> > +    .minimum_version_id = 0,
> > +    .needed = spapr_irq_map_needed,
> > +    .fields = (VMStateField[]) {
> > +        VMSTATE_BITMAP(irq_map, sPAPRMachineState, 0, nr_irqs),
> > +        VMSTATE_END_OF_LIST()
> > +    },
> > +};
> > +
> >  static const VMStateDescription vmstate_spapr = {
> >      .name = "spapr",
> >      .version_id = 3,
> > @@ -1700,6 +1716,7 @@ static const VMStateDescription vmstate_spapr = {
> >          &vmstate_spapr_ov5_cas,
> >          &vmstate_spapr_patb_entry,
> >          &vmstate_spapr_pending_events,
> > +        &vmstate_spapr_irq_map,
> >          NULL
> >      }
> >  };
> > @@ -2337,8 +2354,12 @@ static void ppc_spapr_init(MachineState *machine)
> >      /* Setup a load limit for the ramdisk leaving room for SLOF and FDT */
> >      load_limit = MIN(spapr->rma_size, RTAS_MAX_ADDR) - FW_OVERHEAD;
> >  
> > +    /* Initialize the IRQ allocator */
> > +    spapr->nr_irqs  = XICS_IRQS_SPAPR;
> > +    spapr->irq_map  = bitmap_new(spapr->nr_irqs);
> > +
> 
> I think you should introduce a sPAPRMachineClass::has_irq_bitmap boolean
> so that the bitmap is only allocated for newer machine types. And you should
> then use this flag in spapr_irq_map_needed() above.
> 
> Apart from that, the rest of the patch looks good.
> 
> >      /* Set up Interrupt Controller before we create the VCPUs */
> > -    xics_system_init(machine, XICS_IRQS_SPAPR, &error_fatal);
> > +    xics_system_init(machine, spapr->nr_irqs, &error_fatal);
> >  
> >      /* Set up containers for ibm,client-architecture-support negotiated options
> >       */
> > @@ -3560,7 +3581,7 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum)
> >      return -1;
> >  }
> >  
> > -static bool spapr_irq_test(XICSFabric *xi, int irq)
> > +static bool spapr_irq_test_2_11(XICSFabric *xi, int irq)
> >  {
> >      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> >      ICSState *ics = spapr->ics;
> > @@ -3569,7 +3590,7 @@ static bool spapr_irq_test(XICSFabric *xi, int irq)
> >      return !ICS_IRQ_FREE(ics, srcno);
> >  }
> >  
> > -static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
> > +static int spapr_irq_alloc_block_2_11(XICSFabric *xi, int count, int align)
> >  {
> >      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> >      ICSState *ics = spapr->ics;
> > @@ -3583,7 +3604,7 @@ static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
> >      return srcno + ics->offset;
> >  }
> >  
> > -static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
> > +static void spapr_irq_free_block_2_11(XICSFabric *xi, int irq, int num)
> >  {
> >      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> >      ICSState *ics = spapr->ics;
> > @@ -3601,6 +3622,46 @@ static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
> >      }
> >  }
> >  
> > +static bool spapr_irq_test(XICSFabric *xi, int irq)
> > +{
> > +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> > +    int srcno = irq - spapr->ics->offset;
> > +
> > +    return test_bit(srcno, spapr->irq_map);
> > +}
> > +
> > +static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
> > +{
> > +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> > +    int start = 0;
> > +    int srcno;
> > +
> > +    /*
> > +     * The 'align_mask' parameter of bitmap_find_next_zero_area()
> > +     * should be one less than a power of 2; 0 means no
> > +     * alignment. Adapt the 'align' value of the former allocator to
> > +     * fit the requirements of bitmap_find_next_zero_area()
> > +     */
> > +    align -= 1;
> > +
> > +    srcno = bitmap_find_next_zero_area(spapr->irq_map, spapr->nr_irqs, start,
> > +                                       count, align);
> > +    if (srcno == spapr->nr_irqs) {
> > +        return -1;
> > +    }
> > +
> > +    bitmap_set(spapr->irq_map, srcno, count);
> > +    return srcno + spapr->ics->offset;
> > +}
> > +
> > +static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
> > +{
> > +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> > +    int srcno = irq - spapr->ics->offset;
> > +
> > +    bitmap_clear(spapr->irq_map, srcno, num);
> > +}
> > +
> >  static void spapr_pic_print_info(InterruptStatsProvider *obj,
> >                                   Monitor *mon)
> >  {
> > @@ -3778,7 +3839,12 @@ static void spapr_machine_2_11_instance_options(MachineState *machine)
> >  
> >  static void spapr_machine_2_11_class_options(MachineClass *mc)
> >  {
> > -    /* Defaults for the latest behaviour inherited from the base class */
> > +    XICSFabricClass *xic = XICS_FABRIC_CLASS(mc);
> > +
> > +    spapr_machine_2_12_class_options(mc);
> > +    xic->irq_test = spapr_irq_test_2_11;
> > +    xic->irq_alloc_block = spapr_irq_alloc_block_2_11;
> > +    xic->irq_free_block = spapr_irq_free_block_2_11;
> >  }
> >  
> >  DEFINE_SPAPR_MACHINE(2_11, "2.11", false);
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index 9d21ca9bde3a..5835c694caff 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -7,6 +7,7 @@
> >  #include "hw/ppc/spapr_drc.h"
> >  #include "hw/mem/pc-dimm.h"
> >  #include "hw/ppc/spapr_ovec.h"
> > +#include "qemu/bitmap.h"
> >  
> >  struct VIOsPAPRBus;
> >  struct sPAPRPHBState;
> > @@ -78,6 +79,8 @@ struct sPAPRMachineState {
> >      struct VIOsPAPRBus *vio_bus;
> >      QLIST_HEAD(, sPAPRPHBState) phbs;
> >      struct sPAPRNVRAM *nvram;
> > +    int32_t nr_irqs;
> > +    unsigned long *irq_map;
> >      ICSState *ics;
> >      sPAPRRTCState rtc;
> >  
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 08/11] spapr: introduce a XICSFabric irq_is_lsi() operation
  2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 08/11] spapr: introduce a XICSFabric irq_is_lsi() operation Cédric Le Goater
  2017-11-14 16:21   ` Greg Kurz
@ 2017-11-17  4:54   ` David Gibson
  2017-11-17  7:23     ` Cédric Le Goater
  1 sibling, 1 reply; 79+ messages in thread
From: David Gibson @ 2017-11-17  4:54 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: qemu-ppc, qemu-devel, Greg Kurz, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 8388 bytes --]

On Fri, Nov 10, 2017 at 03:20:14PM +0000, Cédric Le Goater wrote:
> It will be used later on to distinguish the allocation of an LSI
> interrupt from an MSI and also to reduce the use of the ICSIRQState
> array of the ICSState object, which is on our way to introduce XIVE.
> 
> The 'irq' parameter continues to refer to the global IRQ number space.
> 
> On PowerNV, only the PSI controller interrupts are handled and they
> are all LSIs.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>

!?! AFAICT this is a step backwards.  The users of ics_is_lsi() here
are in the xics code.  So they already have the right information
locally in the ICSState object.  Why on earth would you indirect
through the fabric.

> ---
>  hw/intc/xics.c        | 26 +++++++++++++++++---------
>  hw/intc/xics_kvm.c    |  4 ++--
>  hw/ppc/pnv.c          | 16 ++++++++++++++++
>  hw/ppc/spapr.c        |  9 +++++++++
>  include/hw/ppc/xics.h |  2 ++
>  5 files changed, 46 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> index 2c4899f278e2..42880e736697 100644
> --- a/hw/intc/xics.c
> +++ b/hw/intc/xics.c
> @@ -33,6 +33,7 @@
>  #include "trace.h"
>  #include "qemu/timer.h"
>  #include "hw/ppc/xics.h"
> +#include "hw/ppc/spapr.h"
>  #include "qemu/error-report.h"
>  #include "qapi/visitor.h"
>  #include "monitor/monitor.h"
> @@ -70,8 +71,7 @@ void ics_pic_print_info(ICSState *ics, Monitor *mon)
>          }
>          monitor_printf(mon, "  %4x %s %02x %02x\n",
>                         ics->offset + i,
> -                       (irq->flags & XICS_FLAGS_IRQ_LSI) ?
> -                       "LSI" : "MSI",
> +                       ics_is_lsi(ics, i) ? "LSI" : "MSI",

!?! 

>                         irq->priority, irq->status);
>      }
>  }
> @@ -377,6 +377,14 @@ static const TypeInfo icp_info = {
>  /*
>   * ICS: Source layer
>   */
> +bool ics_is_lsi(ICSState *ics, int srcno)
> +{
> +    XICSFabric *xi = ics->xics;
> +    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(xi);
> +
> +    return xic->irq_is_lsi(xi, srcno + ics->offset);
> +}
> +
>  static void ics_simple_resend_msi(ICSState *ics, int srcno)
>  {
>      ICSIRQState *irq = ics->irqs + srcno;
> @@ -435,7 +443,7 @@ static void ics_simple_set_irq(void *opaque, int srcno, int val)
>  {
>      ICSState *ics = (ICSState *)opaque;
>  
> -    if (ics->irqs[srcno].flags & XICS_FLAGS_IRQ_LSI) {
> +    if (ics_is_lsi(ics, srcno)) {
>          ics_simple_set_irq_lsi(ics, srcno, val);
>      } else {
>          ics_simple_set_irq_msi(ics, srcno, val);
> @@ -472,7 +480,7 @@ void ics_simple_write_xive(ICSState *ics, int srcno, int server,
>      trace_xics_ics_simple_write_xive(ics->offset + srcno, srcno, server,
>                                       priority);
>  
> -    if (ics->irqs[srcno].flags & XICS_FLAGS_IRQ_LSI) {
> +    if (ics_is_lsi(ics, srcno)) {
>          ics_simple_write_xive_lsi(ics, srcno);
>      } else {
>          ics_simple_write_xive_msi(ics, srcno);
> @@ -484,10 +492,10 @@ static void ics_simple_reject(ICSState *ics, uint32_t nr)
>      ICSIRQState *irq = ics->irqs + nr - ics->offset;
>  
>      trace_xics_ics_simple_reject(nr, nr - ics->offset);
> -    if (irq->flags & XICS_FLAGS_IRQ_MSI) {
> -        irq->status |= XICS_STATUS_REJECTED;
> -    } else if (irq->flags & XICS_FLAGS_IRQ_LSI) {
> +    if (ics_is_lsi(ics, nr - ics->offset)) {
>          irq->status &= ~XICS_STATUS_SENT;
> +    } else {
> +        irq->status |= XICS_STATUS_REJECTED;
>      }
>  }
>  
> @@ -497,7 +505,7 @@ static void ics_simple_resend(ICSState *ics)
>  
>      for (i = 0; i < ics->nr_irqs; i++) {
>          /* FIXME: filter by server#? */
> -        if (ics->irqs[i].flags & XICS_FLAGS_IRQ_LSI) {
> +        if (ics_is_lsi(ics, i)) {
>              ics_simple_resend_lsi(ics, i);
>          } else {
>              ics_simple_resend_msi(ics, i);
> @@ -512,7 +520,7 @@ static void ics_simple_eoi(ICSState *ics, uint32_t nr)
>  
>      trace_xics_ics_simple_eoi(nr);
>  
> -    if (ics->irqs[srcno].flags & XICS_FLAGS_IRQ_LSI) {
> +    if (ics_is_lsi(ics, srcno)) {
>          irq->status &= ~XICS_STATUS_SENT;
>      }
>  }
> diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
> index 3091ad3ac2c8..2f10637c9f7c 100644
> --- a/hw/intc/xics_kvm.c
> +++ b/hw/intc/xics_kvm.c
> @@ -258,7 +258,7 @@ static int ics_set_kvm_state(ICSState *ics, int version_id)
>              state |= KVM_XICS_MASKED;
>          }
>  
> -        if (ics->irqs[i].flags & XICS_FLAGS_IRQ_LSI) {
> +        if (ics_is_lsi(ics, i)) {
>              state |= KVM_XICS_LEVEL_SENSITIVE;
>              if (irq->status & XICS_STATUS_ASSERTED) {
>                  state |= KVM_XICS_PENDING;
> @@ -293,7 +293,7 @@ static void ics_kvm_set_irq(void *opaque, int srcno, int val)
>      int rc;
>  
>      args.irq = srcno + ics->offset;
> -    if (ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MSI) {
> +    if (!ics_is_lsi(ics, srcno)) {
>          if (!val) {
>              return;
>          }
> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> index 8288940ef9d7..958223376b4c 100644
> --- a/hw/ppc/pnv.c
> +++ b/hw/ppc/pnv.c
> @@ -1035,6 +1035,21 @@ static bool pnv_irq_test(XICSFabric *xi, int irq)
>      return false;
>  }
>  
> +static bool pnv_irq_is_lsi(XICSFabric *xi, int irq)
> +{
> +    PnvMachineState *pnv = POWERNV_MACHINE(xi);
> +    int i;
> +
> +    /* PowerNV machine only has PSI interrupts which are all LSIs */
> +    for (i = 0; i < pnv->num_chips; i++) {
> +        ICSState *ics = &pnv->chips[i]->psi.ics;
> +        if (ics_valid_irq(ics, irq)) {
> +            return true;
> +        }
> +    }
> +    return false;
> +}
> +
>  static void pnv_pic_print_info(InterruptStatsProvider *obj,
>                                 Monitor *mon)
>  {
> @@ -1120,6 +1135,7 @@ static void powernv_machine_class_init(ObjectClass *oc, void *data)
>      xic->ics_get = pnv_ics_get;
>      xic->ics_resend = pnv_ics_resend;
>      xic->irq_test = pnv_irq_test;
> +    xic->irq_is_lsi = pnv_irq_is_lsi;
>      ispc->print_info = pnv_pic_print_info;
>  
>      powernv_machine_class_props_init(oc);
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 1cbbd7715a85..ce314fcf38db 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3628,6 +3628,14 @@ static void spapr_irq_free_block_2_11(XICSFabric *xi, int irq, int num)
>      }
>  }
>  
> +static bool spapr_irq_is_lsi(XICSFabric *xi, int irq)
> +{
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> +    int srcno = irq - spapr->ics->offset;
> +
> +    return spapr->ics->irqs[srcno].flags & XICS_FLAGS_IRQ_LSI;
> +}
> +
>  static bool spapr_irq_test(XICSFabric *xi, int irq)
>  {
>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> @@ -3765,6 +3773,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>      xic->irq_test = spapr_irq_test;
>      xic->irq_alloc_block = spapr_irq_alloc_block;
>      xic->irq_free_block = spapr_irq_free_block;
> +    xic->irq_is_lsi = spapr_irq_is_lsi;
>  
>      ispc->print_info = spapr_pic_print_info;
>      /* Force NUMA node memory size to be a multiple of
> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> index 30e7f2e0a7dd..478f8e510179 100644
> --- a/include/hw/ppc/xics.h
> +++ b/include/hw/ppc/xics.h
> @@ -179,6 +179,7 @@ typedef struct XICSFabricClass {
>      bool (*irq_test)(XICSFabric *xi, int irq);
>      int (*irq_alloc_block)(XICSFabric *xi, int count, int align);
>      void (*irq_free_block)(XICSFabric *xi, int irq, int num);
> +    bool (*irq_is_lsi)(XICSFabric *xi, int irq);
>  } XICSFabricClass;
>  
>  #define XICS_IRQS_SPAPR               1024
> @@ -205,6 +206,7 @@ void ics_simple_write_xive(ICSState *ics, int nr, int server,
>  void ics_set_irq_type(ICSState *ics, int srcno, bool lsi);
>  void icp_pic_print_info(ICPState *icp, Monitor *mon);
>  void ics_pic_print_info(ICSState *ics, Monitor *mon);
> +bool ics_is_lsi(ICSState *ics, int srno);
>  
>  void ics_resend(ICSState *ics);
>  void icp_resend(ICPState *ss);

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 03/11] spapr: introduce new XICSFabric operations for an IRQ allocator
  2017-11-17  4:48   ` David Gibson
@ 2017-11-17  7:16     ` Cédric Le Goater
  2017-11-23 11:07       ` David Gibson
  0 siblings, 1 reply; 79+ messages in thread
From: Cédric Le Goater @ 2017-11-17  7:16 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, Greg Kurz, Benjamin Herrenschmidt

On 11/17/2017 05:48 AM, David Gibson wrote:
> On Fri, Nov 10, 2017 at 03:20:09PM +0000, Cédric Le Goater wrote:
>> Currently, the ICSState 'ics' object of the sPAPR machine acts as the
>> global interrupt source handler and also as the IRQ number allocator
>> for the machine. Some IRQ numbers are allocated very early in the
>> machine initialization sequence to populate the device tree, and this
>> is a problem to introduce the new POWER XIVE interrupt model, as it
>> needs to share the IRQ numbers with the older model.
>>
>> To prepare ground for XIVE, here is a set of new XICSFabric operations
>> to let the machine handle directly the IRQ number allocation and to
>> decorrelate the allocation from the interrupt source object :
>>
>>     bool (*irq_test)(XICSFabric *xi, int irq);
>>     int (*irq_alloc_block)(XICSFabric *xi, int count, int align);
>>     void (*irq_free_block)(XICSFabric *xi, int irq, int num);
>>
>> In these prototypes, the 'irq' parameter refers to a number in the
>> global IRQ number space. Indexes for arrays storing different state
>> informations on the interrupts, like the ICSIRQState, are usually
>> named 'srcno'.
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> 
> This doesn't seem sensible to me.  When I said you should move irq
> allocation to the machine, I mean actually move the code.  The only
> user of irq allocation should be in the machine, so we shouldn't need
> to indirect via the XICSFabric interface to do that.

OK. so we can probably do the same with machine class handlers because 
we do need an indirection to handle the way older pseries machines 
allocate IRQs that will change with newer machines  supporting XIVE.

> And, we shouldn't be using XICSFabric things for XIVE.

ok. The spapr machine should be enough. 

Thanks,

C.
 
>> ---
>>  hw/ppc/spapr.c        | 19 +++++++++++++++++++
>>  include/hw/ppc/xics.h |  4 ++++
>>  2 files changed, 23 insertions(+)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index a2dcbee07214..84d68f2fdbae 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -3536,6 +3536,21 @@ static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
>>      return cpu ? ICP(cpu->intc) : NULL;
>>  }
>>  
>> +static bool spapr_irq_test(XICSFabric *xi, int irq)
>> +{
>> +    return false;
>> +}
>> +
>> +static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
>> +{
>> +    return -1;
>> +}
>> +
>> +static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
>> +{
>> +    ;
>> +}
>> +
>>  static void spapr_pic_print_info(InterruptStatsProvider *obj,
>>                                   Monitor *mon)
>>  {
>> @@ -3630,6 +3645,10 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>>      xic->ics_get = spapr_ics_get;
>>      xic->ics_resend = spapr_ics_resend;
>>      xic->icp_get = spapr_icp_get;
>> +    xic->irq_test = spapr_irq_test;
>> +    xic->irq_alloc_block = spapr_irq_alloc_block;
>> +    xic->irq_free_block = spapr_irq_free_block;
>> +
>>      ispc->print_info = spapr_pic_print_info;
>>      /* Force NUMA node memory size to be a multiple of
>>       * SPAPR_MEMORY_BLOCK_SIZE (256M) since that's the granularity
>> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
>> index 28d248abad61..30e7f2e0a7dd 100644
>> --- a/include/hw/ppc/xics.h
>> +++ b/include/hw/ppc/xics.h
>> @@ -175,6 +175,10 @@ typedef struct XICSFabricClass {
>>      ICSState *(*ics_get)(XICSFabric *xi, int irq);
>>      void (*ics_resend)(XICSFabric *xi);
>>      ICPState *(*icp_get)(XICSFabric *xi, int server);
>> +    /* IRQ allocator helpers */
>> +    bool (*irq_test)(XICSFabric *xi, int irq);
>> +    int (*irq_alloc_block)(XICSFabric *xi, int count, int align);
>> +    void (*irq_free_block)(XICSFabric *xi, int irq, int num);
>>  } XICSFabricClass;
>>  
>>  #define XICS_IRQS_SPAPR               1024
> 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 05/11] spapr: introduce an IRQ allocator using a bitmap
  2017-11-17  4:50     ` David Gibson
@ 2017-11-17  7:19       ` Cédric Le Goater
  2017-11-23 11:08         ` David Gibson
  2017-11-20 12:07       ` Greg Kurz
  1 sibling, 1 reply; 79+ messages in thread
From: Cédric Le Goater @ 2017-11-17  7:19 UTC (permalink / raw)
  To: David Gibson, Greg Kurz; +Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt

On 11/17/2017 05:50 AM, David Gibson wrote:
> On Tue, Nov 14, 2017 at 10:42:24AM +0100, Greg Kurz wrote:
>> On Fri, 10 Nov 2017 15:20:11 +0000
>> Cédric Le Goater <clg@kaod.org> wrote:
>>
>>> Let's define a new set of XICSFabric IRQ operations for the latest
>>> pseries machine. These simply use a a bitmap 'irq_map' as a IRQ number
>>> allocator.
>>>
>>> The previous pseries machines keep the old set of IRQ operations using
>>> the ICSIRQState array.
>>>
>>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>>> ---
>>>
>>>  Changes since v2 :
>>>
>>>  - introduced a second set of XICSFabric IRQ operations for older
>>>    pseries machines
>>>
>>>  hw/ppc/spapr.c         | 76 ++++++++++++++++++++++++++++++++++++++++++++++----
>>>  include/hw/ppc/spapr.h |  3 ++
>>>  2 files changed, 74 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>> index 4bdceb45a14f..4ef0b73559ca 100644
>>> --- a/hw/ppc/spapr.c
>>> +++ b/hw/ppc/spapr.c
>>> @@ -1681,6 +1681,22 @@ static const VMStateDescription vmstate_spapr_patb_entry = {
>>>      },
>>>  };
>>>  
>>> +static bool spapr_irq_map_needed(void *opaque)
>>> +{
>>> +    return true;
>>
>> I see that the next patch adds some code to avoid sending the
>> bitmap if it doesn't contain state, but I guess you should also
>> explicitly have this function to return false for older machine
>> types (see remark below).
> 
> I don't see that you should need to migrate this at all.  The machine
> needs to reliably allocate the same interrupts each time, and that
> means source and dest should have the same allocations without
> migrating data.

ok. so we need to make sure that hot plugging devices or CPUs does
not break that scheme. This is not the case today if you don't follow
the exact same order on the monitor.

C.

>>
>>> +}
>>> +
>>> +static const VMStateDescription vmstate_spapr_irq_map = {
>>> +    .name = "spapr_irq_map",
>>> +    .version_id = 0,
>>> +    .minimum_version_id = 0,
>>> +    .needed = spapr_irq_map_needed,
>>> +    .fields = (VMStateField[]) {
>>> +        VMSTATE_BITMAP(irq_map, sPAPRMachineState, 0, nr_irqs),
>>> +        VMSTATE_END_OF_LIST()
>>> +    },
>>> +};
>>> +
>>>  static const VMStateDescription vmstate_spapr = {
>>>      .name = "spapr",
>>>      .version_id = 3,
>>> @@ -1700,6 +1716,7 @@ static const VMStateDescription vmstate_spapr = {
>>>          &vmstate_spapr_ov5_cas,
>>>          &vmstate_spapr_patb_entry,
>>>          &vmstate_spapr_pending_events,
>>> +        &vmstate_spapr_irq_map,
>>>          NULL
>>>      }
>>>  };
>>> @@ -2337,8 +2354,12 @@ static void ppc_spapr_init(MachineState *machine)
>>>      /* Setup a load limit for the ramdisk leaving room for SLOF and FDT */
>>>      load_limit = MIN(spapr->rma_size, RTAS_MAX_ADDR) - FW_OVERHEAD;
>>>  
>>> +    /* Initialize the IRQ allocator */
>>> +    spapr->nr_irqs  = XICS_IRQS_SPAPR;
>>> +    spapr->irq_map  = bitmap_new(spapr->nr_irqs);
>>> +
>>
>> I think you should introduce a sPAPRMachineClass::has_irq_bitmap boolean
>> so that the bitmap is only allocated for newer machine types. And you should
>> then use this flag in spapr_irq_map_needed() above.
>>
>> Apart from that, the rest of the patch looks good.
>>
>>>      /* Set up Interrupt Controller before we create the VCPUs */
>>> -    xics_system_init(machine, XICS_IRQS_SPAPR, &error_fatal);
>>> +    xics_system_init(machine, spapr->nr_irqs, &error_fatal);
>>>  
>>>      /* Set up containers for ibm,client-architecture-support negotiated options
>>>       */
>>> @@ -3560,7 +3581,7 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum)
>>>      return -1;
>>>  }
>>>  
>>> -static bool spapr_irq_test(XICSFabric *xi, int irq)
>>> +static bool spapr_irq_test_2_11(XICSFabric *xi, int irq)
>>>  {
>>>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>>>      ICSState *ics = spapr->ics;
>>> @@ -3569,7 +3590,7 @@ static bool spapr_irq_test(XICSFabric *xi, int irq)
>>>      return !ICS_IRQ_FREE(ics, srcno);
>>>  }
>>>  
>>> -static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
>>> +static int spapr_irq_alloc_block_2_11(XICSFabric *xi, int count, int align)
>>>  {
>>>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>>>      ICSState *ics = spapr->ics;
>>> @@ -3583,7 +3604,7 @@ static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
>>>      return srcno + ics->offset;
>>>  }
>>>  
>>> -static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
>>> +static void spapr_irq_free_block_2_11(XICSFabric *xi, int irq, int num)
>>>  {
>>>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>>>      ICSState *ics = spapr->ics;
>>> @@ -3601,6 +3622,46 @@ static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
>>>      }
>>>  }
>>>  
>>> +static bool spapr_irq_test(XICSFabric *xi, int irq)
>>> +{
>>> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>>> +    int srcno = irq - spapr->ics->offset;
>>> +
>>> +    return test_bit(srcno, spapr->irq_map);
>>> +}
>>> +
>>> +static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
>>> +{
>>> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>>> +    int start = 0;
>>> +    int srcno;
>>> +
>>> +    /*
>>> +     * The 'align_mask' parameter of bitmap_find_next_zero_area()
>>> +     * should be one less than a power of 2; 0 means no
>>> +     * alignment. Adapt the 'align' value of the former allocator to
>>> +     * fit the requirements of bitmap_find_next_zero_area()
>>> +     */
>>> +    align -= 1;
>>> +
>>> +    srcno = bitmap_find_next_zero_area(spapr->irq_map, spapr->nr_irqs, start,
>>> +                                       count, align);
>>> +    if (srcno == spapr->nr_irqs) {
>>> +        return -1;
>>> +    }
>>> +
>>> +    bitmap_set(spapr->irq_map, srcno, count);
>>> +    return srcno + spapr->ics->offset;
>>> +}
>>> +
>>> +static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
>>> +{
>>> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>>> +    int srcno = irq - spapr->ics->offset;
>>> +
>>> +    bitmap_clear(spapr->irq_map, srcno, num);
>>> +}
>>> +
>>>  static void spapr_pic_print_info(InterruptStatsProvider *obj,
>>>                                   Monitor *mon)
>>>  {
>>> @@ -3778,7 +3839,12 @@ static void spapr_machine_2_11_instance_options(MachineState *machine)
>>>  
>>>  static void spapr_machine_2_11_class_options(MachineClass *mc)
>>>  {
>>> -    /* Defaults for the latest behaviour inherited from the base class */
>>> +    XICSFabricClass *xic = XICS_FABRIC_CLASS(mc);
>>> +
>>> +    spapr_machine_2_12_class_options(mc);
>>> +    xic->irq_test = spapr_irq_test_2_11;
>>> +    xic->irq_alloc_block = spapr_irq_alloc_block_2_11;
>>> +    xic->irq_free_block = spapr_irq_free_block_2_11;
>>>  }
>>>  
>>>  DEFINE_SPAPR_MACHINE(2_11, "2.11", false);
>>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>>> index 9d21ca9bde3a..5835c694caff 100644
>>> --- a/include/hw/ppc/spapr.h
>>> +++ b/include/hw/ppc/spapr.h
>>> @@ -7,6 +7,7 @@
>>>  #include "hw/ppc/spapr_drc.h"
>>>  #include "hw/mem/pc-dimm.h"
>>>  #include "hw/ppc/spapr_ovec.h"
>>> +#include "qemu/bitmap.h"
>>>  
>>>  struct VIOsPAPRBus;
>>>  struct sPAPRPHBState;
>>> @@ -78,6 +79,8 @@ struct sPAPRMachineState {
>>>      struct VIOsPAPRBus *vio_bus;
>>>      QLIST_HEAD(, sPAPRPHBState) phbs;
>>>      struct sPAPRNVRAM *nvram;
>>> +    int32_t nr_irqs;
>>> +    unsigned long *irq_map;
>>>      ICSState *ics;
>>>      sPAPRRTCState rtc;
>>>  
>>
> 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 08/11] spapr: introduce a XICSFabric irq_is_lsi() operation
  2017-11-17  4:54   ` David Gibson
@ 2017-11-17  7:23     ` Cédric Le Goater
  2017-11-23 11:12       ` David Gibson
  0 siblings, 1 reply; 79+ messages in thread
From: Cédric Le Goater @ 2017-11-17  7:23 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, Greg Kurz, Benjamin Herrenschmidt

On 11/17/2017 05:54 AM, David Gibson wrote:
> On Fri, Nov 10, 2017 at 03:20:14PM +0000, Cédric Le Goater wrote:
>> It will be used later on to distinguish the allocation of an LSI
>> interrupt from an MSI and also to reduce the use of the ICSIRQState
>> array of the ICSState object, which is on our way to introduce XIVE.
>>
>> The 'irq' parameter continues to refer to the global IRQ number space.
>>
>> On PowerNV, only the PSI controller interrupts are handled and they
>> are all LSIs.
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> 
> !?! AFAICT this is a step backwards.  The users of ics_is_lsi() here
> are in the xics code.  So they already have the right information
> locally in the ICSState object.  Why on earth would you indirect
> through the fabric.

because I am trying to get rid of the fact that the current ICS 
allocator handles two things at the same time : IRQ allocation 
and IRQ type. And that's a problem because the ICSState object 
is just used everywhere in the code. 

OK, So that's is not to your liking, I will come up with some 
other solution. Thanks for looking.

C.   
  
> 
>> ---
>>  hw/intc/xics.c        | 26 +++++++++++++++++---------
>>  hw/intc/xics_kvm.c    |  4 ++--
>>  hw/ppc/pnv.c          | 16 ++++++++++++++++
>>  hw/ppc/spapr.c        |  9 +++++++++
>>  include/hw/ppc/xics.h |  2 ++
>>  5 files changed, 46 insertions(+), 11 deletions(-)
>>
>> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
>> index 2c4899f278e2..42880e736697 100644
>> --- a/hw/intc/xics.c
>> +++ b/hw/intc/xics.c
>> @@ -33,6 +33,7 @@
>>  #include "trace.h"
>>  #include "qemu/timer.h"
>>  #include "hw/ppc/xics.h"
>> +#include "hw/ppc/spapr.h"
>>  #include "qemu/error-report.h"
>>  #include "qapi/visitor.h"
>>  #include "monitor/monitor.h"
>> @@ -70,8 +71,7 @@ void ics_pic_print_info(ICSState *ics, Monitor *mon)
>>          }
>>          monitor_printf(mon, "  %4x %s %02x %02x\n",
>>                         ics->offset + i,
>> -                       (irq->flags & XICS_FLAGS_IRQ_LSI) ?
>> -                       "LSI" : "MSI",
>> +                       ics_is_lsi(ics, i) ? "LSI" : "MSI",
> 
> !?! 
> 
>>                         irq->priority, irq->status);
>>      }
>>  }
>> @@ -377,6 +377,14 @@ static const TypeInfo icp_info = {
>>  /*
>>   * ICS: Source layer
>>   */
>> +bool ics_is_lsi(ICSState *ics, int srcno)
>> +{
>> +    XICSFabric *xi = ics->xics;
>> +    XICSFabricClass *xic = XICS_FABRIC_GET_CLASS(xi);
>> +
>> +    return xic->irq_is_lsi(xi, srcno + ics->offset);
>> +}
>> +
>>  static void ics_simple_resend_msi(ICSState *ics, int srcno)
>>  {
>>      ICSIRQState *irq = ics->irqs + srcno;
>> @@ -435,7 +443,7 @@ static void ics_simple_set_irq(void *opaque, int srcno, int val)
>>  {
>>      ICSState *ics = (ICSState *)opaque;
>>  
>> -    if (ics->irqs[srcno].flags & XICS_FLAGS_IRQ_LSI) {
>> +    if (ics_is_lsi(ics, srcno)) {
>>          ics_simple_set_irq_lsi(ics, srcno, val);
>>      } else {
>>          ics_simple_set_irq_msi(ics, srcno, val);
>> @@ -472,7 +480,7 @@ void ics_simple_write_xive(ICSState *ics, int srcno, int server,
>>      trace_xics_ics_simple_write_xive(ics->offset + srcno, srcno, server,
>>                                       priority);
>>  
>> -    if (ics->irqs[srcno].flags & XICS_FLAGS_IRQ_LSI) {
>> +    if (ics_is_lsi(ics, srcno)) {
>>          ics_simple_write_xive_lsi(ics, srcno);
>>      } else {
>>          ics_simple_write_xive_msi(ics, srcno);
>> @@ -484,10 +492,10 @@ static void ics_simple_reject(ICSState *ics, uint32_t nr)
>>      ICSIRQState *irq = ics->irqs + nr - ics->offset;
>>  
>>      trace_xics_ics_simple_reject(nr, nr - ics->offset);
>> -    if (irq->flags & XICS_FLAGS_IRQ_MSI) {
>> -        irq->status |= XICS_STATUS_REJECTED;
>> -    } else if (irq->flags & XICS_FLAGS_IRQ_LSI) {
>> +    if (ics_is_lsi(ics, nr - ics->offset)) {
>>          irq->status &= ~XICS_STATUS_SENT;
>> +    } else {
>> +        irq->status |= XICS_STATUS_REJECTED;
>>      }
>>  }
>>  
>> @@ -497,7 +505,7 @@ static void ics_simple_resend(ICSState *ics)
>>  
>>      for (i = 0; i < ics->nr_irqs; i++) {
>>          /* FIXME: filter by server#? */
>> -        if (ics->irqs[i].flags & XICS_FLAGS_IRQ_LSI) {
>> +        if (ics_is_lsi(ics, i)) {
>>              ics_simple_resend_lsi(ics, i);
>>          } else {
>>              ics_simple_resend_msi(ics, i);
>> @@ -512,7 +520,7 @@ static void ics_simple_eoi(ICSState *ics, uint32_t nr)
>>  
>>      trace_xics_ics_simple_eoi(nr);
>>  
>> -    if (ics->irqs[srcno].flags & XICS_FLAGS_IRQ_LSI) {
>> +    if (ics_is_lsi(ics, srcno)) {
>>          irq->status &= ~XICS_STATUS_SENT;
>>      }
>>  }
>> diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
>> index 3091ad3ac2c8..2f10637c9f7c 100644
>> --- a/hw/intc/xics_kvm.c
>> +++ b/hw/intc/xics_kvm.c
>> @@ -258,7 +258,7 @@ static int ics_set_kvm_state(ICSState *ics, int version_id)
>>              state |= KVM_XICS_MASKED;
>>          }
>>  
>> -        if (ics->irqs[i].flags & XICS_FLAGS_IRQ_LSI) {
>> +        if (ics_is_lsi(ics, i)) {
>>              state |= KVM_XICS_LEVEL_SENSITIVE;
>>              if (irq->status & XICS_STATUS_ASSERTED) {
>>                  state |= KVM_XICS_PENDING;
>> @@ -293,7 +293,7 @@ static void ics_kvm_set_irq(void *opaque, int srcno, int val)
>>      int rc;
>>  
>>      args.irq = srcno + ics->offset;
>> -    if (ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MSI) {
>> +    if (!ics_is_lsi(ics, srcno)) {
>>          if (!val) {
>>              return;
>>          }
>> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
>> index 8288940ef9d7..958223376b4c 100644
>> --- a/hw/ppc/pnv.c
>> +++ b/hw/ppc/pnv.c
>> @@ -1035,6 +1035,21 @@ static bool pnv_irq_test(XICSFabric *xi, int irq)
>>      return false;
>>  }
>>  
>> +static bool pnv_irq_is_lsi(XICSFabric *xi, int irq)
>> +{
>> +    PnvMachineState *pnv = POWERNV_MACHINE(xi);
>> +    int i;
>> +
>> +    /* PowerNV machine only has PSI interrupts which are all LSIs */
>> +    for (i = 0; i < pnv->num_chips; i++) {
>> +        ICSState *ics = &pnv->chips[i]->psi.ics;
>> +        if (ics_valid_irq(ics, irq)) {
>> +            return true;
>> +        }
>> +    }
>> +    return false;
>> +}
>> +
>>  static void pnv_pic_print_info(InterruptStatsProvider *obj,
>>                                 Monitor *mon)
>>  {
>> @@ -1120,6 +1135,7 @@ static void powernv_machine_class_init(ObjectClass *oc, void *data)
>>      xic->ics_get = pnv_ics_get;
>>      xic->ics_resend = pnv_ics_resend;
>>      xic->irq_test = pnv_irq_test;
>> +    xic->irq_is_lsi = pnv_irq_is_lsi;
>>      ispc->print_info = pnv_pic_print_info;
>>  
>>      powernv_machine_class_props_init(oc);
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 1cbbd7715a85..ce314fcf38db 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -3628,6 +3628,14 @@ static void spapr_irq_free_block_2_11(XICSFabric *xi, int irq, int num)
>>      }
>>  }
>>  
>> +static bool spapr_irq_is_lsi(XICSFabric *xi, int irq)
>> +{
>> +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>> +    int srcno = irq - spapr->ics->offset;
>> +
>> +    return spapr->ics->irqs[srcno].flags & XICS_FLAGS_IRQ_LSI;
>> +}
>> +
>>  static bool spapr_irq_test(XICSFabric *xi, int irq)
>>  {
>>      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
>> @@ -3765,6 +3773,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>>      xic->irq_test = spapr_irq_test;
>>      xic->irq_alloc_block = spapr_irq_alloc_block;
>>      xic->irq_free_block = spapr_irq_free_block;
>> +    xic->irq_is_lsi = spapr_irq_is_lsi;
>>  
>>      ispc->print_info = spapr_pic_print_info;
>>      /* Force NUMA node memory size to be a multiple of
>> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
>> index 30e7f2e0a7dd..478f8e510179 100644
>> --- a/include/hw/ppc/xics.h
>> +++ b/include/hw/ppc/xics.h
>> @@ -179,6 +179,7 @@ typedef struct XICSFabricClass {
>>      bool (*irq_test)(XICSFabric *xi, int irq);
>>      int (*irq_alloc_block)(XICSFabric *xi, int count, int align);
>>      void (*irq_free_block)(XICSFabric *xi, int irq, int num);
>> +    bool (*irq_is_lsi)(XICSFabric *xi, int irq);
>>  } XICSFabricClass;
>>  
>>  #define XICS_IRQS_SPAPR               1024
>> @@ -205,6 +206,7 @@ void ics_simple_write_xive(ICSState *ics, int nr, int server,
>>  void ics_set_irq_type(ICSState *ics, int srcno, bool lsi);
>>  void icp_pic_print_info(ICPState *icp, Monitor *mon);
>>  void ics_pic_print_info(ICSState *ics, Monitor *mon);
>> +bool ics_is_lsi(ICSState *ics, int srno);
>>  
>>  void ics_resend(ICSState *ics);
>>  void icp_resend(ICPState *ss);
> 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 05/11] spapr: introduce an IRQ allocator using a bitmap
  2017-11-17  4:50     ` David Gibson
  2017-11-17  7:19       ` Cédric Le Goater
@ 2017-11-20 12:07       ` Greg Kurz
  2017-11-23 11:13         ` David Gibson
  1 sibling, 1 reply; 79+ messages in thread
From: Greg Kurz @ 2017-11-20 12:07 UTC (permalink / raw)
  To: David Gibson
  Cc: Cédric Le Goater, qemu-ppc, qemu-devel, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 8155 bytes --]

On Fri, 17 Nov 2017 15:50:53 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Tue, Nov 14, 2017 at 10:42:24AM +0100, Greg Kurz wrote:
> > On Fri, 10 Nov 2017 15:20:11 +0000
> > Cédric Le Goater <clg@kaod.org> wrote:
> >   
> > > Let's define a new set of XICSFabric IRQ operations for the latest
> > > pseries machine. These simply use a a bitmap 'irq_map' as a IRQ number
> > > allocator.
> > > 
> > > The previous pseries machines keep the old set of IRQ operations using
> > > the ICSIRQState array.
> > > 
> > > Signed-off-by: Cédric Le Goater <clg@kaod.org>
> > > ---
> > > 
> > >  Changes since v2 :
> > > 
> > >  - introduced a second set of XICSFabric IRQ operations for older
> > >    pseries machines
> > > 
> > >  hw/ppc/spapr.c         | 76 ++++++++++++++++++++++++++++++++++++++++++++++----
> > >  include/hw/ppc/spapr.h |  3 ++
> > >  2 files changed, 74 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index 4bdceb45a14f..4ef0b73559ca 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -1681,6 +1681,22 @@ static const VMStateDescription vmstate_spapr_patb_entry = {
> > >      },
> > >  };
> > >  
> > > +static bool spapr_irq_map_needed(void *opaque)
> > > +{
> > > +    return true;  
> > 
> > I see that the next patch adds some code to avoid sending the
> > bitmap if it doesn't contain state, but I guess you should also
> > explicitly have this function to return false for older machine
> > types (see remark below).  
> 
> I don't see that you should need to migrate this at all.  The machine
> needs to reliably allocate the same interrupts each time, and that
> means source and dest should have the same allocations without
> migrating data.
> 

Is this true for MSIs ? With the current code, the guest can change
the allocation of such interrupts with the ibm,rtas-change-msi RTAS
call. How can the dest know about that ?

> >   
> > > +}
> > > +
> > > +static const VMStateDescription vmstate_spapr_irq_map = {
> > > +    .name = "spapr_irq_map",
> > > +    .version_id = 0,
> > > +    .minimum_version_id = 0,
> > > +    .needed = spapr_irq_map_needed,
> > > +    .fields = (VMStateField[]) {
> > > +        VMSTATE_BITMAP(irq_map, sPAPRMachineState, 0, nr_irqs),
> > > +        VMSTATE_END_OF_LIST()
> > > +    },
> > > +};
> > > +
> > >  static const VMStateDescription vmstate_spapr = {
> > >      .name = "spapr",
> > >      .version_id = 3,
> > > @@ -1700,6 +1716,7 @@ static const VMStateDescription vmstate_spapr = {
> > >          &vmstate_spapr_ov5_cas,
> > >          &vmstate_spapr_patb_entry,
> > >          &vmstate_spapr_pending_events,
> > > +        &vmstate_spapr_irq_map,
> > >          NULL
> > >      }
> > >  };
> > > @@ -2337,8 +2354,12 @@ static void ppc_spapr_init(MachineState *machine)
> > >      /* Setup a load limit for the ramdisk leaving room for SLOF and FDT */
> > >      load_limit = MIN(spapr->rma_size, RTAS_MAX_ADDR) - FW_OVERHEAD;
> > >  
> > > +    /* Initialize the IRQ allocator */
> > > +    spapr->nr_irqs  = XICS_IRQS_SPAPR;
> > > +    spapr->irq_map  = bitmap_new(spapr->nr_irqs);
> > > +  
> > 
> > I think you should introduce a sPAPRMachineClass::has_irq_bitmap boolean
> > so that the bitmap is only allocated for newer machine types. And you should
> > then use this flag in spapr_irq_map_needed() above.
> > 
> > Apart from that, the rest of the patch looks good.
> >   
> > >      /* Set up Interrupt Controller before we create the VCPUs */
> > > -    xics_system_init(machine, XICS_IRQS_SPAPR, &error_fatal);
> > > +    xics_system_init(machine, spapr->nr_irqs, &error_fatal);
> > >  
> > >      /* Set up containers for ibm,client-architecture-support negotiated options
> > >       */
> > > @@ -3560,7 +3581,7 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum)
> > >      return -1;
> > >  }
> > >  
> > > -static bool spapr_irq_test(XICSFabric *xi, int irq)
> > > +static bool spapr_irq_test_2_11(XICSFabric *xi, int irq)
> > >  {
> > >      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> > >      ICSState *ics = spapr->ics;
> > > @@ -3569,7 +3590,7 @@ static bool spapr_irq_test(XICSFabric *xi, int irq)
> > >      return !ICS_IRQ_FREE(ics, srcno);
> > >  }
> > >  
> > > -static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
> > > +static int spapr_irq_alloc_block_2_11(XICSFabric *xi, int count, int align)
> > >  {
> > >      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> > >      ICSState *ics = spapr->ics;
> > > @@ -3583,7 +3604,7 @@ static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
> > >      return srcno + ics->offset;
> > >  }
> > >  
> > > -static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
> > > +static void spapr_irq_free_block_2_11(XICSFabric *xi, int irq, int num)
> > >  {
> > >      sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> > >      ICSState *ics = spapr->ics;
> > > @@ -3601,6 +3622,46 @@ static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
> > >      }
> > >  }
> > >  
> > > +static bool spapr_irq_test(XICSFabric *xi, int irq)
> > > +{
> > > +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> > > +    int srcno = irq - spapr->ics->offset;
> > > +
> > > +    return test_bit(srcno, spapr->irq_map);
> > > +}
> > > +
> > > +static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
> > > +{
> > > +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> > > +    int start = 0;
> > > +    int srcno;
> > > +
> > > +    /*
> > > +     * The 'align_mask' parameter of bitmap_find_next_zero_area()
> > > +     * should be one less than a power of 2; 0 means no
> > > +     * alignment. Adapt the 'align' value of the former allocator to
> > > +     * fit the requirements of bitmap_find_next_zero_area()
> > > +     */
> > > +    align -= 1;
> > > +
> > > +    srcno = bitmap_find_next_zero_area(spapr->irq_map, spapr->nr_irqs, start,
> > > +                                       count, align);
> > > +    if (srcno == spapr->nr_irqs) {
> > > +        return -1;
> > > +    }
> > > +
> > > +    bitmap_set(spapr->irq_map, srcno, count);
> > > +    return srcno + spapr->ics->offset;
> > > +}
> > > +
> > > +static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
> > > +{
> > > +    sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
> > > +    int srcno = irq - spapr->ics->offset;
> > > +
> > > +    bitmap_clear(spapr->irq_map, srcno, num);
> > > +}
> > > +
> > >  static void spapr_pic_print_info(InterruptStatsProvider *obj,
> > >                                   Monitor *mon)
> > >  {
> > > @@ -3778,7 +3839,12 @@ static void spapr_machine_2_11_instance_options(MachineState *machine)
> > >  
> > >  static void spapr_machine_2_11_class_options(MachineClass *mc)
> > >  {
> > > -    /* Defaults for the latest behaviour inherited from the base class */
> > > +    XICSFabricClass *xic = XICS_FABRIC_CLASS(mc);
> > > +
> > > +    spapr_machine_2_12_class_options(mc);
> > > +    xic->irq_test = spapr_irq_test_2_11;
> > > +    xic->irq_alloc_block = spapr_irq_alloc_block_2_11;
> > > +    xic->irq_free_block = spapr_irq_free_block_2_11;
> > >  }
> > >  
> > >  DEFINE_SPAPR_MACHINE(2_11, "2.11", false);
> > > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > > index 9d21ca9bde3a..5835c694caff 100644
> > > --- a/include/hw/ppc/spapr.h
> > > +++ b/include/hw/ppc/spapr.h
> > > @@ -7,6 +7,7 @@
> > >  #include "hw/ppc/spapr_drc.h"
> > >  #include "hw/mem/pc-dimm.h"
> > >  #include "hw/ppc/spapr_ovec.h"
> > > +#include "qemu/bitmap.h"
> > >  
> > >  struct VIOsPAPRBus;
> > >  struct sPAPRPHBState;
> > > @@ -78,6 +79,8 @@ struct sPAPRMachineState {
> > >      struct VIOsPAPRBus *vio_bus;
> > >      QLIST_HEAD(, sPAPRPHBState) phbs;
> > >      struct sPAPRNVRAM *nvram;
> > > +    int32_t nr_irqs;
> > > +    unsigned long *irq_map;
> > >      ICSState *ics;
> > >      sPAPRRTCState rtc;
> > >    
> >   
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ? (was: [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type)
  2017-11-13  7:14   ` [Qemu-devel] QEMU 3.0 ? (was: [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type) Thomas Huth
  2017-11-13  9:53     ` Peter Maydell
@ 2017-11-23 10:03     ` Cornelia Huck
  2017-11-23 10:17       ` Peter Maydell
  1 sibling, 1 reply; 79+ messages in thread
From: Cornelia Huck @ 2017-11-23 10:03 UTC (permalink / raw)
  To: Thomas Huth
  Cc: Cédric Le Goater, qemu-devel, David Gibson, Greg Kurz,
	Peter Maydell

On Mon, 13 Nov 2017 08:14:28 +0100
Thomas Huth <thuth@redhat.com> wrote:

> By the way, before everybody now introduces "2.12" machine types ... is
> there already a consensus that the next version will be "2.12" ?
> 
> A couple of months ago, we discussed that we could maybe do a 3.0 after
> 2.11, e.g. here:
> 
>  https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05056.html
> 
> I'd still like to see that happen... Peter, any thoughts on this?

So, as I just thought about preparing the new machine for s390x as
well: Did we reach any consensus about what the next qemu version will
be called?

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ? (was: [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type)
  2017-11-23 10:03     ` [Qemu-devel] QEMU 3.0 ? (was: [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type) Cornelia Huck
@ 2017-11-23 10:17       ` Peter Maydell
  2017-11-23 10:57         ` [Qemu-devel] QEMU 3.0 ? Thomas Huth
  2017-11-23 11:14         ` [Qemu-devel] QEMU 3.0 ? (was: [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type) Daniel P. Berrange
  0 siblings, 2 replies; 79+ messages in thread
From: Peter Maydell @ 2017-11-23 10:17 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Thomas Huth, Cédric Le Goater, QEMU Developers,
	David Gibson, Greg Kurz

On 23 November 2017 at 10:03, Cornelia Huck <cohuck@redhat.com> wrote:
> On Mon, 13 Nov 2017 08:14:28 +0100
> Thomas Huth <thuth@redhat.com> wrote:
>
>> By the way, before everybody now introduces "2.12" machine types ... is
>> there already a consensus that the next version will be "2.12" ?
>>
>> A couple of months ago, we discussed that we could maybe do a 3.0 after
>> 2.11, e.g. here:
>>
>>  https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05056.html
>>
>> I'd still like to see that happen... Peter, any thoughts on this?
>
> So, as I just thought about preparing the new machine for s390x as
> well: Did we reach any consensus about what the next qemu version will
> be called?

I haven't seen any sufficiently solid plan to make me want to
pick anything except "2.12".

thanks
-- PMM

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 10:17       ` Peter Maydell
@ 2017-11-23 10:57         ` Thomas Huth
  2017-11-23 11:11           ` Daniel P. Berrange
  2017-11-23 11:17           ` Paolo Bonzini
  2017-11-23 11:14         ` [Qemu-devel] QEMU 3.0 ? (was: [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type) Daniel P. Berrange
  1 sibling, 2 replies; 79+ messages in thread
From: Thomas Huth @ 2017-11-23 10:57 UTC (permalink / raw)
  To: Peter Maydell, Cornelia Huck
  Cc: Cédric Le Goater, QEMU Developers, David Gibson, Greg Kurz,
	Markus Armbruster, Paolo Bonzini, Daniel P. Berrange, Eric Blake,
	Philippe Mathieu-Daudé

On 23.11.2017 11:17, Peter Maydell wrote:
> On 23 November 2017 at 10:03, Cornelia Huck <cohuck@redhat.com> wrote:
>> On Mon, 13 Nov 2017 08:14:28 +0100
>> Thomas Huth <thuth@redhat.com> wrote:
>>
>>> By the way, before everybody now introduces "2.12" machine types ... is
>>> there already a consensus that the next version will be "2.12" ?
>>>
>>> A couple of months ago, we discussed that we could maybe do a 3.0 after
>>> 2.11, e.g. here:
>>>
>>>  https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05056.html
>>>
>>> I'd still like to see that happen... Peter, any thoughts on this?
>>
>> So, as I just thought about preparing the new machine for s390x as
>> well: Did we reach any consensus about what the next qemu version will
>> be called?
> 
> I haven't seen any sufficiently solid plan to make me want to
> pick anything except "2.12".

I still don't think that we need a big plan for this... The change from
1.7 to 2.0 was also rather arbitrary, wasn't it?

Anyway, I've now started a Wiki page to collect ideas:

 https://wiki.qemu.org/Features/Version3.0

Maybe we can jump to version 3.0 if there are enough doable items on the
list that we can all agree upon.

I've put "--accel kvm:hax:tcg" also on the doable list since I don't
remember any objections to that idea so far -- feel free to move it to
the controversial list instead if you think it needs more discussion.

 Thomas

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 03/11] spapr: introduce new XICSFabric operations for an IRQ allocator
  2017-11-17  7:16     ` Cédric Le Goater
@ 2017-11-23 11:07       ` David Gibson
  2017-11-23 13:22         ` Cédric Le Goater
  0 siblings, 1 reply; 79+ messages in thread
From: David Gibson @ 2017-11-23 11:07 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: qemu-ppc, qemu-devel, Greg Kurz, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 4633 bytes --]

On Fri, Nov 17, 2017 at 08:16:47AM +0100, Cédric Le Goater wrote:
> On 11/17/2017 05:48 AM, David Gibson wrote:
> > On Fri, Nov 10, 2017 at 03:20:09PM +0000, Cédric Le Goater wrote:
> >> Currently, the ICSState 'ics' object of the sPAPR machine acts as the
> >> global interrupt source handler and also as the IRQ number allocator
> >> for the machine. Some IRQ numbers are allocated very early in the
> >> machine initialization sequence to populate the device tree, and this
> >> is a problem to introduce the new POWER XIVE interrupt model, as it
> >> needs to share the IRQ numbers with the older model.
> >>
> >> To prepare ground for XIVE, here is a set of new XICSFabric operations
> >> to let the machine handle directly the IRQ number allocation and to
> >> decorrelate the allocation from the interrupt source object :
> >>
> >>     bool (*irq_test)(XICSFabric *xi, int irq);
> >>     int (*irq_alloc_block)(XICSFabric *xi, int count, int align);
> >>     void (*irq_free_block)(XICSFabric *xi, int irq, int num);
> >>
> >> In these prototypes, the 'irq' parameter refers to a number in the
> >> global IRQ number space. Indexes for arrays storing different state
> >> informations on the interrupts, like the ICSIRQState, are usually
> >> named 'srcno'.
> >>
> >> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> > 
> > This doesn't seem sensible to me.  When I said you should move irq
> > allocation to the machine, I mean actually move the code.  The only
> > user of irq allocation should be in the machine, so we shouldn't need
> > to indirect via the XICSFabric interface to do that.
> 
> OK. so we can probably do the same with machine class handlers because 
> we do need an indirection to handle the way older pseries machines 
> allocate IRQs that will change with newer machines  supporting XIVE.

Right.  You could do it either with a MachineClass callback (similar
to the phb placement one), or with just a flag in the MachineClass
that's checked explicitly.  I'd be ok with either approach.

> > And, we shouldn't be using XICSFabric things for XIVE.
> 
> ok. The spapr machine should be enough. 
> 
> Thanks,
> 
> C.
>  
> >> ---
> >>  hw/ppc/spapr.c        | 19 +++++++++++++++++++
> >>  include/hw/ppc/xics.h |  4 ++++
> >>  2 files changed, 23 insertions(+)
> >>
> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >> index a2dcbee07214..84d68f2fdbae 100644
> >> --- a/hw/ppc/spapr.c
> >> +++ b/hw/ppc/spapr.c
> >> @@ -3536,6 +3536,21 @@ static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
> >>      return cpu ? ICP(cpu->intc) : NULL;
> >>  }
> >>  
> >> +static bool spapr_irq_test(XICSFabric *xi, int irq)
> >> +{
> >> +    return false;
> >> +}
> >> +
> >> +static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
> >> +{
> >> +    return -1;
> >> +}
> >> +
> >> +static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
> >> +{
> >> +    ;
> >> +}
> >> +
> >>  static void spapr_pic_print_info(InterruptStatsProvider *obj,
> >>                                   Monitor *mon)
> >>  {
> >> @@ -3630,6 +3645,10 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
> >>      xic->ics_get = spapr_ics_get;
> >>      xic->ics_resend = spapr_ics_resend;
> >>      xic->icp_get = spapr_icp_get;
> >> +    xic->irq_test = spapr_irq_test;
> >> +    xic->irq_alloc_block = spapr_irq_alloc_block;
> >> +    xic->irq_free_block = spapr_irq_free_block;
> >> +
> >>      ispc->print_info = spapr_pic_print_info;
> >>      /* Force NUMA node memory size to be a multiple of
> >>       * SPAPR_MEMORY_BLOCK_SIZE (256M) since that's the granularity
> >> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> >> index 28d248abad61..30e7f2e0a7dd 100644
> >> --- a/include/hw/ppc/xics.h
> >> +++ b/include/hw/ppc/xics.h
> >> @@ -175,6 +175,10 @@ typedef struct XICSFabricClass {
> >>      ICSState *(*ics_get)(XICSFabric *xi, int irq);
> >>      void (*ics_resend)(XICSFabric *xi);
> >>      ICPState *(*icp_get)(XICSFabric *xi, int server);
> >> +    /* IRQ allocator helpers */
> >> +    bool (*irq_test)(XICSFabric *xi, int irq);
> >> +    int (*irq_alloc_block)(XICSFabric *xi, int count, int align);
> >> +    void (*irq_free_block)(XICSFabric *xi, int irq, int num);
> >>  } XICSFabricClass;
> >>  
> >>  #define XICS_IRQS_SPAPR               1024
> > 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 05/11] spapr: introduce an IRQ allocator using a bitmap
  2017-11-17  7:19       ` Cédric Le Goater
@ 2017-11-23 11:08         ` David Gibson
  0 siblings, 0 replies; 79+ messages in thread
From: David Gibson @ 2017-11-23 11:08 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: Greg Kurz, qemu-ppc, qemu-devel, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 2575 bytes --]

On Fri, Nov 17, 2017 at 08:19:23AM +0100, Cédric Le Goater wrote:
> On 11/17/2017 05:50 AM, David Gibson wrote:
> > On Tue, Nov 14, 2017 at 10:42:24AM +0100, Greg Kurz wrote:
> >> On Fri, 10 Nov 2017 15:20:11 +0000
> >> Cédric Le Goater <clg@kaod.org> wrote:
> >>
> >>> Let's define a new set of XICSFabric IRQ operations for the latest
> >>> pseries machine. These simply use a a bitmap 'irq_map' as a IRQ number
> >>> allocator.
> >>>
> >>> The previous pseries machines keep the old set of IRQ operations using
> >>> the ICSIRQState array.
> >>>
> >>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> >>> ---
> >>>
> >>>  Changes since v2 :
> >>>
> >>>  - introduced a second set of XICSFabric IRQ operations for older
> >>>    pseries machines
> >>>
> >>>  hw/ppc/spapr.c         | 76 ++++++++++++++++++++++++++++++++++++++++++++++----
> >>>  include/hw/ppc/spapr.h |  3 ++
> >>>  2 files changed, 74 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >>> index 4bdceb45a14f..4ef0b73559ca 100644
> >>> --- a/hw/ppc/spapr.c
> >>> +++ b/hw/ppc/spapr.c
> >>> @@ -1681,6 +1681,22 @@ static const VMStateDescription vmstate_spapr_patb_entry = {
> >>>      },
> >>>  };
> >>>  
> >>> +static bool spapr_irq_map_needed(void *opaque)
> >>> +{
> >>> +    return true;
> >>
> >> I see that the next patch adds some code to avoid sending the
> >> bitmap if it doesn't contain state, but I guess you should also
> >> explicitly have this function to return false for older machine
> >> types (see remark below).
> > 
> > I don't see that you should need to migrate this at all.  The machine
> > needs to reliably allocate the same interrupts each time, and that
> > means source and dest should have the same allocations without
> > migrating data.
> 
> ok. so we need to make sure that hot plugging devices or CPUs does
> not break that scheme. This is not the case today if you don't follow
> the exact same order on the monitor.

Ok, that's already broken then :/.

AFAIK plugging CPUs shouldn't matter though.  Plugging devices might -
which is exactly why the approach of using an irq "allocator" as such
isn't a very good one.  We might have to have something for MSI-X,
which will then require migration of some data, but we shouldn't need
it for the intx and other fixed interrupts.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 10:57         ` [Qemu-devel] QEMU 3.0 ? Thomas Huth
@ 2017-11-23 11:11           ` Daniel P. Berrange
  2017-11-23 11:24             ` Thomas Huth
  2017-11-23 11:17           ` Paolo Bonzini
  1 sibling, 1 reply; 79+ messages in thread
From: Daniel P. Berrange @ 2017-11-23 11:11 UTC (permalink / raw)
  To: Thomas Huth
  Cc: Peter Maydell, Cornelia Huck, Cédric Le Goater,
	QEMU Developers, David Gibson, Greg Kurz, Markus Armbruster,
	Paolo Bonzini, Eric Blake, Philippe Mathieu-Daudé

On Thu, Nov 23, 2017 at 11:57:34AM +0100, Thomas Huth wrote:
> On 23.11.2017 11:17, Peter Maydell wrote:
> > On 23 November 2017 at 10:03, Cornelia Huck <cohuck@redhat.com> wrote:
> >> On Mon, 13 Nov 2017 08:14:28 +0100
> >> Thomas Huth <thuth@redhat.com> wrote:
> >>
> >>> By the way, before everybody now introduces "2.12" machine types ... is
> >>> there already a consensus that the next version will be "2.12" ?
> >>>
> >>> A couple of months ago, we discussed that we could maybe do a 3.0 after
> >>> 2.11, e.g. here:
> >>>
> >>>  https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05056.html
> >>>
> >>> I'd still like to see that happen... Peter, any thoughts on this?
> >>
> >> So, as I just thought about preparing the new machine for s390x as
> >> well: Did we reach any consensus about what the next qemu version will
> >> be called?
> > 
> > I haven't seen any sufficiently solid plan to make me want to
> > pick anything except "2.12".
> 
> I still don't think that we need a big plan for this... The change from
> 1.7 to 2.0 was also rather arbitrary, wasn't it?
>
> Anyway, I've now started a Wiki page to collect ideas:
> 
>  https://wiki.qemu.org/Features/Version3.0
> 
> Maybe we can jump to version 3.0 if there are enough doable items on the
> list that we can all agree upon.

>From the mgmt app / libvirt POV, I'm against the idea of doing such an
API incompatible release associated with major versions. The whole point
of the documented deprecation timeframe, was that we can incrementally
remove legacy interfaces without having a big bang break the whole
world. It gives management apps a clear warning on what will go away,
and a consistent timeframe for how long they have to adapt before it
goes away.

Tieing breakage to "major" releases, gives a very inconsistent lead up
for mgmt apps - some features will live on the "to be removed" list
for years, while other features put on the 3.0 kill list may only have
less than a single release warning before being removed. What's worse
is that, for features which change their impl rather than being deleted,
apps have to adapt to everything at the same time. It

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 08/11] spapr: introduce a XICSFabric irq_is_lsi() operation
  2017-11-17  7:23     ` Cédric Le Goater
@ 2017-11-23 11:12       ` David Gibson
  2017-11-23 13:26         ` Cédric Le Goater
  0 siblings, 1 reply; 79+ messages in thread
From: David Gibson @ 2017-11-23 11:12 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: qemu-ppc, qemu-devel, Greg Kurz, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 1959 bytes --]

On Fri, Nov 17, 2017 at 08:23:00AM +0100, Cédric Le Goater wrote:
> On 11/17/2017 05:54 AM, David Gibson wrote:
> > On Fri, Nov 10, 2017 at 03:20:14PM +0000, Cédric Le Goater wrote:
> >> It will be used later on to distinguish the allocation of an LSI
> >> interrupt from an MSI and also to reduce the use of the ICSIRQState
> >> array of the ICSState object, which is on our way to introduce XIVE.
> >>
> >> The 'irq' parameter continues to refer to the global IRQ number space.
> >>
> >> On PowerNV, only the PSI controller interrupts are handled and they
> >> are all LSIs.
> >>
> >> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> > 
> > !?! AFAICT this is a step backwards.  The users of ics_is_lsi() here
> > are in the xics code.  So they already have the right information
> > locally in the ICSState object.  Why on earth would you indirect
> > through the fabric.
> 
> because I am trying to get rid of the fact that the current ICS 
> allocator handles two things at the same time : IRQ allocation 
> and IRQ type. And that's a problem because the ICSState object 
> is just used everywhere in the code.

That's a good goal, but I don't think this is the right way there.
One of the reasons that conflation of allocation and typing is bad is
that the typing is essentially local information to the PIC itself,
whereas the allocation is a machine concern.

By using a XICSFabric hook you're essentially wiring the irq typing
through the machine, which again seems like a step in the wrong
direction.

> OK, So that's is not to your liking, I will come up with some 
> other solution. Thanks for looking.

Sorry if I was a bit harsh; quite a few unrelated things have been
frustrating me, which has made me grumpy.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 05/11] spapr: introduce an IRQ allocator using a bitmap
  2017-11-20 12:07       ` Greg Kurz
@ 2017-11-23 11:13         ` David Gibson
  0 siblings, 0 replies; 79+ messages in thread
From: David Gibson @ 2017-11-23 11:13 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Cédric Le Goater, qemu-ppc, qemu-devel, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 2459 bytes --]

On Mon, Nov 20, 2017 at 01:07:42PM +0100, Greg Kurz wrote:
> On Fri, 17 Nov 2017 15:50:53 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Tue, Nov 14, 2017 at 10:42:24AM +0100, Greg Kurz wrote:
> > > On Fri, 10 Nov 2017 15:20:11 +0000
> > > Cédric Le Goater <clg@kaod.org> wrote:
> > >   
> > > > Let's define a new set of XICSFabric IRQ operations for the latest
> > > > pseries machine. These simply use a a bitmap 'irq_map' as a IRQ number
> > > > allocator.
> > > > 
> > > > The previous pseries machines keep the old set of IRQ operations using
> > > > the ICSIRQState array.
> > > > 
> > > > Signed-off-by: Cédric Le Goater <clg@kaod.org>
> > > > ---
> > > > 
> > > >  Changes since v2 :
> > > > 
> > > >  - introduced a second set of XICSFabric IRQ operations for older
> > > >    pseries machines
> > > > 
> > > >  hw/ppc/spapr.c         | 76 ++++++++++++++++++++++++++++++++++++++++++++++----
> > > >  include/hw/ppc/spapr.h |  3 ++
> > > >  2 files changed, 74 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > > index 4bdceb45a14f..4ef0b73559ca 100644
> > > > --- a/hw/ppc/spapr.c
> > > > +++ b/hw/ppc/spapr.c
> > > > @@ -1681,6 +1681,22 @@ static const VMStateDescription vmstate_spapr_patb_entry = {
> > > >      },
> > > >  };
> > > >  
> > > > +static bool spapr_irq_map_needed(void *opaque)
> > > > +{
> > > > +    return true;  
> > > 
> > > I see that the next patch adds some code to avoid sending the
> > > bitmap if it doesn't contain state, but I guess you should also
> > > explicitly have this function to return false for older machine
> > > types (see remark below).  
> > 
> > I don't see that you should need to migrate this at all.  The machine
> > needs to reliably allocate the same interrupts each time, and that
> > means source and dest should have the same allocations without
> > migrating data.
> > 
> 
> Is this true for MSIs ? With the current code, the guest can change
> the allocation of such interrupts with the ibm,rtas-change-msi RTAS
> call. How can the dest know about that ?

Yeah, true.  The MSIs really are dynamically allocated by the guest
and so will need to have information migrated.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ? (was: [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type)
  2017-11-23 10:17       ` Peter Maydell
  2017-11-23 10:57         ` [Qemu-devel] QEMU 3.0 ? Thomas Huth
@ 2017-11-23 11:14         ` Daniel P. Berrange
  2017-11-23 11:26           ` [Qemu-devel] QEMU 3.0 ? Thomas Huth
  1 sibling, 1 reply; 79+ messages in thread
From: Daniel P. Berrange @ 2017-11-23 11:14 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Cornelia Huck, Greg Kurz, Thomas Huth, David Gibson,
	Cédric Le Goater, QEMU Developers

On Thu, Nov 23, 2017 at 10:17:48AM +0000, Peter Maydell wrote:
> On 23 November 2017 at 10:03, Cornelia Huck <cohuck@redhat.com> wrote:
> > On Mon, 13 Nov 2017 08:14:28 +0100
> > Thomas Huth <thuth@redhat.com> wrote:
> >
> >> By the way, before everybody now introduces "2.12" machine types ... is
> >> there already a consensus that the next version will be "2.12" ?
> >>
> >> A couple of months ago, we discussed that we could maybe do a 3.0 after
> >> 2.11, e.g. here:
> >>
> >>  https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05056.html
> >>
> >> I'd still like to see that happen... Peter, any thoughts on this?
> >
> > So, as I just thought about preparing the new machine for s390x as
> > well: Did we reach any consensus about what the next qemu version will
> > be called?
> 
> I haven't seen any sufficiently solid plan to make me want to
> pick anything except "2.12".

I would suggest we just make major version number changes explicitly an
arbitrary choice. ie just bump the major version for the first release
of each year. Or just always bump it when we get to x.5 o x.9.

We have a planned deprecation process for making incompatible changes
at an arbitrary time, with no need to batch it up in a "big bang" break
the whole world release. This kind of incremental change is much preferred
from libvirt POV, as adapting to a huge pile of changes at the same time
is a much bigger burden to deal with.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 10:57         ` [Qemu-devel] QEMU 3.0 ? Thomas Huth
  2017-11-23 11:11           ` Daniel P. Berrange
@ 2017-11-23 11:17           ` Paolo Bonzini
  2017-11-23 11:57             ` Thomas Huth
  2017-11-23 14:57             ` Igor Mammedov
  1 sibling, 2 replies; 79+ messages in thread
From: Paolo Bonzini @ 2017-11-23 11:17 UTC (permalink / raw)
  To: Thomas Huth, Peter Maydell, Cornelia Huck
  Cc: Cédric Le Goater, QEMU Developers, David Gibson, Greg Kurz,
	Markus Armbruster, Daniel P. Berrange, Eric Blake,
	Philippe Mathieu-Daudé

On 23/11/2017 11:57, Thomas Huth wrote:
> On 23.11.2017 11:17, Peter Maydell wrote:
>> On 23 November 2017 at 10:03, Cornelia Huck <cohuck@redhat.com> wrote:
>>> On Mon, 13 Nov 2017 08:14:28 +0100
>>> Thomas Huth <thuth@redhat.com> wrote:
>>>
>>>> By the way, before everybody now introduces "2.12" machine types ... is
>>>> there already a consensus that the next version will be "2.12" ?
>>>>
>>>> A couple of months ago, we discussed that we could maybe do a 3.0 after
>>>> 2.11, e.g. here:
>>>>
>>>>  https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05056.html
>>>>
>>>> I'd still like to see that happen... Peter, any thoughts on this?
>>>
>>> So, as I just thought about preparing the new machine for s390x as
>>> well: Did we reach any consensus about what the next qemu version will
>>> be called?
>>
>> I haven't seen any sufficiently solid plan to make me want to
>> pick anything except "2.12".
> 
> I still don't think that we need a big plan for this... The change from
> 1.7 to 2.0 was also rather arbitrary, wasn't it?
> 
> Anyway, I've now started a Wiki page to collect ideas:
> 
>  https://wiki.qemu.org/Features/Version3.0
> 
> Maybe we can jump to version 3.0 if there are enough doable items on the
> list that we can all agree upon.
> 
> I've put "--accel kvm:hax:tcg" also on the doable list since I don't
> remember any objections to that idea so far -- feel free to move it to
> the controversial list instead if you think it needs more discussion.

"hax" is very far from feature parity with TCG, it doesn't even support
CPUID (-cpu).  "-accel kvm:hvf:tcg" could be a possibility, but only if
we have resources to test it.  As far as I know the only active x86
developer who owns a Mac is Igor?

Paolo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 11:11           ` Daniel P. Berrange
@ 2017-11-23 11:24             ` Thomas Huth
  2017-11-23 11:33               ` Daniel P. Berrange
  0 siblings, 1 reply; 79+ messages in thread
From: Thomas Huth @ 2017-11-23 11:24 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: Peter Maydell, Philippe Mathieu-Daudé,
	Markus Armbruster, Cornelia Huck, Greg Kurz, QEMU Developers,
	Cédric Le Goater, Paolo Bonzini, David Gibson

On 23.11.2017 12:11, Daniel P. Berrange wrote:
> On Thu, Nov 23, 2017 at 11:57:34AM +0100, Thomas Huth wrote:
>> On 23.11.2017 11:17, Peter Maydell wrote:
>>> On 23 November 2017 at 10:03, Cornelia Huck <cohuck@redhat.com> wrote:
>>>> On Mon, 13 Nov 2017 08:14:28 +0100
>>>> Thomas Huth <thuth@redhat.com> wrote:
>>>>
>>>>> By the way, before everybody now introduces "2.12" machine types ... is
>>>>> there already a consensus that the next version will be "2.12" ?
>>>>>
>>>>> A couple of months ago, we discussed that we could maybe do a 3.0 after
>>>>> 2.11, e.g. here:
>>>>>
>>>>>  https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05056.html
>>>>>
>>>>> I'd still like to see that happen... Peter, any thoughts on this?
>>>>
>>>> So, as I just thought about preparing the new machine for s390x as
>>>> well: Did we reach any consensus about what the next qemu version will
>>>> be called?
>>>
>>> I haven't seen any sufficiently solid plan to make me want to
>>> pick anything except "2.12".
>>
>> I still don't think that we need a big plan for this... The change from
>> 1.7 to 2.0 was also rather arbitrary, wasn't it?
>>
>> Anyway, I've now started a Wiki page to collect ideas:
>>
>>  https://wiki.qemu.org/Features/Version3.0
>>
>> Maybe we can jump to version 3.0 if there are enough doable items on the
>> list that we can all agree upon.
> 
> From the mgmt app / libvirt POV, I'm against the idea of doing such an
> API incompatible release associated with major versions. The whole point
> of the documented deprecation timeframe, was that we can incrementally
> remove legacy interfaces without having a big bang break the whole
> world.

Yes, I agree ... that's why I tried to split the list into a "doable"
part (which hopefully does not mean any breakage for the upper stack),
and a "controversial" part (which we could use for collecting ideas, but
it is likely not feasible to do it any time soon). Sorry for not stating
this more clearly.

 Thomas

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 11:14         ` [Qemu-devel] QEMU 3.0 ? (was: [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type) Daniel P. Berrange
@ 2017-11-23 11:26           ` Thomas Huth
  0 siblings, 0 replies; 79+ messages in thread
From: Thomas Huth @ 2017-11-23 11:26 UTC (permalink / raw)
  To: Daniel P. Berrange, Peter Maydell
  Cc: Cornelia Huck, QEMU Developers, Greg Kurz, Cédric Le Goater,
	David Gibson

On 23.11.2017 12:14, Daniel P. Berrange wrote:
> On Thu, Nov 23, 2017 at 10:17:48AM +0000, Peter Maydell wrote:
>> On 23 November 2017 at 10:03, Cornelia Huck <cohuck@redhat.com> wrote:
>>> On Mon, 13 Nov 2017 08:14:28 +0100
>>> Thomas Huth <thuth@redhat.com> wrote:
>>>
>>>> By the way, before everybody now introduces "2.12" machine types ... is
>>>> there already a consensus that the next version will be "2.12" ?
>>>>
>>>> A couple of months ago, we discussed that we could maybe do a 3.0 after
>>>> 2.11, e.g. here:
>>>>
>>>>  https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05056.html
>>>>
>>>> I'd still like to see that happen... Peter, any thoughts on this?
>>>
>>> So, as I just thought about preparing the new machine for s390x as
>>> well: Did we reach any consensus about what the next qemu version will
>>> be called?
>>
>> I haven't seen any sufficiently solid plan to make me want to
>> pick anything except "2.12".
> 
> I would suggest we just make major version number changes explicitly an
> arbitrary choice. ie just bump the major version for the first release
> of each year. Or just always bump it when we get to x.5 o x.9.

FWIW, I like the idea to bump it when we get to x.9. (Having mixed up
x.1 and x.10 for other projects in the past already, I don't like two
digit minor release version numbers very much).

 Thomas

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 11:24             ` Thomas Huth
@ 2017-11-23 11:33               ` Daniel P. Berrange
  2017-11-23 11:40                 ` Thomas Huth
  0 siblings, 1 reply; 79+ messages in thread
From: Daniel P. Berrange @ 2017-11-23 11:33 UTC (permalink / raw)
  To: Thomas Huth
  Cc: Peter Maydell, Philippe Mathieu-Daudé,
	Markus Armbruster, Cornelia Huck, Greg Kurz, QEMU Developers,
	Cédric Le Goater, Paolo Bonzini, David Gibson

On Thu, Nov 23, 2017 at 12:24:24PM +0100, Thomas Huth wrote:
> On 23.11.2017 12:11, Daniel P. Berrange wrote:
> > On Thu, Nov 23, 2017 at 11:57:34AM +0100, Thomas Huth wrote:
> >> On 23.11.2017 11:17, Peter Maydell wrote:
> >>> On 23 November 2017 at 10:03, Cornelia Huck <cohuck@redhat.com> wrote:
> >>>> On Mon, 13 Nov 2017 08:14:28 +0100
> >>>> Thomas Huth <thuth@redhat.com> wrote:
> >>>>
> >>>>> By the way, before everybody now introduces "2.12" machine types ... is
> >>>>> there already a consensus that the next version will be "2.12" ?
> >>>>>
> >>>>> A couple of months ago, we discussed that we could maybe do a 3.0 after
> >>>>> 2.11, e.g. here:
> >>>>>
> >>>>>  https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05056.html
> >>>>>
> >>>>> I'd still like to see that happen... Peter, any thoughts on this?
> >>>>
> >>>> So, as I just thought about preparing the new machine for s390x as
> >>>> well: Did we reach any consensus about what the next qemu version will
> >>>> be called?
> >>>
> >>> I haven't seen any sufficiently solid plan to make me want to
> >>> pick anything except "2.12".
> >>
> >> I still don't think that we need a big plan for this... The change from
> >> 1.7 to 2.0 was also rather arbitrary, wasn't it?
> >>
> >> Anyway, I've now started a Wiki page to collect ideas:
> >>
> >>  https://wiki.qemu.org/Features/Version3.0
> >>
> >> Maybe we can jump to version 3.0 if there are enough doable items on the
> >> list that we can all agree upon.
> > 
> > From the mgmt app / libvirt POV, I'm against the idea of doing such an
> > API incompatible release associated with major versions. The whole point
> > of the documented deprecation timeframe, was that we can incrementally
> > remove legacy interfaces without having a big bang break the whole
> > world.
> 
> Yes, I agree ... that's why I tried to split the list into a "doable"
> part (which hopefully does not mean any breakage for the upper stack),
> and a "controversial" part (which we could use for collecting ideas, but
> it is likely not feasible to do it any time soon). Sorry for not stating
> this more clearly.

Your "doable" list includes removing all deprecated features, which
basically just nullifies the deprecation process, which declared
that they would live for 2 releases with a warning and then be
deleted. That will break the upper stack.

For the --accel item though, if that's doable without breakage then
there's no point delaying that until a 3.0 release. Just do it as
soon as its functionally ready for any release.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 11:33               ` Daniel P. Berrange
@ 2017-11-23 11:40                 ` Thomas Huth
  0 siblings, 0 replies; 79+ messages in thread
From: Thomas Huth @ 2017-11-23 11:40 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: Peter Maydell, Philippe Mathieu-Daudé,
	Markus Armbruster, Cornelia Huck, Greg Kurz, QEMU Developers,
	Cédric Le Goater, Paolo Bonzini, David Gibson

On 23.11.2017 12:33, Daniel P. Berrange wrote:
> On Thu, Nov 23, 2017 at 12:24:24PM +0100, Thomas Huth wrote:
>> On 23.11.2017 12:11, Daniel P. Berrange wrote:
>>> On Thu, Nov 23, 2017 at 11:57:34AM +0100, Thomas Huth wrote:
>>>> On 23.11.2017 11:17, Peter Maydell wrote:
>>>>> On 23 November 2017 at 10:03, Cornelia Huck <cohuck@redhat.com> wrote:
>>>>>> On Mon, 13 Nov 2017 08:14:28 +0100
>>>>>> Thomas Huth <thuth@redhat.com> wrote:
>>>>>>
>>>>>>> By the way, before everybody now introduces "2.12" machine types ... is
>>>>>>> there already a consensus that the next version will be "2.12" ?
>>>>>>>
>>>>>>> A couple of months ago, we discussed that we could maybe do a 3.0 after
>>>>>>> 2.11, e.g. here:
>>>>>>>
>>>>>>>  https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05056.html
>>>>>>>
>>>>>>> I'd still like to see that happen... Peter, any thoughts on this?
>>>>>>
>>>>>> So, as I just thought about preparing the new machine for s390x as
>>>>>> well: Did we reach any consensus about what the next qemu version will
>>>>>> be called?
>>>>>
>>>>> I haven't seen any sufficiently solid plan to make me want to
>>>>> pick anything except "2.12".
>>>>
>>>> I still don't think that we need a big plan for this... The change from
>>>> 1.7 to 2.0 was also rather arbitrary, wasn't it?
>>>>
>>>> Anyway, I've now started a Wiki page to collect ideas:
>>>>
>>>>  https://wiki.qemu.org/Features/Version3.0
>>>>
>>>> Maybe we can jump to version 3.0 if there are enough doable items on the
>>>> list that we can all agree upon.
>>>
>>> From the mgmt app / libvirt POV, I'm against the idea of doing such an
>>> API incompatible release associated with major versions. The whole point
>>> of the documented deprecation timeframe, was that we can incrementally
>>> remove legacy interfaces without having a big bang break the whole
>>> world.
>>
>> Yes, I agree ... that's why I tried to split the list into a "doable"
>> part (which hopefully does not mean any breakage for the upper stack),
>> and a "controversial" part (which we could use for collecting ideas, but
>> it is likely not feasible to do it any time soon). Sorry for not stating
>> this more clearly.
> 
> Your "doable" list includes removing all deprecated features, which
> basically just nullifies the deprecation process, which declared
> that they would live for 2 releases with a warning and then be
> deleted. That will break the upper stack.

Sorry again, that's not what I meant here. I meant to respect the 2
release cycle warning period - and used "accordingly" here to express
that. Looks like this was not very clear :-( I've rephrased the sentence
now, I hope it is now not ambiguous any more.

 Thomas

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 11:17           ` Paolo Bonzini
@ 2017-11-23 11:57             ` Thomas Huth
  2017-11-23 12:05               ` Paolo Bonzini
  2017-11-23 14:57             ` Igor Mammedov
  1 sibling, 1 reply; 79+ messages in thread
From: Thomas Huth @ 2017-11-23 11:57 UTC (permalink / raw)
  To: Paolo Bonzini, Peter Maydell, Cornelia Huck
  Cc: Cédric Le Goater, QEMU Developers, David Gibson, Greg Kurz,
	Markus Armbruster, Daniel P. Berrange, Eric Blake,
	Philippe Mathieu-Daudé,
	sergio.g.delreal, alex

On 23.11.2017 12:17, Paolo Bonzini wrote:
> On 23/11/2017 11:57, Thomas Huth wrote:
[...]
>> I've put "--accel kvm:hax:tcg" also on the doable list since I don't
>> remember any objections to that idea so far -- feel free to move it to
>> the controversial list instead if you think it needs more discussion.
> 
> "hax" is very far from feature parity with TCG, it doesn't even support
> CPUID (-cpu).  "-accel kvm:hvf:tcg" could be a possibility, but only if
> we have resources to test it.  As far as I know the only active x86
> developer who owns a Mac is Igor?

hvf hasn't been merged yet ... do you expect it to hit master after 2.11
has been released? Otherwise, we should maybe rather simply go with
"--accel kvm:tcg"?

 Thomas

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 11:57             ` Thomas Huth
@ 2017-11-23 12:05               ` Paolo Bonzini
  2017-11-23 12:09                 ` Cornelia Huck
  0 siblings, 1 reply; 79+ messages in thread
From: Paolo Bonzini @ 2017-11-23 12:05 UTC (permalink / raw)
  To: Thomas Huth, Peter Maydell, Cornelia Huck
  Cc: Cédric Le Goater, QEMU Developers, David Gibson, Greg Kurz,
	Markus Armbruster, Daniel P. Berrange, Eric Blake,
	Philippe Mathieu-Daudé,
	sergio.g.delreal, alex

On 23/11/2017 12:57, Thomas Huth wrote:
> On 23.11.2017 12:17, Paolo Bonzini wrote:
>> On 23/11/2017 11:57, Thomas Huth wrote:
> [...]
>>> I've put "--accel kvm:hax:tcg" also on the doable list since I don't
>>> remember any objections to that idea so far -- feel free to move it to
>>> the controversial list instead if you think it needs more discussion.
>>
>> "hax" is very far from feature parity with TCG, it doesn't even support
>> CPUID (-cpu).  "-accel kvm:hvf:tcg" could be a possibility, but only if
>> we have resources to test it.  As far as I know the only active x86
>> developer who owns a Mac is Igor?
> 
> hvf hasn't been merged yet ... do you expect it to hit master after 2.11
> has been released?

Yes, more or less.

> Otherwise, we should maybe rather simply go with
> "--accel kvm:tcg"?

Yes, HVF can come later.

Paolo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 12:05               ` Paolo Bonzini
@ 2017-11-23 12:09                 ` Cornelia Huck
  2017-11-23 12:26                   ` Paolo Bonzini
  0 siblings, 1 reply; 79+ messages in thread
From: Cornelia Huck @ 2017-11-23 12:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Thomas Huth, Peter Maydell, Cédric Le Goater,
	QEMU Developers, David Gibson, Greg Kurz, Markus Armbruster,
	Daniel P. Berrange, Eric Blake, Philippe Mathieu-Daudé,
	sergio.g.delreal, alex

On Thu, 23 Nov 2017 13:05:33 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 23/11/2017 12:57, Thomas Huth wrote:
> > On 23.11.2017 12:17, Paolo Bonzini wrote:  
> >> On 23/11/2017 11:57, Thomas Huth wrote:  
> > [...]  
> >>> I've put "--accel kvm:hax:tcg" also on the doable list since I don't
> >>> remember any objections to that idea so far -- feel free to move it to
> >>> the controversial list instead if you think it needs more discussion.  
> >>
> >> "hax" is very far from feature parity with TCG, it doesn't even support
> >> CPUID (-cpu).  "-accel kvm:hvf:tcg" could be a possibility, but only if
> >> we have resources to test it.  As far as I know the only active x86
> >> developer who owns a Mac is Igor?  
> > 
> > hvf hasn't been merged yet ... do you expect it to hit master after 2.11
> > has been released?  
> 
> Yes, more or less.
> 
> > Otherwise, we should maybe rather simply go with
> > "--accel kvm:tcg"?  
> 
> Yes, HVF can come later.

This switch sounds like something we can easily do for the next
release; I'd hope that anyone explicitly requiring tcg already
specifies it.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 12:09                 ` Cornelia Huck
@ 2017-11-23 12:26                   ` Paolo Bonzini
  2017-11-23 12:39                     ` Cornelia Huck
  0 siblings, 1 reply; 79+ messages in thread
From: Paolo Bonzini @ 2017-11-23 12:26 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Thomas Huth, Peter Maydell, Cédric Le Goater,
	QEMU Developers, David Gibson, Greg Kurz, Markus Armbruster,
	Daniel P. Berrange, Eric Blake, Philippe Mathieu-Daudé,
	sergio.g.delreal, alex

On 23/11/2017 13:09, Cornelia Huck wrote:
> On Thu, 23 Nov 2017 13:05:33 +0100
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
>> On 23/11/2017 12:57, Thomas Huth wrote:
>>> On 23.11.2017 12:17, Paolo Bonzini wrote:  
>>>> On 23/11/2017 11:57, Thomas Huth wrote:  
>>> [...]  
>>>>> I've put "--accel kvm:hax:tcg" also on the doable list since I don't
>>>>> remember any objections to that idea so far -- feel free to move it to
>>>>> the controversial list instead if you think it needs more discussion.  
>>>>
>>>> "hax" is very far from feature parity with TCG, it doesn't even support
>>>> CPUID (-cpu).  "-accel kvm:hvf:tcg" could be a possibility, but only if
>>>> we have resources to test it.  As far as I know the only active x86
>>>> developer who owns a Mac is Igor?  
>>>
>>> hvf hasn't been merged yet ... do you expect it to hit master after 2.11
>>> has been released?  
>>
>> Yes, more or less.
>>
>>> Otherwise, we should maybe rather simply go with
>>> "--accel kvm:tcg"?  
>>
>> Yes, HVF can come later.
> 
> This switch sounds like something we can easily do for the next
> release; I'd hope that anyone explicitly requiring tcg already
> specifies it.

I seriously doubt that.  Most people are probably using a
distro-provided qemu-kvm script for KVM, and qemu-system-x86_64 for TCG.
 In fact, that is probably the best of both worlds for anybody who
doesn't compile its own QEMU; and since KVM is Linux-only, there are
very few non-developers in the intersection of "compile its own QEMU"
and "use KVM".

And in fact that is the main reason why have never bothered switching
the default... only RHEL does it, because it ships the QEMU binary as
qemu-kvm rather than qemu-system-xxx plus a wrapper script.

Perhaps we could:

1) look for "qemu-{kvm,hvf,hax}" in argv[0] and change the "-accel" default?

2) change "make install" to install one or more of qemu-kvm/hvf/hax
based on target architecture and OS.

Then distros can do away with the script and Windows/Mac users can learn
to use qemu-hvf and qemu-hax.

Paolo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 12:26                   ` Paolo Bonzini
@ 2017-11-23 12:39                     ` Cornelia Huck
  2017-11-23 12:59                       ` Daniel P. Berrange
  2017-11-23 13:02                       ` Paolo Bonzini
  0 siblings, 2 replies; 79+ messages in thread
From: Cornelia Huck @ 2017-11-23 12:39 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Thomas Huth, Peter Maydell, Cédric Le Goater,
	QEMU Developers, David Gibson, Greg Kurz, Markus Armbruster,
	Daniel P. Berrange, Eric Blake, Philippe Mathieu-Daudé,
	sergio.g.delreal, alex

On Thu, 23 Nov 2017 13:26:14 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 23/11/2017 13:09, Cornelia Huck wrote:
> > On Thu, 23 Nov 2017 13:05:33 +0100
> > Paolo Bonzini <pbonzini@redhat.com> wrote:
> >   
> >> On 23/11/2017 12:57, Thomas Huth wrote:  
> >>> On 23.11.2017 12:17, Paolo Bonzini wrote:    
> >>>> On 23/11/2017 11:57, Thomas Huth wrote:    
> >>> [...]    
> >>>>> I've put "--accel kvm:hax:tcg" also on the doable list since I don't
> >>>>> remember any objections to that idea so far -- feel free to move it to
> >>>>> the controversial list instead if you think it needs more discussion.    
> >>>>
> >>>> "hax" is very far from feature parity with TCG, it doesn't even support
> >>>> CPUID (-cpu).  "-accel kvm:hvf:tcg" could be a possibility, but only if
> >>>> we have resources to test it.  As far as I know the only active x86
> >>>> developer who owns a Mac is Igor?    
> >>>
> >>> hvf hasn't been merged yet ... do you expect it to hit master after 2.11
> >>> has been released?    
> >>
> >> Yes, more or less.
> >>  
> >>> Otherwise, we should maybe rather simply go with
> >>> "--accel kvm:tcg"?    
> >>
> >> Yes, HVF can come later.  
> > 
> > This switch sounds like something we can easily do for the next
> > release; I'd hope that anyone explicitly requiring tcg already
> > specifies it.  
> 
> I seriously doubt that.  Most people are probably using a
> distro-provided qemu-kvm script for KVM, and qemu-system-x86_64 for TCG.
>  In fact, that is probably the best of both worlds for anybody who
> doesn't compile its own QEMU; and since KVM is Linux-only, there are
> very few non-developers in the intersection of "compile its own QEMU"
> and "use KVM".

I'm wondering how many people want to run e.g. x86_64-on-x86_64
_without_ using an available kvm (and I expect those people to
explicitly specify tcg).

> 
> And in fact that is the main reason why have never bothered switching
> the default... only RHEL does it, because it ships the QEMU binary as
> qemu-kvm rather than qemu-system-xxx plus a wrapper script.
> 
> Perhaps we could:
> 
> 1) look for "qemu-{kvm,hvf,hax}" in argv[0] and change the "-accel" default?
> 
> 2) change "make install" to install one or more of qemu-kvm/hvf/hax
> based on target architecture and OS.
> 
> Then distros can do away with the script and Windows/Mac users can learn
> to use qemu-hvf and qemu-hax.

I'm not sure I like that. For me, qemu-kvm comes with the connotation
of "there used to be a fork of qemu for kvm usage, and we stuck with
the name because it is likely scattered through scripts".

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 12:39                     ` Cornelia Huck
@ 2017-11-23 12:59                       ` Daniel P. Berrange
  2017-11-23 13:08                         ` Paolo Bonzini
  2017-11-23 13:02                       ` Paolo Bonzini
  1 sibling, 1 reply; 79+ messages in thread
From: Daniel P. Berrange @ 2017-11-23 12:59 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Paolo Bonzini, Thomas Huth, Peter Maydell, Cédric Le Goater,
	QEMU Developers, David Gibson, Greg Kurz, Markus Armbruster,
	Eric Blake, Philippe Mathieu-Daudé,
	sergio.g.delreal, alex

On Thu, Nov 23, 2017 at 01:39:24PM +0100, Cornelia Huck wrote:
> On Thu, 23 Nov 2017 13:26:14 +0100
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> > On 23/11/2017 13:09, Cornelia Huck wrote:
> > > On Thu, 23 Nov 2017 13:05:33 +0100
> > > Paolo Bonzini <pbonzini@redhat.com> wrote:
> > >   
> > >> On 23/11/2017 12:57, Thomas Huth wrote:  
> > >>> On 23.11.2017 12:17, Paolo Bonzini wrote:    
> > >>>> On 23/11/2017 11:57, Thomas Huth wrote:    
> > >>> [...]    
> > >>>>> I've put "--accel kvm:hax:tcg" also on the doable list since I don't
> > >>>>> remember any objections to that idea so far -- feel free to move it to
> > >>>>> the controversial list instead if you think it needs more discussion.    
> > >>>>
> > >>>> "hax" is very far from feature parity with TCG, it doesn't even support
> > >>>> CPUID (-cpu).  "-accel kvm:hvf:tcg" could be a possibility, but only if
> > >>>> we have resources to test it.  As far as I know the only active x86
> > >>>> developer who owns a Mac is Igor?    
> > >>>
> > >>> hvf hasn't been merged yet ... do you expect it to hit master after 2.11
> > >>> has been released?    
> > >>
> > >> Yes, more or less.
> > >>  
> > >>> Otherwise, we should maybe rather simply go with
> > >>> "--accel kvm:tcg"?    
> > >>
> > >> Yes, HVF can come later.  
> > > 
> > > This switch sounds like something we can easily do for the next
> > > release; I'd hope that anyone explicitly requiring tcg already
> > > specifies it.  
> > 
> > I seriously doubt that.  Most people are probably using a
> > distro-provided qemu-kvm script for KVM, and qemu-system-x86_64 for TCG.
> >  In fact, that is probably the best of both worlds for anybody who
> > doesn't compile its own QEMU; and since KVM is Linux-only, there are
> > very few non-developers in the intersection of "compile its own QEMU"
> > and "use KVM".
> 
> I'm wondering how many people want to run e.g. x86_64-on-x86_64
> _without_ using an available kvm (and I expect those people to
> explicitly specify tcg).

Libvirt at least always explicitly gives an --accel option, so the
question is only for people who directly run qemu.


> > And in fact that is the main reason why have never bothered switching
> > the default... only RHEL does it, because it ships the QEMU binary as
> > qemu-kvm rather than qemu-system-xxx plus a wrapper script.
> > 
> > Perhaps we could:
> > 
> > 1) look for "qemu-{kvm,hvf,hax}" in argv[0] and change the "-accel" default?
> > 
> > 2) change "make install" to install one or more of qemu-kvm/hvf/hax
> > based on target architecture and OS.
> > 
> > Then distros can do away with the script and Windows/Mac users can learn
> > to use qemu-hvf and qemu-hax.
> 
> I'm not sure I like that. For me, qemu-kvm comes with the connotation
> of "there used to be a fork of qemu for kvm usage, and we stuck with
> the name because it is likely scattered through scripts".

Yes, qemu-kvm is a historical artifact in Fedora solely because of the
previous fork. We can't easily get rid of it because the path is encoded
in libvirt XML files and countless end user scripts/configs.

In RHEL it exists for similar historical reasons, but we explicitly want
to keep it, since RHEL only ships KVM, this allows people to optionally
install a full QEMU for TCG usage in parallel.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 12:39                     ` Cornelia Huck
  2017-11-23 12:59                       ` Daniel P. Berrange
@ 2017-11-23 13:02                       ` Paolo Bonzini
  2017-11-23 13:13                         ` Cornelia Huck
  2017-11-23 13:13                         ` Peter Maydell
  1 sibling, 2 replies; 79+ messages in thread
From: Paolo Bonzini @ 2017-11-23 13:02 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Thomas Huth, Peter Maydell, Cédric Le Goater,
	QEMU Developers, David Gibson, Greg Kurz, Markus Armbruster,
	Daniel P. Berrange, Eric Blake, Philippe Mathieu-Daudé,
	sergio.g.delreal, alex

On 23/11/2017 13:39, Cornelia Huck wrote:
> I'm wondering how many people want to run e.g. x86_64-on-x86_64
> _without_ using an available kvm (and I expect those people to
> explicitly specify tcg).

I disagree.  I expect them to be "power users" enough to know that
qemu-system-x86_64 defaults to TCG.

Another possibility is that users come here asking for help, we tell
them to test qemu.git, and they are confused by the lack of a qemu-kvm
binary.  Ok, maybe not that likely, but it's a category which we want to
have a smooth experience.

>> And in fact that is the main reason why have never bothered switching
>> the default... only RHEL does it, because it ships the QEMU binary as
>> qemu-kvm rather than qemu-system-xxx plus a wrapper script.
>>
>> Perhaps we could:
>>
>> 1) look for "qemu-{kvm,hvf,hax}" in argv[0] and change the "-accel" default?
>>
>> 2) change "make install" to install one or more of qemu-kvm/hvf/hax
>> based on target architecture and OS.
>>
>> Then distros can do away with the script and Windows/Mac users can learn
>> to use qemu-hvf and qemu-hax.
> 
> I'm not sure I like that. For me, qemu-kvm comes with the connotation
> of "there used to be a fork of qemu for kvm usage, and we stuck with
> the name because it is likely scattered through scripts".

In theory I don't like it either (and I hadn't thought about it until
today).  In practice, qemu-kvm is not going away from
blogs/scripts/tutorials in a decade, so we might as well embrace it...
especially since it has fewer issues than the alternative and even some
advantages:

1) scripts that hardcode qemu-system-x86_64 expecting to use TCG keep to
work

2) it ensures that qemu-kvm works even for those who compile their own QEMU

3) it keeps behavior consistent across all qemu-system-* binaries

4) it reserves the unwieldy name for the thing that you don't want
(think of "exit" vs "_exit" in the C library)

5) we don't have to think about including hax or hvf in the -accel default

Paolo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 12:59                       ` Daniel P. Berrange
@ 2017-11-23 13:08                         ` Paolo Bonzini
  2017-11-23 13:23                           ` Daniel P. Berrange
  0 siblings, 1 reply; 79+ messages in thread
From: Paolo Bonzini @ 2017-11-23 13:08 UTC (permalink / raw)
  To: Daniel P. Berrange, Cornelia Huck
  Cc: Thomas Huth, Peter Maydell, Cédric Le Goater,
	QEMU Developers, David Gibson, Greg Kurz, Markus Armbruster,
	Eric Blake, Philippe Mathieu-Daudé,
	sergio.g.delreal, alex

On 23/11/2017 13:59, Daniel P. Berrange wrote:
>> I'm not sure I like that. For me, qemu-kvm comes with the connotation
>> of "there used to be a fork of qemu for kvm usage, and we stuck with
>> the name because it is likely scattered through scripts".
>
> Yes, qemu-kvm is a historical artifact in Fedora solely because of the
> previous fork.

In Fedora, Debian, Ubuntu, Arch Linux and NixOS at least.  AFAICS only
Gentoo doesn't have it.

Looks like something that upstream should provide, let's not fight
windmills.

Paolo

 We can't easily get rid of it because the path is encoded
> in libvirt XML files and countless end user scripts/configs.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 13:02                       ` Paolo Bonzini
@ 2017-11-23 13:13                         ` Cornelia Huck
  2017-11-23 13:27                           ` Paolo Bonzini
  2017-11-23 13:13                         ` Peter Maydell
  1 sibling, 1 reply; 79+ messages in thread
From: Cornelia Huck @ 2017-11-23 13:13 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Thomas Huth, Peter Maydell, Cédric Le Goater,
	QEMU Developers, David Gibson, Greg Kurz, Markus Armbruster,
	Daniel P. Berrange, Eric Blake, Philippe Mathieu-Daudé,
	sergio.g.delreal, alex

On Thu, 23 Nov 2017 14:02:12 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 23/11/2017 13:39, Cornelia Huck wrote:
> > I'm wondering how many people want to run e.g. x86_64-on-x86_64
> > _without_ using an available kvm (and I expect those people to
> > explicitly specify tcg).  
> 
> I disagree.  I expect them to be "power users" enough to know that
> qemu-system-x86_64 defaults to TCG.

Do we have any idea (from questions, bugzillas, ...) about which kind
of people actually do that?

[Coming from s390x, where tcg cannot run most current distros, I'm
lacking data.]

> 
> Another possibility is that users come here asking for help, we tell
> them to test qemu.git, and they are confused by the lack of a qemu-kvm
> binary.  Ok, maybe not that likely, but it's a category which we want to
> have a smooth experience.
> 
> >> And in fact that is the main reason why have never bothered switching
> >> the default... only RHEL does it, because it ships the QEMU binary as
> >> qemu-kvm rather than qemu-system-xxx plus a wrapper script.
> >>
> >> Perhaps we could:
> >>
> >> 1) look for "qemu-{kvm,hvf,hax}" in argv[0] and change the "-accel" default?
> >>
> >> 2) change "make install" to install one or more of qemu-kvm/hvf/hax
> >> based on target architecture and OS.
> >>
> >> Then distros can do away with the script and Windows/Mac users can learn
> >> to use qemu-hvf and qemu-hax.  
> > 
> > I'm not sure I like that. For me, qemu-kvm comes with the connotation
> > of "there used to be a fork of qemu for kvm usage, and we stuck with
> > the name because it is likely scattered through scripts".  
> 
> In theory I don't like it either (and I hadn't thought about it until
> today).  In practice, qemu-kvm is not going away from
> blogs/scripts/tutorials in a decade, so we might as well embrace it...
> especially since it has fewer issues than the alternative and even some
> advantages:
> 
> 1) scripts that hardcode qemu-system-x86_64 expecting to use TCG keep to
> work
> 
> 2) it ensures that qemu-kvm works even for those who compile their own QEMU
> 
> 3) it keeps behavior consistent across all qemu-system-* binaries
> 
> 4) it reserves the unwieldy name for the thing that you don't want
> (think of "exit" vs "_exit" in the C library)
> 
> 5) we don't have to think about including hax or hvf in the -accel default

One issue I see is that this naming convention only works with
same-architecture accelerators. You can't have a qemu-tcg as you don't
know which architecture is supposed to be emulated. (Or if you use
qemu-tcg as a shorthand for same-architecture emulation, you still have
the long names for anything else, which is even more confusing.)

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 13:02                       ` Paolo Bonzini
  2017-11-23 13:13                         ` Cornelia Huck
@ 2017-11-23 13:13                         ` Peter Maydell
  2017-11-23 13:51                           ` Paolo Bonzini
  1 sibling, 1 reply; 79+ messages in thread
From: Peter Maydell @ 2017-11-23 13:13 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Cornelia Huck, Thomas Huth, Cédric Le Goater,
	QEMU Developers, David Gibson, Greg Kurz, Markus Armbruster,
	Daniel P. Berrange, Eric Blake, Philippe Mathieu-Daudé,
	Sergio Andrés Gómez Del Real, Alex Bligh

On 23 November 2017 at 13:02, Paolo Bonzini <pbonzini@redhat.com> wrote:
> In theory I don't like it either (and I hadn't thought about it until
> today).  In practice, qemu-kvm is not going away from
> blogs/scripts/tutorials in a decade, so we might as well embrace it...

Isn't this distro-specific? In ubuntu by default there isn't
any wrapper, and if you do install the optional 'qemu-kvm' package
the wrapper it provides is /usr/bin/kvm, not /usr/bin/qemu-kvm.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 03/11] spapr: introduce new XICSFabric operations for an IRQ allocator
  2017-11-23 11:07       ` David Gibson
@ 2017-11-23 13:22         ` Cédric Le Goater
  0 siblings, 0 replies; 79+ messages in thread
From: Cédric Le Goater @ 2017-11-23 13:22 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, Greg Kurz, Benjamin Herrenschmidt

On 11/23/2017 12:07 PM, David Gibson wrote:
> On Fri, Nov 17, 2017 at 08:16:47AM +0100, Cédric Le Goater wrote:
>> On 11/17/2017 05:48 AM, David Gibson wrote:
>>> On Fri, Nov 10, 2017 at 03:20:09PM +0000, Cédric Le Goater wrote:
>>>> Currently, the ICSState 'ics' object of the sPAPR machine acts as the
>>>> global interrupt source handler and also as the IRQ number allocator
>>>> for the machine. Some IRQ numbers are allocated very early in the
>>>> machine initialization sequence to populate the device tree, and this
>>>> is a problem to introduce the new POWER XIVE interrupt model, as it
>>>> needs to share the IRQ numbers with the older model.
>>>>
>>>> To prepare ground for XIVE, here is a set of new XICSFabric operations
>>>> to let the machine handle directly the IRQ number allocation and to
>>>> decorrelate the allocation from the interrupt source object :
>>>>
>>>>     bool (*irq_test)(XICSFabric *xi, int irq);
>>>>     int (*irq_alloc_block)(XICSFabric *xi, int count, int align);
>>>>     void (*irq_free_block)(XICSFabric *xi, int irq, int num);
>>>>
>>>> In these prototypes, the 'irq' parameter refers to a number in the
>>>> global IRQ number space. Indexes for arrays storing different state
>>>> informations on the interrupts, like the ICSIRQState, are usually
>>>> named 'srcno'.
>>>>
>>>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>>>
>>> This doesn't seem sensible to me.  When I said you should move irq
>>> allocation to the machine, I mean actually move the code.  The only
>>> user of irq allocation should be in the machine, so we shouldn't need
>>> to indirect via the XICSFabric interface to do that.
>>
>> OK. so we can probably do the same with machine class handlers because 
>> we do need an indirection to handle the way older pseries machines 
>> allocate IRQs that will change with newer machines  supporting XIVE.
> 
> Right.  You could do it either with a MachineClass callback (similar
> to the phb placement one), or with just a flag in the MachineClass
> that's checked explicitly.  I'd be ok with either approach.

I have changed the approach in the latest XIVE patchset and I am not 
using any handlers of any sort anymore. It does not seem necessary
but if it is, you will have a global view of what XIVE requires to 
decide with direction to take.

I will send the patchset later in the day.

Thanks,

C.  

>>> And, we shouldn't be using XICSFabric things for XIVE.
>>
>> ok. The spapr machine should be enough. 
>>
>> Thanks,
>>
>> C.
>>  
>>>> ---
>>>>  hw/ppc/spapr.c        | 19 +++++++++++++++++++
>>>>  include/hw/ppc/xics.h |  4 ++++
>>>>  2 files changed, 23 insertions(+)
>>>>
>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>>> index a2dcbee07214..84d68f2fdbae 100644
>>>> --- a/hw/ppc/spapr.c
>>>> +++ b/hw/ppc/spapr.c
>>>> @@ -3536,6 +3536,21 @@ static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
>>>>      return cpu ? ICP(cpu->intc) : NULL;
>>>>  }
>>>>  
>>>> +static bool spapr_irq_test(XICSFabric *xi, int irq)
>>>> +{
>>>> +    return false;
>>>> +}
>>>> +
>>>> +static int spapr_irq_alloc_block(XICSFabric *xi, int count, int align)
>>>> +{
>>>> +    return -1;
>>>> +}
>>>> +
>>>> +static void spapr_irq_free_block(XICSFabric *xi, int irq, int num)
>>>> +{
>>>> +    ;
>>>> +}
>>>> +
>>>>  static void spapr_pic_print_info(InterruptStatsProvider *obj,
>>>>                                   Monitor *mon)
>>>>  {
>>>> @@ -3630,6 +3645,10 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>>>>      xic->ics_get = spapr_ics_get;
>>>>      xic->ics_resend = spapr_ics_resend;
>>>>      xic->icp_get = spapr_icp_get;
>>>> +    xic->irq_test = spapr_irq_test;
>>>> +    xic->irq_alloc_block = spapr_irq_alloc_block;
>>>> +    xic->irq_free_block = spapr_irq_free_block;
>>>> +
>>>>      ispc->print_info = spapr_pic_print_info;
>>>>      /* Force NUMA node memory size to be a multiple of
>>>>       * SPAPR_MEMORY_BLOCK_SIZE (256M) since that's the granularity
>>>> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
>>>> index 28d248abad61..30e7f2e0a7dd 100644
>>>> --- a/include/hw/ppc/xics.h
>>>> +++ b/include/hw/ppc/xics.h
>>>> @@ -175,6 +175,10 @@ typedef struct XICSFabricClass {
>>>>      ICSState *(*ics_get)(XICSFabric *xi, int irq);
>>>>      void (*ics_resend)(XICSFabric *xi);
>>>>      ICPState *(*icp_get)(XICSFabric *xi, int server);
>>>> +    /* IRQ allocator helpers */
>>>> +    bool (*irq_test)(XICSFabric *xi, int irq);
>>>> +    int (*irq_alloc_block)(XICSFabric *xi, int count, int align);
>>>> +    void (*irq_free_block)(XICSFabric *xi, int irq, int num);
>>>>  } XICSFabricClass;
>>>>  
>>>>  #define XICS_IRQS_SPAPR               1024
>>>
>>
> 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 13:08                         ` Paolo Bonzini
@ 2017-11-23 13:23                           ` Daniel P. Berrange
  2017-11-23 13:25                             ` Paolo Bonzini
  0 siblings, 1 reply; 79+ messages in thread
From: Daniel P. Berrange @ 2017-11-23 13:23 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Cornelia Huck, Thomas Huth, Peter Maydell, Cédric Le Goater,
	QEMU Developers, David Gibson, Greg Kurz, Markus Armbruster,
	Eric Blake, Philippe Mathieu-Daudé,
	sergio.g.delreal, alex

On Thu, Nov 23, 2017 at 02:08:42PM +0100, Paolo Bonzini wrote:
> On 23/11/2017 13:59, Daniel P. Berrange wrote:
> >> I'm not sure I like that. For me, qemu-kvm comes with the connotation
> >> of "there used to be a fork of qemu for kvm usage, and we stuck with
> >> the name because it is likely scattered through scripts".
> >
> > Yes, qemu-kvm is a historical artifact in Fedora solely because of the
> > previous fork.
> 
> In Fedora, Debian, Ubuntu, Arch Linux and NixOS at least.  AFAICS only
> Gentoo doesn't have it.
> 
> Looks like something that upstream should provide, let's not fight
> windmills.

I thought that Ubuntu / Debian had /usr/bin/kvm originally instead of
qemu-kvm


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 13:23                           ` Daniel P. Berrange
@ 2017-11-23 13:25                             ` Paolo Bonzini
  0 siblings, 0 replies; 79+ messages in thread
From: Paolo Bonzini @ 2017-11-23 13:25 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: Cornelia Huck, Thomas Huth, Peter Maydell, Cédric Le Goater,
	QEMU Developers, David Gibson, Greg Kurz, Markus Armbruster,
	Eric Blake, Philippe Mathieu-Daudé,
	sergio.g.delreal, alex

On 23/11/2017 14:23, Daniel P. Berrange wrote:
> On Thu, Nov 23, 2017 at 02:08:42PM +0100, Paolo Bonzini wrote:
>> On 23/11/2017 13:59, Daniel P. Berrange wrote:
>>>> I'm not sure I like that. For me, qemu-kvm comes with the connotation
>>>> of "there used to be a fork of qemu for kvm usage, and we stuck with
>>>> the name because it is likely scattered through scripts".
>>>
>>> Yes, qemu-kvm is a historical artifact in Fedora solely because of the
>>> previous fork.
>>
>> In Fedora, Debian, Ubuntu, Arch Linux and NixOS at least.  AFAICS only
>> Gentoo doesn't have it.
>>
>> Looks like something that upstream should provide, let's not fight
>> windmills.
> 
> I thought that Ubuntu / Debian had /usr/bin/kvm originally instead of
> qemu-kvm

Yeah, though the package is named qemu-kvm (confusing...).  Still the
point stands that if there is distro confusion, we should consider
looking at what the de facto standards are, and provide it ourselves in
a way that makes sense (and /usr/bin/kvm doesn't).

Paolo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.12 v3 08/11] spapr: introduce a XICSFabric irq_is_lsi() operation
  2017-11-23 11:12       ` David Gibson
@ 2017-11-23 13:26         ` Cédric Le Goater
  0 siblings, 0 replies; 79+ messages in thread
From: Cédric Le Goater @ 2017-11-23 13:26 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, Greg Kurz, Benjamin Herrenschmidt

On 11/23/2017 12:12 PM, David Gibson wrote:
> On Fri, Nov 17, 2017 at 08:23:00AM +0100, Cédric Le Goater wrote:
>> On 11/17/2017 05:54 AM, David Gibson wrote:
>>> On Fri, Nov 10, 2017 at 03:20:14PM +0000, Cédric Le Goater wrote:
>>>> It will be used later on to distinguish the allocation of an LSI
>>>> interrupt from an MSI and also to reduce the use of the ICSIRQState
>>>> array of the ICSState object, which is on our way to introduce XIVE.
>>>>
>>>> The 'irq' parameter continues to refer to the global IRQ number space.
>>>>
>>>> On PowerNV, only the PSI controller interrupts are handled and they
>>>> are all LSIs.
>>>>
>>>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>>>
>>> !?! AFAICT this is a step backwards.  The users of ics_is_lsi() here
>>> are in the xics code.  So they already have the right information
>>> locally in the ICSState object.  Why on earth would you indirect
>>> through the fabric.
>>
>> because I am trying to get rid of the fact that the current ICS 
>> allocator handles two things at the same time : IRQ allocation 
>> and IRQ type. And that's a problem because the ICSState object 
>> is just used everywhere in the code.
> 
> That's a good goal, but I don't think this is the right way there.
> One of the reasons that conflation of allocation and typing is bad is
> that the typing is essentially local information to the PIC itself,
> whereas the allocation is a machine concern.

OK.

> By using a XICSFabric hook you're essentially wiring the irq typing
> through the machine, which again seems like a step in the wrong
> direction.

I agree.
 
>> OK, So that's is not to your liking, I will come up with some 
>> other solution. Thanks for looking.
> 
> Sorry if I was a bit harsh; quite a few unrelated things have been
> frustrating me, which has made me grumpy.

No problem. I am also trying to cover too many things at the same 
time. XIVE is a big piece enough. 

I think the XIVE patchset is stabilizing. So I will it send today 
and see how it is perceived. But let's keep in mind that we have 
some more needs for the PHBs and their IRQs placement.

Thanks,

C.    

   

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 13:13                         ` Cornelia Huck
@ 2017-11-23 13:27                           ` Paolo Bonzini
  0 siblings, 0 replies; 79+ messages in thread
From: Paolo Bonzini @ 2017-11-23 13:27 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Thomas Huth, Peter Maydell, Cédric Le Goater,
	QEMU Developers, David Gibson, Greg Kurz, Markus Armbruster,
	Daniel P. Berrange, Eric Blake, Philippe Mathieu-Daudé,
	sergio.g.delreal, alex

On 23/11/2017 14:13, Cornelia Huck wrote:
> On Thu, 23 Nov 2017 14:02:12 +0100
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
>> On 23/11/2017 13:39, Cornelia Huck wrote:
>>> I'm wondering how many people want to run e.g. x86_64-on-x86_64
>>> _without_ using an available kvm (and I expect those people to
>>> explicitly specify tcg).  
>>
>> I disagree.  I expect them to be "power users" enough to know that
>> qemu-system-x86_64 defaults to TCG.
> 
> Do we have any idea (from questions, bugzillas, ...) about which kind
> of people actually do that?
> 
> [Coming from s390x, where tcg cannot run most current distros, I'm
> lacking data.]

For example Linux or other OS developers that want to test x86 features
not in any shipping processor.  (See for example 5-level page tables
that were recently contributed to QEMU).

>> In theory I don't like it either (and I hadn't thought about it until
>> today).  In practice, qemu-kvm is not going away from
>> blogs/scripts/tutorials in a decade, so we might as well embrace it...
>
> One issue I see is that this naming convention only works with
> same-architecture accelerators. You can't have a qemu-tcg as you don't
> know which architecture is supposed to be emulated. (Or if you use
> qemu-tcg as a shorthand for same-architecture emulation, you still have
> the long names for anything else, which is even more confusing.)

Right, there would be no qemu-tcg, that would keep qemu-system-ARCH.

Paolo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 13:13                         ` Peter Maydell
@ 2017-11-23 13:51                           ` Paolo Bonzini
  2017-11-23 13:57                             ` Peter Maydell
  2017-11-23 13:57                             ` Daniel P. Berrange
  0 siblings, 2 replies; 79+ messages in thread
From: Paolo Bonzini @ 2017-11-23 13:51 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Cornelia Huck, Thomas Huth, Cédric Le Goater,
	QEMU Developers, David Gibson, Greg Kurz, Markus Armbruster,
	Daniel P. Berrange, Eric Blake, Philippe Mathieu-Daudé,
	Sergio Andrés Gómez Del Real, Alex Bligh

On 23/11/2017 14:13, Peter Maydell wrote:
> On 23 November 2017 at 13:02, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> In theory I don't like it either (and I hadn't thought about it until
>> today).  In practice, qemu-kvm is not going away from
>> blogs/scripts/tutorials in a decade, so we might as well embrace it...
> Isn't this distro-specific? In ubuntu by default there isn't
> any wrapper, and if you do install the optional 'qemu-kvm' package
> the wrapper it provides is /usr/bin/kvm, not /usr/bin/qemu-kvm.

Fedora also has no wrapper in the qemu-system-x86 package, and only
"qemu-kvm" installs one.  In practice if you install the virtualization
package group you get it.  What about Ubuntu?

Paolo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 13:51                           ` Paolo Bonzini
@ 2017-11-23 13:57                             ` Peter Maydell
  2017-11-23 14:01                               ` Thomas Huth
  2017-11-23 13:57                             ` Daniel P. Berrange
  1 sibling, 1 reply; 79+ messages in thread
From: Peter Maydell @ 2017-11-23 13:57 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Cornelia Huck, Thomas Huth, Cédric Le Goater,
	QEMU Developers, David Gibson, Greg Kurz, Markus Armbruster,
	Daniel P. Berrange, Eric Blake, Philippe Mathieu-Daudé,
	Sergio Andrés Gómez Del Real, Alex Bligh

On 23 November 2017 at 13:51, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 23/11/2017 14:13, Peter Maydell wrote:
>> On 23 November 2017 at 13:02, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> In theory I don't like it either (and I hadn't thought about it until
>>> today).  In practice, qemu-kvm is not going away from
>>> blogs/scripts/tutorials in a decade, so we might as well embrace it...
>> Isn't this distro-specific? In ubuntu by default there isn't
>> any wrapper, and if you do install the optional 'qemu-kvm' package
>> the wrapper it provides is /usr/bin/kvm, not /usr/bin/qemu-kvm.
>
> Fedora also has no wrapper in the qemu-system-x86 package, and only
> "qemu-kvm" installs one.  In practice if you install the virtualization
> package group you get it.  What about Ubuntu?

Well, I didn't have the qemu-kvm package installed until I
pulled it in to check the wrapper name.

My point is more that if there's no consensus between distros
about what the wrapper script name should be then as upstream
if we provide a qemu-kvm then we might be helping Fedora/RedHat
but we're just increasing confusion for those distros that
used a different name or have already transitioned away from
the wrapper entirely.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 13:51                           ` Paolo Bonzini
  2017-11-23 13:57                             ` Peter Maydell
@ 2017-11-23 13:57                             ` Daniel P. Berrange
  1 sibling, 0 replies; 79+ messages in thread
From: Daniel P. Berrange @ 2017-11-23 13:57 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Maydell, Cornelia Huck, Thomas Huth, Cédric Le Goater,
	QEMU Developers, David Gibson, Greg Kurz, Markus Armbruster,
	Eric Blake, Philippe Mathieu-Daudé,
	Sergio Andrés Gómez Del Real, Alex Bligh

On Thu, Nov 23, 2017 at 02:51:51PM +0100, Paolo Bonzini wrote:
> On 23/11/2017 14:13, Peter Maydell wrote:
> > On 23 November 2017 at 13:02, Paolo Bonzini <pbonzini@redhat.com> wrote:
> >> In theory I don't like it either (and I hadn't thought about it until
> >> today).  In practice, qemu-kvm is not going away from
> >> blogs/scripts/tutorials in a decade, so we might as well embrace it...
> > Isn't this distro-specific? In ubuntu by default there isn't
> > any wrapper, and if you do install the optional 'qemu-kvm' package
> > the wrapper it provides is /usr/bin/kvm, not /usr/bin/qemu-kvm.
> 
> Fedora also has no wrapper in the qemu-system-x86 package, and only
> "qemu-kvm" installs one.  In practice if you install the virtualization
> package group you get it.  What about Ubuntu?

Actually not quite correct. Historically '/usr/bin/qemu-kvm' is provided
by whichever 'qemu-system-$ARCH'  RPM matches your name arch. With recent
modularization, its now provided by 'qemu-system-$ARCH-core'. So everyone
will have qemu-kvm if they've installed the sub-RPM matching their host
arch.

The 'qemu-kvm' RPM is just an empty RPM that depends on whichever
'qemu-system-$ARCH' matches your host, and thus provides '/usr/bin/qemu-kvm'
This is just convenience to let downstream app RPMs depend on qemu-kvm
instead of a big set of per-arch conditionals.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 13:57                             ` Peter Maydell
@ 2017-11-23 14:01                               ` Thomas Huth
  2017-11-23 14:13                                 ` Paolo Bonzini
  0 siblings, 1 reply; 79+ messages in thread
From: Thomas Huth @ 2017-11-23 14:01 UTC (permalink / raw)
  To: Peter Maydell, Paolo Bonzini
  Cc: Cornelia Huck, Cédric Le Goater, QEMU Developers,
	David Gibson, Greg Kurz, Markus Armbruster, Daniel P. Berrange,
	Eric Blake, Philippe Mathieu-Daudé,
	Sergio Andrés Gómez Del Real, Alex Bligh

On 23.11.2017 14:57, Peter Maydell wrote:
> On 23 November 2017 at 13:51, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> On 23/11/2017 14:13, Peter Maydell wrote:
>>> On 23 November 2017 at 13:02, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>> In theory I don't like it either (and I hadn't thought about it until
>>>> today).  In practice, qemu-kvm is not going away from
>>>> blogs/scripts/tutorials in a decade, so we might as well embrace it...
>>> Isn't this distro-specific? In ubuntu by default there isn't
>>> any wrapper, and if you do install the optional 'qemu-kvm' package
>>> the wrapper it provides is /usr/bin/kvm, not /usr/bin/qemu-kvm.
>>
>> Fedora also has no wrapper in the qemu-system-x86 package, and only
>> "qemu-kvm" installs one.  In practice if you install the virtualization
>> package group you get it.  What about Ubuntu?
> 
> Well, I didn't have the qemu-kvm package installed until I
> pulled it in to check the wrapper name.
> 
> My point is more that if there's no consensus between distros
> about what the wrapper script name should be then as upstream
> if we provide a qemu-kvm then we might be helping Fedora/RedHat
> but we're just increasing confusion for those distros that
> used a different name
We could simply check whether argv[0] ends in "kvm" ... that should work
with both "kvm" and "qemu-kvm".

 Thomas

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 14:01                               ` Thomas Huth
@ 2017-11-23 14:13                                 ` Paolo Bonzini
  0 siblings, 0 replies; 79+ messages in thread
From: Paolo Bonzini @ 2017-11-23 14:13 UTC (permalink / raw)
  To: Thomas Huth, Peter Maydell
  Cc: Cornelia Huck, Cédric Le Goater, QEMU Developers,
	David Gibson, Greg Kurz, Markus Armbruster, Daniel P. Berrange,
	Eric Blake, Philippe Mathieu-Daudé,
	Sergio Andrés Gómez Del Real, Alex Bligh

On 23/11/2017 15:01, Thomas Huth wrote:
> On 23.11.2017 14:57, Peter Maydell wrote:
>> On 23 November 2017 at 13:51, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> On 23/11/2017 14:13, Peter Maydell wrote:
>>>> On 23 November 2017 at 13:02, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>>> In theory I don't like it either (and I hadn't thought about it until
>>>>> today).  In practice, qemu-kvm is not going away from
>>>>> blogs/scripts/tutorials in a decade, so we might as well embrace it...
>>>> Isn't this distro-specific? In ubuntu by default there isn't
>>>> any wrapper, and if you do install the optional 'qemu-kvm' package
>>>> the wrapper it provides is /usr/bin/kvm, not /usr/bin/qemu-kvm.
>>>
>>> Fedora also has no wrapper in the qemu-system-x86 package, and only
>>> "qemu-kvm" installs one.  In practice if you install the virtualization
>>> package group you get it.  What about Ubuntu?
>>
>> Well, I didn't have the qemu-kvm package installed until I
>> pulled it in to check the wrapper name.
>>
>> My point is more that if there's no consensus between distros
>> about what the wrapper script name should be then as upstream
>> if we provide a qemu-kvm then we might be helping Fedora/RedHat
>> but we're just increasing confusion for those distros that
>> used a different name
>
> We could simply check whether argv[0] ends in "kvm" ... that should work
> with both "kvm" and "qemu-kvm".

We could also have two completely different binaries, one of which adds
a file accel/kvm/default-accel.c:

   const char *default_accel = "kvm:tcg";

(likewise for hax and hvf).  For everyone else stubs/default-accel.c
provides:

   const char *default_accel = "tcg";

It shouldn't be hard to add it to Makefile.target.  Then Debian can
install our qemu-kvm binary as /usr/bin/kvm.

Paolo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Qemu-devel] QEMU 3.0 ?
  2017-11-23 11:17           ` Paolo Bonzini
  2017-11-23 11:57             ` Thomas Huth
@ 2017-11-23 14:57             ` Igor Mammedov
  1 sibling, 0 replies; 79+ messages in thread
From: Igor Mammedov @ 2017-11-23 14:57 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Thomas Huth, Peter Maydell, Cornelia Huck,
	Philippe Mathieu-Daudé,
	QEMU Developers, Markus Armbruster, Greg Kurz,
	Cédric Le Goater, David Gibson

On Thu, 23 Nov 2017 12:17:54 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 23/11/2017 11:57, Thomas Huth wrote:
> > On 23.11.2017 11:17, Peter Maydell wrote:  
> >> On 23 November 2017 at 10:03, Cornelia Huck <cohuck@redhat.com> wrote:  
> >>> On Mon, 13 Nov 2017 08:14:28 +0100
> >>> Thomas Huth <thuth@redhat.com> wrote:
> >>>  
...  
> 
> "hax" is very far from feature parity with TCG, it doesn't even support
> CPUID (-cpu).  "-accel kvm:hvf:tcg" could be a possibility, but only if
> we have resources to test it.  As far as I know the only active x86
> developer who owns a Mac is Igor?
I can test occasionally but hw is out warranty so no hard promises from me to do it.

> Paolo
> 
> 

^ permalink raw reply	[flat|nested] 79+ messages in thread

end of thread, other threads:[~2017-11-23 14:58 UTC | newest]

Thread overview: 79+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-10 15:20 [Qemu-devel] [PATCH for-2.12 v3 00/11] spapr: introduce an IRQ allocator at the machine level Cédric Le Goater
2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type Cédric Le Goater
2017-11-11 15:15   ` Greg Kurz
2017-11-13  5:51   ` David Gibson
2017-11-13  9:50     ` Greg Kurz
2017-11-14  9:08       ` David Gibson
2017-11-13  7:14   ` [Qemu-devel] QEMU 3.0 ? (was: [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type) Thomas Huth
2017-11-13  9:53     ` Peter Maydell
2017-11-13 10:03       ` [Qemu-devel] QEMU 3.0 ? Cédric Le Goater
2017-11-13 10:21         ` Peter Maydell
2017-11-13 10:25       ` Thomas Huth
2017-11-23 10:03     ` [Qemu-devel] QEMU 3.0 ? (was: [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type) Cornelia Huck
2017-11-23 10:17       ` Peter Maydell
2017-11-23 10:57         ` [Qemu-devel] QEMU 3.0 ? Thomas Huth
2017-11-23 11:11           ` Daniel P. Berrange
2017-11-23 11:24             ` Thomas Huth
2017-11-23 11:33               ` Daniel P. Berrange
2017-11-23 11:40                 ` Thomas Huth
2017-11-23 11:17           ` Paolo Bonzini
2017-11-23 11:57             ` Thomas Huth
2017-11-23 12:05               ` Paolo Bonzini
2017-11-23 12:09                 ` Cornelia Huck
2017-11-23 12:26                   ` Paolo Bonzini
2017-11-23 12:39                     ` Cornelia Huck
2017-11-23 12:59                       ` Daniel P. Berrange
2017-11-23 13:08                         ` Paolo Bonzini
2017-11-23 13:23                           ` Daniel P. Berrange
2017-11-23 13:25                             ` Paolo Bonzini
2017-11-23 13:02                       ` Paolo Bonzini
2017-11-23 13:13                         ` Cornelia Huck
2017-11-23 13:27                           ` Paolo Bonzini
2017-11-23 13:13                         ` Peter Maydell
2017-11-23 13:51                           ` Paolo Bonzini
2017-11-23 13:57                             ` Peter Maydell
2017-11-23 14:01                               ` Thomas Huth
2017-11-23 14:13                                 ` Paolo Bonzini
2017-11-23 13:57                             ` Daniel P. Berrange
2017-11-23 14:57             ` Igor Mammedov
2017-11-23 11:14         ` [Qemu-devel] QEMU 3.0 ? (was: [PATCH for-2.12 v3 01/11] spapr: add pseries 2.12 machine type) Daniel P. Berrange
2017-11-23 11:26           ` [Qemu-devel] QEMU 3.0 ? Thomas Huth
2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 02/11] ppc/xics: remove useless if condition Cédric Le Goater
2017-11-11 14:50   ` Greg Kurz
2017-11-13  5:28   ` David Gibson
2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 03/11] spapr: introduce new XICSFabric operations for an IRQ allocator Cédric Le Goater
2017-11-14  8:52   ` Greg Kurz
2017-11-17  4:48   ` David Gibson
2017-11-17  7:16     ` Cédric Le Goater
2017-11-23 11:07       ` David Gibson
2017-11-23 13:22         ` Cédric Le Goater
2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 04/11] spapr: move current IRQ allocation under the machine Cédric Le Goater
2017-11-14  8:56   ` Greg Kurz
2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 05/11] spapr: introduce an IRQ allocator using a bitmap Cédric Le Goater
2017-11-14  9:42   ` Greg Kurz
2017-11-14 11:54     ` Cédric Le Goater
2017-11-14 15:28       ` Greg Kurz
2017-11-15  8:47         ` Cédric Le Goater
2017-11-17  4:50     ` David Gibson
2017-11-17  7:19       ` Cédric Le Goater
2017-11-23 11:08         ` David Gibson
2017-11-20 12:07       ` Greg Kurz
2017-11-23 11:13         ` David Gibson
2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 06/11] spapr: store a reference IRQ bitmap Cédric Le Goater
2017-11-14 15:12   ` Greg Kurz
2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 07/11] spapr: introduce an 'irq_base' number Cédric Le Goater
2017-11-14 15:45   ` Greg Kurz
2017-11-15 15:24     ` Cédric Le Goater
2017-11-15 16:43       ` Greg Kurz
2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 08/11] spapr: introduce a XICSFabric irq_is_lsi() operation Cédric Le Goater
2017-11-14 16:21   ` Greg Kurz
2017-11-17  4:54   ` David Gibson
2017-11-17  7:23     ` Cédric Le Goater
2017-11-23 11:12       ` David Gibson
2017-11-23 13:26         ` Cédric Le Goater
2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 09/11] spapr: split the IRQ number space for LSI interrupts Cédric Le Goater
2017-11-15 15:52   ` Greg Kurz
2017-11-15 16:08     ` Cédric Le Goater
2017-11-15 20:27       ` Greg Kurz
2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 10/11] sparp: merge ics_set_irq_type() in irq_alloc_block() operation Cédric Le Goater
2017-11-10 15:20 ` [Qemu-devel] [PATCH for-2.12 v3 11/11] spapr: use sPAPRMachineState in spapr_ics_ prototypes Cédric Le Goater

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.