All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/3] introduce a fixed IRQ number space
@ 2018-06-15 11:53 Cédric Le Goater
  2018-06-15 11:53 ` [Qemu-devel] [PATCH 1/3] spapr: split the IRQ allocation sequence Cédric Le Goater
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Cédric Le Goater @ 2018-06-15 11:53 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, Cédric Le Goater

Hello,

Here is a proposal for a new IRQ number space layout using static
numbers and a bitmap allocator for the MSIs. The previous layout is
kept for compatibly in machines raising the 'xics_legacy' flag.

These are just the basics preparing ground for the new XIVE
controller. I would also like to introduce a structure like below to
abstract the IRQ controller in the sPAPR machine :


typedef struct sPAPRIrq {
    uint32_t    nr_irqs;
    uint8_t     ov5;

    void (*init)(sPAPRMachineState *spapr, uint32_t nr_servers, Error **errp);
    int (*claim)(sPAPRMachineState *spapr, uint32_t range, uint32_t irq,
                  Error **errp);
    void (*free)(sPAPRMachineState *spapr, int irq, int num, Error **errp);
    qemu_irq (*qirq)(sPAPRMachineState *spapr, int irq);
    void (*print_info)(sPAPRMachineState *spapr, Monitor *mon);
    void (*dt_populate)(sPAPRMachineState *spapr, uint32_t nr_servers,
                        void *fdt, uint32_t phandle);
    Object *(*cpu_intc_create)(sPAPRMachineState *spapr, Object *cpu,
                               Error **errp);
    void (*post_load)(sPAPRMachineState *spapr);
} sPAPRIrq;


Let's discuss !

Cheers,

C.

Cédric Le Goater (3):
  spapr: split the IRQ allocation sequence
  spapr: remove unused spapr_irq routines
  spapr: introduce a fixed IRQ number space

 include/hw/ppc/spapr.h     |  11 +++--
 include/hw/ppc/spapr_irq.h |  30 ++++++++++++
 hw/ppc/spapr.c             | 115 ++++++++++++++++++++++++++-------------------
 hw/ppc/spapr_events.c      |  26 ++++++++--
 hw/ppc/spapr_irq.c         |  57 ++++++++++++++++++++++
 hw/ppc/spapr_pci.c         |  35 ++++++++++++--
 hw/ppc/spapr_vio.c         |  17 ++++++-
 hw/ppc/Makefile.objs       |   2 +-
 8 files changed, 232 insertions(+), 61 deletions(-)
 create mode 100644 include/hw/ppc/spapr_irq.h
 create mode 100644 hw/ppc/spapr_irq.c

-- 
2.13.6

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 1/3] spapr: split the IRQ allocation sequence
  2018-06-15 11:53 [Qemu-devel] [PATCH 0/3] introduce a fixed IRQ number space Cédric Le Goater
@ 2018-06-15 11:53 ` Cédric Le Goater
  2018-06-15 13:14   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
  2018-06-18  4:00   ` [Qemu-devel] " David Gibson
  2018-06-15 11:53 ` [Qemu-devel] [PATCH 2/3] spapr: remove unused spapr_irq routines Cédric Le Goater
  2018-06-15 11:53 ` [Qemu-devel] [PATCH 3/3] spapr: introduce a fixed IRQ number space Cédric Le Goater
  2 siblings, 2 replies; 21+ messages in thread
From: Cédric Le Goater @ 2018-06-15 11:53 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, Cédric Le Goater

Today, when a device requests for IRQ number in a sPAPR machine, the
spapr_irq_alloc() routine first scans the ICSState status array to
find an empty slot and then performs the assignement of the selected
numbers. Split this sequence in two distinct routines : spapr_irq_find()
for lookups and spapr_irq_claim() for claiming the IRQ numbers.

This will ease the introduction of a static layout of IRQ numbers.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 include/hw/ppc/spapr.h |  5 ++++
 hw/ppc/spapr.c         | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/ppc/spapr_events.c  | 18 ++++++++++----
 hw/ppc/spapr_pci.c     | 19 ++++++++++++---
 hw/ppc/spapr_vio.c     | 10 +++++++-
 5 files changed, 108 insertions(+), 8 deletions(-)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 3388750fc795..6088f44c1b2a 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -776,6 +776,11 @@ int spapr_irq_alloc(sPAPRMachineState *spapr, int irq_hint, bool lsi,
                     Error **errp);
 int spapr_irq_alloc_block(sPAPRMachineState *spapr, int num, bool lsi,
                           bool align, Error **errp);
+int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align,
+                   Error **errp);
+#define spapr_irq_findone(spapr, errp) spapr_irq_find(spapr, 1, false, errp)
+int spapr_irq_claim(sPAPRMachineState *spapr, int irq, int num, bool lsi,
+                    Error **errp);
 void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
 qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq);
 
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index f59999daacfc..b1d19b328166 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3810,6 +3810,36 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum)
     return -1;
 }
 
+int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align, Error **errp)
+{
+    ICSState *ics = spapr->ics;
+    int first = -1;
+
+    assert(ics);
+
+    /*
+     * MSIMesage::data is used for storing VIRQ so
+     * it has to be aligned to num to support multiple
+     * MSI vectors. MSI-X is not affected by this.
+     * The hint is used for the first IRQ, the rest should
+     * be allocated continuously.
+     */
+    if (align) {
+        assert((num == 1) || (num == 2) || (num == 4) ||
+               (num == 8) || (num == 16) || (num == 32));
+        first = ics_find_free_block(ics, num, num);
+    } else {
+        first = ics_find_free_block(ics, num, 1);
+    }
+
+    if (first < 0) {
+        error_setg(errp, "can't find a free %d-IRQ block", num);
+        return -1;
+    }
+
+    return first + ics->offset;
+}
+
 /*
  * Allocate the IRQ number and set the IRQ type, LSI or MSI
  */
@@ -3888,6 +3918,40 @@ int spapr_irq_alloc_block(sPAPRMachineState *spapr, int num, bool lsi,
     return first;
 }
 
+int spapr_irq_claim(sPAPRMachineState *spapr, int irq, int num, bool lsi,
+                    Error **errp)
+{
+    ICSState *ics = spapr->ics;
+    int i;
+    int srcno = irq - ics->offset;
+    int ret = 0;
+
+    assert(ics);
+
+    if (!ics_valid_irq(ics, irq) || !ics_valid_irq(ics, irq + num)) {
+        return -1;
+    }
+
+    for (i = srcno; i < srcno + num; ++i) {
+        if (ICS_IRQ_FREE(ics, i)) {
+            spapr_irq_set_lsi(spapr, i + ics->offset, lsi);
+        } else {
+            error_setg(errp, "IRQ %d is not free", i + ics->offset);
+            ret = -1;
+            break;
+        }
+    }
+
+    /* we could call spapr_irq_free() when rolling back */
+    if (ret) {
+        while (--i >= srcno) {
+            memset(&ics->irqs[i], 0, sizeof(ICSIRQState));
+        }
+    }
+
+    return ret;
+}
+
 void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num)
 {
     ICSState *ics = spapr->ics;
diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index 86836f0626dc..3b6ae7272092 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -707,13 +707,18 @@ void spapr_clear_pending_events(sPAPRMachineState *spapr)
 
 void spapr_events_init(sPAPRMachineState *spapr)
 {
+    int epow_irq;
+
+    epow_irq = spapr_irq_findone(spapr, &error_fatal);
+
+    spapr_irq_claim(spapr, epow_irq, 1, false, &error_fatal);
+
     QTAILQ_INIT(&spapr->pending_events);
 
     spapr->event_sources = spapr_event_sources_new();
 
     spapr_event_sources_register(spapr->event_sources, EVENT_CLASS_EPOW,
-                                 spapr_irq_alloc(spapr, 0, false,
-                                                  &error_fatal));
+                                 epow_irq);
 
     /* NOTE: if machine supports modern/dedicated hotplug event source,
      * we add it to the device-tree unconditionally. This means we may
@@ -724,9 +729,14 @@ void spapr_events_init(sPAPRMachineState *spapr)
      * checking that it's enabled.
      */
     if (spapr->use_hotplug_event_source) {
+        int hp_irq;
+
+        hp_irq = spapr_irq_findone(spapr, &error_fatal);
+
+        spapr_irq_claim(spapr, hp_irq, 1, false, &error_fatal);
+
         spapr_event_sources_register(spapr->event_sources, EVENT_CLASS_HOT_PLUG,
-                                     spapr_irq_alloc(spapr, 0, false,
-                                                      &error_fatal));
+                                     hp_irq);
     }
 
     spapr->epow_notifier.notify = spapr_powerdown_req;
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index f936ce63effa..7394c62b4a8b 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -371,8 +371,14 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
     }
 
     /* Allocate MSIs */
-    irq = spapr_irq_alloc_block(spapr, req_num, false,
-                           ret_intr_type == RTAS_TYPE_MSI, &err);
+    irq = spapr_irq_find(spapr, req_num, ret_intr_type == RTAS_TYPE_MSI, &err);
+    if (err) {
+        error_reportf_err(err, "Can't allocate MSIs for device %x: ",
+                          config_addr);
+        rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
+        return;
+    }
+    spapr_irq_claim(spapr, irq, req_num, false, &err);
     if (err) {
         error_reportf_err(err, "Can't allocate MSIs for device %x: ",
                           config_addr);
@@ -1698,7 +1704,14 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
         uint32_t irq;
         Error *local_err = NULL;
 
-        irq = spapr_irq_alloc_block(spapr, 1, true, false, &local_err);
+        irq = spapr_irq_findone(spapr, &local_err);
+        if (local_err) {
+            error_propagate(errp, local_err);
+            error_prepend(errp, "can't allocate LSIs: ");
+            return;
+        }
+
+        spapr_irq_claim(spapr, irq, 1, true, &local_err);
         if (local_err) {
             error_propagate(errp, local_err);
             error_prepend(errp, "can't allocate LSIs: ");
diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
index 4555c648a8e2..ad9b56e28447 100644
--- a/hw/ppc/spapr_vio.c
+++ b/hw/ppc/spapr_vio.c
@@ -475,7 +475,15 @@ static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
         dev->qdev.id = id;
     }
 
-    dev->irq = spapr_irq_alloc(spapr, dev->irq, false, &local_err);
+    if (!dev->irq) {
+        dev->irq = spapr_irq_findone(spapr, &local_err);
+        if (local_err) {
+            error_propagate(errp, local_err);
+            return;
+        }
+    }
+
+    spapr_irq_claim(spapr, dev->irq, 1, false, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         return;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 2/3] spapr: remove unused spapr_irq routines
  2018-06-15 11:53 [Qemu-devel] [PATCH 0/3] introduce a fixed IRQ number space Cédric Le Goater
  2018-06-15 11:53 ` [Qemu-devel] [PATCH 1/3] spapr: split the IRQ allocation sequence Cédric Le Goater
@ 2018-06-15 11:53 ` Cédric Le Goater
  2018-06-15 13:30   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
  2018-06-18  4:00   ` [Qemu-devel] " David Gibson
  2018-06-15 11:53 ` [Qemu-devel] [PATCH 3/3] spapr: introduce a fixed IRQ number space Cédric Le Goater
  2 siblings, 2 replies; 21+ messages in thread
From: Cédric Le Goater @ 2018-06-15 11:53 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, Cédric Le Goater

spapr_irq_alloc_block and spapr_irq_alloc() are now deprecated.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 include/hw/ppc/spapr.h |  4 ---
 hw/ppc/spapr.c         | 80 +-------------------------------------------------
 2 files changed, 1 insertion(+), 83 deletions(-)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 6088f44c1b2a..49cda04b1493 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -772,10 +772,6 @@ int spapr_get_vcpu_id(PowerPCCPU *cpu);
 void spapr_set_vcpu_id(PowerPCCPU *cpu, int cpu_index, Error **errp);
 PowerPCCPU *spapr_find_cpu(int vcpu_id);
 
-int spapr_irq_alloc(sPAPRMachineState *spapr, int irq_hint, bool lsi,
-                    Error **errp);
-int spapr_irq_alloc_block(sPAPRMachineState *spapr, int num, bool lsi,
-                          bool align, Error **errp);
 int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align,
                    Error **errp);
 #define spapr_irq_findone(spapr, errp) spapr_irq_find(spapr, 1, false, errp)
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index b1d19b328166..c464951747e3 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3840,84 +3840,6 @@ int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align, Error **errp)
     return first + ics->offset;
 }
 
-/*
- * Allocate the IRQ number and set the IRQ type, LSI or MSI
- */
-static void spapr_irq_set_lsi(sPAPRMachineState *spapr, int irq, bool lsi)
-{
-    ics_set_irq_type(spapr->ics, irq - spapr->ics->offset, lsi);
-}
-
-int spapr_irq_alloc(sPAPRMachineState *spapr, int irq_hint, bool lsi,
-                    Error **errp)
-{
-    ICSState *ics = spapr->ics;
-    int irq;
-
-    assert(ics);
-
-    if (irq_hint) {
-        if (!ICS_IRQ_FREE(ics, irq_hint - ics->offset)) {
-            error_setg(errp, "can't allocate IRQ %d: already in use", irq_hint);
-            return -1;
-        }
-        irq = irq_hint;
-    } else {
-        irq = ics_find_free_block(ics, 1, 1);
-        if (irq < 0) {
-            error_setg(errp, "can't allocate IRQ: no IRQ left");
-            return -1;
-        }
-        irq += ics->offset;
-    }
-
-    spapr_irq_set_lsi(spapr, irq, lsi);
-    trace_spapr_irq_alloc(irq);
-
-    return irq;
-}
-
-/*
- * Allocate block of consecutive IRQs, and return the number of the first IRQ in
- * the block. If align==true, aligns the first IRQ number to num.
- */
-int spapr_irq_alloc_block(sPAPRMachineState *spapr, int num, bool lsi,
-                          bool align, Error **errp)
-{
-    ICSState *ics = spapr->ics;
-    int i, first = -1;
-
-    assert(ics);
-
-    /*
-     * MSIMesage::data is used for storing VIRQ so
-     * it has to be aligned to num to support multiple
-     * MSI vectors. MSI-X is not affected by this.
-     * The hint is used for the first IRQ, the rest should
-     * be allocated continuously.
-     */
-    if (align) {
-        assert((num == 1) || (num == 2) || (num == 4) ||
-               (num == 8) || (num == 16) || (num == 32));
-        first = ics_find_free_block(ics, num, num);
-    } else {
-        first = ics_find_free_block(ics, num, 1);
-    }
-    if (first < 0) {
-        error_setg(errp, "can't find a free %d-IRQ block", num);
-        return -1;
-    }
-
-    first += ics->offset;
-    for (i = first; i < first + num; ++i) {
-        spapr_irq_set_lsi(spapr, i, lsi);
-    }
-
-    trace_spapr_irq_alloc_block(first, num, lsi, align);
-
-    return first;
-}
-
 int spapr_irq_claim(sPAPRMachineState *spapr, int irq, int num, bool lsi,
                     Error **errp)
 {
@@ -3934,7 +3856,7 @@ int spapr_irq_claim(sPAPRMachineState *spapr, int irq, int num, bool lsi,
 
     for (i = srcno; i < srcno + num; ++i) {
         if (ICS_IRQ_FREE(ics, i)) {
-            spapr_irq_set_lsi(spapr, i + ics->offset, lsi);
+            ics_set_irq_type(ics, i, lsi);
         } else {
             error_setg(errp, "IRQ %d is not free", i + ics->offset);
             ret = -1;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 3/3] spapr: introduce a fixed IRQ number space
  2018-06-15 11:53 [Qemu-devel] [PATCH 0/3] introduce a fixed IRQ number space Cédric Le Goater
  2018-06-15 11:53 ` [Qemu-devel] [PATCH 1/3] spapr: split the IRQ allocation sequence Cédric Le Goater
  2018-06-15 11:53 ` [Qemu-devel] [PATCH 2/3] spapr: remove unused spapr_irq routines Cédric Le Goater
@ 2018-06-15 11:53 ` Cédric Le Goater
  2018-06-15 16:03   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
  2 siblings, 1 reply; 21+ messages in thread
From: Cédric Le Goater @ 2018-06-15 11:53 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, Cédric Le Goater

This proposal introduces a new IRQ number space layout using static
numbers for all devices and a bitmap allocator for the MSI numbers
which are negotiated by the guest at runtime.

The previous layout is kept in machines raising the 'xics_legacy'
flag.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 include/hw/ppc/spapr.h     |  4 ++++
 include/hw/ppc/spapr_irq.h | 30 ++++++++++++++++++++++++
 hw/ppc/spapr.c             | 31 +++++++++++++++++++++++++
 hw/ppc/spapr_events.c      | 12 ++++++++--
 hw/ppc/spapr_irq.c         | 57 ++++++++++++++++++++++++++++++++++++++++++++++
 hw/ppc/spapr_pci.c         | 28 ++++++++++++++++++-----
 hw/ppc/spapr_vio.c         | 15 ++++++++----
 hw/ppc/Makefile.objs       |  2 +-
 8 files changed, 166 insertions(+), 13 deletions(-)
 create mode 100644 include/hw/ppc/spapr_irq.h
 create mode 100644 hw/ppc/spapr_irq.c

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 49cda04b1493..094c7780130e 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -7,6 +7,7 @@
 #include "hw/ppc/spapr_drc.h"
 #include "hw/mem/pc-dimm.h"
 #include "hw/ppc/spapr_ovec.h"
+#include "hw/ppc/spapr_irq.h"
 
 struct VIOsPAPRBus;
 struct sPAPRPHBState;
@@ -164,6 +165,9 @@ struct sPAPRMachineState {
     char *kvm_type;
 
     const char *icp_type;
+    bool xics_legacy;
+    int32_t irq_map_nr;
+    unsigned long *irq_map;
 
     bool cmd_line_caps[SPAPR_CAP_NUM];
     sPAPRCapabilities def, eff, mig;
diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
new file mode 100644
index 000000000000..07e36d1c7e74
--- /dev/null
+++ b/include/hw/ppc/spapr_irq.h
@@ -0,0 +1,30 @@
+/*
+ * QEMU PowerPC sPAPR IRQ backend definitions
+ *
+ * Copyright (c) 2018, IBM Corporation.
+ *
+ * This code is licensed under the GPL version 2 or later. See the
+ * COPYING file in the top-level directory.
+ */
+#ifndef HW_SPAPR_IRQ_H
+#define HW_SPAPR_IRQ_H
+
+/*
+ * IRQ ranges per type
+ */
+#define SPAPR_IRQ_EPOW       0x1000  /* XICS_IRQ_BASE offset */
+#define SPAPR_IRQ_HOTPLUG    0x1001
+#define SPAPR_IRQ_VIO        0x1040
+#define SPAPR_IRQ_PCI_LSI    0x1080
+
+#define SPAPR_IRQ_MSI        0x1100  /* Offset of the dynamic range covered
+                                      * by the bitmap allocator */
+
+void spapr_irq_msi_init(sPAPRMachineState *spapr, uint32_t nr_irqs,
+                       Error **errp);
+int spapr_irq_msi_alloc(sPAPRMachineState *spapr, uint32_t num, bool align,
+                        Error **errp);
+void spapr_irq_msi_free(sPAPRMachineState *spapr, int irq, uint32_t num);
+void spapr_irq_msi_reset(sPAPRMachineState *spapr);
+
+#endif
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index c464951747e3..32aa5d869d9c 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -187,6 +187,11 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
 {
     sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
 
+    /* Initialize the MSI IRQ allocator. */
+    if (!spapr->xics_legacy) {
+        spapr_irq_msi_init(spapr, nr_irqs, &error_fatal);
+    }
+
     if (kvm_enabled()) {
         if (machine_kernel_irqchip_allowed(machine) &&
             !xics_kvm_init(spapr, errp)) {
@@ -1629,6 +1634,10 @@ static void spapr_machine_reset(void)
         ppc_set_compat(first_ppc_cpu, spapr->max_compat_pvr, &error_fatal);
     }
 
+    if (!spapr->xics_legacy) {
+        spapr_irq_msi_reset(spapr);
+    }
+
     qemu_devices_reset();
 
     /* DRC reset may cause a device to be unplugged. This will cause troubles
@@ -1903,6 +1912,24 @@ static const VMStateDescription vmstate_spapr_patb_entry = {
     },
 };
 
+static bool spapr_irq_map_needed(void *opaque)
+{
+    sPAPRMachineState *spapr = opaque;
+
+    return spapr->irq_map && !bitmap_empty(spapr->irq_map, spapr->irq_map_nr);
+}
+
+static const VMStateDescription vmstate_spapr_irq_map = {
+    .name = "spapr_irq_map",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = spapr_irq_map_needed,
+    .fields = (VMStateField[]) {
+        VMSTATE_BITMAP(irq_map, sPAPRMachineState, 0, irq_map_nr),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
 static const VMStateDescription vmstate_spapr = {
     .name = "spapr",
     .version_id = 3,
@@ -1930,6 +1957,7 @@ static const VMStateDescription vmstate_spapr = {
         &vmstate_spapr_cap_cfpc,
         &vmstate_spapr_cap_sbbc,
         &vmstate_spapr_cap_ibs,
+        &vmstate_spapr_irq_map,
         NULL
     }
 };
@@ -4101,7 +4129,10 @@ DEFINE_SPAPR_MACHINE(3_0, "3.0", true);
 
 static void spapr_machine_2_12_instance_options(MachineState *machine)
 {
+    sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
+
     spapr_machine_3_0_instance_options(machine);
+    spapr->xics_legacy = true;
 }
 
 static void spapr_machine_2_12_class_options(MachineClass *mc)
diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index 3b6ae7272092..1ef7ce7f1509 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -709,7 +709,11 @@ void spapr_events_init(sPAPRMachineState *spapr)
 {
     int epow_irq;
 
-    epow_irq = spapr_irq_findone(spapr, &error_fatal);
+    if (spapr->xics_legacy) {
+        epow_irq = spapr_irq_findone(spapr, &error_fatal);
+    } else {
+        epow_irq = SPAPR_IRQ_EPOW;
+    }
 
     spapr_irq_claim(spapr, epow_irq, 1, false, &error_fatal);
 
@@ -731,7 +735,11 @@ void spapr_events_init(sPAPRMachineState *spapr)
     if (spapr->use_hotplug_event_source) {
         int hp_irq;
 
-        hp_irq = spapr_irq_findone(spapr, &error_fatal);
+        if (spapr->xics_legacy) {
+            hp_irq = spapr_irq_findone(spapr, &error_fatal);
+        } else {
+            hp_irq = SPAPR_IRQ_HOTPLUG;
+        }
 
         spapr_irq_claim(spapr, hp_irq, 1, false, &error_fatal);
 
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
new file mode 100644
index 000000000000..55fa764c85d1
--- /dev/null
+++ b/hw/ppc/spapr_irq.c
@@ -0,0 +1,57 @@
+/*
+ * QEMU PowerPC sPAPR IRQ interface
+ *
+ * Copyright (c) 2018, IBM Corporation.
+ *
+ * This code is licensed under the GPL version 2 or later. See the
+ * COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+#include "hw/ppc/spapr.h"
+#include "hw/ppc/xics.h"
+
+void spapr_irq_msi_init(sPAPRMachineState *spapr, uint32_t nr_irqs,
+                        Error **errp)
+{
+    spapr->irq_map_nr = XICS_IRQ_BASE + nr_irqs - SPAPR_IRQ_MSI;
+    spapr->irq_map = bitmap_new(spapr->irq_map_nr);
+}
+
+int spapr_irq_msi_alloc(sPAPRMachineState *spapr, uint32_t num, bool align,
+                        Error **errp)
+{
+    int irq;
+
+    /*
+     * The 'align_mask' parameter of bitmap_find_next_zero_area()
+     * should be one less than a power of 2; 0 means no
+     * alignment. Adapt the 'align' value of the former allocator
+     * to fit the requirements of bitmap_find_next_zero_area()
+     */
+    align -= 1;
+
+    irq = bitmap_find_next_zero_area(spapr->irq_map, spapr->irq_map_nr, 0, num,
+                                     align);
+    if (irq == spapr->irq_map_nr) {
+        error_setg(errp, "can't find a free %d-IRQ block", num);
+        return -1;
+    }
+
+    bitmap_set(spapr->irq_map, irq, num);
+
+    return irq + SPAPR_IRQ_MSI;
+}
+
+void spapr_irq_msi_free(sPAPRMachineState *spapr, int irq, uint32_t num)
+{
+    bitmap_clear(spapr->irq_map, irq - SPAPR_IRQ_MSI, num);
+}
+
+void spapr_irq_msi_reset(sPAPRMachineState *spapr)
+{
+    bitmap_clear(spapr->irq_map, 0, spapr->irq_map_nr);
+}
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 7394c62b4a8b..f3c4a25fb482 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -333,6 +333,9 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
             return;
         }
 
+        if (!spapr->xics_legacy) {
+            spapr_irq_msi_free(spapr, msi->first_irq, msi->num);
+        }
         spapr_irq_free(spapr, msi->first_irq, msi->num);
         if (msi_present(pdev)) {
             spapr_msi_setmsg(pdev, 0, false, 0, 0);
@@ -371,7 +374,13 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
     }
 
     /* Allocate MSIs */
-    irq = spapr_irq_find(spapr, req_num, ret_intr_type == RTAS_TYPE_MSI, &err);
+    if (spapr->xics_legacy) {
+        irq = spapr_irq_find(spapr, req_num, ret_intr_type == RTAS_TYPE_MSI,
+                             &err);
+    } else {
+        irq = spapr_irq_msi_alloc(spapr, req_num,
+                                  ret_intr_type == RTAS_TYPE_MSI, &err);
+    }
     if (err) {
         error_reportf_err(err, "Can't allocate MSIs for device %x: ",
                           config_addr);
@@ -388,6 +397,9 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
 
     /* Release previous MSIs */
     if (msi) {
+        if (!spapr->xics_legacy) {
+            spapr_irq_msi_free(spapr, msi->first_irq, msi->num);
+        }
         spapr_irq_free(spapr, msi->first_irq, msi->num);
         g_hash_table_remove(phb->msi, &config_addr);
     }
@@ -1704,11 +1716,15 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
         uint32_t irq;
         Error *local_err = NULL;
 
-        irq = spapr_irq_findone(spapr, &local_err);
-        if (local_err) {
-            error_propagate(errp, local_err);
-            error_prepend(errp, "can't allocate LSIs: ");
-            return;
+        if (spapr->xics_legacy) {
+            irq = spapr_irq_findone(spapr, &local_err);
+            if (local_err) {
+                error_propagate(errp, local_err);
+                error_prepend(errp, "can't allocate LSIs: ");
+                return;
+            }
+        } else {
+            irq = SPAPR_IRQ_PCI_LSI + sphb->index * PCI_NUM_PINS + i;
         }
 
         spapr_irq_claim(spapr, irq, 1, true, &local_err);
diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
index ad9b56e28447..7707bda6a38d 100644
--- a/hw/ppc/spapr_vio.c
+++ b/hw/ppc/spapr_vio.c
@@ -436,6 +436,9 @@ static void spapr_vio_busdev_reset(DeviceState *qdev)
     }
 }
 
+/* TODO : poor VIO device indexing ... */
+static uint32_t vio_index;
+
 static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
 {
     sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
@@ -476,10 +479,14 @@ static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
     }
 
     if (!dev->irq) {
-        dev->irq = spapr_irq_findone(spapr, &local_err);
-        if (local_err) {
-            error_propagate(errp, local_err);
-            return;
+        if (spapr->xics_legacy) {
+            dev->irq = spapr_irq_findone(spapr, &local_err);
+            if (local_err) {
+                error_propagate(errp, local_err);
+                return;
+            }
+        } else {
+            dev->irq = SPAPR_IRQ_VIO + vio_index++;
         }
     }
 
diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
index 86d82a6ec3ac..4fe3b7804d43 100644
--- a/hw/ppc/Makefile.objs
+++ b/hw/ppc/Makefile.objs
@@ -4,7 +4,7 @@ obj-y += ppc.o ppc_booke.o fdt.o
 obj-$(CONFIG_PSERIES) += spapr.o spapr_caps.o spapr_vio.o spapr_events.o
 obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o spapr_rtas.o
 obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o spapr_rng.o
-obj-$(CONFIG_PSERIES) += spapr_cpu_core.o spapr_ovec.o
+obj-$(CONFIG_PSERIES) += spapr_cpu_core.o spapr_ovec.o spapr_irq.o
 # IBM PowerNV
 obj-$(CONFIG_POWERNV) += pnv.o pnv_xscom.o pnv_core.o pnv_lpc.o pnv_psi.o pnv_occ.o pnv_bmc.o
 ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy)
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/3] spapr: split the IRQ allocation sequence
  2018-06-15 11:53 ` [Qemu-devel] [PATCH 1/3] spapr: split the IRQ allocation sequence Cédric Le Goater
@ 2018-06-15 13:14   ` Greg Kurz
  2018-06-15 17:21     ` Cédric Le Goater
  2018-06-18  4:00   ` [Qemu-devel] " David Gibson
  1 sibling, 1 reply; 21+ messages in thread
From: Greg Kurz @ 2018-06-15 13:14 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: David Gibson, qemu-ppc, qemu-devel

On Fri, 15 Jun 2018 13:53:01 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> Today, when a device requests for IRQ number in a sPAPR machine, the
> spapr_irq_alloc() routine first scans the ICSState status array to
> find an empty slot and then performs the assignement of the selected
> numbers. Split this sequence in two distinct routines : spapr_irq_find()
> for lookups and spapr_irq_claim() for claiming the IRQ numbers.
> 
> This will ease the introduction of a static layout of IRQ numbers.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
>  include/hw/ppc/spapr.h |  5 ++++
>  hw/ppc/spapr.c         | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  hw/ppc/spapr_events.c  | 18 ++++++++++----
>  hw/ppc/spapr_pci.c     | 19 ++++++++++++---
>  hw/ppc/spapr_vio.c     | 10 +++++++-
>  5 files changed, 108 insertions(+), 8 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 3388750fc795..6088f44c1b2a 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -776,6 +776,11 @@ int spapr_irq_alloc(sPAPRMachineState *spapr, int irq_hint, bool lsi,
>                      Error **errp);
>  int spapr_irq_alloc_block(sPAPRMachineState *spapr, int num, bool lsi,
>                            bool align, Error **errp);
> +int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align,
> +                   Error **errp);
> +#define spapr_irq_findone(spapr, errp) spapr_irq_find(spapr, 1, false, errp)
> +int spapr_irq_claim(sPAPRMachineState *spapr, int irq, int num, bool lsi,
> +                    Error **errp);
>  void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
>  qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq);
>  
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index f59999daacfc..b1d19b328166 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3810,6 +3810,36 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum)
>      return -1;
>  }
>  
> +int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align, Error **errp)
> +{
> +    ICSState *ics = spapr->ics;
> +    int first = -1;
> +
> +    assert(ics);
> +
> +    /*
> +     * MSIMesage::data is used for storing VIRQ so
> +     * it has to be aligned to num to support multiple
> +     * MSI vectors. MSI-X is not affected by this.
> +     * The hint is used for the first IRQ, the rest should
> +     * be allocated continuously.
> +     */
> +    if (align) {
> +        assert((num == 1) || (num == 2) || (num == 4) ||
> +               (num == 8) || (num == 16) || (num == 32));
> +        first = ics_find_free_block(ics, num, num);
> +    } else {
> +        first = ics_find_free_block(ics, num, 1);
> +    }
> +
> +    if (first < 0) {
> +        error_setg(errp, "can't find a free %d-IRQ block", num);
> +        return -1;
> +    }
> +
> +    return first + ics->offset;
> +}
> +
>  /*
>   * Allocate the IRQ number and set the IRQ type, LSI or MSI
>   */
> @@ -3888,6 +3918,40 @@ int spapr_irq_alloc_block(sPAPRMachineState *spapr, int num, bool lsi,
>      return first;
>  }
>  
> +int spapr_irq_claim(sPAPRMachineState *spapr, int irq, int num, bool lsi,
> +                    Error **errp)
> +{
> +    ICSState *ics = spapr->ics;
> +    int i;
> +    int srcno = irq - ics->offset;
> +    int ret = 0;
> +
> +    assert(ics);
> +
> +    if (!ics_valid_irq(ics, irq) || !ics_valid_irq(ics, irq + num)) {

I guess it is okay to assume that if the first and last irqs are valid,
so are all numbers in between. Shouldn't the second check be against
irq + num - 1 though ?

This lacks an error_setg() BTW.

> +        return -1;
> +    }
> +
> +    for (i = srcno; i < srcno + num; ++i) {
> +        if (ICS_IRQ_FREE(ics, i)) {
> +            spapr_irq_set_lsi(spapr, i + ics->offset, lsi);
> +        } else {
> +            error_setg(errp, "IRQ %d is not free", i + ics->offset);
> +            ret = -1;
> +            break;
> +        }
> +    }
> +
> +    /* we could call spapr_irq_free() when rolling back */
> +    if (ret) {
> +        while (--i >= srcno) {
> +            memset(&ics->irqs[i], 0, sizeof(ICSIRQState));
> +        }
> +    }

Hmm... I guess we should free the whole range otherwise we leak
srcno + num - i irqs, preferably using spapr_irq_free() to match
the traces from spapr_irq_alloc().

Alternatively, the rollback could be pushed to the callers.

Rest looks good.

> +
> +    return ret;
> +}
> +
>  void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num)
>  {
>      ICSState *ics = spapr->ics;
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index 86836f0626dc..3b6ae7272092 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -707,13 +707,18 @@ void spapr_clear_pending_events(sPAPRMachineState *spapr)
>  
>  void spapr_events_init(sPAPRMachineState *spapr)
>  {
> +    int epow_irq;
> +
> +    epow_irq = spapr_irq_findone(spapr, &error_fatal);
> +
> +    spapr_irq_claim(spapr, epow_irq, 1, false, &error_fatal);
> +
>      QTAILQ_INIT(&spapr->pending_events);
>  
>      spapr->event_sources = spapr_event_sources_new();
>  
>      spapr_event_sources_register(spapr->event_sources, EVENT_CLASS_EPOW,
> -                                 spapr_irq_alloc(spapr, 0, false,
> -                                                  &error_fatal));
> +                                 epow_irq);
>  
>      /* NOTE: if machine supports modern/dedicated hotplug event source,
>       * we add it to the device-tree unconditionally. This means we may
> @@ -724,9 +729,14 @@ void spapr_events_init(sPAPRMachineState *spapr)
>       * checking that it's enabled.
>       */
>      if (spapr->use_hotplug_event_source) {
> +        int hp_irq;
> +
> +        hp_irq = spapr_irq_findone(spapr, &error_fatal);
> +
> +        spapr_irq_claim(spapr, hp_irq, 1, false, &error_fatal);
> +
>          spapr_event_sources_register(spapr->event_sources, EVENT_CLASS_HOT_PLUG,
> -                                     spapr_irq_alloc(spapr, 0, false,
> -                                                      &error_fatal));
> +                                     hp_irq);
>      }
>  
>      spapr->epow_notifier.notify = spapr_powerdown_req;
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index f936ce63effa..7394c62b4a8b 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -371,8 +371,14 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>      }
>  
>      /* Allocate MSIs */
> -    irq = spapr_irq_alloc_block(spapr, req_num, false,
> -                           ret_intr_type == RTAS_TYPE_MSI, &err);
> +    irq = spapr_irq_find(spapr, req_num, ret_intr_type == RTAS_TYPE_MSI, &err);
> +    if (err) {
> +        error_reportf_err(err, "Can't allocate MSIs for device %x: ",
> +                          config_addr);
> +        rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
> +        return;
> +    }
> +    spapr_irq_claim(spapr, irq, req_num, false, &err);
>      if (err) {
>          error_reportf_err(err, "Can't allocate MSIs for device %x: ",
>                            config_addr);
> @@ -1698,7 +1704,14 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>          uint32_t irq;
>          Error *local_err = NULL;
>  
> -        irq = spapr_irq_alloc_block(spapr, 1, true, false, &local_err);
> +        irq = spapr_irq_findone(spapr, &local_err);
> +        if (local_err) {
> +            error_propagate(errp, local_err);
> +            error_prepend(errp, "can't allocate LSIs: ");
> +            return;
> +        }
> +
> +        spapr_irq_claim(spapr, irq, 1, true, &local_err);
>          if (local_err) {
>              error_propagate(errp, local_err);
>              error_prepend(errp, "can't allocate LSIs: ");
> diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
> index 4555c648a8e2..ad9b56e28447 100644
> --- a/hw/ppc/spapr_vio.c
> +++ b/hw/ppc/spapr_vio.c
> @@ -475,7 +475,15 @@ static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
>          dev->qdev.id = id;
>      }
>  
> -    dev->irq = spapr_irq_alloc(spapr, dev->irq, false, &local_err);
> +    if (!dev->irq) {
> +        dev->irq = spapr_irq_findone(spapr, &local_err);
> +        if (local_err) {
> +            error_propagate(errp, local_err);
> +            return;
> +        }
> +    }
> +
> +    spapr_irq_claim(spapr, dev->irq, 1, false, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
>          return;

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/3] spapr: remove unused spapr_irq routines
  2018-06-15 11:53 ` [Qemu-devel] [PATCH 2/3] spapr: remove unused spapr_irq routines Cédric Le Goater
@ 2018-06-15 13:30   ` Greg Kurz
  2018-06-18  4:00   ` [Qemu-devel] " David Gibson
  1 sibling, 0 replies; 21+ messages in thread
From: Greg Kurz @ 2018-06-15 13:30 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: David Gibson, qemu-ppc, qemu-devel

On Fri, 15 Jun 2018 13:53:02 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> spapr_irq_alloc_block and spapr_irq_alloc() are now deprecated.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  include/hw/ppc/spapr.h |  4 ---
>  hw/ppc/spapr.c         | 80 +-------------------------------------------------
>  2 files changed, 1 insertion(+), 83 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 6088f44c1b2a..49cda04b1493 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -772,10 +772,6 @@ int spapr_get_vcpu_id(PowerPCCPU *cpu);
>  void spapr_set_vcpu_id(PowerPCCPU *cpu, int cpu_index, Error **errp);
>  PowerPCCPU *spapr_find_cpu(int vcpu_id);
>  
> -int spapr_irq_alloc(sPAPRMachineState *spapr, int irq_hint, bool lsi,
> -                    Error **errp);
> -int spapr_irq_alloc_block(sPAPRMachineState *spapr, int num, bool lsi,
> -                          bool align, Error **errp);
>  int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align,
>                     Error **errp);
>  #define spapr_irq_findone(spapr, errp) spapr_irq_find(spapr, 1, false, errp)
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index b1d19b328166..c464951747e3 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3840,84 +3840,6 @@ int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align, Error **errp)
>      return first + ics->offset;
>  }
>  
> -/*
> - * Allocate the IRQ number and set the IRQ type, LSI or MSI
> - */
> -static void spapr_irq_set_lsi(sPAPRMachineState *spapr, int irq, bool lsi)
> -{
> -    ics_set_irq_type(spapr->ics, irq - spapr->ics->offset, lsi);
> -}
> -
> -int spapr_irq_alloc(sPAPRMachineState *spapr, int irq_hint, bool lsi,
> -                    Error **errp)
> -{
> -    ICSState *ics = spapr->ics;
> -    int irq;
> -
> -    assert(ics);
> -
> -    if (irq_hint) {
> -        if (!ICS_IRQ_FREE(ics, irq_hint - ics->offset)) {
> -            error_setg(errp, "can't allocate IRQ %d: already in use", irq_hint);
> -            return -1;
> -        }
> -        irq = irq_hint;
> -    } else {
> -        irq = ics_find_free_block(ics, 1, 1);
> -        if (irq < 0) {
> -            error_setg(errp, "can't allocate IRQ: no IRQ left");
> -            return -1;
> -        }
> -        irq += ics->offset;
> -    }
> -
> -    spapr_irq_set_lsi(spapr, irq, lsi);
> -    trace_spapr_irq_alloc(irq);
> -
> -    return irq;
> -}
> -
> -/*
> - * Allocate block of consecutive IRQs, and return the number of the first IRQ in
> - * the block. If align==true, aligns the first IRQ number to num.
> - */
> -int spapr_irq_alloc_block(sPAPRMachineState *spapr, int num, bool lsi,
> -                          bool align, Error **errp)
> -{
> -    ICSState *ics = spapr->ics;
> -    int i, first = -1;
> -
> -    assert(ics);
> -
> -    /*
> -     * MSIMesage::data is used for storing VIRQ so
> -     * it has to be aligned to num to support multiple
> -     * MSI vectors. MSI-X is not affected by this.
> -     * The hint is used for the first IRQ, the rest should
> -     * be allocated continuously.
> -     */
> -    if (align) {
> -        assert((num == 1) || (num == 2) || (num == 4) ||
> -               (num == 8) || (num == 16) || (num == 32));
> -        first = ics_find_free_block(ics, num, num);
> -    } else {
> -        first = ics_find_free_block(ics, num, 1);
> -    }
> -    if (first < 0) {
> -        error_setg(errp, "can't find a free %d-IRQ block", num);
> -        return -1;
> -    }
> -
> -    first += ics->offset;
> -    for (i = first; i < first + num; ++i) {
> -        spapr_irq_set_lsi(spapr, i, lsi);
> -    }
> -
> -    trace_spapr_irq_alloc_block(first, num, lsi, align);
> -
> -    return first;
> -}
> -
>  int spapr_irq_claim(sPAPRMachineState *spapr, int irq, int num, bool lsi,
>                      Error **errp)
>  {
> @@ -3934,7 +3856,7 @@ int spapr_irq_claim(sPAPRMachineState *spapr, int irq, int num, bool lsi,
>  
>      for (i = srcno; i < srcno + num; ++i) {
>          if (ICS_IRQ_FREE(ics, i)) {
> -            spapr_irq_set_lsi(spapr, i + ics->offset, lsi);
> +            ics_set_irq_type(ics, i, lsi);
>          } else {
>              error_setg(errp, "IRQ %d is not free", i + ics->offset);
>              ret = -1;

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/3] spapr: introduce a fixed IRQ number space
  2018-06-15 11:53 ` [Qemu-devel] [PATCH 3/3] spapr: introduce a fixed IRQ number space Cédric Le Goater
@ 2018-06-15 16:03   ` Greg Kurz
  2018-06-15 17:18     ` Cédric Le Goater
  2018-06-18 12:30     ` David Gibson
  0 siblings, 2 replies; 21+ messages in thread
From: Greg Kurz @ 2018-06-15 16:03 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: David Gibson, qemu-ppc, qemu-devel

On Fri, 15 Jun 2018 13:53:03 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> This proposal introduces a new IRQ number space layout using static
> numbers for all devices and a bitmap allocator for the MSI numbers
> which are negotiated by the guest at runtime.
> 
> The previous layout is kept in machines raising the 'xics_legacy'
> flag.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
>  include/hw/ppc/spapr.h     |  4 ++++
>  include/hw/ppc/spapr_irq.h | 30 ++++++++++++++++++++++++
>  hw/ppc/spapr.c             | 31 +++++++++++++++++++++++++
>  hw/ppc/spapr_events.c      | 12 ++++++++--
>  hw/ppc/spapr_irq.c         | 57 ++++++++++++++++++++++++++++++++++++++++++++++
>  hw/ppc/spapr_pci.c         | 28 ++++++++++++++++++-----
>  hw/ppc/spapr_vio.c         | 15 ++++++++----
>  hw/ppc/Makefile.objs       |  2 +-
>  8 files changed, 166 insertions(+), 13 deletions(-)
>  create mode 100644 include/hw/ppc/spapr_irq.h
>  create mode 100644 hw/ppc/spapr_irq.c
> 
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 49cda04b1493..094c7780130e 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -7,6 +7,7 @@
>  #include "hw/ppc/spapr_drc.h"
>  #include "hw/mem/pc-dimm.h"
>  #include "hw/ppc/spapr_ovec.h"
> +#include "hw/ppc/spapr_irq.h"
>  
>  struct VIOsPAPRBus;
>  struct sPAPRPHBState;
> @@ -164,6 +165,9 @@ struct sPAPRMachineState {
>      char *kvm_type;
>  
>      const char *icp_type;
> +    bool xics_legacy;
> +    int32_t irq_map_nr;
> +    unsigned long *irq_map;
>  
>      bool cmd_line_caps[SPAPR_CAP_NUM];
>      sPAPRCapabilities def, eff, mig;
> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> new file mode 100644
> index 000000000000..07e36d1c7e74
> --- /dev/null
> +++ b/include/hw/ppc/spapr_irq.h
> @@ -0,0 +1,30 @@
> +/*
> + * QEMU PowerPC sPAPR IRQ backend definitions
> + *
> + * Copyright (c) 2018, IBM Corporation.
> + *
> + * This code is licensed under the GPL version 2 or later. See the
> + * COPYING file in the top-level directory.
> + */
> +#ifndef HW_SPAPR_IRQ_H
> +#define HW_SPAPR_IRQ_H
> +
> +/*
> + * IRQ ranges per type

Well, range offsets actually. Maybe you could document the range size
as well for each type.

> + */
> +#define SPAPR_IRQ_EPOW       0x1000  /* XICS_IRQ_BASE offset */
> +#define SPAPR_IRQ_HOTPLUG    0x1001

This is just 1 irq, ie, range 0x1002-0x103F is unused

> +#define SPAPR_IRQ_VIO        0x1040

This is 1 irq per device, hence up to 64 devices

> +#define SPAPR_IRQ_PCI_LSI    0x1080
> +

This is 4 irqs per PHB, hence up to 32 PHBs. Cool, we're currently
limited to 31 PHBs.

> +#define SPAPR_IRQ_MSI        0x1100  /* Offset of the dynamic range covered

We only support dynamic MSIs with PCI, maybe rename to SPAPR_IRQ_PCI_MSI ?

> +                                      * by the bitmap allocator */

The range size is hence 1k (XICS_IRQS_SPAPR) for the time being.

> +
> +void spapr_irq_msi_init(sPAPRMachineState *spapr, uint32_t nr_irqs,
> +                       Error **errp);
                         ^
                     missing space

> +int spapr_irq_msi_alloc(sPAPRMachineState *spapr, uint32_t num, bool align,
> +                        Error **errp);
> +void spapr_irq_msi_free(sPAPRMachineState *spapr, int irq, uint32_t num);
> +void spapr_irq_msi_reset(sPAPRMachineState *spapr);
> +
> +#endif
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index c464951747e3..32aa5d869d9c 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -187,6 +187,11 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
>  {
>      sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
>  
> +    /* Initialize the MSI IRQ allocator. */
> +    if (!spapr->xics_legacy) {
> +        spapr_irq_msi_init(spapr, nr_irqs, &error_fatal);

The error should be propagated to the caller.

> +    }
> +
>      if (kvm_enabled()) {
>          if (machine_kernel_irqchip_allowed(machine) &&
>              !xics_kvm_init(spapr, errp)) {

And I now realize that the way errors are handled here is badly
broken.

In the default case, this function is supposed to try in-kernel XICS
first and if it fails, then fallback to userland XICS. But since we
pass errp down to xics_kvm_init() and the caller happens to pass
&error_fatal, we stop right here instead of trying the fallback.

And if the caller changes to pass a standard &local_err instead,
things would be even worse. With the same scenario, QEMU would
probably exit if it could fallback to the userland XICS, or abort
otherwise because error_set() would be called with an already set
*errp.

Since this beha^wug is mine (commit 3d85885a1b1f3), let me fix it
first :)

> @@ -1629,6 +1634,10 @@ static void spapr_machine_reset(void)
>          ppc_set_compat(first_ppc_cpu, spapr->max_compat_pvr, &error_fatal);
>      }
>  
> +    if (!spapr->xics_legacy) {
> +        spapr_irq_msi_reset(spapr);
> +    }
> +
>      qemu_devices_reset();
>  
>      /* DRC reset may cause a device to be unplugged. This will cause troubles
> @@ -1903,6 +1912,24 @@ static const VMStateDescription vmstate_spapr_patb_entry = {
>      },
>  };
>  
> +static bool spapr_irq_map_needed(void *opaque)
> +{
> +    sPAPRMachineState *spapr = opaque;
> +
> +    return spapr->irq_map && !bitmap_empty(spapr->irq_map, spapr->irq_map_nr);

Maybe check !spapr->xics_legacy as well to preserve backward migration
to older machines ?

> +}
> +
> +static const VMStateDescription vmstate_spapr_irq_map = {
> +    .name = "spapr_irq_map",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .needed = spapr_irq_map_needed,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_BITMAP(irq_map, sPAPRMachineState, 0, irq_map_nr),
> +        VMSTATE_END_OF_LIST()
> +    },
> +};
> +
>  static const VMStateDescription vmstate_spapr = {
>      .name = "spapr",
>      .version_id = 3,
> @@ -1930,6 +1957,7 @@ static const VMStateDescription vmstate_spapr = {
>          &vmstate_spapr_cap_cfpc,
>          &vmstate_spapr_cap_sbbc,
>          &vmstate_spapr_cap_ibs,
> +        &vmstate_spapr_irq_map,
>          NULL
>      }
>  };
> @@ -4101,7 +4129,10 @@ DEFINE_SPAPR_MACHINE(3_0, "3.0", true);
>  
>  static void spapr_machine_2_12_instance_options(MachineState *machine)
>  {
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
> +
>      spapr_machine_3_0_instance_options(machine);
> +    spapr->xics_legacy = true;
>  }
>  
>  static void spapr_machine_2_12_class_options(MachineClass *mc)
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index 3b6ae7272092..1ef7ce7f1509 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -709,7 +709,11 @@ void spapr_events_init(sPAPRMachineState *spapr)
>  {
>      int epow_irq;
>  
> -    epow_irq = spapr_irq_findone(spapr, &error_fatal);
> +    if (spapr->xics_legacy) {
> +        epow_irq = spapr_irq_findone(spapr, &error_fatal);
> +    } else {
> +        epow_irq = SPAPR_IRQ_EPOW;
> +    }
>  
>      spapr_irq_claim(spapr, epow_irq, 1, false, &error_fatal);
>  
> @@ -731,7 +735,11 @@ void spapr_events_init(sPAPRMachineState *spapr)
>      if (spapr->use_hotplug_event_source) {
>          int hp_irq;
>  
> -        hp_irq = spapr_irq_findone(spapr, &error_fatal);
> +        if (spapr->xics_legacy) {
> +            hp_irq = spapr_irq_findone(spapr, &error_fatal);
> +        } else {
> +            hp_irq = SPAPR_IRQ_HOTPLUG;
> +        }
>  
>          spapr_irq_claim(spapr, hp_irq, 1, false, &error_fatal);
>  
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> new file mode 100644
> index 000000000000..55fa764c85d1
> --- /dev/null
> +++ b/hw/ppc/spapr_irq.c
> @@ -0,0 +1,57 @@
> +/*
> + * QEMU PowerPC sPAPR IRQ interface
> + *
> + * Copyright (c) 2018, IBM Corporation.
> + *
> + * This code is licensed under the GPL version 2 or later. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/log.h"
> +#include "qemu/error-report.h"
> +#include "qapi/error.h"
> +#include "hw/ppc/spapr.h"
> +#include "hw/ppc/xics.h"
> +
> +void spapr_irq_msi_init(sPAPRMachineState *spapr, uint32_t nr_irqs,
> +                        Error **errp)

What's the errp for ? It looks like this function can never return
an error. Maybe just drop it.

> +{
> +    spapr->irq_map_nr = XICS_IRQ_BASE + nr_irqs - SPAPR_IRQ_MSI;
> +    spapr->irq_map = bitmap_new(spapr->irq_map_nr);
> +}
> +
> +int spapr_irq_msi_alloc(sPAPRMachineState *spapr, uint32_t num, bool align,
> +                        Error **errp)
> +{
> +    int irq;
> +
> +    /*
> +     * The 'align_mask' parameter of bitmap_find_next_zero_area()
> +     * should be one less than a power of 2; 0 means no
> +     * alignment. Adapt the 'align' value of the former allocator
> +     * to fit the requirements of bitmap_find_next_zero_area()
> +     */
> +    align -= 1;
> +
> +    irq = bitmap_find_next_zero_area(spapr->irq_map, spapr->irq_map_nr, 0, num,
> +                                     align);
> +    if (irq == spapr->irq_map_nr) {
> +        error_setg(errp, "can't find a free %d-IRQ block", num);
> +        return -1;
> +    }
> +
> +    bitmap_set(spapr->irq_map, irq, num);
> +
> +    return irq + SPAPR_IRQ_MSI;
> +}
> +
> +void spapr_irq_msi_free(sPAPRMachineState *spapr, int irq, uint32_t num)
> +{
> +    bitmap_clear(spapr->irq_map, irq - SPAPR_IRQ_MSI, num);
> +}
> +
> +void spapr_irq_msi_reset(sPAPRMachineState *spapr)
> +{
> +    bitmap_clear(spapr->irq_map, 0, spapr->irq_map_nr);
> +}
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 7394c62b4a8b..f3c4a25fb482 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -333,6 +333,9 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>              return;
>          }
>  
> +        if (!spapr->xics_legacy) {
> +            spapr_irq_msi_free(spapr, msi->first_irq, msi->num);
> +        }
>          spapr_irq_free(spapr, msi->first_irq, msi->num);

Shouldn't ^^ be in an else block ?

>          if (msi_present(pdev)) {
>              spapr_msi_setmsg(pdev, 0, false, 0, 0);
> @@ -371,7 +374,13 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>      }
>  
>      /* Allocate MSIs */
> -    irq = spapr_irq_find(spapr, req_num, ret_intr_type == RTAS_TYPE_MSI, &err);
> +    if (spapr->xics_legacy) {
> +        irq = spapr_irq_find(spapr, req_num, ret_intr_type == RTAS_TYPE_MSI,
> +                             &err);
> +    } else {
> +        irq = spapr_irq_msi_alloc(spapr, req_num,
> +                                  ret_intr_type == RTAS_TYPE_MSI, &err);
> +    }
>      if (err) {
>          error_reportf_err(err, "Can't allocate MSIs for device %x: ",
>                            config_addr);
> @@ -388,6 +397,9 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>  
>      /* Release previous MSIs */
>      if (msi) {
> +        if (!spapr->xics_legacy) {
> +            spapr_irq_msi_free(spapr, msi->first_irq, msi->num);
> +        }
>          spapr_irq_free(spapr, msi->first_irq, msi->num);
>          g_hash_table_remove(phb->msi, &config_addr);
>      }
> @@ -1704,11 +1716,15 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>          uint32_t irq;
>          Error *local_err = NULL;
>  
> -        irq = spapr_irq_findone(spapr, &local_err);
> -        if (local_err) {
> -            error_propagate(errp, local_err);
> -            error_prepend(errp, "can't allocate LSIs: ");
> -            return;
> +        if (spapr->xics_legacy) {
> +            irq = spapr_irq_findone(spapr, &local_err);
> +            if (local_err) {
> +                error_propagate(errp, local_err);
> +                error_prepend(errp, "can't allocate LSIs: ");
> +                return;
> +            }
> +        } else {
> +            irq = SPAPR_IRQ_PCI_LSI + sphb->index * PCI_NUM_PINS + i;
>          }
>  
>          spapr_irq_claim(spapr, irq, 1, true, &local_err);
> diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
> index ad9b56e28447..7707bda6a38d 100644
> --- a/hw/ppc/spapr_vio.c
> +++ b/hw/ppc/spapr_vio.c
> @@ -436,6 +436,9 @@ static void spapr_vio_busdev_reset(DeviceState *qdev)
>      }
>  }
>  
> +/* TODO : poor VIO device indexing ... */
> +static uint32_t vio_index;
> +
>  static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
>  {
>      sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> @@ -476,10 +479,14 @@ static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
>      }
>  
>      if (!dev->irq) {
> -        dev->irq = spapr_irq_findone(spapr, &local_err);
> -        if (local_err) {
> -            error_propagate(errp, local_err);
> -            return;
> +        if (spapr->xics_legacy) {
> +            dev->irq = spapr_irq_findone(spapr, &local_err);
> +            if (local_err) {
> +                error_propagate(errp, local_err);
> +                return;
> +            }
> +        } else {
> +            dev->irq = SPAPR_IRQ_VIO + vio_index++;

This can overlap the next range if we have more than 64 VIO devices...

>          }
>      }
>  
> diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
> index 86d82a6ec3ac..4fe3b7804d43 100644
> --- a/hw/ppc/Makefile.objs
> +++ b/hw/ppc/Makefile.objs
> @@ -4,7 +4,7 @@ obj-y += ppc.o ppc_booke.o fdt.o
>  obj-$(CONFIG_PSERIES) += spapr.o spapr_caps.o spapr_vio.o spapr_events.o
>  obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o spapr_rtas.o
>  obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o spapr_rng.o
> -obj-$(CONFIG_PSERIES) += spapr_cpu_core.o spapr_ovec.o
> +obj-$(CONFIG_PSERIES) += spapr_cpu_core.o spapr_ovec.o spapr_irq.o
>  # IBM PowerNV
>  obj-$(CONFIG_POWERNV) += pnv.o pnv_xscom.o pnv_core.o pnv_lpc.o pnv_psi.o pnv_occ.o pnv_bmc.o
>  ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/3] spapr: introduce a fixed IRQ number space
  2018-06-15 16:03   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
@ 2018-06-15 17:18     ` Cédric Le Goater
  2018-06-18  8:42       ` Greg Kurz
  2018-06-18 12:30     ` David Gibson
  1 sibling, 1 reply; 21+ messages in thread
From: Cédric Le Goater @ 2018-06-15 17:18 UTC (permalink / raw)
  To: Greg Kurz; +Cc: David Gibson, qemu-ppc, qemu-devel

Hello,

On 06/15/2018 06:03 PM, Greg Kurz wrote:
> On Fri, 15 Jun 2018 13:53:03 +0200
> Cédric Le Goater <clg@kaod.org> wrote:
> 
>> This proposal introduces a new IRQ number space layout using static
>> numbers for all devices and a bitmap allocator for the MSI numbers
>> which are negotiated by the guest at runtime.
>>
>> The previous layout is kept in machines raising the 'xics_legacy'
>> flag.
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>> ---
>>  include/hw/ppc/spapr.h     |  4 ++++
>>  include/hw/ppc/spapr_irq.h | 30 ++++++++++++++++++++++++
>>  hw/ppc/spapr.c             | 31 +++++++++++++++++++++++++
>>  hw/ppc/spapr_events.c      | 12 ++++++++--
>>  hw/ppc/spapr_irq.c         | 57 ++++++++++++++++++++++++++++++++++++++++++++++
>>  hw/ppc/spapr_pci.c         | 28 ++++++++++++++++++-----
>>  hw/ppc/spapr_vio.c         | 15 ++++++++----
>>  hw/ppc/Makefile.objs       |  2 +-
>>  8 files changed, 166 insertions(+), 13 deletions(-)
>>  create mode 100644 include/hw/ppc/spapr_irq.h
>>  create mode 100644 hw/ppc/spapr_irq.c
>>
>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>> index 49cda04b1493..094c7780130e 100644
>> --- a/include/hw/ppc/spapr.h
>> +++ b/include/hw/ppc/spapr.h
>> @@ -7,6 +7,7 @@
>>  #include "hw/ppc/spapr_drc.h"
>>  #include "hw/mem/pc-dimm.h"
>>  #include "hw/ppc/spapr_ovec.h"
>> +#include "hw/ppc/spapr_irq.h"
>>  
>>  struct VIOsPAPRBus;
>>  struct sPAPRPHBState;
>> @@ -164,6 +165,9 @@ struct sPAPRMachineState {
>>      char *kvm_type;
>>  
>>      const char *icp_type;
>> +    bool xics_legacy;
>> +    int32_t irq_map_nr;
>> +    unsigned long *irq_map;
>>  
>>      bool cmd_line_caps[SPAPR_CAP_NUM];
>>      sPAPRCapabilities def, eff, mig;
>> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
>> new file mode 100644
>> index 000000000000..07e36d1c7e74
>> --- /dev/null
>> +++ b/include/hw/ppc/spapr_irq.h
>> @@ -0,0 +1,30 @@
>> +/*
>> + * QEMU PowerPC sPAPR IRQ backend definitions
>> + *
>> + * Copyright (c) 2018, IBM Corporation.
>> + *
>> + * This code is licensed under the GPL version 2 or later. See the
>> + * COPYING file in the top-level directory.
>> + */
>> +#ifndef HW_SPAPR_IRQ_H
>> +#define HW_SPAPR_IRQ_H
>> +
>> +/*
>> + * IRQ ranges per type
> 
> Well, range offsets actually. Maybe you could document the range size
> as well for each type.

yes. I am waiting for people to scream : "I want more !"

>> + */
>> +#define SPAPR_IRQ_EPOW       0x1000  /* XICS_IRQ_BASE offset */
>> +#define SPAPR_IRQ_HOTPLUG    0x1001
> 
> This is just 1 irq, ie, range 0x1002-0x103F is unused
> 
>> +#define SPAPR_IRQ_VIO        0x1040
> 
> This is 1 irq per device, hence up to 64 devices
> 
>> +#define SPAPR_IRQ_PCI_LSI    0x1080
>> +
> 
> This is 4 irqs per PHB, hence up to 32 PHBs. Cool, we're currently
> limited to 31 PHBs.
> 
>> +#define SPAPR_IRQ_MSI        0x1100  /* Offset of the dynamic range covered
> 
> We only support dynamic MSIs with PCI, maybe rename to SPAPR_IRQ_PCI_MSI ?

hmm, no. We could have CAPI devices there. remember ? ;)

>> +                                      * by the bitmap allocator */
> 
> The range size is hence 1k (XICS_IRQS_SPAPR) for the time being.

in fact we could this bogus limit and use spapr->irq_map_nr now.

>> +
>> +void spapr_irq_msi_init(sPAPRMachineState *spapr, uint32_t nr_irqs,
>> +                       Error **errp);
>                          ^
>                      missing space
>
>> +int spapr_irq_msi_alloc(sPAPRMachineState *spapr, uint32_t num, bool align,
>> +                        Error **errp);
>> +void spapr_irq_msi_free(sPAPRMachineState *spapr, int irq, uint32_t num);
>> +void spapr_irq_msi_reset(sPAPRMachineState *spapr);
>> +
>> +#endif
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index c464951747e3..32aa5d869d9c 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -187,6 +187,11 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
>>  {
>>      sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
>>  
>> +    /* Initialize the MSI IRQ allocator. */
>> +    if (!spapr->xics_legacy) {
>> +        spapr_irq_msi_init(spapr, nr_irqs, &error_fatal);
> 
> The error should be propagated to the caller.

I think the Error is useless here.


> 
>> +    }
>> +
>>      if (kvm_enabled()) {
>>          if (machine_kernel_irqchip_allowed(machine) &&
>>              !xics_kvm_init(spapr, errp)) {
> 
> And I now realize that the way errors are handled here is badly
> broken.
> 
> In the default case, this function is supposed to try in-kernel XICS
> first and if it fails, then fallback to userland XICS. But since we
> pass errp down to xics_kvm_init() and the caller happens to pass
> &error_fatal, we stop right here instead of trying the fallback.
> 
> And if the caller changes to pass a standard &local_err instead,
> things would be even worse. With the same scenario, QEMU would
> probably exit if it could fallback to the userland XICS, or abort
> otherwise because error_set() would be called with an already set
> *errp.
> 
> Since this beha^wug is mine (commit 3d85885a1b1f3), let me fix it
> first :)
> 
>> @@ -1629,6 +1634,10 @@ static void spapr_machine_reset(void)
>>          ppc_set_compat(first_ppc_cpu, spapr->max_compat_pvr, &error_fatal);
>>      }
>>  
>> +    if (!spapr->xics_legacy) {
>> +        spapr_irq_msi_reset(spapr);
>> +    }
>> +
>>      qemu_devices_reset();
>>  
>>      /* DRC reset may cause a device to be unplugged. This will cause troubles
>> @@ -1903,6 +1912,24 @@ static const VMStateDescription vmstate_spapr_patb_entry = {
>>      },
>>  };
>>  
>> +static bool spapr_irq_map_needed(void *opaque)
>> +{
>> +    sPAPRMachineState *spapr = opaque;
>> +
>> +    return spapr->irq_map && !bitmap_empty(spapr->irq_map, spapr->irq_map_nr);
> 
> Maybe check !spapr->xics_legacy as well to preserve backward migration
> to older machines ?

no need. It is redudant with spapr->irq_map which is not allocated in that case.

>> +}
>> +
>> +static const VMStateDescription vmstate_spapr_irq_map = {
>> +    .name = "spapr_irq_map",
>> +    .version_id = 1,
>> +    .minimum_version_id = 1,
>> +    .needed = spapr_irq_map_needed,
>> +    .fields = (VMStateField[]) {
>> +        VMSTATE_BITMAP(irq_map, sPAPRMachineState, 0, irq_map_nr),
>> +        VMSTATE_END_OF_LIST()
>> +    },
>> +};
>> +
>>  static const VMStateDescription vmstate_spapr = {
>>      .name = "spapr",
>>      .version_id = 3,
>> @@ -1930,6 +1957,7 @@ static const VMStateDescription vmstate_spapr = {
>>          &vmstate_spapr_cap_cfpc,
>>          &vmstate_spapr_cap_sbbc,
>>          &vmstate_spapr_cap_ibs,
>> +        &vmstate_spapr_irq_map,
>>          NULL
>>      }
>>  };
>> @@ -4101,7 +4129,10 @@ DEFINE_SPAPR_MACHINE(3_0, "3.0", true);
>>  
>>  static void spapr_machine_2_12_instance_options(MachineState *machine)
>>  {
>> +    sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
>> +
>>      spapr_machine_3_0_instance_options(machine);
>> +    spapr->xics_legacy = true;
>>  }
>>  
>>  static void spapr_machine_2_12_class_options(MachineClass *mc)
>> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
>> index 3b6ae7272092..1ef7ce7f1509 100644
>> --- a/hw/ppc/spapr_events.c
>> +++ b/hw/ppc/spapr_events.c
>> @@ -709,7 +709,11 @@ void spapr_events_init(sPAPRMachineState *spapr)
>>  {
>>      int epow_irq;
>>  
>> -    epow_irq = spapr_irq_findone(spapr, &error_fatal);
>> +    if (spapr->xics_legacy) {
>> +        epow_irq = spapr_irq_findone(spapr, &error_fatal);
>> +    } else {
>> +        epow_irq = SPAPR_IRQ_EPOW;
>> +    }
>>  
>>      spapr_irq_claim(spapr, epow_irq, 1, false, &error_fatal);
>>  
>> @@ -731,7 +735,11 @@ void spapr_events_init(sPAPRMachineState *spapr)
>>      if (spapr->use_hotplug_event_source) {
>>          int hp_irq;
>>  
>> -        hp_irq = spapr_irq_findone(spapr, &error_fatal);
>> +        if (spapr->xics_legacy) {
>> +            hp_irq = spapr_irq_findone(spapr, &error_fatal);
>> +        } else {
>> +            hp_irq = SPAPR_IRQ_HOTPLUG;
>> +        }
>>  
>>          spapr_irq_claim(spapr, hp_irq, 1, false, &error_fatal);
>>  
>> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
>> new file mode 100644
>> index 000000000000..55fa764c85d1
>> --- /dev/null
>> +++ b/hw/ppc/spapr_irq.c
>> @@ -0,0 +1,57 @@
>> +/*
>> + * QEMU PowerPC sPAPR IRQ interface
>> + *
>> + * Copyright (c) 2018, IBM Corporation.
>> + *
>> + * This code is licensed under the GPL version 2 or later. See the
>> + * COPYING file in the top-level directory.
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "qemu/log.h"
>> +#include "qemu/error-report.h"
>> +#include "qapi/error.h"
>> +#include "hw/ppc/spapr.h"
>> +#include "hw/ppc/xics.h"
>> +
>> +void spapr_irq_msi_init(sPAPRMachineState *spapr, uint32_t nr_irqs,
>> +                        Error **errp)
> 
> What's the errp for ? It looks like this function can never return
> an error. Maybe just drop it.

yes

> 
>> +{
>> +    spapr->irq_map_nr = XICS_IRQ_BASE + nr_irqs - SPAPR_IRQ_MSI;
>> +    spapr->irq_map = bitmap_new(spapr->irq_map_nr);
>> +}
>> +
>> +int spapr_irq_msi_alloc(sPAPRMachineState *spapr, uint32_t num, bool align,
>> +                        Error **errp)
>> +{
>> +    int irq;
>> +
>> +    /*
>> +     * The 'align_mask' parameter of bitmap_find_next_zero_area()
>> +     * should be one less than a power of 2; 0 means no
>> +     * alignment. Adapt the 'align' value of the former allocator
>> +     * to fit the requirements of bitmap_find_next_zero_area()
>> +     */
>> +    align -= 1;
>> +
>> +    irq = bitmap_find_next_zero_area(spapr->irq_map, spapr->irq_map_nr, 0, num,
>> +                                     align);
>> +    if (irq == spapr->irq_map_nr) {
>> +        error_setg(errp, "can't find a free %d-IRQ block", num);
>> +        return -1;
>> +    }
>> +
>> +    bitmap_set(spapr->irq_map, irq, num);
>> +
>> +    return irq + SPAPR_IRQ_MSI;
>> +}
>> +
>> +void spapr_irq_msi_free(sPAPRMachineState *spapr, int irq, uint32_t num)
>> +{
>> +    bitmap_clear(spapr->irq_map, irq - SPAPR_IRQ_MSI, num);
>> +}
>> +
>> +void spapr_irq_msi_reset(sPAPRMachineState *spapr)
>> +{
>> +    bitmap_clear(spapr->irq_map, 0, spapr->irq_map_nr);
>> +}
>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>> index 7394c62b4a8b..f3c4a25fb482 100644
>> --- a/hw/ppc/spapr_pci.c
>> +++ b/hw/ppc/spapr_pci.c
>> @@ -333,6 +333,9 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>>              return;
>>          }
>>  
>> +        if (!spapr->xics_legacy) {
>> +            spapr_irq_msi_free(spapr, msi->first_irq, msi->num);
>> +        }
>>          spapr_irq_free(spapr, msi->first_irq, msi->num);
> 
> Shouldn't ^^ be in an else block ?

no. one is for the MSI IRQ allocator, the other for the controller. 

> 
>>          if (msi_present(pdev)) {
>>              spapr_msi_setmsg(pdev, 0, false, 0, 0);
>> @@ -371,7 +374,13 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>>      }
>>  
>>      /* Allocate MSIs */
>> -    irq = spapr_irq_find(spapr, req_num, ret_intr_type == RTAS_TYPE_MSI, &err);
>> +    if (spapr->xics_legacy) {
>> +        irq = spapr_irq_find(spapr, req_num, ret_intr_type == RTAS_TYPE_MSI,
>> +                             &err);
>> +    } else {
>> +        irq = spapr_irq_msi_alloc(spapr, req_num,
>> +                                  ret_intr_type == RTAS_TYPE_MSI, &err);
>> +    }
>>      if (err) {
>>          error_reportf_err(err, "Can't allocate MSIs for device %x: ",
>>                            config_addr);
>> @@ -388,6 +397,9 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>>  
>>      /* Release previous MSIs */
>>      if (msi) {
>> +        if (!spapr->xics_legacy) {
>> +            spapr_irq_msi_free(spapr, msi->first_irq, msi->num);
>> +        }
>>          spapr_irq_free(spapr, msi->first_irq, msi->num);
>>          g_hash_table_remove(phb->msi, &config_addr);
>>      }
>> @@ -1704,11 +1716,15 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>>          uint32_t irq;
>>          Error *local_err = NULL;
>>  
>> -        irq = spapr_irq_findone(spapr, &local_err);
>> -        if (local_err) {
>> -            error_propagate(errp, local_err);
>> -            error_prepend(errp, "can't allocate LSIs: ");
>> -            return;
>> +        if (spapr->xics_legacy) {
>> +            irq = spapr_irq_findone(spapr, &local_err);
>> +            if (local_err) {
>> +                error_propagate(errp, local_err);
>> +                error_prepend(errp, "can't allocate LSIs: ");
>> +                return;
>> +            }
>> +        } else {
>> +            irq = SPAPR_IRQ_PCI_LSI + sphb->index * PCI_NUM_PINS + i;
>>          }
>>  
>>          spapr_irq_claim(spapr, irq, 1, true, &local_err);
>> diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
>> index ad9b56e28447..7707bda6a38d 100644
>> --- a/hw/ppc/spapr_vio.c
>> +++ b/hw/ppc/spapr_vio.c
>> @@ -436,6 +436,9 @@ static void spapr_vio_busdev_reset(DeviceState *qdev)
>>      }
>>  }
>>  
>> +/* TODO : poor VIO device indexing ... */
>> +static uint32_t vio_index;
>> +
>>  static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
>>  {
>>      sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
>> @@ -476,10 +479,14 @@ static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
>>      }
>>  
>>      if (!dev->irq) {
>> -        dev->irq = spapr_irq_findone(spapr, &local_err);
>> -        if (local_err) {
>> -            error_propagate(errp, local_err);
>> -            return;
>> +        if (spapr->xics_legacy) {
>> +            dev->irq = spapr_irq_findone(spapr, &local_err);
>> +            if (local_err) {
>> +                error_propagate(errp, local_err);
>> +                return;
>> +            }
>> +        } else {
>> +            dev->irq = SPAPR_IRQ_VIO + vio_index++;
> 
> This can overlap the next range if we have more than 64 VIO devices...

yes. but claim() should fail.

Thanks,

C.

>>          }
>>      }
>>  
>> diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
>> index 86d82a6ec3ac..4fe3b7804d43 100644
>> --- a/hw/ppc/Makefile.objs
>> +++ b/hw/ppc/Makefile.objs
>> @@ -4,7 +4,7 @@ obj-y += ppc.o ppc_booke.o fdt.o
>>  obj-$(CONFIG_PSERIES) += spapr.o spapr_caps.o spapr_vio.o spapr_events.o
>>  obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o spapr_rtas.o
>>  obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o spapr_rng.o
>> -obj-$(CONFIG_PSERIES) += spapr_cpu_core.o spapr_ovec.o
>> +obj-$(CONFIG_PSERIES) += spapr_cpu_core.o spapr_ovec.o spapr_irq.o
>>  # IBM PowerNV
>>  obj-$(CONFIG_POWERNV) += pnv.o pnv_xscom.o pnv_core.o pnv_lpc.o pnv_psi.o pnv_occ.o pnv_bmc.o
>>  ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy)
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/3] spapr: split the IRQ allocation sequence
  2018-06-15 13:14   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
@ 2018-06-15 17:21     ` Cédric Le Goater
  2018-06-18  6:52       ` Greg Kurz
  0 siblings, 1 reply; 21+ messages in thread
From: Cédric Le Goater @ 2018-06-15 17:21 UTC (permalink / raw)
  To: Greg Kurz; +Cc: David Gibson, qemu-ppc, qemu-devel

On 06/15/2018 03:14 PM, Greg Kurz wrote:
> On Fri, 15 Jun 2018 13:53:01 +0200
> Cédric Le Goater <clg@kaod.org> wrote:
> 
>> Today, when a device requests for IRQ number in a sPAPR machine, the
>> spapr_irq_alloc() routine first scans the ICSState status array to
>> find an empty slot and then performs the assignement of the selected
>> numbers. Split this sequence in two distinct routines : spapr_irq_find()
>> for lookups and spapr_irq_claim() for claiming the IRQ numbers.
>>
>> This will ease the introduction of a static layout of IRQ numbers.
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>> ---
>>  include/hw/ppc/spapr.h |  5 ++++
>>  hw/ppc/spapr.c         | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++
>>  hw/ppc/spapr_events.c  | 18 ++++++++++----
>>  hw/ppc/spapr_pci.c     | 19 ++++++++++++---
>>  hw/ppc/spapr_vio.c     | 10 +++++++-
>>  5 files changed, 108 insertions(+), 8 deletions(-)
>>
>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>> index 3388750fc795..6088f44c1b2a 100644
>> --- a/include/hw/ppc/spapr.h
>> +++ b/include/hw/ppc/spapr.h
>> @@ -776,6 +776,11 @@ int spapr_irq_alloc(sPAPRMachineState *spapr, int irq_hint, bool lsi,
>>                      Error **errp);
>>  int spapr_irq_alloc_block(sPAPRMachineState *spapr, int num, bool lsi,
>>                            bool align, Error **errp);
>> +int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align,
>> +                   Error **errp);
>> +#define spapr_irq_findone(spapr, errp) spapr_irq_find(spapr, 1, false, errp)
>> +int spapr_irq_claim(sPAPRMachineState *spapr, int irq, int num, bool lsi,
>> +                    Error **errp);
>>  void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
>>  qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq);
>>  
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index f59999daacfc..b1d19b328166 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -3810,6 +3810,36 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum)
>>      return -1;
>>  }
>>  
>> +int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align, Error **errp)
>> +{
>> +    ICSState *ics = spapr->ics;
>> +    int first = -1;
>> +
>> +    assert(ics);
>> +
>> +    /*
>> +     * MSIMesage::data is used for storing VIRQ so
>> +     * it has to be aligned to num to support multiple
>> +     * MSI vectors. MSI-X is not affected by this.
>> +     * The hint is used for the first IRQ, the rest should
>> +     * be allocated continuously.
>> +     */
>> +    if (align) {
>> +        assert((num == 1) || (num == 2) || (num == 4) ||
>> +               (num == 8) || (num == 16) || (num == 32));
>> +        first = ics_find_free_block(ics, num, num);
>> +    } else {
>> +        first = ics_find_free_block(ics, num, 1);
>> +    }
>> +
>> +    if (first < 0) {
>> +        error_setg(errp, "can't find a free %d-IRQ block", num);
>> +        return -1;
>> +    }
>> +
>> +    return first + ics->offset;
>> +}
>> +
>>  /*
>>   * Allocate the IRQ number and set the IRQ type, LSI or MSI
>>   */
>> @@ -3888,6 +3918,40 @@ int spapr_irq_alloc_block(sPAPRMachineState *spapr, int num, bool lsi,
>>      return first;
>>  }
>>  
>> +int spapr_irq_claim(sPAPRMachineState *spapr, int irq, int num, bool lsi,
>> +                    Error **errp)
>> +{
>> +    ICSState *ics = spapr->ics;
>> +    int i;
>> +    int srcno = irq - ics->offset;
>> +    int ret = 0;
>> +
>> +    assert(ics);
>> +
>> +    if (!ics_valid_irq(ics, irq) || !ics_valid_irq(ics, irq + num)) {
> 
> I guess it is okay to assume that if the first and last irqs are valid,
> so are all numbers in between. Shouldn't the second check be against
> irq + num - 1 though ?

yes. thanks.

> This lacks an error_setg() BTW.
> 
>> +        return -1;
>> +    }
>> +
>> +    for (i = srcno; i < srcno + num; ++i) {
>> +        if (ICS_IRQ_FREE(ics, i)) {
>> +            spapr_irq_set_lsi(spapr, i + ics->offset, lsi);
>> +        } else {
>> +            error_setg(errp, "IRQ %d is not free", i + ics->offset);
>> +            ret = -1;
>> +            break;
>> +        }
>> +    }
>> +
>> +    /* we could call spapr_irq_free() when rolling back */
>> +    if (ret) {
>> +        while (--i >= srcno) {
>> +            memset(&ics->irqs[i], 0, sizeof(ICSIRQState));
>> +        }
>> +    }
> 
> Hmm... I guess we should free the whole range otherwise we leak
> srcno + num - i irqs, preferably using spapr_irq_free() to match
> the traces from spapr_irq_alloc().

I don't understand. This is undoing what has been done. 

> Alternatively, the rollback could be pushed to the callers.

Yes. Today none is done so we might just do nothing about it
and return an error.
 
> Rest looks good.
> 
>> +
>> +    return ret;
>> +}
>> +
>>  void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num)
>>  {
>>      ICSState *ics = spapr->ics;
>> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
>> index 86836f0626dc..3b6ae7272092 100644
>> --- a/hw/ppc/spapr_events.c
>> +++ b/hw/ppc/spapr_events.c
>> @@ -707,13 +707,18 @@ void spapr_clear_pending_events(sPAPRMachineState *spapr)
>>  
>>  void spapr_events_init(sPAPRMachineState *spapr)
>>  {
>> +    int epow_irq;
>> +
>> +    epow_irq = spapr_irq_findone(spapr, &error_fatal);
>> +
>> +    spapr_irq_claim(spapr, epow_irq, 1, false, &error_fatal);
>> +
>>      QTAILQ_INIT(&spapr->pending_events);
>>  
>>      spapr->event_sources = spapr_event_sources_new();
>>  
>>      spapr_event_sources_register(spapr->event_sources, EVENT_CLASS_EPOW,
>> -                                 spapr_irq_alloc(spapr, 0, false,
>> -                                                  &error_fatal));
>> +                                 epow_irq);
>>  
>>      /* NOTE: if machine supports modern/dedicated hotplug event source,
>>       * we add it to the device-tree unconditionally. This means we may
>> @@ -724,9 +729,14 @@ void spapr_events_init(sPAPRMachineState *spapr)
>>       * checking that it's enabled.
>>       */
>>      if (spapr->use_hotplug_event_source) {
>> +        int hp_irq;
>> +
>> +        hp_irq = spapr_irq_findone(spapr, &error_fatal);
>> +
>> +        spapr_irq_claim(spapr, hp_irq, 1, false, &error_fatal);
>> +
>>          spapr_event_sources_register(spapr->event_sources, EVENT_CLASS_HOT_PLUG,
>> -                                     spapr_irq_alloc(spapr, 0, false,
>> -                                                      &error_fatal));
>> +                                     hp_irq);
>>      }
>>  
>>      spapr->epow_notifier.notify = spapr_powerdown_req;
>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>> index f936ce63effa..7394c62b4a8b 100644
>> --- a/hw/ppc/spapr_pci.c
>> +++ b/hw/ppc/spapr_pci.c
>> @@ -371,8 +371,14 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>>      }
>>  
>>      /* Allocate MSIs */
>> -    irq = spapr_irq_alloc_block(spapr, req_num, false,
>> -                           ret_intr_type == RTAS_TYPE_MSI, &err);
>> +    irq = spapr_irq_find(spapr, req_num, ret_intr_type == RTAS_TYPE_MSI, &err);
>> +    if (err) {
>> +        error_reportf_err(err, "Can't allocate MSIs for device %x: ",
>> +                          config_addr);
>> +        rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
>> +        return;
>> +    }
>> +    spapr_irq_claim(spapr, irq, req_num, false, &err);
>>      if (err) {
>>          error_reportf_err(err, "Can't allocate MSIs for device %x: ",
>>                            config_addr);
>> @@ -1698,7 +1704,14 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>>          uint32_t irq;
>>          Error *local_err = NULL;
>>  
>> -        irq = spapr_irq_alloc_block(spapr, 1, true, false, &local_err);
>> +        irq = spapr_irq_findone(spapr, &local_err);
>> +        if (local_err) {
>> +            error_propagate(errp, local_err);
>> +            error_prepend(errp, "can't allocate LSIs: ");
>> +            return;
>> +        }
>> +
>> +        spapr_irq_claim(spapr, irq, 1, true, &local_err);
>>          if (local_err) {
>>              error_propagate(errp, local_err);
>>              error_prepend(errp, "can't allocate LSIs: ");
>> diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
>> index 4555c648a8e2..ad9b56e28447 100644
>> --- a/hw/ppc/spapr_vio.c
>> +++ b/hw/ppc/spapr_vio.c
>> @@ -475,7 +475,15 @@ static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
>>          dev->qdev.id = id;
>>      }
>>  
>> -    dev->irq = spapr_irq_alloc(spapr, dev->irq, false, &local_err);
>> +    if (!dev->irq) {
>> +        dev->irq = spapr_irq_findone(spapr, &local_err);
>> +        if (local_err) {
>> +            error_propagate(errp, local_err);
>> +            return;
>> +        }
>> +    }
>> +
>> +    spapr_irq_claim(spapr, dev->irq, 1, false, &local_err);
>>      if (local_err) {
>>          error_propagate(errp, local_err);
>>          return;
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] spapr: split the IRQ allocation sequence
  2018-06-15 11:53 ` [Qemu-devel] [PATCH 1/3] spapr: split the IRQ allocation sequence Cédric Le Goater
  2018-06-15 13:14   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
@ 2018-06-18  4:00   ` David Gibson
  2018-06-18 12:16     ` David Gibson
  1 sibling, 1 reply; 21+ messages in thread
From: David Gibson @ 2018-06-18  4:00 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 8698 bytes --]

On Fri, Jun 15, 2018 at 01:53:01PM +0200, Cédric Le Goater wrote:
> Today, when a device requests for IRQ number in a sPAPR machine, the
> spapr_irq_alloc() routine first scans the ICSState status array to
> find an empty slot and then performs the assignement of the selected
> numbers. Split this sequence in two distinct routines : spapr_irq_find()
> for lookups and spapr_irq_claim() for claiming the IRQ numbers.
> 
> This will ease the introduction of a static layout of IRQ numbers.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
>  include/hw/ppc/spapr.h |  5 ++++
>  hw/ppc/spapr.c         | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  hw/ppc/spapr_events.c  | 18 ++++++++++----
>  hw/ppc/spapr_pci.c     | 19 ++++++++++++---
>  hw/ppc/spapr_vio.c     | 10 +++++++-
>  5 files changed, 108 insertions(+), 8 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 3388750fc795..6088f44c1b2a 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -776,6 +776,11 @@ int spapr_irq_alloc(sPAPRMachineState *spapr, int irq_hint, bool lsi,
>                      Error **errp);
>  int spapr_irq_alloc_block(sPAPRMachineState *spapr, int num, bool lsi,
>                            bool align, Error **errp);
> +int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align,
> +                   Error **errp);
> +#define spapr_irq_findone(spapr, errp) spapr_irq_find(spapr, 1, false, errp)
> +int spapr_irq_claim(sPAPRMachineState *spapr, int irq, int num, bool lsi,
> +                    Error **errp);

Unlike find, there's no reason we need to atomically claim a block of
irqs.  So I think claim should just grab a single irq.  Callers can
loop if they need multiple irqs.

Other than that and the minor points Greg noted, this looks good.

>  void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
>  qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq);
>  
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index f59999daacfc..b1d19b328166 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3810,6 +3810,36 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum)
>      return -1;
>  }
>  
> +int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align, Error **errp)
> +{
> +    ICSState *ics = spapr->ics;
> +    int first = -1;
> +
> +    assert(ics);
> +
> +    /*
> +     * MSIMesage::data is used for storing VIRQ so
> +     * it has to be aligned to num to support multiple
> +     * MSI vectors. MSI-X is not affected by this.
> +     * The hint is used for the first IRQ, the rest should
> +     * be allocated continuously.
> +     */
> +    if (align) {
> +        assert((num == 1) || (num == 2) || (num == 4) ||
> +               (num == 8) || (num == 16) || (num == 32));
> +        first = ics_find_free_block(ics, num, num);
> +    } else {
> +        first = ics_find_free_block(ics, num, 1);
> +    }
> +
> +    if (first < 0) {
> +        error_setg(errp, "can't find a free %d-IRQ block", num);
> +        return -1;
> +    }
> +
> +    return first + ics->offset;
> +}
> +
>  /*
>   * Allocate the IRQ number and set the IRQ type, LSI or MSI
>   */
> @@ -3888,6 +3918,40 @@ int spapr_irq_alloc_block(sPAPRMachineState *spapr, int num, bool lsi,
>      return first;
>  }
>  
> +int spapr_irq_claim(sPAPRMachineState *spapr, int irq, int num, bool lsi,
> +                    Error **errp)
> +{
> +    ICSState *ics = spapr->ics;
> +    int i;
> +    int srcno = irq - ics->offset;
> +    int ret = 0;
> +
> +    assert(ics);
> +
> +    if (!ics_valid_irq(ics, irq) || !ics_valid_irq(ics, irq + num)) {
> +        return -1;
> +    }
> +
> +    for (i = srcno; i < srcno + num; ++i) {
> +        if (ICS_IRQ_FREE(ics, i)) {
> +            spapr_irq_set_lsi(spapr, i + ics->offset, lsi);
> +        } else {
> +            error_setg(errp, "IRQ %d is not free", i + ics->offset);
> +            ret = -1;
> +            break;
> +        }
> +    }
> +
> +    /* we could call spapr_irq_free() when rolling back */
> +    if (ret) {
> +        while (--i >= srcno) {
> +            memset(&ics->irqs[i], 0, sizeof(ICSIRQState));
> +        }
> +    }
> +
> +    return ret;
> +}
> +
>  void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num)
>  {
>      ICSState *ics = spapr->ics;
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index 86836f0626dc..3b6ae7272092 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -707,13 +707,18 @@ void spapr_clear_pending_events(sPAPRMachineState *spapr)
>  
>  void spapr_events_init(sPAPRMachineState *spapr)
>  {
> +    int epow_irq;
> +
> +    epow_irq = spapr_irq_findone(spapr, &error_fatal);
> +
> +    spapr_irq_claim(spapr, epow_irq, 1, false, &error_fatal);
> +
>      QTAILQ_INIT(&spapr->pending_events);
>  
>      spapr->event_sources = spapr_event_sources_new();
>  
>      spapr_event_sources_register(spapr->event_sources, EVENT_CLASS_EPOW,
> -                                 spapr_irq_alloc(spapr, 0, false,
> -                                                  &error_fatal));
> +                                 epow_irq);
>  
>      /* NOTE: if machine supports modern/dedicated hotplug event source,
>       * we add it to the device-tree unconditionally. This means we may
> @@ -724,9 +729,14 @@ void spapr_events_init(sPAPRMachineState *spapr)
>       * checking that it's enabled.
>       */
>      if (spapr->use_hotplug_event_source) {
> +        int hp_irq;
> +
> +        hp_irq = spapr_irq_findone(spapr, &error_fatal);
> +
> +        spapr_irq_claim(spapr, hp_irq, 1, false, &error_fatal);
> +
>          spapr_event_sources_register(spapr->event_sources, EVENT_CLASS_HOT_PLUG,
> -                                     spapr_irq_alloc(spapr, 0, false,
> -                                                      &error_fatal));
> +                                     hp_irq);
>      }
>  
>      spapr->epow_notifier.notify = spapr_powerdown_req;
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index f936ce63effa..7394c62b4a8b 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -371,8 +371,14 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>      }
>  
>      /* Allocate MSIs */
> -    irq = spapr_irq_alloc_block(spapr, req_num, false,
> -                           ret_intr_type == RTAS_TYPE_MSI, &err);
> +    irq = spapr_irq_find(spapr, req_num, ret_intr_type == RTAS_TYPE_MSI, &err);
> +    if (err) {
> +        error_reportf_err(err, "Can't allocate MSIs for device %x: ",
> +                          config_addr);
> +        rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
> +        return;
> +    }
> +    spapr_irq_claim(spapr, irq, req_num, false, &err);
>      if (err) {
>          error_reportf_err(err, "Can't allocate MSIs for device %x: ",
>                            config_addr);
> @@ -1698,7 +1704,14 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>          uint32_t irq;
>          Error *local_err = NULL;
>  
> -        irq = spapr_irq_alloc_block(spapr, 1, true, false, &local_err);
> +        irq = spapr_irq_findone(spapr, &local_err);
> +        if (local_err) {
> +            error_propagate(errp, local_err);
> +            error_prepend(errp, "can't allocate LSIs: ");
> +            return;
> +        }
> +
> +        spapr_irq_claim(spapr, irq, 1, true, &local_err);
>          if (local_err) {
>              error_propagate(errp, local_err);
>              error_prepend(errp, "can't allocate LSIs: ");
> diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
> index 4555c648a8e2..ad9b56e28447 100644
> --- a/hw/ppc/spapr_vio.c
> +++ b/hw/ppc/spapr_vio.c
> @@ -475,7 +475,15 @@ static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
>          dev->qdev.id = id;
>      }
>  
> -    dev->irq = spapr_irq_alloc(spapr, dev->irq, false, &local_err);
> +    if (!dev->irq) {
> +        dev->irq = spapr_irq_findone(spapr, &local_err);
> +        if (local_err) {
> +            error_propagate(errp, local_err);
> +            return;
> +        }
> +    }
> +
> +    spapr_irq_claim(spapr, dev->irq, 1, false, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
>          return;

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] spapr: remove unused spapr_irq routines
  2018-06-15 11:53 ` [Qemu-devel] [PATCH 2/3] spapr: remove unused spapr_irq routines Cédric Le Goater
  2018-06-15 13:30   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
@ 2018-06-18  4:00   ` David Gibson
  1 sibling, 0 replies; 21+ messages in thread
From: David Gibson @ 2018-06-18  4:00 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 4656 bytes --]

On Fri, Jun 15, 2018 at 01:53:02PM +0200, Cédric Le Goater wrote:
> spapr_irq_alloc_block and spapr_irq_alloc() are now deprecated.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  include/hw/ppc/spapr.h |  4 ---
>  hw/ppc/spapr.c         | 80 +-------------------------------------------------
>  2 files changed, 1 insertion(+), 83 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 6088f44c1b2a..49cda04b1493 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -772,10 +772,6 @@ int spapr_get_vcpu_id(PowerPCCPU *cpu);
>  void spapr_set_vcpu_id(PowerPCCPU *cpu, int cpu_index, Error **errp);
>  PowerPCCPU *spapr_find_cpu(int vcpu_id);
>  
> -int spapr_irq_alloc(sPAPRMachineState *spapr, int irq_hint, bool lsi,
> -                    Error **errp);
> -int spapr_irq_alloc_block(sPAPRMachineState *spapr, int num, bool lsi,
> -                          bool align, Error **errp);
>  int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align,
>                     Error **errp);
>  #define spapr_irq_findone(spapr, errp) spapr_irq_find(spapr, 1, false, errp)
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index b1d19b328166..c464951747e3 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3840,84 +3840,6 @@ int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align, Error **errp)
>      return first + ics->offset;
>  }
>  
> -/*
> - * Allocate the IRQ number and set the IRQ type, LSI or MSI
> - */
> -static void spapr_irq_set_lsi(sPAPRMachineState *spapr, int irq, bool lsi)
> -{
> -    ics_set_irq_type(spapr->ics, irq - spapr->ics->offset, lsi);
> -}
> -
> -int spapr_irq_alloc(sPAPRMachineState *spapr, int irq_hint, bool lsi,
> -                    Error **errp)
> -{
> -    ICSState *ics = spapr->ics;
> -    int irq;
> -
> -    assert(ics);
> -
> -    if (irq_hint) {
> -        if (!ICS_IRQ_FREE(ics, irq_hint - ics->offset)) {
> -            error_setg(errp, "can't allocate IRQ %d: already in use", irq_hint);
> -            return -1;
> -        }
> -        irq = irq_hint;
> -    } else {
> -        irq = ics_find_free_block(ics, 1, 1);
> -        if (irq < 0) {
> -            error_setg(errp, "can't allocate IRQ: no IRQ left");
> -            return -1;
> -        }
> -        irq += ics->offset;
> -    }
> -
> -    spapr_irq_set_lsi(spapr, irq, lsi);
> -    trace_spapr_irq_alloc(irq);
> -
> -    return irq;
> -}
> -
> -/*
> - * Allocate block of consecutive IRQs, and return the number of the first IRQ in
> - * the block. If align==true, aligns the first IRQ number to num.
> - */
> -int spapr_irq_alloc_block(sPAPRMachineState *spapr, int num, bool lsi,
> -                          bool align, Error **errp)
> -{
> -    ICSState *ics = spapr->ics;
> -    int i, first = -1;
> -
> -    assert(ics);
> -
> -    /*
> -     * MSIMesage::data is used for storing VIRQ so
> -     * it has to be aligned to num to support multiple
> -     * MSI vectors. MSI-X is not affected by this.
> -     * The hint is used for the first IRQ, the rest should
> -     * be allocated continuously.
> -     */
> -    if (align) {
> -        assert((num == 1) || (num == 2) || (num == 4) ||
> -               (num == 8) || (num == 16) || (num == 32));
> -        first = ics_find_free_block(ics, num, num);
> -    } else {
> -        first = ics_find_free_block(ics, num, 1);
> -    }
> -    if (first < 0) {
> -        error_setg(errp, "can't find a free %d-IRQ block", num);
> -        return -1;
> -    }
> -
> -    first += ics->offset;
> -    for (i = first; i < first + num; ++i) {
> -        spapr_irq_set_lsi(spapr, i, lsi);
> -    }
> -
> -    trace_spapr_irq_alloc_block(first, num, lsi, align);
> -
> -    return first;
> -}
> -
>  int spapr_irq_claim(sPAPRMachineState *spapr, int irq, int num, bool lsi,
>                      Error **errp)
>  {
> @@ -3934,7 +3856,7 @@ int spapr_irq_claim(sPAPRMachineState *spapr, int irq, int num, bool lsi,
>  
>      for (i = srcno; i < srcno + num; ++i) {
>          if (ICS_IRQ_FREE(ics, i)) {
> -            spapr_irq_set_lsi(spapr, i + ics->offset, lsi);
> +            ics_set_irq_type(ics, i, lsi);
>          } else {
>              error_setg(errp, "IRQ %d is not free", i + ics->offset);
>              ret = -1;

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/3] spapr: split the IRQ allocation sequence
  2018-06-15 17:21     ` Cédric Le Goater
@ 2018-06-18  6:52       ` Greg Kurz
  0 siblings, 0 replies; 21+ messages in thread
From: Greg Kurz @ 2018-06-18  6:52 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: David Gibson, qemu-ppc, qemu-devel

On Fri, 15 Jun 2018 19:21:19 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> On 06/15/2018 03:14 PM, Greg Kurz wrote:
> > On Fri, 15 Jun 2018 13:53:01 +0200
> > Cédric Le Goater <clg@kaod.org> wrote:
> >   
> >> Today, when a device requests for IRQ number in a sPAPR machine, the
> >> spapr_irq_alloc() routine first scans the ICSState status array to
> >> find an empty slot and then performs the assignement of the selected
> >> numbers. Split this sequence in two distinct routines : spapr_irq_find()
> >> for lookups and spapr_irq_claim() for claiming the IRQ numbers.
> >>
> >> This will ease the introduction of a static layout of IRQ numbers.
> >>
> >> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> >> ---
> >>  include/hw/ppc/spapr.h |  5 ++++
> >>  hw/ppc/spapr.c         | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >>  hw/ppc/spapr_events.c  | 18 ++++++++++----
> >>  hw/ppc/spapr_pci.c     | 19 ++++++++++++---
> >>  hw/ppc/spapr_vio.c     | 10 +++++++-
> >>  5 files changed, 108 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> >> index 3388750fc795..6088f44c1b2a 100644
> >> --- a/include/hw/ppc/spapr.h
> >> +++ b/include/hw/ppc/spapr.h
> >> @@ -776,6 +776,11 @@ int spapr_irq_alloc(sPAPRMachineState *spapr, int irq_hint, bool lsi,
> >>                      Error **errp);
> >>  int spapr_irq_alloc_block(sPAPRMachineState *spapr, int num, bool lsi,
> >>                            bool align, Error **errp);
> >> +int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align,
> >> +                   Error **errp);
> >> +#define spapr_irq_findone(spapr, errp) spapr_irq_find(spapr, 1, false, errp)
> >> +int spapr_irq_claim(sPAPRMachineState *spapr, int irq, int num, bool lsi,
> >> +                    Error **errp);
> >>  void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
> >>  qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq);
> >>  
> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >> index f59999daacfc..b1d19b328166 100644
> >> --- a/hw/ppc/spapr.c
> >> +++ b/hw/ppc/spapr.c
> >> @@ -3810,6 +3810,36 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum)
> >>      return -1;
> >>  }
> >>  
> >> +int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align, Error **errp)
> >> +{
> >> +    ICSState *ics = spapr->ics;
> >> +    int first = -1;
> >> +
> >> +    assert(ics);
> >> +
> >> +    /*
> >> +     * MSIMesage::data is used for storing VIRQ so
> >> +     * it has to be aligned to num to support multiple
> >> +     * MSI vectors. MSI-X is not affected by this.
> >> +     * The hint is used for the first IRQ, the rest should
> >> +     * be allocated continuously.
> >> +     */
> >> +    if (align) {
> >> +        assert((num == 1) || (num == 2) || (num == 4) ||
> >> +               (num == 8) || (num == 16) || (num == 32));
> >> +        first = ics_find_free_block(ics, num, num);
> >> +    } else {
> >> +        first = ics_find_free_block(ics, num, 1);
> >> +    }
> >> +
> >> +    if (first < 0) {
> >> +        error_setg(errp, "can't find a free %d-IRQ block", num);
> >> +        return -1;
> >> +    }
> >> +
> >> +    return first + ics->offset;
> >> +}
> >> +
> >>  /*
> >>   * Allocate the IRQ number and set the IRQ type, LSI or MSI
> >>   */
> >> @@ -3888,6 +3918,40 @@ int spapr_irq_alloc_block(sPAPRMachineState *spapr, int num, bool lsi,
> >>      return first;
> >>  }
> >>  
> >> +int spapr_irq_claim(sPAPRMachineState *spapr, int irq, int num, bool lsi,
> >> +                    Error **errp)
> >> +{
> >> +    ICSState *ics = spapr->ics;
> >> +    int i;
> >> +    int srcno = irq - ics->offset;
> >> +    int ret = 0;
> >> +
> >> +    assert(ics);
> >> +
> >> +    if (!ics_valid_irq(ics, irq) || !ics_valid_irq(ics, irq + num)) {  
> > 
> > I guess it is okay to assume that if the first and last irqs are valid,
> > so are all numbers in between. Shouldn't the second check be against
> > irq + num - 1 though ?  
> 
> yes. thanks.
> 
> > This lacks an error_setg() BTW.
> >   
> >> +        return -1;
> >> +    }
> >> +
> >> +    for (i = srcno; i < srcno + num; ++i) {
> >> +        if (ICS_IRQ_FREE(ics, i)) {
> >> +            spapr_irq_set_lsi(spapr, i + ics->offset, lsi);
> >> +        } else {
> >> +            error_setg(errp, "IRQ %d is not free", i + ics->offset);
> >> +            ret = -1;
> >> +            break;
> >> +        }
> >> +    }
> >> +
> >> +    /* we could call spapr_irq_free() when rolling back */
> >> +    if (ret) {
> >> +        while (--i >= srcno) {
> >> +            memset(&ics->irqs[i], 0, sizeof(ICSIRQState));
> >> +        }
> >> +    }  
> > 
> > Hmm... I guess we should free the whole range otherwise we leak
> > srcno + num - i irqs, preferably using spapr_irq_free() to match
> > the traces from spapr_irq_alloc().  
> 
> I don't understand. This is undoing what has been done. 
> 

Oops sorry... I got confused by the suggestion to use spapr_irq_free().
So, indeed, rollback is okay. Only thing to address would be to have
matching traces in spapr_irq_claim() and spapr_irq_free() I guess.

> > Alternatively, the rollback could be pushed to the callers.  
> 
> Yes. Today none is done so we might just do nothing about it
> and return an error.
>  
> > Rest looks good.
> >   
> >> +
> >> +    return ret;
> >> +}
> >> +
> >>  void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num)
> >>  {
> >>      ICSState *ics = spapr->ics;
> >> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> >> index 86836f0626dc..3b6ae7272092 100644
> >> --- a/hw/ppc/spapr_events.c
> >> +++ b/hw/ppc/spapr_events.c
> >> @@ -707,13 +707,18 @@ void spapr_clear_pending_events(sPAPRMachineState *spapr)
> >>  
> >>  void spapr_events_init(sPAPRMachineState *spapr)
> >>  {
> >> +    int epow_irq;
> >> +
> >> +    epow_irq = spapr_irq_findone(spapr, &error_fatal);
> >> +
> >> +    spapr_irq_claim(spapr, epow_irq, 1, false, &error_fatal);
> >> +
> >>      QTAILQ_INIT(&spapr->pending_events);
> >>  
> >>      spapr->event_sources = spapr_event_sources_new();
> >>  
> >>      spapr_event_sources_register(spapr->event_sources, EVENT_CLASS_EPOW,
> >> -                                 spapr_irq_alloc(spapr, 0, false,
> >> -                                                  &error_fatal));
> >> +                                 epow_irq);
> >>  
> >>      /* NOTE: if machine supports modern/dedicated hotplug event source,
> >>       * we add it to the device-tree unconditionally. This means we may
> >> @@ -724,9 +729,14 @@ void spapr_events_init(sPAPRMachineState *spapr)
> >>       * checking that it's enabled.
> >>       */
> >>      if (spapr->use_hotplug_event_source) {
> >> +        int hp_irq;
> >> +
> >> +        hp_irq = spapr_irq_findone(spapr, &error_fatal);
> >> +
> >> +        spapr_irq_claim(spapr, hp_irq, 1, false, &error_fatal);
> >> +
> >>          spapr_event_sources_register(spapr->event_sources, EVENT_CLASS_HOT_PLUG,
> >> -                                     spapr_irq_alloc(spapr, 0, false,
> >> -                                                      &error_fatal));
> >> +                                     hp_irq);
> >>      }
> >>  
> >>      spapr->epow_notifier.notify = spapr_powerdown_req;
> >> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> >> index f936ce63effa..7394c62b4a8b 100644
> >> --- a/hw/ppc/spapr_pci.c
> >> +++ b/hw/ppc/spapr_pci.c
> >> @@ -371,8 +371,14 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
> >>      }
> >>  
> >>      /* Allocate MSIs */
> >> -    irq = spapr_irq_alloc_block(spapr, req_num, false,
> >> -                           ret_intr_type == RTAS_TYPE_MSI, &err);
> >> +    irq = spapr_irq_find(spapr, req_num, ret_intr_type == RTAS_TYPE_MSI, &err);
> >> +    if (err) {
> >> +        error_reportf_err(err, "Can't allocate MSIs for device %x: ",
> >> +                          config_addr);
> >> +        rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
> >> +        return;
> >> +    }
> >> +    spapr_irq_claim(spapr, irq, req_num, false, &err);
> >>      if (err) {
> >>          error_reportf_err(err, "Can't allocate MSIs for device %x: ",
> >>                            config_addr);
> >> @@ -1698,7 +1704,14 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
> >>          uint32_t irq;
> >>          Error *local_err = NULL;
> >>  
> >> -        irq = spapr_irq_alloc_block(spapr, 1, true, false, &local_err);
> >> +        irq = spapr_irq_findone(spapr, &local_err);
> >> +        if (local_err) {
> >> +            error_propagate(errp, local_err);
> >> +            error_prepend(errp, "can't allocate LSIs: ");
> >> +            return;
> >> +        }
> >> +
> >> +        spapr_irq_claim(spapr, irq, 1, true, &local_err);
> >>          if (local_err) {
> >>              error_propagate(errp, local_err);
> >>              error_prepend(errp, "can't allocate LSIs: ");
> >> diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
> >> index 4555c648a8e2..ad9b56e28447 100644
> >> --- a/hw/ppc/spapr_vio.c
> >> +++ b/hw/ppc/spapr_vio.c
> >> @@ -475,7 +475,15 @@ static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
> >>          dev->qdev.id = id;
> >>      }
> >>  
> >> -    dev->irq = spapr_irq_alloc(spapr, dev->irq, false, &local_err);
> >> +    if (!dev->irq) {
> >> +        dev->irq = spapr_irq_findone(spapr, &local_err);
> >> +        if (local_err) {
> >> +            error_propagate(errp, local_err);
> >> +            return;
> >> +        }
> >> +    }
> >> +
> >> +    spapr_irq_claim(spapr, dev->irq, 1, false, &local_err);
> >>      if (local_err) {
> >>          error_propagate(errp, local_err);
> >>          return;  
> >   
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/3] spapr: introduce a fixed IRQ number space
  2018-06-15 17:18     ` Cédric Le Goater
@ 2018-06-18  8:42       ` Greg Kurz
  2018-06-18  8:56         ` Cédric Le Goater
  0 siblings, 1 reply; 21+ messages in thread
From: Greg Kurz @ 2018-06-18  8:42 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: David Gibson, qemu-ppc, qemu-devel

On Fri, 15 Jun 2018 19:18:08 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> Hello,
> 
> On 06/15/2018 06:03 PM, Greg Kurz wrote:
> > On Fri, 15 Jun 2018 13:53:03 +0200
> > Cédric Le Goater <clg@kaod.org> wrote:
> >   
> >> This proposal introduces a new IRQ number space layout using static
> >> numbers for all devices and a bitmap allocator for the MSI numbers
> >> which are negotiated by the guest at runtime.
> >>
> >> The previous layout is kept in machines raising the 'xics_legacy'
> >> flag.
> >>
> >> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> >> ---
> >>  include/hw/ppc/spapr.h     |  4 ++++
> >>  include/hw/ppc/spapr_irq.h | 30 ++++++++++++++++++++++++
> >>  hw/ppc/spapr.c             | 31 +++++++++++++++++++++++++
> >>  hw/ppc/spapr_events.c      | 12 ++++++++--
> >>  hw/ppc/spapr_irq.c         | 57 ++++++++++++++++++++++++++++++++++++++++++++++
> >>  hw/ppc/spapr_pci.c         | 28 ++++++++++++++++++-----
> >>  hw/ppc/spapr_vio.c         | 15 ++++++++----
> >>  hw/ppc/Makefile.objs       |  2 +-
> >>  8 files changed, 166 insertions(+), 13 deletions(-)
> >>  create mode 100644 include/hw/ppc/spapr_irq.h
> >>  create mode 100644 hw/ppc/spapr_irq.c
> >>
> >> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> >> index 49cda04b1493..094c7780130e 100644
> >> --- a/include/hw/ppc/spapr.h
> >> +++ b/include/hw/ppc/spapr.h
> >> @@ -7,6 +7,7 @@
> >>  #include "hw/ppc/spapr_drc.h"
> >>  #include "hw/mem/pc-dimm.h"
> >>  #include "hw/ppc/spapr_ovec.h"
> >> +#include "hw/ppc/spapr_irq.h"
> >>  
> >>  struct VIOsPAPRBus;
> >>  struct sPAPRPHBState;
> >> @@ -164,6 +165,9 @@ struct sPAPRMachineState {
> >>      char *kvm_type;
> >>  
> >>      const char *icp_type;
> >> +    bool xics_legacy;
> >> +    int32_t irq_map_nr;
> >> +    unsigned long *irq_map;
> >>  
> >>      bool cmd_line_caps[SPAPR_CAP_NUM];
> >>      sPAPRCapabilities def, eff, mig;
> >> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> >> new file mode 100644
> >> index 000000000000..07e36d1c7e74
> >> --- /dev/null
> >> +++ b/include/hw/ppc/spapr_irq.h
> >> @@ -0,0 +1,30 @@
> >> +/*
> >> + * QEMU PowerPC sPAPR IRQ backend definitions
> >> + *
> >> + * Copyright (c) 2018, IBM Corporation.
> >> + *
> >> + * This code is licensed under the GPL version 2 or later. See the
> >> + * COPYING file in the top-level directory.
> >> + */
> >> +#ifndef HW_SPAPR_IRQ_H
> >> +#define HW_SPAPR_IRQ_H
> >> +
> >> +/*
> >> + * IRQ ranges per type  
> > 
> > Well, range offsets actually. Maybe you could document the range size
> > as well for each type.  
> 
> yes. I am waiting for people to scream : "I want more !"
> 
> >> + */
> >> +#define SPAPR_IRQ_EPOW       0x1000  /* XICS_IRQ_BASE offset */
> >> +#define SPAPR_IRQ_HOTPLUG    0x1001  
> > 
> > This is just 1 irq, ie, range 0x1002-0x103F is unused
> >   
> >> +#define SPAPR_IRQ_VIO        0x1040  
> > 
> > This is 1 irq per device, hence up to 64 devices
> >   
> >> +#define SPAPR_IRQ_PCI_LSI    0x1080
> >> +  
> > 
> > This is 4 irqs per PHB, hence up to 32 PHBs. Cool, we're currently
> > limited to 31 PHBs.
> >   
> >> +#define SPAPR_IRQ_MSI        0x1100  /* Offset of the dynamic range covered  
> > 
> > We only support dynamic MSIs with PCI, maybe rename to SPAPR_IRQ_PCI_MSI ?  
> 
> hmm, no. We could have CAPI devices there. remember ? ;)
> 

Well... OpenCAPI devices are exposed to the OS as PCI devices, so I'm not
sure we need a CAPI specific range.

> >> +                                      * by the bitmap allocator */  
> > 
> > The range size is hence 1k (XICS_IRQS_SPAPR) for the time being.  
> 
> in fact we could this bogus limit and use spapr->irq_map_nr now.
> 

"we could *missing verb* this bogus limit"... so I'm not sure to
understand...

> >> +
> >> +void spapr_irq_msi_init(sPAPRMachineState *spapr, uint32_t nr_irqs,
> >> +                       Error **errp);  
> >                          ^
> >                      missing space
> >  
> >> +int spapr_irq_msi_alloc(sPAPRMachineState *spapr, uint32_t num, bool align,
> >> +                        Error **errp);
> >> +void spapr_irq_msi_free(sPAPRMachineState *spapr, int irq, uint32_t num);
> >> +void spapr_irq_msi_reset(sPAPRMachineState *spapr);
> >> +
> >> +#endif
> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >> index c464951747e3..32aa5d869d9c 100644
> >> --- a/hw/ppc/spapr.c
> >> +++ b/hw/ppc/spapr.c
> >> @@ -187,6 +187,11 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
> >>  {
> >>      sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
> >>  
> >> +    /* Initialize the MSI IRQ allocator. */
> >> +    if (!spapr->xics_legacy) {
> >> +        spapr_irq_msi_init(spapr, nr_irqs, &error_fatal);  
> > 
> > The error should be propagated to the caller.  
> 
> I think the Error is useless here.
> 

Yeah, spapr_irq_msi_init() cannot fail.

> 
> >   
> >> +    }
> >> +
> >>      if (kvm_enabled()) {
> >>          if (machine_kernel_irqchip_allowed(machine) &&
> >>              !xics_kvm_init(spapr, errp)) {  
> > 
> > And I now realize that the way errors are handled here is badly
> > broken.
> > 
> > In the default case, this function is supposed to try in-kernel XICS
> > first and if it fails, then fallback to userland XICS. But since we
> > pass errp down to xics_kvm_init() and the caller happens to pass
> > &error_fatal, we stop right here instead of trying the fallback.
> > 
> > And if the caller changes to pass a standard &local_err instead,
> > things would be even worse. With the same scenario, QEMU would
> > probably exit if it could fallback to the userland XICS, or abort
> > otherwise because error_set() would be called with an already set
> > *errp.
> > 
> > Since this beha^wug is mine (commit 3d85885a1b1f3), let me fix it
> > first :)
> >   
> >> @@ -1629,6 +1634,10 @@ static void spapr_machine_reset(void)
> >>          ppc_set_compat(first_ppc_cpu, spapr->max_compat_pvr, &error_fatal);
> >>      }
> >>  
> >> +    if (!spapr->xics_legacy) {
> >> +        spapr_irq_msi_reset(spapr);
> >> +    }
> >> +
> >>      qemu_devices_reset();
> >>  
> >>      /* DRC reset may cause a device to be unplugged. This will cause troubles
> >> @@ -1903,6 +1912,24 @@ static const VMStateDescription vmstate_spapr_patb_entry = {
> >>      },
> >>  };
> >>  
> >> +static bool spapr_irq_map_needed(void *opaque)
> >> +{
> >> +    sPAPRMachineState *spapr = opaque;
> >> +
> >> +    return spapr->irq_map && !bitmap_empty(spapr->irq_map, spapr->irq_map_nr);  
> > 
> > Maybe check !spapr->xics_legacy as well to preserve backward migration
> > to older machines ?  
> 
> no need. It is redudant with spapr->irq_map which is not allocated in that case.
> 

Right.

> >> +}
> >> +
> >> +static const VMStateDescription vmstate_spapr_irq_map = {
> >> +    .name = "spapr_irq_map",
> >> +    .version_id = 1,
> >> +    .minimum_version_id = 1,
> >> +    .needed = spapr_irq_map_needed,
> >> +    .fields = (VMStateField[]) {
> >> +        VMSTATE_BITMAP(irq_map, sPAPRMachineState, 0, irq_map_nr),
> >> +        VMSTATE_END_OF_LIST()
> >> +    },
> >> +};
> >> +
> >>  static const VMStateDescription vmstate_spapr = {
> >>      .name = "spapr",
> >>      .version_id = 3,
> >> @@ -1930,6 +1957,7 @@ static const VMStateDescription vmstate_spapr = {
> >>          &vmstate_spapr_cap_cfpc,
> >>          &vmstate_spapr_cap_sbbc,
> >>          &vmstate_spapr_cap_ibs,
> >> +        &vmstate_spapr_irq_map,
> >>          NULL
> >>      }
> >>  };
> >> @@ -4101,7 +4129,10 @@ DEFINE_SPAPR_MACHINE(3_0, "3.0", true);
> >>  
> >>  static void spapr_machine_2_12_instance_options(MachineState *machine)
> >>  {
> >> +    sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
> >> +
> >>      spapr_machine_3_0_instance_options(machine);
> >> +    spapr->xics_legacy = true;
> >>  }
> >>  
> >>  static void spapr_machine_2_12_class_options(MachineClass *mc)
> >> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> >> index 3b6ae7272092..1ef7ce7f1509 100644
> >> --- a/hw/ppc/spapr_events.c
> >> +++ b/hw/ppc/spapr_events.c
> >> @@ -709,7 +709,11 @@ void spapr_events_init(sPAPRMachineState *spapr)
> >>  {
> >>      int epow_irq;
> >>  
> >> -    epow_irq = spapr_irq_findone(spapr, &error_fatal);
> >> +    if (spapr->xics_legacy) {
> >> +        epow_irq = spapr_irq_findone(spapr, &error_fatal);
> >> +    } else {
> >> +        epow_irq = SPAPR_IRQ_EPOW;
> >> +    }
> >>  
> >>      spapr_irq_claim(spapr, epow_irq, 1, false, &error_fatal);
> >>  
> >> @@ -731,7 +735,11 @@ void spapr_events_init(sPAPRMachineState *spapr)
> >>      if (spapr->use_hotplug_event_source) {
> >>          int hp_irq;
> >>  
> >> -        hp_irq = spapr_irq_findone(spapr, &error_fatal);
> >> +        if (spapr->xics_legacy) {
> >> +            hp_irq = spapr_irq_findone(spapr, &error_fatal);
> >> +        } else {
> >> +            hp_irq = SPAPR_IRQ_HOTPLUG;
> >> +        }
> >>  
> >>          spapr_irq_claim(spapr, hp_irq, 1, false, &error_fatal);
> >>  
> >> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> >> new file mode 100644
> >> index 000000000000..55fa764c85d1
> >> --- /dev/null
> >> +++ b/hw/ppc/spapr_irq.c
> >> @@ -0,0 +1,57 @@
> >> +/*
> >> + * QEMU PowerPC sPAPR IRQ interface
> >> + *
> >> + * Copyright (c) 2018, IBM Corporation.
> >> + *
> >> + * This code is licensed under the GPL version 2 or later. See the
> >> + * COPYING file in the top-level directory.
> >> + */
> >> +
> >> +#include "qemu/osdep.h"
> >> +#include "qemu/log.h"
> >> +#include "qemu/error-report.h"
> >> +#include "qapi/error.h"
> >> +#include "hw/ppc/spapr.h"
> >> +#include "hw/ppc/xics.h"
> >> +
> >> +void spapr_irq_msi_init(sPAPRMachineState *spapr, uint32_t nr_irqs,
> >> +                        Error **errp)  
> > 
> > What's the errp for ? It looks like this function can never return
> > an error. Maybe just drop it.  
> 
> yes
> 
> >   
> >> +{
> >> +    spapr->irq_map_nr = XICS_IRQ_BASE + nr_irqs - SPAPR_IRQ_MSI;
> >> +    spapr->irq_map = bitmap_new(spapr->irq_map_nr);
> >> +}
> >> +
> >> +int spapr_irq_msi_alloc(sPAPRMachineState *spapr, uint32_t num, bool align,
> >> +                        Error **errp)
> >> +{
> >> +    int irq;
> >> +
> >> +    /*
> >> +     * The 'align_mask' parameter of bitmap_find_next_zero_area()
> >> +     * should be one less than a power of 2; 0 means no
> >> +     * alignment. Adapt the 'align' value of the former allocator
> >> +     * to fit the requirements of bitmap_find_next_zero_area()
> >> +     */
> >> +    align -= 1;
> >> +
> >> +    irq = bitmap_find_next_zero_area(spapr->irq_map, spapr->irq_map_nr, 0, num,
> >> +                                     align);
> >> +    if (irq == spapr->irq_map_nr) {
> >> +        error_setg(errp, "can't find a free %d-IRQ block", num);
> >> +        return -1;
> >> +    }
> >> +
> >> +    bitmap_set(spapr->irq_map, irq, num);
> >> +
> >> +    return irq + SPAPR_IRQ_MSI;
> >> +}
> >> +
> >> +void spapr_irq_msi_free(sPAPRMachineState *spapr, int irq, uint32_t num)
> >> +{
> >> +    bitmap_clear(spapr->irq_map, irq - SPAPR_IRQ_MSI, num);
> >> +}
> >> +
> >> +void spapr_irq_msi_reset(sPAPRMachineState *spapr)
> >> +{
> >> +    bitmap_clear(spapr->irq_map, 0, spapr->irq_map_nr);
> >> +}
> >> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> >> index 7394c62b4a8b..f3c4a25fb482 100644
> >> --- a/hw/ppc/spapr_pci.c
> >> +++ b/hw/ppc/spapr_pci.c
> >> @@ -333,6 +333,9 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
> >>              return;
> >>          }
> >>  
> >> +        if (!spapr->xics_legacy) {
> >> +            spapr_irq_msi_free(spapr, msi->first_irq, msi->num);
> >> +        }
> >>          spapr_irq_free(spapr, msi->first_irq, msi->num);  
> > 
> > Shouldn't ^^ be in an else block ?  
> 
> no. one is for the MSI IRQ allocator, the other for the controller. 
> 

Indeed.

> >   
> >>          if (msi_present(pdev)) {
> >>              spapr_msi_setmsg(pdev, 0, false, 0, 0);
> >> @@ -371,7 +374,13 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
> >>      }
> >>  
> >>      /* Allocate MSIs */
> >> -    irq = spapr_irq_find(spapr, req_num, ret_intr_type == RTAS_TYPE_MSI, &err);
> >> +    if (spapr->xics_legacy) {
> >> +        irq = spapr_irq_find(spapr, req_num, ret_intr_type == RTAS_TYPE_MSI,
> >> +                             &err);
> >> +    } else {
> >> +        irq = spapr_irq_msi_alloc(spapr, req_num,
> >> +                                  ret_intr_type == RTAS_TYPE_MSI, &err);
> >> +    }
> >>      if (err) {
> >>          error_reportf_err(err, "Can't allocate MSIs for device %x: ",
> >>                            config_addr);
> >> @@ -388,6 +397,9 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
> >>  
> >>      /* Release previous MSIs */
> >>      if (msi) {
> >> +        if (!spapr->xics_legacy) {
> >> +            spapr_irq_msi_free(spapr, msi->first_irq, msi->num);
> >> +        }
> >>          spapr_irq_free(spapr, msi->first_irq, msi->num);
> >>          g_hash_table_remove(phb->msi, &config_addr);
> >>      }
> >> @@ -1704,11 +1716,15 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
> >>          uint32_t irq;
> >>          Error *local_err = NULL;
> >>  
> >> -        irq = spapr_irq_findone(spapr, &local_err);
> >> -        if (local_err) {
> >> -            error_propagate(errp, local_err);
> >> -            error_prepend(errp, "can't allocate LSIs: ");
> >> -            return;
> >> +        if (spapr->xics_legacy) {
> >> +            irq = spapr_irq_findone(spapr, &local_err);
> >> +            if (local_err) {
> >> +                error_propagate(errp, local_err);
> >> +                error_prepend(errp, "can't allocate LSIs: ");
> >> +                return;
> >> +            }
> >> +        } else {
> >> +            irq = SPAPR_IRQ_PCI_LSI + sphb->index * PCI_NUM_PINS + i;
> >>          }
> >>  
> >>          spapr_irq_claim(spapr, irq, 1, true, &local_err);
> >> diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
> >> index ad9b56e28447..7707bda6a38d 100644
> >> --- a/hw/ppc/spapr_vio.c
> >> +++ b/hw/ppc/spapr_vio.c
> >> @@ -436,6 +436,9 @@ static void spapr_vio_busdev_reset(DeviceState *qdev)
> >>      }
> >>  }
> >>  
> >> +/* TODO : poor VIO device indexing ... */
> >> +static uint32_t vio_index;
> >> +
> >>  static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
> >>  {
> >>      sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> >> @@ -476,10 +479,14 @@ static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
> >>      }
> >>  
> >>      if (!dev->irq) {
> >> -        dev->irq = spapr_irq_findone(spapr, &local_err);
> >> -        if (local_err) {
> >> -            error_propagate(errp, local_err);
> >> -            return;
> >> +        if (spapr->xics_legacy) {
> >> +            dev->irq = spapr_irq_findone(spapr, &local_err);
> >> +            if (local_err) {
> >> +                error_propagate(errp, local_err);
> >> +                return;
> >> +            }
> >> +        } else {
> >> +            dev->irq = SPAPR_IRQ_VIO + vio_index++;  
> > 
> > This can overlap the next range if we have more than 64 VIO devices...  
> 
> yes. but claim() should fail.
> 

Hmm... I have the impression claim() only fails if:
- irq < ics->offset (ie, XICS_IRQ_BASE == 4096)
- or irq >= ics->offset + ics->nr_irqs (ie, XICS_IRQS_SPAPR == 1024)
- or irq is already in use

I can't find code that would prevent dev->irq to reach SPAPR_IRQ_MSI.

> Thanks,
> 
> C.
> 
> >>          }
> >>      }
> >>  
> >> diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
> >> index 86d82a6ec3ac..4fe3b7804d43 100644
> >> --- a/hw/ppc/Makefile.objs
> >> +++ b/hw/ppc/Makefile.objs
> >> @@ -4,7 +4,7 @@ obj-y += ppc.o ppc_booke.o fdt.o
> >>  obj-$(CONFIG_PSERIES) += spapr.o spapr_caps.o spapr_vio.o spapr_events.o
> >>  obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o spapr_rtas.o
> >>  obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o spapr_rng.o
> >> -obj-$(CONFIG_PSERIES) += spapr_cpu_core.o spapr_ovec.o
> >> +obj-$(CONFIG_PSERIES) += spapr_cpu_core.o spapr_ovec.o spapr_irq.o
> >>  # IBM PowerNV
> >>  obj-$(CONFIG_POWERNV) += pnv.o pnv_xscom.o pnv_core.o pnv_lpc.o pnv_psi.o pnv_occ.o pnv_bmc.o
> >>  ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy)  
> >   
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/3] spapr: introduce a fixed IRQ number space
  2018-06-18  8:42       ` Greg Kurz
@ 2018-06-18  8:56         ` Cédric Le Goater
  2018-06-18  9:54           ` Greg Kurz
  0 siblings, 1 reply; 21+ messages in thread
From: Cédric Le Goater @ 2018-06-18  8:56 UTC (permalink / raw)
  To: Greg Kurz; +Cc: David Gibson, qemu-ppc, qemu-devel

[ ... ]

>>> This is 4 irqs per PHB, hence up to 32 PHBs. Cool, we're currently
>>> limited to 31 PHBs.
>>>   
>>>> +#define SPAPR_IRQ_MSI        0x1100  /* Offset of the dynamic range covered  
>>>
>>> We only support dynamic MSIs with PCI, maybe rename to SPAPR_IRQ_PCI_MSI ?  
>>
>> hmm, no. We could have CAPI devices there. remember ? ;)
>>
> 
> Well... OpenCAPI devices are exposed to the OS as PCI devices, so I'm not
> sure we need a CAPI specific range.

yes. so this range is common to all devices doing dynamic allocation
of IRQs. How should we call it ? 

>>>> +                                      * by the bitmap allocator */  
>>>
>>> The range size is hence 1k (XICS_IRQS_SPAPR) for the time being.  
>>
>> in fact we could this bogus limit and use spapr->irq_map_nr now.
>>
> 
> "we could *missing verb* this bogus limit"... so I'm not sure to
> understand...

oups. We could use spapr->irq_map_nr instead of XICS_IRQS_SPAPR when
defining : 

    _FDT(fdt_setprop_cell(fdt, bus_off, "ibm,pe-total-#msi", XICS_IRQS_SPAPR));

in spapr_pci.c

[ ... ]

>>>> +        if (spapr->xics_legacy) {
>>>> +            dev->irq = spapr_irq_findone(spapr, &local_err);
>>>> +            if (local_err) {
>>>> +                error_propagate(errp, local_err);
>>>> +                return;
>>>> +            }
>>>> +        } else {
>>>> +            dev->irq = SPAPR_IRQ_VIO + vio_index++;  
>>>
>>> This can overlap the next range if we have more than 64 VIO devices...  
>>
>> yes. but claim() should fail.
>>
> 
> Hmm... I have the impression claim() only fails if:
> - irq < ics->offset (ie, XICS_IRQ_BASE == 4096)
> - or irq >= ics->offset + ics->nr_irqs (ie, XICS_IRQS_SPAPR == 1024)
> - or irq is already in use
> 
> I can't find code that would prevent dev->irq to reach SPAPR_IRQ_MSI.

Ah yes. It can overlap. 

My previous proposal took care of overlaps but something simpler was 
requested. That's how I understand it at least. We can introduce 
a maximum for the VIO range or live with it. 

C.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/3] spapr: introduce a fixed IRQ number space
  2018-06-18  8:56         ` Cédric Le Goater
@ 2018-06-18  9:54           ` Greg Kurz
  2018-06-18 11:31             ` Cédric Le Goater
  0 siblings, 1 reply; 21+ messages in thread
From: Greg Kurz @ 2018-06-18  9:54 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: David Gibson, qemu-ppc, qemu-devel

On Mon, 18 Jun 2018 10:56:55 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> [ ... ]
> 
> >>> This is 4 irqs per PHB, hence up to 32 PHBs. Cool, we're currently
> >>> limited to 31 PHBs.
> >>>     
> >>>> +#define SPAPR_IRQ_MSI        0x1100  /* Offset of the dynamic range covered    
> >>>
> >>> We only support dynamic MSIs with PCI, maybe rename to SPAPR_IRQ_PCI_MSI ?    
> >>
> >> hmm, no. We could have CAPI devices there. remember ? ;)
> >>  
> > 
> > Well... OpenCAPI devices are exposed to the OS as PCI devices, so I'm not
> > sure we need a CAPI specific range.  
> 
> yes. so this range is common to all devices doing dynamic allocation
> of IRQs. How should we call it ? 
> 
> >>>> +                                      * by the bitmap allocator */    
> >>>
> >>> The range size is hence 1k (XICS_IRQS_SPAPR) for the time being.    
> >>
> >> in fact we could this bogus limit and use spapr->irq_map_nr now.
> >>  
> > 
> > "we could *missing verb* this bogus limit"... so I'm not sure to
> > understand...  
> 
> oups. We could use spapr->irq_map_nr instead of XICS_IRQS_SPAPR when
> defining : 
> 
>     _FDT(fdt_setprop_cell(fdt, bus_off, "ibm,pe-total-#msi", XICS_IRQS_SPAPR));
> 
> in spapr_pci.c
> 

Ah... yeah, I've always found weird that all PHBs advertise the same number
of available MSIs as the total number of irqs for the whole machine. And
putting spapr->irq_map_nr looks weird all the same if all PHBs rely on the
same bitmap actually.

I'm thinking of doing the other way around: keep XICS_IRQS_SPAPR in
"ibm,pe-total-#msi" and have one XICS_IRQS_SPAPR sized bitmap per PHB.

> [ ... ]
> 
> >>>> +        if (spapr->xics_legacy) {
> >>>> +            dev->irq = spapr_irq_findone(spapr, &local_err);
> >>>> +            if (local_err) {
> >>>> +                error_propagate(errp, local_err);
> >>>> +                return;
> >>>> +            }
> >>>> +        } else {
> >>>> +            dev->irq = SPAPR_IRQ_VIO + vio_index++;    
> >>>
> >>> This can overlap the next range if we have more than 64 VIO devices...    
> >>
> >> yes. but claim() should fail.
> >>  
> > 
> > Hmm... I have the impression claim() only fails if:
> > - irq < ics->offset (ie, XICS_IRQ_BASE == 4096)
> > - or irq >= ics->offset + ics->nr_irqs (ie, XICS_IRQS_SPAPR == 1024)
> > - or irq is already in use
> > 
> > I can't find code that would prevent dev->irq to reach SPAPR_IRQ_MSI.  
> 
> Ah yes. It can overlap. 
> 
> My previous proposal took care of overlaps but something simpler was 
> requested. That's how I understand it at least. We can introduce 
> a maximum for the VIO range or live with it. 

Hmm... I'm not very comfortable with the VIO range silently stealing an
irq number from the bitmap allocator. This would cause spapr_irq_msi_alloc()
to return a value that causes spapr_irq_claim() to fail, which looks buggy.

> 
> C.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/3] spapr: introduce a fixed IRQ number space
  2018-06-18  9:54           ` Greg Kurz
@ 2018-06-18 11:31             ` Cédric Le Goater
  2018-06-18 12:46               ` Greg Kurz
  0 siblings, 1 reply; 21+ messages in thread
From: Cédric Le Goater @ 2018-06-18 11:31 UTC (permalink / raw)
  To: Greg Kurz; +Cc: David Gibson, qemu-ppc, qemu-devel

On 06/18/2018 11:54 AM, Greg Kurz wrote:
> On Mon, 18 Jun 2018 10:56:55 +0200
> Cédric Le Goater <clg@kaod.org> wrote:
> 
>> [ ... ]
>>
>>>>> This is 4 irqs per PHB, hence up to 32 PHBs. Cool, we're currently
>>>>> limited to 31 PHBs.
>>>>>     
>>>>>> +#define SPAPR_IRQ_MSI        0x1100  /* Offset of the dynamic range covered    
>>>>>
>>>>> We only support dynamic MSIs with PCI, maybe rename to SPAPR_IRQ_PCI_MSI ?    
>>>>
>>>> hmm, no. We could have CAPI devices there. remember ? ;)
>>>>  
>>>
>>> Well... OpenCAPI devices are exposed to the OS as PCI devices, so I'm not
>>> sure we need a CAPI specific range.  
>>
>> yes. so this range is common to all devices doing dynamic allocation
>> of IRQs. How should we call it ? 
>>
>>>>>> +                                      * by the bitmap allocator */    
>>>>>
>>>>> The range size is hence 1k (XICS_IRQS_SPAPR) for the time being.    
>>>>
>>>> in fact we could this bogus limit and use spapr->irq_map_nr now.
>>>>  
>>>
>>> "we could *missing verb* this bogus limit"... so I'm not sure to
>>> understand...  
>>
>> oups. We could use spapr->irq_map_nr instead of XICS_IRQS_SPAPR when
>> defining : 
>>
>>     _FDT(fdt_setprop_cell(fdt, bus_off, "ibm,pe-total-#msi", XICS_IRQS_SPAPR));
>>
>> in spapr_pci.c
>>
> 
> Ah... yeah, I've always found weird that all PHBs advertise the same number
> of available MSIs as the total number of irqs for the whole machine. And
> putting spapr->irq_map_nr looks weird all the same if all PHBs rely on the
> same bitmap actually.
> 
> I'm thinking of doing the other way around: keep XICS_IRQS_SPAPR in
> "ibm,pe-total-#msi" and have one XICS_IRQS_SPAPR sized bitmap per PHB.

That could be the place where to put the msi allocator.

C.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] spapr: split the IRQ allocation sequence
  2018-06-18  4:00   ` [Qemu-devel] " David Gibson
@ 2018-06-18 12:16     ` David Gibson
  2018-06-18 16:15       ` Cédric Le Goater
  0 siblings, 1 reply; 21+ messages in thread
From: David Gibson @ 2018-06-18 12:16 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 9515 bytes --]

On Mon, Jun 18, 2018 at 02:00:06PM +1000, David Gibson wrote:
> On Fri, Jun 15, 2018 at 01:53:01PM +0200, Cédric Le Goater wrote:
> > Today, when a device requests for IRQ number in a sPAPR machine, the
> > spapr_irq_alloc() routine first scans the ICSState status array to
> > find an empty slot and then performs the assignement of the selected
> > numbers. Split this sequence in two distinct routines : spapr_irq_find()
> > for lookups and spapr_irq_claim() for claiming the IRQ numbers.
> > 
> > This will ease the introduction of a static layout of IRQ numbers.
> > 
> > Signed-off-by: Cédric Le Goater <clg@kaod.org>
> > ---
> >  include/hw/ppc/spapr.h |  5 ++++
> >  hw/ppc/spapr.c         | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >  hw/ppc/spapr_events.c  | 18 ++++++++++----
> >  hw/ppc/spapr_pci.c     | 19 ++++++++++++---
> >  hw/ppc/spapr_vio.c     | 10 +++++++-
> >  5 files changed, 108 insertions(+), 8 deletions(-)
> > 
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index 3388750fc795..6088f44c1b2a 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -776,6 +776,11 @@ int spapr_irq_alloc(sPAPRMachineState *spapr, int irq_hint, bool lsi,
> >                      Error **errp);
> >  int spapr_irq_alloc_block(sPAPRMachineState *spapr, int num, bool lsi,
> >                            bool align, Error **errp);
> > +int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align,
> > +                   Error **errp);
> > +#define spapr_irq_findone(spapr, errp) spapr_irq_find(spapr, 1, false, errp)
> > +int spapr_irq_claim(sPAPRMachineState *spapr, int irq, int num, bool lsi,
> > +                    Error **errp);
> 
> Unlike find, there's no reason we need to atomically claim a block of
> irqs.  So I think claim should just grab a single irq.  Callers can
> loop if they need multiple irqs.

Actually.. I take that back.  We don't *need* a block-claim interface,
but if it's convenient there's no strong reason not to (as long as
therer aren't *both* single and block interfaces, which there aren't).

So feel free to run with that, once the rollback bugs Greg pointed out
are fixed.


> Other than that and the minor points Greg noted, this looks good.
> 
> >  void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
> >  qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq);
> >  
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index f59999daacfc..b1d19b328166 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -3810,6 +3810,36 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum)
> >      return -1;
> >  }
> >  
> > +int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align, Error **errp)
> > +{
> > +    ICSState *ics = spapr->ics;
> > +    int first = -1;
> > +
> > +    assert(ics);
> > +
> > +    /*
> > +     * MSIMesage::data is used for storing VIRQ so
> > +     * it has to be aligned to num to support multiple
> > +     * MSI vectors. MSI-X is not affected by this.
> > +     * The hint is used for the first IRQ, the rest should
> > +     * be allocated continuously.
> > +     */
> > +    if (align) {
> > +        assert((num == 1) || (num == 2) || (num == 4) ||
> > +               (num == 8) || (num == 16) || (num == 32));
> > +        first = ics_find_free_block(ics, num, num);
> > +    } else {
> > +        first = ics_find_free_block(ics, num, 1);
> > +    }
> > +
> > +    if (first < 0) {
> > +        error_setg(errp, "can't find a free %d-IRQ block", num);
> > +        return -1;
> > +    }
> > +
> > +    return first + ics->offset;
> > +}
> > +
> >  /*
> >   * Allocate the IRQ number and set the IRQ type, LSI or MSI
> >   */
> > @@ -3888,6 +3918,40 @@ int spapr_irq_alloc_block(sPAPRMachineState *spapr, int num, bool lsi,
> >      return first;
> >  }
> >  
> > +int spapr_irq_claim(sPAPRMachineState *spapr, int irq, int num, bool lsi,
> > +                    Error **errp)
> > +{
> > +    ICSState *ics = spapr->ics;
> > +    int i;
> > +    int srcno = irq - ics->offset;
> > +    int ret = 0;
> > +
> > +    assert(ics);
> > +
> > +    if (!ics_valid_irq(ics, irq) || !ics_valid_irq(ics, irq + num)) {
> > +        return -1;
> > +    }
> > +
> > +    for (i = srcno; i < srcno + num; ++i) {
> > +        if (ICS_IRQ_FREE(ics, i)) {
> > +            spapr_irq_set_lsi(spapr, i + ics->offset, lsi);
> > +        } else {
> > +            error_setg(errp, "IRQ %d is not free", i + ics->offset);
> > +            ret = -1;
> > +            break;
> > +        }
> > +    }
> > +
> > +    /* we could call spapr_irq_free() when rolling back */
> > +    if (ret) {
> > +        while (--i >= srcno) {
> > +            memset(&ics->irqs[i], 0, sizeof(ICSIRQState));
> > +        }
> > +    }
> > +
> > +    return ret;
> > +}
> > +
> >  void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num)
> >  {
> >      ICSState *ics = spapr->ics;
> > diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> > index 86836f0626dc..3b6ae7272092 100644
> > --- a/hw/ppc/spapr_events.c
> > +++ b/hw/ppc/spapr_events.c
> > @@ -707,13 +707,18 @@ void spapr_clear_pending_events(sPAPRMachineState *spapr)
> >  
> >  void spapr_events_init(sPAPRMachineState *spapr)
> >  {
> > +    int epow_irq;
> > +
> > +    epow_irq = spapr_irq_findone(spapr, &error_fatal);
> > +
> > +    spapr_irq_claim(spapr, epow_irq, 1, false, &error_fatal);
> > +
> >      QTAILQ_INIT(&spapr->pending_events);
> >  
> >      spapr->event_sources = spapr_event_sources_new();
> >  
> >      spapr_event_sources_register(spapr->event_sources, EVENT_CLASS_EPOW,
> > -                                 spapr_irq_alloc(spapr, 0, false,
> > -                                                  &error_fatal));
> > +                                 epow_irq);
> >  
> >      /* NOTE: if machine supports modern/dedicated hotplug event source,
> >       * we add it to the device-tree unconditionally. This means we may
> > @@ -724,9 +729,14 @@ void spapr_events_init(sPAPRMachineState *spapr)
> >       * checking that it's enabled.
> >       */
> >      if (spapr->use_hotplug_event_source) {
> > +        int hp_irq;
> > +
> > +        hp_irq = spapr_irq_findone(spapr, &error_fatal);
> > +
> > +        spapr_irq_claim(spapr, hp_irq, 1, false, &error_fatal);
> > +
> >          spapr_event_sources_register(spapr->event_sources, EVENT_CLASS_HOT_PLUG,
> > -                                     spapr_irq_alloc(spapr, 0, false,
> > -                                                      &error_fatal));
> > +                                     hp_irq);
> >      }
> >  
> >      spapr->epow_notifier.notify = spapr_powerdown_req;
> > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> > index f936ce63effa..7394c62b4a8b 100644
> > --- a/hw/ppc/spapr_pci.c
> > +++ b/hw/ppc/spapr_pci.c
> > @@ -371,8 +371,14 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
> >      }
> >  
> >      /* Allocate MSIs */
> > -    irq = spapr_irq_alloc_block(spapr, req_num, false,
> > -                           ret_intr_type == RTAS_TYPE_MSI, &err);
> > +    irq = spapr_irq_find(spapr, req_num, ret_intr_type == RTAS_TYPE_MSI, &err);
> > +    if (err) {
> > +        error_reportf_err(err, "Can't allocate MSIs for device %x: ",
> > +                          config_addr);
> > +        rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
> > +        return;
> > +    }
> > +    spapr_irq_claim(spapr, irq, req_num, false, &err);
> >      if (err) {
> >          error_reportf_err(err, "Can't allocate MSIs for device %x: ",
> >                            config_addr);
> > @@ -1698,7 +1704,14 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
> >          uint32_t irq;
> >          Error *local_err = NULL;
> >  
> > -        irq = spapr_irq_alloc_block(spapr, 1, true, false, &local_err);
> > +        irq = spapr_irq_findone(spapr, &local_err);
> > +        if (local_err) {
> > +            error_propagate(errp, local_err);
> > +            error_prepend(errp, "can't allocate LSIs: ");
> > +            return;
> > +        }
> > +
> > +        spapr_irq_claim(spapr, irq, 1, true, &local_err);
> >          if (local_err) {
> >              error_propagate(errp, local_err);
> >              error_prepend(errp, "can't allocate LSIs: ");
> > diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
> > index 4555c648a8e2..ad9b56e28447 100644
> > --- a/hw/ppc/spapr_vio.c
> > +++ b/hw/ppc/spapr_vio.c
> > @@ -475,7 +475,15 @@ static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
> >          dev->qdev.id = id;
> >      }
> >  
> > -    dev->irq = spapr_irq_alloc(spapr, dev->irq, false, &local_err);
> > +    if (!dev->irq) {
> > +        dev->irq = spapr_irq_findone(spapr, &local_err);
> > +        if (local_err) {
> > +            error_propagate(errp, local_err);
> > +            return;
> > +        }
> > +    }
> > +
> > +    spapr_irq_claim(spapr, dev->irq, 1, false, &local_err);
> >      if (local_err) {
> >          error_propagate(errp, local_err);
> >          return;
> 



-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/3] spapr: introduce a fixed IRQ number space
  2018-06-15 16:03   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
  2018-06-15 17:18     ` Cédric Le Goater
@ 2018-06-18 12:30     ` David Gibson
  1 sibling, 0 replies; 21+ messages in thread
From: David Gibson @ 2018-06-18 12:30 UTC (permalink / raw)
  To: Greg Kurz; +Cc: Cédric Le Goater, qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 15872 bytes --]

On Fri, Jun 15, 2018 at 06:03:11PM +0200, Greg Kurz wrote:
> On Fri, 15 Jun 2018 13:53:03 +0200
> Cédric Le Goater <clg@kaod.org> wrote:
> 
> > This proposal introduces a new IRQ number space layout using static
> > numbers for all devices and a bitmap allocator for the MSI numbers
> > which are negotiated by the guest at runtime.
> > 
> > The previous layout is kept in machines raising the 'xics_legacy'
> > flag.
> > 
> > Signed-off-by: Cédric Le Goater <clg@kaod.org>
> > ---
> >  include/hw/ppc/spapr.h     |  4 ++++
> >  include/hw/ppc/spapr_irq.h | 30 ++++++++++++++++++++++++
> >  hw/ppc/spapr.c             | 31 +++++++++++++++++++++++++
> >  hw/ppc/spapr_events.c      | 12 ++++++++--
> >  hw/ppc/spapr_irq.c         | 57 ++++++++++++++++++++++++++++++++++++++++++++++
> >  hw/ppc/spapr_pci.c         | 28 ++++++++++++++++++-----
> >  hw/ppc/spapr_vio.c         | 15 ++++++++----
> >  hw/ppc/Makefile.objs       |  2 +-
> >  8 files changed, 166 insertions(+), 13 deletions(-)
> >  create mode 100644 include/hw/ppc/spapr_irq.h
> >  create mode 100644 hw/ppc/spapr_irq.c
> > 
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index 49cda04b1493..094c7780130e 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -7,6 +7,7 @@
> >  #include "hw/ppc/spapr_drc.h"
> >  #include "hw/mem/pc-dimm.h"
> >  #include "hw/ppc/spapr_ovec.h"
> > +#include "hw/ppc/spapr_irq.h"
> >  
> >  struct VIOsPAPRBus;
> >  struct sPAPRPHBState;
> > @@ -164,6 +165,9 @@ struct sPAPRMachineState {
> >      char *kvm_type;
> >  
> >      const char *icp_type;
> > +    bool xics_legacy;
> > +    int32_t irq_map_nr;
> > +    unsigned long *irq_map;
> >  
> >      bool cmd_line_caps[SPAPR_CAP_NUM];
> >      sPAPRCapabilities def, eff, mig;
> > diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> > new file mode 100644
> > index 000000000000..07e36d1c7e74
> > --- /dev/null
> > +++ b/include/hw/ppc/spapr_irq.h
> > @@ -0,0 +1,30 @@
> > +/*
> > + * QEMU PowerPC sPAPR IRQ backend definitions
> > + *
> > + * Copyright (c) 2018, IBM Corporation.
> > + *
> > + * This code is licensed under the GPL version 2 or later. See the
> > + * COPYING file in the top-level directory.
> > + */
> > +#ifndef HW_SPAPR_IRQ_H
> > +#define HW_SPAPR_IRQ_H
> > +
> > +/*
> > + * IRQ ranges per type
> 
> Well, range offsets actually. Maybe you could document the range size
> as well for each type.
> 
> > + */
> > +#define SPAPR_IRQ_EPOW       0x1000  /* XICS_IRQ_BASE offset */
> > +#define SPAPR_IRQ_HOTPLUG    0x1001
> 
> This is just 1 irq, ie, range 0x1002-0x103F is unused

I'm good with that.  I'm thinking of that as the reserved range for
any more "one off" interrupts we get.

> > +#define SPAPR_IRQ_VIO        0x1040

I'd bump it up to 0x1100, makes for rounder numbers.  We can up the
total limit if we have to.

> This is 1 irq per device, hence up to 64 devices
> 
> > +#define SPAPR_IRQ_PCI_LSI    0x1080
> > +
> 
> This is 4 irqs per PHB, hence up to 32 PHBs. Cool, we're currently
> limited to 31 PHBs.

Likewise I'd suggest spacing out this range.  Rearranging all the irqs
would probably be more trouble than raising the #PHBs limit at this
point.  Well.. maybe.  Raising the PHBs limit does already involve
moving a bunch of address space blocks around.

> > +#define SPAPR_IRQ_MSI        0x1100  /* Offset of the dynamic range covered
> 
> We only support dynamic MSIs with PCI, maybe rename to SPAPR_IRQ_PCI_MSI ?
> 
> > +                                      * by the bitmap allocator */
> 
> The range size is hence 1k (XICS_IRQS_SPAPR) for the time being.
> 
> > +
> > +void spapr_irq_msi_init(sPAPRMachineState *spapr, uint32_t nr_irqs,
> > +                       Error **errp);
>                          ^
>                      missing space
> 
> > +int spapr_irq_msi_alloc(sPAPRMachineState *spapr, uint32_t num, bool align,
> > +                        Error **errp);
> > +void spapr_irq_msi_free(sPAPRMachineState *spapr, int irq, uint32_t num);
> > +void spapr_irq_msi_reset(sPAPRMachineState *spapr);
> > +
> > +#endif
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index c464951747e3..32aa5d869d9c 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -187,6 +187,11 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
> >  {
> >      sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
> >  
> > +    /* Initialize the MSI IRQ allocator. */
> > +    if (!spapr->xics_legacy) {
> > +        spapr_irq_msi_init(spapr, nr_irqs, &error_fatal);
> 
> The error should be propagated to the caller.
> 
> > +    }
> > +
> >      if (kvm_enabled()) {
> >          if (machine_kernel_irqchip_allowed(machine) &&
> >              !xics_kvm_init(spapr, errp)) {
> 
> And I now realize that the way errors are handled here is badly
> broken.
> 
> In the default case, this function is supposed to try in-kernel XICS
> first and if it fails, then fallback to userland XICS. But since we
> pass errp down to xics_kvm_init() and the caller happens to pass
> &error_fatal, we stop right here instead of trying the fallback.
> 
> And if the caller changes to pass a standard &local_err instead,
> things would be even worse. With the same scenario, QEMU would
> probably exit if it could fallback to the userland XICS, or abort
> otherwise because error_set() would be called with an already set
> *errp.
> 
> Since this beha^wug is mine (commit 3d85885a1b1f3), let me fix it
> first :)
> 
> > @@ -1629,6 +1634,10 @@ static void spapr_machine_reset(void)
> >          ppc_set_compat(first_ppc_cpu, spapr->max_compat_pvr, &error_fatal);
> >      }
> >  
> > +    if (!spapr->xics_legacy) {
> > +        spapr_irq_msi_reset(spapr);
> > +    }
> > +
> >      qemu_devices_reset();
> >  
> >      /* DRC reset may cause a device to be unplugged. This will cause troubles
> > @@ -1903,6 +1912,24 @@ static const VMStateDescription vmstate_spapr_patb_entry = {
> >      },
> >  };
> >  
> > +static bool spapr_irq_map_needed(void *opaque)
> > +{
> > +    sPAPRMachineState *spapr = opaque;
> > +
> > +    return spapr->irq_map && !bitmap_empty(spapr->irq_map, spapr->irq_map_nr);
> 
> Maybe check !spapr->xics_legacy as well to preserve backward migration
> to older machines ?
> 
> > +}
> > +
> > +static const VMStateDescription vmstate_spapr_irq_map = {
> > +    .name = "spapr_irq_map",
> > +    .version_id = 1,
> > +    .minimum_version_id = 1,
> > +    .needed = spapr_irq_map_needed,
> > +    .fields = (VMStateField[]) {
> > +        VMSTATE_BITMAP(irq_map, sPAPRMachineState, 0, irq_map_nr),
> > +        VMSTATE_END_OF_LIST()
> > +    },
> > +};
> > +
> >  static const VMStateDescription vmstate_spapr = {
> >      .name = "spapr",
> >      .version_id = 3,
> > @@ -1930,6 +1957,7 @@ static const VMStateDescription vmstate_spapr = {
> >          &vmstate_spapr_cap_cfpc,
> >          &vmstate_spapr_cap_sbbc,
> >          &vmstate_spapr_cap_ibs,
> > +        &vmstate_spapr_irq_map,
> >          NULL
> >      }
> >  };
> > @@ -4101,7 +4129,10 @@ DEFINE_SPAPR_MACHINE(3_0, "3.0", true);
> >  
> >  static void spapr_machine_2_12_instance_options(MachineState *machine)
> >  {
> > +    sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
> > +
> >      spapr_machine_3_0_instance_options(machine);
> > +    spapr->xics_legacy = true;
> >  }
> >  
> >  static void spapr_machine_2_12_class_options(MachineClass *mc)
> > diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> > index 3b6ae7272092..1ef7ce7f1509 100644
> > --- a/hw/ppc/spapr_events.c
> > +++ b/hw/ppc/spapr_events.c
> > @@ -709,7 +709,11 @@ void spapr_events_init(sPAPRMachineState *spapr)
> >  {
> >      int epow_irq;
> >  
> > -    epow_irq = spapr_irq_findone(spapr, &error_fatal);
> > +    if (spapr->xics_legacy) {
> > +        epow_irq = spapr_irq_findone(spapr, &error_fatal);
> > +    } else {
> > +        epow_irq = SPAPR_IRQ_EPOW;
> > +    }
> >  
> >      spapr_irq_claim(spapr, epow_irq, 1, false, &error_fatal);
> >  
> > @@ -731,7 +735,11 @@ void spapr_events_init(sPAPRMachineState *spapr)
> >      if (spapr->use_hotplug_event_source) {
> >          int hp_irq;
> >  
> > -        hp_irq = spapr_irq_findone(spapr, &error_fatal);
> > +        if (spapr->xics_legacy) {
> > +            hp_irq = spapr_irq_findone(spapr, &error_fatal);
> > +        } else {
> > +            hp_irq = SPAPR_IRQ_HOTPLUG;
> > +        }
> >  
> >          spapr_irq_claim(spapr, hp_irq, 1, false, &error_fatal);
> >  
> > diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> > new file mode 100644
> > index 000000000000..55fa764c85d1
> > --- /dev/null
> > +++ b/hw/ppc/spapr_irq.c
> > @@ -0,0 +1,57 @@
> > +/*
> > + * QEMU PowerPC sPAPR IRQ interface
> > + *
> > + * Copyright (c) 2018, IBM Corporation.
> > + *
> > + * This code is licensed under the GPL version 2 or later. See the
> > + * COPYING file in the top-level directory.
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "qemu/log.h"
> > +#include "qemu/error-report.h"
> > +#include "qapi/error.h"
> > +#include "hw/ppc/spapr.h"
> > +#include "hw/ppc/xics.h"
> > +
> > +void spapr_irq_msi_init(sPAPRMachineState *spapr, uint32_t nr_irqs,
> > +                        Error **errp)
> 
> What's the errp for ? It looks like this function can never return
> an error. Maybe just drop it.
> 
> > +{
> > +    spapr->irq_map_nr = XICS_IRQ_BASE + nr_irqs - SPAPR_IRQ_MSI;
> > +    spapr->irq_map = bitmap_new(spapr->irq_map_nr);
> > +}
> > +
> > +int spapr_irq_msi_alloc(sPAPRMachineState *spapr, uint32_t num, bool align,
> > +                        Error **errp)
> > +{
> > +    int irq;
> > +
> > +    /*
> > +     * The 'align_mask' parameter of bitmap_find_next_zero_area()
> > +     * should be one less than a power of 2; 0 means no
> > +     * alignment. Adapt the 'align' value of the former allocator
> > +     * to fit the requirements of bitmap_find_next_zero_area()
> > +     */
> > +    align -= 1;
> > +
> > +    irq = bitmap_find_next_zero_area(spapr->irq_map, spapr->irq_map_nr, 0, num,
> > +                                     align);
> > +    if (irq == spapr->irq_map_nr) {
> > +        error_setg(errp, "can't find a free %d-IRQ block", num);
> > +        return -1;
> > +    }
> > +
> > +    bitmap_set(spapr->irq_map, irq, num);
> > +
> > +    return irq + SPAPR_IRQ_MSI;
> > +}
> > +
> > +void spapr_irq_msi_free(sPAPRMachineState *spapr, int irq, uint32_t num)
> > +{
> > +    bitmap_clear(spapr->irq_map, irq - SPAPR_IRQ_MSI, num);
> > +}
> > +
> > +void spapr_irq_msi_reset(sPAPRMachineState *spapr)
> > +{
> > +    bitmap_clear(spapr->irq_map, 0, spapr->irq_map_nr);
> > +}
> > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> > index 7394c62b4a8b..f3c4a25fb482 100644
> > --- a/hw/ppc/spapr_pci.c
> > +++ b/hw/ppc/spapr_pci.c
> > @@ -333,6 +333,9 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
> >              return;
> >          }
> >  
> > +        if (!spapr->xics_legacy) {
> > +            spapr_irq_msi_free(spapr, msi->first_irq, msi->num);
> > +        }
> >          spapr_irq_free(spapr, msi->first_irq, msi->num);
> 
> Shouldn't ^^ be in an else block ?
> 
> >          if (msi_present(pdev)) {
> >              spapr_msi_setmsg(pdev, 0, false, 0, 0);
> > @@ -371,7 +374,13 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
> >      }
> >  
> >      /* Allocate MSIs */
> > -    irq = spapr_irq_find(spapr, req_num, ret_intr_type == RTAS_TYPE_MSI, &err);
> > +    if (spapr->xics_legacy) {
> > +        irq = spapr_irq_find(spapr, req_num, ret_intr_type == RTAS_TYPE_MSI,
> > +                             &err);
> > +    } else {
> > +        irq = spapr_irq_msi_alloc(spapr, req_num,
> > +                                  ret_intr_type == RTAS_TYPE_MSI, &err);
> > +    }
> >      if (err) {
> >          error_reportf_err(err, "Can't allocate MSIs for device %x: ",
> >                            config_addr);
> > @@ -388,6 +397,9 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
> >  
> >      /* Release previous MSIs */
> >      if (msi) {
> > +        if (!spapr->xics_legacy) {
> > +            spapr_irq_msi_free(spapr, msi->first_irq, msi->num);
> > +        }
> >          spapr_irq_free(spapr, msi->first_irq, msi->num);
> >          g_hash_table_remove(phb->msi, &config_addr);
> >      }
> > @@ -1704,11 +1716,15 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
> >          uint32_t irq;
> >          Error *local_err = NULL;
> >  
> > -        irq = spapr_irq_findone(spapr, &local_err);
> > -        if (local_err) {
> > -            error_propagate(errp, local_err);
> > -            error_prepend(errp, "can't allocate LSIs: ");
> > -            return;
> > +        if (spapr->xics_legacy) {
> > +            irq = spapr_irq_findone(spapr, &local_err);
> > +            if (local_err) {
> > +                error_propagate(errp, local_err);
> > +                error_prepend(errp, "can't allocate LSIs: ");
> > +                return;
> > +            }
> > +        } else {
> > +            irq = SPAPR_IRQ_PCI_LSI + sphb->index * PCI_NUM_PINS + i;
> >          }
> >  
> >          spapr_irq_claim(spapr, irq, 1, true, &local_err);
> > diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
> > index ad9b56e28447..7707bda6a38d 100644
> > --- a/hw/ppc/spapr_vio.c
> > +++ b/hw/ppc/spapr_vio.c
> > @@ -436,6 +436,9 @@ static void spapr_vio_busdev_reset(DeviceState *qdev)
> >      }
> >  }
> >  
> > +/* TODO : poor VIO device indexing ... */
> > +static uint32_t vio_index;
> > +
> >  static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
> >  {
> >      sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> > @@ -476,10 +479,14 @@ static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
> >      }
> >  
> >      if (!dev->irq) {
> > -        dev->irq = spapr_irq_findone(spapr, &local_err);
> > -        if (local_err) {
> > -            error_propagate(errp, local_err);
> > -            return;
> > +        if (spapr->xics_legacy) {
> > +            dev->irq = spapr_irq_findone(spapr, &local_err);
> > +            if (local_err) {
> > +                error_propagate(errp, local_err);
> > +                return;
> > +            }
> > +        } else {
> > +            dev->irq = SPAPR_IRQ_VIO + vio_index++;
> 
> This can overlap the next range if we have more than 64 VIO devices...
> 
> >          }
> >      }
> >  
> > diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
> > index 86d82a6ec3ac..4fe3b7804d43 100644
> > --- a/hw/ppc/Makefile.objs
> > +++ b/hw/ppc/Makefile.objs
> > @@ -4,7 +4,7 @@ obj-y += ppc.o ppc_booke.o fdt.o
> >  obj-$(CONFIG_PSERIES) += spapr.o spapr_caps.o spapr_vio.o spapr_events.o
> >  obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o spapr_rtas.o
> >  obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o spapr_rng.o
> > -obj-$(CONFIG_PSERIES) += spapr_cpu_core.o spapr_ovec.o
> > +obj-$(CONFIG_PSERIES) += spapr_cpu_core.o spapr_ovec.o spapr_irq.o
> >  # IBM PowerNV
> >  obj-$(CONFIG_POWERNV) += pnv.o pnv_xscom.o pnv_core.o pnv_lpc.o pnv_psi.o pnv_occ.o pnv_bmc.o
> >  ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy)
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/3] spapr: introduce a fixed IRQ number space
  2018-06-18 11:31             ` Cédric Le Goater
@ 2018-06-18 12:46               ` Greg Kurz
  2018-06-18 17:40                 ` Cédric Le Goater
  0 siblings, 1 reply; 21+ messages in thread
From: Greg Kurz @ 2018-06-18 12:46 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: David Gibson, qemu-ppc, qemu-devel

On Mon, 18 Jun 2018 13:31:26 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> On 06/18/2018 11:54 AM, Greg Kurz wrote:
> > On Mon, 18 Jun 2018 10:56:55 +0200
> > Cédric Le Goater <clg@kaod.org> wrote:
> >   
> >> [ ... ]
> >>  
> >>>>> This is 4 irqs per PHB, hence up to 32 PHBs. Cool, we're currently
> >>>>> limited to 31 PHBs.
> >>>>>       
> >>>>>> +#define SPAPR_IRQ_MSI        0x1100  /* Offset of the dynamic range covered      
> >>>>>
> >>>>> We only support dynamic MSIs with PCI, maybe rename to SPAPR_IRQ_PCI_MSI ?      
> >>>>
> >>>> hmm, no. We could have CAPI devices there. remember ? ;)
> >>>>    
> >>>
> >>> Well... OpenCAPI devices are exposed to the OS as PCI devices, so I'm not
> >>> sure we need a CAPI specific range.    
> >>
> >> yes. so this range is common to all devices doing dynamic allocation
> >> of IRQs. How should we call it ? 
> >>  
> >>>>>> +                                      * by the bitmap allocator */      
> >>>>>
> >>>>> The range size is hence 1k (XICS_IRQS_SPAPR) for the time being.      
> >>>>
> >>>> in fact we could this bogus limit and use spapr->irq_map_nr now.
> >>>>    
> >>>
> >>> "we could *missing verb* this bogus limit"... so I'm not sure to
> >>> understand...    
> >>
> >> oups. We could use spapr->irq_map_nr instead of XICS_IRQS_SPAPR when
> >> defining : 
> >>
> >>     _FDT(fdt_setprop_cell(fdt, bus_off, "ibm,pe-total-#msi", XICS_IRQS_SPAPR));
> >>
> >> in spapr_pci.c
> >>  
> > 
> > Ah... yeah, I've always found weird that all PHBs advertise the same number
> > of available MSIs as the total number of irqs for the whole machine. And
> > putting spapr->irq_map_nr looks weird all the same if all PHBs rely on the
> > same bitmap actually.
> > 
> > I'm thinking of doing the other way around: keep XICS_IRQS_SPAPR in
> > "ibm,pe-total-#msi" and have one XICS_IRQS_SPAPR sized bitmap per PHB.  
> 
> That could be the place where to put the msi allocator.
> 

Are you suggesting to move @irq_map and @irq_map_nr to sPAPRPHBState and
to have these setup during PHB realize ? I guess we could do that.

> C.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] spapr: split the IRQ allocation sequence
  2018-06-18 12:16     ` David Gibson
@ 2018-06-18 16:15       ` Cédric Le Goater
  0 siblings, 0 replies; 21+ messages in thread
From: Cédric Le Goater @ 2018-06-18 16:15 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel

On 06/18/2018 02:16 PM, David Gibson wrote:
> On Mon, Jun 18, 2018 at 02:00:06PM +1000, David Gibson wrote:
>> On Fri, Jun 15, 2018 at 01:53:01PM +0200, Cédric Le Goater wrote:
>>> Today, when a device requests for IRQ number in a sPAPR machine, the
>>> spapr_irq_alloc() routine first scans the ICSState status array to
>>> find an empty slot and then performs the assignement of the selected
>>> numbers. Split this sequence in two distinct routines : spapr_irq_find()
>>> for lookups and spapr_irq_claim() for claiming the IRQ numbers.
>>>
>>> This will ease the introduction of a static layout of IRQ numbers.
>>>
>>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>>> ---
>>>  include/hw/ppc/spapr.h |  5 ++++
>>>  hw/ppc/spapr.c         | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++
>>>  hw/ppc/spapr_events.c  | 18 ++++++++++----
>>>  hw/ppc/spapr_pci.c     | 19 ++++++++++++---
>>>  hw/ppc/spapr_vio.c     | 10 +++++++-
>>>  5 files changed, 108 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>>> index 3388750fc795..6088f44c1b2a 100644
>>> --- a/include/hw/ppc/spapr.h
>>> +++ b/include/hw/ppc/spapr.h
>>> @@ -776,6 +776,11 @@ int spapr_irq_alloc(sPAPRMachineState *spapr, int irq_hint, bool lsi,
>>>                      Error **errp);
>>>  int spapr_irq_alloc_block(sPAPRMachineState *spapr, int num, bool lsi,
>>>                            bool align, Error **errp);
>>> +int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align,
>>> +                   Error **errp);
>>> +#define spapr_irq_findone(spapr, errp) spapr_irq_find(spapr, 1, false, errp)
>>> +int spapr_irq_claim(sPAPRMachineState *spapr, int irq, int num, bool lsi,
>>> +                    Error **errp);
>>
>> Unlike find, there's no reason we need to atomically claim a block of
>> irqs.  So I think claim should just grab a single irq.  Callers can
>> loop if they need multiple irqs.
> 
> Actually.. I take that back.  We don't *need* a block-claim interface,
> but if it's convenient there's no strong reason not to (as long as
> therer aren't *both* single and block interfaces, which there aren't).

Ah ! I usually code important matters in the morning and so I have 
removed it already. It was only used under spapr_pci for the msi.  

> So feel free to run with that, once the rollback bugs Greg pointed out
> are fixed.

The code was ok I think. But anyway, it's dead now. We can always 
add it back.

Thanks,

C. 
 
>> Other than that and the minor points Greg noted, this looks good.
>>
>>>  void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
>>>  qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq);
>>>  
>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>> index f59999daacfc..b1d19b328166 100644
>>> --- a/hw/ppc/spapr.c
>>> +++ b/hw/ppc/spapr.c
>>> @@ -3810,6 +3810,36 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum)
>>>      return -1;
>>>  }
>>>  
>>> +int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align, Error **errp)
>>> +{
>>> +    ICSState *ics = spapr->ics;
>>> +    int first = -1;
>>> +
>>> +    assert(ics);
>>> +
>>> +    /*
>>> +     * MSIMesage::data is used for storing VIRQ so
>>> +     * it has to be aligned to num to support multiple
>>> +     * MSI vectors. MSI-X is not affected by this.
>>> +     * The hint is used for the first IRQ, the rest should
>>> +     * be allocated continuously.
>>> +     */
>>> +    if (align) {
>>> +        assert((num == 1) || (num == 2) || (num == 4) ||
>>> +               (num == 8) || (num == 16) || (num == 32));
>>> +        first = ics_find_free_block(ics, num, num);
>>> +    } else {
>>> +        first = ics_find_free_block(ics, num, 1);
>>> +    }
>>> +
>>> +    if (first < 0) {
>>> +        error_setg(errp, "can't find a free %d-IRQ block", num);
>>> +        return -1;
>>> +    }
>>> +
>>> +    return first + ics->offset;
>>> +}
>>> +
>>>  /*
>>>   * Allocate the IRQ number and set the IRQ type, LSI or MSI
>>>   */
>>> @@ -3888,6 +3918,40 @@ int spapr_irq_alloc_block(sPAPRMachineState *spapr, int num, bool lsi,
>>>      return first;
>>>  }
>>>  
>>> +int spapr_irq_claim(sPAPRMachineState *spapr, int irq, int num, bool lsi,
>>> +                    Error **errp)
>>> +{
>>> +    ICSState *ics = spapr->ics;
>>> +    int i;
>>> +    int srcno = irq - ics->offset;
>>> +    int ret = 0;
>>> +
>>> +    assert(ics);
>>> +
>>> +    if (!ics_valid_irq(ics, irq) || !ics_valid_irq(ics, irq + num)) {
>>> +        return -1;
>>> +    }
>>> +
>>> +    for (i = srcno; i < srcno + num; ++i) {
>>> +        if (ICS_IRQ_FREE(ics, i)) {
>>> +            spapr_irq_set_lsi(spapr, i + ics->offset, lsi);
>>> +        } else {
>>> +            error_setg(errp, "IRQ %d is not free", i + ics->offset);
>>> +            ret = -1;
>>> +            break;
>>> +        }
>>> +    }
>>> +
>>> +    /* we could call spapr_irq_free() when rolling back */
>>> +    if (ret) {
>>> +        while (--i >= srcno) {
>>> +            memset(&ics->irqs[i], 0, sizeof(ICSIRQState));
>>> +        }
>>> +    }
>>> +
>>> +    return ret;
>>> +}
>>> +
>>>  void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num)
>>>  {
>>>      ICSState *ics = spapr->ics;
>>> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
>>> index 86836f0626dc..3b6ae7272092 100644
>>> --- a/hw/ppc/spapr_events.c
>>> +++ b/hw/ppc/spapr_events.c
>>> @@ -707,13 +707,18 @@ void spapr_clear_pending_events(sPAPRMachineState *spapr)
>>>  
>>>  void spapr_events_init(sPAPRMachineState *spapr)
>>>  {
>>> +    int epow_irq;
>>> +
>>> +    epow_irq = spapr_irq_findone(spapr, &error_fatal);
>>> +
>>> +    spapr_irq_claim(spapr, epow_irq, 1, false, &error_fatal);
>>> +
>>>      QTAILQ_INIT(&spapr->pending_events);
>>>  
>>>      spapr->event_sources = spapr_event_sources_new();
>>>  
>>>      spapr_event_sources_register(spapr->event_sources, EVENT_CLASS_EPOW,
>>> -                                 spapr_irq_alloc(spapr, 0, false,
>>> -                                                  &error_fatal));
>>> +                                 epow_irq);
>>>  
>>>      /* NOTE: if machine supports modern/dedicated hotplug event source,
>>>       * we add it to the device-tree unconditionally. This means we may
>>> @@ -724,9 +729,14 @@ void spapr_events_init(sPAPRMachineState *spapr)
>>>       * checking that it's enabled.
>>>       */
>>>      if (spapr->use_hotplug_event_source) {
>>> +        int hp_irq;
>>> +
>>> +        hp_irq = spapr_irq_findone(spapr, &error_fatal);
>>> +
>>> +        spapr_irq_claim(spapr, hp_irq, 1, false, &error_fatal);
>>> +
>>>          spapr_event_sources_register(spapr->event_sources, EVENT_CLASS_HOT_PLUG,
>>> -                                     spapr_irq_alloc(spapr, 0, false,
>>> -                                                      &error_fatal));
>>> +                                     hp_irq);
>>>      }
>>>  
>>>      spapr->epow_notifier.notify = spapr_powerdown_req;
>>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>>> index f936ce63effa..7394c62b4a8b 100644
>>> --- a/hw/ppc/spapr_pci.c
>>> +++ b/hw/ppc/spapr_pci.c
>>> @@ -371,8 +371,14 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>>>      }
>>>  
>>>      /* Allocate MSIs */
>>> -    irq = spapr_irq_alloc_block(spapr, req_num, false,
>>> -                           ret_intr_type == RTAS_TYPE_MSI, &err);
>>> +    irq = spapr_irq_find(spapr, req_num, ret_intr_type == RTAS_TYPE_MSI, &err);
>>> +    if (err) {
>>> +        error_reportf_err(err, "Can't allocate MSIs for device %x: ",
>>> +                          config_addr);
>>> +        rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
>>> +        return;
>>> +    }
>>> +    spapr_irq_claim(spapr, irq, req_num, false, &err);
>>>      if (err) {
>>>          error_reportf_err(err, "Can't allocate MSIs for device %x: ",
>>>                            config_addr);
>>> @@ -1698,7 +1704,14 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>>>          uint32_t irq;
>>>          Error *local_err = NULL;
>>>  
>>> -        irq = spapr_irq_alloc_block(spapr, 1, true, false, &local_err);
>>> +        irq = spapr_irq_findone(spapr, &local_err);
>>> +        if (local_err) {
>>> +            error_propagate(errp, local_err);
>>> +            error_prepend(errp, "can't allocate LSIs: ");
>>> +            return;
>>> +        }
>>> +
>>> +        spapr_irq_claim(spapr, irq, 1, true, &local_err);
>>>          if (local_err) {
>>>              error_propagate(errp, local_err);
>>>              error_prepend(errp, "can't allocate LSIs: ");
>>> diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
>>> index 4555c648a8e2..ad9b56e28447 100644
>>> --- a/hw/ppc/spapr_vio.c
>>> +++ b/hw/ppc/spapr_vio.c
>>> @@ -475,7 +475,15 @@ static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
>>>          dev->qdev.id = id;
>>>      }
>>>  
>>> -    dev->irq = spapr_irq_alloc(spapr, dev->irq, false, &local_err);
>>> +    if (!dev->irq) {
>>> +        dev->irq = spapr_irq_findone(spapr, &local_err);
>>> +        if (local_err) {
>>> +            error_propagate(errp, local_err);
>>> +            return;
>>> +        }
>>> +    }
>>> +
>>> +    spapr_irq_claim(spapr, dev->irq, 1, false, &local_err);
>>>      if (local_err) {
>>>          error_propagate(errp, local_err);
>>>          return;
>>
> 
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/3] spapr: introduce a fixed IRQ number space
  2018-06-18 12:46               ` Greg Kurz
@ 2018-06-18 17:40                 ` Cédric Le Goater
  0 siblings, 0 replies; 21+ messages in thread
From: Cédric Le Goater @ 2018-06-18 17:40 UTC (permalink / raw)
  To: Greg Kurz; +Cc: David Gibson, qemu-ppc, qemu-devel

On 06/18/2018 02:46 PM, Greg Kurz wrote:
> On Mon, 18 Jun 2018 13:31:26 +0200
> Cédric Le Goater <clg@kaod.org> wrote:
> 
>> On 06/18/2018 11:54 AM, Greg Kurz wrote:
>>> On Mon, 18 Jun 2018 10:56:55 +0200
>>> Cédric Le Goater <clg@kaod.org> wrote:
>>>   
>>>> [ ... ]
>>>>  
>>>>>>> This is 4 irqs per PHB, hence up to 32 PHBs. Cool, we're currently
>>>>>>> limited to 31 PHBs.
>>>>>>>       
>>>>>>>> +#define SPAPR_IRQ_MSI        0x1100  /* Offset of the dynamic range covered      
>>>>>>>
>>>>>>> We only support dynamic MSIs with PCI, maybe rename to SPAPR_IRQ_PCI_MSI ?      
>>>>>>
>>>>>> hmm, no. We could have CAPI devices there. remember ? ;)
>>>>>>    
>>>>>
>>>>> Well... OpenCAPI devices are exposed to the OS as PCI devices, so I'm not
>>>>> sure we need a CAPI specific range.    
>>>>
>>>> yes. so this range is common to all devices doing dynamic allocation
>>>> of IRQs. How should we call it ? 
>>>>  
>>>>>>>> +                                      * by the bitmap allocator */      
>>>>>>>
>>>>>>> The range size is hence 1k (XICS_IRQS_SPAPR) for the time being.      
>>>>>>
>>>>>> in fact we could this bogus limit and use spapr->irq_map_nr now.
>>>>>>    
>>>>>
>>>>> "we could *missing verb* this bogus limit"... so I'm not sure to
>>>>> understand...    
>>>>
>>>> oups. We could use spapr->irq_map_nr instead of XICS_IRQS_SPAPR when
>>>> defining : 
>>>>
>>>>     _FDT(fdt_setprop_cell(fdt, bus_off, "ibm,pe-total-#msi", XICS_IRQS_SPAPR));
>>>>
>>>> in spapr_pci.c
>>>>  
>>>
>>> Ah... yeah, I've always found weird that all PHBs advertise the same number
>>> of available MSIs as the total number of irqs for the whole machine. And
>>> putting spapr->irq_map_nr looks weird all the same if all PHBs rely on the
>>> same bitmap actually.
>>>
>>> I'm thinking of doing the other way around: keep XICS_IRQS_SPAPR in
>>> "ibm,pe-total-#msi" and have one XICS_IRQS_SPAPR sized bitmap per PHB.  
>>
>> That could be the place where to put the msi allocator.
>>
> 
> Are you suggesting to move @irq_map and @irq_map_nr to sPAPRPHBState and
> to have these setup during PHB realize ? I guess we could do that.

Hmm, I think it's better to keep it under the machine. We only need 
to define an IRQ number "quantity" per phb and not a specific IRQ 
number range. The claimed numbers can be machine wide numbers.  

C.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2018-06-18 17:40 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-15 11:53 [Qemu-devel] [PATCH 0/3] introduce a fixed IRQ number space Cédric Le Goater
2018-06-15 11:53 ` [Qemu-devel] [PATCH 1/3] spapr: split the IRQ allocation sequence Cédric Le Goater
2018-06-15 13:14   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
2018-06-15 17:21     ` Cédric Le Goater
2018-06-18  6:52       ` Greg Kurz
2018-06-18  4:00   ` [Qemu-devel] " David Gibson
2018-06-18 12:16     ` David Gibson
2018-06-18 16:15       ` Cédric Le Goater
2018-06-15 11:53 ` [Qemu-devel] [PATCH 2/3] spapr: remove unused spapr_irq routines Cédric Le Goater
2018-06-15 13:30   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
2018-06-18  4:00   ` [Qemu-devel] " David Gibson
2018-06-15 11:53 ` [Qemu-devel] [PATCH 3/3] spapr: introduce a fixed IRQ number space Cédric Le Goater
2018-06-15 16:03   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
2018-06-15 17:18     ` Cédric Le Goater
2018-06-18  8:42       ` Greg Kurz
2018-06-18  8:56         ` Cédric Le Goater
2018-06-18  9:54           ` Greg Kurz
2018-06-18 11:31             ` Cédric Le Goater
2018-06-18 12:46               ` Greg Kurz
2018-06-18 17:40                 ` Cédric Le Goater
2018-06-18 12:30     ` David Gibson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.