All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v4 00/15] spapr: Add support for PHB hotplug
@ 2019-02-12 18:23 Greg Kurz
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 01/15] spapr_irq: Add an @xics_offset field to sPAPRIrq Greg Kurz
                   ` (14 more replies)
  0 siblings, 15 replies; 35+ messages in thread
From: Greg Kurz @ 2019-02-12 18:23 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Greg Kurz,
	Alexey Kardashevskiy, Cédric Le Goater, Michael Roth,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Eduardo Habkost, David Hildenbrand, Cornelia Huck, Gerd Hoffmann,
	Dmitry Fleytman, Thomas Huth

This allows to hotplug/unplug PHBs. I could successfully test:
- using in-kernel XICS, emulated XICS and XIVE
- hotplug/unplug with e1000 device to validate LSIs
- hotplug/unplug with virtio-net device to validate MSIs
- some simple migration scenarios

Please comment.

Changes in v4:

- added a LSI bitmap to XICS
- no longer need compat property in XICS
- simplified the patches to access the name and the phandle of the
  interrupt controller
- delay the creation of the PHB drc->fdt to RTAS ibm,configure-connector

Change in v3:
- reworked phandle related code some more
- disintricate allocation/"type setting" of interrupts
- identify LSIs at machine init

Changes in v2:
- rebased on current ppc-for-4.0
- added some preliminary cleanup
- call unrealize from realize error path
- advertise PHB hotplug in last patch
- reworked phandle related code
- sync LSIs to KVM

--
Greg

---

Greg Kurz (8):
      spapr_irq: Add an @xics_offset field to sPAPRIrq
      xive: Only set source type for LSIs
      spapr_irq: Set LSIs at interrupt controller init
      spapr: Expose the name of the interrupt controller node
      spapr_irq: Expose the phandle of the interrupt controller
      spapr_pci: add PHB unrealize
      spapr_drc: Allow FDT fragment to be added later
      spapr: add hotplug hooks for PHB hotplug

Michael Roth (6):
      spapr: create DR connectors for PHBs
      spapr_events: add support for phb hotplug events
      qdev: pass an Object * to qbus_set_hotplug_handler()
      spapr_pci: provide node start offset via spapr_populate_pci_dt()
      spapr_pci: add ibm, my-drc-index property for PHB hotplug
      spapr: enable PHB hotplug for default pseries machine type

Nathan Fontenot (1):
      spapr: populate PHB DRC entries for root DT node


 hw/acpi/pcihp.c               |    2 -
 hw/acpi/piix4.c               |    2 -
 hw/char/virtio-serial-bus.c   |    2 -
 hw/core/bus.c                 |   11 +--
 hw/intc/spapr_xive.c          |   17 +----
 hw/intc/xics.c                |   18 ++---
 hw/intc/xics_kvm.c            |    2 -
 hw/intc/xics_spapr.c          |    2 -
 hw/pci/pcie.c                 |    2 -
 hw/pci/shpc.c                 |    2 -
 hw/ppc/pnv_psi.c              |    3 +
 hw/ppc/spapr.c                |  146 ++++++++++++++++++++++++++++++++++++++++-
 hw/ppc/spapr_drc.c            |   53 +++++++++++++--
 hw/ppc/spapr_events.c         |    7 +-
 hw/ppc/spapr_irq.c            |  122 ++++++++++++++++++++++++++++------
 hw/ppc/spapr_pci.c            |  113 +++++++++++++++++++++++++-------
 hw/ppc/spapr_vio.c            |    2 -
 hw/s390x/css-bridge.c         |    2 -
 hw/s390x/s390-pci-bus.c       |    6 +-
 hw/scsi/virtio-scsi.c         |    2 -
 hw/scsi/vmw_pvscsi.c          |    2 -
 hw/usb/dev-smartcard-reader.c |    2 -
 include/hw/pci-host/spapr.h   |    7 ++
 include/hw/ppc/spapr.h        |    6 +-
 include/hw/ppc/spapr_drc.h    |   14 ++++
 include/hw/ppc/spapr_irq.h    |    9 ++-
 include/hw/ppc/spapr_xive.h   |    5 +
 include/hw/ppc/xics.h         |    4 +
 include/hw/ppc/xics_spapr.h   |    2 +
 include/hw/ppc/xive.h         |    7 +-
 include/hw/qdev-core.h        |    3 -
 31 files changed, 461 insertions(+), 116 deletions(-)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v4 01/15] spapr_irq: Add an @xics_offset field to sPAPRIrq
  2019-02-12 18:23 [Qemu-devel] [PATCH v4 00/15] spapr: Add support for PHB hotplug Greg Kurz
@ 2019-02-12 18:24 ` Greg Kurz
  2019-02-12 20:07   ` Cédric Le Goater
  2019-02-13  3:26   ` David Gibson
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 02/15] xive: Only set source type for LSIs Greg Kurz
                   ` (13 subsequent siblings)
  14 siblings, 2 replies; 35+ messages in thread
From: Greg Kurz @ 2019-02-12 18:24 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Greg Kurz,
	Alexey Kardashevskiy, Cédric Le Goater, Michael Roth,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Eduardo Habkost, David Hildenbrand, Cornelia Huck, Gerd Hoffmann,
	Dmitry Fleytman, Thomas Huth

Only pseries machines, either recent ones started with ic-mode=xics
or older ones using the legacy irq allocation scheme, need to set the
@offset of the ICS to XICS_IRQ_BASE. Recent pseries started with
ic-mode=dual set it to 0 and powernv machines set it to some other
value at runtime.

It thus doesn't really help to set the default value of the ICS offset
to XICS_IRQ_BASE in ics_base_instance_init().

Drop that code from XICS and let the pseries code set the offset
explicitely for clarity.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/intc/xics.c             |    8 --------
 hw/ppc/spapr_irq.c         |   33 ++++++++++++++++++++-------------
 include/hw/ppc/spapr_irq.h |    1 +
 3 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 16e8ffa2aaf7..7cac138067e2 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -638,13 +638,6 @@ static void ics_base_realize(DeviceState *dev, Error **errp)
     ics->irqs = g_malloc0(ics->nr_irqs * sizeof(ICSIRQState));
 }
 
-static void ics_base_instance_init(Object *obj)
-{
-    ICSState *ics = ICS_BASE(obj);
-
-    ics->offset = XICS_IRQ_BASE;
-}
-
 static int ics_base_dispatch_pre_save(void *opaque)
 {
     ICSState *ics = opaque;
@@ -720,7 +713,6 @@ static const TypeInfo ics_base_info = {
     .parent = TYPE_DEVICE,
     .abstract = true,
     .instance_size = sizeof(ICSState),
-    .instance_init = ics_base_instance_init,
     .class_init = ics_base_class_init,
     .class_size = sizeof(ICSStateClass),
 };
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 80b0083b8e38..8217e0215411 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -68,10 +68,11 @@ void spapr_irq_msi_reset(sPAPRMachineState *spapr)
 
 static ICSState *spapr_ics_create(sPAPRMachineState *spapr,
                                   const char *type_ics,
-                                  int nr_irqs, Error **errp)
+                                  int nr_irqs, int offset, Error **errp)
 {
     Error *local_err = NULL;
     Object *obj;
+    ICSState *ics;
 
     obj = object_new(type_ics);
     object_property_add_child(OBJECT(spapr), "ics", obj, &error_abort);
@@ -86,7 +87,10 @@ static ICSState *spapr_ics_create(sPAPRMachineState *spapr,
         goto error;
     }
 
-    return ICS_BASE(obj);
+    ics = ICS_BASE(obj);
+    ics->offset = offset;
+
+    return ics;
 
 error:
     error_propagate(errp, local_err);
@@ -104,6 +108,7 @@ static void spapr_irq_init_xics(sPAPRMachineState *spapr, Error **errp)
             !xics_kvm_init(spapr, &local_err)) {
             spapr->icp_type = TYPE_KVM_ICP;
             spapr->ics = spapr_ics_create(spapr, TYPE_ICS_KVM, nr_irqs,
+                                          spapr->irq->xics_offset,
                                           &local_err);
         }
         if (machine_kernel_irqchip_required(machine) && !spapr->ics) {
@@ -119,6 +124,7 @@ static void spapr_irq_init_xics(sPAPRMachineState *spapr, Error **errp)
         xics_spapr_init(spapr);
         spapr->icp_type = TYPE_ICP;
         spapr->ics = spapr_ics_create(spapr, TYPE_ICS_SIMPLE, nr_irqs,
+                                      spapr->irq->xics_offset,
                                       &local_err);
     }
 
@@ -246,6 +252,7 @@ sPAPRIrq spapr_irq_xics = {
     .nr_irqs     = SPAPR_IRQ_XICS_NR_IRQS,
     .nr_msis     = SPAPR_IRQ_XICS_NR_MSIS,
     .ov5         = SPAPR_OV5_XIVE_LEGACY,
+    .xics_offset = XICS_IRQ_BASE,
 
     .init        = spapr_irq_init_xics,
     .claim       = spapr_irq_claim_xics,
@@ -451,17 +458,6 @@ static void spapr_irq_init_dual(sPAPRMachineState *spapr, Error **errp)
         return;
     }
 
-    /*
-     * Align the XICS and the XIVE IRQ number space under QEMU.
-     *
-     * However, the XICS KVM device still considers that the IRQ
-     * numbers should start at XICS_IRQ_BASE (0x1000). Either we
-     * should introduce a KVM device ioctl to set the offset or ignore
-     * the lower 4K numbers when using the get/set ioctl of the XICS
-     * KVM device. The second option seems the least intrusive.
-     */
-    spapr->ics->offset = 0;
-
     spapr_irq_xive.init(spapr, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
@@ -582,6 +578,16 @@ sPAPRIrq spapr_irq_dual = {
     .nr_irqs     = SPAPR_IRQ_DUAL_NR_IRQS,
     .nr_msis     = SPAPR_IRQ_DUAL_NR_MSIS,
     .ov5         = SPAPR_OV5_XIVE_BOTH,
+    /*
+     * Align the XICS and the XIVE IRQ number space under QEMU.
+     *
+     * However, the XICS KVM device still considers that the IRQ
+     * numbers should start at XICS_IRQ_BASE (0x1000). Either we
+     * should introduce a KVM device ioctl to set the offset or ignore
+     * the lower 4K numbers when using the get/set ioctl of the XICS
+     * KVM device. The second option seems the least intrusive.
+     */
+    .xics_offset = 0,
 
     .init        = spapr_irq_init_dual,
     .claim       = spapr_irq_claim_dual,
@@ -712,6 +718,7 @@ sPAPRIrq spapr_irq_xics_legacy = {
     .nr_irqs     = SPAPR_IRQ_XICS_LEGACY_NR_IRQS,
     .nr_msis     = SPAPR_IRQ_XICS_LEGACY_NR_IRQS,
     .ov5         = SPAPR_OV5_XIVE_LEGACY,
+    .xics_offset = XICS_IRQ_BASE,
 
     .init        = spapr_irq_init_xics,
     .claim       = spapr_irq_claim_xics,
diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index 14b02c3aca33..5e30858dc22a 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -34,6 +34,7 @@ typedef struct sPAPRIrq {
     uint32_t    nr_irqs;
     uint32_t    nr_msis;
     uint8_t     ov5;
+    uint32_t    xics_offset;
 
     void (*init)(sPAPRMachineState *spapr, Error **errp);
     int (*claim)(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v4 02/15] xive: Only set source type for LSIs
  2019-02-12 18:23 [Qemu-devel] [PATCH v4 00/15] spapr: Add support for PHB hotplug Greg Kurz
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 01/15] spapr_irq: Add an @xics_offset field to sPAPRIrq Greg Kurz
@ 2019-02-12 18:24 ` Greg Kurz
  2019-02-13  3:27   ` David Gibson
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 03/15] spapr_irq: Set LSIs at interrupt controller init Greg Kurz
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 35+ messages in thread
From: Greg Kurz @ 2019-02-12 18:24 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Greg Kurz,
	Alexey Kardashevskiy, Cédric Le Goater, Michael Roth,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Eduardo Habkost, David Hildenbrand, Cornelia Huck, Gerd Hoffmann,
	Dmitry Fleytman, Thomas Huth

MSI is the default and LSI specific code is guarded by the
xive_source_irq_is_lsi() helper. The xive_source_irq_set()
helper is a nop for MSIs.

Simplify the code by turning xive_source_irq_set() into
xive_source_irq_set_lsi() and only call it for LSIs. The
call to xive_source_irq_set(false) in spapr_xive_irq_free()
is also a nop. Just drop it.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
---
 hw/intc/spapr_xive.c  |    7 +++----
 include/hw/ppc/xive.h |    7 ++-----
 2 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index a0f5ff929447..290a290e43a5 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -489,20 +489,19 @@ bool spapr_xive_irq_claim(sPAPRXive *xive, uint32_t lisn, bool lsi)
     }
 
     xive->eat[lisn].w |= cpu_to_be64(EAS_VALID);
-    xive_source_irq_set(xsrc, lisn, lsi);
+    if (lsi) {
+        xive_source_irq_set_lsi(xsrc, lisn);
+    }
     return true;
 }
 
 bool spapr_xive_irq_free(sPAPRXive *xive, uint32_t lisn)
 {
-    XiveSource *xsrc = &xive->source;
-
     if (lisn >= xive->nr_irqs) {
         return false;
     }
 
     xive->eat[lisn].w &= cpu_to_be64(~EAS_VALID);
-    xive_source_irq_set(xsrc, lisn, false);
     return true;
 }
 
diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
index ec3bb2aae45a..13a487527b11 100644
--- a/include/hw/ppc/xive.h
+++ b/include/hw/ppc/xive.h
@@ -283,13 +283,10 @@ static inline bool xive_source_irq_is_lsi(XiveSource *xsrc, uint32_t srcno)
     return test_bit(srcno, xsrc->lsi_map);
 }
 
-static inline void xive_source_irq_set(XiveSource *xsrc, uint32_t srcno,
-                                       bool lsi)
+static inline void xive_source_irq_set_lsi(XiveSource *xsrc, uint32_t srcno)
 {
     assert(srcno < xsrc->nr_irqs);
-    if (lsi) {
-        bitmap_set(xsrc->lsi_map, srcno, 1);
-    }
+    bitmap_set(xsrc->lsi_map, srcno, 1);
 }
 
 void xive_source_set_irq(void *opaque, int srcno, int val);

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v4 03/15] spapr_irq: Set LSIs at interrupt controller init
  2019-02-12 18:23 [Qemu-devel] [PATCH v4 00/15] spapr: Add support for PHB hotplug Greg Kurz
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 01/15] spapr_irq: Add an @xics_offset field to sPAPRIrq Greg Kurz
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 02/15] xive: Only set source type for LSIs Greg Kurz
@ 2019-02-12 18:24 ` Greg Kurz
  2019-02-12 20:17   ` Cédric Le Goater
  2019-02-13  3:48   ` David Gibson
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 04/15] spapr: Expose the name of the interrupt controller node Greg Kurz
                   ` (11 subsequent siblings)
  14 siblings, 2 replies; 35+ messages in thread
From: Greg Kurz @ 2019-02-12 18:24 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Greg Kurz,
	Alexey Kardashevskiy, Cédric Le Goater, Michael Roth,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Eduardo Habkost, David Hildenbrand, Cornelia Huck, Gerd Hoffmann,
	Dmitry Fleytman, Thomas Huth

The pseries machine only uses LSIs to support legacy PCI devices. Every
PHB claims 4 LSIs at realize time. When using in-kernel XICS (or upcoming
in-kernel XIVE), QEMU synchronizes the state of all irqs, including these
LSIs, later on at machine reset.

In order to support PHB hotplug, we need a way to tell KVM about the LSIs
that doesn't require a machine reset.

Since recent machine types allocate all these LSIs in a fixed range for
the machine lifetime, identify them when initializing the interrupt
controller, long before they get passed to KVM.

In order to do that, first disintricate interrupt typing and allocation.
Since the vast majority of interrupts are MSIs, make that the default
and have only the LSI users to explicitely set the type.

It is rather straight forward for XIVE. XICS needs some extra care
though: allocation state and type are mixed up in the same bits of the
flags field within the interrupt state. Setting the LSI bit there at
init time would mean the interrupt is de facto allocated, even if no
device asked for it. Introduce a bitmap to track LSIs at the ICS level.
In order to keep the patch minimal, the bitmap is only used when writing
the source state to KVM and when the interrupt is claimed, so that the
code that checks the interrupt type through the flags stays untouched.

With older pseries machine using the XICS legacy IRQ allocation scheme,
all interrupt numbers come from a common pool and there's no such thing
as a fixed range for LSIs. Introduce an helper so that these older
machine types can continue to set the type when allocating the LSI.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/intc/spapr_xive.c        |    7 +------
 hw/intc/xics.c              |   10 ++++++++--
 hw/intc/xics_kvm.c          |    2 +-
 hw/ppc/pnv_psi.c            |    3 ++-
 hw/ppc/spapr_events.c       |    4 ++--
 hw/ppc/spapr_irq.c          |   42 ++++++++++++++++++++++++++++++++----------
 hw/ppc/spapr_pci.c          |    6 ++++--
 hw/ppc/spapr_vio.c          |    2 +-
 include/hw/ppc/spapr_irq.h  |    5 +++--
 include/hw/ppc/spapr_xive.h |    2 +-
 include/hw/ppc/xics.h       |    4 +++-
 11 files changed, 58 insertions(+), 29 deletions(-)

diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index 290a290e43a5..815263ca72ab 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -480,18 +480,13 @@ static void spapr_xive_register_types(void)
 
 type_init(spapr_xive_register_types)
 
-bool spapr_xive_irq_claim(sPAPRXive *xive, uint32_t lisn, bool lsi)
+bool spapr_xive_irq_claim(sPAPRXive *xive, uint32_t lisn)
 {
-    XiveSource *xsrc = &xive->source;
-
     if (lisn >= xive->nr_irqs) {
         return false;
     }
 
     xive->eat[lisn].w |= cpu_to_be64(EAS_VALID);
-    if (lsi) {
-        xive_source_irq_set_lsi(xsrc, lisn);
-    }
     return true;
 }
 
diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 7cac138067e2..26e8940d7329 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -636,6 +636,7 @@ static void ics_base_realize(DeviceState *dev, Error **errp)
         return;
     }
     ics->irqs = g_malloc0(ics->nr_irqs * sizeof(ICSIRQState));
+    ics->lsi_map = bitmap_new(ics->nr_irqs);
 }
 
 static int ics_base_dispatch_pre_save(void *opaque)
@@ -733,12 +734,17 @@ ICPState *xics_icp_get(XICSFabric *xi, int server)
     return xic->icp_get(xi, server);
 }
 
-void ics_set_irq_type(ICSState *ics, int srcno, bool lsi)
+void ics_set_lsi(ICSState *ics, int srcno)
+{
+    set_bit(srcno, ics->lsi_map);
+}
+
+void ics_claim_irq(ICSState *ics, int srcno)
 {
     assert(!(ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MASK));
 
     ics->irqs[srcno].flags |=
-        lsi ? XICS_FLAGS_IRQ_LSI : XICS_FLAGS_IRQ_MSI;
+        test_bit(srcno, ics->lsi_map) ? XICS_FLAGS_IRQ_LSI : XICS_FLAGS_IRQ_MSI;
 }
 
 static void xics_register_types(void)
diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
index dff13300504c..e63979abc7fc 100644
--- a/hw/intc/xics_kvm.c
+++ b/hw/intc/xics_kvm.c
@@ -271,7 +271,7 @@ static int ics_set_kvm_state(ICSState *ics, int version_id)
             state |= KVM_XICS_MASKED;
         }
 
-        if (ics->irqs[i].flags & XICS_FLAGS_IRQ_LSI) {
+        if (test_bit(i, ics->lsi_map)) {
             state |= KVM_XICS_LEVEL_SENSITIVE;
             if (irq->status & XICS_STATUS_ASSERTED) {
                 state |= KVM_XICS_PENDING;
diff --git a/hw/ppc/pnv_psi.c b/hw/ppc/pnv_psi.c
index 8ced09506321..e6089e1035c0 100644
--- a/hw/ppc/pnv_psi.c
+++ b/hw/ppc/pnv_psi.c
@@ -487,7 +487,8 @@ static void pnv_psi_realize(DeviceState *dev, Error **errp)
     }
 
     for (i = 0; i < ics->nr_irqs; i++) {
-        ics_set_irq_type(ics, i, true);
+        ics_set_lsi(ics, i);
+        ics_claim_irq(ics, i);
     }
 
     psi->qirqs = qemu_allocate_irqs(ics_simple_set_irq, ics, ics->nr_irqs);
diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index b9c7ecb9e987..559026d0981c 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -713,7 +713,7 @@ void spapr_events_init(sPAPRMachineState *spapr)
         epow_irq = spapr_irq_findone(spapr, &error_fatal);
     }
 
-    spapr_irq_claim(spapr, epow_irq, false, &error_fatal);
+    spapr_irq_claim(spapr, epow_irq, &error_fatal);
 
     QTAILQ_INIT(&spapr->pending_events);
 
@@ -737,7 +737,7 @@ void spapr_events_init(sPAPRMachineState *spapr)
             hp_irq = spapr_irq_findone(spapr, &error_fatal);
         }
 
-        spapr_irq_claim(spapr, hp_irq, false, &error_fatal);
+        spapr_irq_claim(spapr, hp_irq, &error_fatal);
 
         spapr_event_sources_register(spapr->event_sources, EVENT_CLASS_HOT_PLUG,
                                      hp_irq);
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 8217e0215411..3fc34d7c8a43 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -16,10 +16,13 @@
 #include "hw/ppc/spapr_xive.h"
 #include "hw/ppc/xics.h"
 #include "hw/ppc/xics_spapr.h"
+#include "hw/pci-host/spapr.h"
 #include "sysemu/kvm.h"
 
 #include "trace.h"
 
+#define SPAPR_IRQ_PCI_LSI_NR     (SPAPR_MAX_PHBS * PCI_NUM_PINS)
+
 void spapr_irq_msi_init(sPAPRMachineState *spapr, uint32_t nr_msis)
 {
     spapr->irq_map_nr = nr_msis;
@@ -102,6 +105,7 @@ static void spapr_irq_init_xics(sPAPRMachineState *spapr, Error **errp)
     MachineState *machine = MACHINE(spapr);
     int nr_irqs = spapr->irq->nr_irqs;
     Error *local_err = NULL;
+    int i;
 
     if (kvm_enabled()) {
         if (machine_kernel_irqchip_allowed(machine) &&
@@ -128,6 +132,14 @@ static void spapr_irq_init_xics(sPAPRMachineState *spapr, Error **errp)
                                       &local_err);
     }
 
+    /* Identify the PCI LSIs */
+    if (!SPAPR_MACHINE_GET_CLASS(spapr)->legacy_irq_allocation) {
+        for (i = 0; i < SPAPR_IRQ_PCI_LSI_NR; ++i) {
+            ics_set_lsi(spapr->ics,
+                        i + SPAPR_IRQ_PCI_LSI - spapr->irq->xics_offset);
+        }
+    }
+
 error:
     error_propagate(errp, local_err);
 }
@@ -135,7 +147,7 @@ error:
 #define ICS_IRQ_FREE(ics, srcno)   \
     (!((ics)->irqs[(srcno)].flags & (XICS_FLAGS_IRQ_MASK)))
 
-static int spapr_irq_claim_xics(sPAPRMachineState *spapr, int irq, bool lsi,
+static int spapr_irq_claim_xics(sPAPRMachineState *spapr, int irq,
                                 Error **errp)
 {
     ICSState *ics = spapr->ics;
@@ -152,7 +164,7 @@ static int spapr_irq_claim_xics(sPAPRMachineState *spapr, int irq, bool lsi,
         return -1;
     }
 
-    ics_set_irq_type(ics, irq - ics->offset, lsi);
+    ics_claim_irq(ics, irq - ics->offset);
     return 0;
 }
 
@@ -296,16 +308,21 @@ static void spapr_irq_init_xive(sPAPRMachineState *spapr, Error **errp)
 
     /* Enable the CPU IPIs */
     for (i = 0; i < nr_servers; ++i) {
-        spapr_xive_irq_claim(spapr->xive, SPAPR_IRQ_IPI + i, false);
+        spapr_xive_irq_claim(spapr->xive, SPAPR_IRQ_IPI + i);
+    }
+
+    /* Identify the PCI LSIs */
+    for (i = 0; i < SPAPR_IRQ_PCI_LSI_NR; ++i) {
+        xive_source_irq_set_lsi(&spapr->xive->source, SPAPR_IRQ_PCI_LSI + i);
     }
 
     spapr_xive_hcall_init(spapr);
 }
 
-static int spapr_irq_claim_xive(sPAPRMachineState *spapr, int irq, bool lsi,
+static int spapr_irq_claim_xive(sPAPRMachineState *spapr, int irq,
                                 Error **errp)
 {
-    if (!spapr_xive_irq_claim(spapr->xive, irq, lsi)) {
+    if (!spapr_xive_irq_claim(spapr->xive, irq)) {
         error_setg(errp, "IRQ %d is invalid", irq);
         return -1;
     }
@@ -465,19 +482,19 @@ static void spapr_irq_init_dual(sPAPRMachineState *spapr, Error **errp)
     }
 }
 
-static int spapr_irq_claim_dual(sPAPRMachineState *spapr, int irq, bool lsi,
+static int spapr_irq_claim_dual(sPAPRMachineState *spapr, int irq,
                                 Error **errp)
 {
     Error *local_err = NULL;
     int ret;
 
-    ret = spapr_irq_xics.claim(spapr, irq, lsi, &local_err);
+    ret = spapr_irq_xics.claim(spapr, irq, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         return ret;
     }
 
-    ret = spapr_irq_xive.claim(spapr, irq, lsi, &local_err);
+    ret = spapr_irq_xive.claim(spapr, irq, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         return ret;
@@ -630,9 +647,9 @@ void spapr_irq_init(sPAPRMachineState *spapr, Error **errp)
                                       spapr->irq->nr_irqs);
 }
 
-int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp)
+int spapr_irq_claim(sPAPRMachineState *spapr, int irq, Error **errp)
 {
-    return spapr->irq->claim(spapr, irq, lsi, errp);
+    return spapr->irq->claim(spapr, irq, errp);
 }
 
 void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num)
@@ -712,6 +729,11 @@ int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align, Error **errp)
     return first + ics->offset;
 }
 
+void spapr_irq_set_lsi_legacy(sPAPRMachineState *spapr, int irq)
+{
+    ics_set_lsi(spapr->ics, irq - spapr->irq->xics_offset);
+}
+
 #define SPAPR_IRQ_XICS_LEGACY_NR_IRQS     0x400
 
 sPAPRIrq spapr_irq_xics_legacy = {
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index c3fb0ac884b0..d68595531d5a 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -391,7 +391,7 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
     }
 
     for (i = 0; i < req_num; i++) {
-        spapr_irq_claim(spapr, irq + i, false, &err);
+        spapr_irq_claim(spapr, irq + i, &err);
         if (err) {
             if (i) {
                 spapr_irq_free(spapr, irq, i);
@@ -1742,9 +1742,11 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
                                         "can't allocate LSIs: ");
                 return;
             }
+
+            spapr_irq_set_lsi_legacy(spapr, irq);
         }
 
-        spapr_irq_claim(spapr, irq, true, &local_err);
+        spapr_irq_claim(spapr, irq, &local_err);
         if (local_err) {
             error_propagate_prepend(errp, local_err, "can't allocate LSIs: ");
             return;
diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
index 2b7e7ecac57f..b1beefc24be5 100644
--- a/hw/ppc/spapr_vio.c
+++ b/hw/ppc/spapr_vio.c
@@ -512,7 +512,7 @@ static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
         }
     }
 
-    spapr_irq_claim(spapr, dev->irq, false, &local_err);
+    spapr_irq_claim(spapr, dev->irq, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         return;
diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index 5e30858dc22a..0e6c65d55430 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -37,7 +37,7 @@ typedef struct sPAPRIrq {
     uint32_t    xics_offset;
 
     void (*init)(sPAPRMachineState *spapr, Error **errp);
-    int (*claim)(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
+    int (*claim)(sPAPRMachineState *spapr, int irq, Error **errp);
     void (*free)(sPAPRMachineState *spapr, int irq, int num);
     qemu_irq (*qirq)(sPAPRMachineState *spapr, int irq);
     void (*print_info)(sPAPRMachineState *spapr, Monitor *mon);
@@ -56,7 +56,7 @@ extern sPAPRIrq spapr_irq_xive;
 extern sPAPRIrq spapr_irq_dual;
 
 void spapr_irq_init(sPAPRMachineState *spapr, Error **errp);
-int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
+int spapr_irq_claim(sPAPRMachineState *spapr, int irq, Error **errp);
 void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
 qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq);
 int spapr_irq_post_load(sPAPRMachineState *spapr, int version_id);
@@ -67,5 +67,6 @@ void spapr_irq_reset(sPAPRMachineState *spapr, Error **errp);
  */
 int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align, Error **errp);
 #define spapr_irq_findone(spapr, errp) spapr_irq_find(spapr, 1, false, errp)
+void spapr_irq_set_lsi_legacy(sPAPRMachineState *spapr, int irq);
 
 #endif
diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
index 9bec9192e4a0..885ca169cb29 100644
--- a/include/hw/ppc/spapr_xive.h
+++ b/include/hw/ppc/spapr_xive.h
@@ -37,7 +37,7 @@ typedef struct sPAPRXive {
     MemoryRegion  tm_mmio;
 } sPAPRXive;
 
-bool spapr_xive_irq_claim(sPAPRXive *xive, uint32_t lisn, bool lsi);
+bool spapr_xive_irq_claim(sPAPRXive *xive, uint32_t lisn);
 bool spapr_xive_irq_free(sPAPRXive *xive, uint32_t lisn);
 void spapr_xive_pic_print_info(sPAPRXive *xive, Monitor *mon);
 
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index fad786e8b22d..18b083fe2aec 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -133,6 +133,7 @@ struct ICSState {
     uint32_t offset;
     ICSIRQState *irqs;
     XICSFabric *xics;
+    unsigned long *lsi_map;
 };
 
 #define ICS_PROP_XICS "xics"
@@ -193,7 +194,8 @@ void ics_simple_write_xive(ICSState *ics, int nr, int server,
 void ics_simple_set_irq(void *opaque, int srcno, int val);
 void ics_kvm_set_irq(void *opaque, int srcno, int val);
 
-void ics_set_irq_type(ICSState *ics, int srcno, bool lsi);
+void ics_set_lsi(ICSState *ics, int srcno);
+void ics_claim_irq(ICSState *ics, int srcno);
 void icp_pic_print_info(ICPState *icp, Monitor *mon);
 void ics_pic_print_info(ICSState *ics, Monitor *mon);
 

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v4 04/15] spapr: Expose the name of the interrupt controller node
  2019-02-12 18:23 [Qemu-devel] [PATCH v4 00/15] spapr: Add support for PHB hotplug Greg Kurz
                   ` (2 preceding siblings ...)
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 03/15] spapr_irq: Set LSIs at interrupt controller init Greg Kurz
@ 2019-02-12 18:24 ` Greg Kurz
  2019-02-13  3:50   ` David Gibson
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 05/15] spapr_irq: Expose the phandle of the interrupt controller Greg Kurz
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 35+ messages in thread
From: Greg Kurz @ 2019-02-12 18:24 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Greg Kurz,
	Alexey Kardashevskiy, Cédric Le Goater, Michael Roth,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Eduardo Habkost, David Hildenbrand, Cornelia Huck, Gerd Hoffmann,
	Dmitry Fleytman, Thomas Huth

This will be needed by PHB hotplug in order to access the "phandle"
property of the interrupt controller node.

Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
---
v4: - folded some changes from patches 15, 16 and 17 of v3
    - dropped useless helpers
---
 hw/intc/spapr_xive.c        |    9 ++++-----
 hw/intc/xics_spapr.c        |    2 +-
 hw/ppc/spapr_irq.c          |   21 ++++++++++++++++++++-
 include/hw/ppc/spapr_irq.h  |    1 +
 include/hw/ppc/spapr_xive.h |    3 +++
 include/hw/ppc/xics_spapr.h |    2 ++
 6 files changed, 31 insertions(+), 7 deletions(-)

diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index 815263ca72ab..f14e436ad4b9 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -317,6 +317,9 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
     /* Map all regions */
     spapr_xive_map_mmio(xive);
 
+    xive->nodename = g_strdup_printf("interrupt-controller@%" PRIx64,
+                           xive->tm_base + XIVE_TM_USER_PAGE * (1 << TM_SHIFT));
+
     qemu_register_reset(spapr_xive_reset, dev);
 }
 
@@ -1443,7 +1446,6 @@ void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
         cpu_to_be32(7),    /* start */
         cpu_to_be32(0xf8), /* count */
     };
-    gchar *nodename;
 
     /* Thread Interrupt Management Area : User (ring 3) and OS (ring 2) */
     timas[0] = cpu_to_be64(xive->tm_base +
@@ -1453,10 +1455,7 @@ void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
                            XIVE_TM_OS_PAGE * (1ull << TM_SHIFT));
     timas[3] = cpu_to_be64(1ull << TM_SHIFT);
 
-    nodename = g_strdup_printf("interrupt-controller@%" PRIx64,
-                           xive->tm_base + XIVE_TM_USER_PAGE * (1 << TM_SHIFT));
-    _FDT(node = fdt_add_subnode(fdt, 0, nodename));
-    g_free(nodename);
+    _FDT(node = fdt_add_subnode(fdt, 0, xive->nodename));
 
     _FDT(fdt_setprop_string(fdt, node, "device_type", "power-ivpe"));
     _FDT(fdt_setprop(fdt, node, "reg", timas, sizeof(timas)));
diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
index e2d8b3818336..53bda6661b2a 100644
--- a/hw/intc/xics_spapr.c
+++ b/hw/intc/xics_spapr.c
@@ -254,7 +254,7 @@ void spapr_dt_xics(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
     };
     int node;
 
-    _FDT(node = fdt_add_subnode(fdt, 0, "interrupt-controller"));
+    _FDT(node = fdt_add_subnode(fdt, 0, XICS_NODENAME));
 
     _FDT(fdt_setprop_string(fdt, node, "device_type",
                             "PowerPC-External-Interrupt-Presentation"));
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 3fc34d7c8a43..b8d725e251ba 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -256,6 +256,11 @@ static void spapr_irq_reset_xics(sPAPRMachineState *spapr, Error **errp)
     /* TODO: create the KVM XICS device */
 }
 
+static const char *spapr_irq_get_nodename_xics(sPAPRMachineState *spapr)
+{
+    return XICS_NODENAME;
+}
+
 #define SPAPR_IRQ_XICS_NR_IRQS     0x1000
 #define SPAPR_IRQ_XICS_NR_MSIS     \
     (XICS_IRQ_BASE + SPAPR_IRQ_XICS_NR_IRQS - SPAPR_IRQ_MSI)
@@ -276,6 +281,7 @@ sPAPRIrq spapr_irq_xics = {
     .post_load   = spapr_irq_post_load_xics,
     .reset       = spapr_irq_reset_xics,
     .set_irq     = spapr_irq_set_irq_xics,
+    .get_nodename = spapr_irq_get_nodename_xics,
 };
 
 /*
@@ -415,6 +421,11 @@ static void spapr_irq_set_irq_xive(void *opaque, int srcno, int val)
     xive_source_set_irq(&spapr->xive->source, srcno, val);
 }
 
+static const char *spapr_irq_get_nodename_xive(sPAPRMachineState *spapr)
+{
+    return spapr->xive->nodename;
+}
+
 /*
  * XIVE uses the full IRQ number space. Set it to 8K to be compatible
  * with XICS.
@@ -438,6 +449,7 @@ sPAPRIrq spapr_irq_xive = {
     .post_load   = spapr_irq_post_load_xive,
     .reset       = spapr_irq_reset_xive,
     .set_irq     = spapr_irq_set_irq_xive,
+    .get_nodename = spapr_irq_get_nodename_xive,
 };
 
 /*
@@ -585,6 +597,11 @@ static void spapr_irq_set_irq_dual(void *opaque, int srcno, int val)
     spapr_irq_current(spapr)->set_irq(spapr, srcno, val);
 }
 
+static const char *spapr_irq_get_nodename_dual(sPAPRMachineState *spapr)
+{
+    return spapr_irq_current(spapr)->get_nodename(spapr);
+}
+
 /*
  * Define values in sync with the XIVE and XICS backend
  */
@@ -615,7 +632,8 @@ sPAPRIrq spapr_irq_dual = {
     .cpu_intc_create = spapr_irq_cpu_intc_create_dual,
     .post_load   = spapr_irq_post_load_dual,
     .reset       = spapr_irq_reset_dual,
-    .set_irq     = spapr_irq_set_irq_dual
+    .set_irq     = spapr_irq_set_irq_dual,
+    .get_nodename = spapr_irq_get_nodename_dual,
 };
 
 /*
@@ -751,4 +769,5 @@ sPAPRIrq spapr_irq_xics_legacy = {
     .cpu_intc_create = spapr_irq_cpu_intc_create_xics,
     .post_load   = spapr_irq_post_load_xics,
     .set_irq     = spapr_irq_set_irq_xics,
+    .get_nodename = spapr_irq_get_nodename_xics,
 };
diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index 0e6c65d55430..ad7127355441 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -48,6 +48,7 @@ typedef struct sPAPRIrq {
     int (*post_load)(sPAPRMachineState *spapr, int version_id);
     void (*reset)(sPAPRMachineState *spapr, Error **errp);
     void (*set_irq)(void *opaque, int srcno, int val);
+    const char *(*get_nodename)(sPAPRMachineState *spapr);
 } sPAPRIrq;
 
 extern sPAPRIrq spapr_irq_xics;
diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
index 885ca169cb29..2c57a59a3f5b 100644
--- a/include/hw/ppc/spapr_xive.h
+++ b/include/hw/ppc/spapr_xive.h
@@ -26,6 +26,9 @@ typedef struct sPAPRXive {
     XiveENDSource end_source;
     hwaddr        end_base;
 
+    /* DT */
+    gchar *nodename;
+
     /* Routing table */
     XiveEAS       *eat;
     uint32_t      nr_irqs;
diff --git a/include/hw/ppc/xics_spapr.h b/include/hw/ppc/xics_spapr.h
index b1ab27d022cf..b8d924baf437 100644
--- a/include/hw/ppc/xics_spapr.h
+++ b/include/hw/ppc/xics_spapr.h
@@ -29,6 +29,8 @@
 
 #include "hw/ppc/spapr.h"
 
+#define XICS_NODENAME "interrupt-controller"
+
 void spapr_dt_xics(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
                    uint32_t phandle);
 int xics_kvm_init(sPAPRMachineState *spapr, Error **errp);

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v4 05/15] spapr_irq: Expose the phandle of the interrupt controller
  2019-02-12 18:23 [Qemu-devel] [PATCH v4 00/15] spapr: Add support for PHB hotplug Greg Kurz
                   ` (3 preceding siblings ...)
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 04/15] spapr: Expose the name of the interrupt controller node Greg Kurz
@ 2019-02-12 18:24 ` Greg Kurz
  2019-02-13  3:52   ` David Gibson
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 06/15] spapr_pci: add PHB unrealize Greg Kurz
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 35+ messages in thread
From: Greg Kurz @ 2019-02-12 18:24 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Greg Kurz,
	Alexey Kardashevskiy, Cédric Le Goater, Michael Roth,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Eduardo Habkost, David Hildenbrand, Cornelia Huck, Gerd Hoffmann,
	Dmitry Fleytman, Thomas Huth

This will be used by PHB hotplug in order to create the "interrupt-map"
property of the PHB node.

Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
---
v4: - return phandle via a pointer
---
 hw/ppc/spapr_irq.c         |   26 ++++++++++++++++++++++++++
 include/hw/ppc/spapr_irq.h |    2 ++
 2 files changed, 28 insertions(+)

diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index b8d725e251ba..31495033c37c 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -692,6 +692,32 @@ void spapr_irq_reset(sPAPRMachineState *spapr, Error **errp)
     }
 }
 
+int spapr_irq_get_phandle(sPAPRMachineState *spapr, void *fdt,
+                          uint32_t *phandle, Error **errp)
+{
+    const char *nodename = spapr->irq->get_nodename(spapr);
+    int offset, ph;
+
+    offset = fdt_subnode_offset(fdt, 0, nodename);
+    if (offset < 0) {
+        error_setg(errp, "Can't find node \"%s\": %s", nodename,
+                   fdt_strerror(offset));
+        return -1;
+    }
+
+    ph = fdt_get_phandle(fdt, offset);
+    if (!ph) {
+        error_setg(errp, "Can't get phandle of node \"%s\"", nodename);
+        return -1;
+    }
+
+    if (phandle) {
+        *phandle = ph;
+    }
+
+    return 0;
+}
+
 /*
  * XICS legacy routines - to deprecate one day
  */
diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index ad7127355441..4b3303ef4f6a 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -62,6 +62,8 @@ void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
 qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq);
 int spapr_irq_post_load(sPAPRMachineState *spapr, int version_id);
 void spapr_irq_reset(sPAPRMachineState *spapr, Error **errp);
+int spapr_irq_get_phandle(sPAPRMachineState *spapr, void *fdt,
+                          uint32_t *phandle, Error **errp);
 
 /*
  * XICS legacy routines

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v4 06/15] spapr_pci: add PHB unrealize
  2019-02-12 18:23 [Qemu-devel] [PATCH v4 00/15] spapr: Add support for PHB hotplug Greg Kurz
                   ` (4 preceding siblings ...)
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 05/15] spapr_irq: Expose the phandle of the interrupt controller Greg Kurz
@ 2019-02-12 18:24 ` Greg Kurz
  2019-02-13  3:56   ` David Gibson
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 07/15] spapr: create DR connectors for PHBs Greg Kurz
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 35+ messages in thread
From: Greg Kurz @ 2019-02-12 18:24 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Greg Kurz,
	Alexey Kardashevskiy, Cédric Le Goater, Michael Roth,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Eduardo Habkost, David Hildenbrand, Cornelia Huck, Gerd Hoffmann,
	Dmitry Fleytman, Thomas Huth

To support PHB hotplug we need to clean up lingering references,
memory, child properties, etc. prior to the PHB object being
finalized. Generally this will be called as a result of calling
object_unparent() on the PHB object, which in turn would normally
be called as the result of an unplug() operation.

When the PHB is finalized, child objects will be unparented in
turn, and finalized if the PHB was the only reference holder. so
we don't bother to explicitly unparent child objects of the PHB
(spapr_iommu, spapr_drc, etc).

The formula that gives the number of DMA windows is moved to an
inline function in the hw/pci-host/spapr.h header because it
will have other users.

The unrealize function is able to cope with partially realized PHBs.
It is hence used to implement proper rollback on the realize error
path.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
---
v4: - reverted to v2
v3: - don't free LSIs at unrealize
v2: - implement rollback with unrealize function
---
 hw/ppc/spapr_pci.c          |   75 +++++++++++++++++++++++++++++++++++++++++--
 include/hw/pci-host/spapr.h |    5 +++
 2 files changed, 76 insertions(+), 4 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index d68595531d5a..e3781dd110b2 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -1565,6 +1565,64 @@ static void spapr_pci_unplug_request(HotplugHandler *plug_handler,
     }
 }
 
+static void spapr_phb_finalizefn(Object *obj)
+{
+    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(obj);
+
+    g_free(sphb->dtbusname);
+    sphb->dtbusname = NULL;
+}
+
+static void spapr_phb_unrealize(DeviceState *dev, Error **errp)
+{
+    sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
+    SysBusDevice *s = SYS_BUS_DEVICE(dev);
+    PCIHostState *phb = PCI_HOST_BRIDGE(s);
+    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(phb);
+    sPAPRTCETable *tcet;
+    int i;
+    const unsigned windows_supported = spapr_phb_windows_supported(sphb);
+
+    if (sphb->msi) {
+        g_hash_table_unref(sphb->msi);
+        sphb->msi = NULL;
+    }
+
+    /*
+     * Remove IO/MMIO subregions and aliases, rest should get cleaned
+     * via PHB's unrealize->object_finalize
+     */
+    for (i = windows_supported - 1; i >= 0; i--) {
+        tcet = spapr_tce_find_by_liobn(sphb->dma_liobn[i]);
+        if (tcet) {
+            memory_region_del_subregion(&sphb->iommu_root,
+                                        spapr_tce_get_iommu(tcet));
+        }
+    }
+
+    for (i = PCI_NUM_PINS - 1; i >= 0; i--) {
+        if (sphb->lsi_table[i].irq) {
+            spapr_irq_free(spapr, sphb->lsi_table[i].irq, 1);
+            sphb->lsi_table[i].irq = 0;
+        }
+    }
+
+    QLIST_REMOVE(sphb, list);
+
+    memory_region_del_subregion(&sphb->iommu_root, &sphb->msiwindow);
+
+    address_space_destroy(&sphb->iommu_as);
+
+    qbus_set_hotplug_handler(BUS(phb->bus), NULL, &error_abort);
+    pci_unregister_root_bus(phb->bus);
+
+    memory_region_del_subregion(get_system_memory(), &sphb->iowindow);
+    if (sphb->mem64_win_pciaddr != (hwaddr)-1) {
+        memory_region_del_subregion(get_system_memory(), &sphb->mem64window);
+    }
+    memory_region_del_subregion(get_system_memory(), &sphb->mem32window);
+}
+
 static void spapr_phb_realize(DeviceState *dev, Error **errp)
 {
     /* We don't use SPAPR_MACHINE() in order to exit gracefully if the user
@@ -1582,8 +1640,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
     PCIBus *bus;
     uint64_t msi_window_size = 4096;
     sPAPRTCETable *tcet;
-    const unsigned windows_supported =
-        sphb->ddw_enabled ? SPAPR_PCI_DMA_MAX_WINDOWS : 1;
+    const unsigned windows_supported = spapr_phb_windows_supported(sphb);
 
     if (!spapr) {
         error_setg(errp, TYPE_SPAPR_PCI_HOST_BRIDGE " needs a pseries machine");
@@ -1740,6 +1797,10 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
             if (local_err) {
                 error_propagate_prepend(errp, local_err,
                                         "can't allocate LSIs: ");
+                /*
+                 * Older machines will never support PHB hotplug, ie, this is an
+                 * init only path and QEMU will terminate. No need to rollback.
+                 */
                 return;
             }
 
@@ -1749,7 +1810,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
         spapr_irq_claim(spapr, irq, &local_err);
         if (local_err) {
             error_propagate_prepend(errp, local_err, "can't allocate LSIs: ");
-            return;
+            goto unrealize;
         }
 
         sphb->lsi_table[i].irq = irq;
@@ -1769,13 +1830,17 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
         if (!tcet) {
             error_setg(errp, "Creating window#%d failed for %s",
                        i, sphb->dtbusname);
-            return;
+            goto unrealize;
         }
         memory_region_add_subregion(&sphb->iommu_root, 0,
                                     spapr_tce_get_iommu(tcet));
     }
 
     sphb->msi = g_hash_table_new_full(g_int_hash, g_int_equal, g_free, g_free);
+    return;
+
+unrealize:
+    spapr_phb_unrealize(dev, NULL);
 }
 
 static int spapr_phb_children_reset(Object *child, void *opaque)
@@ -1974,6 +2039,7 @@ static void spapr_phb_class_init(ObjectClass *klass, void *data)
 
     hc->root_bus_path = spapr_phb_root_bus_path;
     dc->realize = spapr_phb_realize;
+    dc->unrealize = spapr_phb_unrealize;
     dc->props = spapr_phb_properties;
     dc->reset = spapr_phb_reset;
     dc->vmsd = &vmstate_spapr_pci;
@@ -1989,6 +2055,7 @@ static const TypeInfo spapr_phb_info = {
     .name          = TYPE_SPAPR_PCI_HOST_BRIDGE,
     .parent        = TYPE_PCI_HOST_BRIDGE,
     .instance_size = sizeof(sPAPRPHBState),
+    .instance_finalize = spapr_phb_finalizefn,
     .class_init    = spapr_phb_class_init,
     .interfaces    = (InterfaceInfo[]) {
         { TYPE_HOTPLUG_HANDLER },
diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
index 51d81c4b7ce8..7cfce54a9449 100644
--- a/include/hw/pci-host/spapr.h
+++ b/include/hw/pci-host/spapr.h
@@ -163,4 +163,9 @@ static inline void spapr_phb_vfio_reset(DeviceState *qdev)
 
 void spapr_phb_dma_reset(sPAPRPHBState *sphb);
 
+static inline unsigned spapr_phb_windows_supported(sPAPRPHBState *sphb)
+{
+    return sphb->ddw_enabled ? SPAPR_PCI_DMA_MAX_WINDOWS : 1;
+}
+
 #endif /* PCI_HOST_SPAPR_H */

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v4 07/15] spapr: create DR connectors for PHBs
  2019-02-12 18:23 [Qemu-devel] [PATCH v4 00/15] spapr: Add support for PHB hotplug Greg Kurz
                   ` (5 preceding siblings ...)
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 06/15] spapr_pci: add PHB unrealize Greg Kurz
@ 2019-02-12 18:24 ` Greg Kurz
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 08/15] spapr: populate PHB DRC entries for root DT node Greg Kurz
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 35+ messages in thread
From: Greg Kurz @ 2019-02-12 18:24 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Greg Kurz,
	Alexey Kardashevskiy, Cédric Le Goater, Michael Roth,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Eduardo Habkost, David Hildenbrand, Cornelia Huck, Gerd Hoffmann,
	Dmitry Fleytman, Thomas Huth

From: Michael Roth <mdroth@linux.vnet.ibm.com>

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/ppc/spapr.c             |   13 +++++++++++++
 hw/ppc/spapr_drc.c         |   17 +++++++++++++++++
 include/hw/ppc/spapr.h     |    1 +
 include/hw/ppc/spapr_drc.h |    8 ++++++++
 4 files changed, 39 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 850cfe28c414..590c67805e52 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2796,6 +2796,19 @@ static void spapr_machine_init(MachineState *machine)
     /* We always have at least the nvram device on VIO */
     spapr_create_nvram(spapr);
 
+    /*
+     * Setup hotplug / dynamic-reconfiguration connectors. top-level
+     * connectors (described in root DT node's "ibm,drc-types" property)
+     * are pre-initialized here. additional child connectors (such as
+     * connectors for a PHBs PCI slots) are added as needed during their
+     * parent's realization.
+     */
+    if (smc->dr_phb_enabled) {
+        for (i = 0; i < SPAPR_MAX_PHBS; i++) {
+            spapr_dr_connector_new(OBJECT(machine), TYPE_SPAPR_DRC_PHB, i);
+        }
+    }
+
     /* Set up PCI */
     spapr_pci_rtas_init();
 
diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 2edb7d1e9c8c..189ee681062a 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -696,6 +696,15 @@ static void spapr_drc_lmb_class_init(ObjectClass *k, void *data)
     drck->release = spapr_lmb_release;
 }
 
+static void spapr_drc_phb_class_init(ObjectClass *k, void *data)
+{
+    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_CLASS(k);
+
+    drck->typeshift = SPAPR_DR_CONNECTOR_TYPE_SHIFT_PHB;
+    drck->typename = "PHB";
+    drck->drc_name_prefix = "PHB ";
+}
+
 static const TypeInfo spapr_dr_connector_info = {
     .name          = TYPE_SPAPR_DR_CONNECTOR,
     .parent        = TYPE_DEVICE,
@@ -739,6 +748,13 @@ static const TypeInfo spapr_drc_lmb_info = {
     .class_init    = spapr_drc_lmb_class_init,
 };
 
+static const TypeInfo spapr_drc_phb_info = {
+    .name          = TYPE_SPAPR_DRC_PHB,
+    .parent        = TYPE_SPAPR_DRC_LOGICAL,
+    .instance_size = sizeof(sPAPRDRConnector),
+    .class_init    = spapr_drc_phb_class_init,
+};
+
 /* helper functions for external users */
 
 sPAPRDRConnector *spapr_drc_by_index(uint32_t index)
@@ -1189,6 +1205,7 @@ static void spapr_drc_register_types(void)
     type_register_static(&spapr_drc_cpu_info);
     type_register_static(&spapr_drc_pci_info);
     type_register_static(&spapr_drc_lmb_info);
+    type_register_static(&spapr_drc_phb_info);
 
     spapr_rtas_register(RTAS_SET_INDICATOR, "set-indicator",
                         rtas_set_indicator);
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index cbd276ed2b6a..a3074e7fea37 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -104,6 +104,7 @@ struct sPAPRMachineClass {
 
     /*< public >*/
     bool dr_lmb_enabled;       /* enable dynamic-reconfig/hotplug of LMBs */
+    bool dr_phb_enabled;       /* enable dynamic-reconfig/hotplug of PHBs */
     bool update_dt_enabled;    /* enable KVMPPC_H_UPDATE_DT */
     bool use_ohci_by_default;  /* use USB-OHCI instead of XHCI */
     bool pre_2_10_has_unused_icps;
diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
index f6ff32e7e2f2..56bba36ad4da 100644
--- a/include/hw/ppc/spapr_drc.h
+++ b/include/hw/ppc/spapr_drc.h
@@ -70,6 +70,14 @@
 #define SPAPR_DRC_LMB(obj) OBJECT_CHECK(sPAPRDRConnector, (obj), \
                                         TYPE_SPAPR_DRC_LMB)
 
+#define TYPE_SPAPR_DRC_PHB "spapr-drc-phb"
+#define SPAPR_DRC_PHB_GET_CLASS(obj) \
+        OBJECT_GET_CLASS(sPAPRDRConnectorClass, obj, TYPE_SPAPR_DRC_PHB)
+#define SPAPR_DRC_PHB_CLASS(klass) \
+        OBJECT_CLASS_CHECK(sPAPRDRConnectorClass, klass, TYPE_SPAPR_DRC_PHB)
+#define SPAPR_DRC_PHB(obj) OBJECT_CHECK(sPAPRDRConnector, (obj), \
+                                        TYPE_SPAPR_DRC_PHB)
+
 /*
  * Various hotplug types managed by sPAPRDRConnector
  *

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v4 08/15] spapr: populate PHB DRC entries for root DT node
  2019-02-12 18:23 [Qemu-devel] [PATCH v4 00/15] spapr: Add support for PHB hotplug Greg Kurz
                   ` (6 preceding siblings ...)
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 07/15] spapr: create DR connectors for PHBs Greg Kurz
@ 2019-02-12 18:24 ` Greg Kurz
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 09/15] spapr_events: add support for phb hotplug events Greg Kurz
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 35+ messages in thread
From: Greg Kurz @ 2019-02-12 18:24 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Greg Kurz,
	Alexey Kardashevskiy, Cédric Le Goater, Michael Roth,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Eduardo Habkost, David Hildenbrand, Cornelia Huck, Gerd Hoffmann,
	Dmitry Fleytman, Thomas Huth

From: Nathan Fontenot <nfont@linux.vnet.ibm.com>

This add entries to the root OF node to advertise our PHBs as being
DR-capable in accordance with PAPR specification.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/ppc/spapr.c |    8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 590c67805e52..03183c52f57b 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1348,6 +1348,14 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr)
         exit(1);
     }
 
+    if (smc->dr_phb_enabled) {
+        ret = spapr_drc_populate_dt(fdt, 0, NULL, SPAPR_DR_CONNECTOR_TYPE_PHB);
+        if (ret < 0) {
+            error_report("Couldn't set up PHB DR device tree properties");
+            exit(1);
+        }
+    }
+
     return fdt;
 }
 

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v4 09/15] spapr_events: add support for phb hotplug events
  2019-02-12 18:23 [Qemu-devel] [PATCH v4 00/15] spapr: Add support for PHB hotplug Greg Kurz
                   ` (7 preceding siblings ...)
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 08/15] spapr: populate PHB DRC entries for root DT node Greg Kurz
@ 2019-02-12 18:24 ` Greg Kurz
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 10/15] qdev: pass an Object * to qbus_set_hotplug_handler() Greg Kurz
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 35+ messages in thread
From: Greg Kurz @ 2019-02-12 18:24 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Greg Kurz,
	Alexey Kardashevskiy, Cédric Le Goater, Michael Roth,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Eduardo Habkost, David Hildenbrand, Cornelia Huck, Gerd Hoffmann,
	Dmitry Fleytman, Thomas Huth

From: Michael Roth <mdroth@linux.vnet.ibm.com>

Extend the existing EPOW event format we use for PCI
devices to emit PHB plug/unplug events.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/ppc/spapr_events.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index 559026d0981c..6d5a925d03cb 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -526,6 +526,9 @@ static void spapr_hotplug_req_event(uint8_t hp_id, uint8_t hp_action,
     case SPAPR_DR_CONNECTOR_TYPE_CPU:
         hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_CPU;
         break;
+    case SPAPR_DR_CONNECTOR_TYPE_PHB:
+        hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_PHB;
+        break;
     default:
         /* we shouldn't be signaling hotplug events for resources
          * that don't support them

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v4 10/15] qdev: pass an Object * to qbus_set_hotplug_handler()
  2019-02-12 18:23 [Qemu-devel] [PATCH v4 00/15] spapr: Add support for PHB hotplug Greg Kurz
                   ` (8 preceding siblings ...)
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 09/15] spapr_events: add support for phb hotplug events Greg Kurz
@ 2019-02-12 18:24 ` Greg Kurz
  2019-02-13  3:59   ` David Gibson
  2019-02-12 18:25 ` [Qemu-devel] [PATCH v4 11/15] spapr_pci: provide node start offset via spapr_populate_pci_dt() Greg Kurz
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 35+ messages in thread
From: Greg Kurz @ 2019-02-12 18:24 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Greg Kurz,
	Alexey Kardashevskiy, Cédric Le Goater, Michael Roth,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Eduardo Habkost, David Hildenbrand, Cornelia Huck, Gerd Hoffmann,
	Dmitry Fleytman, Thomas Huth

From: Michael Roth <mdroth@linux.vnet.ibm.com>

Certain devices types, like memory/CPU, are now being handled using a
hotplug interface provided by a top-level MachineClass. Hotpluggable
host bridges are another such device where it makes sense to use a
machine-level hotplug handler. However, unlike those devices,
host-bridges have a parent bus (the main system bus), and devices with
a parent bus use a different mechanism for registering their hotplug
handlers: qbus_set_hotplug_handler(). This interface currently expects
a handler to be a subclass of DeviceClass, but this is not the case
for MachineClass, which derives directly from ObjectClass.

Internally, the interface only requires an ObjectClass, so expose that
in qbus_set_hotplug_handler().

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
---
 hw/acpi/pcihp.c               |    2 +-
 hw/acpi/piix4.c               |    2 +-
 hw/char/virtio-serial-bus.c   |    2 +-
 hw/core/bus.c                 |   11 ++---------
 hw/pci/pcie.c                 |    2 +-
 hw/pci/shpc.c                 |    2 +-
 hw/ppc/spapr_pci.c            |    2 +-
 hw/s390x/css-bridge.c         |    2 +-
 hw/s390x/s390-pci-bus.c       |    6 +++---
 hw/scsi/virtio-scsi.c         |    2 +-
 hw/scsi/vmw_pvscsi.c          |    2 +-
 hw/usb/dev-smartcard-reader.c |    2 +-
 include/hw/qdev-core.h        |    3 +--
 13 files changed, 16 insertions(+), 24 deletions(-)

diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
index 7bc7a723407b..942918132376 100644
--- a/hw/acpi/pcihp.c
+++ b/hw/acpi/pcihp.c
@@ -251,7 +251,7 @@ void acpi_pcihp_device_plug_cb(HotplugHandler *hotplug_dev, AcpiPciHpState *s,
             object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) {
             PCIBus *sec = pci_bridge_get_sec_bus(PCI_BRIDGE(pdev));
 
-            qbus_set_hotplug_handler(BUS(sec), DEVICE(hotplug_dev),
+            qbus_set_hotplug_handler(BUS(sec), OBJECT(hotplug_dev),
                                      &error_abort);
             /* We don't have to overwrite any other hotplug handler yet */
             assert(QLIST_EMPTY(&sec->child));
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 88f9a9ec0912..df8c0db909ce 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -536,7 +536,7 @@ static void piix4_pm_realize(PCIDevice *dev, Error **errp)
 
     piix4_acpi_system_hot_add_init(pci_address_space_io(dev),
                                    pci_get_bus(dev), s);
-    qbus_set_hotplug_handler(BUS(pci_get_bus(dev)), DEVICE(s), &error_abort);
+    qbus_set_hotplug_handler(BUS(pci_get_bus(dev)), OBJECT(s), &error_abort);
 
     piix4_pm_add_propeties(s);
 }
diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index d76351d7487d..bdd917bbb83c 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -1052,7 +1052,7 @@ static void virtio_serial_device_realize(DeviceState *dev, Error **errp)
     /* Spawn a new virtio-serial bus on which the ports will ride as devices */
     qbus_create_inplace(&vser->bus, sizeof(vser->bus), TYPE_VIRTIO_SERIAL_BUS,
                         dev, vdev->bus_name);
-    qbus_set_hotplug_handler(BUS(&vser->bus), DEVICE(vser), errp);
+    qbus_set_hotplug_handler(BUS(&vser->bus), OBJECT(vser), errp);
     vser->bus.vser = vser;
     QTAILQ_INIT(&vser->ports);
 
diff --git a/hw/core/bus.c b/hw/core/bus.c
index 4651f244864c..e09843f6abea 100644
--- a/hw/core/bus.c
+++ b/hw/core/bus.c
@@ -22,22 +22,15 @@
 #include "hw/qdev.h"
 #include "qapi/error.h"
 
-static void qbus_set_hotplug_handler_internal(BusState *bus, Object *handler,
-                                              Error **errp)
+void qbus_set_hotplug_handler(BusState *bus, Object *handler, Error **errp)
 {
-
     object_property_set_link(OBJECT(bus), OBJECT(handler),
                              QDEV_HOTPLUG_HANDLER_PROPERTY, errp);
 }
 
-void qbus_set_hotplug_handler(BusState *bus, DeviceState *handler, Error **errp)
-{
-    qbus_set_hotplug_handler_internal(bus, OBJECT(handler), errp);
-}
-
 void qbus_set_bus_hotplug_handler(BusState *bus, Error **errp)
 {
-    qbus_set_hotplug_handler_internal(bus, OBJECT(bus), errp);
+    qbus_set_hotplug_handler(bus, OBJECT(bus), errp);
 }
 
 int qbus_walk_children(BusState *bus,
diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index 230478faab12..3f7c36609313 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -543,7 +543,7 @@ void pcie_cap_slot_init(PCIDevice *dev, uint16_t slot)
     dev->exp.hpev_notified = false;
 
     qbus_set_hotplug_handler(BUS(pci_bridge_get_sec_bus(PCI_BRIDGE(dev))),
-                             DEVICE(dev), NULL);
+                             OBJECT(dev), NULL);
 }
 
 void pcie_cap_slot_reset(PCIDevice *dev)
diff --git a/hw/pci/shpc.c b/hw/pci/shpc.c
index 45053b39b92c..52ccdc5ae3b9 100644
--- a/hw/pci/shpc.c
+++ b/hw/pci/shpc.c
@@ -648,7 +648,7 @@ int shpc_init(PCIDevice *d, PCIBus *sec_bus, MemoryRegion *bar,
     shpc_cap_update_dword(d);
     memory_region_add_subregion(bar, offset, &shpc->mmio);
 
-    qbus_set_hotplug_handler(BUS(sec_bus), DEVICE(d), NULL);
+    qbus_set_hotplug_handler(BUS(sec_bus), OBJECT(d), NULL);
 
     d->cap_present |= QEMU_PCI_CAP_SHPC;
     return 0;
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index e3781dd110b2..0d4bad7bbe73 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -1743,7 +1743,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
                                 &sphb->memspace, &sphb->iospace,
                                 PCI_DEVFN(0, 0), PCI_NUM_PINS, TYPE_PCI_BUS);
     phb->bus = bus;
-    qbus_set_hotplug_handler(BUS(phb->bus), DEVICE(sphb), NULL);
+    qbus_set_hotplug_handler(BUS(phb->bus), OBJECT(sphb), NULL);
 
     /*
      * Initialize PHB address space.
diff --git a/hw/s390x/css-bridge.c b/hw/s390x/css-bridge.c
index 1bd6c8b45860..7573c40badbd 100644
--- a/hw/s390x/css-bridge.c
+++ b/hw/s390x/css-bridge.c
@@ -108,7 +108,7 @@ VirtualCssBus *virtual_css_bus_init(void)
     cbus = VIRTUAL_CSS_BUS(bus);
 
     /* Enable hotplugging */
-    qbus_set_hotplug_handler(bus, dev, &error_abort);
+    qbus_set_hotplug_handler(bus, OBJECT(dev), &error_abort);
 
     css_register_io_adapters(CSS_IO_ADAPTER_VIRTIO, true, false,
                              0, &error_abort);
diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 80ff1ce33f72..5998942b4c15 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -742,7 +742,7 @@ static void s390_pcihost_realize(DeviceState *dev, Error **errp)
     pci_setup_iommu(b, s390_pci_dma_iommu, s);
 
     bus = BUS(b);
-    qbus_set_hotplug_handler(bus, dev, &local_err);
+    qbus_set_hotplug_handler(bus, OBJECT(dev), &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         return;
@@ -750,7 +750,7 @@ static void s390_pcihost_realize(DeviceState *dev, Error **errp)
     phb->bus = b;
 
     s->bus = S390_PCI_BUS(qbus_create(TYPE_S390_PCI_BUS, dev, NULL));
-    qbus_set_hotplug_handler(BUS(s->bus), dev, &local_err);
+    qbus_set_hotplug_handler(BUS(s->bus), OBJECT(dev), &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         return;
@@ -912,7 +912,7 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
         pci_bridge_map_irq(pb, dev->id, s390_pci_map_irq);
         pci_setup_iommu(&pb->sec_bus, s390_pci_dma_iommu, s);
 
-        qbus_set_hotplug_handler(BUS(&pb->sec_bus), DEVICE(s), errp);
+        qbus_set_hotplug_handler(BUS(&pb->sec_bus), OBJECT(s), errp);
 
         if (dev->hotplugged) {
             pci_default_write_config(pdev, PCI_PRIMARY_BUS,
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index eb90288f4741..ce99d288b035 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -906,7 +906,7 @@ static void virtio_scsi_device_realize(DeviceState *dev, Error **errp)
     scsi_bus_new(&s->bus, sizeof(s->bus), dev,
                  &virtio_scsi_scsi_info, vdev->bus_name);
     /* override default SCSI bus hotplug-handler, with virtio-scsi's one */
-    qbus_set_hotplug_handler(BUS(&s->bus), dev, &error_abort);
+    qbus_set_hotplug_handler(BUS(&s->bus), OBJECT(dev), &error_abort);
 
     virtio_scsi_dataplane_setup(s, errp);
 }
diff --git a/hw/scsi/vmw_pvscsi.c b/hw/scsi/vmw_pvscsi.c
index a3a019e30a74..584b4be07e79 100644
--- a/hw/scsi/vmw_pvscsi.c
+++ b/hw/scsi/vmw_pvscsi.c
@@ -1142,7 +1142,7 @@ pvscsi_realizefn(PCIDevice *pci_dev, Error **errp)
     scsi_bus_new(&s->bus, sizeof(s->bus), DEVICE(pci_dev),
                  &pvscsi_scsi_info, NULL);
     /* override default SCSI bus hotplug-handler, with pvscsi's one */
-    qbus_set_hotplug_handler(BUS(&s->bus), DEVICE(s), &error_abort);
+    qbus_set_hotplug_handler(BUS(&s->bus), OBJECT(s), &error_abort);
     pvscsi_reset_state(s);
 }
 
diff --git a/hw/usb/dev-smartcard-reader.c b/hw/usb/dev-smartcard-reader.c
index 8f716fc165a3..6b0137bb7699 100644
--- a/hw/usb/dev-smartcard-reader.c
+++ b/hw/usb/dev-smartcard-reader.c
@@ -1322,7 +1322,7 @@ static void ccid_realize(USBDevice *dev, Error **errp)
     usb_desc_init(dev);
     qbus_create_inplace(&s->bus, sizeof(s->bus), TYPE_CCID_BUS, DEVICE(dev),
                         NULL);
-    qbus_set_hotplug_handler(BUS(&s->bus), DEVICE(dev), &error_abort);
+    qbus_set_hotplug_handler(BUS(&s->bus), OBJECT(dev), &error_abort);
     s->intr = usb_ep_get(dev, USB_TOKEN_IN, CCID_INT_IN_EP);
     s->bulk = usb_ep_get(dev, USB_TOKEN_IN, CCID_BULK_IN_EP);
     s->card = NULL;
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index 0a84c427561c..e70a4bfa498f 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -430,8 +430,7 @@ char *qdev_get_dev_path(DeviceState *dev);
 
 GSList *qdev_build_hotpluggable_device_list(Object *peripheral);
 
-void qbus_set_hotplug_handler(BusState *bus, DeviceState *handler,
-                              Error **errp);
+void qbus_set_hotplug_handler(BusState *bus, Object *handler, Error **errp);
 
 void qbus_set_bus_hotplug_handler(BusState *bus, Error **errp);
 

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v4 11/15] spapr_pci: provide node start offset via spapr_populate_pci_dt()
  2019-02-12 18:23 [Qemu-devel] [PATCH v4 00/15] spapr: Add support for PHB hotplug Greg Kurz
                   ` (9 preceding siblings ...)
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 10/15] qdev: pass an Object * to qbus_set_hotplug_handler() Greg Kurz
@ 2019-02-12 18:25 ` Greg Kurz
  2019-02-12 18:25 ` [Qemu-devel] [PATCH v4 12/15] spapr_pci: add ibm, my-drc-index property for PHB hotplug Greg Kurz
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 35+ messages in thread
From: Greg Kurz @ 2019-02-12 18:25 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Greg Kurz,
	Alexey Kardashevskiy, Cédric Le Goater, Michael Roth,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Eduardo Habkost, David Hildenbrand, Cornelia Huck, Gerd Hoffmann,
	Dmitry Fleytman, Thomas Huth

From: Michael Roth <mdroth@linux.vnet.ibm.com>

PHB hotplug re-uses PHB device tree generation code and passes
it to a guest via RTAS. Doing this requires knowledge of where
exactly in the device tree the node describing the PHB begins.

Provide this via a new optional pointer that can be used to
store the PHB node's start offset.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/ppc/spapr.c              |    2 +-
 hw/ppc/spapr_pci.c          |    5 ++++-
 include/hw/pci-host/spapr.h |    2 +-
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 03183c52f57b..021758825b7e 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1295,7 +1295,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr)
 
     QLIST_FOREACH(phb, &spapr->phbs, list) {
         ret = spapr_populate_pci_dt(phb, PHANDLE_INTC, fdt,
-                                    spapr->irq->nr_msis);
+                                    spapr->irq->nr_msis, NULL);
         if (ret < 0) {
             error_report("couldn't setup PCI devices in fdt");
             exit(1);
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 0d4bad7bbe73..4f184a80df5d 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -2139,7 +2139,7 @@ static void spapr_phb_pci_enumerate(sPAPRPHBState *phb)
 }
 
 int spapr_populate_pci_dt(sPAPRPHBState *phb, uint32_t intc_phandle, void *fdt,
-                          uint32_t nr_msis)
+                          uint32_t nr_msis, int *node_offset)
 {
     int bus_off, i, j, ret;
     gchar *nodename;
@@ -2194,6 +2194,9 @@ int spapr_populate_pci_dt(sPAPRPHBState *phb, uint32_t intc_phandle, void *fdt,
     nodename = g_strdup_printf("pci@%" PRIx64, phb->buid);
     _FDT(bus_off = fdt_add_subnode(fdt, 0, nodename));
     g_free(nodename);
+    if (node_offset) {
+        *node_offset = bus_off;
+    }
 
     /* Write PHB properties */
     _FDT(fdt_setprop_string(fdt, bus_off, "device_type", "pci"));
diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
index 7cfce54a9449..c05cdaec481f 100644
--- a/include/hw/pci-host/spapr.h
+++ b/include/hw/pci-host/spapr.h
@@ -113,7 +113,7 @@ static inline qemu_irq spapr_phb_lsi_qirq(struct sPAPRPHBState *phb, int pin)
 }
 
 int spapr_populate_pci_dt(sPAPRPHBState *phb, uint32_t intc_phandle, void *fdt,
-                          uint32_t nr_msis);
+                          uint32_t nr_msis, int *node_offset);
 
 void spapr_pci_rtas_init(void);
 

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v4 12/15] spapr_pci: add ibm, my-drc-index property for PHB hotplug
  2019-02-12 18:23 [Qemu-devel] [PATCH v4 00/15] spapr: Add support for PHB hotplug Greg Kurz
                   ` (10 preceding siblings ...)
  2019-02-12 18:25 ` [Qemu-devel] [PATCH v4 11/15] spapr_pci: provide node start offset via spapr_populate_pci_dt() Greg Kurz
@ 2019-02-12 18:25 ` Greg Kurz
  2019-02-12 18:25 ` [Qemu-devel] [PATCH v4 13/15] spapr_drc: Allow FDT fragment to be added later Greg Kurz
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 35+ messages in thread
From: Greg Kurz @ 2019-02-12 18:25 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Greg Kurz,
	Alexey Kardashevskiy, Cédric Le Goater, Michael Roth,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Eduardo Habkost, David Hildenbrand, Cornelia Huck, Gerd Hoffmann,
	Dmitry Fleytman, Thomas Huth

From: Michael Roth <mdroth@linux.vnet.ibm.com>

This is needed to denote a boot-time PHB as being hot-pluggable.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/ppc/spapr_pci.c |    9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 4f184a80df5d..7df7f6502f93 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -2189,6 +2189,7 @@ int spapr_populate_pci_dt(sPAPRPHBState *phb, uint32_t intc_phandle, void *fdt,
     sPAPRTCETable *tcet;
     PCIBus *bus = PCI_HOST_BRIDGE(phb)->bus;
     sPAPRFDT s_fdt;
+    sPAPRDRConnector *drc;
 
     /* Start populating the FDT */
     nodename = g_strdup_printf("pci@%" PRIx64, phb->buid);
@@ -2255,6 +2256,14 @@ int spapr_populate_pci_dt(sPAPRPHBState *phb, uint32_t intc_phandle, void *fdt,
                  tcet->liobn, tcet->bus_offset,
                  tcet->nb_table << tcet->page_shift);
 
+    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_PHB, phb->index);
+    if (drc) {
+        uint32_t drc_index = cpu_to_be32(spapr_drc_index(drc));
+
+        _FDT(fdt_setprop(fdt, bus_off, "ibm,my-drc-index", &drc_index,
+                         sizeof(drc_index)));
+    }
+
     /* Walk the bridges and program the bus numbers*/
     spapr_phb_pci_enumerate(phb);
     _FDT(fdt_setprop_cell(fdt, bus_off, "qemu,phb-enumerated", 0x1));

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v4 13/15] spapr_drc: Allow FDT fragment to be added later
  2019-02-12 18:23 [Qemu-devel] [PATCH v4 00/15] spapr: Add support for PHB hotplug Greg Kurz
                   ` (11 preceding siblings ...)
  2019-02-12 18:25 ` [Qemu-devel] [PATCH v4 12/15] spapr_pci: add ibm, my-drc-index property for PHB hotplug Greg Kurz
@ 2019-02-12 18:25 ` Greg Kurz
  2019-02-13  4:05   ` David Gibson
  2019-02-12 18:25 ` [Qemu-devel] [PATCH v4 14/15] spapr: add hotplug hooks for PHB hotplug Greg Kurz
  2019-02-12 18:25 ` [Qemu-devel] [PATCH v4 15/15] spapr: enable PHB hotplug for default pseries machine type Greg Kurz
  14 siblings, 1 reply; 35+ messages in thread
From: Greg Kurz @ 2019-02-12 18:25 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Greg Kurz,
	Alexey Kardashevskiy, Cédric Le Goater, Michael Roth,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Eduardo Habkost, David Hildenbrand, Cornelia Huck, Gerd Hoffmann,
	Dmitry Fleytman, Thomas Huth

The current logic is to provide the FDT fragment when attaching a device
to a DRC. This works perfectly fine for our current hotplug support, but
soon we will add support for PHB hotplug which has some constraints, that
CPU, PCI and LMB devices don't seem to have.

The first constraint is that the "ibm,dma-window" property of the PHB
node requires the IOMMU to be configured, ie, spapr_tce_table_enable()
has been called, which happens during PHB reset. It is okay in the case
of hotplug since the device is reset before the hotplug handler is
called. On the contrary with coldplug, the hotplug handler is called
first and device is only reset during the initial system reset. Trying
to create the FDT fragment on the hotplug path in this case, would
result in somthing like this:

ibm,dma-window = < 0x80000000 0x00 0x00 0x00 0x00 >;

This will cause linux in the guest to panic, by simply removing and
re-adding the PHB using the drmgr command:

	page = alloc_pages_node(nid, GFP_KERNEL, get_order(sz));
	if (!page)
		panic("iommu_init_table: Can't allocate %ld bytes\n", sz);

The second and maybe more problematic constraint is that the
"interrupt-map" property needs to reference the interrupt controller
node using the very same phandle that SLOF has already exposed to the
guest. QEMU requires SLOF to call the private KVMPPC_H_UPDATE_DT hcall
at some point to know about this phandle. With the latest QEMU and SLOF,
this happens when SLOF gets quiesced. This means that if the PHB gets
hotplugged after CAS but before SLOF quiesce, then we're sure that the
phandle is not known when the hotplug handler is called.

The FDT is only needed when the guest first invokes RTAS to configure
the connector actually, long after SLOF quiesce. Let's postpone the
creation of FDT fragments for PHBs to rtas_ibm_configure_connector().

Since we only need this for PHBs, introduce a new method in the base
DRC class for that. It will implemented for "spapr-drc-phb" DRCs in
a subsequent patch.

Allow spapr_drc_attach() to be passed a NULL fdt argument if the method
is available.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/ppc/spapr_drc.c         |   34 +++++++++++++++++++++++++++++-----
 include/hw/ppc/spapr_drc.h |    6 ++++++
 2 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 189ee681062a..c5a281915665 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -22,6 +22,7 @@
 #include "qemu/error-report.h"
 #include "hw/ppc/spapr.h" /* for RTAS return codes */
 #include "hw/pci-host/spapr.h" /* spapr_phb_remove_pci_device_cb callback */
+#include "sysemu/device_tree.h"
 #include "trace.h"
 
 #define DRC_CONTAINER_PATH "/dr-connector"
@@ -376,6 +377,8 @@ static void prop_get_fdt(Object *obj, Visitor *v, const char *name,
 void spapr_drc_attach(sPAPRDRConnector *drc, DeviceState *d, void *fdt,
                       int fdt_start_offset, Error **errp)
 {
+    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
+
     trace_spapr_drc_attach(spapr_drc_index(drc));
 
     if (drc->dev) {
@@ -384,11 +387,14 @@ void spapr_drc_attach(sPAPRDRConnector *drc, DeviceState *d, void *fdt,
     }
     g_assert((drc->state == SPAPR_DRC_STATE_LOGICAL_UNUSABLE)
              || (drc->state == SPAPR_DRC_STATE_PHYSICAL_POWERON));
-    g_assert(fdt);
+    g_assert(fdt || drck->populate_dt);
 
     drc->dev = d;
-    drc->fdt = fdt;
-    drc->fdt_start_offset = fdt_start_offset;
+
+    if (fdt) {
+        drc->fdt = fdt;
+        drc->fdt_start_offset = fdt_start_offset;
+    }
 
     object_property_add_link(OBJECT(drc), "device",
                              object_get_typename(OBJECT(drc->dev)),
@@ -1118,10 +1124,28 @@ static void rtas_ibm_configure_connector(PowerPCCPU *cpu,
         goto out;
     }
 
-    g_assert(drc->fdt);
-
     drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
 
+    g_assert(drc->fdt || drck->populate_dt);
+
+    if (!drc->fdt) {
+        Error *local_err = NULL;
+        void *fdt;
+        int fdt_size;
+
+        fdt = create_device_tree(&fdt_size);
+
+        if (drck->populate_dt(drc->dev, spapr, fdt, &drc->fdt_start_offset,
+                               &local_err)) {
+            g_free(fdt);
+            error_free(local_err);
+            rc = SPAPR_DR_CC_RESPONSE_ERROR;
+            goto out;
+        }
+
+        drc->fdt = fdt;
+    }
+
     do {
         uint32_t tag;
         const char *name;
diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
index 56bba36ad4da..e947d6987bf2 100644
--- a/include/hw/ppc/spapr_drc.h
+++ b/include/hw/ppc/spapr_drc.h
@@ -18,6 +18,7 @@
 #include "qom/object.h"
 #include "sysemu/sysemu.h"
 #include "hw/qdev.h"
+#include "qapi/error.h"
 
 #define TYPE_SPAPR_DR_CONNECTOR "spapr-dr-connector"
 #define SPAPR_DR_CONNECTOR_GET_CLASS(obj) \
@@ -221,6 +222,8 @@ typedef struct sPAPRDRConnector {
     int fdt_start_offset;
 } sPAPRDRConnector;
 
+struct sPAPRMachineState;
+
 typedef struct sPAPRDRConnectorClass {
     /*< private >*/
     DeviceClass parent;
@@ -236,6 +239,9 @@ typedef struct sPAPRDRConnectorClass {
     uint32_t (*isolate)(sPAPRDRConnector *drc);
     uint32_t (*unisolate)(sPAPRDRConnector *drc);
     void (*release)(DeviceState *dev);
+
+    int (*populate_dt)(DeviceState *dev, struct sPAPRMachineState *spapr,
+                       void *fdt, int *fdt_start_offset, Error **errp);
 } sPAPRDRConnectorClass;
 
 typedef struct sPAPRDRCPhysical {

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v4 14/15] spapr: add hotplug hooks for PHB hotplug
  2019-02-12 18:23 [Qemu-devel] [PATCH v4 00/15] spapr: Add support for PHB hotplug Greg Kurz
                   ` (12 preceding siblings ...)
  2019-02-12 18:25 ` [Qemu-devel] [PATCH v4 13/15] spapr_drc: Allow FDT fragment to be added later Greg Kurz
@ 2019-02-12 18:25 ` Greg Kurz
  2019-02-13  4:13   ` David Gibson
  2019-02-13  9:25   ` David Hildenbrand
  2019-02-12 18:25 ` [Qemu-devel] [PATCH v4 15/15] spapr: enable PHB hotplug for default pseries machine type Greg Kurz
  14 siblings, 2 replies; 35+ messages in thread
From: Greg Kurz @ 2019-02-12 18:25 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Greg Kurz,
	Alexey Kardashevskiy, Cédric Le Goater, Michael Roth,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Eduardo Habkost, David Hildenbrand, Cornelia Huck, Gerd Hoffmann,
	Dmitry Fleytman, Thomas Huth

Hotplugging PHBs is a machine-level operation, but PHBs reside on the
main system bus, so we register spapr machine as the handler for the
main system bus.

Provide the usual pre-plug, plug and unplug-request handlers.

Move the checking of the PHB index to the pre-plug handler. It is okay
to do that and assert in the realize function because the pre-plug
handler is always called, even for the oldest machine types we support.

Unlike with other device types, there are some cases where we cannot
provide the FDT fragment of the PHB from the plug handler, eg, before
KVMPPC_H_UPDATE_DT was called. Do this from a DRC callback that is
called just before the first FDT fragment is exposed to the guest.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(Fixed interrupt controller phandle in "interrupt-map" and
 TCE table size in "ibm,dma-window" FDT fragment, Greg Kurz)
Signed-off-by: Greg Kurz <groug@kaod.org>
---
v4: - populate FDT fragment in a DRC callback
v3: - reworked phandle handling some more
v2: - reworked phandle handling
    - sync LSIs to KVM
---
---
 hw/ppc/spapr.c         |  121 ++++++++++++++++++++++++++++++++++++++++++++++++
 hw/ppc/spapr_drc.c     |    2 +
 hw/ppc/spapr_pci.c     |   16 ------
 include/hw/ppc/spapr.h |    5 ++
 4 files changed, 127 insertions(+), 17 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 021758825b7e..06ce0babcb54 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2930,6 +2930,11 @@ static void spapr_machine_init(MachineState *machine)
     register_savevm_live(NULL, "spapr/htab", -1, 1,
                          &savevm_htab_handlers, spapr);
 
+    if (smc->dr_phb_enabled) {
+        qbus_set_hotplug_handler(sysbus_get_default(), OBJECT(machine),
+                                 &error_fatal);
+    }
+
     qemu_register_boot_set(spapr_boot_set, spapr);
 
     if (kvm_enabled()) {
@@ -3733,6 +3738,108 @@ out:
     error_propagate(errp, local_err);
 }
 
+int spapr_dt_phb(DeviceState *dev, sPAPRMachineState *spapr, void *fdt,
+                 int *fdt_start_offset, Error **errp)
+{
+    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
+    uint32_t intc_phandle;
+
+    if (spapr_irq_get_phandle(spapr, spapr->fdt_blob, &intc_phandle, errp)) {
+        return -1;
+    }
+
+    if (spapr_populate_pci_dt(sphb, intc_phandle, fdt, spapr->irq->nr_msis,
+                              fdt_start_offset)) {
+        error_setg(errp, "unable to create FDT node for PHB %d", sphb->index);
+        return -1;
+    }
+
+    /* generally SLOF creates these, for hotplug it's up to QEMU */
+    _FDT(fdt_setprop_string(fdt, *fdt_start_offset, "name", "pci"));
+
+    return 0;
+}
+
+static void spapr_phb_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
+                               Error **errp)
+{
+    sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev));
+    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
+    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
+    const unsigned windows_supported = spapr_phb_windows_supported(sphb);
+
+    if (sphb->index == (uint32_t)-1) {
+        error_setg(errp, "\"index\" for PAPR PHB is mandatory");
+        return;
+    }
+
+    /*
+     * This will check that sphb->index doesn't exceed the maximum number of
+     * PHBs for the current machine type.
+     */
+    smc->phb_placement(spapr, sphb->index,
+                       &sphb->buid, &sphb->io_win_addr,
+                       &sphb->mem_win_addr, &sphb->mem64_win_addr,
+                       windows_supported, sphb->dma_liobn, errp);
+}
+
+static void spapr_phb_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
+                           Error **errp)
+{
+    sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev));
+    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
+    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
+    sPAPRDRConnector *drc;
+    bool hotplugged = spapr_drc_hotplugged(dev);
+    Error *local_err = NULL;
+
+    if (!smc->dr_phb_enabled) {
+        return;
+    }
+
+    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_PHB, sphb->index);
+    /* hotplug hooks should check it's enabled before getting this far */
+    assert(drc);
+
+    /*
+     * The FDT fragment will be added during the first invocation of RTAS
+     * ibm,client-architecture-support  for this device, when we're sure
+     * that the IOMMU is configured and that QEMU knows the phandle of the
+     * interrupt controller.
+     */
+    spapr_drc_attach(drc, DEVICE(dev), NULL, 0, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+
+    if (hotplugged) {
+        spapr_hotplug_req_add_by_index(drc);
+    } else {
+        spapr_drc_reset(drc);
+    }
+}
+
+void spapr_phb_release(DeviceState *dev)
+{
+    object_unparent(OBJECT(dev));
+}
+
+static void spapr_phb_unplug_request(HotplugHandler *hotplug_dev,
+                                     DeviceState *dev, Error **errp)
+{
+    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
+    sPAPRDRConnector *drc;
+
+    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_PHB, sphb->index);
+    assert(drc);
+
+    if (!spapr_drc_unplug_requested(drc)) {
+        spapr_drc_detach(drc);
+        spapr_hotplug_req_remove_by_index(drc);
+    }
+}
+
 static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
                                       DeviceState *dev, Error **errp)
 {
@@ -3740,6 +3847,8 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
         spapr_memory_plug(hotplug_dev, dev, errp);
     } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_CPU_CORE)) {
         spapr_core_plug(hotplug_dev, dev, errp);
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_PCI_HOST_BRIDGE)) {
+        spapr_phb_plug(hotplug_dev, dev, errp);
     }
 }
 
@@ -3758,6 +3867,7 @@ static void spapr_machine_device_unplug_request(HotplugHandler *hotplug_dev,
 {
     sPAPRMachineState *sms = SPAPR_MACHINE(OBJECT(hotplug_dev));
     MachineClass *mc = MACHINE_GET_CLASS(sms);
+    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
 
     if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
         if (spapr_ovec_test(sms->ov5_cas, OV5_HP_EVT)) {
@@ -3777,6 +3887,12 @@ static void spapr_machine_device_unplug_request(HotplugHandler *hotplug_dev,
             return;
         }
         spapr_core_unplug_request(hotplug_dev, dev, errp);
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_PCI_HOST_BRIDGE)) {
+        if (!smc->dr_phb_enabled) {
+            error_setg(errp, "PHB hot unplug not supported on this machine");
+            return;
+        }
+        spapr_phb_unplug_request(hotplug_dev, dev, errp);
     }
 }
 
@@ -3787,6 +3903,8 @@ static void spapr_machine_device_pre_plug(HotplugHandler *hotplug_dev,
         spapr_memory_pre_plug(hotplug_dev, dev, errp);
     } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_CPU_CORE)) {
         spapr_core_pre_plug(hotplug_dev, dev, errp);
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_PCI_HOST_BRIDGE)) {
+        spapr_phb_pre_plug(hotplug_dev, dev, errp);
     }
 }
 
@@ -3794,7 +3912,8 @@ static HotplugHandler *spapr_get_hotplug_handler(MachineState *machine,
                                                  DeviceState *dev)
 {
     if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) ||
-        object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_CPU_CORE)) {
+        object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_CPU_CORE) ||
+        object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_PCI_HOST_BRIDGE)) {
         return HOTPLUG_HANDLER(machine);
     }
     return NULL;
diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index c5a281915665..22563a381a37 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -709,6 +709,8 @@ static void spapr_drc_phb_class_init(ObjectClass *k, void *data)
     drck->typeshift = SPAPR_DR_CONNECTOR_TYPE_SHIFT_PHB;
     drck->typename = "PHB";
     drck->drc_name_prefix = "PHB ";
+    drck->release = spapr_phb_release;
+    drck->populate_dt = spapr_dt_phb;
 }
 
 static const TypeInfo spapr_dr_connector_info = {
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 7df7f6502f93..d0caca627455 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -1647,21 +1647,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
         return;
     }
 
-    if (sphb->index != (uint32_t)-1) {
-        Error *local_err = NULL;
-
-        smc->phb_placement(spapr, sphb->index,
-                           &sphb->buid, &sphb->io_win_addr,
-                           &sphb->mem_win_addr, &sphb->mem64_win_addr,
-                           windows_supported, sphb->dma_liobn, &local_err);
-        if (local_err) {
-            error_propagate(errp, local_err);
-            return;
-        }
-    } else {
-        error_setg(errp, "\"index\" for PAPR PHB is mandatory");
-        return;
-    }
+    assert(sphb->index != (uint32_t)-1); /* checked in spapr_phb_pre_plug() */
 
     if (sphb->mem64_win_size != 0) {
         if (sphb->mem_win_size > SPAPR_PCI_MEM32_WIN_SIZE) {
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index a3074e7fea37..69d9c2196ca2 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -764,9 +764,12 @@ void spapr_reallocate_hpt(sPAPRMachineState *spapr, int shift,
 void spapr_clear_pending_events(sPAPRMachineState *spapr);
 int spapr_max_server_number(sPAPRMachineState *spapr);
 
-/* CPU and LMB DRC release callbacks. */
+/* DRC callbacks. */
 void spapr_core_release(DeviceState *dev);
 void spapr_lmb_release(DeviceState *dev);
+void spapr_phb_release(DeviceState *dev);
+int spapr_dt_phb(DeviceState *dev, sPAPRMachineState *spapr, void *fdt,
+                 int *fdt_start_offset, Error **errp);
 
 void spapr_rtc_read(sPAPRRTCState *rtc, struct tm *tm, uint32_t *ns);
 int spapr_rtc_import_offset(sPAPRRTCState *rtc, int64_t legacy_offset);

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v4 15/15] spapr: enable PHB hotplug for default pseries machine type
  2019-02-12 18:23 [Qemu-devel] [PATCH v4 00/15] spapr: Add support for PHB hotplug Greg Kurz
                   ` (13 preceding siblings ...)
  2019-02-12 18:25 ` [Qemu-devel] [PATCH v4 14/15] spapr: add hotplug hooks for PHB hotplug Greg Kurz
@ 2019-02-12 18:25 ` Greg Kurz
  2019-02-13  4:13   ` David Gibson
  14 siblings, 1 reply; 35+ messages in thread
From: Greg Kurz @ 2019-02-12 18:25 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Greg Kurz,
	Alexey Kardashevskiy, Cédric Le Goater, Michael Roth,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Eduardo Habkost, David Hildenbrand, Cornelia Huck, Gerd Hoffmann,
	Dmitry Fleytman, Thomas Huth

From: Michael Roth <mdroth@linux.vnet.ibm.com>

The 'dr_phb_enabled' field of that class can be set as part of
machine-specific init code. It will be used to conditionally
enable creation of DRC objects and device-tree description to
facilitate hotplug of PHBs.

Since we can't migrate this state to older machine types,
default the option to true and disable it for older machine
types.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/ppc/spapr.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 06ce0babcb54..4a6b2f7f3f62 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -4166,6 +4166,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     smc->default_caps.caps[SPAPR_CAP_NESTED_KVM_HV] = SPAPR_CAP_OFF;
     spapr_caps_add_properties(smc, &error_abort);
     smc->irq = &spapr_irq_xics;
+    smc->dr_phb_enabled = true;
 }
 
 static const TypeInfo spapr_machine_info = {
@@ -4231,6 +4232,7 @@ static void spapr_machine_3_1_class_options(MachineClass *mc)
     compat_props_add(mc->compat_props, hw_compat_3_1, hw_compat_3_1_len);
     mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power8_v2.0");
     smc->update_dt_enabled = false;
+    smc->dr_phb_enabled = false;
 }
 
 DEFINE_SPAPR_MACHINE(3_1, "3.1", false);

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v4 01/15] spapr_irq: Add an @xics_offset field to sPAPRIrq
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 01/15] spapr_irq: Add an @xics_offset field to sPAPRIrq Greg Kurz
@ 2019-02-12 20:07   ` Cédric Le Goater
  2019-02-13  3:26   ` David Gibson
  1 sibling, 0 replies; 35+ messages in thread
From: Cédric Le Goater @ 2019-02-12 20:07 UTC (permalink / raw)
  To: Greg Kurz, David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Alexey Kardashevskiy,
	Michael Roth, Paolo Bonzini, Michael S. Tsirkin,
	Marcel Apfelbaum, Eduardo Habkost, David Hildenbrand,
	Cornelia Huck, Gerd Hoffmann, Dmitry Fleytman, Thomas Huth

On 2/12/19 7:24 PM, Greg Kurz wrote:
> Only pseries machines, either recent ones started with ic-mode=xics
> or older ones using the legacy irq allocation scheme, need to set the
> @offset of the ICS to XICS_IRQ_BASE. Recent pseries started with
> ic-mode=dual set it to 0 and powernv machines set it to some other
> value at runtime.
> 
> It thus doesn't really help to set the default value of the ICS offset
> to XICS_IRQ_BASE in ics_base_instance_init().
> 
> Drop that code from XICS and let the pseries code set the offset
> explicitely for clarity.

Looks OK to me. I would have call it 'offset' and not 'xics_offset' 
though, because we might want to create an sPAPR IRQ XIVE device with
some offset one day. There is still some work to be done before that
is possible. Anyhow,

Reviewed-by: Cédric Le Goater <clg@kaod.org>

Thanks,

C.


> 
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
>  hw/intc/xics.c             |    8 --------
>  hw/ppc/spapr_irq.c         |   33 ++++++++++++++++++++-------------
>  include/hw/ppc/spapr_irq.h |    1 +
>  3 files changed, 21 insertions(+), 21 deletions(-)
> 
> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> index 16e8ffa2aaf7..7cac138067e2 100644
> --- a/hw/intc/xics.c
> +++ b/hw/intc/xics.c
> @@ -638,13 +638,6 @@ static void ics_base_realize(DeviceState *dev, Error **errp)
>      ics->irqs = g_malloc0(ics->nr_irqs * sizeof(ICSIRQState));
>  }
>  
> -static void ics_base_instance_init(Object *obj)
> -{
> -    ICSState *ics = ICS_BASE(obj);
> -
> -    ics->offset = XICS_IRQ_BASE;
> -}
> -
>  static int ics_base_dispatch_pre_save(void *opaque)
>  {
>      ICSState *ics = opaque;
> @@ -720,7 +713,6 @@ static const TypeInfo ics_base_info = {
>      .parent = TYPE_DEVICE,
>      .abstract = true,
>      .instance_size = sizeof(ICSState),
> -    .instance_init = ics_base_instance_init,
>      .class_init = ics_base_class_init,
>      .class_size = sizeof(ICSStateClass),
>  };
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 80b0083b8e38..8217e0215411 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -68,10 +68,11 @@ void spapr_irq_msi_reset(sPAPRMachineState *spapr)
>  
>  static ICSState *spapr_ics_create(sPAPRMachineState *spapr,
>                                    const char *type_ics,
> -                                  int nr_irqs, Error **errp)
> +                                  int nr_irqs, int offset, Error **errp)
>  {
>      Error *local_err = NULL;
>      Object *obj;
> +    ICSState *ics;
>  
>      obj = object_new(type_ics);
>      object_property_add_child(OBJECT(spapr), "ics", obj, &error_abort);
> @@ -86,7 +87,10 @@ static ICSState *spapr_ics_create(sPAPRMachineState *spapr,
>          goto error;
>      }
>  
> -    return ICS_BASE(obj);
> +    ics = ICS_BASE(obj);
> +    ics->offset = offset;
> +
> +    return ics;
>  
>  error:
>      error_propagate(errp, local_err);
> @@ -104,6 +108,7 @@ static void spapr_irq_init_xics(sPAPRMachineState *spapr, Error **errp)
>              !xics_kvm_init(spapr, &local_err)) {
>              spapr->icp_type = TYPE_KVM_ICP;
>              spapr->ics = spapr_ics_create(spapr, TYPE_ICS_KVM, nr_irqs,
> +                                          spapr->irq->xics_offset,
>                                            &local_err);
>          }
>          if (machine_kernel_irqchip_required(machine) && !spapr->ics) {
> @@ -119,6 +124,7 @@ static void spapr_irq_init_xics(sPAPRMachineState *spapr, Error **errp)
>          xics_spapr_init(spapr);
>          spapr->icp_type = TYPE_ICP;
>          spapr->ics = spapr_ics_create(spapr, TYPE_ICS_SIMPLE, nr_irqs,
> +                                      spapr->irq->xics_offset,
>                                        &local_err);
>      }
>  
> @@ -246,6 +252,7 @@ sPAPRIrq spapr_irq_xics = {
>      .nr_irqs     = SPAPR_IRQ_XICS_NR_IRQS,
>      .nr_msis     = SPAPR_IRQ_XICS_NR_MSIS,
>      .ov5         = SPAPR_OV5_XIVE_LEGACY,
> +    .xics_offset = XICS_IRQ_BASE,
>  
>      .init        = spapr_irq_init_xics,
>      .claim       = spapr_irq_claim_xics,
> @@ -451,17 +458,6 @@ static void spapr_irq_init_dual(sPAPRMachineState *spapr, Error **errp)
>          return;
>      }
>  
> -    /*
> -     * Align the XICS and the XIVE IRQ number space under QEMU.
> -     *
> -     * However, the XICS KVM device still considers that the IRQ
> -     * numbers should start at XICS_IRQ_BASE (0x1000). Either we
> -     * should introduce a KVM device ioctl to set the offset or ignore
> -     * the lower 4K numbers when using the get/set ioctl of the XICS
> -     * KVM device. The second option seems the least intrusive.
> -     */
> -    spapr->ics->offset = 0;
> -
>      spapr_irq_xive.init(spapr, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
> @@ -582,6 +578,16 @@ sPAPRIrq spapr_irq_dual = {
>      .nr_irqs     = SPAPR_IRQ_DUAL_NR_IRQS,
>      .nr_msis     = SPAPR_IRQ_DUAL_NR_MSIS,
>      .ov5         = SPAPR_OV5_XIVE_BOTH,
> +    /*
> +     * Align the XICS and the XIVE IRQ number space under QEMU.
> +     *
> +     * However, the XICS KVM device still considers that the IRQ
> +     * numbers should start at XICS_IRQ_BASE (0x1000). Either we
> +     * should introduce a KVM device ioctl to set the offset or ignore
> +     * the lower 4K numbers when using the get/set ioctl of the XICS
> +     * KVM device. The second option seems the least intrusive.
> +     */
> +    .xics_offset = 0,
>  
>      .init        = spapr_irq_init_dual,
>      .claim       = spapr_irq_claim_dual,
> @@ -712,6 +718,7 @@ sPAPRIrq spapr_irq_xics_legacy = {
>      .nr_irqs     = SPAPR_IRQ_XICS_LEGACY_NR_IRQS,
>      .nr_msis     = SPAPR_IRQ_XICS_LEGACY_NR_IRQS,
>      .ov5         = SPAPR_OV5_XIVE_LEGACY,
> +    .xics_offset = XICS_IRQ_BASE,
>  
>      .init        = spapr_irq_init_xics,
>      .claim       = spapr_irq_claim_xics,
> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> index 14b02c3aca33..5e30858dc22a 100644
> --- a/include/hw/ppc/spapr_irq.h
> +++ b/include/hw/ppc/spapr_irq.h
> @@ -34,6 +34,7 @@ typedef struct sPAPRIrq {
>      uint32_t    nr_irqs;
>      uint32_t    nr_msis;
>      uint8_t     ov5;
> +    uint32_t    xics_offset;
>  
>      void (*init)(sPAPRMachineState *spapr, Error **errp);
>      int (*claim)(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v4 03/15] spapr_irq: Set LSIs at interrupt controller init
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 03/15] spapr_irq: Set LSIs at interrupt controller init Greg Kurz
@ 2019-02-12 20:17   ` Cédric Le Goater
  2019-02-13  3:48   ` David Gibson
  1 sibling, 0 replies; 35+ messages in thread
From: Cédric Le Goater @ 2019-02-12 20:17 UTC (permalink / raw)
  To: Greg Kurz, David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Alexey Kardashevskiy,
	Michael Roth, Paolo Bonzini, Michael S. Tsirkin,
	Marcel Apfelbaum, Eduardo Habkost, David Hildenbrand,
	Cornelia Huck, Gerd Hoffmann, Dmitry Fleytman, Thomas Huth

On 2/12/19 7:24 PM, Greg Kurz wrote:
> The pseries machine only uses LSIs to support legacy PCI devices. Every
> PHB claims 4 LSIs at realize time. When using in-kernel XICS (or upcoming
> in-kernel XIVE), QEMU synchronizes the state of all irqs, including these
> LSIs, later on at machine reset.
> 
> In order to support PHB hotplug, we need a way to tell KVM about the LSIs
> that doesn't require a machine reset.
>
> Since recent machine types allocate all these LSIs in a fixed range for
> the machine lifetime, identify them when initializing the interrupt
> controller, long before they get passed to KVM.
> 
> In order to do that, first disintricate interrupt typing and allocation.
> Since the vast majority of interrupts are MSIs, make that the default
> and have only the LSI users to explicitely set the type.
> 
> It is rather straight forward for XIVE. XICS needs some extra care
> though: allocation state and type are mixed up in the same bits of the
> flags field within the interrupt state. Setting the LSI bit there at
> init time would mean the interrupt is de facto allocated, even if no
> device asked for it. Introduce a bitmap to track LSIs at the ICS level.
> In order to keep the patch minimal, the bitmap is only used when writing
> the source state to KVM and when the interrupt is claimed, so that the
> code that checks the interrupt type through the flags stays untouched.
> 
> With older pseries machine using the XICS legacy IRQ allocation scheme,
> all interrupt numbers come from a common pool and there's no such thing
> as a fixed range for LSIs. Introduce an helper so that these older
> machine types can continue to set the type when allocating the LSI.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>

There are multiple changes in this patch but having them all at once 
makes the overall picture clearer. Having a set_lsi method would 
probably help for the "Identify the PCI LSIs" section.

Reviewed-by: Cédric Le Goater <clg@kaod.org>

Thanks,

C.

> ---
>  hw/intc/spapr_xive.c        |    7 +------
>  hw/intc/xics.c              |   10 ++++++++--
>  hw/intc/xics_kvm.c          |    2 +-
>  hw/ppc/pnv_psi.c            |    3 ++-
>  hw/ppc/spapr_events.c       |    4 ++--
>  hw/ppc/spapr_irq.c          |   42 ++++++++++++++++++++++++++++++++----------
>  hw/ppc/spapr_pci.c          |    6 ++++--
>  hw/ppc/spapr_vio.c          |    2 +-
>  include/hw/ppc/spapr_irq.h  |    5 +++--
>  include/hw/ppc/spapr_xive.h |    2 +-
>  include/hw/ppc/xics.h       |    4 +++-
>  11 files changed, 58 insertions(+), 29 deletions(-)
> 
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index 290a290e43a5..815263ca72ab 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -480,18 +480,13 @@ static void spapr_xive_register_types(void)
>  
>  type_init(spapr_xive_register_types)
>  
> -bool spapr_xive_irq_claim(sPAPRXive *xive, uint32_t lisn, bool lsi)
> +bool spapr_xive_irq_claim(sPAPRXive *xive, uint32_t lisn)
>  {
> -    XiveSource *xsrc = &xive->source;
> -
>      if (lisn >= xive->nr_irqs) {
>          return false;
>      }
>  
>      xive->eat[lisn].w |= cpu_to_be64(EAS_VALID);
> -    if (lsi) {
> -        xive_source_irq_set_lsi(xsrc, lisn);
> -    }
>      return true;
>  }
>  
> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> index 7cac138067e2..26e8940d7329 100644
> --- a/hw/intc/xics.c
> +++ b/hw/intc/xics.c
> @@ -636,6 +636,7 @@ static void ics_base_realize(DeviceState *dev, Error **errp)
>          return;
>      }
>      ics->irqs = g_malloc0(ics->nr_irqs * sizeof(ICSIRQState));
> +    ics->lsi_map = bitmap_new(ics->nr_irqs);
>  }
>  
>  static int ics_base_dispatch_pre_save(void *opaque)
> @@ -733,12 +734,17 @@ ICPState *xics_icp_get(XICSFabric *xi, int server)
>      return xic->icp_get(xi, server);
>  }
>  
> -void ics_set_irq_type(ICSState *ics, int srcno, bool lsi)
> +void ics_set_lsi(ICSState *ics, int srcno)
> +{
> +    set_bit(srcno, ics->lsi_map);
> +}
> +
> +void ics_claim_irq(ICSState *ics, int srcno)
>  {
>      assert(!(ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MASK));
>  
>      ics->irqs[srcno].flags |=
> -        lsi ? XICS_FLAGS_IRQ_LSI : XICS_FLAGS_IRQ_MSI;
> +        test_bit(srcno, ics->lsi_map) ? XICS_FLAGS_IRQ_LSI : XICS_FLAGS_IRQ_MSI;
>  }
>  
>  static void xics_register_types(void)
> diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
> index dff13300504c..e63979abc7fc 100644
> --- a/hw/intc/xics_kvm.c
> +++ b/hw/intc/xics_kvm.c
> @@ -271,7 +271,7 @@ static int ics_set_kvm_state(ICSState *ics, int version_id)
>              state |= KVM_XICS_MASKED;
>          }
>  
> -        if (ics->irqs[i].flags & XICS_FLAGS_IRQ_LSI) {
> +        if (test_bit(i, ics->lsi_map)) {
>              state |= KVM_XICS_LEVEL_SENSITIVE;
>              if (irq->status & XICS_STATUS_ASSERTED) {
>                  state |= KVM_XICS_PENDING;
> diff --git a/hw/ppc/pnv_psi.c b/hw/ppc/pnv_psi.c
> index 8ced09506321..e6089e1035c0 100644
> --- a/hw/ppc/pnv_psi.c
> +++ b/hw/ppc/pnv_psi.c
> @@ -487,7 +487,8 @@ static void pnv_psi_realize(DeviceState *dev, Error **errp)
>      }
>  
>      for (i = 0; i < ics->nr_irqs; i++) {
> -        ics_set_irq_type(ics, i, true);
> +        ics_set_lsi(ics, i);
> +        ics_claim_irq(ics, i);
>      }
>  
>      psi->qirqs = qemu_allocate_irqs(ics_simple_set_irq, ics, ics->nr_irqs);
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index b9c7ecb9e987..559026d0981c 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -713,7 +713,7 @@ void spapr_events_init(sPAPRMachineState *spapr)
>          epow_irq = spapr_irq_findone(spapr, &error_fatal);
>      }
>  
> -    spapr_irq_claim(spapr, epow_irq, false, &error_fatal);
> +    spapr_irq_claim(spapr, epow_irq, &error_fatal);
>  
>      QTAILQ_INIT(&spapr->pending_events);
>  
> @@ -737,7 +737,7 @@ void spapr_events_init(sPAPRMachineState *spapr)
>              hp_irq = spapr_irq_findone(spapr, &error_fatal);
>          }
>  
> -        spapr_irq_claim(spapr, hp_irq, false, &error_fatal);
> +        spapr_irq_claim(spapr, hp_irq, &error_fatal);
>  
>          spapr_event_sources_register(spapr->event_sources, EVENT_CLASS_HOT_PLUG,
>                                       hp_irq);
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 8217e0215411..3fc34d7c8a43 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -16,10 +16,13 @@
>  #include "hw/ppc/spapr_xive.h"
>  #include "hw/ppc/xics.h"
>  #include "hw/ppc/xics_spapr.h"
> +#include "hw/pci-host/spapr.h"
>  #include "sysemu/kvm.h"
>  
>  #include "trace.h"
>  
> +#define SPAPR_IRQ_PCI_LSI_NR     (SPAPR_MAX_PHBS * PCI_NUM_PINS)
> +
>  void spapr_irq_msi_init(sPAPRMachineState *spapr, uint32_t nr_msis)
>  {
>      spapr->irq_map_nr = nr_msis;
> @@ -102,6 +105,7 @@ static void spapr_irq_init_xics(sPAPRMachineState *spapr, Error **errp)
>      MachineState *machine = MACHINE(spapr);
>      int nr_irqs = spapr->irq->nr_irqs;
>      Error *local_err = NULL;
> +    int i;
>  
>      if (kvm_enabled()) {
>          if (machine_kernel_irqchip_allowed(machine) &&
> @@ -128,6 +132,14 @@ static void spapr_irq_init_xics(sPAPRMachineState *spapr, Error **errp)
>                                        &local_err);
>      }
>  
> +    /* Identify the PCI LSIs */
> +    if (!SPAPR_MACHINE_GET_CLASS(spapr)->legacy_irq_allocation) {
> +        for (i = 0; i < SPAPR_IRQ_PCI_LSI_NR; ++i) {
> +            ics_set_lsi(spapr->ics,
> +                        i + SPAPR_IRQ_PCI_LSI - spapr->irq->xics_offset);
> +        }
> +    }
> +
>  error:
>      error_propagate(errp, local_err);
>  }
> @@ -135,7 +147,7 @@ error:
>  #define ICS_IRQ_FREE(ics, srcno)   \
>      (!((ics)->irqs[(srcno)].flags & (XICS_FLAGS_IRQ_MASK)))
>  
> -static int spapr_irq_claim_xics(sPAPRMachineState *spapr, int irq, bool lsi,
> +static int spapr_irq_claim_xics(sPAPRMachineState *spapr, int irq,
>                                  Error **errp)
>  {
>      ICSState *ics = spapr->ics;
> @@ -152,7 +164,7 @@ static int spapr_irq_claim_xics(sPAPRMachineState *spapr, int irq, bool lsi,
>          return -1;
>      }
>  
> -    ics_set_irq_type(ics, irq - ics->offset, lsi);
> +    ics_claim_irq(ics, irq - ics->offset);
>      return 0;
>  }
>  
> @@ -296,16 +308,21 @@ static void spapr_irq_init_xive(sPAPRMachineState *spapr, Error **errp)
>  
>      /* Enable the CPU IPIs */
>      for (i = 0; i < nr_servers; ++i) {
> -        spapr_xive_irq_claim(spapr->xive, SPAPR_IRQ_IPI + i, false);
> +        spapr_xive_irq_claim(spapr->xive, SPAPR_IRQ_IPI + i);
> +    }
> +
> +    /* Identify the PCI LSIs */
> +    for (i = 0; i < SPAPR_IRQ_PCI_LSI_NR; ++i) {
> +        xive_source_irq_set_lsi(&spapr->xive->source, SPAPR_IRQ_PCI_LSI + i);
>      }
>  
>      spapr_xive_hcall_init(spapr);
>  }
>  
> -static int spapr_irq_claim_xive(sPAPRMachineState *spapr, int irq, bool lsi,
> +static int spapr_irq_claim_xive(sPAPRMachineState *spapr, int irq,
>                                  Error **errp)
>  {
> -    if (!spapr_xive_irq_claim(spapr->xive, irq, lsi)) {
> +    if (!spapr_xive_irq_claim(spapr->xive, irq)) {
>          error_setg(errp, "IRQ %d is invalid", irq);
>          return -1;
>      }
> @@ -465,19 +482,19 @@ static void spapr_irq_init_dual(sPAPRMachineState *spapr, Error **errp)
>      }
>  }
>  
> -static int spapr_irq_claim_dual(sPAPRMachineState *spapr, int irq, bool lsi,
> +static int spapr_irq_claim_dual(sPAPRMachineState *spapr, int irq,
>                                  Error **errp)
>  {
>      Error *local_err = NULL;
>      int ret;
>  
> -    ret = spapr_irq_xics.claim(spapr, irq, lsi, &local_err);
> +    ret = spapr_irq_xics.claim(spapr, irq, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
>          return ret;
>      }
>  
> -    ret = spapr_irq_xive.claim(spapr, irq, lsi, &local_err);
> +    ret = spapr_irq_xive.claim(spapr, irq, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
>          return ret;
> @@ -630,9 +647,9 @@ void spapr_irq_init(sPAPRMachineState *spapr, Error **errp)
>                                        spapr->irq->nr_irqs);
>  }
>  
> -int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp)
> +int spapr_irq_claim(sPAPRMachineState *spapr, int irq, Error **errp)
>  {
> -    return spapr->irq->claim(spapr, irq, lsi, errp);
> +    return spapr->irq->claim(spapr, irq, errp);
>  }
>  
>  void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num)
> @@ -712,6 +729,11 @@ int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align, Error **errp)
>      return first + ics->offset;
>  }
>  
> +void spapr_irq_set_lsi_legacy(sPAPRMachineState *spapr, int irq)
> +{
> +    ics_set_lsi(spapr->ics, irq - spapr->irq->xics_offset);
> +}
> +
>  #define SPAPR_IRQ_XICS_LEGACY_NR_IRQS     0x400
>  
>  sPAPRIrq spapr_irq_xics_legacy = {
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index c3fb0ac884b0..d68595531d5a 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -391,7 +391,7 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>      }
>  
>      for (i = 0; i < req_num; i++) {
> -        spapr_irq_claim(spapr, irq + i, false, &err);
> +        spapr_irq_claim(spapr, irq + i, &err);
>          if (err) {
>              if (i) {
>                  spapr_irq_free(spapr, irq, i);
> @@ -1742,9 +1742,11 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>                                          "can't allocate LSIs: ");
>                  return;
>              }
> +
> +            spapr_irq_set_lsi_legacy(spapr, irq);
>          }
>  
> -        spapr_irq_claim(spapr, irq, true, &local_err);
> +        spapr_irq_claim(spapr, irq, &local_err);
>          if (local_err) {
>              error_propagate_prepend(errp, local_err, "can't allocate LSIs: ");
>              return;
> diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
> index 2b7e7ecac57f..b1beefc24be5 100644
> --- a/hw/ppc/spapr_vio.c
> +++ b/hw/ppc/spapr_vio.c
> @@ -512,7 +512,7 @@ static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
>          }
>      }
>  
> -    spapr_irq_claim(spapr, dev->irq, false, &local_err);
> +    spapr_irq_claim(spapr, dev->irq, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
>          return;
> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> index 5e30858dc22a..0e6c65d55430 100644
> --- a/include/hw/ppc/spapr_irq.h
> +++ b/include/hw/ppc/spapr_irq.h
> @@ -37,7 +37,7 @@ typedef struct sPAPRIrq {
>      uint32_t    xics_offset;
>  
>      void (*init)(sPAPRMachineState *spapr, Error **errp);
> -    int (*claim)(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
> +    int (*claim)(sPAPRMachineState *spapr, int irq, Error **errp);
>      void (*free)(sPAPRMachineState *spapr, int irq, int num);
>      qemu_irq (*qirq)(sPAPRMachineState *spapr, int irq);
>      void (*print_info)(sPAPRMachineState *spapr, Monitor *mon);
> @@ -56,7 +56,7 @@ extern sPAPRIrq spapr_irq_xive;
>  extern sPAPRIrq spapr_irq_dual;
>  
>  void spapr_irq_init(sPAPRMachineState *spapr, Error **errp);
> -int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
> +int spapr_irq_claim(sPAPRMachineState *spapr, int irq, Error **errp);
>  void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
>  qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq);
>  int spapr_irq_post_load(sPAPRMachineState *spapr, int version_id);
> @@ -67,5 +67,6 @@ void spapr_irq_reset(sPAPRMachineState *spapr, Error **errp);
>   */
>  int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align, Error **errp);
>  #define spapr_irq_findone(spapr, errp) spapr_irq_find(spapr, 1, false, errp)
> +void spapr_irq_set_lsi_legacy(sPAPRMachineState *spapr, int irq);
>  
>  #endif
> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> index 9bec9192e4a0..885ca169cb29 100644
> --- a/include/hw/ppc/spapr_xive.h
> +++ b/include/hw/ppc/spapr_xive.h
> @@ -37,7 +37,7 @@ typedef struct sPAPRXive {
>      MemoryRegion  tm_mmio;
>  } sPAPRXive;
>  
> -bool spapr_xive_irq_claim(sPAPRXive *xive, uint32_t lisn, bool lsi);
> +bool spapr_xive_irq_claim(sPAPRXive *xive, uint32_t lisn);
>  bool spapr_xive_irq_free(sPAPRXive *xive, uint32_t lisn);
>  void spapr_xive_pic_print_info(sPAPRXive *xive, Monitor *mon);
>  
> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> index fad786e8b22d..18b083fe2aec 100644
> --- a/include/hw/ppc/xics.h
> +++ b/include/hw/ppc/xics.h
> @@ -133,6 +133,7 @@ struct ICSState {
>      uint32_t offset;
>      ICSIRQState *irqs;
>      XICSFabric *xics;
> +    unsigned long *lsi_map;
>  };
>  
>  #define ICS_PROP_XICS "xics"
> @@ -193,7 +194,8 @@ void ics_simple_write_xive(ICSState *ics, int nr, int server,
>  void ics_simple_set_irq(void *opaque, int srcno, int val);
>  void ics_kvm_set_irq(void *opaque, int srcno, int val);
>  
> -void ics_set_irq_type(ICSState *ics, int srcno, bool lsi);
> +void ics_set_lsi(ICSState *ics, int srcno);
> +void ics_claim_irq(ICSState *ics, int srcno);
>  void icp_pic_print_info(ICPState *icp, Monitor *mon);
>  void ics_pic_print_info(ICSState *ics, Monitor *mon);
>  
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v4 01/15] spapr_irq: Add an @xics_offset field to sPAPRIrq
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 01/15] spapr_irq: Add an @xics_offset field to sPAPRIrq Greg Kurz
  2019-02-12 20:07   ` Cédric Le Goater
@ 2019-02-13  3:26   ` David Gibson
  2019-02-13 12:23     ` Greg Kurz
  1 sibling, 1 reply; 35+ messages in thread
From: David Gibson @ 2019-02-13  3:26 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Alexey Kardashevskiy,
	Cédric Le Goater, Michael Roth, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Eduardo Habkost,
	David Hildenbrand, Cornelia Huck, Gerd Hoffmann, Dmitry Fleytman,
	Thomas Huth

[-- Attachment #1: Type: text/plain, Size: 6813 bytes --]

On Tue, Feb 12, 2019 at 07:24:00PM +0100, Greg Kurz wrote:
> Only pseries machines, either recent ones started with ic-mode=xics
> or older ones using the legacy irq allocation scheme, need to set the
> @offset of the ICS to XICS_IRQ_BASE. Recent pseries started with
> ic-mode=dual set it to 0 and powernv machines set it to some other
> value at runtime.
> 
> It thus doesn't really help to set the default value of the ICS offset
> to XICS_IRQ_BASE in ics_base_instance_init().
> 
> Drop that code from XICS and let the pseries code set the offset
> explicitely for clarity.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>

So this actually relates to a discussion I've had on some of Cédric's
more recent patches.  Changing the ics offset in ic-mode=dual doesn't
make sense to me.  The global (guest) interrupt numbers need to match
between XICS and XIVE, but the global interrupt numbers don't have to
match the ICS source numbers, which is what ics->offset is about.

> ---
>  hw/intc/xics.c             |    8 --------
>  hw/ppc/spapr_irq.c         |   33 ++++++++++++++++++++-------------
>  include/hw/ppc/spapr_irq.h |    1 +
>  3 files changed, 21 insertions(+), 21 deletions(-)
> 
> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> index 16e8ffa2aaf7..7cac138067e2 100644
> --- a/hw/intc/xics.c
> +++ b/hw/intc/xics.c
> @@ -638,13 +638,6 @@ static void ics_base_realize(DeviceState *dev, Error **errp)
>      ics->irqs = g_malloc0(ics->nr_irqs * sizeof(ICSIRQState));
>  }
>  
> -static void ics_base_instance_init(Object *obj)
> -{
> -    ICSState *ics = ICS_BASE(obj);
> -
> -    ics->offset = XICS_IRQ_BASE;
> -}
> -
>  static int ics_base_dispatch_pre_save(void *opaque)
>  {
>      ICSState *ics = opaque;
> @@ -720,7 +713,6 @@ static const TypeInfo ics_base_info = {
>      .parent = TYPE_DEVICE,
>      .abstract = true,
>      .instance_size = sizeof(ICSState),
> -    .instance_init = ics_base_instance_init,
>      .class_init = ics_base_class_init,
>      .class_size = sizeof(ICSStateClass),
>  };
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 80b0083b8e38..8217e0215411 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -68,10 +68,11 @@ void spapr_irq_msi_reset(sPAPRMachineState *spapr)
>  
>  static ICSState *spapr_ics_create(sPAPRMachineState *spapr,
>                                    const char *type_ics,
> -                                  int nr_irqs, Error **errp)
> +                                  int nr_irqs, int offset, Error **errp)
>  {
>      Error *local_err = NULL;
>      Object *obj;
> +    ICSState *ics;
>  
>      obj = object_new(type_ics);
>      object_property_add_child(OBJECT(spapr), "ics", obj, &error_abort);
> @@ -86,7 +87,10 @@ static ICSState *spapr_ics_create(sPAPRMachineState *spapr,
>          goto error;
>      }
>  
> -    return ICS_BASE(obj);
> +    ics = ICS_BASE(obj);
> +    ics->offset = offset;
> +
> +    return ics;
>  
>  error:
>      error_propagate(errp, local_err);
> @@ -104,6 +108,7 @@ static void spapr_irq_init_xics(sPAPRMachineState *spapr, Error **errp)
>              !xics_kvm_init(spapr, &local_err)) {
>              spapr->icp_type = TYPE_KVM_ICP;
>              spapr->ics = spapr_ics_create(spapr, TYPE_ICS_KVM, nr_irqs,
> +                                          spapr->irq->xics_offset,
>                                            &local_err);
>          }
>          if (machine_kernel_irqchip_required(machine) && !spapr->ics) {
> @@ -119,6 +124,7 @@ static void spapr_irq_init_xics(sPAPRMachineState *spapr, Error **errp)
>          xics_spapr_init(spapr);
>          spapr->icp_type = TYPE_ICP;
>          spapr->ics = spapr_ics_create(spapr, TYPE_ICS_SIMPLE, nr_irqs,
> +                                      spapr->irq->xics_offset,
>                                        &local_err);
>      }
>  
> @@ -246,6 +252,7 @@ sPAPRIrq spapr_irq_xics = {
>      .nr_irqs     = SPAPR_IRQ_XICS_NR_IRQS,
>      .nr_msis     = SPAPR_IRQ_XICS_NR_MSIS,
>      .ov5         = SPAPR_OV5_XIVE_LEGACY,
> +    .xics_offset = XICS_IRQ_BASE,
>  
>      .init        = spapr_irq_init_xics,
>      .claim       = spapr_irq_claim_xics,
> @@ -451,17 +458,6 @@ static void spapr_irq_init_dual(sPAPRMachineState *spapr, Error **errp)
>          return;
>      }
>  
> -    /*
> -     * Align the XICS and the XIVE IRQ number space under QEMU.
> -     *
> -     * However, the XICS KVM device still considers that the IRQ
> -     * numbers should start at XICS_IRQ_BASE (0x1000). Either we
> -     * should introduce a KVM device ioctl to set the offset or ignore
> -     * the lower 4K numbers when using the get/set ioctl of the XICS
> -     * KVM device. The second option seems the least intrusive.
> -     */
> -    spapr->ics->offset = 0;
> -
>      spapr_irq_xive.init(spapr, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
> @@ -582,6 +578,16 @@ sPAPRIrq spapr_irq_dual = {
>      .nr_irqs     = SPAPR_IRQ_DUAL_NR_IRQS,
>      .nr_msis     = SPAPR_IRQ_DUAL_NR_MSIS,
>      .ov5         = SPAPR_OV5_XIVE_BOTH,
> +    /*
> +     * Align the XICS and the XIVE IRQ number space under QEMU.
> +     *
> +     * However, the XICS KVM device still considers that the IRQ
> +     * numbers should start at XICS_IRQ_BASE (0x1000). Either we
> +     * should introduce a KVM device ioctl to set the offset or ignore
> +     * the lower 4K numbers when using the get/set ioctl of the XICS
> +     * KVM device. The second option seems the least intrusive.
> +     */
> +    .xics_offset = 0,
>  
>      .init        = spapr_irq_init_dual,
>      .claim       = spapr_irq_claim_dual,
> @@ -712,6 +718,7 @@ sPAPRIrq spapr_irq_xics_legacy = {
>      .nr_irqs     = SPAPR_IRQ_XICS_LEGACY_NR_IRQS,
>      .nr_msis     = SPAPR_IRQ_XICS_LEGACY_NR_IRQS,
>      .ov5         = SPAPR_OV5_XIVE_LEGACY,
> +    .xics_offset = XICS_IRQ_BASE,
>  
>      .init        = spapr_irq_init_xics,
>      .claim       = spapr_irq_claim_xics,
> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> index 14b02c3aca33..5e30858dc22a 100644
> --- a/include/hw/ppc/spapr_irq.h
> +++ b/include/hw/ppc/spapr_irq.h
> @@ -34,6 +34,7 @@ typedef struct sPAPRIrq {
>      uint32_t    nr_irqs;
>      uint32_t    nr_msis;
>      uint8_t     ov5;
> +    uint32_t    xics_offset;
>  
>      void (*init)(sPAPRMachineState *spapr, Error **errp);
>      int (*claim)(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v4 02/15] xive: Only set source type for LSIs
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 02/15] xive: Only set source type for LSIs Greg Kurz
@ 2019-02-13  3:27   ` David Gibson
  0 siblings, 0 replies; 35+ messages in thread
From: David Gibson @ 2019-02-13  3:27 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Alexey Kardashevskiy,
	Cédric Le Goater, Michael Roth, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Eduardo Habkost,
	David Hildenbrand, Cornelia Huck, Gerd Hoffmann, Dmitry Fleytman,
	Thomas Huth

[-- Attachment #1: Type: text/plain, Size: 2560 bytes --]

On Tue, Feb 12, 2019 at 07:24:06PM +0100, Greg Kurz wrote:
> MSI is the default and LSI specific code is guarded by the
> xive_source_irq_is_lsi() helper. The xive_source_irq_set()
> helper is a nop for MSIs.
> 
> Simplify the code by turning xive_source_irq_set() into
> xive_source_irq_set_lsi() and only call it for LSIs. The
> call to xive_source_irq_set(false) in spapr_xive_irq_free()
> is also a nop. Just drop it.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>
> Reviewed-by: Cédric Le Goater <clg@kaod.org>

Looks like a reasoanble cleanup regardless of the rest of the series.
Applied to ppc-for-4.0.

> ---
>  hw/intc/spapr_xive.c  |    7 +++----
>  include/hw/ppc/xive.h |    7 ++-----
>  2 files changed, 5 insertions(+), 9 deletions(-)
> 
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index a0f5ff929447..290a290e43a5 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -489,20 +489,19 @@ bool spapr_xive_irq_claim(sPAPRXive *xive, uint32_t lisn, bool lsi)
>      }
>  
>      xive->eat[lisn].w |= cpu_to_be64(EAS_VALID);
> -    xive_source_irq_set(xsrc, lisn, lsi);
> +    if (lsi) {
> +        xive_source_irq_set_lsi(xsrc, lisn);
> +    }
>      return true;
>  }
>  
>  bool spapr_xive_irq_free(sPAPRXive *xive, uint32_t lisn)
>  {
> -    XiveSource *xsrc = &xive->source;
> -
>      if (lisn >= xive->nr_irqs) {
>          return false;
>      }
>  
>      xive->eat[lisn].w &= cpu_to_be64(~EAS_VALID);
> -    xive_source_irq_set(xsrc, lisn, false);
>      return true;
>  }
>  
> diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
> index ec3bb2aae45a..13a487527b11 100644
> --- a/include/hw/ppc/xive.h
> +++ b/include/hw/ppc/xive.h
> @@ -283,13 +283,10 @@ static inline bool xive_source_irq_is_lsi(XiveSource *xsrc, uint32_t srcno)
>      return test_bit(srcno, xsrc->lsi_map);
>  }
>  
> -static inline void xive_source_irq_set(XiveSource *xsrc, uint32_t srcno,
> -                                       bool lsi)
> +static inline void xive_source_irq_set_lsi(XiveSource *xsrc, uint32_t srcno)
>  {
>      assert(srcno < xsrc->nr_irqs);
> -    if (lsi) {
> -        bitmap_set(xsrc->lsi_map, srcno, 1);
> -    }
> +    bitmap_set(xsrc->lsi_map, srcno, 1);
>  }
>  
>  void xive_source_set_irq(void *opaque, int srcno, int val);
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v4 03/15] spapr_irq: Set LSIs at interrupt controller init
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 03/15] spapr_irq: Set LSIs at interrupt controller init Greg Kurz
  2019-02-12 20:17   ` Cédric Le Goater
@ 2019-02-13  3:48   ` David Gibson
  2019-02-13 12:44     ` Greg Kurz
  1 sibling, 1 reply; 35+ messages in thread
From: David Gibson @ 2019-02-13  3:48 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Alexey Kardashevskiy,
	Cédric Le Goater, Michael Roth, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Eduardo Habkost,
	David Hildenbrand, Cornelia Huck, Gerd Hoffmann, Dmitry Fleytman,
	Thomas Huth

[-- Attachment #1: Type: text/plain, Size: 16213 bytes --]

On Tue, Feb 12, 2019 at 07:24:13PM +0100, Greg Kurz wrote:
> The pseries machine only uses LSIs to support legacy PCI devices. Every
> PHB claims 4 LSIs at realize time. When using in-kernel XICS (or upcoming
> in-kernel XIVE), QEMU synchronizes the state of all irqs, including these
> LSIs, later on at machine reset.
> 
> In order to support PHB hotplug, we need a way to tell KVM about the LSIs
> that doesn't require a machine reset.
> 
> Since recent machine types allocate all these LSIs in a fixed range for
> the machine lifetime, identify them when initializing the interrupt
> controller, long before they get passed to KVM.
> 
> In order to do that, first disintricate interrupt typing and allocation.
> Since the vast majority of interrupts are MSIs, make that the default
> and have only the LSI users to explicitely set the type.
> 
> It is rather straight forward for XIVE. XICS needs some extra care
> though: allocation state and type are mixed up in the same bits of the
> flags field within the interrupt state. Setting the LSI bit there at
> init time would mean the interrupt is de facto allocated, even if no
> device asked for it. Introduce a bitmap to track LSIs at the ICS level.
> In order to keep the patch minimal, the bitmap is only used when writing
> the source state to KVM and when the interrupt is claimed, so that the
> code that checks the interrupt type through the flags stays untouched.
> 
> With older pseries machine using the XICS legacy IRQ allocation scheme,
> all interrupt numbers come from a common pool and there's no such thing
> as a fixed range for LSIs. Introduce an helper so that these older
> machine types can continue to set the type when allocating the LSI.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
>  hw/intc/spapr_xive.c        |    7 +------
>  hw/intc/xics.c              |   10 ++++++++--
>  hw/intc/xics_kvm.c          |    2 +-
>  hw/ppc/pnv_psi.c            |    3 ++-
>  hw/ppc/spapr_events.c       |    4 ++--
>  hw/ppc/spapr_irq.c          |   42 ++++++++++++++++++++++++++++++++----------
>  hw/ppc/spapr_pci.c          |    6 ++++--
>  hw/ppc/spapr_vio.c          |    2 +-
>  include/hw/ppc/spapr_irq.h  |    5 +++--
>  include/hw/ppc/spapr_xive.h |    2 +-
>  include/hw/ppc/xics.h       |    4 +++-
>  11 files changed, 58 insertions(+), 29 deletions(-)
> 
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index 290a290e43a5..815263ca72ab 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -480,18 +480,13 @@ static void spapr_xive_register_types(void)
>  
>  type_init(spapr_xive_register_types)
>  
> -bool spapr_xive_irq_claim(sPAPRXive *xive, uint32_t lisn, bool lsi)
> +bool spapr_xive_irq_claim(sPAPRXive *xive, uint32_t lisn)
>  {
> -    XiveSource *xsrc = &xive->source;
> -
>      if (lisn >= xive->nr_irqs) {
>          return false;
>      }
>  
>      xive->eat[lisn].w |= cpu_to_be64(EAS_VALID);
> -    if (lsi) {
> -        xive_source_irq_set_lsi(xsrc, lisn);
> -    }
>      return true;
>  }
>  
> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> index 7cac138067e2..26e8940d7329 100644
> --- a/hw/intc/xics.c
> +++ b/hw/intc/xics.c
> @@ -636,6 +636,7 @@ static void ics_base_realize(DeviceState *dev, Error **errp)
>          return;
>      }
>      ics->irqs = g_malloc0(ics->nr_irqs * sizeof(ICSIRQState));
> +    ics->lsi_map = bitmap_new(ics->nr_irqs);
>  }
>  
>  static int ics_base_dispatch_pre_save(void *opaque)
> @@ -733,12 +734,17 @@ ICPState *xics_icp_get(XICSFabric *xi, int server)
>      return xic->icp_get(xi, server);
>  }
>  
> -void ics_set_irq_type(ICSState *ics, int srcno, bool lsi)
> +void ics_set_lsi(ICSState *ics, int srcno)
> +{
> +    set_bit(srcno, ics->lsi_map);
> +}
> +
> +void ics_claim_irq(ICSState *ics, int srcno)
>  {
>      assert(!(ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MASK));
>  
>      ics->irqs[srcno].flags |=
> -        lsi ? XICS_FLAGS_IRQ_LSI : XICS_FLAGS_IRQ_MSI;
> +        test_bit(srcno, ics->lsi_map) ? XICS_FLAGS_IRQ_LSI : XICS_FLAGS_IRQ_MSI;

I really don't like having the trigger type redundantly stored in the
lsi_map and then again in the flags fields.

In a sense the natural way to do this would be more like the hardware
- have two source objects, one for MSIs and one for LSIs, and make the
trigger a per ICSState rather than per IRQState.  But that would make
life hard for the legacy support.

But... thinking about it, isn't all this overkill anyway.  Can't we
fix the problem by simply forcing an ics_set_kvm_state() (and the xive
equivalent) at claim time.  It's not like it's a hot path.

>  }
>  
>  static void xics_register_types(void)
> diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
> index dff13300504c..e63979abc7fc 100644
> --- a/hw/intc/xics_kvm.c
> +++ b/hw/intc/xics_kvm.c
> @@ -271,7 +271,7 @@ static int ics_set_kvm_state(ICSState *ics, int version_id)
>              state |= KVM_XICS_MASKED;
>          }
>  
> -        if (ics->irqs[i].flags & XICS_FLAGS_IRQ_LSI) {
> +        if (test_bit(i, ics->lsi_map)) {
>              state |= KVM_XICS_LEVEL_SENSITIVE;
>              if (irq->status & XICS_STATUS_ASSERTED) {
>                  state |= KVM_XICS_PENDING;
> diff --git a/hw/ppc/pnv_psi.c b/hw/ppc/pnv_psi.c
> index 8ced09506321..e6089e1035c0 100644
> --- a/hw/ppc/pnv_psi.c
> +++ b/hw/ppc/pnv_psi.c
> @@ -487,7 +487,8 @@ static void pnv_psi_realize(DeviceState *dev, Error **errp)
>      }
>  
>      for (i = 0; i < ics->nr_irqs; i++) {
> -        ics_set_irq_type(ics, i, true);
> +        ics_set_lsi(ics, i);
> +        ics_claim_irq(ics, i);
>      }
>  
>      psi->qirqs = qemu_allocate_irqs(ics_simple_set_irq, ics, ics->nr_irqs);
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index b9c7ecb9e987..559026d0981c 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -713,7 +713,7 @@ void spapr_events_init(sPAPRMachineState *spapr)
>          epow_irq = spapr_irq_findone(spapr, &error_fatal);
>      }
>  
> -    spapr_irq_claim(spapr, epow_irq, false, &error_fatal);
> +    spapr_irq_claim(spapr, epow_irq, &error_fatal);
>  
>      QTAILQ_INIT(&spapr->pending_events);
>  
> @@ -737,7 +737,7 @@ void spapr_events_init(sPAPRMachineState *spapr)
>              hp_irq = spapr_irq_findone(spapr, &error_fatal);
>          }
>  
> -        spapr_irq_claim(spapr, hp_irq, false, &error_fatal);
> +        spapr_irq_claim(spapr, hp_irq, &error_fatal);
>  
>          spapr_event_sources_register(spapr->event_sources, EVENT_CLASS_HOT_PLUG,
>                                       hp_irq);
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 8217e0215411..3fc34d7c8a43 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -16,10 +16,13 @@
>  #include "hw/ppc/spapr_xive.h"
>  #include "hw/ppc/xics.h"
>  #include "hw/ppc/xics_spapr.h"
> +#include "hw/pci-host/spapr.h"
>  #include "sysemu/kvm.h"
>  
>  #include "trace.h"
>  
> +#define SPAPR_IRQ_PCI_LSI_NR     (SPAPR_MAX_PHBS * PCI_NUM_PINS)
> +
>  void spapr_irq_msi_init(sPAPRMachineState *spapr, uint32_t nr_msis)
>  {
>      spapr->irq_map_nr = nr_msis;
> @@ -102,6 +105,7 @@ static void spapr_irq_init_xics(sPAPRMachineState *spapr, Error **errp)
>      MachineState *machine = MACHINE(spapr);
>      int nr_irqs = spapr->irq->nr_irqs;
>      Error *local_err = NULL;
> +    int i;
>  
>      if (kvm_enabled()) {
>          if (machine_kernel_irqchip_allowed(machine) &&
> @@ -128,6 +132,14 @@ static void spapr_irq_init_xics(sPAPRMachineState *spapr, Error **errp)
>                                        &local_err);
>      }
>  
> +    /* Identify the PCI LSIs */
> +    if (!SPAPR_MACHINE_GET_CLASS(spapr)->legacy_irq_allocation) {
> +        for (i = 0; i < SPAPR_IRQ_PCI_LSI_NR; ++i) {
> +            ics_set_lsi(spapr->ics,
> +                        i + SPAPR_IRQ_PCI_LSI - spapr->irq->xics_offset);
> +        }
> +    }
> +
>  error:
>      error_propagate(errp, local_err);
>  }
> @@ -135,7 +147,7 @@ error:
>  #define ICS_IRQ_FREE(ics, srcno)   \
>      (!((ics)->irqs[(srcno)].flags & (XICS_FLAGS_IRQ_MASK)))
>  
> -static int spapr_irq_claim_xics(sPAPRMachineState *spapr, int irq, bool lsi,
> +static int spapr_irq_claim_xics(sPAPRMachineState *spapr, int irq,
>                                  Error **errp)
>  {
>      ICSState *ics = spapr->ics;
> @@ -152,7 +164,7 @@ static int spapr_irq_claim_xics(sPAPRMachineState *spapr, int irq, bool lsi,
>          return -1;
>      }
>  
> -    ics_set_irq_type(ics, irq - ics->offset, lsi);
> +    ics_claim_irq(ics, irq - ics->offset);
>      return 0;
>  }
>  
> @@ -296,16 +308,21 @@ static void spapr_irq_init_xive(sPAPRMachineState *spapr, Error **errp)
>  
>      /* Enable the CPU IPIs */
>      for (i = 0; i < nr_servers; ++i) {
> -        spapr_xive_irq_claim(spapr->xive, SPAPR_IRQ_IPI + i, false);
> +        spapr_xive_irq_claim(spapr->xive, SPAPR_IRQ_IPI + i);
> +    }
> +
> +    /* Identify the PCI LSIs */
> +    for (i = 0; i < SPAPR_IRQ_PCI_LSI_NR; ++i) {
> +        xive_source_irq_set_lsi(&spapr->xive->source, SPAPR_IRQ_PCI_LSI + i);
>      }
>  
>      spapr_xive_hcall_init(spapr);
>  }
>  
> -static int spapr_irq_claim_xive(sPAPRMachineState *spapr, int irq, bool lsi,
> +static int spapr_irq_claim_xive(sPAPRMachineState *spapr, int irq,
>                                  Error **errp)
>  {
> -    if (!spapr_xive_irq_claim(spapr->xive, irq, lsi)) {
> +    if (!spapr_xive_irq_claim(spapr->xive, irq)) {
>          error_setg(errp, "IRQ %d is invalid", irq);
>          return -1;
>      }
> @@ -465,19 +482,19 @@ static void spapr_irq_init_dual(sPAPRMachineState *spapr, Error **errp)
>      }
>  }
>  
> -static int spapr_irq_claim_dual(sPAPRMachineState *spapr, int irq, bool lsi,
> +static int spapr_irq_claim_dual(sPAPRMachineState *spapr, int irq,
>                                  Error **errp)
>  {
>      Error *local_err = NULL;
>      int ret;
>  
> -    ret = spapr_irq_xics.claim(spapr, irq, lsi, &local_err);
> +    ret = spapr_irq_xics.claim(spapr, irq, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
>          return ret;
>      }
>  
> -    ret = spapr_irq_xive.claim(spapr, irq, lsi, &local_err);
> +    ret = spapr_irq_xive.claim(spapr, irq, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
>          return ret;
> @@ -630,9 +647,9 @@ void spapr_irq_init(sPAPRMachineState *spapr, Error **errp)
>                                        spapr->irq->nr_irqs);
>  }
>  
> -int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp)
> +int spapr_irq_claim(sPAPRMachineState *spapr, int irq, Error **errp)
>  {
> -    return spapr->irq->claim(spapr, irq, lsi, errp);
> +    return spapr->irq->claim(spapr, irq, errp);
>  }
>  
>  void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num)
> @@ -712,6 +729,11 @@ int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align, Error **errp)
>      return first + ics->offset;
>  }
>  
> +void spapr_irq_set_lsi_legacy(sPAPRMachineState *spapr, int irq)
> +{
> +    ics_set_lsi(spapr->ics, irq - spapr->irq->xics_offset);
> +}
> +
>  #define SPAPR_IRQ_XICS_LEGACY_NR_IRQS     0x400
>  
>  sPAPRIrq spapr_irq_xics_legacy = {
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index c3fb0ac884b0..d68595531d5a 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -391,7 +391,7 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>      }
>  
>      for (i = 0; i < req_num; i++) {
> -        spapr_irq_claim(spapr, irq + i, false, &err);
> +        spapr_irq_claim(spapr, irq + i, &err);
>          if (err) {
>              if (i) {
>                  spapr_irq_free(spapr, irq, i);
> @@ -1742,9 +1742,11 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>                                          "can't allocate LSIs: ");
>                  return;
>              }
> +
> +            spapr_irq_set_lsi_legacy(spapr, irq);
>          }
>  
> -        spapr_irq_claim(spapr, irq, true, &local_err);
> +        spapr_irq_claim(spapr, irq, &local_err);
>          if (local_err) {
>              error_propagate_prepend(errp, local_err, "can't allocate LSIs: ");
>              return;
> diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
> index 2b7e7ecac57f..b1beefc24be5 100644
> --- a/hw/ppc/spapr_vio.c
> +++ b/hw/ppc/spapr_vio.c
> @@ -512,7 +512,7 @@ static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
>          }
>      }
>  
> -    spapr_irq_claim(spapr, dev->irq, false, &local_err);
> +    spapr_irq_claim(spapr, dev->irq, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
>          return;
> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> index 5e30858dc22a..0e6c65d55430 100644
> --- a/include/hw/ppc/spapr_irq.h
> +++ b/include/hw/ppc/spapr_irq.h
> @@ -37,7 +37,7 @@ typedef struct sPAPRIrq {
>      uint32_t    xics_offset;
>  
>      void (*init)(sPAPRMachineState *spapr, Error **errp);
> -    int (*claim)(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
> +    int (*claim)(sPAPRMachineState *spapr, int irq, Error **errp);
>      void (*free)(sPAPRMachineState *spapr, int irq, int num);
>      qemu_irq (*qirq)(sPAPRMachineState *spapr, int irq);
>      void (*print_info)(sPAPRMachineState *spapr, Monitor *mon);
> @@ -56,7 +56,7 @@ extern sPAPRIrq spapr_irq_xive;
>  extern sPAPRIrq spapr_irq_dual;
>  
>  void spapr_irq_init(sPAPRMachineState *spapr, Error **errp);
> -int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
> +int spapr_irq_claim(sPAPRMachineState *spapr, int irq, Error **errp);
>  void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
>  qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq);
>  int spapr_irq_post_load(sPAPRMachineState *spapr, int version_id);
> @@ -67,5 +67,6 @@ void spapr_irq_reset(sPAPRMachineState *spapr, Error **errp);
>   */
>  int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align, Error **errp);
>  #define spapr_irq_findone(spapr, errp) spapr_irq_find(spapr, 1, false, errp)
> +void spapr_irq_set_lsi_legacy(sPAPRMachineState *spapr, int irq);
>  
>  #endif
> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> index 9bec9192e4a0..885ca169cb29 100644
> --- a/include/hw/ppc/spapr_xive.h
> +++ b/include/hw/ppc/spapr_xive.h
> @@ -37,7 +37,7 @@ typedef struct sPAPRXive {
>      MemoryRegion  tm_mmio;
>  } sPAPRXive;
>  
> -bool spapr_xive_irq_claim(sPAPRXive *xive, uint32_t lisn, bool lsi);
> +bool spapr_xive_irq_claim(sPAPRXive *xive, uint32_t lisn);
>  bool spapr_xive_irq_free(sPAPRXive *xive, uint32_t lisn);
>  void spapr_xive_pic_print_info(sPAPRXive *xive, Monitor *mon);
>  
> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> index fad786e8b22d..18b083fe2aec 100644
> --- a/include/hw/ppc/xics.h
> +++ b/include/hw/ppc/xics.h
> @@ -133,6 +133,7 @@ struct ICSState {
>      uint32_t offset;
>      ICSIRQState *irqs;
>      XICSFabric *xics;
> +    unsigned long *lsi_map;
>  };
>  
>  #define ICS_PROP_XICS "xics"
> @@ -193,7 +194,8 @@ void ics_simple_write_xive(ICSState *ics, int nr, int server,
>  void ics_simple_set_irq(void *opaque, int srcno, int val);
>  void ics_kvm_set_irq(void *opaque, int srcno, int val);
>  
> -void ics_set_irq_type(ICSState *ics, int srcno, bool lsi);
> +void ics_set_lsi(ICSState *ics, int srcno);
> +void ics_claim_irq(ICSState *ics, int srcno);
>  void icp_pic_print_info(ICPState *icp, Monitor *mon);
>  void ics_pic_print_info(ICSState *ics, Monitor *mon);
>  
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v4 04/15] spapr: Expose the name of the interrupt controller node
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 04/15] spapr: Expose the name of the interrupt controller node Greg Kurz
@ 2019-02-13  3:50   ` David Gibson
  0 siblings, 0 replies; 35+ messages in thread
From: David Gibson @ 2019-02-13  3:50 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Alexey Kardashevskiy,
	Cédric Le Goater, Michael Roth, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Eduardo Habkost,
	David Hildenbrand, Cornelia Huck, Gerd Hoffmann, Dmitry Fleytman,
	Thomas Huth

[-- Attachment #1: Type: text/plain, Size: 7151 bytes --]

On Tue, Feb 12, 2019 at 07:24:19PM +0100, Greg Kurz wrote:
> This will be needed by PHB hotplug in order to access the "phandle"
> property of the interrupt controller node.
> 
> Reviewed-by: Cédric Le Goater <clg@kaod.org>
> Signed-off-by: Greg Kurz <groug@kaod.org>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
> v4: - folded some changes from patches 15, 16 and 17 of v3
>     - dropped useless helpers
> ---
>  hw/intc/spapr_xive.c        |    9 ++++-----
>  hw/intc/xics_spapr.c        |    2 +-
>  hw/ppc/spapr_irq.c          |   21 ++++++++++++++++++++-
>  include/hw/ppc/spapr_irq.h  |    1 +
>  include/hw/ppc/spapr_xive.h |    3 +++
>  include/hw/ppc/xics_spapr.h |    2 ++
>  6 files changed, 31 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index 815263ca72ab..f14e436ad4b9 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -317,6 +317,9 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
>      /* Map all regions */
>      spapr_xive_map_mmio(xive);
>  
> +    xive->nodename = g_strdup_printf("interrupt-controller@%" PRIx64,
> +                           xive->tm_base + XIVE_TM_USER_PAGE * (1 << TM_SHIFT));
> +
>      qemu_register_reset(spapr_xive_reset, dev);
>  }
>  
> @@ -1443,7 +1446,6 @@ void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
>          cpu_to_be32(7),    /* start */
>          cpu_to_be32(0xf8), /* count */
>      };
> -    gchar *nodename;
>  
>      /* Thread Interrupt Management Area : User (ring 3) and OS (ring 2) */
>      timas[0] = cpu_to_be64(xive->tm_base +
> @@ -1453,10 +1455,7 @@ void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
>                             XIVE_TM_OS_PAGE * (1ull << TM_SHIFT));
>      timas[3] = cpu_to_be64(1ull << TM_SHIFT);
>  
> -    nodename = g_strdup_printf("interrupt-controller@%" PRIx64,
> -                           xive->tm_base + XIVE_TM_USER_PAGE * (1 << TM_SHIFT));
> -    _FDT(node = fdt_add_subnode(fdt, 0, nodename));
> -    g_free(nodename);
> +    _FDT(node = fdt_add_subnode(fdt, 0, xive->nodename));
>  
>      _FDT(fdt_setprop_string(fdt, node, "device_type", "power-ivpe"));
>      _FDT(fdt_setprop(fdt, node, "reg", timas, sizeof(timas)));
> diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
> index e2d8b3818336..53bda6661b2a 100644
> --- a/hw/intc/xics_spapr.c
> +++ b/hw/intc/xics_spapr.c
> @@ -254,7 +254,7 @@ void spapr_dt_xics(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
>      };
>      int node;
>  
> -    _FDT(node = fdt_add_subnode(fdt, 0, "interrupt-controller"));
> +    _FDT(node = fdt_add_subnode(fdt, 0, XICS_NODENAME));
>  
>      _FDT(fdt_setprop_string(fdt, node, "device_type",
>                              "PowerPC-External-Interrupt-Presentation"));
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 3fc34d7c8a43..b8d725e251ba 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -256,6 +256,11 @@ static void spapr_irq_reset_xics(sPAPRMachineState *spapr, Error **errp)
>      /* TODO: create the KVM XICS device */
>  }
>  
> +static const char *spapr_irq_get_nodename_xics(sPAPRMachineState *spapr)
> +{
> +    return XICS_NODENAME;
> +}
> +
>  #define SPAPR_IRQ_XICS_NR_IRQS     0x1000
>  #define SPAPR_IRQ_XICS_NR_MSIS     \
>      (XICS_IRQ_BASE + SPAPR_IRQ_XICS_NR_IRQS - SPAPR_IRQ_MSI)
> @@ -276,6 +281,7 @@ sPAPRIrq spapr_irq_xics = {
>      .post_load   = spapr_irq_post_load_xics,
>      .reset       = spapr_irq_reset_xics,
>      .set_irq     = spapr_irq_set_irq_xics,
> +    .get_nodename = spapr_irq_get_nodename_xics,
>  };
>  
>  /*
> @@ -415,6 +421,11 @@ static void spapr_irq_set_irq_xive(void *opaque, int srcno, int val)
>      xive_source_set_irq(&spapr->xive->source, srcno, val);
>  }
>  
> +static const char *spapr_irq_get_nodename_xive(sPAPRMachineState *spapr)
> +{
> +    return spapr->xive->nodename;
> +}
> +
>  /*
>   * XIVE uses the full IRQ number space. Set it to 8K to be compatible
>   * with XICS.
> @@ -438,6 +449,7 @@ sPAPRIrq spapr_irq_xive = {
>      .post_load   = spapr_irq_post_load_xive,
>      .reset       = spapr_irq_reset_xive,
>      .set_irq     = spapr_irq_set_irq_xive,
> +    .get_nodename = spapr_irq_get_nodename_xive,
>  };
>  
>  /*
> @@ -585,6 +597,11 @@ static void spapr_irq_set_irq_dual(void *opaque, int srcno, int val)
>      spapr_irq_current(spapr)->set_irq(spapr, srcno, val);
>  }
>  
> +static const char *spapr_irq_get_nodename_dual(sPAPRMachineState *spapr)
> +{
> +    return spapr_irq_current(spapr)->get_nodename(spapr);
> +}
> +
>  /*
>   * Define values in sync with the XIVE and XICS backend
>   */
> @@ -615,7 +632,8 @@ sPAPRIrq spapr_irq_dual = {
>      .cpu_intc_create = spapr_irq_cpu_intc_create_dual,
>      .post_load   = spapr_irq_post_load_dual,
>      .reset       = spapr_irq_reset_dual,
> -    .set_irq     = spapr_irq_set_irq_dual
> +    .set_irq     = spapr_irq_set_irq_dual,
> +    .get_nodename = spapr_irq_get_nodename_dual,
>  };
>  
>  /*
> @@ -751,4 +769,5 @@ sPAPRIrq spapr_irq_xics_legacy = {
>      .cpu_intc_create = spapr_irq_cpu_intc_create_xics,
>      .post_load   = spapr_irq_post_load_xics,
>      .set_irq     = spapr_irq_set_irq_xics,
> +    .get_nodename = spapr_irq_get_nodename_xics,
>  };
> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> index 0e6c65d55430..ad7127355441 100644
> --- a/include/hw/ppc/spapr_irq.h
> +++ b/include/hw/ppc/spapr_irq.h
> @@ -48,6 +48,7 @@ typedef struct sPAPRIrq {
>      int (*post_load)(sPAPRMachineState *spapr, int version_id);
>      void (*reset)(sPAPRMachineState *spapr, Error **errp);
>      void (*set_irq)(void *opaque, int srcno, int val);
> +    const char *(*get_nodename)(sPAPRMachineState *spapr);
>  } sPAPRIrq;
>  
>  extern sPAPRIrq spapr_irq_xics;
> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> index 885ca169cb29..2c57a59a3f5b 100644
> --- a/include/hw/ppc/spapr_xive.h
> +++ b/include/hw/ppc/spapr_xive.h
> @@ -26,6 +26,9 @@ typedef struct sPAPRXive {
>      XiveENDSource end_source;
>      hwaddr        end_base;
>  
> +    /* DT */
> +    gchar *nodename;
> +
>      /* Routing table */
>      XiveEAS       *eat;
>      uint32_t      nr_irqs;
> diff --git a/include/hw/ppc/xics_spapr.h b/include/hw/ppc/xics_spapr.h
> index b1ab27d022cf..b8d924baf437 100644
> --- a/include/hw/ppc/xics_spapr.h
> +++ b/include/hw/ppc/xics_spapr.h
> @@ -29,6 +29,8 @@
>  
>  #include "hw/ppc/spapr.h"
>  
> +#define XICS_NODENAME "interrupt-controller"
> +
>  void spapr_dt_xics(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
>                     uint32_t phandle);
>  int xics_kvm_init(sPAPRMachineState *spapr, Error **errp);
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v4 05/15] spapr_irq: Expose the phandle of the interrupt controller
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 05/15] spapr_irq: Expose the phandle of the interrupt controller Greg Kurz
@ 2019-02-13  3:52   ` David Gibson
  2019-02-13 13:11     ` Greg Kurz
  0 siblings, 1 reply; 35+ messages in thread
From: David Gibson @ 2019-02-13  3:52 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Alexey Kardashevskiy,
	Cédric Le Goater, Michael Roth, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Eduardo Habkost,
	David Hildenbrand, Cornelia Huck, Gerd Hoffmann, Dmitry Fleytman,
	Thomas Huth

[-- Attachment #1: Type: text/plain, Size: 2593 bytes --]

On Tue, Feb 12, 2019 at 07:24:26PM +0100, Greg Kurz wrote:
> This will be used by PHB hotplug in order to create the "interrupt-map"
> property of the PHB node.
> 
> Reviewed-by: Cédric Le Goater <clg@kaod.org>
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
> v4: - return phandle via a pointer

You don't really need to do this.  You already have an Error ** to
return errors via, so you don't need an error return code.  Plus
phandles are not permitted to be 0 or -1, so you have some safe values
even for that case.

> ---
>  hw/ppc/spapr_irq.c         |   26 ++++++++++++++++++++++++++
>  include/hw/ppc/spapr_irq.h |    2 ++
>  2 files changed, 28 insertions(+)
> 
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index b8d725e251ba..31495033c37c 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -692,6 +692,32 @@ void spapr_irq_reset(sPAPRMachineState *spapr, Error **errp)
>      }
>  }
>  
> +int spapr_irq_get_phandle(sPAPRMachineState *spapr, void *fdt,
> +                          uint32_t *phandle, Error **errp)
> +{
> +    const char *nodename = spapr->irq->get_nodename(spapr);
> +    int offset, ph;
> +
> +    offset = fdt_subnode_offset(fdt, 0, nodename);
> +    if (offset < 0) {
> +        error_setg(errp, "Can't find node \"%s\": %s", nodename,
> +                   fdt_strerror(offset));
> +        return -1;
> +    }
> +
> +    ph = fdt_get_phandle(fdt, offset);
> +    if (!ph) {
> +        error_setg(errp, "Can't get phandle of node \"%s\"", nodename);
> +        return -1;
> +    }
> +
> +    if (phandle) {
> +        *phandle = ph;
> +    }
> +
> +    return 0;
> +}
> +
>  /*
>   * XICS legacy routines - to deprecate one day
>   */
> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> index ad7127355441..4b3303ef4f6a 100644
> --- a/include/hw/ppc/spapr_irq.h
> +++ b/include/hw/ppc/spapr_irq.h
> @@ -62,6 +62,8 @@ void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
>  qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq);
>  int spapr_irq_post_load(sPAPRMachineState *spapr, int version_id);
>  void spapr_irq_reset(sPAPRMachineState *spapr, Error **errp);
> +int spapr_irq_get_phandle(sPAPRMachineState *spapr, void *fdt,
> +                          uint32_t *phandle, Error **errp);
>  
>  /*
>   * XICS legacy routines
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v4 06/15] spapr_pci: add PHB unrealize
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 06/15] spapr_pci: add PHB unrealize Greg Kurz
@ 2019-02-13  3:56   ` David Gibson
  0 siblings, 0 replies; 35+ messages in thread
From: David Gibson @ 2019-02-13  3:56 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Alexey Kardashevskiy,
	Cédric Le Goater, Michael Roth, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Eduardo Habkost,
	David Hildenbrand, Cornelia Huck, Gerd Hoffmann, Dmitry Fleytman,
	Thomas Huth

[-- Attachment #1: Type: text/plain, Size: 7372 bytes --]

On Tue, Feb 12, 2019 at 07:24:33PM +0100, Greg Kurz wrote:
> To support PHB hotplug we need to clean up lingering references,
> memory, child properties, etc. prior to the PHB object being
> finalized. Generally this will be called as a result of calling
> object_unparent() on the PHB object, which in turn would normally
> be called as the result of an unplug() operation.
> 
> When the PHB is finalized, child objects will be unparented in
> turn, and finalized if the PHB was the only reference holder. so
> we don't bother to explicitly unparent child objects of the PHB
> (spapr_iommu, spapr_drc, etc).
> 
> The formula that gives the number of DMA windows is moved to an
> inline function in the hw/pci-host/spapr.h header because it
> will have other users.
> 
> The unrealize function is able to cope with partially realized PHBs.
> It is hence used to implement proper rollback on the realize error
> path.
> 
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> Signed-off-by: Greg Kurz <groug@kaod.org>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
> v4: - reverted to v2
> v3: - don't free LSIs at unrealize
> v2: - implement rollback with unrealize function
> ---
>  hw/ppc/spapr_pci.c          |   75 +++++++++++++++++++++++++++++++++++++++++--
>  include/hw/pci-host/spapr.h |    5 +++
>  2 files changed, 76 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index d68595531d5a..e3781dd110b2 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -1565,6 +1565,64 @@ static void spapr_pci_unplug_request(HotplugHandler *plug_handler,
>      }
>  }
>  
> +static void spapr_phb_finalizefn(Object *obj)
> +{
> +    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(obj);
> +
> +    g_free(sphb->dtbusname);
> +    sphb->dtbusname = NULL;
> +}
> +
> +static void spapr_phb_unrealize(DeviceState *dev, Error **errp)
> +{
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> +    SysBusDevice *s = SYS_BUS_DEVICE(dev);
> +    PCIHostState *phb = PCI_HOST_BRIDGE(s);
> +    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(phb);
> +    sPAPRTCETable *tcet;
> +    int i;
> +    const unsigned windows_supported = spapr_phb_windows_supported(sphb);
> +
> +    if (sphb->msi) {
> +        g_hash_table_unref(sphb->msi);
> +        sphb->msi = NULL;
> +    }
> +
> +    /*
> +     * Remove IO/MMIO subregions and aliases, rest should get cleaned
> +     * via PHB's unrealize->object_finalize
> +     */
> +    for (i = windows_supported - 1; i >= 0; i--) {
> +        tcet = spapr_tce_find_by_liobn(sphb->dma_liobn[i]);
> +        if (tcet) {
> +            memory_region_del_subregion(&sphb->iommu_root,
> +                                        spapr_tce_get_iommu(tcet));
> +        }
> +    }
> +
> +    for (i = PCI_NUM_PINS - 1; i >= 0; i--) {
> +        if (sphb->lsi_table[i].irq) {
> +            spapr_irq_free(spapr, sphb->lsi_table[i].irq, 1);
> +            sphb->lsi_table[i].irq = 0;
> +        }
> +    }
> +
> +    QLIST_REMOVE(sphb, list);
> +
> +    memory_region_del_subregion(&sphb->iommu_root, &sphb->msiwindow);
> +
> +    address_space_destroy(&sphb->iommu_as);
> +
> +    qbus_set_hotplug_handler(BUS(phb->bus), NULL, &error_abort);
> +    pci_unregister_root_bus(phb->bus);
> +
> +    memory_region_del_subregion(get_system_memory(), &sphb->iowindow);
> +    if (sphb->mem64_win_pciaddr != (hwaddr)-1) {
> +        memory_region_del_subregion(get_system_memory(), &sphb->mem64window);
> +    }
> +    memory_region_del_subregion(get_system_memory(), &sphb->mem32window);
> +}
> +
>  static void spapr_phb_realize(DeviceState *dev, Error **errp)
>  {
>      /* We don't use SPAPR_MACHINE() in order to exit gracefully if the user
> @@ -1582,8 +1640,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>      PCIBus *bus;
>      uint64_t msi_window_size = 4096;
>      sPAPRTCETable *tcet;
> -    const unsigned windows_supported =
> -        sphb->ddw_enabled ? SPAPR_PCI_DMA_MAX_WINDOWS : 1;
> +    const unsigned windows_supported = spapr_phb_windows_supported(sphb);
>  
>      if (!spapr) {
>          error_setg(errp, TYPE_SPAPR_PCI_HOST_BRIDGE " needs a pseries machine");
> @@ -1740,6 +1797,10 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>              if (local_err) {
>                  error_propagate_prepend(errp, local_err,
>                                          "can't allocate LSIs: ");
> +                /*
> +                 * Older machines will never support PHB hotplug, ie, this is an
> +                 * init only path and QEMU will terminate. No need to rollback.
> +                 */
>                  return;
>              }
>  
> @@ -1749,7 +1810,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>          spapr_irq_claim(spapr, irq, &local_err);
>          if (local_err) {
>              error_propagate_prepend(errp, local_err, "can't allocate LSIs: ");
> -            return;
> +            goto unrealize;
>          }
>  
>          sphb->lsi_table[i].irq = irq;
> @@ -1769,13 +1830,17 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>          if (!tcet) {
>              error_setg(errp, "Creating window#%d failed for %s",
>                         i, sphb->dtbusname);
> -            return;
> +            goto unrealize;
>          }
>          memory_region_add_subregion(&sphb->iommu_root, 0,
>                                      spapr_tce_get_iommu(tcet));
>      }
>  
>      sphb->msi = g_hash_table_new_full(g_int_hash, g_int_equal, g_free, g_free);
> +    return;
> +
> +unrealize:
> +    spapr_phb_unrealize(dev, NULL);
>  }
>  
>  static int spapr_phb_children_reset(Object *child, void *opaque)
> @@ -1974,6 +2039,7 @@ static void spapr_phb_class_init(ObjectClass *klass, void *data)
>  
>      hc->root_bus_path = spapr_phb_root_bus_path;
>      dc->realize = spapr_phb_realize;
> +    dc->unrealize = spapr_phb_unrealize;
>      dc->props = spapr_phb_properties;
>      dc->reset = spapr_phb_reset;
>      dc->vmsd = &vmstate_spapr_pci;
> @@ -1989,6 +2055,7 @@ static const TypeInfo spapr_phb_info = {
>      .name          = TYPE_SPAPR_PCI_HOST_BRIDGE,
>      .parent        = TYPE_PCI_HOST_BRIDGE,
>      .instance_size = sizeof(sPAPRPHBState),
> +    .instance_finalize = spapr_phb_finalizefn,
>      .class_init    = spapr_phb_class_init,
>      .interfaces    = (InterfaceInfo[]) {
>          { TYPE_HOTPLUG_HANDLER },
> diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
> index 51d81c4b7ce8..7cfce54a9449 100644
> --- a/include/hw/pci-host/spapr.h
> +++ b/include/hw/pci-host/spapr.h
> @@ -163,4 +163,9 @@ static inline void spapr_phb_vfio_reset(DeviceState *qdev)
>  
>  void spapr_phb_dma_reset(sPAPRPHBState *sphb);
>  
> +static inline unsigned spapr_phb_windows_supported(sPAPRPHBState *sphb)
> +{
> +    return sphb->ddw_enabled ? SPAPR_PCI_DMA_MAX_WINDOWS : 1;
> +}
> +
>  #endif /* PCI_HOST_SPAPR_H */
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v4 10/15] qdev: pass an Object * to qbus_set_hotplug_handler()
  2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 10/15] qdev: pass an Object * to qbus_set_hotplug_handler() Greg Kurz
@ 2019-02-13  3:59   ` David Gibson
  0 siblings, 0 replies; 35+ messages in thread
From: David Gibson @ 2019-02-13  3:59 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Alexey Kardashevskiy,
	Cédric Le Goater, Michael Roth, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Eduardo Habkost,
	David Hildenbrand, Cornelia Huck, Gerd Hoffmann, Dmitry Fleytman,
	Thomas Huth

[-- Attachment #1: Type: text/plain, Size: 11463 bytes --]

On Tue, Feb 12, 2019 at 07:24:59PM +0100, Greg Kurz wrote:
> From: Michael Roth <mdroth@linux.vnet.ibm.com>
> 
> Certain devices types, like memory/CPU, are now being handled using a
> hotplug interface provided by a top-level MachineClass. Hotpluggable
> host bridges are another such device where it makes sense to use a
> machine-level hotplug handler. However, unlike those devices,
> host-bridges have a parent bus (the main system bus), and devices with
> a parent bus use a different mechanism for registering their hotplug
> handlers: qbus_set_hotplug_handler(). This interface currently expects
> a handler to be a subclass of DeviceClass, but this is not the case
> for MachineClass, which derives directly from ObjectClass.
> 
> Internally, the interface only requires an ObjectClass, so expose that
> in qbus_set_hotplug_handler().
> 
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Eduardo Habkost <ehabkost@redhat.com>
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> Signed-off-by: Greg Kurz <groug@kaod.org>
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> Acked-by: Halil Pasic <pasic@linux.ibm.com>
> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>

Applied to ppc-for-4.0, this will be useful for something I have in
mind as well.

> ---
>  hw/acpi/pcihp.c               |    2 +-
>  hw/acpi/piix4.c               |    2 +-
>  hw/char/virtio-serial-bus.c   |    2 +-
>  hw/core/bus.c                 |   11 ++---------
>  hw/pci/pcie.c                 |    2 +-
>  hw/pci/shpc.c                 |    2 +-
>  hw/ppc/spapr_pci.c            |    2 +-
>  hw/s390x/css-bridge.c         |    2 +-
>  hw/s390x/s390-pci-bus.c       |    6 +++---
>  hw/scsi/virtio-scsi.c         |    2 +-
>  hw/scsi/vmw_pvscsi.c          |    2 +-
>  hw/usb/dev-smartcard-reader.c |    2 +-
>  include/hw/qdev-core.h        |    3 +--
>  13 files changed, 16 insertions(+), 24 deletions(-)
> 
> diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
> index 7bc7a723407b..942918132376 100644
> --- a/hw/acpi/pcihp.c
> +++ b/hw/acpi/pcihp.c
> @@ -251,7 +251,7 @@ void acpi_pcihp_device_plug_cb(HotplugHandler *hotplug_dev, AcpiPciHpState *s,
>              object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) {
>              PCIBus *sec = pci_bridge_get_sec_bus(PCI_BRIDGE(pdev));
>  
> -            qbus_set_hotplug_handler(BUS(sec), DEVICE(hotplug_dev),
> +            qbus_set_hotplug_handler(BUS(sec), OBJECT(hotplug_dev),
>                                       &error_abort);
>              /* We don't have to overwrite any other hotplug handler yet */
>              assert(QLIST_EMPTY(&sec->child));
> diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
> index 88f9a9ec0912..df8c0db909ce 100644
> --- a/hw/acpi/piix4.c
> +++ b/hw/acpi/piix4.c
> @@ -536,7 +536,7 @@ static void piix4_pm_realize(PCIDevice *dev, Error **errp)
>  
>      piix4_acpi_system_hot_add_init(pci_address_space_io(dev),
>                                     pci_get_bus(dev), s);
> -    qbus_set_hotplug_handler(BUS(pci_get_bus(dev)), DEVICE(s), &error_abort);
> +    qbus_set_hotplug_handler(BUS(pci_get_bus(dev)), OBJECT(s), &error_abort);
>  
>      piix4_pm_add_propeties(s);
>  }
> diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
> index d76351d7487d..bdd917bbb83c 100644
> --- a/hw/char/virtio-serial-bus.c
> +++ b/hw/char/virtio-serial-bus.c
> @@ -1052,7 +1052,7 @@ static void virtio_serial_device_realize(DeviceState *dev, Error **errp)
>      /* Spawn a new virtio-serial bus on which the ports will ride as devices */
>      qbus_create_inplace(&vser->bus, sizeof(vser->bus), TYPE_VIRTIO_SERIAL_BUS,
>                          dev, vdev->bus_name);
> -    qbus_set_hotplug_handler(BUS(&vser->bus), DEVICE(vser), errp);
> +    qbus_set_hotplug_handler(BUS(&vser->bus), OBJECT(vser), errp);
>      vser->bus.vser = vser;
>      QTAILQ_INIT(&vser->ports);
>  
> diff --git a/hw/core/bus.c b/hw/core/bus.c
> index 4651f244864c..e09843f6abea 100644
> --- a/hw/core/bus.c
> +++ b/hw/core/bus.c
> @@ -22,22 +22,15 @@
>  #include "hw/qdev.h"
>  #include "qapi/error.h"
>  
> -static void qbus_set_hotplug_handler_internal(BusState *bus, Object *handler,
> -                                              Error **errp)
> +void qbus_set_hotplug_handler(BusState *bus, Object *handler, Error **errp)
>  {
> -
>      object_property_set_link(OBJECT(bus), OBJECT(handler),
>                               QDEV_HOTPLUG_HANDLER_PROPERTY, errp);
>  }
>  
> -void qbus_set_hotplug_handler(BusState *bus, DeviceState *handler, Error **errp)
> -{
> -    qbus_set_hotplug_handler_internal(bus, OBJECT(handler), errp);
> -}
> -
>  void qbus_set_bus_hotplug_handler(BusState *bus, Error **errp)
>  {
> -    qbus_set_hotplug_handler_internal(bus, OBJECT(bus), errp);
> +    qbus_set_hotplug_handler(bus, OBJECT(bus), errp);
>  }
>  
>  int qbus_walk_children(BusState *bus,
> diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> index 230478faab12..3f7c36609313 100644
> --- a/hw/pci/pcie.c
> +++ b/hw/pci/pcie.c
> @@ -543,7 +543,7 @@ void pcie_cap_slot_init(PCIDevice *dev, uint16_t slot)
>      dev->exp.hpev_notified = false;
>  
>      qbus_set_hotplug_handler(BUS(pci_bridge_get_sec_bus(PCI_BRIDGE(dev))),
> -                             DEVICE(dev), NULL);
> +                             OBJECT(dev), NULL);
>  }
>  
>  void pcie_cap_slot_reset(PCIDevice *dev)
> diff --git a/hw/pci/shpc.c b/hw/pci/shpc.c
> index 45053b39b92c..52ccdc5ae3b9 100644
> --- a/hw/pci/shpc.c
> +++ b/hw/pci/shpc.c
> @@ -648,7 +648,7 @@ int shpc_init(PCIDevice *d, PCIBus *sec_bus, MemoryRegion *bar,
>      shpc_cap_update_dword(d);
>      memory_region_add_subregion(bar, offset, &shpc->mmio);
>  
> -    qbus_set_hotplug_handler(BUS(sec_bus), DEVICE(d), NULL);
> +    qbus_set_hotplug_handler(BUS(sec_bus), OBJECT(d), NULL);
>  
>      d->cap_present |= QEMU_PCI_CAP_SHPC;
>      return 0;
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index e3781dd110b2..0d4bad7bbe73 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -1743,7 +1743,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>                                  &sphb->memspace, &sphb->iospace,
>                                  PCI_DEVFN(0, 0), PCI_NUM_PINS, TYPE_PCI_BUS);
>      phb->bus = bus;
> -    qbus_set_hotplug_handler(BUS(phb->bus), DEVICE(sphb), NULL);
> +    qbus_set_hotplug_handler(BUS(phb->bus), OBJECT(sphb), NULL);
>  
>      /*
>       * Initialize PHB address space.
> diff --git a/hw/s390x/css-bridge.c b/hw/s390x/css-bridge.c
> index 1bd6c8b45860..7573c40badbd 100644
> --- a/hw/s390x/css-bridge.c
> +++ b/hw/s390x/css-bridge.c
> @@ -108,7 +108,7 @@ VirtualCssBus *virtual_css_bus_init(void)
>      cbus = VIRTUAL_CSS_BUS(bus);
>  
>      /* Enable hotplugging */
> -    qbus_set_hotplug_handler(bus, dev, &error_abort);
> +    qbus_set_hotplug_handler(bus, OBJECT(dev), &error_abort);
>  
>      css_register_io_adapters(CSS_IO_ADAPTER_VIRTIO, true, false,
>                               0, &error_abort);
> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
> index 80ff1ce33f72..5998942b4c15 100644
> --- a/hw/s390x/s390-pci-bus.c
> +++ b/hw/s390x/s390-pci-bus.c
> @@ -742,7 +742,7 @@ static void s390_pcihost_realize(DeviceState *dev, Error **errp)
>      pci_setup_iommu(b, s390_pci_dma_iommu, s);
>  
>      bus = BUS(b);
> -    qbus_set_hotplug_handler(bus, dev, &local_err);
> +    qbus_set_hotplug_handler(bus, OBJECT(dev), &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
>          return;
> @@ -750,7 +750,7 @@ static void s390_pcihost_realize(DeviceState *dev, Error **errp)
>      phb->bus = b;
>  
>      s->bus = S390_PCI_BUS(qbus_create(TYPE_S390_PCI_BUS, dev, NULL));
> -    qbus_set_hotplug_handler(BUS(s->bus), dev, &local_err);
> +    qbus_set_hotplug_handler(BUS(s->bus), OBJECT(dev), &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
>          return;
> @@ -912,7 +912,7 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>          pci_bridge_map_irq(pb, dev->id, s390_pci_map_irq);
>          pci_setup_iommu(&pb->sec_bus, s390_pci_dma_iommu, s);
>  
> -        qbus_set_hotplug_handler(BUS(&pb->sec_bus), DEVICE(s), errp);
> +        qbus_set_hotplug_handler(BUS(&pb->sec_bus), OBJECT(s), errp);
>  
>          if (dev->hotplugged) {
>              pci_default_write_config(pdev, PCI_PRIMARY_BUS,
> diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
> index eb90288f4741..ce99d288b035 100644
> --- a/hw/scsi/virtio-scsi.c
> +++ b/hw/scsi/virtio-scsi.c
> @@ -906,7 +906,7 @@ static void virtio_scsi_device_realize(DeviceState *dev, Error **errp)
>      scsi_bus_new(&s->bus, sizeof(s->bus), dev,
>                   &virtio_scsi_scsi_info, vdev->bus_name);
>      /* override default SCSI bus hotplug-handler, with virtio-scsi's one */
> -    qbus_set_hotplug_handler(BUS(&s->bus), dev, &error_abort);
> +    qbus_set_hotplug_handler(BUS(&s->bus), OBJECT(dev), &error_abort);
>  
>      virtio_scsi_dataplane_setup(s, errp);
>  }
> diff --git a/hw/scsi/vmw_pvscsi.c b/hw/scsi/vmw_pvscsi.c
> index a3a019e30a74..584b4be07e79 100644
> --- a/hw/scsi/vmw_pvscsi.c
> +++ b/hw/scsi/vmw_pvscsi.c
> @@ -1142,7 +1142,7 @@ pvscsi_realizefn(PCIDevice *pci_dev, Error **errp)
>      scsi_bus_new(&s->bus, sizeof(s->bus), DEVICE(pci_dev),
>                   &pvscsi_scsi_info, NULL);
>      /* override default SCSI bus hotplug-handler, with pvscsi's one */
> -    qbus_set_hotplug_handler(BUS(&s->bus), DEVICE(s), &error_abort);
> +    qbus_set_hotplug_handler(BUS(&s->bus), OBJECT(s), &error_abort);
>      pvscsi_reset_state(s);
>  }
>  
> diff --git a/hw/usb/dev-smartcard-reader.c b/hw/usb/dev-smartcard-reader.c
> index 8f716fc165a3..6b0137bb7699 100644
> --- a/hw/usb/dev-smartcard-reader.c
> +++ b/hw/usb/dev-smartcard-reader.c
> @@ -1322,7 +1322,7 @@ static void ccid_realize(USBDevice *dev, Error **errp)
>      usb_desc_init(dev);
>      qbus_create_inplace(&s->bus, sizeof(s->bus), TYPE_CCID_BUS, DEVICE(dev),
>                          NULL);
> -    qbus_set_hotplug_handler(BUS(&s->bus), DEVICE(dev), &error_abort);
> +    qbus_set_hotplug_handler(BUS(&s->bus), OBJECT(dev), &error_abort);
>      s->intr = usb_ep_get(dev, USB_TOKEN_IN, CCID_INT_IN_EP);
>      s->bulk = usb_ep_get(dev, USB_TOKEN_IN, CCID_BULK_IN_EP);
>      s->card = NULL;
> diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
> index 0a84c427561c..e70a4bfa498f 100644
> --- a/include/hw/qdev-core.h
> +++ b/include/hw/qdev-core.h
> @@ -430,8 +430,7 @@ char *qdev_get_dev_path(DeviceState *dev);
>  
>  GSList *qdev_build_hotpluggable_device_list(Object *peripheral);
>  
> -void qbus_set_hotplug_handler(BusState *bus, DeviceState *handler,
> -                              Error **errp);
> +void qbus_set_hotplug_handler(BusState *bus, Object *handler, Error **errp);
>  
>  void qbus_set_bus_hotplug_handler(BusState *bus, Error **errp);
>  
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v4 13/15] spapr_drc: Allow FDT fragment to be added later
  2019-02-12 18:25 ` [Qemu-devel] [PATCH v4 13/15] spapr_drc: Allow FDT fragment to be added later Greg Kurz
@ 2019-02-13  4:05   ` David Gibson
  2019-02-13 13:15     ` Greg Kurz
  0 siblings, 1 reply; 35+ messages in thread
From: David Gibson @ 2019-02-13  4:05 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Alexey Kardashevskiy,
	Cédric Le Goater, Michael Roth, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Eduardo Habkost,
	David Hildenbrand, Cornelia Huck, Gerd Hoffmann, Dmitry Fleytman,
	Thomas Huth

[-- Attachment #1: Type: text/plain, Size: 6630 bytes --]

On Tue, Feb 12, 2019 at 07:25:19PM +0100, Greg Kurz wrote:
> The current logic is to provide the FDT fragment when attaching a device
> to a DRC. This works perfectly fine for our current hotplug support, but
> soon we will add support for PHB hotplug which has some constraints, that
> CPU, PCI and LMB devices don't seem to have.
> 
> The first constraint is that the "ibm,dma-window" property of the PHB
> node requires the IOMMU to be configured, ie, spapr_tce_table_enable()
> has been called, which happens during PHB reset. It is okay in the case
> of hotplug since the device is reset before the hotplug handler is
> called. On the contrary with coldplug, the hotplug handler is called
> first and device is only reset during the initial system reset. Trying
> to create the FDT fragment on the hotplug path in this case, would
> result in somthing like this:
> 
> ibm,dma-window = < 0x80000000 0x00 0x00 0x00 0x00 >;
> 
> This will cause linux in the guest to panic, by simply removing and
> re-adding the PHB using the drmgr command:
> 
> 	page = alloc_pages_node(nid, GFP_KERNEL, get_order(sz));
> 	if (!page)
> 		panic("iommu_init_table: Can't allocate %ld bytes\n", sz);
> 
> The second and maybe more problematic constraint is that the
> "interrupt-map" property needs to reference the interrupt controller
> node using the very same phandle that SLOF has already exposed to the
> guest. QEMU requires SLOF to call the private KVMPPC_H_UPDATE_DT hcall
> at some point to know about this phandle. With the latest QEMU and SLOF,
> this happens when SLOF gets quiesced. This means that if the PHB gets
> hotplugged after CAS but before SLOF quiesce, then we're sure that the
> phandle is not known when the hotplug handler is called.
> 
> The FDT is only needed when the guest first invokes RTAS to configure
> the connector actually, long after SLOF quiesce. Let's postpone the
> creation of FDT fragments for PHBs to rtas_ibm_configure_connector().
> 
> Since we only need this for PHBs, introduce a new method in the base
> DRC class for that. It will implemented for "spapr-drc-phb" DRCs in
> a subsequent patch.
> 
> Allow spapr_drc_attach() to be passed a NULL fdt argument if the method
> is available.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>

The basic solution looks fine.  However I don't much like the fact
that this leaves us with two ways to handle the fdt fragment - either
at connect time or at configure connector time via a callback.  qemu
already has way to many places where there are confusingly multiple
ways to do things.

I know it's a detour, but I'd really prefer to convert the existing
DRC handling to this new callback scheme, rather than have two
different approaches.

> ---
>  hw/ppc/spapr_drc.c         |   34 +++++++++++++++++++++++++++++-----
>  include/hw/ppc/spapr_drc.h |    6 ++++++
>  2 files changed, 35 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> index 189ee681062a..c5a281915665 100644
> --- a/hw/ppc/spapr_drc.c
> +++ b/hw/ppc/spapr_drc.c
> @@ -22,6 +22,7 @@
>  #include "qemu/error-report.h"
>  #include "hw/ppc/spapr.h" /* for RTAS return codes */
>  #include "hw/pci-host/spapr.h" /* spapr_phb_remove_pci_device_cb callback */
> +#include "sysemu/device_tree.h"
>  #include "trace.h"
>  
>  #define DRC_CONTAINER_PATH "/dr-connector"
> @@ -376,6 +377,8 @@ static void prop_get_fdt(Object *obj, Visitor *v, const char *name,
>  void spapr_drc_attach(sPAPRDRConnector *drc, DeviceState *d, void *fdt,
>                        int fdt_start_offset, Error **errp)
>  {
> +    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> +
>      trace_spapr_drc_attach(spapr_drc_index(drc));
>  
>      if (drc->dev) {
> @@ -384,11 +387,14 @@ void spapr_drc_attach(sPAPRDRConnector *drc, DeviceState *d, void *fdt,
>      }
>      g_assert((drc->state == SPAPR_DRC_STATE_LOGICAL_UNUSABLE)
>               || (drc->state == SPAPR_DRC_STATE_PHYSICAL_POWERON));
> -    g_assert(fdt);
> +    g_assert(fdt || drck->populate_dt);
>  
>      drc->dev = d;
> -    drc->fdt = fdt;
> -    drc->fdt_start_offset = fdt_start_offset;
> +
> +    if (fdt) {
> +        drc->fdt = fdt;
> +        drc->fdt_start_offset = fdt_start_offset;
> +    }
>  
>      object_property_add_link(OBJECT(drc), "device",
>                               object_get_typename(OBJECT(drc->dev)),
> @@ -1118,10 +1124,28 @@ static void rtas_ibm_configure_connector(PowerPCCPU *cpu,
>          goto out;
>      }
>  
> -    g_assert(drc->fdt);
> -
>      drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
>  
> +    g_assert(drc->fdt || drck->populate_dt);
> +
> +    if (!drc->fdt) {
> +        Error *local_err = NULL;
> +        void *fdt;
> +        int fdt_size;
> +
> +        fdt = create_device_tree(&fdt_size);
> +
> +        if (drck->populate_dt(drc->dev, spapr, fdt, &drc->fdt_start_offset,
> +                               &local_err)) {
> +            g_free(fdt);
> +            error_free(local_err);
> +            rc = SPAPR_DR_CC_RESPONSE_ERROR;
> +            goto out;
> +        }
> +
> +        drc->fdt = fdt;
> +    }
> +
>      do {
>          uint32_t tag;
>          const char *name;
> diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
> index 56bba36ad4da..e947d6987bf2 100644
> --- a/include/hw/ppc/spapr_drc.h
> +++ b/include/hw/ppc/spapr_drc.h
> @@ -18,6 +18,7 @@
>  #include "qom/object.h"
>  #include "sysemu/sysemu.h"
>  #include "hw/qdev.h"
> +#include "qapi/error.h"
>  
>  #define TYPE_SPAPR_DR_CONNECTOR "spapr-dr-connector"
>  #define SPAPR_DR_CONNECTOR_GET_CLASS(obj) \
> @@ -221,6 +222,8 @@ typedef struct sPAPRDRConnector {
>      int fdt_start_offset;
>  } sPAPRDRConnector;
>  
> +struct sPAPRMachineState;
> +
>  typedef struct sPAPRDRConnectorClass {
>      /*< private >*/
>      DeviceClass parent;
> @@ -236,6 +239,9 @@ typedef struct sPAPRDRConnectorClass {
>      uint32_t (*isolate)(sPAPRDRConnector *drc);
>      uint32_t (*unisolate)(sPAPRDRConnector *drc);
>      void (*release)(DeviceState *dev);
> +
> +    int (*populate_dt)(DeviceState *dev, struct sPAPRMachineState *spapr,
> +                       void *fdt, int *fdt_start_offset, Error **errp);
>  } sPAPRDRConnectorClass;
>  
>  typedef struct sPAPRDRCPhysical {
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v4 14/15] spapr: add hotplug hooks for PHB hotplug
  2019-02-12 18:25 ` [Qemu-devel] [PATCH v4 14/15] spapr: add hotplug hooks for PHB hotplug Greg Kurz
@ 2019-02-13  4:13   ` David Gibson
  2019-02-13 13:24     ` Greg Kurz
  2019-02-13  9:25   ` David Hildenbrand
  1 sibling, 1 reply; 35+ messages in thread
From: David Gibson @ 2019-02-13  4:13 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Alexey Kardashevskiy,
	Cédric Le Goater, Michael Roth, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Eduardo Habkost,
	David Hildenbrand, Cornelia Huck, Gerd Hoffmann, Dmitry Fleytman,
	Thomas Huth

[-- Attachment #1: Type: text/plain, Size: 11174 bytes --]

On Tue, Feb 12, 2019 at 07:25:25PM +0100, Greg Kurz wrote:
> Hotplugging PHBs is a machine-level operation, but PHBs reside on the
> main system bus, so we register spapr machine as the handler for the
> main system bus.
> 
> Provide the usual pre-plug, plug and unplug-request handlers.
> 
> Move the checking of the PHB index to the pre-plug handler. It is okay
> to do that and assert in the realize function because the pre-plug
> handler is always called, even for the oldest machine types we support.
> 
> Unlike with other device types, there are some cases where we cannot
> provide the FDT fragment of the PHB from the plug handler, eg, before
> KVMPPC_H_UPDATE_DT was called. Do this from a DRC callback that is
> called just before the first FDT fragment is exposed to the guest.
> 
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> (Fixed interrupt controller phandle in "interrupt-map" and
>  TCE table size in "ibm,dma-window" FDT fragment, Greg Kurz)
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
> v4: - populate FDT fragment in a DRC callback
> v3: - reworked phandle handling some more
> v2: - reworked phandle handling
>     - sync LSIs to KVM
> ---
> ---
>  hw/ppc/spapr.c         |  121 ++++++++++++++++++++++++++++++++++++++++++++++++
>  hw/ppc/spapr_drc.c     |    2 +
>  hw/ppc/spapr_pci.c     |   16 ------
>  include/hw/ppc/spapr.h |    5 ++
>  4 files changed, 127 insertions(+), 17 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 021758825b7e..06ce0babcb54 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2930,6 +2930,11 @@ static void spapr_machine_init(MachineState *machine)
>      register_savevm_live(NULL, "spapr/htab", -1, 1,
>                           &savevm_htab_handlers, spapr);
>  
> +    if (smc->dr_phb_enabled) {
> +        qbus_set_hotplug_handler(sysbus_get_default(), OBJECT(machine),
> +                                 &error_fatal);
> +    }

I think you could do this unconditionally and just check
dr_phb_enabled at pre_plug.  That makes it more consistent with the
other hotplug types, and I suspect will give us better error messages.

>      qemu_register_boot_set(spapr_boot_set, spapr);
>  
>      if (kvm_enabled()) {
> @@ -3733,6 +3738,108 @@ out:
>      error_propagate(errp, local_err);
>  }
>  
> +int spapr_dt_phb(DeviceState *dev, sPAPRMachineState *spapr, void *fdt,
> +                 int *fdt_start_offset, Error **errp)
> +{
> +    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
> +    uint32_t intc_phandle;
> +
> +    if (spapr_irq_get_phandle(spapr, spapr->fdt_blob, &intc_phandle, errp)) {
> +        return -1;
> +    }
> +
> +    if (spapr_populate_pci_dt(sphb, intc_phandle, fdt, spapr->irq->nr_msis,
> +                              fdt_start_offset)) {
> +        error_setg(errp, "unable to create FDT node for PHB %d", sphb->index);
> +        return -1;
> +    }
> +
> +    /* generally SLOF creates these, for hotplug it's up to QEMU */
> +    _FDT(fdt_setprop_string(fdt, *fdt_start_offset, "name", "pci"));
> +
> +    return 0;
> +}
> +
> +static void spapr_phb_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> +                               Error **errp)
> +{
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev));
> +    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
> +    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> +    const unsigned windows_supported = spapr_phb_windows_supported(sphb);
> +
> +    if (sphb->index == (uint32_t)-1) {
> +        error_setg(errp, "\"index\" for PAPR PHB is mandatory");
> +        return;
> +    }
> +
> +    /*
> +     * This will check that sphb->index doesn't exceed the maximum number of
> +     * PHBs for the current machine type.
> +     */
> +    smc->phb_placement(spapr, sphb->index,
> +                       &sphb->buid, &sphb->io_win_addr,
> +                       &sphb->mem_win_addr, &sphb->mem64_win_addr,
> +                       windows_supported, sphb->dma_liobn, errp);
> +}
> +
> +static void spapr_phb_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> +                           Error **errp)
> +{
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev));
> +    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> +    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
> +    sPAPRDRConnector *drc;
> +    bool hotplugged = spapr_drc_hotplugged(dev);
> +    Error *local_err = NULL;
> +
> +    if (!smc->dr_phb_enabled) {
> +        return;
> +    }
> +
> +    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_PHB, sphb->index);
> +    /* hotplug hooks should check it's enabled before getting this far */
> +    assert(drc);
> +
> +    /*
> +     * The FDT fragment will be added during the first invocation of RTAS
> +     * ibm,client-architecture-support  for this device, when we're sure
> +     * that the IOMMU is configured and that QEMU knows the phandle of the
> +     * interrupt controller.
> +     */
> +    spapr_drc_attach(drc, DEVICE(dev), NULL, 0, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +        return;
> +    }
> +
> +    if (hotplugged) {
> +        spapr_hotplug_req_add_by_index(drc);
> +    } else {
> +        spapr_drc_reset(drc);
> +    }
> +}
> +
> +void spapr_phb_release(DeviceState *dev)
> +{
> +    object_unparent(OBJECT(dev));
> +}
> +
> +static void spapr_phb_unplug_request(HotplugHandler *hotplug_dev,
> +                                     DeviceState *dev, Error **errp)
> +{
> +    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
> +    sPAPRDRConnector *drc;
> +
> +    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_PHB, sphb->index);
> +    assert(drc);
> +
> +    if (!spapr_drc_unplug_requested(drc)) {
> +        spapr_drc_detach(drc);
> +        spapr_hotplug_req_remove_by_index(drc);
> +    }
> +}
> +
>  static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
>                                        DeviceState *dev, Error **errp)
>  {
> @@ -3740,6 +3847,8 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
>          spapr_memory_plug(hotplug_dev, dev, errp);
>      } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_CPU_CORE)) {
>          spapr_core_plug(hotplug_dev, dev, errp);
> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_PCI_HOST_BRIDGE)) {
> +        spapr_phb_plug(hotplug_dev, dev, errp);
>      }
>  }
>  
> @@ -3758,6 +3867,7 @@ static void spapr_machine_device_unplug_request(HotplugHandler *hotplug_dev,
>  {
>      sPAPRMachineState *sms = SPAPR_MACHINE(OBJECT(hotplug_dev));
>      MachineClass *mc = MACHINE_GET_CLASS(sms);
> +    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
>  
>      if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
>          if (spapr_ovec_test(sms->ov5_cas, OV5_HP_EVT)) {
> @@ -3777,6 +3887,12 @@ static void spapr_machine_device_unplug_request(HotplugHandler *hotplug_dev,
>              return;
>          }
>          spapr_core_unplug_request(hotplug_dev, dev, errp);
> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_PCI_HOST_BRIDGE)) {
> +        if (!smc->dr_phb_enabled) {
> +            error_setg(errp, "PHB hot unplug not supported on this machine");
> +            return;
> +        }
> +        spapr_phb_unplug_request(hotplug_dev, dev, errp);
>      }
>  }
>  
> @@ -3787,6 +3903,8 @@ static void spapr_machine_device_pre_plug(HotplugHandler *hotplug_dev,
>          spapr_memory_pre_plug(hotplug_dev, dev, errp);
>      } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_CPU_CORE)) {
>          spapr_core_pre_plug(hotplug_dev, dev, errp);
> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_PCI_HOST_BRIDGE)) {
> +        spapr_phb_pre_plug(hotplug_dev, dev, errp);
>      }
>  }
>  
> @@ -3794,7 +3912,8 @@ static HotplugHandler *spapr_get_hotplug_handler(MachineState *machine,
>                                                   DeviceState *dev)
>  {
>      if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) ||
> -        object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_CPU_CORE)) {
> +        object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_CPU_CORE) ||
> +        object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_PCI_HOST_BRIDGE)) {
>          return HOTPLUG_HANDLER(machine);
>      }
>      return NULL;
> diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> index c5a281915665..22563a381a37 100644
> --- a/hw/ppc/spapr_drc.c
> +++ b/hw/ppc/spapr_drc.c
> @@ -709,6 +709,8 @@ static void spapr_drc_phb_class_init(ObjectClass *k, void *data)
>      drck->typeshift = SPAPR_DR_CONNECTOR_TYPE_SHIFT_PHB;
>      drck->typename = "PHB";
>      drck->drc_name_prefix = "PHB ";
> +    drck->release = spapr_phb_release;
> +    drck->populate_dt = spapr_dt_phb;
>  }
>  
>  static const TypeInfo spapr_dr_connector_info = {
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 7df7f6502f93..d0caca627455 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -1647,21 +1647,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>          return;
>      }
>  
> -    if (sphb->index != (uint32_t)-1) {
> -        Error *local_err = NULL;
> -
> -        smc->phb_placement(spapr, sphb->index,
> -                           &sphb->buid, &sphb->io_win_addr,
> -                           &sphb->mem_win_addr, &sphb->mem64_win_addr,
> -                           windows_supported, sphb->dma_liobn, &local_err);
> -        if (local_err) {
> -            error_propagate(errp, local_err);
> -            return;
> -        }
> -    } else {
> -        error_setg(errp, "\"index\" for PAPR PHB is mandatory");
> -        return;
> -    }
> +    assert(sphb->index != (uint32_t)-1); /* checked in spapr_phb_pre_plug() */
>  
>      if (sphb->mem64_win_size != 0) {
>          if (sphb->mem_win_size > SPAPR_PCI_MEM32_WIN_SIZE) {
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index a3074e7fea37..69d9c2196ca2 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -764,9 +764,12 @@ void spapr_reallocate_hpt(sPAPRMachineState *spapr, int shift,
>  void spapr_clear_pending_events(sPAPRMachineState *spapr);
>  int spapr_max_server_number(sPAPRMachineState *spapr);
>  
> -/* CPU and LMB DRC release callbacks. */
> +/* DRC callbacks. */
>  void spapr_core_release(DeviceState *dev);
>  void spapr_lmb_release(DeviceState *dev);
> +void spapr_phb_release(DeviceState *dev);
> +int spapr_dt_phb(DeviceState *dev, sPAPRMachineState *spapr, void *fdt,
> +                 int *fdt_start_offset, Error **errp);
>  
>  void spapr_rtc_read(sPAPRRTCState *rtc, struct tm *tm, uint32_t *ns);
>  int spapr_rtc_import_offset(sPAPRRTCState *rtc, int64_t legacy_offset);
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v4 15/15] spapr: enable PHB hotplug for default pseries machine type
  2019-02-12 18:25 ` [Qemu-devel] [PATCH v4 15/15] spapr: enable PHB hotplug for default pseries machine type Greg Kurz
@ 2019-02-13  4:13   ` David Gibson
  0 siblings, 0 replies; 35+ messages in thread
From: David Gibson @ 2019-02-13  4:13 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Alexey Kardashevskiy,
	Cédric Le Goater, Michael Roth, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Eduardo Habkost,
	David Hildenbrand, Cornelia Huck, Gerd Hoffmann, Dmitry Fleytman,
	Thomas Huth

[-- Attachment #1: Type: text/plain, Size: 1779 bytes --]

On Tue, Feb 12, 2019 at 07:25:32PM +0100, Greg Kurz wrote:
> From: Michael Roth <mdroth@linux.vnet.ibm.com>
> 
> The 'dr_phb_enabled' field of that class can be set as part of
> machine-specific init code. It will be used to conditionally
> enable creation of DRC objects and device-tree description to
> facilitate hotplug of PHBs.
> 
> Since we can't migrate this state to older machine types,
> default the option to true and disable it for older machine
> types.
> 
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> Signed-off-by: Greg Kurz <groug@kaod.org>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  hw/ppc/spapr.c |    2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 06ce0babcb54..4a6b2f7f3f62 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -4166,6 +4166,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>      smc->default_caps.caps[SPAPR_CAP_NESTED_KVM_HV] = SPAPR_CAP_OFF;
>      spapr_caps_add_properties(smc, &error_abort);
>      smc->irq = &spapr_irq_xics;
> +    smc->dr_phb_enabled = true;
>  }
>  
>  static const TypeInfo spapr_machine_info = {
> @@ -4231,6 +4232,7 @@ static void spapr_machine_3_1_class_options(MachineClass *mc)
>      compat_props_add(mc->compat_props, hw_compat_3_1, hw_compat_3_1_len);
>      mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power8_v2.0");
>      smc->update_dt_enabled = false;
> +    smc->dr_phb_enabled = false;
>  }
>  
>  DEFINE_SPAPR_MACHINE(3_1, "3.1", false);
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v4 14/15] spapr: add hotplug hooks for PHB hotplug
  2019-02-12 18:25 ` [Qemu-devel] [PATCH v4 14/15] spapr: add hotplug hooks for PHB hotplug Greg Kurz
  2019-02-13  4:13   ` David Gibson
@ 2019-02-13  9:25   ` David Hildenbrand
  2019-02-13 13:25     ` Greg Kurz
  1 sibling, 1 reply; 35+ messages in thread
From: David Hildenbrand @ 2019-02-13  9:25 UTC (permalink / raw)
  To: Greg Kurz, David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Alexey Kardashevskiy,
	Cédric Le Goater, Michael Roth, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Eduardo Habkost,
	Cornelia Huck, Gerd Hoffmann, Dmitry Fleytman, Thomas Huth

On 12.02.19 19:25, Greg Kurz wrote:
> Hotplugging PHBs is a machine-level operation, but PHBs reside on the
> main system bus, so we register spapr machine as the handler for the
> main system bus.
> 
> Provide the usual pre-plug, plug and unplug-request handlers.
> 
> Move the checking of the PHB index to the pre-plug handler. It is okay
> to do that and assert in the realize function because the pre-plug
> handler is always called, even for the oldest machine types we support.
> 
> Unlike with other device types, there are some cases where we cannot
> provide the FDT fragment of the PHB from the plug handler, eg, before
> KVMPPC_H_UPDATE_DT was called. Do this from a DRC callback that is
> called just before the first FDT fragment is exposed to the guest.
> 
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> (Fixed interrupt controller phandle in "interrupt-map" and
>  TCE table size in "ibm,dma-window" FDT fragment, Greg Kurz)
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
> v4: - populate FDT fragment in a DRC callback
> v3: - reworked phandle handling some more
> v2: - reworked phandle handling
>     - sync LSIs to KVM
> ---
> ---
>  hw/ppc/spapr.c         |  121 ++++++++++++++++++++++++++++++++++++++++++++++++
>  hw/ppc/spapr_drc.c     |    2 +
>  hw/ppc/spapr_pci.c     |   16 ------
>  include/hw/ppc/spapr.h |    5 ++
>  4 files changed, 127 insertions(+), 17 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 021758825b7e..06ce0babcb54 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2930,6 +2930,11 @@ static void spapr_machine_init(MachineState *machine)
>      register_savevm_live(NULL, "spapr/htab", -1, 1,
>                           &savevm_htab_handlers, spapr);
>  
> +    if (smc->dr_phb_enabled) {
> +        qbus_set_hotplug_handler(sysbus_get_default(), OBJECT(machine),
> +                                 &error_fatal);
> +    }
> +
>      qemu_register_boot_set(spapr_boot_set, spapr);
>  
>      if (kvm_enabled()) {
> @@ -3733,6 +3738,108 @@ out:
>      error_propagate(errp, local_err);
>  }
>  
> +int spapr_dt_phb(DeviceState *dev, sPAPRMachineState *spapr, void *fdt,
> +                 int *fdt_start_offset, Error **errp)
> +{
> +    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
> +    uint32_t intc_phandle;
> +
> +    if (spapr_irq_get_phandle(spapr, spapr->fdt_blob, &intc_phandle, errp)) {
> +        return -1;
> +    }
> +
> +    if (spapr_populate_pci_dt(sphb, intc_phandle, fdt, spapr->irq->nr_msis,
> +                              fdt_start_offset)) {
> +        error_setg(errp, "unable to create FDT node for PHB %d", sphb->index);
> +        return -1;
> +    }
> +
> +    /* generally SLOF creates these, for hotplug it's up to QEMU */
> +    _FDT(fdt_setprop_string(fdt, *fdt_start_offset, "name", "pci"));
> +
> +    return 0;
> +}
> +
> +static void spapr_phb_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> +                               Error **errp)
> +{
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev));
> +    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
> +    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> +    const unsigned windows_supported = spapr_phb_windows_supported(sphb);
> +
> +    if (sphb->index == (uint32_t)-1) {
> +        error_setg(errp, "\"index\" for PAPR PHB is mandatory");
> +        return;
> +    }
> +
> +    /*
> +     * This will check that sphb->index doesn't exceed the maximum number of
> +     * PHBs for the current machine type.
> +     */
> +    smc->phb_placement(spapr, sphb->index,
> +                       &sphb->buid, &sphb->io_win_addr,
> +                       &sphb->mem_win_addr, &sphb->mem64_win_addr,
> +                       windows_supported, sphb->dma_liobn, errp);
> +}
> +
> +static void spapr_phb_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> +                           Error **errp)
> +{
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev));
> +    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> +    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
> +    sPAPRDRConnector *drc;
> +    bool hotplugged = spapr_drc_hotplugged(dev);
> +    Error *local_err = NULL;
> +
> +    if (!smc->dr_phb_enabled) {
> +        return;
> +    }
> +
> +    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_PHB, sphb->index);
> +    /* hotplug hooks should check it's enabled before getting this far */
> +    assert(drc);
> +
> +    /*
> +     * The FDT fragment will be added during the first invocation of RTAS
> +     * ibm,client-architecture-support  for this device, when we're sure
> +     * that the IOMMU is configured and that QEMU knows the phandle of the
> +     * interrupt controller.
> +     */
> +    spapr_drc_attach(drc, DEVICE(dev), NULL, 0, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +        return;
> +    }
> +
> +    if (hotplugged) {
> +        spapr_hotplug_req_add_by_index(drc);
> +    } else {
> +        spapr_drc_reset(drc);
> +    }
> +}
> +
> +void spapr_phb_release(DeviceState *dev)
> +{
> +    object_unparent(OBJECT(dev));
> +}
> +

Please call the unplug handler here just like we already do with
spapr_phb_remove_pci_device_cb().

And add a _unplug handler that simply calls e.g.

qdev_simple_device_unplug_cb


Otherwise this will break with
[PATCH RFCv2 0/9] qdev: Hotplug handler chaining + virtio-pmem


> +static void spapr_phb_unplug_request(HotplugHandler *hotplug_dev,
> +                                     DeviceState *dev, Error **errp)
> +{
> +    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
> +    sPAPRDRConnector *drc;
> +
> +    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_PHB, sphb->index);
> +    assert(drc);
> +
> +    if (!spapr_drc_unplug_requested(drc)) {
> +        spapr_drc_detach(drc);
> +        spapr_hotplug_req_remove_by_index(drc);
> +    }
> +}
> +
>  static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
>                                        DeviceState *dev, Error **errp)
>  {
> @@ -3740,6 +3847,8 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
>          spapr_memory_plug(hotplug_dev, dev, errp);
>      } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_CPU_CORE)) {
>          spapr_core_plug(hotplug_dev, dev, errp);
> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_PCI_HOST_BRIDGE)) {
> +        spapr_phb_plug(hotplug_dev, dev, errp);
>      }
>  }
>  
> @@ -3758,6 +3867,7 @@ static void spapr_machine_device_unplug_request(HotplugHandler *hotplug_dev,
>  {
>      sPAPRMachineState *sms = SPAPR_MACHINE(OBJECT(hotplug_dev));
>      MachineClass *mc = MACHINE_GET_CLASS(sms);
> +    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
>  
>      if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
>          if (spapr_ovec_test(sms->ov5_cas, OV5_HP_EVT)) {
> @@ -3777,6 +3887,12 @@ static void spapr_machine_device_unplug_request(HotplugHandler *hotplug_dev,
>              return;
>          }
>          spapr_core_unplug_request(hotplug_dev, dev, errp);
> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_PCI_HOST_BRIDGE)) {
> +        if (!smc->dr_phb_enabled) {
> +            error_setg(errp, "PHB hot unplug not supported on this machine");
> +            return;
> +        }
> +        spapr_phb_unplug_request(hotplug_dev, dev, errp);
>      }
>  }
>  
> @@ -3787,6 +3903,8 @@ static void spapr_machine_device_pre_plug(HotplugHandler *hotplug_dev,
>          spapr_memory_pre_plug(hotplug_dev, dev, errp);
>      } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_CPU_CORE)) {
>          spapr_core_pre_plug(hotplug_dev, dev, errp);
> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_PCI_HOST_BRIDGE)) {
> +        spapr_phb_pre_plug(hotplug_dev, dev, errp);
>      }
>  }
>  
> @@ -3794,7 +3912,8 @@ static HotplugHandler *spapr_get_hotplug_handler(MachineState *machine,
>                                                   DeviceState *dev)
>  {
>      if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) ||
> -        object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_CPU_CORE)) {
> +        object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_CPU_CORE) ||
> +        object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_PCI_HOST_BRIDGE)) {
>          return HOTPLUG_HANDLER(machine);
>      }
>      return NULL;
> diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> index c5a281915665..22563a381a37 100644
> --- a/hw/ppc/spapr_drc.c
> +++ b/hw/ppc/spapr_drc.c
> @@ -709,6 +709,8 @@ static void spapr_drc_phb_class_init(ObjectClass *k, void *data)
>      drck->typeshift = SPAPR_DR_CONNECTOR_TYPE_SHIFT_PHB;
>      drck->typename = "PHB";
>      drck->drc_name_prefix = "PHB ";
> +    drck->release = spapr_phb_release;
> +    drck->populate_dt = spapr_dt_phb;
>  }
>  
>  static const TypeInfo spapr_dr_connector_info = {
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 7df7f6502f93..d0caca627455 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -1647,21 +1647,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>          return;
>      }
>  
> -    if (sphb->index != (uint32_t)-1) {
> -        Error *local_err = NULL;
> -
> -        smc->phb_placement(spapr, sphb->index,
> -                           &sphb->buid, &sphb->io_win_addr,
> -                           &sphb->mem_win_addr, &sphb->mem64_win_addr,
> -                           windows_supported, sphb->dma_liobn, &local_err);
> -        if (local_err) {
> -            error_propagate(errp, local_err);
> -            return;
> -        }
> -    } else {
> -        error_setg(errp, "\"index\" for PAPR PHB is mandatory");
> -        return;
> -    }
> +    assert(sphb->index != (uint32_t)-1); /* checked in spapr_phb_pre_plug() */
>  
>      if (sphb->mem64_win_size != 0) {
>          if (sphb->mem_win_size > SPAPR_PCI_MEM32_WIN_SIZE) {
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index a3074e7fea37..69d9c2196ca2 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -764,9 +764,12 @@ void spapr_reallocate_hpt(sPAPRMachineState *spapr, int shift,
>  void spapr_clear_pending_events(sPAPRMachineState *spapr);
>  int spapr_max_server_number(sPAPRMachineState *spapr);
>  
> -/* CPU and LMB DRC release callbacks. */
> +/* DRC callbacks. */
>  void spapr_core_release(DeviceState *dev);
>  void spapr_lmb_release(DeviceState *dev);
> +void spapr_phb_release(DeviceState *dev);
> +int spapr_dt_phb(DeviceState *dev, sPAPRMachineState *spapr, void *fdt,
> +                 int *fdt_start_offset, Error **errp);
>  
>  void spapr_rtc_read(sPAPRRTCState *rtc, struct tm *tm, uint32_t *ns);
>  int spapr_rtc_import_offset(sPAPRRTCState *rtc, int64_t legacy_offset);
> 


-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v4 01/15] spapr_irq: Add an @xics_offset field to sPAPRIrq
  2019-02-13  3:26   ` David Gibson
@ 2019-02-13 12:23     ` Greg Kurz
  0 siblings, 0 replies; 35+ messages in thread
From: Greg Kurz @ 2019-02-13 12:23 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Alexey Kardashevskiy,
	Cédric Le Goater, Michael Roth, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Eduardo Habkost,
	David Hildenbrand, Cornelia Huck, Gerd Hoffmann, Dmitry Fleytman,
	Thomas Huth

[-- Attachment #1: Type: text/plain, Size: 7150 bytes --]

On Wed, 13 Feb 2019 14:26:01 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Tue, Feb 12, 2019 at 07:24:00PM +0100, Greg Kurz wrote:
> > Only pseries machines, either recent ones started with ic-mode=xics
> > or older ones using the legacy irq allocation scheme, need to set the
> > @offset of the ICS to XICS_IRQ_BASE. Recent pseries started with
> > ic-mode=dual set it to 0 and powernv machines set it to some other
> > value at runtime.
> > 
> > It thus doesn't really help to set the default value of the ICS offset
> > to XICS_IRQ_BASE in ics_base_instance_init().
> > 
> > Drop that code from XICS and let the pseries code set the offset
> > explicitely for clarity.
> > 
> > Signed-off-by: Greg Kurz <groug@kaod.org>  
> 
> So this actually relates to a discussion I've had on some of Cédric's
> more recent patches.  Changing the ics offset in ic-mode=dual doesn't
> make sense to me.  The global (guest) interrupt numbers need to match
> between XICS and XIVE, but the global interrupt numbers don't have to
> match the ICS source numbers, which is what ics->offset is about.
> 

Yeah. We'll see what comes out of the discussion at:

    https://patchwork.ozlabs.org/patch/1021496/

> > ---
> >  hw/intc/xics.c             |    8 --------
> >  hw/ppc/spapr_irq.c         |   33 ++++++++++++++++++++-------------
> >  include/hw/ppc/spapr_irq.h |    1 +
> >  3 files changed, 21 insertions(+), 21 deletions(-)
> > 
> > diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> > index 16e8ffa2aaf7..7cac138067e2 100644
> > --- a/hw/intc/xics.c
> > +++ b/hw/intc/xics.c
> > @@ -638,13 +638,6 @@ static void ics_base_realize(DeviceState *dev, Error **errp)
> >      ics->irqs = g_malloc0(ics->nr_irqs * sizeof(ICSIRQState));
> >  }
> >  
> > -static void ics_base_instance_init(Object *obj)
> > -{
> > -    ICSState *ics = ICS_BASE(obj);
> > -
> > -    ics->offset = XICS_IRQ_BASE;
> > -}
> > -
> >  static int ics_base_dispatch_pre_save(void *opaque)
> >  {
> >      ICSState *ics = opaque;
> > @@ -720,7 +713,6 @@ static const TypeInfo ics_base_info = {
> >      .parent = TYPE_DEVICE,
> >      .abstract = true,
> >      .instance_size = sizeof(ICSState),
> > -    .instance_init = ics_base_instance_init,
> >      .class_init = ics_base_class_init,
> >      .class_size = sizeof(ICSStateClass),
> >  };
> > diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> > index 80b0083b8e38..8217e0215411 100644
> > --- a/hw/ppc/spapr_irq.c
> > +++ b/hw/ppc/spapr_irq.c
> > @@ -68,10 +68,11 @@ void spapr_irq_msi_reset(sPAPRMachineState *spapr)
> >  
> >  static ICSState *spapr_ics_create(sPAPRMachineState *spapr,
> >                                    const char *type_ics,
> > -                                  int nr_irqs, Error **errp)
> > +                                  int nr_irqs, int offset, Error **errp)
> >  {
> >      Error *local_err = NULL;
> >      Object *obj;
> > +    ICSState *ics;
> >  
> >      obj = object_new(type_ics);
> >      object_property_add_child(OBJECT(spapr), "ics", obj, &error_abort);
> > @@ -86,7 +87,10 @@ static ICSState *spapr_ics_create(sPAPRMachineState *spapr,
> >          goto error;
> >      }
> >  
> > -    return ICS_BASE(obj);
> > +    ics = ICS_BASE(obj);
> > +    ics->offset = offset;
> > +
> > +    return ics;
> >  
> >  error:
> >      error_propagate(errp, local_err);
> > @@ -104,6 +108,7 @@ static void spapr_irq_init_xics(sPAPRMachineState *spapr, Error **errp)
> >              !xics_kvm_init(spapr, &local_err)) {
> >              spapr->icp_type = TYPE_KVM_ICP;
> >              spapr->ics = spapr_ics_create(spapr, TYPE_ICS_KVM, nr_irqs,
> > +                                          spapr->irq->xics_offset,
> >                                            &local_err);
> >          }
> >          if (machine_kernel_irqchip_required(machine) && !spapr->ics) {
> > @@ -119,6 +124,7 @@ static void spapr_irq_init_xics(sPAPRMachineState *spapr, Error **errp)
> >          xics_spapr_init(spapr);
> >          spapr->icp_type = TYPE_ICP;
> >          spapr->ics = spapr_ics_create(spapr, TYPE_ICS_SIMPLE, nr_irqs,
> > +                                      spapr->irq->xics_offset,
> >                                        &local_err);
> >      }
> >  
> > @@ -246,6 +252,7 @@ sPAPRIrq spapr_irq_xics = {
> >      .nr_irqs     = SPAPR_IRQ_XICS_NR_IRQS,
> >      .nr_msis     = SPAPR_IRQ_XICS_NR_MSIS,
> >      .ov5         = SPAPR_OV5_XIVE_LEGACY,
> > +    .xics_offset = XICS_IRQ_BASE,
> >  
> >      .init        = spapr_irq_init_xics,
> >      .claim       = spapr_irq_claim_xics,
> > @@ -451,17 +458,6 @@ static void spapr_irq_init_dual(sPAPRMachineState *spapr, Error **errp)
> >          return;
> >      }
> >  
> > -    /*
> > -     * Align the XICS and the XIVE IRQ number space under QEMU.
> > -     *
> > -     * However, the XICS KVM device still considers that the IRQ
> > -     * numbers should start at XICS_IRQ_BASE (0x1000). Either we
> > -     * should introduce a KVM device ioctl to set the offset or ignore
> > -     * the lower 4K numbers when using the get/set ioctl of the XICS
> > -     * KVM device. The second option seems the least intrusive.
> > -     */
> > -    spapr->ics->offset = 0;
> > -
> >      spapr_irq_xive.init(spapr, &local_err);
> >      if (local_err) {
> >          error_propagate(errp, local_err);
> > @@ -582,6 +578,16 @@ sPAPRIrq spapr_irq_dual = {
> >      .nr_irqs     = SPAPR_IRQ_DUAL_NR_IRQS,
> >      .nr_msis     = SPAPR_IRQ_DUAL_NR_MSIS,
> >      .ov5         = SPAPR_OV5_XIVE_BOTH,
> > +    /*
> > +     * Align the XICS and the XIVE IRQ number space under QEMU.
> > +     *
> > +     * However, the XICS KVM device still considers that the IRQ
> > +     * numbers should start at XICS_IRQ_BASE (0x1000). Either we
> > +     * should introduce a KVM device ioctl to set the offset or ignore
> > +     * the lower 4K numbers when using the get/set ioctl of the XICS
> > +     * KVM device. The second option seems the least intrusive.
> > +     */
> > +    .xics_offset = 0,
> >  
> >      .init        = spapr_irq_init_dual,
> >      .claim       = spapr_irq_claim_dual,
> > @@ -712,6 +718,7 @@ sPAPRIrq spapr_irq_xics_legacy = {
> >      .nr_irqs     = SPAPR_IRQ_XICS_LEGACY_NR_IRQS,
> >      .nr_msis     = SPAPR_IRQ_XICS_LEGACY_NR_IRQS,
> >      .ov5         = SPAPR_OV5_XIVE_LEGACY,
> > +    .xics_offset = XICS_IRQ_BASE,
> >  
> >      .init        = spapr_irq_init_xics,
> >      .claim       = spapr_irq_claim_xics,
> > diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> > index 14b02c3aca33..5e30858dc22a 100644
> > --- a/include/hw/ppc/spapr_irq.h
> > +++ b/include/hw/ppc/spapr_irq.h
> > @@ -34,6 +34,7 @@ typedef struct sPAPRIrq {
> >      uint32_t    nr_irqs;
> >      uint32_t    nr_msis;
> >      uint8_t     ov5;
> > +    uint32_t    xics_offset;
> >  
> >      void (*init)(sPAPRMachineState *spapr, Error **errp);
> >      int (*claim)(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
> >   
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v4 03/15] spapr_irq: Set LSIs at interrupt controller init
  2019-02-13  3:48   ` David Gibson
@ 2019-02-13 12:44     ` Greg Kurz
  0 siblings, 0 replies; 35+ messages in thread
From: Greg Kurz @ 2019-02-13 12:44 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Alexey Kardashevskiy,
	Cédric Le Goater, Michael Roth, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Eduardo Habkost,
	David Hildenbrand, Cornelia Huck, Gerd Hoffmann, Dmitry Fleytman,
	Thomas Huth

[-- Attachment #1: Type: text/plain, Size: 16992 bytes --]

On Wed, 13 Feb 2019 14:48:44 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Tue, Feb 12, 2019 at 07:24:13PM +0100, Greg Kurz wrote:
> > The pseries machine only uses LSIs to support legacy PCI devices. Every
> > PHB claims 4 LSIs at realize time. When using in-kernel XICS (or upcoming
> > in-kernel XIVE), QEMU synchronizes the state of all irqs, including these
> > LSIs, later on at machine reset.
> > 
> > In order to support PHB hotplug, we need a way to tell KVM about the LSIs
> > that doesn't require a machine reset.
> > 
> > Since recent machine types allocate all these LSIs in a fixed range for
> > the machine lifetime, identify them when initializing the interrupt
> > controller, long before they get passed to KVM.
> > 
> > In order to do that, first disintricate interrupt typing and allocation.
> > Since the vast majority of interrupts are MSIs, make that the default
> > and have only the LSI users to explicitely set the type.
> > 
> > It is rather straight forward for XIVE. XICS needs some extra care
> > though: allocation state and type are mixed up in the same bits of the
> > flags field within the interrupt state. Setting the LSI bit there at
> > init time would mean the interrupt is de facto allocated, even if no
> > device asked for it. Introduce a bitmap to track LSIs at the ICS level.
> > In order to keep the patch minimal, the bitmap is only used when writing
> > the source state to KVM and when the interrupt is claimed, so that the
> > code that checks the interrupt type through the flags stays untouched.
> > 
> > With older pseries machine using the XICS legacy IRQ allocation scheme,
> > all interrupt numbers come from a common pool and there's no such thing
> > as a fixed range for LSIs. Introduce an helper so that these older
> > machine types can continue to set the type when allocating the LSI.
> > 
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> >  hw/intc/spapr_xive.c        |    7 +------
> >  hw/intc/xics.c              |   10 ++++++++--
> >  hw/intc/xics_kvm.c          |    2 +-
> >  hw/ppc/pnv_psi.c            |    3 ++-
> >  hw/ppc/spapr_events.c       |    4 ++--
> >  hw/ppc/spapr_irq.c          |   42 ++++++++++++++++++++++++++++++++----------
> >  hw/ppc/spapr_pci.c          |    6 ++++--
> >  hw/ppc/spapr_vio.c          |    2 +-
> >  include/hw/ppc/spapr_irq.h  |    5 +++--
> >  include/hw/ppc/spapr_xive.h |    2 +-
> >  include/hw/ppc/xics.h       |    4 +++-
> >  11 files changed, 58 insertions(+), 29 deletions(-)
> > 
> > diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> > index 290a290e43a5..815263ca72ab 100644
> > --- a/hw/intc/spapr_xive.c
> > +++ b/hw/intc/spapr_xive.c
> > @@ -480,18 +480,13 @@ static void spapr_xive_register_types(void)
> >  
> >  type_init(spapr_xive_register_types)
> >  
> > -bool spapr_xive_irq_claim(sPAPRXive *xive, uint32_t lisn, bool lsi)
> > +bool spapr_xive_irq_claim(sPAPRXive *xive, uint32_t lisn)
> >  {
> > -    XiveSource *xsrc = &xive->source;
> > -
> >      if (lisn >= xive->nr_irqs) {
> >          return false;
> >      }
> >  
> >      xive->eat[lisn].w |= cpu_to_be64(EAS_VALID);
> > -    if (lsi) {
> > -        xive_source_irq_set_lsi(xsrc, lisn);
> > -    }
> >      return true;
> >  }
> >  
> > diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> > index 7cac138067e2..26e8940d7329 100644
> > --- a/hw/intc/xics.c
> > +++ b/hw/intc/xics.c
> > @@ -636,6 +636,7 @@ static void ics_base_realize(DeviceState *dev, Error **errp)
> >          return;
> >      }
> >      ics->irqs = g_malloc0(ics->nr_irqs * sizeof(ICSIRQState));
> > +    ics->lsi_map = bitmap_new(ics->nr_irqs);
> >  }
> >  
> >  static int ics_base_dispatch_pre_save(void *opaque)
> > @@ -733,12 +734,17 @@ ICPState *xics_icp_get(XICSFabric *xi, int server)
> >      return xic->icp_get(xi, server);
> >  }
> >  
> > -void ics_set_irq_type(ICSState *ics, int srcno, bool lsi)
> > +void ics_set_lsi(ICSState *ics, int srcno)
> > +{
> > +    set_bit(srcno, ics->lsi_map);
> > +}
> > +
> > +void ics_claim_irq(ICSState *ics, int srcno)
> >  {
> >      assert(!(ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MASK));
> >  
> >      ics->irqs[srcno].flags |=
> > -        lsi ? XICS_FLAGS_IRQ_LSI : XICS_FLAGS_IRQ_MSI;
> > +        test_bit(srcno, ics->lsi_map) ? XICS_FLAGS_IRQ_LSI : XICS_FLAGS_IRQ_MSI;  
> 
> I really don't like having the trigger type redundantly stored in the
> lsi_map and then again in the flags fields.
> 
> In a sense the natural way to do this would be more like the hardware
> - have two source objects, one for MSIs and one for LSIs, and make the
> trigger a per ICSState rather than per IRQState.  But that would make
> life hard for the legacy support.
> 
> But... thinking about it, isn't all this overkill anyway.  Can't we
> fix the problem by simply forcing an ics_set_kvm_state() (and the xive
> equivalent) at claim time.  It's not like it's a hot path.
> 

I had kinda followed this approach in earlier versions. I'll try
again.

> >  }
> >  
> >  static void xics_register_types(void)
> > diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
> > index dff13300504c..e63979abc7fc 100644
> > --- a/hw/intc/xics_kvm.c
> > +++ b/hw/intc/xics_kvm.c
> > @@ -271,7 +271,7 @@ static int ics_set_kvm_state(ICSState *ics, int version_id)
> >              state |= KVM_XICS_MASKED;
> >          }
> >  
> > -        if (ics->irqs[i].flags & XICS_FLAGS_IRQ_LSI) {
> > +        if (test_bit(i, ics->lsi_map)) {
> >              state |= KVM_XICS_LEVEL_SENSITIVE;
> >              if (irq->status & XICS_STATUS_ASSERTED) {
> >                  state |= KVM_XICS_PENDING;
> > diff --git a/hw/ppc/pnv_psi.c b/hw/ppc/pnv_psi.c
> > index 8ced09506321..e6089e1035c0 100644
> > --- a/hw/ppc/pnv_psi.c
> > +++ b/hw/ppc/pnv_psi.c
> > @@ -487,7 +487,8 @@ static void pnv_psi_realize(DeviceState *dev, Error **errp)
> >      }
> >  
> >      for (i = 0; i < ics->nr_irqs; i++) {
> > -        ics_set_irq_type(ics, i, true);
> > +        ics_set_lsi(ics, i);
> > +        ics_claim_irq(ics, i);
> >      }
> >  
> >      psi->qirqs = qemu_allocate_irqs(ics_simple_set_irq, ics, ics->nr_irqs);
> > diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> > index b9c7ecb9e987..559026d0981c 100644
> > --- a/hw/ppc/spapr_events.c
> > +++ b/hw/ppc/spapr_events.c
> > @@ -713,7 +713,7 @@ void spapr_events_init(sPAPRMachineState *spapr)
> >          epow_irq = spapr_irq_findone(spapr, &error_fatal);
> >      }
> >  
> > -    spapr_irq_claim(spapr, epow_irq, false, &error_fatal);
> > +    spapr_irq_claim(spapr, epow_irq, &error_fatal);
> >  
> >      QTAILQ_INIT(&spapr->pending_events);
> >  
> > @@ -737,7 +737,7 @@ void spapr_events_init(sPAPRMachineState *spapr)
> >              hp_irq = spapr_irq_findone(spapr, &error_fatal);
> >          }
> >  
> > -        spapr_irq_claim(spapr, hp_irq, false, &error_fatal);
> > +        spapr_irq_claim(spapr, hp_irq, &error_fatal);
> >  
> >          spapr_event_sources_register(spapr->event_sources, EVENT_CLASS_HOT_PLUG,
> >                                       hp_irq);
> > diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> > index 8217e0215411..3fc34d7c8a43 100644
> > --- a/hw/ppc/spapr_irq.c
> > +++ b/hw/ppc/spapr_irq.c
> > @@ -16,10 +16,13 @@
> >  #include "hw/ppc/spapr_xive.h"
> >  #include "hw/ppc/xics.h"
> >  #include "hw/ppc/xics_spapr.h"
> > +#include "hw/pci-host/spapr.h"
> >  #include "sysemu/kvm.h"
> >  
> >  #include "trace.h"
> >  
> > +#define SPAPR_IRQ_PCI_LSI_NR     (SPAPR_MAX_PHBS * PCI_NUM_PINS)
> > +
> >  void spapr_irq_msi_init(sPAPRMachineState *spapr, uint32_t nr_msis)
> >  {
> >      spapr->irq_map_nr = nr_msis;
> > @@ -102,6 +105,7 @@ static void spapr_irq_init_xics(sPAPRMachineState *spapr, Error **errp)
> >      MachineState *machine = MACHINE(spapr);
> >      int nr_irqs = spapr->irq->nr_irqs;
> >      Error *local_err = NULL;
> > +    int i;
> >  
> >      if (kvm_enabled()) {
> >          if (machine_kernel_irqchip_allowed(machine) &&
> > @@ -128,6 +132,14 @@ static void spapr_irq_init_xics(sPAPRMachineState *spapr, Error **errp)
> >                                        &local_err);
> >      }
> >  
> > +    /* Identify the PCI LSIs */
> > +    if (!SPAPR_MACHINE_GET_CLASS(spapr)->legacy_irq_allocation) {
> > +        for (i = 0; i < SPAPR_IRQ_PCI_LSI_NR; ++i) {
> > +            ics_set_lsi(spapr->ics,
> > +                        i + SPAPR_IRQ_PCI_LSI - spapr->irq->xics_offset);
> > +        }
> > +    }
> > +
> >  error:
> >      error_propagate(errp, local_err);
> >  }
> > @@ -135,7 +147,7 @@ error:
> >  #define ICS_IRQ_FREE(ics, srcno)   \
> >      (!((ics)->irqs[(srcno)].flags & (XICS_FLAGS_IRQ_MASK)))
> >  
> > -static int spapr_irq_claim_xics(sPAPRMachineState *spapr, int irq, bool lsi,
> > +static int spapr_irq_claim_xics(sPAPRMachineState *spapr, int irq,
> >                                  Error **errp)
> >  {
> >      ICSState *ics = spapr->ics;
> > @@ -152,7 +164,7 @@ static int spapr_irq_claim_xics(sPAPRMachineState *spapr, int irq, bool lsi,
> >          return -1;
> >      }
> >  
> > -    ics_set_irq_type(ics, irq - ics->offset, lsi);
> > +    ics_claim_irq(ics, irq - ics->offset);
> >      return 0;
> >  }
> >  
> > @@ -296,16 +308,21 @@ static void spapr_irq_init_xive(sPAPRMachineState *spapr, Error **errp)
> >  
> >      /* Enable the CPU IPIs */
> >      for (i = 0; i < nr_servers; ++i) {
> > -        spapr_xive_irq_claim(spapr->xive, SPAPR_IRQ_IPI + i, false);
> > +        spapr_xive_irq_claim(spapr->xive, SPAPR_IRQ_IPI + i);
> > +    }
> > +
> > +    /* Identify the PCI LSIs */
> > +    for (i = 0; i < SPAPR_IRQ_PCI_LSI_NR; ++i) {
> > +        xive_source_irq_set_lsi(&spapr->xive->source, SPAPR_IRQ_PCI_LSI + i);
> >      }
> >  
> >      spapr_xive_hcall_init(spapr);
> >  }
> >  
> > -static int spapr_irq_claim_xive(sPAPRMachineState *spapr, int irq, bool lsi,
> > +static int spapr_irq_claim_xive(sPAPRMachineState *spapr, int irq,
> >                                  Error **errp)
> >  {
> > -    if (!spapr_xive_irq_claim(spapr->xive, irq, lsi)) {
> > +    if (!spapr_xive_irq_claim(spapr->xive, irq)) {
> >          error_setg(errp, "IRQ %d is invalid", irq);
> >          return -1;
> >      }
> > @@ -465,19 +482,19 @@ static void spapr_irq_init_dual(sPAPRMachineState *spapr, Error **errp)
> >      }
> >  }
> >  
> > -static int spapr_irq_claim_dual(sPAPRMachineState *spapr, int irq, bool lsi,
> > +static int spapr_irq_claim_dual(sPAPRMachineState *spapr, int irq,
> >                                  Error **errp)
> >  {
> >      Error *local_err = NULL;
> >      int ret;
> >  
> > -    ret = spapr_irq_xics.claim(spapr, irq, lsi, &local_err);
> > +    ret = spapr_irq_xics.claim(spapr, irq, &local_err);
> >      if (local_err) {
> >          error_propagate(errp, local_err);
> >          return ret;
> >      }
> >  
> > -    ret = spapr_irq_xive.claim(spapr, irq, lsi, &local_err);
> > +    ret = spapr_irq_xive.claim(spapr, irq, &local_err);
> >      if (local_err) {
> >          error_propagate(errp, local_err);
> >          return ret;
> > @@ -630,9 +647,9 @@ void spapr_irq_init(sPAPRMachineState *spapr, Error **errp)
> >                                        spapr->irq->nr_irqs);
> >  }
> >  
> > -int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp)
> > +int spapr_irq_claim(sPAPRMachineState *spapr, int irq, Error **errp)
> >  {
> > -    return spapr->irq->claim(spapr, irq, lsi, errp);
> > +    return spapr->irq->claim(spapr, irq, errp);
> >  }
> >  
> >  void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num)
> > @@ -712,6 +729,11 @@ int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align, Error **errp)
> >      return first + ics->offset;
> >  }
> >  
> > +void spapr_irq_set_lsi_legacy(sPAPRMachineState *spapr, int irq)
> > +{
> > +    ics_set_lsi(spapr->ics, irq - spapr->irq->xics_offset);
> > +}
> > +
> >  #define SPAPR_IRQ_XICS_LEGACY_NR_IRQS     0x400
> >  
> >  sPAPRIrq spapr_irq_xics_legacy = {
> > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> > index c3fb0ac884b0..d68595531d5a 100644
> > --- a/hw/ppc/spapr_pci.c
> > +++ b/hw/ppc/spapr_pci.c
> > @@ -391,7 +391,7 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
> >      }
> >  
> >      for (i = 0; i < req_num; i++) {
> > -        spapr_irq_claim(spapr, irq + i, false, &err);
> > +        spapr_irq_claim(spapr, irq + i, &err);
> >          if (err) {
> >              if (i) {
> >                  spapr_irq_free(spapr, irq, i);
> > @@ -1742,9 +1742,11 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
> >                                          "can't allocate LSIs: ");
> >                  return;
> >              }
> > +
> > +            spapr_irq_set_lsi_legacy(spapr, irq);
> >          }
> >  
> > -        spapr_irq_claim(spapr, irq, true, &local_err);
> > +        spapr_irq_claim(spapr, irq, &local_err);
> >          if (local_err) {
> >              error_propagate_prepend(errp, local_err, "can't allocate LSIs: ");
> >              return;
> > diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
> > index 2b7e7ecac57f..b1beefc24be5 100644
> > --- a/hw/ppc/spapr_vio.c
> > +++ b/hw/ppc/spapr_vio.c
> > @@ -512,7 +512,7 @@ static void spapr_vio_busdev_realize(DeviceState *qdev, Error **errp)
> >          }
> >      }
> >  
> > -    spapr_irq_claim(spapr, dev->irq, false, &local_err);
> > +    spapr_irq_claim(spapr, dev->irq, &local_err);
> >      if (local_err) {
> >          error_propagate(errp, local_err);
> >          return;
> > diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> > index 5e30858dc22a..0e6c65d55430 100644
> > --- a/include/hw/ppc/spapr_irq.h
> > +++ b/include/hw/ppc/spapr_irq.h
> > @@ -37,7 +37,7 @@ typedef struct sPAPRIrq {
> >      uint32_t    xics_offset;
> >  
> >      void (*init)(sPAPRMachineState *spapr, Error **errp);
> > -    int (*claim)(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
> > +    int (*claim)(sPAPRMachineState *spapr, int irq, Error **errp);
> >      void (*free)(sPAPRMachineState *spapr, int irq, int num);
> >      qemu_irq (*qirq)(sPAPRMachineState *spapr, int irq);
> >      void (*print_info)(sPAPRMachineState *spapr, Monitor *mon);
> > @@ -56,7 +56,7 @@ extern sPAPRIrq spapr_irq_xive;
> >  extern sPAPRIrq spapr_irq_dual;
> >  
> >  void spapr_irq_init(sPAPRMachineState *spapr, Error **errp);
> > -int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
> > +int spapr_irq_claim(sPAPRMachineState *spapr, int irq, Error **errp);
> >  void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
> >  qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq);
> >  int spapr_irq_post_load(sPAPRMachineState *spapr, int version_id);
> > @@ -67,5 +67,6 @@ void spapr_irq_reset(sPAPRMachineState *spapr, Error **errp);
> >   */
> >  int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align, Error **errp);
> >  #define spapr_irq_findone(spapr, errp) spapr_irq_find(spapr, 1, false, errp)
> > +void spapr_irq_set_lsi_legacy(sPAPRMachineState *spapr, int irq);
> >  
> >  #endif
> > diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> > index 9bec9192e4a0..885ca169cb29 100644
> > --- a/include/hw/ppc/spapr_xive.h
> > +++ b/include/hw/ppc/spapr_xive.h
> > @@ -37,7 +37,7 @@ typedef struct sPAPRXive {
> >      MemoryRegion  tm_mmio;
> >  } sPAPRXive;
> >  
> > -bool spapr_xive_irq_claim(sPAPRXive *xive, uint32_t lisn, bool lsi);
> > +bool spapr_xive_irq_claim(sPAPRXive *xive, uint32_t lisn);
> >  bool spapr_xive_irq_free(sPAPRXive *xive, uint32_t lisn);
> >  void spapr_xive_pic_print_info(sPAPRXive *xive, Monitor *mon);
> >  
> > diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> > index fad786e8b22d..18b083fe2aec 100644
> > --- a/include/hw/ppc/xics.h
> > +++ b/include/hw/ppc/xics.h
> > @@ -133,6 +133,7 @@ struct ICSState {
> >      uint32_t offset;
> >      ICSIRQState *irqs;
> >      XICSFabric *xics;
> > +    unsigned long *lsi_map;
> >  };
> >  
> >  #define ICS_PROP_XICS "xics"
> > @@ -193,7 +194,8 @@ void ics_simple_write_xive(ICSState *ics, int nr, int server,
> >  void ics_simple_set_irq(void *opaque, int srcno, int val);
> >  void ics_kvm_set_irq(void *opaque, int srcno, int val);
> >  
> > -void ics_set_irq_type(ICSState *ics, int srcno, bool lsi);
> > +void ics_set_lsi(ICSState *ics, int srcno);
> > +void ics_claim_irq(ICSState *ics, int srcno);
> >  void icp_pic_print_info(ICPState *icp, Monitor *mon);
> >  void ics_pic_print_info(ICSState *ics, Monitor *mon);
> >  
> >   
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v4 05/15] spapr_irq: Expose the phandle of the interrupt controller
  2019-02-13  3:52   ` David Gibson
@ 2019-02-13 13:11     ` Greg Kurz
  0 siblings, 0 replies; 35+ messages in thread
From: Greg Kurz @ 2019-02-13 13:11 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Alexey Kardashevskiy,
	Cédric Le Goater, Michael Roth, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Eduardo Habkost,
	David Hildenbrand, Cornelia Huck, Gerd Hoffmann, Dmitry Fleytman,
	Thomas Huth

[-- Attachment #1: Type: text/plain, Size: 2689 bytes --]

On Wed, 13 Feb 2019 14:52:04 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Tue, Feb 12, 2019 at 07:24:26PM +0100, Greg Kurz wrote:
> > This will be used by PHB hotplug in order to create the "interrupt-map"
> > property of the PHB node.
> > 
> > Reviewed-by: Cédric Le Goater <clg@kaod.org>
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> > v4: - return phandle via a pointer  
> 
> You don't really need to do this.  You already have an Error ** to
> return errors via, so you don't need an error return code.  Plus
> phandles are not permitted to be 0 or -1, so you have some safe values
> even for that case.
> 

Ok, I'll use the return value for the phandle.

> > ---
> >  hw/ppc/spapr_irq.c         |   26 ++++++++++++++++++++++++++
> >  include/hw/ppc/spapr_irq.h |    2 ++
> >  2 files changed, 28 insertions(+)
> > 
> > diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> > index b8d725e251ba..31495033c37c 100644
> > --- a/hw/ppc/spapr_irq.c
> > +++ b/hw/ppc/spapr_irq.c
> > @@ -692,6 +692,32 @@ void spapr_irq_reset(sPAPRMachineState *spapr, Error **errp)
> >      }
> >  }
> >  
> > +int spapr_irq_get_phandle(sPAPRMachineState *spapr, void *fdt,
> > +                          uint32_t *phandle, Error **errp)
> > +{
> > +    const char *nodename = spapr->irq->get_nodename(spapr);
> > +    int offset, ph;
> > +
> > +    offset = fdt_subnode_offset(fdt, 0, nodename);
> > +    if (offset < 0) {
> > +        error_setg(errp, "Can't find node \"%s\": %s", nodename,
> > +                   fdt_strerror(offset));
> > +        return -1;
> > +    }
> > +
> > +    ph = fdt_get_phandle(fdt, offset);
> > +    if (!ph) {
> > +        error_setg(errp, "Can't get phandle of node \"%s\"", nodename);
> > +        return -1;
> > +    }
> > +
> > +    if (phandle) {
> > +        *phandle = ph;
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> >  /*
> >   * XICS legacy routines - to deprecate one day
> >   */
> > diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> > index ad7127355441..4b3303ef4f6a 100644
> > --- a/include/hw/ppc/spapr_irq.h
> > +++ b/include/hw/ppc/spapr_irq.h
> > @@ -62,6 +62,8 @@ void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
> >  qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq);
> >  int spapr_irq_post_load(sPAPRMachineState *spapr, int version_id);
> >  void spapr_irq_reset(sPAPRMachineState *spapr, Error **errp);
> > +int spapr_irq_get_phandle(sPAPRMachineState *spapr, void *fdt,
> > +                          uint32_t *phandle, Error **errp);
> >  
> >  /*
> >   * XICS legacy routines
> >   
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v4 13/15] spapr_drc: Allow FDT fragment to be added later
  2019-02-13  4:05   ` David Gibson
@ 2019-02-13 13:15     ` Greg Kurz
  0 siblings, 0 replies; 35+ messages in thread
From: Greg Kurz @ 2019-02-13 13:15 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Alexey Kardashevskiy,
	Cédric Le Goater, Michael Roth, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Eduardo Habkost,
	David Hildenbrand, Cornelia Huck, Gerd Hoffmann, Dmitry Fleytman,
	Thomas Huth

[-- Attachment #1: Type: text/plain, Size: 6960 bytes --]

On Wed, 13 Feb 2019 15:05:24 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Tue, Feb 12, 2019 at 07:25:19PM +0100, Greg Kurz wrote:
> > The current logic is to provide the FDT fragment when attaching a device
> > to a DRC. This works perfectly fine for our current hotplug support, but
> > soon we will add support for PHB hotplug which has some constraints, that
> > CPU, PCI and LMB devices don't seem to have.
> > 
> > The first constraint is that the "ibm,dma-window" property of the PHB
> > node requires the IOMMU to be configured, ie, spapr_tce_table_enable()
> > has been called, which happens during PHB reset. It is okay in the case
> > of hotplug since the device is reset before the hotplug handler is
> > called. On the contrary with coldplug, the hotplug handler is called
> > first and device is only reset during the initial system reset. Trying
> > to create the FDT fragment on the hotplug path in this case, would
> > result in somthing like this:
> > 
> > ibm,dma-window = < 0x80000000 0x00 0x00 0x00 0x00 >;
> > 
> > This will cause linux in the guest to panic, by simply removing and
> > re-adding the PHB using the drmgr command:
> > 
> > 	page = alloc_pages_node(nid, GFP_KERNEL, get_order(sz));
> > 	if (!page)
> > 		panic("iommu_init_table: Can't allocate %ld bytes\n", sz);
> > 
> > The second and maybe more problematic constraint is that the
> > "interrupt-map" property needs to reference the interrupt controller
> > node using the very same phandle that SLOF has already exposed to the
> > guest. QEMU requires SLOF to call the private KVMPPC_H_UPDATE_DT hcall
> > at some point to know about this phandle. With the latest QEMU and SLOF,
> > this happens when SLOF gets quiesced. This means that if the PHB gets
> > hotplugged after CAS but before SLOF quiesce, then we're sure that the
> > phandle is not known when the hotplug handler is called.
> > 
> > The FDT is only needed when the guest first invokes RTAS to configure
> > the connector actually, long after SLOF quiesce. Let's postpone the
> > creation of FDT fragments for PHBs to rtas_ibm_configure_connector().
> > 
> > Since we only need this for PHBs, introduce a new method in the base
> > DRC class for that. It will implemented for "spapr-drc-phb" DRCs in
> > a subsequent patch.
> > 
> > Allow spapr_drc_attach() to be passed a NULL fdt argument if the method
> > is available.
> > 
> > Signed-off-by: Greg Kurz <groug@kaod.org>  
> 
> The basic solution looks fine.  However I don't much like the fact
> that this leaves us with two ways to handle the fdt fragment - either
> at connect time or at configure connector time via a callback.  qemu
> already has way to many places where there are confusingly multiple
> ways to do things.
> 
> I know it's a detour, but I'd really prefer to convert the existing
> DRC handling to this new callback scheme, rather than have two
> different approaches.
> 

Ok. I'll introduce the new callback scheme and convert the existing code
in a separate series.

> > ---
> >  hw/ppc/spapr_drc.c         |   34 +++++++++++++++++++++++++++++-----
> >  include/hw/ppc/spapr_drc.h |    6 ++++++
> >  2 files changed, 35 insertions(+), 5 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> > index 189ee681062a..c5a281915665 100644
> > --- a/hw/ppc/spapr_drc.c
> > +++ b/hw/ppc/spapr_drc.c
> > @@ -22,6 +22,7 @@
> >  #include "qemu/error-report.h"
> >  #include "hw/ppc/spapr.h" /* for RTAS return codes */
> >  #include "hw/pci-host/spapr.h" /* spapr_phb_remove_pci_device_cb callback */
> > +#include "sysemu/device_tree.h"
> >  #include "trace.h"
> >  
> >  #define DRC_CONTAINER_PATH "/dr-connector"
> > @@ -376,6 +377,8 @@ static void prop_get_fdt(Object *obj, Visitor *v, const char *name,
> >  void spapr_drc_attach(sPAPRDRConnector *drc, DeviceState *d, void *fdt,
> >                        int fdt_start_offset, Error **errp)
> >  {
> > +    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> > +
> >      trace_spapr_drc_attach(spapr_drc_index(drc));
> >  
> >      if (drc->dev) {
> > @@ -384,11 +387,14 @@ void spapr_drc_attach(sPAPRDRConnector *drc, DeviceState *d, void *fdt,
> >      }
> >      g_assert((drc->state == SPAPR_DRC_STATE_LOGICAL_UNUSABLE)
> >               || (drc->state == SPAPR_DRC_STATE_PHYSICAL_POWERON));
> > -    g_assert(fdt);
> > +    g_assert(fdt || drck->populate_dt);
> >  
> >      drc->dev = d;
> > -    drc->fdt = fdt;
> > -    drc->fdt_start_offset = fdt_start_offset;
> > +
> > +    if (fdt) {
> > +        drc->fdt = fdt;
> > +        drc->fdt_start_offset = fdt_start_offset;
> > +    }
> >  
> >      object_property_add_link(OBJECT(drc), "device",
> >                               object_get_typename(OBJECT(drc->dev)),
> > @@ -1118,10 +1124,28 @@ static void rtas_ibm_configure_connector(PowerPCCPU *cpu,
> >          goto out;
> >      }
> >  
> > -    g_assert(drc->fdt);
> > -
> >      drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> >  
> > +    g_assert(drc->fdt || drck->populate_dt);
> > +
> > +    if (!drc->fdt) {
> > +        Error *local_err = NULL;
> > +        void *fdt;
> > +        int fdt_size;
> > +
> > +        fdt = create_device_tree(&fdt_size);
> > +
> > +        if (drck->populate_dt(drc->dev, spapr, fdt, &drc->fdt_start_offset,
> > +                               &local_err)) {
> > +            g_free(fdt);
> > +            error_free(local_err);
> > +            rc = SPAPR_DR_CC_RESPONSE_ERROR;
> > +            goto out;
> > +        }
> > +
> > +        drc->fdt = fdt;
> > +    }
> > +
> >      do {
> >          uint32_t tag;
> >          const char *name;
> > diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
> > index 56bba36ad4da..e947d6987bf2 100644
> > --- a/include/hw/ppc/spapr_drc.h
> > +++ b/include/hw/ppc/spapr_drc.h
> > @@ -18,6 +18,7 @@
> >  #include "qom/object.h"
> >  #include "sysemu/sysemu.h"
> >  #include "hw/qdev.h"
> > +#include "qapi/error.h"
> >  
> >  #define TYPE_SPAPR_DR_CONNECTOR "spapr-dr-connector"
> >  #define SPAPR_DR_CONNECTOR_GET_CLASS(obj) \
> > @@ -221,6 +222,8 @@ typedef struct sPAPRDRConnector {
> >      int fdt_start_offset;
> >  } sPAPRDRConnector;
> >  
> > +struct sPAPRMachineState;
> > +
> >  typedef struct sPAPRDRConnectorClass {
> >      /*< private >*/
> >      DeviceClass parent;
> > @@ -236,6 +239,9 @@ typedef struct sPAPRDRConnectorClass {
> >      uint32_t (*isolate)(sPAPRDRConnector *drc);
> >      uint32_t (*unisolate)(sPAPRDRConnector *drc);
> >      void (*release)(DeviceState *dev);
> > +
> > +    int (*populate_dt)(DeviceState *dev, struct sPAPRMachineState *spapr,
> > +                       void *fdt, int *fdt_start_offset, Error **errp);
> >  } sPAPRDRConnectorClass;
> >  
> >  typedef struct sPAPRDRCPhysical {
> >   
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v4 14/15] spapr: add hotplug hooks for PHB hotplug
  2019-02-13  4:13   ` David Gibson
@ 2019-02-13 13:24     ` Greg Kurz
  0 siblings, 0 replies; 35+ messages in thread
From: Greg Kurz @ 2019-02-13 13:24 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-devel, qemu-ppc, qemu-s390x, Alexey Kardashevskiy,
	Cédric Le Goater, Michael Roth, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Eduardo Habkost,
	David Hildenbrand, Cornelia Huck, Gerd Hoffmann, Dmitry Fleytman,
	Thomas Huth

[-- Attachment #1: Type: text/plain, Size: 11626 bytes --]

On Wed, 13 Feb 2019 15:13:08 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Tue, Feb 12, 2019 at 07:25:25PM +0100, Greg Kurz wrote:
> > Hotplugging PHBs is a machine-level operation, but PHBs reside on the
> > main system bus, so we register spapr machine as the handler for the
> > main system bus.
> > 
> > Provide the usual pre-plug, plug and unplug-request handlers.
> > 
> > Move the checking of the PHB index to the pre-plug handler. It is okay
> > to do that and assert in the realize function because the pre-plug
> > handler is always called, even for the oldest machine types we support.
> > 
> > Unlike with other device types, there are some cases where we cannot
> > provide the FDT fragment of the PHB from the plug handler, eg, before
> > KVMPPC_H_UPDATE_DT was called. Do this from a DRC callback that is
> > called just before the first FDT fragment is exposed to the guest.
> > 
> > Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> > (Fixed interrupt controller phandle in "interrupt-map" and
> >  TCE table size in "ibm,dma-window" FDT fragment, Greg Kurz)
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> > v4: - populate FDT fragment in a DRC callback
> > v3: - reworked phandle handling some more
> > v2: - reworked phandle handling
> >     - sync LSIs to KVM
> > ---
> > ---
> >  hw/ppc/spapr.c         |  121 ++++++++++++++++++++++++++++++++++++++++++++++++
> >  hw/ppc/spapr_drc.c     |    2 +
> >  hw/ppc/spapr_pci.c     |   16 ------
> >  include/hw/ppc/spapr.h |    5 ++
> >  4 files changed, 127 insertions(+), 17 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 021758825b7e..06ce0babcb54 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -2930,6 +2930,11 @@ static void spapr_machine_init(MachineState *machine)
> >      register_savevm_live(NULL, "spapr/htab", -1, 1,
> >                           &savevm_htab_handlers, spapr);
> >  
> > +    if (smc->dr_phb_enabled) {
> > +        qbus_set_hotplug_handler(sysbus_get_default(), OBJECT(machine),
> > +                                 &error_fatal);
> > +    }  
> 
> I think you could do this unconditionally and just check
> dr_phb_enabled at pre_plug.  That makes it more consistent with the
> other hotplug types, and I suspect will give us better error messages.
> 

Ok.

> >      qemu_register_boot_set(spapr_boot_set, spapr);
> >  
> >      if (kvm_enabled()) {
> > @@ -3733,6 +3738,108 @@ out:
> >      error_propagate(errp, local_err);
> >  }
> >  
> > +int spapr_dt_phb(DeviceState *dev, sPAPRMachineState *spapr, void *fdt,
> > +                 int *fdt_start_offset, Error **errp)
> > +{
> > +    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
> > +    uint32_t intc_phandle;
> > +
> > +    if (spapr_irq_get_phandle(spapr, spapr->fdt_blob, &intc_phandle, errp)) {
> > +        return -1;
> > +    }
> > +
> > +    if (spapr_populate_pci_dt(sphb, intc_phandle, fdt, spapr->irq->nr_msis,
> > +                              fdt_start_offset)) {
> > +        error_setg(errp, "unable to create FDT node for PHB %d", sphb->index);
> > +        return -1;
> > +    }
> > +
> > +    /* generally SLOF creates these, for hotplug it's up to QEMU */
> > +    _FDT(fdt_setprop_string(fdt, *fdt_start_offset, "name", "pci"));
> > +
> > +    return 0;
> > +}
> > +
> > +static void spapr_phb_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> > +                               Error **errp)
> > +{
> > +    sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev));
> > +    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
> > +    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> > +    const unsigned windows_supported = spapr_phb_windows_supported(sphb);
> > +
> > +    if (sphb->index == (uint32_t)-1) {
> > +        error_setg(errp, "\"index\" for PAPR PHB is mandatory");
> > +        return;
> > +    }
> > +
> > +    /*
> > +     * This will check that sphb->index doesn't exceed the maximum number of
> > +     * PHBs for the current machine type.
> > +     */
> > +    smc->phb_placement(spapr, sphb->index,
> > +                       &sphb->buid, &sphb->io_win_addr,
> > +                       &sphb->mem_win_addr, &sphb->mem64_win_addr,
> > +                       windows_supported, sphb->dma_liobn, errp);
> > +}
> > +
> > +static void spapr_phb_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> > +                           Error **errp)
> > +{
> > +    sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev));
> > +    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> > +    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
> > +    sPAPRDRConnector *drc;
> > +    bool hotplugged = spapr_drc_hotplugged(dev);
> > +    Error *local_err = NULL;
> > +
> > +    if (!smc->dr_phb_enabled) {
> > +        return;
> > +    }
> > +
> > +    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_PHB, sphb->index);
> > +    /* hotplug hooks should check it's enabled before getting this far */
> > +    assert(drc);
> > +
> > +    /*
> > +     * The FDT fragment will be added during the first invocation of RTAS
> > +     * ibm,client-architecture-support  for this device, when we're sure
> > +     * that the IOMMU is configured and that QEMU knows the phandle of the
> > +     * interrupt controller.
> > +     */
> > +    spapr_drc_attach(drc, DEVICE(dev), NULL, 0, &local_err);
> > +    if (local_err) {
> > +        error_propagate(errp, local_err);
> > +        return;
> > +    }
> > +
> > +    if (hotplugged) {
> > +        spapr_hotplug_req_add_by_index(drc);
> > +    } else {
> > +        spapr_drc_reset(drc);
> > +    }
> > +}
> > +
> > +void spapr_phb_release(DeviceState *dev)
> > +{
> > +    object_unparent(OBJECT(dev));
> > +}
> > +
> > +static void spapr_phb_unplug_request(HotplugHandler *hotplug_dev,
> > +                                     DeviceState *dev, Error **errp)
> > +{
> > +    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
> > +    sPAPRDRConnector *drc;
> > +
> > +    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_PHB, sphb->index);
> > +    assert(drc);
> > +
> > +    if (!spapr_drc_unplug_requested(drc)) {
> > +        spapr_drc_detach(drc);
> > +        spapr_hotplug_req_remove_by_index(drc);
> > +    }
> > +}
> > +
> >  static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
> >                                        DeviceState *dev, Error **errp)
> >  {
> > @@ -3740,6 +3847,8 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
> >          spapr_memory_plug(hotplug_dev, dev, errp);
> >      } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_CPU_CORE)) {
> >          spapr_core_plug(hotplug_dev, dev, errp);
> > +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_PCI_HOST_BRIDGE)) {
> > +        spapr_phb_plug(hotplug_dev, dev, errp);
> >      }
> >  }
> >  
> > @@ -3758,6 +3867,7 @@ static void spapr_machine_device_unplug_request(HotplugHandler *hotplug_dev,
> >  {
> >      sPAPRMachineState *sms = SPAPR_MACHINE(OBJECT(hotplug_dev));
> >      MachineClass *mc = MACHINE_GET_CLASS(sms);
> > +    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
> >  
> >      if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> >          if (spapr_ovec_test(sms->ov5_cas, OV5_HP_EVT)) {
> > @@ -3777,6 +3887,12 @@ static void spapr_machine_device_unplug_request(HotplugHandler *hotplug_dev,
> >              return;
> >          }
> >          spapr_core_unplug_request(hotplug_dev, dev, errp);
> > +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_PCI_HOST_BRIDGE)) {
> > +        if (!smc->dr_phb_enabled) {
> > +            error_setg(errp, "PHB hot unplug not supported on this machine");
> > +            return;
> > +        }
> > +        spapr_phb_unplug_request(hotplug_dev, dev, errp);
> >      }
> >  }
> >  
> > @@ -3787,6 +3903,8 @@ static void spapr_machine_device_pre_plug(HotplugHandler *hotplug_dev,
> >          spapr_memory_pre_plug(hotplug_dev, dev, errp);
> >      } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_CPU_CORE)) {
> >          spapr_core_pre_plug(hotplug_dev, dev, errp);
> > +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_PCI_HOST_BRIDGE)) {
> > +        spapr_phb_pre_plug(hotplug_dev, dev, errp);
> >      }
> >  }
> >  
> > @@ -3794,7 +3912,8 @@ static HotplugHandler *spapr_get_hotplug_handler(MachineState *machine,
> >                                                   DeviceState *dev)
> >  {
> >      if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) ||
> > -        object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_CPU_CORE)) {
> > +        object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_CPU_CORE) ||
> > +        object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_PCI_HOST_BRIDGE)) {
> >          return HOTPLUG_HANDLER(machine);
> >      }
> >      return NULL;
> > diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> > index c5a281915665..22563a381a37 100644
> > --- a/hw/ppc/spapr_drc.c
> > +++ b/hw/ppc/spapr_drc.c
> > @@ -709,6 +709,8 @@ static void spapr_drc_phb_class_init(ObjectClass *k, void *data)
> >      drck->typeshift = SPAPR_DR_CONNECTOR_TYPE_SHIFT_PHB;
> >      drck->typename = "PHB";
> >      drck->drc_name_prefix = "PHB ";
> > +    drck->release = spapr_phb_release;
> > +    drck->populate_dt = spapr_dt_phb;
> >  }
> >  
> >  static const TypeInfo spapr_dr_connector_info = {
> > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> > index 7df7f6502f93..d0caca627455 100644
> > --- a/hw/ppc/spapr_pci.c
> > +++ b/hw/ppc/spapr_pci.c
> > @@ -1647,21 +1647,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
> >          return;
> >      }
> >  
> > -    if (sphb->index != (uint32_t)-1) {
> > -        Error *local_err = NULL;
> > -
> > -        smc->phb_placement(spapr, sphb->index,
> > -                           &sphb->buid, &sphb->io_win_addr,
> > -                           &sphb->mem_win_addr, &sphb->mem64_win_addr,
> > -                           windows_supported, sphb->dma_liobn, &local_err);
> > -        if (local_err) {
> > -            error_propagate(errp, local_err);
> > -            return;
> > -        }
> > -    } else {
> > -        error_setg(errp, "\"index\" for PAPR PHB is mandatory");
> > -        return;
> > -    }
> > +    assert(sphb->index != (uint32_t)-1); /* checked in spapr_phb_pre_plug() */
> >  
> >      if (sphb->mem64_win_size != 0) {
> >          if (sphb->mem_win_size > SPAPR_PCI_MEM32_WIN_SIZE) {
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index a3074e7fea37..69d9c2196ca2 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -764,9 +764,12 @@ void spapr_reallocate_hpt(sPAPRMachineState *spapr, int shift,
> >  void spapr_clear_pending_events(sPAPRMachineState *spapr);
> >  int spapr_max_server_number(sPAPRMachineState *spapr);
> >  
> > -/* CPU and LMB DRC release callbacks. */
> > +/* DRC callbacks. */
> >  void spapr_core_release(DeviceState *dev);
> >  void spapr_lmb_release(DeviceState *dev);
> > +void spapr_phb_release(DeviceState *dev);
> > +int spapr_dt_phb(DeviceState *dev, sPAPRMachineState *spapr, void *fdt,
> > +                 int *fdt_start_offset, Error **errp);
> >  
> >  void spapr_rtc_read(sPAPRRTCState *rtc, struct tm *tm, uint32_t *ns);
> >  int spapr_rtc_import_offset(sPAPRRTCState *rtc, int64_t legacy_offset);
> >   
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v4 14/15] spapr: add hotplug hooks for PHB hotplug
  2019-02-13  9:25   ` David Hildenbrand
@ 2019-02-13 13:25     ` Greg Kurz
  0 siblings, 0 replies; 35+ messages in thread
From: Greg Kurz @ 2019-02-13 13:25 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: David Gibson, qemu-devel, qemu-ppc, qemu-s390x,
	Alexey Kardashevskiy, Cédric Le Goater, Michael Roth,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Eduardo Habkost, Cornelia Huck, Gerd Hoffmann, Dmitry Fleytman,
	Thomas Huth

On Wed, 13 Feb 2019 10:25:07 +0100
David Hildenbrand <david@redhat.com> wrote:

> On 12.02.19 19:25, Greg Kurz wrote:
> > Hotplugging PHBs is a machine-level operation, but PHBs reside on the
> > main system bus, so we register spapr machine as the handler for the
> > main system bus.
> > 
> > Provide the usual pre-plug, plug and unplug-request handlers.
> > 
> > Move the checking of the PHB index to the pre-plug handler. It is okay
> > to do that and assert in the realize function because the pre-plug
> > handler is always called, even for the oldest machine types we support.
> > 
> > Unlike with other device types, there are some cases where we cannot
> > provide the FDT fragment of the PHB from the plug handler, eg, before
> > KVMPPC_H_UPDATE_DT was called. Do this from a DRC callback that is
> > called just before the first FDT fragment is exposed to the guest.
> > 
> > Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> > (Fixed interrupt controller phandle in "interrupt-map" and
> >  TCE table size in "ibm,dma-window" FDT fragment, Greg Kurz)
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> > v4: - populate FDT fragment in a DRC callback
> > v3: - reworked phandle handling some more
> > v2: - reworked phandle handling
> >     - sync LSIs to KVM
> > ---
> > ---
> >  hw/ppc/spapr.c         |  121 ++++++++++++++++++++++++++++++++++++++++++++++++
> >  hw/ppc/spapr_drc.c     |    2 +
> >  hw/ppc/spapr_pci.c     |   16 ------
> >  include/hw/ppc/spapr.h |    5 ++
> >  4 files changed, 127 insertions(+), 17 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 021758825b7e..06ce0babcb54 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -2930,6 +2930,11 @@ static void spapr_machine_init(MachineState *machine)
> >      register_savevm_live(NULL, "spapr/htab", -1, 1,
> >                           &savevm_htab_handlers, spapr);
> >  
> > +    if (smc->dr_phb_enabled) {
> > +        qbus_set_hotplug_handler(sysbus_get_default(), OBJECT(machine),
> > +                                 &error_fatal);
> > +    }
> > +
> >      qemu_register_boot_set(spapr_boot_set, spapr);
> >  
> >      if (kvm_enabled()) {
> > @@ -3733,6 +3738,108 @@ out:
> >      error_propagate(errp, local_err);
> >  }
> >  
> > +int spapr_dt_phb(DeviceState *dev, sPAPRMachineState *spapr, void *fdt,
> > +                 int *fdt_start_offset, Error **errp)
> > +{
> > +    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
> > +    uint32_t intc_phandle;
> > +
> > +    if (spapr_irq_get_phandle(spapr, spapr->fdt_blob, &intc_phandle, errp)) {
> > +        return -1;
> > +    }
> > +
> > +    if (spapr_populate_pci_dt(sphb, intc_phandle, fdt, spapr->irq->nr_msis,
> > +                              fdt_start_offset)) {
> > +        error_setg(errp, "unable to create FDT node for PHB %d", sphb->index);
> > +        return -1;
> > +    }
> > +
> > +    /* generally SLOF creates these, for hotplug it's up to QEMU */
> > +    _FDT(fdt_setprop_string(fdt, *fdt_start_offset, "name", "pci"));
> > +
> > +    return 0;
> > +}
> > +
> > +static void spapr_phb_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> > +                               Error **errp)
> > +{
> > +    sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev));
> > +    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
> > +    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> > +    const unsigned windows_supported = spapr_phb_windows_supported(sphb);
> > +
> > +    if (sphb->index == (uint32_t)-1) {
> > +        error_setg(errp, "\"index\" for PAPR PHB is mandatory");
> > +        return;
> > +    }
> > +
> > +    /*
> > +     * This will check that sphb->index doesn't exceed the maximum number of
> > +     * PHBs for the current machine type.
> > +     */
> > +    smc->phb_placement(spapr, sphb->index,
> > +                       &sphb->buid, &sphb->io_win_addr,
> > +                       &sphb->mem_win_addr, &sphb->mem64_win_addr,
> > +                       windows_supported, sphb->dma_liobn, errp);
> > +}
> > +
> > +static void spapr_phb_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> > +                           Error **errp)
> > +{
> > +    sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev));
> > +    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> > +    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
> > +    sPAPRDRConnector *drc;
> > +    bool hotplugged = spapr_drc_hotplugged(dev);
> > +    Error *local_err = NULL;
> > +
> > +    if (!smc->dr_phb_enabled) {
> > +        return;
> > +    }
> > +
> > +    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_PHB, sphb->index);
> > +    /* hotplug hooks should check it's enabled before getting this far */
> > +    assert(drc);
> > +
> > +    /*
> > +     * The FDT fragment will be added during the first invocation of RTAS
> > +     * ibm,client-architecture-support  for this device, when we're sure
> > +     * that the IOMMU is configured and that QEMU knows the phandle of the
> > +     * interrupt controller.
> > +     */
> > +    spapr_drc_attach(drc, DEVICE(dev), NULL, 0, &local_err);
> > +    if (local_err) {
> > +        error_propagate(errp, local_err);
> > +        return;
> > +    }
> > +
> > +    if (hotplugged) {
> > +        spapr_hotplug_req_add_by_index(drc);
> > +    } else {
> > +        spapr_drc_reset(drc);
> > +    }
> > +}
> > +
> > +void spapr_phb_release(DeviceState *dev)
> > +{
> > +    object_unparent(OBJECT(dev));
> > +}
> > +  
> 
> Please call the unplug handler here just like we already do with
> spapr_phb_remove_pci_device_cb().
> 
> And add a _unplug handler that simply calls e.g.
> 
> qdev_simple_device_unplug_cb
> 
> 
> Otherwise this will break with
> [PATCH RFCv2 0/9] qdev: Hotplug handler chaining + virtio-pmem
> 

Yes, I'll do that.

> 
> > +static void spapr_phb_unplug_request(HotplugHandler *hotplug_dev,
> > +                                     DeviceState *dev, Error **errp)
> > +{
> > +    sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
> > +    sPAPRDRConnector *drc;
> > +
> > +    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_PHB, sphb->index);
> > +    assert(drc);
> > +
> > +    if (!spapr_drc_unplug_requested(drc)) {
> > +        spapr_drc_detach(drc);
> > +        spapr_hotplug_req_remove_by_index(drc);
> > +    }
> > +}
> > +
> >  static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
> >                                        DeviceState *dev, Error **errp)
> >  {
> > @@ -3740,6 +3847,8 @@ static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
> >          spapr_memory_plug(hotplug_dev, dev, errp);
> >      } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_CPU_CORE)) {
> >          spapr_core_plug(hotplug_dev, dev, errp);
> > +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_PCI_HOST_BRIDGE)) {
> > +        spapr_phb_plug(hotplug_dev, dev, errp);
> >      }
> >  }
> >  
> > @@ -3758,6 +3867,7 @@ static void spapr_machine_device_unplug_request(HotplugHandler *hotplug_dev,
> >  {
> >      sPAPRMachineState *sms = SPAPR_MACHINE(OBJECT(hotplug_dev));
> >      MachineClass *mc = MACHINE_GET_CLASS(sms);
> > +    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
> >  
> >      if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> >          if (spapr_ovec_test(sms->ov5_cas, OV5_HP_EVT)) {
> > @@ -3777,6 +3887,12 @@ static void spapr_machine_device_unplug_request(HotplugHandler *hotplug_dev,
> >              return;
> >          }
> >          spapr_core_unplug_request(hotplug_dev, dev, errp);
> > +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_PCI_HOST_BRIDGE)) {
> > +        if (!smc->dr_phb_enabled) {
> > +            error_setg(errp, "PHB hot unplug not supported on this machine");
> > +            return;
> > +        }
> > +        spapr_phb_unplug_request(hotplug_dev, dev, errp);
> >      }
> >  }
> >  
> > @@ -3787,6 +3903,8 @@ static void spapr_machine_device_pre_plug(HotplugHandler *hotplug_dev,
> >          spapr_memory_pre_plug(hotplug_dev, dev, errp);
> >      } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_CPU_CORE)) {
> >          spapr_core_pre_plug(hotplug_dev, dev, errp);
> > +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_PCI_HOST_BRIDGE)) {
> > +        spapr_phb_pre_plug(hotplug_dev, dev, errp);
> >      }
> >  }
> >  
> > @@ -3794,7 +3912,8 @@ static HotplugHandler *spapr_get_hotplug_handler(MachineState *machine,
> >                                                   DeviceState *dev)
> >  {
> >      if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) ||
> > -        object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_CPU_CORE)) {
> > +        object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_CPU_CORE) ||
> > +        object_dynamic_cast(OBJECT(dev), TYPE_SPAPR_PCI_HOST_BRIDGE)) {
> >          return HOTPLUG_HANDLER(machine);
> >      }
> >      return NULL;
> > diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> > index c5a281915665..22563a381a37 100644
> > --- a/hw/ppc/spapr_drc.c
> > +++ b/hw/ppc/spapr_drc.c
> > @@ -709,6 +709,8 @@ static void spapr_drc_phb_class_init(ObjectClass *k, void *data)
> >      drck->typeshift = SPAPR_DR_CONNECTOR_TYPE_SHIFT_PHB;
> >      drck->typename = "PHB";
> >      drck->drc_name_prefix = "PHB ";
> > +    drck->release = spapr_phb_release;
> > +    drck->populate_dt = spapr_dt_phb;
> >  }
> >  
> >  static const TypeInfo spapr_dr_connector_info = {
> > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> > index 7df7f6502f93..d0caca627455 100644
> > --- a/hw/ppc/spapr_pci.c
> > +++ b/hw/ppc/spapr_pci.c
> > @@ -1647,21 +1647,7 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
> >          return;
> >      }
> >  
> > -    if (sphb->index != (uint32_t)-1) {
> > -        Error *local_err = NULL;
> > -
> > -        smc->phb_placement(spapr, sphb->index,
> > -                           &sphb->buid, &sphb->io_win_addr,
> > -                           &sphb->mem_win_addr, &sphb->mem64_win_addr,
> > -                           windows_supported, sphb->dma_liobn, &local_err);
> > -        if (local_err) {
> > -            error_propagate(errp, local_err);
> > -            return;
> > -        }
> > -    } else {
> > -        error_setg(errp, "\"index\" for PAPR PHB is mandatory");
> > -        return;
> > -    }
> > +    assert(sphb->index != (uint32_t)-1); /* checked in spapr_phb_pre_plug() */
> >  
> >      if (sphb->mem64_win_size != 0) {
> >          if (sphb->mem_win_size > SPAPR_PCI_MEM32_WIN_SIZE) {
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index a3074e7fea37..69d9c2196ca2 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -764,9 +764,12 @@ void spapr_reallocate_hpt(sPAPRMachineState *spapr, int shift,
> >  void spapr_clear_pending_events(sPAPRMachineState *spapr);
> >  int spapr_max_server_number(sPAPRMachineState *spapr);
> >  
> > -/* CPU and LMB DRC release callbacks. */
> > +/* DRC callbacks. */
> >  void spapr_core_release(DeviceState *dev);
> >  void spapr_lmb_release(DeviceState *dev);
> > +void spapr_phb_release(DeviceState *dev);
> > +int spapr_dt_phb(DeviceState *dev, sPAPRMachineState *spapr, void *fdt,
> > +                 int *fdt_start_offset, Error **errp);
> >  
> >  void spapr_rtc_read(sPAPRRTCState *rtc, struct tm *tm, uint32_t *ns);
> >  int spapr_rtc_import_offset(sPAPRRTCState *rtc, int64_t legacy_offset);
> >   
> 
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2019-02-13 13:26 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-12 18:23 [Qemu-devel] [PATCH v4 00/15] spapr: Add support for PHB hotplug Greg Kurz
2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 01/15] spapr_irq: Add an @xics_offset field to sPAPRIrq Greg Kurz
2019-02-12 20:07   ` Cédric Le Goater
2019-02-13  3:26   ` David Gibson
2019-02-13 12:23     ` Greg Kurz
2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 02/15] xive: Only set source type for LSIs Greg Kurz
2019-02-13  3:27   ` David Gibson
2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 03/15] spapr_irq: Set LSIs at interrupt controller init Greg Kurz
2019-02-12 20:17   ` Cédric Le Goater
2019-02-13  3:48   ` David Gibson
2019-02-13 12:44     ` Greg Kurz
2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 04/15] spapr: Expose the name of the interrupt controller node Greg Kurz
2019-02-13  3:50   ` David Gibson
2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 05/15] spapr_irq: Expose the phandle of the interrupt controller Greg Kurz
2019-02-13  3:52   ` David Gibson
2019-02-13 13:11     ` Greg Kurz
2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 06/15] spapr_pci: add PHB unrealize Greg Kurz
2019-02-13  3:56   ` David Gibson
2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 07/15] spapr: create DR connectors for PHBs Greg Kurz
2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 08/15] spapr: populate PHB DRC entries for root DT node Greg Kurz
2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 09/15] spapr_events: add support for phb hotplug events Greg Kurz
2019-02-12 18:24 ` [Qemu-devel] [PATCH v4 10/15] qdev: pass an Object * to qbus_set_hotplug_handler() Greg Kurz
2019-02-13  3:59   ` David Gibson
2019-02-12 18:25 ` [Qemu-devel] [PATCH v4 11/15] spapr_pci: provide node start offset via spapr_populate_pci_dt() Greg Kurz
2019-02-12 18:25 ` [Qemu-devel] [PATCH v4 12/15] spapr_pci: add ibm, my-drc-index property for PHB hotplug Greg Kurz
2019-02-12 18:25 ` [Qemu-devel] [PATCH v4 13/15] spapr_drc: Allow FDT fragment to be added later Greg Kurz
2019-02-13  4:05   ` David Gibson
2019-02-13 13:15     ` Greg Kurz
2019-02-12 18:25 ` [Qemu-devel] [PATCH v4 14/15] spapr: add hotplug hooks for PHB hotplug Greg Kurz
2019-02-13  4:13   ` David Gibson
2019-02-13 13:24     ` Greg Kurz
2019-02-13  9:25   ` David Hildenbrand
2019-02-13 13:25     ` Greg Kurz
2019-02-12 18:25 ` [Qemu-devel] [PATCH v4 15/15] spapr: enable PHB hotplug for default pseries machine type Greg Kurz
2019-02-13  4:13   ` David Gibson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.