All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v8 00/12] ppc: support for the XIVE interrupt controller (POWER9)
@ 2018-12-11 22:38 Cédric Le Goater
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 01/12] spapr: introduce a new machine IRQ backend for XIVE Cédric Le Goater
                   ` (12 more replies)
  0 siblings, 13 replies; 30+ messages in thread
From: Cédric Le Goater @ 2018-12-11 22:38 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt, Cédric Le Goater

Hello,

Here is the version 8 of the QEMU models adding support for the XIVE
interrupt controller to the sPAPR machine, under TCG only. KVM support
will be proposed in an other patchset, along with the KVM XIVE device
patchset, so will PowerNV.

The most important changes for sPAPR are the introduction of a 'ic-mode'
machine option to select the interrupt mode of the machine and the fix
for CPU hotplug.

Thanks,

C.

Changes in v8 (sPAPR only) :

 - reworked spapr_irq_init_xive() to use qdev_create()
 - changed the EQ sizes list property to contain 64K only
 - fixed OS CAM line of hotplugged CPUs (TCG)
 - introduced a check on CPU type when activating XIVE.
 - introduced an 'ic-mode' machine option
 - renamed spapr_xive_enable_mmio() to spapr_xive_mmio_set_enabled()
 - changed default CPU type to POWER9
 
Changes in v7 :

 https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg01760.html

Changes in v6 :

 https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg00965.html

Changes in v5 :

 https://lists.gnu.org/archive/html/qemu-devel/2018-11/msg03218.html

Changes in v4 :

 https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg01672.html


= XIVE =================================================================


The POWER9 processor comes with a new interrupt controller, called
XIVE as "eXternal Interrupt Virtualization Engine".


* Overall architecture


             XIVE Interrupt Controller
             +------------------------------------+      IPIs
             | +---------+ +---------+ +--------+ |    +-------+
             | |VC       | |CQ       | |PC      |----> | CORES |
             | |     esb | |         | |        |----> |       |
             | |     eas | |  Bridge | |   tctx |----> |       |
             | |SC   end | |         | |    nvt | |    |       |
 +------+    | +---------+ +----+----+ +--------+ |    +-+-+-+-+
 | RAM  |    +------------------|-----------------+      | | |
 |      |                       |                        | | |
 |      |                       |                        | | |
 |      |  +--------------------v------------------------v-v-v--+    other
 |      <--+                     Power Bus                      +--> chips
 |  esb |  +---------+-----------------------+------------------+
 |  eas |            |                       |
 |  end |         +--|------+                |
 |  nvt |       +----+----+ |           +----+----+
 +------+       |SC       | |           |SC       |
                |         | |           |         |
                | PQ-bits | |           | PQ-bits |
                | local   |-+           |  in VC  |
                +---------+             +---------+
                   PCIe                 NX,NPU,CAPI

                  SC: Source Controller (aka. IVSE)
                  VC: Virtualization Controller (aka. IVRE)
                  PC: Presentation Controller (aka. IVPE)
                  CQ: Common Queue (Bridge)

             PQ-bits: 2 bits source state machine (P:pending Q:queued)
                 esb: Event State Buffer (Array of PQ bits in an IVSE)
                 eas: Event Assignment Structure
                 end: Event Notification Descriptor
                 nvt: Notification Virtual Target
                tctx: Thread interrupt Context


The XIVE IC is composed of three sub-engines :

  - Interrupt Virtualization Source Engine (IVSE), or Source
    Controller (SC). These are found in PCI PHBs, in the PSI host
    bridge controller, but also inside the main controller for the
    core IPIs and other sub-chips (NX, CAP, NPU) of the
    chip/processor. They are configured to feed the IVRE with events.

  - Interrupt Virtualization Routing Engine (IVRE) or Virtualization
    Controller (VC). Its job is to match an event source with an Event
    Notification Descriptor (END).

  - Interrupt Virtualization Presentation Engine (IVPE) or Presentation
    Controller (PC). It maintains the interrupt context state of each
    thread and handles the delivery of the external exception to the
    thread.


* XIVE internal tables

Each of the sub-engines uses a set of tables to redirect exceptions
from event sources to CPU threads.

                                          +-------+
  User or OS                              |  EQ   |
      or                          +------>|entries|
  Hypervisor                      |       |  ..   |
    Memory                        |       +-------+
                                  |           ^
                                  |           |
             +-------------------------------------------------+
                                  |           |
  Hypervisor      +------+    +---+--+    +---+--+   +------+
    Memory        | ESB  |    | EAT  |    | ENDT |   | NVTT |
   (skiboot)      +----+-+    +----+-+    +----+-+   +------+
                    ^  |        ^  |        ^  |       ^
                    |  |        |  |        |  |       |
             +-------------------------------------------------+
                    |  |        |  |        |  |       |
                    |  |        |  |        |  |       |
               +----|--|--------|--|--------|--|-+   +-|-----+    +------+
               |    |  |        |  |        |  | |   | | tctx|    |Thread|
   IPI or   ---+    +  v        +  v        +  v |---| +  .. |----->     |
  HW events    |                                 |   |       |    |      |
               |             IVRE                |   | IVPE  |    +------+
               +---------------------------------+   +-------+
            


The IVSE have a 2-bits state machine, P for pending and Q for queued,
for each source that allows events to be triggered. They are stored in
an Event State Buffer (ESB) array and can be controlled by MMIOs.

If the event is let through, the IVRE looks up in the Event Assignment
Structure (EAS) table for an Event Notification Descriptor (END)
configured for the source. Each Event Notification Descriptor defines
a notification path to a CPU and an in-memory Event Queue, in which
will be enqueued an EQ data for the OS to pull.

The IVPE determines if a Notification Virtual Target (NVT) can handle
the event by scanning the thread contexts of the VCPUs dispatched on
the processor HW threads. It maintains the interrupt context state of
each thread in a NVT table.


* Overview of the QEMU models for the XIVE sub-engines

The XiveSource models the IVSE in general, internal and external. It
handles the source ESBs and the MMIO interface to control them.

The XiveNotifier is a small helper interface interconnecting the
XiveSource to the XiveRouter.

The XiveRouter is an abstract model acting as a combined IVRE and
IVPE. It routes event notifications using the IVE and EQD tables to
the IVPE sub-engine which does a CAM scan to find a CPU to deliver the
exception. Storage should be provided by the inheriting classes.

XiveENDSource is a special source object. It exposes the EQ ESB MMIOs of
the Event Queues which are used for coalescing event notifications and
for escalation. Not used on the field, only to sync the EQ cache in
OPAL.

Finally, the XiveTCTX contains the interrupt state context of a thread,
four sets of registers, one for each exception that can be delivered
to a CPU. These contexts are scanned by the IVPE to find a matching VP
when a notification is triggered. It also models the Thread Interrupt
Management Area (TIMA), which exposes the thread context registers to
the CPU for interrupt management.


* XIVE for sPAPR

sPAPRXive models the XIVE interrupt controller of a sPAPR machine. It
inherits from the XiveRouter and provisions storage for the IVE and
END tables. The NVT table does not need a backend in sPAPR. It owns a
XiveSource object for the IPIs and the virtual device interrupts, a
memory region for the TIMA and a XiveENDSource to manage the END ESBs.
(not used by Linux).

These choices were made to have a sPAPR interrupt controller
consistent with the one found on baremetal and to facilitate KVM
support, the main difficulty being the host memory regions exposed to
the guest.

The NVT and tbe END indexing needs some care and a set of helpers are
defined to ease the conversion between the CPU id as seen by the guest
and the XIVE identifiers manipulated by the models. 


* Integration in the sPAPR machine, xive only and dual

A new sPAPR IRQ backend is defined for XIVE. It introduces a couple of
new operations to handle the differences in the creation of the device
tree and in the allocation of the CPU interrupt controller. A new
'xive' only pseries machine is defined using this XIVE backend.

Being able to support both interrupt mode in the same machine requires
some more changes. As the machine chooses the interrupt mode at CAS
time, it is activated after a reconfiguration done in a reset. This is
handled by a new 'dual' sPAPR IRQ backend which is built on top of the
XICS and XIVE backend. A new 'dual' pseries machine is defined using
this backend.


* KVM support

Support for KVM introduces a set of specific XIVE models, very much
like XICS does, which self-connect to their KVM counterparts in the
Linux kernel. Two host memory regions are exposed to the guest and
need special care at initialization :

  - ESB mmios
  - Thread Interrupt Management Area (TIMA)

The models uses KVM accessors to synchronize the QEMU state with
KVM. The states are :

  - the source configuration (EAT)
  - the END configuration (ENDT)
  - the OS EQ state (toggle bit and index)
  - the thread interrupt context registers.

Hybrid guest using KVM and an emulated irqchip (kernel_irqchip=off) is
supported. Migration under KVM is supported.

KVM support for the 'dual' machine required some more changes. Both
interrupt mode need to be initialized at the QEMU level to keep the
IRQ number space in sync and to allow switching from one mode to
another. At the KVM level, the whole initialization of the KVM device,
sources and presenters, needs to be done in the reset handler when the
interrupt mode is chosen. This is a major change in the KVM models.

KVM being initialized at reset, we loose the possiblity to fallback to
the QEMU emulated mode in case of failure and failures become fatal to
the machine.


* PowerNV models

The PnvXIVE model uses the XiveRouter abstract model just like
sPAPRXive. It provides accessors to the EAS, END and NVT tables which
are stored in the QEMU PowerNV machine and not in QEMU anymore. It
owns a set of memory regions for the IC registers, the ESBs, the END
ESBs, the TIMA, the notification MMIO.

Multichip is supported and the available IVSEs are the internal one
for the IPIS, the PSI host bridge controller and PHB4.

The next interesting step would be to add escalation events and model
the VCPU dispatching to support emulated KVM guests.


* GitHub trees
 
QEMU sPAPR:

  https://github.com/legoater/qemu/commits/xive-v8-4.0
  
QEMU PowerNV:

  https://github.com/legoater/qemu/commits/powernv-3.1

Linux/KVM:

  https://github.com/legoater/linux/commits/xive-4.20

OPAL:

  https://github.com/legoater/skiboot/commits/xive

Cédric Le Goater (12):
  spapr: introduce a new machine IRQ backend for XIVE
  spapr: add hcalls support for the XIVE exploitation interrupt mode
  spapr: add device tree support for the XIVE exploitation mode
  spapr: allocate the interrupt thread context under the CPU core
  spapr: extend the sPAPR IRQ backend for XICS migration
  spapr: add a 'reset' method to the sPAPR IRQ backend
  spapr: add an extra OV5 field to the sPAPR IRQ backend
  spapr: introduce an 'ic-mode' machine option
  spapr: set the interrupt presenter at reset
  spapr: enable XIVE MMIOs at reset
  spapr: introduce a new sPAPR IRQ backend supporting XIVE and XICS
  spapr: change default CPU type to POWER9

 include/hw/ppc/spapr.h          |   24 +-
 include/hw/ppc/spapr_cpu_core.h |    2 +
 include/hw/ppc/spapr_irq.h      |   12 +
 include/hw/ppc/spapr_xive.h     |    8 +
 include/hw/ppc/xics.h           |    4 +-
 include/hw/ppc/xive.h           |    1 +
 hw/intc/spapr_xive.c            | 1053 +++++++++++++++++++++++++++++++
 hw/intc/xics_spapr.c            |    3 +-
 hw/intc/xive.c                  |   22 +
 hw/ppc/spapr.c                  |  105 ++-
 hw/ppc/spapr_cpu_core.c         |   30 +-
 hw/ppc/spapr_hcall.c            |   11 +
 hw/ppc/spapr_irq.c              |  349 +++++++++-
 13 files changed, 1584 insertions(+), 40 deletions(-)

-- 
2.17.2

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Qemu-devel] [PATCH v8 01/12] spapr: introduce a new machine IRQ backend for XIVE
  2018-12-11 22:38 [Qemu-devel] [PATCH v8 00/12] ppc: support for the XIVE interrupt controller (POWER9) Cédric Le Goater
@ 2018-12-11 22:38 ` Cédric Le Goater
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 02/12] spapr: add hcalls support for the XIVE exploitation interrupt mode Cédric Le Goater
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: Cédric Le Goater @ 2018-12-11 22:38 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt, Cédric Le Goater

The XIVE IRQ backend uses the same layout as the new XICS backend but
covers the full range of the IRQ number space. The IRQ numbers for the
CPU IPIs are allocated at the bottom of this space, below 4K, to
preserve compatibility with XICS which does not use that range.

This should be enough given that the maximum number of CPUs is 1024
for the sPAPR machine under QEMU. For the record, the biggest POWER8
or POWER9 system has a maximum of 1536 HW threads (16 sockets, 192
cores, SMT8).

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---

 Changes since v7:

 - removed spapr_xive_create()
 - reworked spapr_irq_init_xive() to use qdev_create()
 
 include/hw/ppc/spapr.h     |  2 +
 include/hw/ppc/spapr_irq.h |  2 +
 hw/ppc/spapr_irq.c         | 93 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 97 insertions(+)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 198764066dc9..cb3082d319af 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -16,6 +16,7 @@ typedef struct sPAPREventLogEntry sPAPREventLogEntry;
 typedef struct sPAPREventSource sPAPREventSource;
 typedef struct sPAPRPendingHPT sPAPRPendingHPT;
 typedef struct ICSState ICSState;
+typedef struct sPAPRXive sPAPRXive;
 
 #define HPTE64_V_HPTE_DIRTY     0x0000000000000040ULL
 #define SPAPR_ENTRY_POINT       0x100
@@ -175,6 +176,7 @@ struct sPAPRMachineState {
     const char *icp_type;
     int32_t irq_map_nr;
     unsigned long *irq_map;
+    sPAPRXive  *xive;
 
     bool cmd_line_caps[SPAPR_CAP_NUM];
     sPAPRCapabilities def, eff, mig;
diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index bd7301e6d9c6..23cdb51b879e 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -13,6 +13,7 @@
 /*
  * IRQ range offsets per device type
  */
+#define SPAPR_IRQ_IPI        0x0
 #define SPAPR_IRQ_EPOW       0x1000  /* XICS_IRQ_BASE offset */
 #define SPAPR_IRQ_HOTPLUG    0x1001
 #define SPAPR_IRQ_VIO        0x1100  /* 256 VIO devices */
@@ -42,6 +43,7 @@ typedef struct sPAPRIrq {
 
 extern sPAPRIrq spapr_irq_xics;
 extern sPAPRIrq spapr_irq_xics_legacy;
+extern sPAPRIrq spapr_irq_xive;
 
 void spapr_irq_init(sPAPRMachineState *spapr, Error **errp);
 int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index f8b651de0ec9..1f5aac55d370 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -12,6 +12,7 @@
 #include "qemu/error-report.h"
 #include "qapi/error.h"
 #include "hw/ppc/spapr.h"
+#include "hw/ppc/spapr_xive.h"
 #include "hw/ppc/xics.h"
 #include "sysemu/kvm.h"
 
@@ -205,6 +206,98 @@ sPAPRIrq spapr_irq_xics = {
     .print_info  = spapr_irq_print_info_xics,
 };
 
+/*
+ * XIVE IRQ backend.
+ */
+static void spapr_irq_init_xive(sPAPRMachineState *spapr, Error **errp)
+{
+    MachineState *machine = MACHINE(spapr);
+    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
+    uint32_t nr_servers = spapr_max_server_number(spapr);
+    DeviceState *dev;
+    int i;
+
+    /* KVM XIVE device not yet available */
+    if (kvm_enabled()) {
+        if (machine_kernel_irqchip_required(machine)) {
+            error_setg(errp, "kernel_irqchip requested. no KVM XIVE support");
+            return;
+        }
+    }
+
+    dev = qdev_create(NULL, TYPE_SPAPR_XIVE);
+    qdev_prop_set_uint32(dev, "nr-irqs", smc->irq->nr_irqs);
+    /*
+     * 8 XIVE END structures per CPU. One for each available priority
+     */
+    qdev_prop_set_uint32(dev, "nr-ends", nr_servers << 3);
+    qdev_init_nofail(dev);
+
+    spapr->xive = SPAPR_XIVE(dev);
+
+    /* Enable the CPU IPIs */
+    for (i = 0; i < nr_servers; ++i) {
+        spapr_xive_irq_claim(spapr->xive, SPAPR_IRQ_IPI + i, false);
+    }
+}
+
+static int spapr_irq_claim_xive(sPAPRMachineState *spapr, int irq, bool lsi,
+                                Error **errp)
+{
+    if (!spapr_xive_irq_claim(spapr->xive, irq, lsi)) {
+        error_setg(errp, "IRQ %d is invalid", irq);
+        return -1;
+    }
+    return 0;
+}
+
+static void spapr_irq_free_xive(sPAPRMachineState *spapr, int irq, int num)
+{
+    int i;
+
+    for (i = irq; i < irq + num; ++i) {
+        spapr_xive_irq_free(spapr->xive, i);
+    }
+}
+
+static qemu_irq spapr_qirq_xive(sPAPRMachineState *spapr, int irq)
+{
+    return spapr_xive_qirq(spapr->xive, irq);
+}
+
+static void spapr_irq_print_info_xive(sPAPRMachineState *spapr,
+                                      Monitor *mon)
+{
+    CPUState *cs;
+
+    CPU_FOREACH(cs) {
+        PowerPCCPU *cpu = POWERPC_CPU(cs);
+
+        xive_tctx_pic_print_info(XIVE_TCTX(cpu->intc), mon);
+    }
+
+    spapr_xive_pic_print_info(spapr->xive, mon);
+}
+
+/*
+ * XIVE uses the full IRQ number space. Set it to 8K to be compatible
+ * with XICS.
+ */
+
+#define SPAPR_IRQ_XIVE_NR_IRQS     0x2000
+#define SPAPR_IRQ_XIVE_NR_MSIS     (SPAPR_IRQ_XIVE_NR_IRQS - SPAPR_IRQ_MSI)
+
+sPAPRIrq spapr_irq_xive = {
+    .nr_irqs     = SPAPR_IRQ_XIVE_NR_IRQS,
+    .nr_msis     = SPAPR_IRQ_XIVE_NR_MSIS,
+
+    .init        = spapr_irq_init_xive,
+    .claim       = spapr_irq_claim_xive,
+    .free        = spapr_irq_free_xive,
+    .qirq        = spapr_qirq_xive,
+    .print_info  = spapr_irq_print_info_xive,
+};
+
 /*
  * sPAPR IRQ frontend routines for devices
  */
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Qemu-devel] [PATCH v8 02/12] spapr: add hcalls support for the XIVE exploitation interrupt mode
  2018-12-11 22:38 [Qemu-devel] [PATCH v8 00/12] ppc: support for the XIVE interrupt controller (POWER9) Cédric Le Goater
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 01/12] spapr: introduce a new machine IRQ backend for XIVE Cédric Le Goater
@ 2018-12-11 22:38 ` Cédric Le Goater
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 03/12] spapr: add device tree support for the XIVE exploitation mode Cédric Le Goater
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: Cédric Le Goater @ 2018-12-11 22:38 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt, Cédric Le Goater

The different XIVE virtualization structures (sources and event queues)
are configured with a set of Hypervisor calls :

 - H_INT_GET_SOURCE_INFO

   used to obtain the address of the MMIO page of the Event State
   Buffer (ESB) entry associated with the source.

 - H_INT_SET_SOURCE_CONFIG

   assigns a source to a "target".

 - H_INT_GET_SOURCE_CONFIG

   determines which "target" and "priority" is assigned to a source

 - H_INT_GET_QUEUE_INFO

   returns the address of the notification management page associated
   with the specified "target" and "priority".

 - H_INT_SET_QUEUE_CONFIG

   sets or resets the event queue for a given "target" and "priority".
   It is also used to set the notification configuration associated
   with the queue, only unconditional notification is supported for
   the moment. Reset is performed with a queue size of 0 and queueing
   is disabled in that case.

 - H_INT_GET_QUEUE_CONFIG

   returns the queue settings for a given "target" and "priority".

 - H_INT_RESET

   resets all of the guest's internal interrupt structures to their
   initial state, losing all configuration set via the hcalls
   H_INT_SET_SOURCE_CONFIG and H_INT_SET_QUEUE_CONFIG.

 - H_INT_SYNC

   issue a synchronisation on a source to make sure all notifications
   have reached their queue.

Calls that still need to be addressed :

   H_INT_SET_OS_REPORTING_LINE
   H_INT_GET_OS_REPORTING_LINE

See the code for more documentation on each hcall.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
 include/hw/ppc/spapr.h      |  15 +-
 include/hw/ppc/spapr_xive.h |   4 +
 hw/intc/spapr_xive.c        | 963 ++++++++++++++++++++++++++++++++++++
 hw/ppc/spapr_irq.c          |   2 +
 4 files changed, 983 insertions(+), 1 deletion(-)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index cb3082d319af..6bf028a02fe2 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -452,7 +452,20 @@ struct sPAPRMachineState {
 #define H_INVALIDATE_PID        0x378
 #define H_REGISTER_PROC_TBL     0x37C
 #define H_SIGNAL_SYS_RESET      0x380
-#define MAX_HCALL_OPCODE        H_SIGNAL_SYS_RESET
+
+#define H_INT_GET_SOURCE_INFO   0x3A8
+#define H_INT_SET_SOURCE_CONFIG 0x3AC
+#define H_INT_GET_SOURCE_CONFIG 0x3B0
+#define H_INT_GET_QUEUE_INFO    0x3B4
+#define H_INT_SET_QUEUE_CONFIG  0x3B8
+#define H_INT_GET_QUEUE_CONFIG  0x3BC
+#define H_INT_SET_OS_REPORTING_LINE 0x3C0
+#define H_INT_GET_OS_REPORTING_LINE 0x3C4
+#define H_INT_ESB               0x3C8
+#define H_INT_SYNC              0x3CC
+#define H_INT_RESET             0x3D0
+
+#define MAX_HCALL_OPCODE        H_INT_RESET
 
 /* The hcalls above are standardized in PAPR and implemented by pHyp
  * as well.
diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
index f087959b9924..9506a8f4d10a 100644
--- a/include/hw/ppc/spapr_xive.h
+++ b/include/hw/ppc/spapr_xive.h
@@ -42,4 +42,8 @@ bool spapr_xive_irq_free(sPAPRXive *xive, uint32_t lisn);
 void spapr_xive_pic_print_info(sPAPRXive *xive, Monitor *mon);
 qemu_irq spapr_xive_qirq(sPAPRXive *xive, uint32_t lisn);
 
+typedef struct sPAPRMachineState sPAPRMachineState;
+
+void spapr_xive_hcall_init(sPAPRMachineState *spapr);
+
 #endif /* PPC_SPAPR_XIVE_H */
diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index 3ade419fdbb1..982ac6e17051 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -38,6 +38,13 @@
 
 #define SPAPR_XIVE_NVT_BASE 0x400
 
+/*
+ * The sPAPR machine has a unique XIVE IC device. Assign a fixed value
+ * to the controller block id value. It can nevertheless be changed
+ * for testing purpose.
+ */
+#define SPAPR_XIVE_BLOCK_ID 0x0
+
 /*
  * sPAPR NVT and END indexing helpers
  */
@@ -46,6 +53,64 @@ static uint32_t spapr_xive_nvt_to_target(uint8_t nvt_blk, uint32_t nvt_idx)
     return nvt_idx - SPAPR_XIVE_NVT_BASE;
 }
 
+static void spapr_xive_cpu_to_nvt(PowerPCCPU *cpu,
+                                  uint8_t *out_nvt_blk, uint32_t *out_nvt_idx)
+{
+    assert(cpu);
+
+    if (out_nvt_blk) {
+        *out_nvt_blk = SPAPR_XIVE_BLOCK_ID;
+    }
+
+    if (out_nvt_blk) {
+        *out_nvt_idx = SPAPR_XIVE_NVT_BASE + cpu->vcpu_id;
+    }
+}
+
+static int spapr_xive_target_to_nvt(uint32_t target,
+                                    uint8_t *out_nvt_blk, uint32_t *out_nvt_idx)
+{
+    PowerPCCPU *cpu = spapr_find_cpu(target);
+
+    if (!cpu) {
+        return -1;
+    }
+
+    spapr_xive_cpu_to_nvt(cpu, out_nvt_blk, out_nvt_idx);
+    return 0;
+}
+
+/*
+ * sPAPR END indexing uses a simple mapping of the CPU vcpu_id, 8
+ * priorities per CPU
+ */
+static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
+                                  uint8_t *out_end_blk, uint32_t *out_end_idx)
+{
+    assert(cpu);
+
+    if (out_end_blk) {
+        *out_end_blk = SPAPR_XIVE_BLOCK_ID;
+    }
+
+    if (out_end_idx) {
+        *out_end_idx = (cpu->vcpu_id << 3) + prio;
+    }
+}
+
+static int spapr_xive_target_to_end(uint32_t target, uint8_t prio,
+                                    uint8_t *out_end_blk, uint32_t *out_end_idx)
+{
+    PowerPCCPU *cpu = spapr_find_cpu(target);
+
+    if (!cpu) {
+        return -1;
+    }
+
+    spapr_xive_cpu_to_end(cpu, prio, out_end_blk, out_end_idx);
+    return 0;
+}
+
 /*
  * On sPAPR machines, use a simplified output for the XIVE END
  * structure dumping only the information related to the OS EQ.
@@ -418,3 +483,901 @@ qemu_irq spapr_xive_qirq(sPAPRXive *xive, uint32_t lisn)
 
     return xive_source_qirq(xsrc, lisn);
 }
+
+/*
+ * XIVE hcalls
+ *
+ * The terminology used by the XIVE hcalls is the following :
+ *
+ *   TARGET vCPU number
+ *   EQ     Event Queue assigned by OS to receive event data
+ *   ESB    page for source interrupt management
+ *   LISN   Logical Interrupt Source Number identifying a source in the
+ *          machine
+ *   EISN   Effective Interrupt Source Number used by guest OS to
+ *          identify source in the guest
+ *
+ * The EAS, END, NVT structures are not exposed.
+ */
+
+/*
+ * Linux hosts under OPAL reserve priority 7 for their own escalation
+ * interrupts (DD2.X POWER9). So we only allow the guest to use
+ * priorities [0..6].
+ */
+static bool spapr_xive_priority_is_reserved(uint8_t priority)
+{
+    switch (priority) {
+    case 0 ... 6:
+        return false;
+    case 7: /* OPAL escalation queue */
+    default:
+        return true;
+    }
+}
+
+/*
+ * The H_INT_GET_SOURCE_INFO hcall() is used to obtain the logical
+ * real address of the MMIO page through which the Event State Buffer
+ * entry associated with the value of the "lisn" parameter is managed.
+ *
+ * Parameters:
+ * Input
+ * - R4: "flags"
+ *         Bits 0-63 reserved
+ * - R5: "lisn" is per "interrupts", "interrupt-map", or
+ *       "ibm,xive-lisn-ranges" properties, or as returned by the
+ *       ibm,query-interrupt-source-number RTAS call, or as returned
+ *       by the H_ALLOCATE_VAS_WINDOW hcall
+ *
+ * Output
+ * - R4: "flags"
+ *         Bits 0-59: Reserved
+ *         Bit 60: H_INT_ESB must be used for Event State Buffer
+ *                 management
+ *         Bit 61: 1 == LSI  0 == MSI
+ *         Bit 62: the full function page supports trigger
+ *         Bit 63: Store EOI Supported
+ * - R5: Logical Real address of full function Event State Buffer
+ *       management page, -1 if H_INT_ESB hcall flag is set to 1.
+ * - R6: Logical Real Address of trigger only Event State Buffer
+ *       management page or -1.
+ * - R7: Power of 2 page size for the ESB management pages returned in
+ *       R5 and R6.
+ */
+
+#define SPAPR_XIVE_SRC_H_INT_ESB     PPC_BIT(60) /* ESB manage with H_INT_ESB */
+#define SPAPR_XIVE_SRC_LSI           PPC_BIT(61) /* Virtual LSI type */
+#define SPAPR_XIVE_SRC_TRIGGER       PPC_BIT(62) /* Trigger and management
+                                                    on same page */
+#define SPAPR_XIVE_SRC_STORE_EOI     PPC_BIT(63) /* Store EOI support */
+
+static target_ulong h_int_get_source_info(PowerPCCPU *cpu,
+                                          sPAPRMachineState *spapr,
+                                          target_ulong opcode,
+                                          target_ulong *args)
+{
+    sPAPRXive *xive = spapr->xive;
+    XiveSource *xsrc = &xive->source;
+    target_ulong flags  = args[0];
+    target_ulong lisn   = args[1];
+
+    if (!spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
+        return H_FUNCTION;
+    }
+
+    if (flags) {
+        return H_PARAMETER;
+    }
+
+    if (lisn >= xive->nr_irqs) {
+        qemu_log_mask(LOG_GUEST_ERROR, "XIVE: Unknown LISN %lx\n", lisn);
+        return H_P2;
+    }
+
+    if (!xive_eas_is_valid(&xive->eat[lisn])) {
+        qemu_log_mask(LOG_GUEST_ERROR, "XIVE: Invalid LISN %lx\n", lisn);
+        return H_P2;
+    }
+
+    /* All sources are emulated under the main XIVE object and share
+     * the same characteristics.
+     */
+    args[0] = 0;
+    if (!xive_source_esb_has_2page(xsrc)) {
+        args[0] |= SPAPR_XIVE_SRC_TRIGGER;
+    }
+    if (xsrc->esb_flags & XIVE_SRC_STORE_EOI) {
+        args[0] |= SPAPR_XIVE_SRC_STORE_EOI;
+    }
+
+    /*
+     * Force the use of the H_INT_ESB hcall in case of an LSI
+     * interrupt. This is necessary under KVM to re-trigger the
+     * interrupt if the level is still asserted
+     */
+    if (xive_source_irq_is_lsi(xsrc, lisn)) {
+        args[0] |= SPAPR_XIVE_SRC_H_INT_ESB | SPAPR_XIVE_SRC_LSI;
+    }
+
+    if (!(args[0] & SPAPR_XIVE_SRC_H_INT_ESB)) {
+        args[1] = xive->vc_base + xive_source_esb_mgmt(xsrc, lisn);
+    } else {
+        args[1] = -1;
+    }
+
+    if (xive_source_esb_has_2page(xsrc) &&
+        !(args[0] & SPAPR_XIVE_SRC_H_INT_ESB)) {
+        args[2] = xive->vc_base + xive_source_esb_page(xsrc, lisn);
+    } else {
+        args[2] = -1;
+    }
+
+    if (xive_source_esb_has_2page(xsrc)) {
+        args[3] = xsrc->esb_shift - 1;
+    } else {
+        args[3] = xsrc->esb_shift;
+    }
+
+    return H_SUCCESS;
+}
+
+/*
+ * The H_INT_SET_SOURCE_CONFIG hcall() is used to assign a Logical
+ * Interrupt Source to a target. The Logical Interrupt Source is
+ * designated with the "lisn" parameter and the target is designated
+ * with the "target" and "priority" parameters.  Upon return from the
+ * hcall(), no additional interrupts will be directed to the old EQ.
+ *
+ * Parameters:
+ * Input:
+ * - R4: "flags"
+ *         Bits 0-61: Reserved
+ *         Bit 62: set the "eisn" in the EAS
+ *         Bit 63: masks the interrupt source in the hardware interrupt
+ *       control structure. An interrupt masked by this mechanism will
+ *       be dropped, but it's source state bits will still be
+ *       set. There is no race-free way of unmasking and restoring the
+ *       source. Thus this should only be used in interrupts that are
+ *       also masked at the source, and only in cases where the
+ *       interrupt is not meant to be used for a large amount of time
+ *       because no valid target exists for it for example
+ * - R5: "lisn" is per "interrupts", "interrupt-map", or
+ *       "ibm,xive-lisn-ranges" properties, or as returned by the
+ *       ibm,query-interrupt-source-number RTAS call, or as returned by
+ *       the H_ALLOCATE_VAS_WINDOW hcall
+ * - R6: "target" is per "ibm,ppc-interrupt-server#s" or
+ *       "ibm,ppc-interrupt-gserver#s"
+ * - R7: "priority" is a valid priority not in
+ *       "ibm,plat-res-int-priorities"
+ * - R8: "eisn" is the guest EISN associated with the "lisn"
+ *
+ * Output:
+ * - None
+ */
+
+#define SPAPR_XIVE_SRC_SET_EISN PPC_BIT(62)
+#define SPAPR_XIVE_SRC_MASK     PPC_BIT(63)
+
+static target_ulong h_int_set_source_config(PowerPCCPU *cpu,
+                                            sPAPRMachineState *spapr,
+                                            target_ulong opcode,
+                                            target_ulong *args)
+{
+    sPAPRXive *xive = spapr->xive;
+    XiveEAS eas, new_eas;
+    target_ulong flags    = args[0];
+    target_ulong lisn     = args[1];
+    target_ulong target   = args[2];
+    target_ulong priority = args[3];
+    target_ulong eisn     = args[4];
+    uint8_t end_blk;
+    uint32_t end_idx;
+
+    if (!spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
+        return H_FUNCTION;
+    }
+
+    if (flags & ~(SPAPR_XIVE_SRC_SET_EISN | SPAPR_XIVE_SRC_MASK)) {
+        return H_PARAMETER;
+    }
+
+    if (lisn >= xive->nr_irqs) {
+        qemu_log_mask(LOG_GUEST_ERROR, "XIVE: Unknown LISN %lx\n", lisn);
+        return H_P2;
+    }
+
+    eas = xive->eat[lisn];
+    if (!xive_eas_is_valid(&eas)) {
+        qemu_log_mask(LOG_GUEST_ERROR, "XIVE: Invalid LISN %lx\n", lisn);
+        return H_P2;
+    }
+
+    /* priority 0xff is used to reset the EAS */
+    if (priority == 0xff) {
+        new_eas.w = cpu_to_be64(EAS_VALID | EAS_MASKED);
+        goto out;
+    }
+
+    if (flags & SPAPR_XIVE_SRC_MASK) {
+        new_eas.w = eas.w | cpu_to_be64(EAS_MASKED);
+    } else {
+        new_eas.w = eas.w & cpu_to_be64(~EAS_MASKED);
+    }
+
+    if (spapr_xive_priority_is_reserved(priority)) {
+        qemu_log_mask(LOG_GUEST_ERROR, "XIVE: priority %ld is reserved\n",
+                      priority);
+        return H_P4;
+    }
+
+    /* Validate that "target" is part of the list of threads allocated
+     * to the partition. For that, find the END corresponding to the
+     * target.
+     */
+    if (spapr_xive_target_to_end(target, priority, &end_blk, &end_idx)) {
+        return H_P3;
+    }
+
+    new_eas.w = SETFIELD_BE64(EAS_END_BLOCK, new_eas.w, end_blk);
+    new_eas.w = SETFIELD_BE64(EAS_END_INDEX, new_eas.w, end_idx);
+
+    if (flags & SPAPR_XIVE_SRC_SET_EISN) {
+        new_eas.w = SETFIELD_BE64(EAS_END_DATA, new_eas.w, eisn);
+    }
+
+out:
+    xive->eat[lisn] = new_eas;
+    return H_SUCCESS;
+}
+
+/*
+ * The H_INT_GET_SOURCE_CONFIG hcall() is used to determine to which
+ * target/priority pair is assigned to the specified Logical Interrupt
+ * Source.
+ *
+ * Parameters:
+ * Input:
+ * - R4: "flags"
+ *         Bits 0-63 Reserved
+ * - R5: "lisn" is per "interrupts", "interrupt-map", or
+ *       "ibm,xive-lisn-ranges" properties, or as returned by the
+ *       ibm,query-interrupt-source-number RTAS call, or as
+ *       returned by the H_ALLOCATE_VAS_WINDOW hcall
+ *
+ * Output:
+ * - R4: Target to which the specified Logical Interrupt Source is
+ *       assigned
+ * - R5: Priority to which the specified Logical Interrupt Source is
+ *       assigned
+ * - R6: EISN for the specified Logical Interrupt Source (this will be
+ *       equivalent to the LISN if not changed by H_INT_SET_SOURCE_CONFIG)
+ */
+static target_ulong h_int_get_source_config(PowerPCCPU *cpu,
+                                            sPAPRMachineState *spapr,
+                                            target_ulong opcode,
+                                            target_ulong *args)
+{
+    sPAPRXive *xive = spapr->xive;
+    target_ulong flags = args[0];
+    target_ulong lisn = args[1];
+    XiveEAS eas;
+    XiveEND *end;
+    uint8_t nvt_blk;
+    uint32_t end_idx, nvt_idx;
+
+    if (!spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
+        return H_FUNCTION;
+    }
+
+    if (flags) {
+        return H_PARAMETER;
+    }
+
+    if (lisn >= xive->nr_irqs) {
+        qemu_log_mask(LOG_GUEST_ERROR, "XIVE: Unknown LISN %lx\n", lisn);
+        return H_P2;
+    }
+
+    eas = xive->eat[lisn];
+    if (!xive_eas_is_valid(&eas)) {
+        qemu_log_mask(LOG_GUEST_ERROR, "XIVE: Invalid LISN %lx\n", lisn);
+        return H_P2;
+    }
+
+    /* EAS_END_BLOCK is unused on sPAPR */
+    end_idx = GETFIELD_BE64(EAS_END_INDEX, eas.w);
+
+    assert(end_idx < xive->nr_ends);
+    end = &xive->endt[end_idx];
+
+    nvt_blk = GETFIELD_BE32(END_W6_NVT_BLOCK, end->w6);
+    nvt_idx = GETFIELD_BE32(END_W6_NVT_INDEX, end->w6);
+    args[0] = spapr_xive_nvt_to_target(nvt_blk, nvt_idx);
+
+    if (xive_eas_is_masked(&eas)) {
+        args[1] = 0xff;
+    } else {
+        args[1] = GETFIELD_BE32(END_W7_F0_PRIORITY, end->w7);
+    }
+
+    args[2] = GETFIELD_BE64(EAS_END_DATA, eas.w);
+
+    return H_SUCCESS;
+}
+
+/*
+ * The H_INT_GET_QUEUE_INFO hcall() is used to get the logical real
+ * address of the notification management page associated with the
+ * specified target and priority.
+ *
+ * Parameters:
+ * Input:
+ * - R4: "flags"
+ *         Bits 0-63 Reserved
+ * - R5: "target" is per "ibm,ppc-interrupt-server#s" or
+ *       "ibm,ppc-interrupt-gserver#s"
+ * - R6: "priority" is a valid priority not in
+ *       "ibm,plat-res-int-priorities"
+ *
+ * Output:
+ * - R4: Logical real address of notification page
+ * - R5: Power of 2 page size of the notification page
+ */
+static target_ulong h_int_get_queue_info(PowerPCCPU *cpu,
+                                         sPAPRMachineState *spapr,
+                                         target_ulong opcode,
+                                         target_ulong *args)
+{
+    sPAPRXive *xive = spapr->xive;
+    XiveENDSource *end_xsrc = &xive->end_source;
+    target_ulong flags = args[0];
+    target_ulong target = args[1];
+    target_ulong priority = args[2];
+    XiveEND *end;
+    uint8_t end_blk;
+    uint32_t end_idx;
+
+    if (!spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
+        return H_FUNCTION;
+    }
+
+    if (flags) {
+        return H_PARAMETER;
+    }
+
+    /*
+     * H_STATE should be returned if a H_INT_RESET is in progress.
+     * This is not needed when running the emulation under QEMU
+     */
+
+    if (spapr_xive_priority_is_reserved(priority)) {
+        qemu_log_mask(LOG_GUEST_ERROR, "XIVE: priority %ld is reserved\n",
+                      priority);
+        return H_P3;
+    }
+
+    /* Validate that "target" is part of the list of threads allocated
+     * to the partition. For that, find the END corresponding to the
+     * target.
+     */
+    if (spapr_xive_target_to_end(target, priority, &end_blk, &end_idx)) {
+        return H_P2;
+    }
+
+    assert(end_idx < xive->nr_ends);
+    end = &xive->endt[end_idx];
+
+    args[0] = xive->end_base + (1ull << (end_xsrc->esb_shift + 1)) * end_idx;
+    if (xive_end_is_enqueue(end)) {
+        args[1] = GETFIELD_BE32(END_W0_QSIZE, end->w0) + 12;
+    } else {
+        args[1] = 0;
+    }
+
+    return H_SUCCESS;
+}
+
+/*
+ * The H_INT_SET_QUEUE_CONFIG hcall() is used to set or reset a EQ for
+ * a given "target" and "priority".  It is also used to set the
+ * notification config associated with the EQ.  An EQ size of 0 is
+ * used to reset the EQ config for a given target and priority. If
+ * resetting the EQ config, the END associated with the given "target"
+ * and "priority" will be changed to disable queueing.
+ *
+ * Upon return from the hcall(), no additional interrupts will be
+ * directed to the old EQ (if one was set). The old EQ (if one was
+ * set) should be investigated for interrupts that occurred prior to
+ * or during the hcall().
+ *
+ * Parameters:
+ * Input:
+ * - R4: "flags"
+ *         Bits 0-62: Reserved
+ *         Bit 63: Unconditional Notify (n) per the XIVE spec
+ * - R5: "target" is per "ibm,ppc-interrupt-server#s" or
+ *       "ibm,ppc-interrupt-gserver#s"
+ * - R6: "priority" is a valid priority not in
+ *       "ibm,plat-res-int-priorities"
+ * - R7: "eventQueue": The logical real address of the start of the EQ
+ * - R8: "eventQueueSize": The power of 2 EQ size per "ibm,xive-eq-sizes"
+ *
+ * Output:
+ * - None
+ */
+
+#define SPAPR_XIVE_END_ALWAYS_NOTIFY PPC_BIT(63)
+
+static target_ulong h_int_set_queue_config(PowerPCCPU *cpu,
+                                           sPAPRMachineState *spapr,
+                                           target_ulong opcode,
+                                           target_ulong *args)
+{
+    sPAPRXive *xive = spapr->xive;
+    target_ulong flags = args[0];
+    target_ulong target = args[1];
+    target_ulong priority = args[2];
+    target_ulong qpage = args[3];
+    target_ulong qsize = args[4];
+    XiveEND end;
+    uint8_t end_blk, nvt_blk;
+    uint32_t end_idx, nvt_idx;
+
+    if (!spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
+        return H_FUNCTION;
+    }
+
+    if (flags & ~SPAPR_XIVE_END_ALWAYS_NOTIFY) {
+        return H_PARAMETER;
+    }
+
+    /*
+     * H_STATE should be returned if a H_INT_RESET is in progress.
+     * This is not needed when running the emulation under QEMU
+     */
+
+    if (spapr_xive_priority_is_reserved(priority)) {
+        qemu_log_mask(LOG_GUEST_ERROR, "XIVE: priority %ld is reserved\n",
+                      priority);
+        return H_P3;
+    }
+
+    /* Validate that "target" is part of the list of threads allocated
+     * to the partition. For that, find the END corresponding to the
+     * target.
+     */
+
+    if (spapr_xive_target_to_end(target, priority, &end_blk, &end_idx)) {
+        return H_P2;
+    }
+
+    assert(end_idx < xive->nr_ends);
+    memcpy(&end, &xive->endt[end_idx], sizeof(XiveEND));
+
+    switch (qsize) {
+    case 12:
+    case 16:
+    case 21:
+    case 24:
+        end.w2 = cpu_to_be32((qpage >> 32) & 0x0fffffff);
+        end.w3 = cpu_to_be32(qpage & 0xffffffff);
+        end.w0 |= cpu_to_be32(END_W0_ENQUEUE);
+        end.w0 = SETFIELD_BE32(END_W0_QSIZE, end.w0, qsize - 12);
+        break;
+    case 0:
+        /* reset queue and disable queueing */
+        spapr_xive_end_reset(&end);
+        goto out;
+
+    default:
+        qemu_log_mask(LOG_GUEST_ERROR, "XIVE: invalid EQ size %"PRIx64"\n",
+                      qsize);
+        return H_P5;
+    }
+
+    if (qsize) {
+        hwaddr plen = 1 << qsize;
+        void *eq;
+
+        /*
+         * Validate the guest EQ. We should also check that the queue
+         * has been zeroed by the OS.
+         */
+        eq = address_space_map(CPU(cpu)->as, qpage, &plen, true,
+                               MEMTXATTRS_UNSPECIFIED);
+        if (plen != 1 << qsize) {
+            qemu_log_mask(LOG_GUEST_ERROR, "XIVE: failed to map EQ @0x%"
+                          HWADDR_PRIx "\n", qpage);
+            return H_P4;
+        }
+        address_space_unmap(CPU(cpu)->as, eq, plen, true, plen);
+    }
+
+    /* "target" should have been validated above */
+    if (spapr_xive_target_to_nvt(target, &nvt_blk, &nvt_idx)) {
+        g_assert_not_reached();
+    }
+
+    /* Ensure the priority and target are correctly set (they will not
+     * be right after allocation)
+     */
+    end.w6 = SETFIELD_BE32(END_W6_NVT_BLOCK, 0ul, nvt_blk) |
+        SETFIELD_BE32(END_W6_NVT_INDEX, 0ul, nvt_idx);
+    end.w7 = SETFIELD_BE32(END_W7_F0_PRIORITY, 0ul, priority);
+
+    if (flags & SPAPR_XIVE_END_ALWAYS_NOTIFY) {
+        end.w0 |= cpu_to_be32(END_W0_UCOND_NOTIFY);
+    } else {
+        end.w0 &= cpu_to_be32((uint32_t)~END_W0_UCOND_NOTIFY);
+    }
+
+    /* The generation bit for the END starts at 1 and The END page
+     * offset counter starts at 0.
+     */
+    end.w1 = cpu_to_be32(END_W1_GENERATION) |
+        SETFIELD_BE32(END_W1_PAGE_OFF, 0ul, 0ul);
+    end.w0 |= cpu_to_be32(END_W0_VALID);
+
+    /* TODO: issue syncs required to ensure all in-flight interrupts
+     * are complete on the old END */
+
+out:
+    /* Update END */
+    memcpy(&xive->endt[end_idx], &end, sizeof(XiveEND));
+    return H_SUCCESS;
+}
+
+/*
+ * The H_INT_GET_QUEUE_CONFIG hcall() is used to get a EQ for a given
+ * target and priority.
+ *
+ * Parameters:
+ * Input:
+ * - R4: "flags"
+ *         Bits 0-62: Reserved
+ *         Bit 63: Debug: Return debug data
+ * - R5: "target" is per "ibm,ppc-interrupt-server#s" or
+ *       "ibm,ppc-interrupt-gserver#s"
+ * - R6: "priority" is a valid priority not in
+ *       "ibm,plat-res-int-priorities"
+ *
+ * Output:
+ * - R4: "flags":
+ *       Bits 0-61: Reserved
+ *       Bit 62: The value of Event Queue Generation Number (g) per
+ *              the XIVE spec if "Debug" = 1
+ *       Bit 63: The value of Unconditional Notify (n) per the XIVE spec
+ * - R5: The logical real address of the start of the EQ
+ * - R6: The power of 2 EQ size per "ibm,xive-eq-sizes"
+ * - R7: The value of Event Queue Offset Counter per XIVE spec
+ *       if "Debug" = 1, else 0
+ *
+ */
+
+#define SPAPR_XIVE_END_DEBUG     PPC_BIT(63)
+
+static target_ulong h_int_get_queue_config(PowerPCCPU *cpu,
+                                           sPAPRMachineState *spapr,
+                                           target_ulong opcode,
+                                           target_ulong *args)
+{
+    sPAPRXive *xive = spapr->xive;
+    target_ulong flags = args[0];
+    target_ulong target = args[1];
+    target_ulong priority = args[2];
+    XiveEND *end;
+    uint8_t end_blk;
+    uint32_t end_idx;
+
+    if (!spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
+        return H_FUNCTION;
+    }
+
+    if (flags & ~SPAPR_XIVE_END_DEBUG) {
+        return H_PARAMETER;
+    }
+
+    /*
+     * H_STATE should be returned if a H_INT_RESET is in progress.
+     * This is not needed when running the emulation under QEMU
+     */
+
+    if (spapr_xive_priority_is_reserved(priority)) {
+        qemu_log_mask(LOG_GUEST_ERROR, "XIVE: priority %ld is reserved\n",
+                      priority);
+        return H_P3;
+    }
+
+    /* Validate that "target" is part of the list of threads allocated
+     * to the partition. For that, find the END corresponding to the
+     * target.
+     */
+    if (spapr_xive_target_to_end(target, priority, &end_blk, &end_idx)) {
+        return H_P2;
+    }
+
+    assert(end_idx < xive->nr_ends);
+    end = &xive->endt[end_idx];
+
+    args[0] = 0;
+    if (xive_end_is_notify(end)) {
+        args[0] |= SPAPR_XIVE_END_ALWAYS_NOTIFY;
+    }
+
+    if (xive_end_is_enqueue(end)) {
+        args[1] = (uint64_t) be32_to_cpu(end->w2 & 0x0fffffff) << 32
+            | be32_to_cpu(end->w3);
+        args[2] = GETFIELD_BE32(END_W0_QSIZE, end->w0) + 12;
+    } else {
+        args[1] = 0;
+        args[2] = 0;
+    }
+
+    /* TODO: do we need any locking on the END ? */
+    if (flags & SPAPR_XIVE_END_DEBUG) {
+        /* Load the event queue generation number into the return flags */
+        args[0] |= (uint64_t)GETFIELD_BE32(END_W1_GENERATION, end->w1) << 62;
+
+        /* Load R7 with the event queue offset counter */
+        args[3] = GETFIELD_BE32(END_W1_PAGE_OFF, end->w1);
+    } else {
+        args[3] = 0;
+    }
+
+    return H_SUCCESS;
+}
+
+/*
+ * The H_INT_SET_OS_REPORTING_LINE hcall() is used to set the
+ * reporting cache line pair for the calling thread.  The reporting
+ * cache lines will contain the OS interrupt context when the OS
+ * issues a CI store byte to @TIMA+0xC10 to acknowledge the OS
+ * interrupt. The reporting cache lines can be reset by inputting -1
+ * in "reportingLine".  Issuing the CI store byte without reporting
+ * cache lines registered will result in the data not being accessible
+ * to the OS.
+ *
+ * Parameters:
+ * Input:
+ * - R4: "flags"
+ *         Bits 0-63: Reserved
+ * - R5: "reportingLine": The logical real address of the reporting cache
+ *       line pair
+ *
+ * Output:
+ * - None
+ */
+static target_ulong h_int_set_os_reporting_line(PowerPCCPU *cpu,
+                                                sPAPRMachineState *spapr,
+                                                target_ulong opcode,
+                                                target_ulong *args)
+{
+    if (!spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
+        return H_FUNCTION;
+    }
+
+    /*
+     * H_STATE should be returned if a H_INT_RESET is in progress.
+     * This is not needed when running the emulation under QEMU
+     */
+
+    /* TODO: H_INT_SET_OS_REPORTING_LINE */
+    return H_FUNCTION;
+}
+
+/*
+ * The H_INT_GET_OS_REPORTING_LINE hcall() is used to get the logical
+ * real address of the reporting cache line pair set for the input
+ * "target".  If no reporting cache line pair has been set, -1 is
+ * returned.
+ *
+ * Parameters:
+ * Input:
+ * - R4: "flags"
+ *         Bits 0-63: Reserved
+ * - R5: "target" is per "ibm,ppc-interrupt-server#s" or
+ *       "ibm,ppc-interrupt-gserver#s"
+ * - R6: "reportingLine": The logical real address of the reporting
+ *        cache line pair
+ *
+ * Output:
+ * - R4: The logical real address of the reporting line if set, else -1
+ */
+static target_ulong h_int_get_os_reporting_line(PowerPCCPU *cpu,
+                                                sPAPRMachineState *spapr,
+                                                target_ulong opcode,
+                                                target_ulong *args)
+{
+    if (!spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
+        return H_FUNCTION;
+    }
+
+    /*
+     * H_STATE should be returned if a H_INT_RESET is in progress.
+     * This is not needed when running the emulation under QEMU
+     */
+
+    /* TODO: H_INT_GET_OS_REPORTING_LINE */
+    return H_FUNCTION;
+}
+
+/*
+ * The H_INT_ESB hcall() is used to issue a load or store to the ESB
+ * page for the input "lisn".  This hcall is only supported for LISNs
+ * that have the ESB hcall flag set to 1 when returned from hcall()
+ * H_INT_GET_SOURCE_INFO.
+ *
+ * Parameters:
+ * Input:
+ * - R4: "flags"
+ *         Bits 0-62: Reserved
+ *         bit 63: Store: Store=1, store operation, else load operation
+ * - R5: "lisn" is per "interrupts", "interrupt-map", or
+ *       "ibm,xive-lisn-ranges" properties, or as returned by the
+ *       ibm,query-interrupt-source-number RTAS call, or as
+ *       returned by the H_ALLOCATE_VAS_WINDOW hcall
+ * - R6: "esbOffset" is the offset into the ESB page for the load or
+ *       store operation
+ * - R7: "storeData" is the data to write for a store operation
+ *
+ * Output:
+ * - R4: The value of the load if load operation, else -1
+ */
+
+#define SPAPR_XIVE_ESB_STORE PPC_BIT(63)
+
+static target_ulong h_int_esb(PowerPCCPU *cpu,
+                              sPAPRMachineState *spapr,
+                              target_ulong opcode,
+                              target_ulong *args)
+{
+    sPAPRXive *xive = spapr->xive;
+    XiveEAS eas;
+    target_ulong flags  = args[0];
+    target_ulong lisn   = args[1];
+    target_ulong offset = args[2];
+    target_ulong data   = args[3];
+    hwaddr mmio_addr;
+    XiveSource *xsrc = &xive->source;
+
+    if (!spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
+        return H_FUNCTION;
+    }
+
+    if (flags & ~SPAPR_XIVE_ESB_STORE) {
+        return H_PARAMETER;
+    }
+
+    if (lisn >= xive->nr_irqs) {
+        qemu_log_mask(LOG_GUEST_ERROR, "XIVE: Unknown LISN %lx\n", lisn);
+        return H_P2;
+    }
+
+    eas = xive->eat[lisn];
+    if (!xive_eas_is_valid(&eas)) {
+        qemu_log_mask(LOG_GUEST_ERROR, "XIVE: Invalid LISN %lx\n", lisn);
+        return H_P2;
+    }
+
+    if (offset > (1ull << xsrc->esb_shift)) {
+        return H_P3;
+    }
+
+    mmio_addr = xive->vc_base + xive_source_esb_mgmt(xsrc, lisn) + offset;
+
+    if (dma_memory_rw(&address_space_memory, mmio_addr, &data, 8,
+                      (flags & SPAPR_XIVE_ESB_STORE))) {
+        qemu_log_mask(LOG_GUEST_ERROR, "XIVE: failed to access ESB @0x%"
+                      HWADDR_PRIx "\n", mmio_addr);
+        return H_HARDWARE;
+    }
+    args[0] = (flags & SPAPR_XIVE_ESB_STORE) ? -1 : data;
+    return H_SUCCESS;
+}
+
+/*
+ * The H_INT_SYNC hcall() is used to issue hardware syncs that will
+ * ensure any in flight events for the input lisn are in the event
+ * queue.
+ *
+ * Parameters:
+ * Input:
+ * - R4: "flags"
+ *         Bits 0-63: Reserved
+ * - R5: "lisn" is per "interrupts", "interrupt-map", or
+ *       "ibm,xive-lisn-ranges" properties, or as returned by the
+ *       ibm,query-interrupt-source-number RTAS call, or as
+ *       returned by the H_ALLOCATE_VAS_WINDOW hcall
+ *
+ * Output:
+ * - None
+ */
+static target_ulong h_int_sync(PowerPCCPU *cpu,
+                               sPAPRMachineState *spapr,
+                               target_ulong opcode,
+                               target_ulong *args)
+{
+    sPAPRXive *xive = spapr->xive;
+    XiveEAS eas;
+    target_ulong flags = args[0];
+    target_ulong lisn = args[1];
+
+    if (!spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
+        return H_FUNCTION;
+    }
+
+    if (flags) {
+        return H_PARAMETER;
+    }
+
+    if (lisn >= xive->nr_irqs) {
+        qemu_log_mask(LOG_GUEST_ERROR, "XIVE: Unknown LISN %lx\n", lisn);
+        return H_P2;
+    }
+
+    eas = xive->eat[lisn];
+    if (!xive_eas_is_valid(&eas)) {
+        qemu_log_mask(LOG_GUEST_ERROR, "XIVE: Invalid LISN %lx\n", lisn);
+        return H_P2;
+    }
+
+    /*
+     * H_STATE should be returned if a H_INT_RESET is in progress.
+     * This is not needed when running the emulation under QEMU
+     */
+
+    /* This is not real hardware. Nothing to be done */
+    return H_SUCCESS;
+}
+
+/*
+ * The H_INT_RESET hcall() is used to reset all of the partition's
+ * interrupt exploitation structures to their initial state.  This
+ * means losing all previously set interrupt state set via
+ * H_INT_SET_SOURCE_CONFIG and H_INT_SET_QUEUE_CONFIG.
+ *
+ * Parameters:
+ * Input:
+ * - R4: "flags"
+ *         Bits 0-63: Reserved
+ *
+ * Output:
+ * - None
+ */
+static target_ulong h_int_reset(PowerPCCPU *cpu,
+                                sPAPRMachineState *spapr,
+                                target_ulong opcode,
+                                target_ulong *args)
+{
+    sPAPRXive *xive = spapr->xive;
+    target_ulong flags   = args[0];
+
+    if (!spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
+        return H_FUNCTION;
+    }
+
+    if (flags) {
+        return H_PARAMETER;
+    }
+
+    device_reset(DEVICE(xive));
+    return H_SUCCESS;
+}
+
+void spapr_xive_hcall_init(sPAPRMachineState *spapr)
+{
+    spapr_register_hypercall(H_INT_GET_SOURCE_INFO, h_int_get_source_info);
+    spapr_register_hypercall(H_INT_SET_SOURCE_CONFIG, h_int_set_source_config);
+    spapr_register_hypercall(H_INT_GET_SOURCE_CONFIG, h_int_get_source_config);
+    spapr_register_hypercall(H_INT_GET_QUEUE_INFO, h_int_get_queue_info);
+    spapr_register_hypercall(H_INT_SET_QUEUE_CONFIG, h_int_set_queue_config);
+    spapr_register_hypercall(H_INT_GET_QUEUE_CONFIG, h_int_get_queue_config);
+    spapr_register_hypercall(H_INT_SET_OS_REPORTING_LINE,
+                             h_int_set_os_reporting_line);
+    spapr_register_hypercall(H_INT_GET_OS_REPORTING_LINE,
+                             h_int_get_os_reporting_line);
+    spapr_register_hypercall(H_INT_ESB, h_int_esb);
+    spapr_register_hypercall(H_INT_SYNC, h_int_sync);
+    spapr_register_hypercall(H_INT_RESET, h_int_reset);
+}
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 1f5aac55d370..9eca8a4c8c51 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -239,6 +239,8 @@ static void spapr_irq_init_xive(sPAPRMachineState *spapr, Error **errp)
     for (i = 0; i < nr_servers; ++i) {
         spapr_xive_irq_claim(spapr->xive, SPAPR_IRQ_IPI + i, false);
     }
+
+    spapr_xive_hcall_init(spapr);
 }
 
 static int spapr_irq_claim_xive(sPAPRMachineState *spapr, int irq, bool lsi,
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Qemu-devel] [PATCH v8 03/12] spapr: add device tree support for the XIVE exploitation mode
  2018-12-11 22:38 [Qemu-devel] [PATCH v8 00/12] ppc: support for the XIVE interrupt controller (POWER9) Cédric Le Goater
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 01/12] spapr: introduce a new machine IRQ backend for XIVE Cédric Le Goater
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 02/12] spapr: add hcalls support for the XIVE exploitation interrupt mode Cédric Le Goater
@ 2018-12-11 22:38 ` Cédric Le Goater
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 04/12] spapr: allocate the interrupt thread context under the CPU core Cédric Le Goater
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: Cédric Le Goater @ 2018-12-11 22:38 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt, Cédric Le Goater

The XIVE interface for the guest is described in the device tree under
the "interrupt-controller" node. A couple of new properties are
specific to XIVE :

 - "reg"

   contains the base address and size of the thread interrupt
   managnement areas (TIMA), for the User level and for the Guest OS
   level. Only the Guest OS level is taken into account today.

 - "ibm,xive-eq-sizes"

   the size of the event queues. One cell per size supported, contains
   log2 of size, in ascending order.

 - "ibm,xive-lisn-ranges"

   the IRQ interrupt number ranges assigned to the guest for the IPIs.

and also under the root node :

 - "ibm,plat-res-int-priorities"

   contains a list of priorities that the hypervisor has reserved for
   its own use. OPAL uses the priority 7 queue to automatically
   escalate interrupts for all other queues (DD2.X POWER9). So only
   priorities [0..6] are allowed for the guest.

Extend the sPAPR IRQ backend with a new handler to populate the DT
with the appropriate "interrupt-controller" node.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---

 Changes since v7:

 - changed the EQ sizes list property to contain 64K only
 
 include/hw/ppc/spapr_irq.h  |  2 ++
 include/hw/ppc/spapr_xive.h |  2 ++
 include/hw/ppc/xics.h       |  4 +--
 hw/intc/spapr_xive.c        | 64 +++++++++++++++++++++++++++++++++++++
 hw/intc/xics_spapr.c        |  3 +-
 hw/ppc/spapr.c              |  3 +-
 hw/ppc/spapr_irq.c          |  3 ++
 7 files changed, 77 insertions(+), 4 deletions(-)

diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index 23cdb51b879e..e51e9f052f63 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -39,6 +39,8 @@ typedef struct sPAPRIrq {
     void (*free)(sPAPRMachineState *spapr, int irq, int num);
     qemu_irq (*qirq)(sPAPRMachineState *spapr, int irq);
     void (*print_info)(sPAPRMachineState *spapr, Monitor *mon);
+    void (*dt_populate)(sPAPRMachineState *spapr, uint32_t nr_servers,
+                        void *fdt, uint32_t phandle);
 } sPAPRIrq;
 
 extern sPAPRIrq spapr_irq_xics;
diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
index 9506a8f4d10a..728a5e8dc163 100644
--- a/include/hw/ppc/spapr_xive.h
+++ b/include/hw/ppc/spapr_xive.h
@@ -45,5 +45,7 @@ qemu_irq spapr_xive_qirq(sPAPRXive *xive, uint32_t lisn);
 typedef struct sPAPRMachineState sPAPRMachineState;
 
 void spapr_xive_hcall_init(sPAPRMachineState *spapr);
+void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
+                   uint32_t phandle);
 
 #endif /* PPC_SPAPR_XIVE_H */
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 9958443d1984..14afda198cdb 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -181,8 +181,6 @@ typedef struct XICSFabricClass {
     ICPState *(*icp_get)(XICSFabric *xi, int server);
 } XICSFabricClass;
 
-void spapr_dt_xics(int nr_servers, void *fdt, uint32_t phandle);
-
 ICPState *xics_icp_get(XICSFabric *xi, int server);
 
 /* Internal XICS interfaces */
@@ -204,6 +202,8 @@ void icp_resend(ICPState *ss);
 
 typedef struct sPAPRMachineState sPAPRMachineState;
 
+void spapr_dt_xics(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
+                   uint32_t phandle);
 int xics_kvm_init(sPAPRMachineState *spapr, Error **errp);
 void xics_spapr_init(sPAPRMachineState *spapr);
 
diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index 982ac6e17051..9e5b2db2439a 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -14,6 +14,7 @@
 #include "target/ppc/cpu.h"
 #include "sysemu/cpus.h"
 #include "monitor/monitor.h"
+#include "hw/ppc/fdt.h"
 #include "hw/ppc/spapr.h"
 #include "hw/ppc/spapr_xive.h"
 #include "hw/ppc/xive.h"
@@ -1381,3 +1382,66 @@ void spapr_xive_hcall_init(sPAPRMachineState *spapr)
     spapr_register_hypercall(H_INT_SYNC, h_int_sync);
     spapr_register_hypercall(H_INT_RESET, h_int_reset);
 }
+
+void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
+                   uint32_t phandle)
+{
+    sPAPRXive *xive = spapr->xive;
+    int node;
+    uint64_t timas[2 * 2];
+    /* Interrupt number ranges for the IPIs */
+    uint32_t lisn_ranges[] = {
+        cpu_to_be32(0),
+        cpu_to_be32(nr_servers),
+    };
+    /* EQ size - the sizes of pages supported by the system 4K, 64K,
+     * 2M, 16M. We only advertise 64K for the moment.
+     */
+    uint32_t eq_sizes[] = {
+        cpu_to_be32(16), /* 64K */
+    };
+    /* The following array is in sync with the reserved priorities
+     * defined by the 'spapr_xive_priority_is_reserved' routine.
+     */
+    uint32_t plat_res_int_priorities[] = {
+        cpu_to_be32(7),    /* start */
+        cpu_to_be32(0xf8), /* count */
+    };
+    gchar *nodename;
+
+    /* Thread Interrupt Management Area : User (ring 3) and OS (ring 2) */
+    timas[0] = cpu_to_be64(xive->tm_base +
+                           XIVE_TM_USER_PAGE * (1ull << TM_SHIFT));
+    timas[1] = cpu_to_be64(1ull << TM_SHIFT);
+    timas[2] = cpu_to_be64(xive->tm_base +
+                           XIVE_TM_OS_PAGE * (1ull << TM_SHIFT));
+    timas[3] = cpu_to_be64(1ull << TM_SHIFT);
+
+    nodename = g_strdup_printf("interrupt-controller@%" PRIx64,
+                           xive->tm_base + XIVE_TM_USER_PAGE * (1 << TM_SHIFT));
+    _FDT(node = fdt_add_subnode(fdt, 0, nodename));
+    g_free(nodename);
+
+    _FDT(fdt_setprop_string(fdt, node, "device_type", "power-ivpe"));
+    _FDT(fdt_setprop(fdt, node, "reg", timas, sizeof(timas)));
+
+    _FDT(fdt_setprop_string(fdt, node, "compatible", "ibm,power-ivpe"));
+    _FDT(fdt_setprop(fdt, node, "ibm,xive-eq-sizes", eq_sizes,
+                     sizeof(eq_sizes)));
+    _FDT(fdt_setprop(fdt, node, "ibm,xive-lisn-ranges", lisn_ranges,
+                     sizeof(lisn_ranges)));
+
+    /* For Linux to link the LSIs to the interrupt controller. */
+    _FDT(fdt_setprop(fdt, node, "interrupt-controller", NULL, 0));
+    _FDT(fdt_setprop_cell(fdt, node, "#interrupt-cells", 2));
+
+    /* For SLOF */
+    _FDT(fdt_setprop_cell(fdt, node, "linux,phandle", phandle));
+    _FDT(fdt_setprop_cell(fdt, node, "phandle", phandle));
+
+    /* The "ibm,plat-res-int-priorities" property defines the priority
+     * ranges reserved by the hypervisor
+     */
+    _FDT(fdt_setprop(fdt, 0, "ibm,plat-res-int-priorities",
+                     plat_res_int_priorities, sizeof(plat_res_int_priorities)));
+}
diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
index 2e27b92b871a..f67d3c80bf3a 100644
--- a/hw/intc/xics_spapr.c
+++ b/hw/intc/xics_spapr.c
@@ -244,7 +244,8 @@ void xics_spapr_init(sPAPRMachineState *spapr)
     spapr_register_hypercall(H_IPOLL, h_ipoll);
 }
 
-void spapr_dt_xics(int nr_servers, void *fdt, uint32_t phandle)
+void spapr_dt_xics(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
+                   uint32_t phandle)
 {
     uint32_t interrupt_server_ranges_prop[] = {
         0, cpu_to_be32(nr_servers),
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 8a18250cf74d..ab46460ddb8b 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1268,7 +1268,8 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
     _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
 
     /* /interrupt controller */
-    spapr_dt_xics(spapr_max_server_number(spapr), fdt, PHANDLE_XICP);
+    smc->irq->dt_populate(spapr, spapr_max_server_number(spapr), fdt,
+                          PHANDLE_XICP);
 
     ret = spapr_populate_memory(spapr, fdt);
     if (ret < 0) {
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 9eca8a4c8c51..975954dc2712 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -204,6 +204,7 @@ sPAPRIrq spapr_irq_xics = {
     .free        = spapr_irq_free_xics,
     .qirq        = spapr_qirq_xics,
     .print_info  = spapr_irq_print_info_xics,
+    .dt_populate = spapr_dt_xics,
 };
 
 /*
@@ -298,6 +299,7 @@ sPAPRIrq spapr_irq_xive = {
     .free        = spapr_irq_free_xive,
     .qirq        = spapr_qirq_xive,
     .print_info  = spapr_irq_print_info_xive,
+    .dt_populate = spapr_dt_xive,
 };
 
 /*
@@ -402,4 +404,5 @@ sPAPRIrq spapr_irq_xics_legacy = {
     .free        = spapr_irq_free_xics,
     .qirq        = spapr_qirq_xics,
     .print_info  = spapr_irq_print_info_xics,
+    .dt_populate = spapr_dt_xics,
 };
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Qemu-devel] [PATCH v8 04/12] spapr: allocate the interrupt thread context under the CPU core
  2018-12-11 22:38 [Qemu-devel] [PATCH v8 00/12] ppc: support for the XIVE interrupt controller (POWER9) Cédric Le Goater
                   ` (2 preceding siblings ...)
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 03/12] spapr: add device tree support for the XIVE exploitation mode Cédric Le Goater
@ 2018-12-11 22:38 ` Cédric Le Goater
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 05/12] spapr: extend the sPAPR IRQ backend for XICS migration Cédric Le Goater
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: Cédric Le Goater @ 2018-12-11 22:38 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt, Cédric Le Goater

Each interrupt mode has its own specific interrupt presenter object,
that we store under the CPU object, one for XICS and one for XIVE.

Extend the sPAPR IRQ backend with a new handler to support them both.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
 include/hw/ppc/spapr_irq.h |  2 ++
 include/hw/ppc/xive.h      |  1 +
 hw/intc/xive.c             | 22 ++++++++++++++++++++++
 hw/ppc/spapr_cpu_core.c    |  5 ++---
 hw/ppc/spapr_irq.c         | 15 +++++++++++++++
 5 files changed, 42 insertions(+), 3 deletions(-)

diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index e51e9f052f63..13db0428ab51 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -41,6 +41,8 @@ typedef struct sPAPRIrq {
     void (*print_info)(sPAPRMachineState *spapr, Monitor *mon);
     void (*dt_populate)(sPAPRMachineState *spapr, uint32_t nr_servers,
                         void *fdt, uint32_t phandle);
+    Object *(*cpu_intc_create)(sPAPRMachineState *spapr, Object *cpu,
+                               Error **errp);
 } sPAPRIrq;
 
 extern sPAPRIrq spapr_irq_xics;
diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
index 19309d1d65d1..18cd114eb244 100644
--- a/include/hw/ppc/xive.h
+++ b/include/hw/ppc/xive.h
@@ -419,6 +419,7 @@ typedef struct XiveTCTX {
 extern const MemoryRegionOps xive_tm_ops;
 
 void xive_tctx_pic_print_info(XiveTCTX *tctx, Monitor *mon);
+Object *xive_tctx_create(Object *cpu, XiveRouter *xrtr, Error **errp);
 
 static inline uint32_t xive_nvt_cam_line(uint8_t nvt_blk, uint32_t nvt_idx)
 {
diff --git a/hw/intc/xive.c b/hw/intc/xive.c
index ea5385ff7784..53d2f191e8a3 100644
--- a/hw/intc/xive.c
+++ b/hw/intc/xive.c
@@ -526,6 +526,28 @@ static const TypeInfo xive_tctx_info = {
     .class_init    = xive_tctx_class_init,
 };
 
+Object *xive_tctx_create(Object *cpu, XiveRouter *xrtr, Error **errp)
+{
+    Error *local_err = NULL;
+    Object *obj;
+
+    obj = object_new(TYPE_XIVE_TCTX);
+    object_property_add_child(cpu, TYPE_XIVE_TCTX, obj, &error_abort);
+    object_unref(obj);
+    object_property_add_const_link(obj, "cpu", cpu, &error_abort);
+    object_property_set_bool(obj, true, "realized", &local_err);
+    if (local_err) {
+        goto error;
+    }
+
+    return obj;
+
+error:
+    object_unparent(obj);
+    error_propagate(errp, local_err);
+    return NULL;
+}
+
 /*
  * XIVE ESB helpers
  */
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 2398ce62c0e7..1811cd48db90 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -11,7 +11,6 @@
 #include "hw/ppc/spapr_cpu_core.h"
 #include "target/ppc/cpu.h"
 #include "hw/ppc/spapr.h"
-#include "hw/ppc/xics.h" /* for icp_create() - to be removed */
 #include "hw/boards.h"
 #include "qapi/error.h"
 #include "sysemu/cpus.h"
@@ -215,6 +214,7 @@ static void spapr_cpu_core_unrealize(DeviceState *dev, Error **errp)
 static void spapr_realize_vcpu(PowerPCCPU *cpu, sPAPRMachineState *spapr,
                                sPAPRCPUCore *sc, Error **errp)
 {
+    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
     CPUPPCState *env = &cpu->env;
     CPUState *cs = CPU(cpu);
     Error *local_err = NULL;
@@ -233,8 +233,7 @@ static void spapr_realize_vcpu(PowerPCCPU *cpu, sPAPRMachineState *spapr,
     qemu_register_reset(spapr_cpu_reset, cpu);
     spapr_cpu_reset(cpu);
 
-    cpu->intc = icp_create(OBJECT(cpu), spapr->icp_type, XICS_FABRIC(spapr),
-                           &local_err);
+    cpu->intc = smc->irq->cpu_intc_create(spapr, OBJECT(cpu), &local_err);
     if (local_err) {
         goto error_unregister;
     }
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 975954dc2712..fdcc7795e446 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -191,6 +191,12 @@ static void spapr_irq_print_info_xics(sPAPRMachineState *spapr, Monitor *mon)
     ics_pic_print_info(spapr->ics, mon);
 }
 
+static Object *spapr_irq_cpu_intc_create_xics(sPAPRMachineState *spapr,
+                                              Object *cpu, Error **errp)
+{
+    return icp_create(cpu, spapr->icp_type, XICS_FABRIC(spapr), errp);
+}
+
 #define SPAPR_IRQ_XICS_NR_IRQS     0x1000
 #define SPAPR_IRQ_XICS_NR_MSIS     \
     (XICS_IRQ_BASE + SPAPR_IRQ_XICS_NR_IRQS - SPAPR_IRQ_MSI)
@@ -205,6 +211,7 @@ sPAPRIrq spapr_irq_xics = {
     .qirq        = spapr_qirq_xics,
     .print_info  = spapr_irq_print_info_xics,
     .dt_populate = spapr_dt_xics,
+    .cpu_intc_create = spapr_irq_cpu_intc_create_xics,
 };
 
 /*
@@ -282,6 +289,12 @@ static void spapr_irq_print_info_xive(sPAPRMachineState *spapr,
     spapr_xive_pic_print_info(spapr->xive, mon);
 }
 
+static Object *spapr_irq_cpu_intc_create_xive(sPAPRMachineState *spapr,
+                                              Object *cpu, Error **errp)
+{
+    return xive_tctx_create(cpu, XIVE_ROUTER(spapr->xive), errp);
+}
+
 /*
  * XIVE uses the full IRQ number space. Set it to 8K to be compatible
  * with XICS.
@@ -300,6 +313,7 @@ sPAPRIrq spapr_irq_xive = {
     .qirq        = spapr_qirq_xive,
     .print_info  = spapr_irq_print_info_xive,
     .dt_populate = spapr_dt_xive,
+    .cpu_intc_create = spapr_irq_cpu_intc_create_xive,
 };
 
 /*
@@ -405,4 +419,5 @@ sPAPRIrq spapr_irq_xics_legacy = {
     .qirq        = spapr_qirq_xics,
     .print_info  = spapr_irq_print_info_xics,
     .dt_populate = spapr_dt_xics,
+    .cpu_intc_create = spapr_irq_cpu_intc_create_xics,
 };
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Qemu-devel] [PATCH v8 05/12] spapr: extend the sPAPR IRQ backend for XICS migration
  2018-12-11 22:38 [Qemu-devel] [PATCH v8 00/12] ppc: support for the XIVE interrupt controller (POWER9) Cédric Le Goater
                   ` (3 preceding siblings ...)
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 04/12] spapr: allocate the interrupt thread context under the CPU core Cédric Le Goater
@ 2018-12-11 22:38 ` Cédric Le Goater
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 06/12] spapr: add a 'reset' method to the sPAPR IRQ backend Cédric Le Goater
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: Cédric Le Goater @ 2018-12-11 22:38 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt, Cédric Le Goater

Introduce a new sPAPR IRQ handler to handle resend after migration
when the machine is using a KVM XICS interrupt controller model.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
 include/hw/ppc/spapr_irq.h |  2 ++
 hw/ppc/spapr.c             | 13 +++++--------
 hw/ppc/spapr_irq.c         | 27 +++++++++++++++++++++++++++
 3 files changed, 34 insertions(+), 8 deletions(-)

diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index 13db0428ab51..84a25ffb6c65 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -43,6 +43,7 @@ typedef struct sPAPRIrq {
                         void *fdt, uint32_t phandle);
     Object *(*cpu_intc_create)(sPAPRMachineState *spapr, Object *cpu,
                                Error **errp);
+    int (*post_load)(sPAPRMachineState *spapr, int version_id);
 } sPAPRIrq;
 
 extern sPAPRIrq spapr_irq_xics;
@@ -53,6 +54,7 @@ void spapr_irq_init(sPAPRMachineState *spapr, Error **errp);
 int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
 void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
 qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq);
+int spapr_irq_post_load(sPAPRMachineState *spapr, int version_id);
 
 /*
  * XICS legacy routines
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index ab46460ddb8b..1f41ea2f3c6c 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1730,14 +1730,6 @@ static int spapr_post_load(void *opaque, int version_id)
         return err;
     }
 
-    if (!object_dynamic_cast(OBJECT(spapr->ics), TYPE_ICS_KVM)) {
-        CPUState *cs;
-        CPU_FOREACH(cs) {
-            PowerPCCPU *cpu = POWERPC_CPU(cs);
-            icp_resend(ICP(cpu->intc));
-        }
-    }
-
     /* In earlier versions, there was no separate qdev for the PAPR
      * RTC, so the RTC offset was stored directly in sPAPREnvironment.
      * So when migrating from those versions, poke the incoming offset
@@ -1758,6 +1750,11 @@ static int spapr_post_load(void *opaque, int version_id)
         }
     }
 
+    err = spapr_irq_post_load(spapr, version_id);
+    if (err) {
+        return err;
+    }
+
     return err;
 }
 
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index fdcc7795e446..292c448a15fa 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -197,6 +197,18 @@ static Object *spapr_irq_cpu_intc_create_xics(sPAPRMachineState *spapr,
     return icp_create(cpu, spapr->icp_type, XICS_FABRIC(spapr), errp);
 }
 
+static int spapr_irq_post_load_xics(sPAPRMachineState *spapr, int version_id)
+{
+    if (!object_dynamic_cast(OBJECT(spapr->ics), TYPE_ICS_KVM)) {
+        CPUState *cs;
+        CPU_FOREACH(cs) {
+            PowerPCCPU *cpu = POWERPC_CPU(cs);
+            icp_resend(ICP(cpu->intc));
+        }
+    }
+    return 0;
+}
+
 #define SPAPR_IRQ_XICS_NR_IRQS     0x1000
 #define SPAPR_IRQ_XICS_NR_MSIS     \
     (XICS_IRQ_BASE + SPAPR_IRQ_XICS_NR_IRQS - SPAPR_IRQ_MSI)
@@ -212,6 +224,7 @@ sPAPRIrq spapr_irq_xics = {
     .print_info  = spapr_irq_print_info_xics,
     .dt_populate = spapr_dt_xics,
     .cpu_intc_create = spapr_irq_cpu_intc_create_xics,
+    .post_load   = spapr_irq_post_load_xics,
 };
 
 /*
@@ -295,6 +308,11 @@ static Object *spapr_irq_cpu_intc_create_xive(sPAPRMachineState *spapr,
     return xive_tctx_create(cpu, XIVE_ROUTER(spapr->xive), errp);
 }
 
+static int spapr_irq_post_load_xive(sPAPRMachineState *spapr, int version_id)
+{
+    return 0;
+}
+
 /*
  * XIVE uses the full IRQ number space. Set it to 8K to be compatible
  * with XICS.
@@ -314,6 +332,7 @@ sPAPRIrq spapr_irq_xive = {
     .print_info  = spapr_irq_print_info_xive,
     .dt_populate = spapr_dt_xive,
     .cpu_intc_create = spapr_irq_cpu_intc_create_xive,
+    .post_load   = spapr_irq_post_load_xive,
 };
 
 /*
@@ -352,6 +371,13 @@ qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq)
     return smc->irq->qirq(spapr, irq);
 }
 
+int spapr_irq_post_load(sPAPRMachineState *spapr, int version_id)
+{
+    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
+
+    return smc->irq->post_load(spapr, version_id);
+}
+
 /*
  * XICS legacy routines - to deprecate one day
  */
@@ -420,4 +446,5 @@ sPAPRIrq spapr_irq_xics_legacy = {
     .print_info  = spapr_irq_print_info_xics,
     .dt_populate = spapr_dt_xics,
     .cpu_intc_create = spapr_irq_cpu_intc_create_xics,
+    .post_load   = spapr_irq_post_load_xics,
 };
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Qemu-devel] [PATCH v8 06/12] spapr: add a 'reset' method to the sPAPR IRQ backend
  2018-12-11 22:38 [Qemu-devel] [PATCH v8 00/12] ppc: support for the XIVE interrupt controller (POWER9) Cédric Le Goater
                   ` (4 preceding siblings ...)
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 05/12] spapr: extend the sPAPR IRQ backend for XICS migration Cédric Le Goater
@ 2018-12-11 22:38 ` Cédric Le Goater
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 07/12] spapr: add an extra OV5 field " Cédric Le Goater
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: Cédric Le Goater @ 2018-12-11 22:38 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt, Cédric Le Goater

For the time being, the XIVE reset handler updates the OS CAM line of
the vCPU as it is done under a real hypervisor when a vCPU is
scheduled to run on a HW thread. This will let the XIVE presenter
engine find a match among the NVTs dispatched on the HW threads.

This handler will become even more useful when we introduce the
machine supporting both interrupt modes, XIVE and XICS. In this
machine, the interrupt mode is chosen by the CAS negotiation process
and activated after a reset.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---

 Changes since v7:

 - removed spapr_irq_reset_xics(). Will be introduce later.
 - replaced spapr_xive_reset_tctx() by spapr_xive_set_tctx_os_cam() to
   also cover hotplugged CPUs. The XiveTCTX and the CPU models lack a
   reset in case of hotplugged. 
 
 include/hw/ppc/spapr_irq.h  |  2 ++
 include/hw/ppc/spapr_xive.h |  1 +
 hw/intc/spapr_xive.c        | 17 +++++++++++++++++
 hw/ppc/spapr.c              |  5 +++++
 hw/ppc/spapr_irq.c          | 30 +++++++++++++++++++++++++++++-
 5 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index 84a25ffb6c65..63061a009b4c 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -44,6 +44,7 @@ typedef struct sPAPRIrq {
     Object *(*cpu_intc_create)(sPAPRMachineState *spapr, Object *cpu,
                                Error **errp);
     int (*post_load)(sPAPRMachineState *spapr, int version_id);
+    void (*reset)(sPAPRMachineState *spapr, Error **errp);
 } sPAPRIrq;
 
 extern sPAPRIrq spapr_irq_xics;
@@ -55,6 +56,7 @@ int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
 void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
 qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq);
 int spapr_irq_post_load(sPAPRMachineState *spapr, int version_id);
+void spapr_irq_reset(sPAPRMachineState *spapr, Error **errp);
 
 /*
  * XICS legacy routines
diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
index 728a5e8dc163..728735dbcfbe 100644
--- a/include/hw/ppc/spapr_xive.h
+++ b/include/hw/ppc/spapr_xive.h
@@ -47,5 +47,6 @@ typedef struct sPAPRMachineState sPAPRMachineState;
 void spapr_xive_hcall_init(sPAPRMachineState *spapr);
 void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
                    uint32_t phandle);
+void spapr_xive_set_tctx_os_cam(XiveTCTX *tctx);
 
 #endif /* PPC_SPAPR_XIVE_H */
diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index 9e5b2db2439a..aaa5865c4080 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -179,6 +179,23 @@ static void spapr_xive_map_mmio(sPAPRXive *xive)
     sysbus_mmio_map(SYS_BUS_DEVICE(xive), 2, xive->tm_base);
 }
 
+/*
+ * When a Virtual Processor is scheduled to run on a HW thread, the
+ * hypervisor pushes its identifier in the OS CAM line. Emulate the
+ * same behavior under QEMU.
+ */
+void spapr_xive_set_tctx_os_cam(XiveTCTX *tctx)
+{
+    uint8_t  nvt_blk;
+    uint32_t nvt_idx;
+    uint32_t nvt_cam;
+
+    spapr_xive_cpu_to_nvt(POWERPC_CPU(tctx->cs), &nvt_blk, &nvt_idx);
+
+    nvt_cam = cpu_to_be32(TM_QW1W2_VO | xive_nvt_cam_line(nvt_blk, nvt_idx));
+    memcpy(&tctx->regs[TM_QW1_OS + TM_WORD2], &nvt_cam, 4);
+}
+
 static void spapr_xive_end_reset(XiveEND *end)
 {
     memset(end, 0, sizeof(*end));
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 1f41ea2f3c6c..57c0066edd56 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1619,6 +1619,11 @@ static void spapr_machine_reset(void)
 
     qemu_devices_reset();
 
+    /* This is fixing some of the default configuration of the XIVE
+     * devices. To be called after the reset of the machine devices.
+     */
+    spapr_irq_reset(spapr, &error_fatal);
+
     /* DRC reset may cause a device to be unplugged. This will cause troubles
      * if this device is used by another device (eg, a running vhost backend
      * will crash QEMU if the DIMM holding the vring goes away). To avoid such
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 292c448a15fa..f835ea939cbf 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -305,7 +305,13 @@ static void spapr_irq_print_info_xive(sPAPRMachineState *spapr,
 static Object *spapr_irq_cpu_intc_create_xive(sPAPRMachineState *spapr,
                                               Object *cpu, Error **errp)
 {
-    return xive_tctx_create(cpu, XIVE_ROUTER(spapr->xive), errp);
+    Object *obj = xive_tctx_create(cpu, XIVE_ROUTER(spapr->xive), errp);
+
+    /* (TCG) Early setting the OS CAM line for hotplugged CPUs as they
+     * don't benificiate from the reset of the XIVE IRQ backend
+     */
+    spapr_xive_set_tctx_os_cam(XIVE_TCTX(obj));
+    return obj;
 }
 
 static int spapr_irq_post_load_xive(sPAPRMachineState *spapr, int version_id)
@@ -313,6 +319,18 @@ static int spapr_irq_post_load_xive(sPAPRMachineState *spapr, int version_id)
     return 0;
 }
 
+static void spapr_irq_reset_xive(sPAPRMachineState *spapr, Error **errp)
+{
+    CPUState *cs;
+
+    CPU_FOREACH(cs) {
+        PowerPCCPU *cpu = POWERPC_CPU(cs);
+
+        /* (TCG) Set the OS CAM line of the thread interrupt context. */
+        spapr_xive_set_tctx_os_cam(XIVE_TCTX(cpu->intc));
+    }
+}
+
 /*
  * XIVE uses the full IRQ number space. Set it to 8K to be compatible
  * with XICS.
@@ -333,6 +351,7 @@ sPAPRIrq spapr_irq_xive = {
     .dt_populate = spapr_dt_xive,
     .cpu_intc_create = spapr_irq_cpu_intc_create_xive,
     .post_load   = spapr_irq_post_load_xive,
+    .reset       = spapr_irq_reset_xive,
 };
 
 /*
@@ -378,6 +397,15 @@ int spapr_irq_post_load(sPAPRMachineState *spapr, int version_id)
     return smc->irq->post_load(spapr, version_id);
 }
 
+void spapr_irq_reset(sPAPRMachineState *spapr, Error **errp)
+{
+    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
+
+    if (smc->irq->reset) {
+        smc->irq->reset(spapr, errp);
+    }
+}
+
 /*
  * XICS legacy routines - to deprecate one day
  */
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Qemu-devel] [PATCH v8 07/12] spapr: add an extra OV5 field to the sPAPR IRQ backend
  2018-12-11 22:38 [Qemu-devel] [PATCH v8 00/12] ppc: support for the XIVE interrupt controller (POWER9) Cédric Le Goater
                   ` (5 preceding siblings ...)
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 06/12] spapr: add a 'reset' method to the sPAPR IRQ backend Cédric Le Goater
@ 2018-12-11 22:38 ` Cédric Le Goater
  2018-12-17  5:27   ` David Gibson
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 08/12] spapr: introduce an 'ic-mode' machine option Cédric Le Goater
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Cédric Le Goater @ 2018-12-11 22:38 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt, Cédric Le Goater

The interrupt modes supported by the hypervisor are advertised to the
guest with new bits definitions of the option vector 5 of property
"ibm,arch-vec-5-platform-support. The byte 23 bits 0-1 of the OV5 are
defined as follow :

  0b00   PAPR 2.7 and earlier (Legacy systems)
  0b01   XIVE Exploitation mode only
  0b10   Either available

If the client/guest selects the XIVE interrupt mode, it informs the
hypervisor by returning the value 0b01 in byte 23 bits 0-1. A 0b00
value indicates the use of the XICS interrupt mode (Legacy systems).

The sPAPR IRQ backend is extended with these definitions and the
values are directly used to populate the "ibm,arch-vec-5-platform-support"
property. The interrupt mode is advertised under TCG and under KVM.
Although a KVM XIVE device is not yet available, the machine can still
operate with kernel_irqchip=off. However, we apply a restriction on
the CPU which is required to be a POWER9 when a XIVE interrupt
controller is in use.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---

 Changes since v7:

 - improved commit log
 - introduced a check on CPU type.

 include/hw/ppc/spapr.h     |  6 ++++++
 include/hw/ppc/spapr_irq.h |  1 +
 hw/ppc/spapr.c             | 36 ++++++++++++++++++++++++++++++------
 hw/ppc/spapr_irq.c         |  3 +++
 4 files changed, 40 insertions(+), 6 deletions(-)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 6bf028a02fe2..06765b4e9d79 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -824,5 +824,11 @@ int spapr_caps_post_migration(sPAPRMachineState *spapr);
 
 void spapr_check_pagesize(sPAPRMachineState *spapr, hwaddr pagesize,
                           Error **errp);
+/*
+ * XIVE definitions
+ */
+#define SPAPR_OV5_XIVE_LEGACY   0x0
+#define SPAPR_OV5_XIVE_EXPLOIT  0x40
+#define SPAPR_OV5_XIVE_BOTH     0x80 /* Only to advertise on the platform */
 
 #endif /* HW_SPAPR_H */
diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index 63061a009b4c..b34d5a00381b 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -33,6 +33,7 @@ void spapr_irq_msi_reset(sPAPRMachineState *spapr);
 typedef struct sPAPRIrq {
     uint32_t    nr_irqs;
     uint32_t    nr_msis;
+    uint8_t     ov5;
 
     void (*init)(sPAPRMachineState *spapr, Error **errp);
     int (*claim)(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 57c0066edd56..ff138f224a87 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1095,12 +1095,14 @@ static void spapr_dt_rtas(sPAPRMachineState *spapr, void *fdt)
     spapr_dt_rtas_tokens(fdt, rtas);
 }
 
-/* Prepare ibm,arch-vec-5-platform-support, which indicates the MMU features
- * that the guest may request and thus the valid values for bytes 24..26 of
- * option vector 5: */
-static void spapr_dt_ov5_platform_support(void *fdt, int chosen)
+/* Prepare ibm,arch-vec-5-platform-support, which indicates the MMU
+ * and the XIVE features that the guest may request and thus the valid
+ * values for bytes 23..26 of option vector 5: */
+static void spapr_dt_ov5_platform_support(sPAPRMachineState *spapr, void *fdt,
+                                          int chosen)
 {
     PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
+    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
 
     char val[2 * 4] = {
         23, 0x00, /* Xive mode, filled in below. */
@@ -1111,9 +1113,17 @@ static void spapr_dt_ov5_platform_support(void *fdt, int chosen)
 
     if (!ppc_check_compat(first_ppc_cpu, CPU_POWERPC_LOGICAL_3_00, 0,
                           first_ppc_cpu->compat_pvr)) {
-        /* If we're in a pre POWER9 compat mode then the guest should do hash */
+        /* If we're in a pre POWER9 compat mode then the guest should do hash
+         * and use the legacy interrupt mode
+         */
+        val[1] = 0x00; /* XICS */
         val[3] = 0x00; /* Hash */
     } else if (kvm_enabled()) {
+        /* If the KVM XIVE device is not available, the machine can
+         * still operate with kernel_irqchip=off
+         */
+        val[1] = smc->irq->ov5;
+
         if (kvmppc_has_cap_mmu_radix() && kvmppc_has_cap_mmu_hash_v3()) {
             val[3] = 0x80; /* OV5_MMU_BOTH */
         } else if (kvmppc_has_cap_mmu_radix()) {
@@ -1122,6 +1132,9 @@ static void spapr_dt_ov5_platform_support(void *fdt, int chosen)
             val[3] = 0x00; /* Hash */
         }
     } else {
+        /* In TCG, the interrupt mode is defined by the pseries machine */
+        val[1] = smc->irq->ov5;
+
         /* V3 MMU supports both hash and radix in tcg (with dynamic switching) */
         val[3] = 0xC0;
     }
@@ -1189,7 +1202,7 @@ static void spapr_dt_chosen(sPAPRMachineState *spapr, void *fdt)
         _FDT(fdt_setprop_string(fdt, chosen, "stdout-path", stdout_path));
     }
 
-    spapr_dt_ov5_platform_support(fdt, chosen);
+    spapr_dt_ov5_platform_support(spapr, fdt, chosen);
 
     g_free(stdout_path);
     g_free(bootlist);
@@ -2622,6 +2635,17 @@ static void spapr_machine_init(MachineState *machine)
     /* advertise support for ibm,dyamic-memory-v2 */
     spapr_ovec_set(spapr->ov5, OV5_DRMEM_V2);
 
+    /* advertise XIVE on POWER9 machines */
+    if (smc->irq->ov5 & SPAPR_OV5_XIVE_EXPLOIT) {
+        if (ppc_type_check_compat(machine->cpu_type, CPU_POWERPC_LOGICAL_3_00,
+                                  0, spapr->max_compat_pvr)) {
+            spapr_ovec_set(spapr->ov5, OV5_XIVE_EXPLOIT);
+        } else {
+            error_report("XIVE-only machines require a POWER9 CPU");
+            exit(1);
+        }
+    }
+
     /* init CPUs */
     spapr_init_cpus(spapr);
 
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index f835ea939cbf..79f06308995b 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -216,6 +216,7 @@ static int spapr_irq_post_load_xics(sPAPRMachineState *spapr, int version_id)
 sPAPRIrq spapr_irq_xics = {
     .nr_irqs     = SPAPR_IRQ_XICS_NR_IRQS,
     .nr_msis     = SPAPR_IRQ_XICS_NR_MSIS,
+    .ov5         = SPAPR_OV5_XIVE_LEGACY,
 
     .init        = spapr_irq_init_xics,
     .claim       = spapr_irq_claim_xics,
@@ -342,6 +343,7 @@ static void spapr_irq_reset_xive(sPAPRMachineState *spapr, Error **errp)
 sPAPRIrq spapr_irq_xive = {
     .nr_irqs     = SPAPR_IRQ_XIVE_NR_IRQS,
     .nr_msis     = SPAPR_IRQ_XIVE_NR_MSIS,
+    .ov5         = SPAPR_OV5_XIVE_EXPLOIT,
 
     .init        = spapr_irq_init_xive,
     .claim       = spapr_irq_claim_xive,
@@ -466,6 +468,7 @@ int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align, Error **errp)
 sPAPRIrq spapr_irq_xics_legacy = {
     .nr_irqs     = SPAPR_IRQ_XICS_LEGACY_NR_IRQS,
     .nr_msis     = SPAPR_IRQ_XICS_LEGACY_NR_IRQS,
+    .ov5         = SPAPR_OV5_XIVE_LEGACY,
 
     .init        = spapr_irq_init_xics,
     .claim       = spapr_irq_claim_xics,
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Qemu-devel] [PATCH v8 08/12] spapr: introduce an 'ic-mode' machine option
  2018-12-11 22:38 [Qemu-devel] [PATCH v8 00/12] ppc: support for the XIVE interrupt controller (POWER9) Cédric Le Goater
                   ` (6 preceding siblings ...)
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 07/12] spapr: add an extra OV5 field " Cédric Le Goater
@ 2018-12-11 22:38 ` Cédric Le Goater
  2018-12-17  5:35   ` David Gibson
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 09/12] spapr: set the interrupt presenter at reset Cédric Le Goater
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Cédric Le Goater @ 2018-12-11 22:38 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt, Cédric Le Goater

This option is used to select the interrupt controller mode (XICS or
XIVE) with which the machine will operate. XICS being the default
mode for now.

When running a machine with the XIVE interrupt mode backend, the guest
OS is required to have support for the XIVE exploitation mode. In the
case of legacy OS, the mode selected by CAS should be XICS and the OS
should fail to boot. However, QEMU could possibly detect it, terminate
the boot process and reset to stop in the SLOF firmware. This is not
yet handled.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 include/hw/ppc/spapr.h  |  1 +
 hw/ppc/spapr.c          | 52 ++++++++++++++++++++++++++++++++++-------
 hw/ppc/spapr_cpu_core.c |  3 +--
 hw/ppc/spapr_irq.c      | 34 ++++++++-------------------
 4 files changed, 56 insertions(+), 34 deletions(-)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 06765b4e9d79..2c77a8ba8810 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -177,6 +177,7 @@ struct sPAPRMachineState {
     int32_t irq_map_nr;
     unsigned long *irq_map;
     sPAPRXive  *xive;
+    sPAPRIrq *irq;
 
     bool cmd_line_caps[SPAPR_CAP_NUM];
     sPAPRCapabilities def, eff, mig;
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index ff138f224a87..5d985e38a598 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1102,7 +1102,6 @@ static void spapr_dt_ov5_platform_support(sPAPRMachineState *spapr, void *fdt,
                                           int chosen)
 {
     PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
-    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
 
     char val[2 * 4] = {
         23, 0x00, /* Xive mode, filled in below. */
@@ -1122,7 +1121,7 @@ static void spapr_dt_ov5_platform_support(sPAPRMachineState *spapr, void *fdt,
         /* If the KVM XIVE device is not available, the machine can
          * still operate with kernel_irqchip=off
          */
-        val[1] = smc->irq->ov5;
+        val[1] = spapr->irq->ov5;
 
         if (kvmppc_has_cap_mmu_radix() && kvmppc_has_cap_mmu_hash_v3()) {
             val[3] = 0x80; /* OV5_MMU_BOTH */
@@ -1133,7 +1132,7 @@ static void spapr_dt_ov5_platform_support(sPAPRMachineState *spapr, void *fdt,
         }
     } else {
         /* In TCG, the interrupt mode is defined by the pseries machine */
-        val[1] = smc->irq->ov5;
+        val[1] = spapr->irq->ov5;
 
         /* V3 MMU supports both hash and radix in tcg (with dynamic switching) */
         val[3] = 0xC0;
@@ -1281,7 +1280,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
     _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
 
     /* /interrupt controller */
-    smc->irq->dt_populate(spapr, spapr_max_server_number(spapr), fdt,
+    spapr->irq->dt_populate(spapr, spapr_max_server_number(spapr), fdt,
                           PHANDLE_XICP);
 
     ret = spapr_populate_memory(spapr, fdt);
@@ -1302,7 +1301,8 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
     }
 
     QLIST_FOREACH(phb, &spapr->phbs, list) {
-        ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt, smc->irq->nr_msis);
+        ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt,
+                                    spapr->irq->nr_msis);
         if (ret < 0) {
             error_report("couldn't setup PCI devices in fdt");
             exit(1);
@@ -2636,7 +2636,7 @@ static void spapr_machine_init(MachineState *machine)
     spapr_ovec_set(spapr->ov5, OV5_DRMEM_V2);
 
     /* advertise XIVE on POWER9 machines */
-    if (smc->irq->ov5 & SPAPR_OV5_XIVE_EXPLOIT) {
+    if (spapr->irq->ov5 & SPAPR_OV5_XIVE_EXPLOIT) {
         if (ppc_type_check_compat(machine->cpu_type, CPU_POWERPC_LOGICAL_3_00,
                                   0, spapr->max_compat_pvr)) {
             spapr_ovec_set(spapr->ov5, OV5_XIVE_EXPLOIT);
@@ -3056,9 +3056,38 @@ static void spapr_set_vsmt(Object *obj, Visitor *v, const char *name,
     visit_type_uint32(v, name, (uint32_t *)opaque, errp);
 }
 
+static char *spapr_get_ic_mode(Object *obj, Error **errp)
+{
+    sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
+
+    if (spapr->irq == &spapr_irq_xics_legacy) {
+        return g_strdup("legacy");
+    } else if (spapr->irq == &spapr_irq_xics) {
+        return g_strdup("xics");
+    } else if (spapr->irq == &spapr_irq_xive) {
+        return g_strdup("xive");
+    }
+    g_assert_not_reached();
+}
+
+static void spapr_set_ic_mode(Object *obj, const char *value, Error **errp)
+{
+    sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
+
+    /* The legacy IRQ backend can not be set */
+    if (strcmp(value, "xics") == 0) {
+        spapr->irq = &spapr_irq_xics;
+    } else if (strcmp(value, "xive") == 0) {
+        spapr->irq = &spapr_irq_xive;
+    } else {
+        error_setg(errp, "Bad value for \"ic-mode\" property");
+    }
+}
+
 static void spapr_instance_init(Object *obj)
 {
     sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
+    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
 
     spapr->htab_fd = -1;
     spapr->use_hotplug_event_source = true;
@@ -3092,6 +3121,14 @@ static void spapr_instance_init(Object *obj)
                                     " the host's SMT mode", &error_abort);
     object_property_add_bool(obj, "vfio-no-msix-emulation",
                              spapr_get_msix_emulation, NULL, NULL);
+
+    /* The machine class defines the default interrupt controller mode */
+    spapr->irq = smc->irq;
+    object_property_add_str(obj, "ic-mode", spapr_get_ic_mode,
+                            spapr_set_ic_mode, NULL);
+    object_property_set_description(obj, "ic-mode",
+                 "Specifies the interrupt controller mode (xics, xive)",
+                 NULL);
 }
 
 static void spapr_machine_finalizefn(Object *obj)
@@ -3814,9 +3851,8 @@ static void spapr_pic_print_info(InterruptStatsProvider *obj,
                                  Monitor *mon)
 {
     sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
-    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
 
-    smc->irq->print_info(spapr, mon);
+    spapr->irq->print_info(spapr, mon);
 }
 
 int spapr_get_vcpu_id(PowerPCCPU *cpu)
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 1811cd48db90..82666436e9b4 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -214,7 +214,6 @@ static void spapr_cpu_core_unrealize(DeviceState *dev, Error **errp)
 static void spapr_realize_vcpu(PowerPCCPU *cpu, sPAPRMachineState *spapr,
                                sPAPRCPUCore *sc, Error **errp)
 {
-    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
     CPUPPCState *env = &cpu->env;
     CPUState *cs = CPU(cpu);
     Error *local_err = NULL;
@@ -233,7 +232,7 @@ static void spapr_realize_vcpu(PowerPCCPU *cpu, sPAPRMachineState *spapr,
     qemu_register_reset(spapr_cpu_reset, cpu);
     spapr_cpu_reset(cpu);
 
-    cpu->intc = smc->irq->cpu_intc_create(spapr, OBJECT(cpu), &local_err);
+    cpu->intc = spapr->irq->cpu_intc_create(spapr, OBJECT(cpu), &local_err);
     if (local_err) {
         goto error_unregister;
     }
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 79f06308995b..0999a2b2d69c 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -94,8 +94,7 @@ error:
 static void spapr_irq_init_xics(sPAPRMachineState *spapr, Error **errp)
 {
     MachineState *machine = MACHINE(spapr);
-    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
-    int nr_irqs = smc->irq->nr_irqs;
+    int nr_irqs = spapr->irq->nr_irqs;
     Error *local_err = NULL;
 
     if (kvm_enabled()) {
@@ -234,7 +233,6 @@ sPAPRIrq spapr_irq_xics = {
 static void spapr_irq_init_xive(sPAPRMachineState *spapr, Error **errp)
 {
     MachineState *machine = MACHINE(spapr);
-    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
     uint32_t nr_servers = spapr_max_server_number(spapr);
     DeviceState *dev;
     int i;
@@ -248,7 +246,7 @@ static void spapr_irq_init_xive(sPAPRMachineState *spapr, Error **errp)
     }
 
     dev = qdev_create(NULL, TYPE_SPAPR_XIVE);
-    qdev_prop_set_uint32(dev, "nr-irqs", smc->irq->nr_irqs);
+    qdev_prop_set_uint32(dev, "nr-irqs", spapr->irq->nr_irqs);
     /*
      * 8 XIVE END structures per CPU. One for each available priority
      */
@@ -361,50 +359,38 @@ sPAPRIrq spapr_irq_xive = {
  */
 void spapr_irq_init(sPAPRMachineState *spapr, Error **errp)
 {
-    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
-
     /* Initialize the MSI IRQ allocator. */
     if (!SPAPR_MACHINE_GET_CLASS(spapr)->legacy_irq_allocation) {
-        spapr_irq_msi_init(spapr, smc->irq->nr_msis);
+        spapr_irq_msi_init(spapr, spapr->irq->nr_msis);
     }
 
-    smc->irq->init(spapr, errp);
+    spapr->irq->init(spapr, errp);
 }
 
 int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp)
 {
-    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
-
-    return smc->irq->claim(spapr, irq, lsi, errp);
+    return spapr->irq->claim(spapr, irq, lsi, errp);
 }
 
 void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num)
 {
-    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
-
-    smc->irq->free(spapr, irq, num);
+    spapr->irq->free(spapr, irq, num);
 }
 
 qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq)
 {
-    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
-
-    return smc->irq->qirq(spapr, irq);
+    return spapr->irq->qirq(spapr, irq);
 }
 
 int spapr_irq_post_load(sPAPRMachineState *spapr, int version_id)
 {
-    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
-
-    return smc->irq->post_load(spapr, version_id);
+    return spapr->irq->post_load(spapr, version_id);
 }
 
 void spapr_irq_reset(sPAPRMachineState *spapr, Error **errp)
 {
-    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
-
-    if (smc->irq->reset) {
-        smc->irq->reset(spapr, errp);
+    if (spapr->irq->reset) {
+        spapr->irq->reset(spapr, errp);
     }
 }
 
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Qemu-devel] [PATCH v8 09/12] spapr: set the interrupt presenter at reset
  2018-12-11 22:38 [Qemu-devel] [PATCH v8 00/12] ppc: support for the XIVE interrupt controller (POWER9) Cédric Le Goater
                   ` (7 preceding siblings ...)
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 08/12] spapr: introduce an 'ic-mode' machine option Cédric Le Goater
@ 2018-12-11 22:38 ` Cédric Le Goater
  2018-12-13 12:52   ` Cédric Le Goater
  2018-12-17  5:37   ` David Gibson
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 10/12] spapr: enable XIVE MMIOs " Cédric Le Goater
                   ` (3 subsequent siblings)
  12 siblings, 2 replies; 30+ messages in thread
From: Cédric Le Goater @ 2018-12-11 22:38 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt, Cédric Le Goater

Currently, the interrupt presenter of the vCPU is set at realize
time. Setting it at reset will become necessary when the new machine
supporting both interrupt modes is introduced. In this machine, the
interrupt mode is chosen at CAS time and activated after a reset.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---

 Changes since v7:

 - introduced spapr_irq_reset_xics(). 

 include/hw/ppc/spapr_cpu_core.h |  2 ++
 hw/ppc/spapr_cpu_core.c         | 26 ++++++++++++++++++++++++++
 hw/ppc/spapr_irq.c              | 13 +++++++++++++
 3 files changed, 41 insertions(+)

diff --git a/include/hw/ppc/spapr_cpu_core.h b/include/hw/ppc/spapr_cpu_core.h
index 9e2821e4b31f..fc8ea9021656 100644
--- a/include/hw/ppc/spapr_cpu_core.h
+++ b/include/hw/ppc/spapr_cpu_core.h
@@ -53,4 +53,6 @@ static inline sPAPRCPUState *spapr_cpu_state(PowerPCCPU *cpu)
     return (sPAPRCPUState *)cpu->machine_data;
 }
 
+void spapr_cpu_core_set_intc(PowerPCCPU *cpu, const char *intc_type);
+
 #endif
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 82666436e9b4..afc5d4f4e739 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -397,3 +397,29 @@ static const TypeInfo spapr_cpu_core_type_infos[] = {
 };
 
 DEFINE_TYPES(spapr_cpu_core_type_infos)
+
+typedef struct ForeachFindIntCArgs {
+    const char *intc_type;
+    Object *intc;
+} ForeachFindIntCArgs;
+
+static int spapr_cpu_core_find_intc(Object *child, void *opaque)
+{
+    ForeachFindIntCArgs *args = opaque;
+
+    if (object_dynamic_cast(child, args->intc_type)) {
+        args->intc = child;
+    }
+
+    return args->intc != NULL;
+}
+
+void spapr_cpu_core_set_intc(PowerPCCPU *cpu, const char *intc_type)
+{
+    ForeachFindIntCArgs args = { intc_type, NULL };
+
+    object_child_foreach(OBJECT(cpu), spapr_cpu_core_find_intc, &args);
+    g_assert(args.intc);
+
+    cpu->intc = args.intc;
+}
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 0999a2b2d69c..814500f22d34 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -12,6 +12,7 @@
 #include "qemu/error-report.h"
 #include "qapi/error.h"
 #include "hw/ppc/spapr.h"
+#include "hw/ppc/spapr_cpu_core.h"
 #include "hw/ppc/spapr_xive.h"
 #include "hw/ppc/xics.h"
 #include "sysemu/kvm.h"
@@ -208,6 +209,15 @@ static int spapr_irq_post_load_xics(sPAPRMachineState *spapr, int version_id)
     return 0;
 }
 
+static void spapr_irq_reset_xics(sPAPRMachineState *spapr, Error **errp)
+{
+   CPUState *cs;
+
+    CPU_FOREACH(cs) {
+        spapr_cpu_core_set_intc(POWERPC_CPU(cs), spapr->icp_type);
+    }
+}
+
 #define SPAPR_IRQ_XICS_NR_IRQS     0x1000
 #define SPAPR_IRQ_XICS_NR_MSIS     \
     (XICS_IRQ_BASE + SPAPR_IRQ_XICS_NR_IRQS - SPAPR_IRQ_MSI)
@@ -225,6 +235,7 @@ sPAPRIrq spapr_irq_xics = {
     .dt_populate = spapr_dt_xics,
     .cpu_intc_create = spapr_irq_cpu_intc_create_xics,
     .post_load   = spapr_irq_post_load_xics,
+    .reset       = spapr_irq_reset_xics,
 };
 
 /*
@@ -325,6 +336,8 @@ static void spapr_irq_reset_xive(sPAPRMachineState *spapr, Error **errp)
     CPU_FOREACH(cs) {
         PowerPCCPU *cpu = POWERPC_CPU(cs);
 
+        spapr_cpu_core_set_intc(cpu, TYPE_XIVE_TCTX);
+
         /* (TCG) Set the OS CAM line of the thread interrupt context. */
         spapr_xive_set_tctx_os_cam(XIVE_TCTX(cpu->intc));
     }
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Qemu-devel] [PATCH v8 10/12] spapr: enable XIVE MMIOs at reset
  2018-12-11 22:38 [Qemu-devel] [PATCH v8 00/12] ppc: support for the XIVE interrupt controller (POWER9) Cédric Le Goater
                   ` (8 preceding siblings ...)
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 09/12] spapr: set the interrupt presenter at reset Cédric Le Goater
@ 2018-12-11 22:38 ` Cédric Le Goater
  2018-12-17  6:04   ` David Gibson
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 11/12] spapr: introduce a new sPAPR IRQ backend supporting XIVE and XICS Cédric Le Goater
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Cédric Le Goater @ 2018-12-11 22:38 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt, Cédric Le Goater

Depending on the interrupt mode of the machine, enable or disable the
XIVE MMIOs.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---

 Changes since v7:

 - renamed spapr_xive_enable_mmio() to spapr_xive_mmio_set_enabled()

 include/hw/ppc/spapr_xive.h | 1 +
 hw/intc/spapr_xive.c        | 9 +++++++++
 hw/ppc/spapr_irq.c          | 8 ++++++++
 3 files changed, 18 insertions(+)

diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
index 728735dbcfbe..9b49871bdb1a 100644
--- a/include/hw/ppc/spapr_xive.h
+++ b/include/hw/ppc/spapr_xive.h
@@ -48,5 +48,6 @@ void spapr_xive_hcall_init(sPAPRMachineState *spapr);
 void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
                    uint32_t phandle);
 void spapr_xive_set_tctx_os_cam(XiveTCTX *tctx);
+void spapr_xive_mmio_set_enabled(sPAPRXive *xive, bool enable);
 
 #endif /* PPC_SPAPR_XIVE_H */
diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index aaa5865c4080..cd1b2c06f88b 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -179,6 +179,15 @@ static void spapr_xive_map_mmio(sPAPRXive *xive)
     sysbus_mmio_map(SYS_BUS_DEVICE(xive), 2, xive->tm_base);
 }
 
+void spapr_xive_mmio_set_enabled(sPAPRXive *xive, bool enable)
+{
+    memory_region_set_enabled(&xive->source.esb_mmio, enable);
+    memory_region_set_enabled(&xive->tm_mmio, enable);
+
+    /* Disable the END ESBs until a guest OS makes use of them */
+    memory_region_set_enabled(&xive->end_source.esb_mmio, false);
+}
+
 /*
  * When a Virtual Processor is scheduled to run on a HW thread, the
  * hypervisor pushes its identifier in the OS CAM line. Emulate the
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 814500f22d34..b1319905327f 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -216,6 +216,11 @@ static void spapr_irq_reset_xics(sPAPRMachineState *spapr, Error **errp)
     CPU_FOREACH(cs) {
         spapr_cpu_core_set_intc(POWERPC_CPU(cs), spapr->icp_type);
     }
+
+    /* Deactivate the XIVE MMIOs */
+    if (spapr->xive) {
+        spapr_xive_mmio_set_enabled(spapr->xive, false);
+    }
 }
 
 #define SPAPR_IRQ_XICS_NR_IRQS     0x1000
@@ -341,6 +346,9 @@ static void spapr_irq_reset_xive(sPAPRMachineState *spapr, Error **errp)
         /* (TCG) Set the OS CAM line of the thread interrupt context. */
         spapr_xive_set_tctx_os_cam(XIVE_TCTX(cpu->intc));
     }
+
+    /* Activate the XIVE MMIOs */
+    spapr_xive_mmio_set_enabled(spapr->xive, true);
 }
 
 /*
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Qemu-devel] [PATCH v8 11/12] spapr: introduce a new sPAPR IRQ backend supporting XIVE and XICS
  2018-12-11 22:38 [Qemu-devel] [PATCH v8 00/12] ppc: support for the XIVE interrupt controller (POWER9) Cédric Le Goater
                   ` (9 preceding siblings ...)
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 10/12] spapr: enable XIVE MMIOs " Cédric Le Goater
@ 2018-12-11 22:38 ` Cédric Le Goater
  2018-12-17  6:07   ` David Gibson
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 12/12] spapr: change default CPU type to POWER9 Cédric Le Goater
  2018-12-17  5:29 ` [Qemu-devel] [PATCH v8 00/12] ppc: support for the XIVE interrupt controller (POWER9) David Gibson
  12 siblings, 1 reply; 30+ messages in thread
From: Cédric Le Goater @ 2018-12-11 22:38 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt, Cédric Le Goater

The 'dual' sPAPR IRQ backend supports both interrupt mode, XIVE
exploitation mode and the legacy compatibility mode (XICS). both modes
are not supported at the same time.

The machine starts with the legacy mode and a new interrupt mode can
then be negotiated by the CAS process. In this case, the new mode is
activated after a reset to take into account the required changes in
the machine. These impact the device tree layout, the interrupt
presenter object and the exposed MMIO regions in the case of XIVE.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---

 Changes since v7:

 - usage of 'ic-mode' machine option

 include/hw/ppc/spapr_irq.h |   1 +
 hw/ppc/spapr.c             |  10 ++-
 hw/ppc/spapr_hcall.c       |  11 +++
 hw/ppc/spapr_irq.c         | 143 +++++++++++++++++++++++++++++++++++++
 4 files changed, 162 insertions(+), 3 deletions(-)

diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index b34d5a00381b..29936498dbc8 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -51,6 +51,7 @@ typedef struct sPAPRIrq {
 extern sPAPRIrq spapr_irq_xics;
 extern sPAPRIrq spapr_irq_xics_legacy;
 extern sPAPRIrq spapr_irq_xive;
+extern sPAPRIrq spapr_irq_dual;
 
 void spapr_irq_init(sPAPRMachineState *spapr, Error **errp);
 int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 5d985e38a598..97a5e3c9929f 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2636,11 +2636,11 @@ static void spapr_machine_init(MachineState *machine)
     spapr_ovec_set(spapr->ov5, OV5_DRMEM_V2);
 
     /* advertise XIVE on POWER9 machines */
-    if (spapr->irq->ov5 & SPAPR_OV5_XIVE_EXPLOIT) {
+    if (spapr->irq->ov5 & (SPAPR_OV5_XIVE_EXPLOIT | SPAPR_OV5_XIVE_BOTH)) {
         if (ppc_type_check_compat(machine->cpu_type, CPU_POWERPC_LOGICAL_3_00,
                                   0, spapr->max_compat_pvr)) {
             spapr_ovec_set(spapr->ov5, OV5_XIVE_EXPLOIT);
-        } else {
+        } else if (spapr->irq->ov5 & SPAPR_OV5_XIVE_EXPLOIT) {
             error_report("XIVE-only machines require a POWER9 CPU");
             exit(1);
         }
@@ -3066,6 +3066,8 @@ static char *spapr_get_ic_mode(Object *obj, Error **errp)
         return g_strdup("xics");
     } else if (spapr->irq == &spapr_irq_xive) {
         return g_strdup("xive");
+    } else if (spapr->irq == &spapr_irq_dual) {
+        return g_strdup("dual");
     }
     g_assert_not_reached();
 }
@@ -3079,6 +3081,8 @@ static void spapr_set_ic_mode(Object *obj, const char *value, Error **errp)
         spapr->irq = &spapr_irq_xics;
     } else if (strcmp(value, "xive") == 0) {
         spapr->irq = &spapr_irq_xive;
+    } else if (strcmp(value, "dual") == 0) {
+        spapr->irq = &spapr_irq_dual;
     } else {
         error_setg(errp, "Bad value for \"ic-mode\" property");
     }
@@ -3127,7 +3131,7 @@ static void spapr_instance_init(Object *obj)
     object_property_add_str(obj, "ic-mode", spapr_get_ic_mode,
                             spapr_set_ic_mode, NULL);
     object_property_set_description(obj, "ic-mode",
-                 "Specifies the interrupt controller mode (xics, xive)",
+                 "Specifies the interrupt controller mode (xics, xive, dual)",
                  NULL);
 }
 
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index ae913d070f50..09386458f267 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1654,6 +1654,17 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
             (spapr_h_cas_compose_response(spapr, args[1], args[2],
                                           ov5_updates) != 0);
     }
+
+    /*
+     * Generate a machine reset when we have an update of the
+     * interrupt mode. Only required on the machine supporting both
+     * mode.
+     */
+    if (!spapr->cas_reboot) {
+        spapr->cas_reboot = spapr_ovec_test(ov5_updates, OV5_XIVE_EXPLOIT)
+            && spapr->irq->ov5 & SPAPR_OV5_XIVE_BOTH;
+    }
+
     spapr_ovec_cleanup(ov5_updates);
 
     if (spapr->cas_reboot) {
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index b1319905327f..31d195b08952 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -375,6 +375,149 @@ sPAPRIrq spapr_irq_xive = {
     .reset       = spapr_irq_reset_xive,
 };
 
+/*
+ * Dual XIVE and XICS IRQ backend.
+ *
+ * Both interrupt mode, XIVE and XICS, objects are created but the
+ * machine starts in legacy interrupt mode (XICS). It can be changed
+ * by the CAS negotiation process and, in that case, the new mode is
+ * activated after extra machine reset.
+ */
+
+/*
+ * Returns the sPAPR IRQ backend negotiated by CAS. XICS is the
+ * default.
+ */
+static sPAPRIrq *spapr_irq_current(sPAPRMachineState *spapr)
+{
+    return spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT) ?
+        &spapr_irq_xive : &spapr_irq_xics;
+}
+
+static void spapr_irq_init_dual(sPAPRMachineState *spapr, Error **errp)
+{
+    MachineState *machine = MACHINE(spapr);
+    Error *local_err = NULL;
+
+    if (kvm_enabled() && machine_kernel_irqchip_allowed(machine)) {
+        error_setg(errp, "No KVM support for the 'dual' machine");
+        return;
+    }
+
+    spapr_irq_xics.init(spapr, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+
+    spapr_irq_xive.init(spapr, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+}
+
+static int spapr_irq_claim_dual(sPAPRMachineState *spapr, int irq, bool lsi,
+                                Error **errp)
+{
+    int ret;
+    Error *local_err = NULL;
+
+    ret = spapr_irq_xive.claim(spapr, irq, lsi, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return ret;
+    }
+
+    ret = spapr_irq_xics.claim(spapr, irq, lsi, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+    }
+
+    return ret;
+}
+
+static void spapr_irq_free_dual(sPAPRMachineState *spapr, int irq, int num)
+{
+    spapr_irq_xive.free(spapr, irq, num);
+    spapr_irq_xics.free(spapr, irq, num);
+}
+
+static qemu_irq spapr_qirq_dual(sPAPRMachineState *spapr, int irq)
+{
+    return spapr_irq_current(spapr)->qirq(spapr, irq);
+}
+
+static void spapr_irq_print_info_dual(sPAPRMachineState *spapr, Monitor *mon)
+{
+    spapr_irq_current(spapr)->print_info(spapr, mon);
+}
+
+static void spapr_irq_dt_populate_dual(sPAPRMachineState *spapr,
+                                       uint32_t nr_servers, void *fdt,
+                                       uint32_t phandle)
+{
+    spapr_irq_current(spapr)->dt_populate(spapr, nr_servers, fdt, phandle);
+}
+
+static Object *spapr_irq_cpu_intc_create_dual(sPAPRMachineState *spapr,
+                                              Object *cpu, Error **errp)
+{
+    Error *local_err = NULL;
+
+    spapr_irq_xive.cpu_intc_create(spapr, cpu, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return NULL;
+    }
+
+    /* Default to XICS interrupt mode */
+    return spapr_irq_xics.cpu_intc_create(spapr, cpu, errp);
+}
+
+static int spapr_irq_post_load_dual(sPAPRMachineState *spapr, int version_id)
+{
+    /*
+     * Force a reset of the XIVE backend after migration. The machine
+     * defaults to XICS at startup.
+     */
+    if (spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
+        spapr_irq_xive.reset(spapr, &error_fatal);
+    }
+
+    return spapr_irq_current(spapr)->post_load(spapr, version_id);
+}
+
+static void spapr_irq_reset_dual(sPAPRMachineState *spapr, Error **errp)
+{
+    /*
+     * Reset the interrupt mode selected by CAS.
+     */
+    spapr_irq_current(spapr)->reset(spapr, errp);
+}
+
+/*
+ * Define values in sync with the XIVE and XICS backend
+ */
+#define SPAPR_IRQ_DUAL_NR_IRQS     0x2000
+#define SPAPR_IRQ_DUAL_NR_MSIS     (SPAPR_IRQ_DUAL_NR_IRQS - SPAPR_IRQ_MSI)
+
+sPAPRIrq spapr_irq_dual = {
+    .nr_irqs     = SPAPR_IRQ_DUAL_NR_IRQS,
+    .nr_msis     = SPAPR_IRQ_DUAL_NR_MSIS,
+    .ov5         = SPAPR_OV5_XIVE_BOTH,
+
+    .init        = spapr_irq_init_dual,
+    .claim       = spapr_irq_claim_dual,
+    .free        = spapr_irq_free_dual,
+    .qirq        = spapr_qirq_dual,
+    .print_info  = spapr_irq_print_info_dual,
+    .dt_populate = spapr_irq_dt_populate_dual,
+    .cpu_intc_create = spapr_irq_cpu_intc_create_dual,
+    .post_load   = spapr_irq_post_load_dual,
+    .reset       = spapr_irq_reset_dual,
+};
+
 /*
  * sPAPR IRQ frontend routines for devices
  */
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Qemu-devel] [PATCH v8 12/12] spapr: change default CPU type to POWER9
  2018-12-11 22:38 [Qemu-devel] [PATCH v8 00/12] ppc: support for the XIVE interrupt controller (POWER9) Cédric Le Goater
                   ` (10 preceding siblings ...)
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 11/12] spapr: introduce a new sPAPR IRQ backend supporting XIVE and XICS Cédric Le Goater
@ 2018-12-11 22:38 ` Cédric Le Goater
  2018-12-17  6:07   ` David Gibson
  2018-12-17  5:29 ` [Qemu-devel] [PATCH v8 00/12] ppc: support for the XIVE interrupt controller (POWER9) David Gibson
  12 siblings, 1 reply; 30+ messages in thread
From: Cédric Le Goater @ 2018-12-11 22:38 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt, Cédric Le Goater

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 hw/ppc/spapr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 97a5e3c9929f..705d3f4057b0 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3938,7 +3938,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     hc->unplug = spapr_machine_device_unplug;
 
     smc->dr_lmb_enabled = true;
-    mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power8_v2.0");
+    mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power9_v2.0");
     mc->has_hotpluggable_cpus = true;
     smc->resize_hpt_default = SPAPR_RESIZE_HPT_ENABLED;
     fwc->get_dev_path = spapr_get_fw_dev_path;
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH v8 09/12] spapr: set the interrupt presenter at reset
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 09/12] spapr: set the interrupt presenter at reset Cédric Le Goater
@ 2018-12-13 12:52   ` Cédric Le Goater
  2018-12-17  6:01     ` David Gibson
  2018-12-17  5:37   ` David Gibson
  1 sibling, 1 reply; 30+ messages in thread
From: Cédric Le Goater @ 2018-12-13 12:52 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt

On 12/11/18 11:38 PM, Cédric Le Goater wrote:
> Currently, the interrupt presenter of the vCPU is set at realize
> time. Setting it at reset will become necessary when the new machine
> supporting both interrupt modes is introduced. In this machine, the
> interrupt mode is chosen at CAS time and activated after a reset.

I think we need a second '->intc' pointer specific to XIVE instead. 
How about ->intc_xive ?  

We could just drop this patch and easly get rid of the XiveTCTX object 
leak in spapr_unrealize_vcpu().

C. 

> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
> 
>  Changes since v7:
> 
>  - introduced spapr_irq_reset_xics(). 
> 
>  include/hw/ppc/spapr_cpu_core.h |  2 ++
>  hw/ppc/spapr_cpu_core.c         | 26 ++++++++++++++++++++++++++
>  hw/ppc/spapr_irq.c              | 13 +++++++++++++
>  3 files changed, 41 insertions(+)
> 
> diff --git a/include/hw/ppc/spapr_cpu_core.h b/include/hw/ppc/spapr_cpu_core.h
> index 9e2821e4b31f..fc8ea9021656 100644
> --- a/include/hw/ppc/spapr_cpu_core.h
> +++ b/include/hw/ppc/spapr_cpu_core.h
> @@ -53,4 +53,6 @@ static inline sPAPRCPUState *spapr_cpu_state(PowerPCCPU *cpu)
>      return (sPAPRCPUState *)cpu->machine_data;
>  }
>  
> +void spapr_cpu_core_set_intc(PowerPCCPU *cpu, const char *intc_type);
> +
>  #endif
> diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> index 82666436e9b4..afc5d4f4e739 100644
> --- a/hw/ppc/spapr_cpu_core.c
> +++ b/hw/ppc/spapr_cpu_core.c
> @@ -397,3 +397,29 @@ static const TypeInfo spapr_cpu_core_type_infos[] = {
>  };
>  
>  DEFINE_TYPES(spapr_cpu_core_type_infos)
> +
> +typedef struct ForeachFindIntCArgs {
> +    const char *intc_type;
> +    Object *intc;
> +} ForeachFindIntCArgs;
> +
> +static int spapr_cpu_core_find_intc(Object *child, void *opaque)
> +{
> +    ForeachFindIntCArgs *args = opaque;
> +
> +    if (object_dynamic_cast(child, args->intc_type)) {
> +        args->intc = child;
> +    }
> +
> +    return args->intc != NULL;
> +}
> +
> +void spapr_cpu_core_set_intc(PowerPCCPU *cpu, const char *intc_type)
> +{
> +    ForeachFindIntCArgs args = { intc_type, NULL };
> +
> +    object_child_foreach(OBJECT(cpu), spapr_cpu_core_find_intc, &args);
> +    g_assert(args.intc);
> +
> +    cpu->intc = args.intc;
> +}
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 0999a2b2d69c..814500f22d34 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -12,6 +12,7 @@
>  #include "qemu/error-report.h"
>  #include "qapi/error.h"
>  #include "hw/ppc/spapr.h"
> +#include "hw/ppc/spapr_cpu_core.h"
>  #include "hw/ppc/spapr_xive.h"
>  #include "hw/ppc/xics.h"
>  #include "sysemu/kvm.h"
> @@ -208,6 +209,15 @@ static int spapr_irq_post_load_xics(sPAPRMachineState *spapr, int version_id)
>      return 0;
>  }
>  
> +static void spapr_irq_reset_xics(sPAPRMachineState *spapr, Error **errp)
> +{
> +   CPUState *cs;
> +
> +    CPU_FOREACH(cs) {
> +        spapr_cpu_core_set_intc(POWERPC_CPU(cs), spapr->icp_type);
> +    }
> +}
> +
>  #define SPAPR_IRQ_XICS_NR_IRQS     0x1000
>  #define SPAPR_IRQ_XICS_NR_MSIS     \
>      (XICS_IRQ_BASE + SPAPR_IRQ_XICS_NR_IRQS - SPAPR_IRQ_MSI)
> @@ -225,6 +235,7 @@ sPAPRIrq spapr_irq_xics = {
>      .dt_populate = spapr_dt_xics,
>      .cpu_intc_create = spapr_irq_cpu_intc_create_xics,
>      .post_load   = spapr_irq_post_load_xics,
> +    .reset       = spapr_irq_reset_xics,
>  };
>  
>  /*
> @@ -325,6 +336,8 @@ static void spapr_irq_reset_xive(sPAPRMachineState *spapr, Error **errp)
>      CPU_FOREACH(cs) {
>          PowerPCCPU *cpu = POWERPC_CPU(cs);
>  
> +        spapr_cpu_core_set_intc(cpu, TYPE_XIVE_TCTX);
> +
>          /* (TCG) Set the OS CAM line of the thread interrupt context. */
>          spapr_xive_set_tctx_os_cam(XIVE_TCTX(cpu->intc));
>      }
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH v8 07/12] spapr: add an extra OV5 field to the sPAPR IRQ backend
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 07/12] spapr: add an extra OV5 field " Cédric Le Goater
@ 2018-12-17  5:27   ` David Gibson
  2018-12-17  8:02     ` Cédric Le Goater
  0 siblings, 1 reply; 30+ messages in thread
From: David Gibson @ 2018-12-17  5:27 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 7812 bytes --]

On Tue, Dec 11, 2018 at 11:38:18PM +0100, Cédric Le Goater wrote:
> The interrupt modes supported by the hypervisor are advertised to the
> guest with new bits definitions of the option vector 5 of property
> "ibm,arch-vec-5-platform-support. The byte 23 bits 0-1 of the OV5 are
> defined as follow :
> 
>   0b00   PAPR 2.7 and earlier (Legacy systems)
>   0b01   XIVE Exploitation mode only
>   0b10   Either available
> 
> If the client/guest selects the XIVE interrupt mode, it informs the
> hypervisor by returning the value 0b01 in byte 23 bits 0-1. A 0b00
> value indicates the use of the XICS interrupt mode (Legacy systems).
> 
> The sPAPR IRQ backend is extended with these definitions and the
> values are directly used to populate the "ibm,arch-vec-5-platform-support"
> property. The interrupt mode is advertised under TCG and under KVM.
> Although a KVM XIVE device is not yet available, the machine can still
> operate with kernel_irqchip=off. However, we apply a restriction on
> the CPU which is required to be a POWER9 when a XIVE interrupt
> controller is in use.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
> 
>  Changes since v7:
> 
>  - improved commit log
>  - introduced a check on CPU type.
> 
>  include/hw/ppc/spapr.h     |  6 ++++++
>  include/hw/ppc/spapr_irq.h |  1 +
>  hw/ppc/spapr.c             | 36 ++++++++++++++++++++++++++++++------
>  hw/ppc/spapr_irq.c         |  3 +++
>  4 files changed, 40 insertions(+), 6 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 6bf028a02fe2..06765b4e9d79 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -824,5 +824,11 @@ int spapr_caps_post_migration(sPAPRMachineState *spapr);
>  
>  void spapr_check_pagesize(sPAPRMachineState *spapr, hwaddr pagesize,
>                            Error **errp);
> +/*
> + * XIVE definitions
> + */
> +#define SPAPR_OV5_XIVE_LEGACY   0x0
> +#define SPAPR_OV5_XIVE_EXPLOIT  0x40
> +#define SPAPR_OV5_XIVE_BOTH     0x80 /* Only to advertise on the platform */
>  
>  #endif /* HW_SPAPR_H */
> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> index 63061a009b4c..b34d5a00381b 100644
> --- a/include/hw/ppc/spapr_irq.h
> +++ b/include/hw/ppc/spapr_irq.h
> @@ -33,6 +33,7 @@ void spapr_irq_msi_reset(sPAPRMachineState *spapr);
>  typedef struct sPAPRIrq {
>      uint32_t    nr_irqs;
>      uint32_t    nr_msis;
> +    uint8_t     ov5;
>  
>      void (*init)(sPAPRMachineState *spapr, Error **errp);
>      int (*claim)(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 57c0066edd56..ff138f224a87 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1095,12 +1095,14 @@ static void spapr_dt_rtas(sPAPRMachineState *spapr, void *fdt)
>      spapr_dt_rtas_tokens(fdt, rtas);
>  }
>  
> -/* Prepare ibm,arch-vec-5-platform-support, which indicates the MMU features
> - * that the guest may request and thus the valid values for bytes 24..26 of
> - * option vector 5: */
> -static void spapr_dt_ov5_platform_support(void *fdt, int chosen)
> +/* Prepare ibm,arch-vec-5-platform-support, which indicates the MMU
> + * and the XIVE features that the guest may request and thus the valid
> + * values for bytes 23..26 of option vector 5: */
> +static void spapr_dt_ov5_platform_support(sPAPRMachineState *spapr, void *fdt,
> +                                          int chosen)
>  {
>      PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
> +    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
>  
>      char val[2 * 4] = {
>          23, 0x00, /* Xive mode, filled in below. */
> @@ -1111,9 +1113,17 @@ static void spapr_dt_ov5_platform_support(void *fdt, int chosen)
>  
>      if (!ppc_check_compat(first_ppc_cpu, CPU_POWERPC_LOGICAL_3_00, 0,
>                            first_ppc_cpu->compat_pvr)) {
> -        /* If we're in a pre POWER9 compat mode then the guest should do hash */
> +        /* If we're in a pre POWER9 compat mode then the guest should do hash
> +         * and use the legacy interrupt mode
> +         */
> +        val[1] = 0x00; /* XICS */
>          val[3] = 0x00; /* Hash */
>      } else if (kvm_enabled()) {
> +        /* If the KVM XIVE device is not available, the machine can
> +         * still operate with kernel_irqchip=off
> +         */
> +        val[1] = smc->irq->ov5;


I know it means another layer of if, but I'd prefer to rearrange this
so setting the intc value is outside the if (kvm_enabled())..
> +
>          if (kvmppc_has_cap_mmu_radix() && kvmppc_has_cap_mmu_hash_v3()) {
>              val[3] = 0x80; /* OV5_MMU_BOTH */
>          } else if (kvmppc_has_cap_mmu_radix()) {
> @@ -1122,6 +1132,9 @@ static void spapr_dt_ov5_platform_support(void *fdt, int chosen)
>              val[3] = 0x00; /* Hash */
>          }
>      } else {
> +        /* In TCG, the interrupt mode is defined by the pseries machine */
> +        val[1] = smc->irq->ov5;

..rather than duplicated in each branch.

> +
>          /* V3 MMU supports both hash and radix in tcg (with dynamic switching) */
>          val[3] = 0xC0;
>      }
> @@ -1189,7 +1202,7 @@ static void spapr_dt_chosen(sPAPRMachineState *spapr, void *fdt)
>          _FDT(fdt_setprop_string(fdt, chosen, "stdout-path", stdout_path));
>      }
>  
> -    spapr_dt_ov5_platform_support(fdt, chosen);
> +    spapr_dt_ov5_platform_support(spapr, fdt, chosen);
>  
>      g_free(stdout_path);
>      g_free(bootlist);
> @@ -2622,6 +2635,17 @@ static void spapr_machine_init(MachineState *machine)
>      /* advertise support for ibm,dyamic-memory-v2 */
>      spapr_ovec_set(spapr->ov5, OV5_DRMEM_V2);
>  
> +    /* advertise XIVE on POWER9 machines */
> +    if (smc->irq->ov5 & SPAPR_OV5_XIVE_EXPLOIT) {
> +        if (ppc_type_check_compat(machine->cpu_type, CPU_POWERPC_LOGICAL_3_00,
> +                                  0, spapr->max_compat_pvr)) {
> +            spapr_ovec_set(spapr->ov5, OV5_XIVE_EXPLOIT);
> +        } else {
> +            error_report("XIVE-only machines require a POWER9 CPU");
> +            exit(1);
> +        }
> +    }
> +
>      /* init CPUs */
>      spapr_init_cpus(spapr);
>  
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index f835ea939cbf..79f06308995b 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -216,6 +216,7 @@ static int spapr_irq_post_load_xics(sPAPRMachineState *spapr, int version_id)
>  sPAPRIrq spapr_irq_xics = {
>      .nr_irqs     = SPAPR_IRQ_XICS_NR_IRQS,
>      .nr_msis     = SPAPR_IRQ_XICS_NR_MSIS,
> +    .ov5         = SPAPR_OV5_XIVE_LEGACY,
>  
>      .init        = spapr_irq_init_xics,
>      .claim       = spapr_irq_claim_xics,
> @@ -342,6 +343,7 @@ static void spapr_irq_reset_xive(sPAPRMachineState *spapr, Error **errp)
>  sPAPRIrq spapr_irq_xive = {
>      .nr_irqs     = SPAPR_IRQ_XIVE_NR_IRQS,
>      .nr_msis     = SPAPR_IRQ_XIVE_NR_MSIS,
> +    .ov5         = SPAPR_OV5_XIVE_EXPLOIT,
>  
>      .init        = spapr_irq_init_xive,
>      .claim       = spapr_irq_claim_xive,
> @@ -466,6 +468,7 @@ int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align, Error **errp)
>  sPAPRIrq spapr_irq_xics_legacy = {
>      .nr_irqs     = SPAPR_IRQ_XICS_LEGACY_NR_IRQS,
>      .nr_msis     = SPAPR_IRQ_XICS_LEGACY_NR_IRQS,
> +    .ov5         = SPAPR_OV5_XIVE_LEGACY,
>  
>      .init        = spapr_irq_init_xics,
>      .claim       = spapr_irq_claim_xics,

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH v8 00/12] ppc: support for the XIVE interrupt controller (POWER9)
  2018-12-11 22:38 [Qemu-devel] [PATCH v8 00/12] ppc: support for the XIVE interrupt controller (POWER9) Cédric Le Goater
                   ` (11 preceding siblings ...)
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 12/12] spapr: change default CPU type to POWER9 Cédric Le Goater
@ 2018-12-17  5:29 ` David Gibson
  2018-12-17 10:16   ` Cédric Le Goater
  12 siblings, 1 reply; 30+ messages in thread
From: David Gibson @ 2018-12-17  5:29 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 920 bytes --]

On Tue, Dec 11, 2018 at 11:38:11PM +0100, Cédric Le Goater wrote:
> Hello,
> 
> Here is the version 8 of the QEMU models adding support for the XIVE
> interrupt controller to the sPAPR machine, under TCG only. KVM support
> will be proposed in an other patchset, along with the KVM XIVE device
> patchset, so will PowerNV.
> 
> The most important changes for sPAPR are the introduction of a 'ic-mode'
> machine option to select the interrupt mode of the machine and the fix
> for CPU hotplug.

I've applied 1-6, I have some minor comments on 7, still reviewing the
remainder.

We also need to fix that Windows build bug - I'm planning to fold /
insert that fix into the series so as not to break bisection.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH v8 08/12] spapr: introduce an 'ic-mode' machine option
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 08/12] spapr: introduce an 'ic-mode' machine option Cédric Le Goater
@ 2018-12-17  5:35   ` David Gibson
  0 siblings, 0 replies; 30+ messages in thread
From: David Gibson @ 2018-12-17  5:35 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 11056 bytes --]

On Tue, Dec 11, 2018 at 11:38:19PM +0100, Cédric Le Goater wrote:
> This option is used to select the interrupt controller mode (XICS or
> XIVE) with which the machine will operate. XICS being the default
> mode for now.
> 
> When running a machine with the XIVE interrupt mode backend, the guest
> OS is required to have support for the XIVE exploitation mode. In the
> case of legacy OS, the mode selected by CAS should be XICS and the OS
> should fail to boot. However, QEMU could possibly detect it, terminate
> the boot process and reset to stop in the SLOF firmware. This is not
> yet handled.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  include/hw/ppc/spapr.h  |  1 +
>  hw/ppc/spapr.c          | 52 ++++++++++++++++++++++++++++++++++-------
>  hw/ppc/spapr_cpu_core.c |  3 +--
>  hw/ppc/spapr_irq.c      | 34 ++++++++-------------------
>  4 files changed, 56 insertions(+), 34 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 06765b4e9d79..2c77a8ba8810 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -177,6 +177,7 @@ struct sPAPRMachineState {
>      int32_t irq_map_nr;
>      unsigned long *irq_map;
>      sPAPRXive  *xive;
> +    sPAPRIrq *irq;
>  
>      bool cmd_line_caps[SPAPR_CAP_NUM];
>      sPAPRCapabilities def, eff, mig;
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index ff138f224a87..5d985e38a598 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1102,7 +1102,6 @@ static void spapr_dt_ov5_platform_support(sPAPRMachineState *spapr, void *fdt,
>                                            int chosen)
>  {
>      PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
> -    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
>  
>      char val[2 * 4] = {
>          23, 0x00, /* Xive mode, filled in below. */
> @@ -1122,7 +1121,7 @@ static void spapr_dt_ov5_platform_support(sPAPRMachineState *spapr, void *fdt,
>          /* If the KVM XIVE device is not available, the machine can
>           * still operate with kernel_irqchip=off
>           */
> -        val[1] = smc->irq->ov5;
> +        val[1] = spapr->irq->ov5;
>  
>          if (kvmppc_has_cap_mmu_radix() && kvmppc_has_cap_mmu_hash_v3()) {
>              val[3] = 0x80; /* OV5_MMU_BOTH */
> @@ -1133,7 +1132,7 @@ static void spapr_dt_ov5_platform_support(sPAPRMachineState *spapr, void *fdt,
>          }
>      } else {
>          /* In TCG, the interrupt mode is defined by the pseries machine */
> -        val[1] = smc->irq->ov5;
> +        val[1] = spapr->irq->ov5;
>  
>          /* V3 MMU supports both hash and radix in tcg (with dynamic switching) */
>          val[3] = 0xC0;
> @@ -1281,7 +1280,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
>      _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
>  
>      /* /interrupt controller */
> -    smc->irq->dt_populate(spapr, spapr_max_server_number(spapr), fdt,
> +    spapr->irq->dt_populate(spapr, spapr_max_server_number(spapr), fdt,
>                            PHANDLE_XICP);
>  
>      ret = spapr_populate_memory(spapr, fdt);
> @@ -1302,7 +1301,8 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
>      }
>  
>      QLIST_FOREACH(phb, &spapr->phbs, list) {
> -        ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt, smc->irq->nr_msis);
> +        ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt,
> +                                    spapr->irq->nr_msis);
>          if (ret < 0) {
>              error_report("couldn't setup PCI devices in fdt");
>              exit(1);
> @@ -2636,7 +2636,7 @@ static void spapr_machine_init(MachineState *machine)
>      spapr_ovec_set(spapr->ov5, OV5_DRMEM_V2);
>  
>      /* advertise XIVE on POWER9 machines */
> -    if (smc->irq->ov5 & SPAPR_OV5_XIVE_EXPLOIT) {
> +    if (spapr->irq->ov5 & SPAPR_OV5_XIVE_EXPLOIT) {
>          if (ppc_type_check_compat(machine->cpu_type, CPU_POWERPC_LOGICAL_3_00,
>                                    0, spapr->max_compat_pvr)) {
>              spapr_ovec_set(spapr->ov5, OV5_XIVE_EXPLOIT);
> @@ -3056,9 +3056,38 @@ static void spapr_set_vsmt(Object *obj, Visitor *v, const char *name,
>      visit_type_uint32(v, name, (uint32_t *)opaque, errp);
>  }
>  
> +static char *spapr_get_ic_mode(Object *obj, Error **errp)
> +{
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
> +
> +    if (spapr->irq == &spapr_irq_xics_legacy) {
> +        return g_strdup("legacy");
> +    } else if (spapr->irq == &spapr_irq_xics) {
> +        return g_strdup("xics");
> +    } else if (spapr->irq == &spapr_irq_xive) {
> +        return g_strdup("xive");
> +    }
> +    g_assert_not_reached();
> +}
> +
> +static void spapr_set_ic_mode(Object *obj, const char *value, Error **errp)
> +{
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
> +
> +    /* The legacy IRQ backend can not be set */
> +    if (strcmp(value, "xics") == 0) {
> +        spapr->irq = &spapr_irq_xics;
> +    } else if (strcmp(value, "xive") == 0) {
> +        spapr->irq = &spapr_irq_xive;
> +    } else {
> +        error_setg(errp, "Bad value for \"ic-mode\" property");
> +    }
> +}
> +
>  static void spapr_instance_init(Object *obj)
>  {
>      sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
> +    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
>  
>      spapr->htab_fd = -1;
>      spapr->use_hotplug_event_source = true;
> @@ -3092,6 +3121,14 @@ static void spapr_instance_init(Object *obj)
>                                      " the host's SMT mode", &error_abort);
>      object_property_add_bool(obj, "vfio-no-msix-emulation",
>                               spapr_get_msix_emulation, NULL, NULL);
> +
> +    /* The machine class defines the default interrupt controller mode */
> +    spapr->irq = smc->irq;
> +    object_property_add_str(obj, "ic-mode", spapr_get_ic_mode,
> +                            spapr_set_ic_mode, NULL);
> +    object_property_set_description(obj, "ic-mode",
> +                 "Specifies the interrupt controller mode (xics, xive)",
> +                 NULL);
>  }
>  
>  static void spapr_machine_finalizefn(Object *obj)
> @@ -3814,9 +3851,8 @@ static void spapr_pic_print_info(InterruptStatsProvider *obj,
>                                   Monitor *mon)
>  {
>      sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
> -    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
>  
> -    smc->irq->print_info(spapr, mon);
> +    spapr->irq->print_info(spapr, mon);
>  }
>  
>  int spapr_get_vcpu_id(PowerPCCPU *cpu)
> diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> index 1811cd48db90..82666436e9b4 100644
> --- a/hw/ppc/spapr_cpu_core.c
> +++ b/hw/ppc/spapr_cpu_core.c
> @@ -214,7 +214,6 @@ static void spapr_cpu_core_unrealize(DeviceState *dev, Error **errp)
>  static void spapr_realize_vcpu(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>                                 sPAPRCPUCore *sc, Error **errp)
>  {
> -    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
>      CPUPPCState *env = &cpu->env;
>      CPUState *cs = CPU(cpu);
>      Error *local_err = NULL;
> @@ -233,7 +232,7 @@ static void spapr_realize_vcpu(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>      qemu_register_reset(spapr_cpu_reset, cpu);
>      spapr_cpu_reset(cpu);
>  
> -    cpu->intc = smc->irq->cpu_intc_create(spapr, OBJECT(cpu), &local_err);
> +    cpu->intc = spapr->irq->cpu_intc_create(spapr, OBJECT(cpu), &local_err);
>      if (local_err) {
>          goto error_unregister;
>      }
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 79f06308995b..0999a2b2d69c 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -94,8 +94,7 @@ error:
>  static void spapr_irq_init_xics(sPAPRMachineState *spapr, Error **errp)
>  {
>      MachineState *machine = MACHINE(spapr);
> -    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> -    int nr_irqs = smc->irq->nr_irqs;
> +    int nr_irqs = spapr->irq->nr_irqs;
>      Error *local_err = NULL;
>  
>      if (kvm_enabled()) {
> @@ -234,7 +233,6 @@ sPAPRIrq spapr_irq_xics = {
>  static void spapr_irq_init_xive(sPAPRMachineState *spapr, Error **errp)
>  {
>      MachineState *machine = MACHINE(spapr);
> -    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
>      uint32_t nr_servers = spapr_max_server_number(spapr);
>      DeviceState *dev;
>      int i;
> @@ -248,7 +246,7 @@ static void spapr_irq_init_xive(sPAPRMachineState *spapr, Error **errp)
>      }
>  
>      dev = qdev_create(NULL, TYPE_SPAPR_XIVE);
> -    qdev_prop_set_uint32(dev, "nr-irqs", smc->irq->nr_irqs);
> +    qdev_prop_set_uint32(dev, "nr-irqs", spapr->irq->nr_irqs);
>      /*
>       * 8 XIVE END structures per CPU. One for each available priority
>       */
> @@ -361,50 +359,38 @@ sPAPRIrq spapr_irq_xive = {
>   */
>  void spapr_irq_init(sPAPRMachineState *spapr, Error **errp)
>  {
> -    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> -
>      /* Initialize the MSI IRQ allocator. */
>      if (!SPAPR_MACHINE_GET_CLASS(spapr)->legacy_irq_allocation) {
> -        spapr_irq_msi_init(spapr, smc->irq->nr_msis);
> +        spapr_irq_msi_init(spapr, spapr->irq->nr_msis);
>      }
>  
> -    smc->irq->init(spapr, errp);
> +    spapr->irq->init(spapr, errp);
>  }
>  
>  int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp)
>  {
> -    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> -
> -    return smc->irq->claim(spapr, irq, lsi, errp);
> +    return spapr->irq->claim(spapr, irq, lsi, errp);
>  }
>  
>  void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num)
>  {
> -    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> -
> -    smc->irq->free(spapr, irq, num);
> +    spapr->irq->free(spapr, irq, num);
>  }
>  
>  qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq)
>  {
> -    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> -
> -    return smc->irq->qirq(spapr, irq);
> +    return spapr->irq->qirq(spapr, irq);
>  }
>  
>  int spapr_irq_post_load(sPAPRMachineState *spapr, int version_id)
>  {
> -    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> -
> -    return smc->irq->post_load(spapr, version_id);
> +    return spapr->irq->post_load(spapr, version_id);
>  }
>  
>  void spapr_irq_reset(sPAPRMachineState *spapr, Error **errp)
>  {
> -    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> -
> -    if (smc->irq->reset) {
> -        smc->irq->reset(spapr, errp);
> +    if (spapr->irq->reset) {
> +        spapr->irq->reset(spapr, errp);
>      }
>  }
>  

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH v8 09/12] spapr: set the interrupt presenter at reset
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 09/12] spapr: set the interrupt presenter at reset Cédric Le Goater
  2018-12-13 12:52   ` Cédric Le Goater
@ 2018-12-17  5:37   ` David Gibson
  1 sibling, 0 replies; 30+ messages in thread
From: David Gibson @ 2018-12-17  5:37 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 3976 bytes --]

On Tue, Dec 11, 2018 at 11:38:20PM +0100, Cédric Le Goater wrote:
> Currently, the interrupt presenter of the vCPU is set at realize
> time. Setting it at reset will become necessary when the new machine
> supporting both interrupt modes is introduced. In this machine, the
> interrupt mode is chosen at CAS time and activated after a reset.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>

Ugly, but necessary.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
> 
>  Changes since v7:
> 
>  - introduced spapr_irq_reset_xics(). 
> 
>  include/hw/ppc/spapr_cpu_core.h |  2 ++
>  hw/ppc/spapr_cpu_core.c         | 26 ++++++++++++++++++++++++++
>  hw/ppc/spapr_irq.c              | 13 +++++++++++++
>  3 files changed, 41 insertions(+)
> 
> diff --git a/include/hw/ppc/spapr_cpu_core.h b/include/hw/ppc/spapr_cpu_core.h
> index 9e2821e4b31f..fc8ea9021656 100644
> --- a/include/hw/ppc/spapr_cpu_core.h
> +++ b/include/hw/ppc/spapr_cpu_core.h
> @@ -53,4 +53,6 @@ static inline sPAPRCPUState *spapr_cpu_state(PowerPCCPU *cpu)
>      return (sPAPRCPUState *)cpu->machine_data;
>  }
>  
> +void spapr_cpu_core_set_intc(PowerPCCPU *cpu, const char *intc_type);
> +
>  #endif
> diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> index 82666436e9b4..afc5d4f4e739 100644
> --- a/hw/ppc/spapr_cpu_core.c
> +++ b/hw/ppc/spapr_cpu_core.c
> @@ -397,3 +397,29 @@ static const TypeInfo spapr_cpu_core_type_infos[] = {
>  };
>  
>  DEFINE_TYPES(spapr_cpu_core_type_infos)
> +
> +typedef struct ForeachFindIntCArgs {
> +    const char *intc_type;
> +    Object *intc;
> +} ForeachFindIntCArgs;
> +
> +static int spapr_cpu_core_find_intc(Object *child, void *opaque)
> +{
> +    ForeachFindIntCArgs *args = opaque;
> +
> +    if (object_dynamic_cast(child, args->intc_type)) {
> +        args->intc = child;
> +    }
> +
> +    return args->intc != NULL;
> +}
> +
> +void spapr_cpu_core_set_intc(PowerPCCPU *cpu, const char *intc_type)
> +{
> +    ForeachFindIntCArgs args = { intc_type, NULL };
> +
> +    object_child_foreach(OBJECT(cpu), spapr_cpu_core_find_intc, &args);
> +    g_assert(args.intc);
> +
> +    cpu->intc = args.intc;
> +}
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 0999a2b2d69c..814500f22d34 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -12,6 +12,7 @@
>  #include "qemu/error-report.h"
>  #include "qapi/error.h"
>  #include "hw/ppc/spapr.h"
> +#include "hw/ppc/spapr_cpu_core.h"
>  #include "hw/ppc/spapr_xive.h"
>  #include "hw/ppc/xics.h"
>  #include "sysemu/kvm.h"
> @@ -208,6 +209,15 @@ static int spapr_irq_post_load_xics(sPAPRMachineState *spapr, int version_id)
>      return 0;
>  }
>  
> +static void spapr_irq_reset_xics(sPAPRMachineState *spapr, Error **errp)
> +{
> +   CPUState *cs;
> +
> +    CPU_FOREACH(cs) {
> +        spapr_cpu_core_set_intc(POWERPC_CPU(cs), spapr->icp_type);
> +    }
> +}
> +
>  #define SPAPR_IRQ_XICS_NR_IRQS     0x1000
>  #define SPAPR_IRQ_XICS_NR_MSIS     \
>      (XICS_IRQ_BASE + SPAPR_IRQ_XICS_NR_IRQS - SPAPR_IRQ_MSI)
> @@ -225,6 +235,7 @@ sPAPRIrq spapr_irq_xics = {
>      .dt_populate = spapr_dt_xics,
>      .cpu_intc_create = spapr_irq_cpu_intc_create_xics,
>      .post_load   = spapr_irq_post_load_xics,
> +    .reset       = spapr_irq_reset_xics,
>  };
>  
>  /*
> @@ -325,6 +336,8 @@ static void spapr_irq_reset_xive(sPAPRMachineState *spapr, Error **errp)
>      CPU_FOREACH(cs) {
>          PowerPCCPU *cpu = POWERPC_CPU(cs);
>  
> +        spapr_cpu_core_set_intc(cpu, TYPE_XIVE_TCTX);
> +
>          /* (TCG) Set the OS CAM line of the thread interrupt context. */
>          spapr_xive_set_tctx_os_cam(XIVE_TCTX(cpu->intc));
>      }

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH v8 09/12] spapr: set the interrupt presenter at reset
  2018-12-13 12:52   ` Cédric Le Goater
@ 2018-12-17  6:01     ` David Gibson
  2018-12-17 22:41       ` Cédric Le Goater
  0 siblings, 1 reply; 30+ messages in thread
From: David Gibson @ 2018-12-17  6:01 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 4766 bytes --]

On Thu, Dec 13, 2018 at 01:52:14PM +0100, Cédric Le Goater wrote:
> On 12/11/18 11:38 PM, Cédric Le Goater wrote:
> > Currently, the interrupt presenter of the vCPU is set at realize
> > time. Setting it at reset will become necessary when the new machine
> > supporting both interrupt modes is introduced. In this machine, the
> > interrupt mode is chosen at CAS time and activated after a reset.
> 
> I think we need a second '->intc' pointer specific to XIVE instead. 
> How about ->intc_xive ?

Ah, interesting.  If we're going to need that, we might as well go
to specific ->icp and ->tctx pointers with the right types.

I don't particularly like having those machine specific pointers
within the cpu structure, but we can look at fixing that later by
reviving my patches to move various machine specific stuff within the
cpu into a separate sub-allocation.

> 
> We could just drop this patch and easly get rid of the XiveTCTX object 
> leak in spapr_unrealize_vcpu().
> 
> C. 
> 
> > Signed-off-by: Cédric Le Goater <clg@kaod.org>
> > ---
> > 
> >  Changes since v7:
> > 
> >  - introduced spapr_irq_reset_xics(). 
> > 
> >  include/hw/ppc/spapr_cpu_core.h |  2 ++
> >  hw/ppc/spapr_cpu_core.c         | 26 ++++++++++++++++++++++++++
> >  hw/ppc/spapr_irq.c              | 13 +++++++++++++
> >  3 files changed, 41 insertions(+)
> > 
> > diff --git a/include/hw/ppc/spapr_cpu_core.h b/include/hw/ppc/spapr_cpu_core.h
> > index 9e2821e4b31f..fc8ea9021656 100644
> > --- a/include/hw/ppc/spapr_cpu_core.h
> > +++ b/include/hw/ppc/spapr_cpu_core.h
> > @@ -53,4 +53,6 @@ static inline sPAPRCPUState *spapr_cpu_state(PowerPCCPU *cpu)
> >      return (sPAPRCPUState *)cpu->machine_data;
> >  }
> >  
> > +void spapr_cpu_core_set_intc(PowerPCCPU *cpu, const char *intc_type);
> > +
> >  #endif
> > diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> > index 82666436e9b4..afc5d4f4e739 100644
> > --- a/hw/ppc/spapr_cpu_core.c
> > +++ b/hw/ppc/spapr_cpu_core.c
> > @@ -397,3 +397,29 @@ static const TypeInfo spapr_cpu_core_type_infos[] = {
> >  };
> >  
> >  DEFINE_TYPES(spapr_cpu_core_type_infos)
> > +
> > +typedef struct ForeachFindIntCArgs {
> > +    const char *intc_type;
> > +    Object *intc;
> > +} ForeachFindIntCArgs;
> > +
> > +static int spapr_cpu_core_find_intc(Object *child, void *opaque)
> > +{
> > +    ForeachFindIntCArgs *args = opaque;
> > +
> > +    if (object_dynamic_cast(child, args->intc_type)) {
> > +        args->intc = child;
> > +    }
> > +
> > +    return args->intc != NULL;
> > +}
> > +
> > +void spapr_cpu_core_set_intc(PowerPCCPU *cpu, const char *intc_type)
> > +{
> > +    ForeachFindIntCArgs args = { intc_type, NULL };
> > +
> > +    object_child_foreach(OBJECT(cpu), spapr_cpu_core_find_intc, &args);
> > +    g_assert(args.intc);
> > +
> > +    cpu->intc = args.intc;
> > +}
> > diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> > index 0999a2b2d69c..814500f22d34 100644
> > --- a/hw/ppc/spapr_irq.c
> > +++ b/hw/ppc/spapr_irq.c
> > @@ -12,6 +12,7 @@
> >  #include "qemu/error-report.h"
> >  #include "qapi/error.h"
> >  #include "hw/ppc/spapr.h"
> > +#include "hw/ppc/spapr_cpu_core.h"
> >  #include "hw/ppc/spapr_xive.h"
> >  #include "hw/ppc/xics.h"
> >  #include "sysemu/kvm.h"
> > @@ -208,6 +209,15 @@ static int spapr_irq_post_load_xics(sPAPRMachineState *spapr, int version_id)
> >      return 0;
> >  }
> >  
> > +static void spapr_irq_reset_xics(sPAPRMachineState *spapr, Error **errp)
> > +{
> > +   CPUState *cs;
> > +
> > +    CPU_FOREACH(cs) {
> > +        spapr_cpu_core_set_intc(POWERPC_CPU(cs), spapr->icp_type);
> > +    }
> > +}
> > +
> >  #define SPAPR_IRQ_XICS_NR_IRQS     0x1000
> >  #define SPAPR_IRQ_XICS_NR_MSIS     \
> >      (XICS_IRQ_BASE + SPAPR_IRQ_XICS_NR_IRQS - SPAPR_IRQ_MSI)
> > @@ -225,6 +235,7 @@ sPAPRIrq spapr_irq_xics = {
> >      .dt_populate = spapr_dt_xics,
> >      .cpu_intc_create = spapr_irq_cpu_intc_create_xics,
> >      .post_load   = spapr_irq_post_load_xics,
> > +    .reset       = spapr_irq_reset_xics,
> >  };
> >  
> >  /*
> > @@ -325,6 +336,8 @@ static void spapr_irq_reset_xive(sPAPRMachineState *spapr, Error **errp)
> >      CPU_FOREACH(cs) {
> >          PowerPCCPU *cpu = POWERPC_CPU(cs);
> >  
> > +        spapr_cpu_core_set_intc(cpu, TYPE_XIVE_TCTX);
> > +
> >          /* (TCG) Set the OS CAM line of the thread interrupt context. */
> >          spapr_xive_set_tctx_os_cam(XIVE_TCTX(cpu->intc));
> >      }
> > 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH v8 10/12] spapr: enable XIVE MMIOs at reset
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 10/12] spapr: enable XIVE MMIOs " Cédric Le Goater
@ 2018-12-17  6:04   ` David Gibson
  0 siblings, 0 replies; 30+ messages in thread
From: David Gibson @ 2018-12-17  6:04 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 3204 bytes --]

On Tue, Dec 11, 2018 at 11:38:21PM +0100, Cédric Le Goater wrote:
> Depending on the interrupt mode of the machine, enable or disable the
> XIVE MMIOs.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
> 
>  Changes since v7:
> 
>  - renamed spapr_xive_enable_mmio() to spapr_xive_mmio_set_enabled()
> 
>  include/hw/ppc/spapr_xive.h | 1 +
>  hw/intc/spapr_xive.c        | 9 +++++++++
>  hw/ppc/spapr_irq.c          | 8 ++++++++
>  3 files changed, 18 insertions(+)
> 
> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> index 728735dbcfbe..9b49871bdb1a 100644
> --- a/include/hw/ppc/spapr_xive.h
> +++ b/include/hw/ppc/spapr_xive.h
> @@ -48,5 +48,6 @@ void spapr_xive_hcall_init(sPAPRMachineState *spapr);
>  void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
>                     uint32_t phandle);
>  void spapr_xive_set_tctx_os_cam(XiveTCTX *tctx);
> +void spapr_xive_mmio_set_enabled(sPAPRXive *xive, bool enable);
>  
>  #endif /* PPC_SPAPR_XIVE_H */
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index aaa5865c4080..cd1b2c06f88b 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -179,6 +179,15 @@ static void spapr_xive_map_mmio(sPAPRXive *xive)
>      sysbus_mmio_map(SYS_BUS_DEVICE(xive), 2, xive->tm_base);
>  }
>  
> +void spapr_xive_mmio_set_enabled(sPAPRXive *xive, bool enable)
> +{
> +    memory_region_set_enabled(&xive->source.esb_mmio, enable);
> +    memory_region_set_enabled(&xive->tm_mmio, enable);
> +
> +    /* Disable the END ESBs until a guest OS makes use of them */
> +    memory_region_set_enabled(&xive->end_source.esb_mmio, false);
> +}
> +
>  /*
>   * When a Virtual Processor is scheduled to run on a HW thread, the
>   * hypervisor pushes its identifier in the OS CAM line. Emulate the
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 814500f22d34..b1319905327f 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -216,6 +216,11 @@ static void spapr_irq_reset_xics(sPAPRMachineState *spapr, Error **errp)
>      CPU_FOREACH(cs) {
>          spapr_cpu_core_set_intc(POWERPC_CPU(cs), spapr->icp_type);
>      }
> +
> +    /* Deactivate the XIVE MMIOs */
> +    if (spapr->xive) {
> +        spapr_xive_mmio_set_enabled(spapr->xive, false);
> +    }

Rather than having the XICS reset reach across and disable the XIVE
MMIO, I think it would be preferable for the common wrapper to disable
both XICS and XIVE, then have the intc specific hook re-enable the
correct one.

>  }
>  
>  #define SPAPR_IRQ_XICS_NR_IRQS     0x1000
> @@ -341,6 +346,9 @@ static void spapr_irq_reset_xive(sPAPRMachineState *spapr, Error **errp)
>          /* (TCG) Set the OS CAM line of the thread interrupt context. */
>          spapr_xive_set_tctx_os_cam(XIVE_TCTX(cpu->intc));
>      }
> +
> +    /* Activate the XIVE MMIOs */
> +    spapr_xive_mmio_set_enabled(spapr->xive, true);
>  }
>  
>  /*

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH v8 11/12] spapr: introduce a new sPAPR IRQ backend supporting XIVE and XICS
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 11/12] spapr: introduce a new sPAPR IRQ backend supporting XIVE and XICS Cédric Le Goater
@ 2018-12-17  6:07   ` David Gibson
  2018-12-17 22:38     ` Cédric Le Goater
  0 siblings, 1 reply; 30+ messages in thread
From: David Gibson @ 2018-12-17  6:07 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 9781 bytes --]

On Tue, Dec 11, 2018 at 11:38:22PM +0100, Cédric Le Goater wrote:
> The 'dual' sPAPR IRQ backend supports both interrupt mode, XIVE
> exploitation mode and the legacy compatibility mode (XICS). both modes
> are not supported at the same time.
> 
> The machine starts with the legacy mode and a new interrupt mode can
> then be negotiated by the CAS process. In this case, the new mode is
> activated after a reset to take into account the required changes in
> the machine. These impact the device tree layout, the interrupt
> presenter object and the exposed MMIO regions in the case of XIVE.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
> 
>  Changes since v7:
> 
>  - usage of 'ic-mode' machine option
> 
>  include/hw/ppc/spapr_irq.h |   1 +
>  hw/ppc/spapr.c             |  10 ++-
>  hw/ppc/spapr_hcall.c       |  11 +++
>  hw/ppc/spapr_irq.c         | 143 +++++++++++++++++++++++++++++++++++++
>  4 files changed, 162 insertions(+), 3 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> index b34d5a00381b..29936498dbc8 100644
> --- a/include/hw/ppc/spapr_irq.h
> +++ b/include/hw/ppc/spapr_irq.h
> @@ -51,6 +51,7 @@ typedef struct sPAPRIrq {
>  extern sPAPRIrq spapr_irq_xics;
>  extern sPAPRIrq spapr_irq_xics_legacy;
>  extern sPAPRIrq spapr_irq_xive;
> +extern sPAPRIrq spapr_irq_dual;
>  
>  void spapr_irq_init(sPAPRMachineState *spapr, Error **errp);
>  int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 5d985e38a598..97a5e3c9929f 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2636,11 +2636,11 @@ static void spapr_machine_init(MachineState *machine)
>      spapr_ovec_set(spapr->ov5, OV5_DRMEM_V2);
>  
>      /* advertise XIVE on POWER9 machines */
> -    if (spapr->irq->ov5 & SPAPR_OV5_XIVE_EXPLOIT) {
> +    if (spapr->irq->ov5 & (SPAPR_OV5_XIVE_EXPLOIT | SPAPR_OV5_XIVE_BOTH)) {
>          if (ppc_type_check_compat(machine->cpu_type, CPU_POWERPC_LOGICAL_3_00,
>                                    0, spapr->max_compat_pvr)) {
>              spapr_ovec_set(spapr->ov5, OV5_XIVE_EXPLOIT);
> -        } else {
> +        } else if (spapr->irq->ov5 & SPAPR_OV5_XIVE_EXPLOIT) {
>              error_report("XIVE-only machines require a POWER9 CPU");
>              exit(1);
>          }
> @@ -3066,6 +3066,8 @@ static char *spapr_get_ic_mode(Object *obj, Error **errp)
>          return g_strdup("xics");
>      } else if (spapr->irq == &spapr_irq_xive) {
>          return g_strdup("xive");
> +    } else if (spapr->irq == &spapr_irq_dual) {
> +        return g_strdup("dual");
>      }
>      g_assert_not_reached();
>  }
> @@ -3079,6 +3081,8 @@ static void spapr_set_ic_mode(Object *obj, const char *value, Error **errp)
>          spapr->irq = &spapr_irq_xics;
>      } else if (strcmp(value, "xive") == 0) {
>          spapr->irq = &spapr_irq_xive;
> +    } else if (strcmp(value, "dual") == 0) {
> +        spapr->irq = &spapr_irq_dual;
>      } else {
>          error_setg(errp, "Bad value for \"ic-mode\" property");
>      }
> @@ -3127,7 +3131,7 @@ static void spapr_instance_init(Object *obj)
>      object_property_add_str(obj, "ic-mode", spapr_get_ic_mode,
>                              spapr_set_ic_mode, NULL);
>      object_property_set_description(obj, "ic-mode",
> -                 "Specifies the interrupt controller mode (xics, xive)",
> +                 "Specifies the interrupt controller mode (xics, xive, dual)",
>                   NULL);
>  }
>  
> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> index ae913d070f50..09386458f267 100644
> --- a/hw/ppc/spapr_hcall.c
> +++ b/hw/ppc/spapr_hcall.c
> @@ -1654,6 +1654,17 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
>              (spapr_h_cas_compose_response(spapr, args[1], args[2],
>                                            ov5_updates) != 0);
>      }
> +
> +    /*
> +     * Generate a machine reset when we have an update of the
> +     * interrupt mode. Only required on the machine supporting both
> +     * mode.
> +     */
> +    if (!spapr->cas_reboot) {
> +        spapr->cas_reboot = spapr_ovec_test(ov5_updates, OV5_XIVE_EXPLOIT)
> +            && spapr->irq->ov5 & SPAPR_OV5_XIVE_BOTH;
> +    }
> +
>      spapr_ovec_cleanup(ov5_updates);
>  
>      if (spapr->cas_reboot) {
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index b1319905327f..31d195b08952 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -375,6 +375,149 @@ sPAPRIrq spapr_irq_xive = {
>      .reset       = spapr_irq_reset_xive,
>  };
>  
> +/*
> + * Dual XIVE and XICS IRQ backend.
> + *
> + * Both interrupt mode, XIVE and XICS, objects are created but the
> + * machine starts in legacy interrupt mode (XICS). It can be changed
> + * by the CAS negotiation process and, in that case, the new mode is
> + * activated after extra machine reset.
> + */
> +
> +/*
> + * Returns the sPAPR IRQ backend negotiated by CAS. XICS is the
> + * default.
> + */
> +static sPAPRIrq *spapr_irq_current(sPAPRMachineState *spapr)
> +{
> +    return spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT) ?
> +        &spapr_irq_xive : &spapr_irq_xics;
> +}
> +
> +static void spapr_irq_init_dual(sPAPRMachineState *spapr, Error **errp)
> +{
> +    MachineState *machine = MACHINE(spapr);
> +    Error *local_err = NULL;
> +
> +    if (kvm_enabled() && machine_kernel_irqchip_allowed(machine)) {
> +        error_setg(errp, "No KVM support for the 'dual' machine");
> +        return;
> +    }
> +
> +    spapr_irq_xics.init(spapr, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +        return;
> +    }
> +
> +    spapr_irq_xive.init(spapr, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +        return;
> +    }
> +}
> +
> +static int spapr_irq_claim_dual(sPAPRMachineState *spapr, int irq, bool lsi,
> +                                Error **errp)
> +{
> +    int ret;
> +    Error *local_err = NULL;
> +
> +    ret = spapr_irq_xive.claim(spapr, irq, lsi, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +        return ret;
> +    }
> +
> +    ret = spapr_irq_xics.claim(spapr, irq, lsi, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +    }
> +
> +    return ret;
> +}
> +
> +static void spapr_irq_free_dual(sPAPRMachineState *spapr, int irq, int num)
> +{
> +    spapr_irq_xive.free(spapr, irq, num);
> +    spapr_irq_xics.free(spapr, irq, num);
> +}
> +
> +static qemu_irq spapr_qirq_dual(sPAPRMachineState *spapr, int irq)
> +{
> +    return spapr_irq_current(spapr)->qirq(spapr, irq);
> +}

This still makes me really nervous - I'd really prefer to have qirqs
independent of the backend, rather than relying on *every* irq using
device never looking up qirqs in advance.

> +static void spapr_irq_print_info_dual(sPAPRMachineState *spapr, Monitor *mon)
> +{
> +    spapr_irq_current(spapr)->print_info(spapr, mon);
> +}
> +
> +static void spapr_irq_dt_populate_dual(sPAPRMachineState *spapr,
> +                                       uint32_t nr_servers, void *fdt,
> +                                       uint32_t phandle)
> +{
> +    spapr_irq_current(spapr)->dt_populate(spapr, nr_servers, fdt, phandle);
> +}
> +
> +static Object *spapr_irq_cpu_intc_create_dual(sPAPRMachineState *spapr,
> +                                              Object *cpu, Error **errp)
> +{
> +    Error *local_err = NULL;
> +
> +    spapr_irq_xive.cpu_intc_create(spapr, cpu, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +        return NULL;
> +    }
> +
> +    /* Default to XICS interrupt mode */
> +    return spapr_irq_xics.cpu_intc_create(spapr, cpu, errp);
> +}
> +
> +static int spapr_irq_post_load_dual(sPAPRMachineState *spapr, int version_id)
> +{
> +    /*
> +     * Force a reset of the XIVE backend after migration. The machine
> +     * defaults to XICS at startup.
> +     */
> +    if (spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
> +        spapr_irq_xive.reset(spapr, &error_fatal);
> +    }
> +
> +    return spapr_irq_current(spapr)->post_load(spapr, version_id);
> +}
> +
> +static void spapr_irq_reset_dual(sPAPRMachineState *spapr, Error **errp)
> +{
> +    /*
> +     * Reset the interrupt mode selected by CAS.
> +     */
> +    spapr_irq_current(spapr)->reset(spapr, errp);
> +}
> +
> +/*
> + * Define values in sync with the XIVE and XICS backend
> + */
> +#define SPAPR_IRQ_DUAL_NR_IRQS     0x2000
> +#define SPAPR_IRQ_DUAL_NR_MSIS     (SPAPR_IRQ_DUAL_NR_IRQS - SPAPR_IRQ_MSI)
> +
> +sPAPRIrq spapr_irq_dual = {
> +    .nr_irqs     = SPAPR_IRQ_DUAL_NR_IRQS,
> +    .nr_msis     = SPAPR_IRQ_DUAL_NR_MSIS,
> +    .ov5         = SPAPR_OV5_XIVE_BOTH,
> +
> +    .init        = spapr_irq_init_dual,
> +    .claim       = spapr_irq_claim_dual,
> +    .free        = spapr_irq_free_dual,
> +    .qirq        = spapr_qirq_dual,
> +    .print_info  = spapr_irq_print_info_dual,
> +    .dt_populate = spapr_irq_dt_populate_dual,
> +    .cpu_intc_create = spapr_irq_cpu_intc_create_dual,
> +    .post_load   = spapr_irq_post_load_dual,
> +    .reset       = spapr_irq_reset_dual,
> +};
> +
>  /*
>   * sPAPR IRQ frontend routines for devices
>   */

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH v8 12/12] spapr: change default CPU type to POWER9
  2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 12/12] spapr: change default CPU type to POWER9 Cédric Le Goater
@ 2018-12-17  6:07   ` David Gibson
  2018-12-17 10:47     ` Cédric Le Goater
  0 siblings, 1 reply; 30+ messages in thread
From: David Gibson @ 2018-12-17  6:07 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 1107 bytes --]

On Tue, Dec 11, 2018 at 11:38:23PM +0100, Cédric Le Goater wrote:
> Signed-off-by: Cédric Le Goater <clg@kaod.org>

Uh.. this will need something to keep the default at POWER8 for the
older machine types.

> ---
>  hw/ppc/spapr.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 97a5e3c9929f..705d3f4057b0 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3938,7 +3938,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>      hc->unplug = spapr_machine_device_unplug;
>  
>      smc->dr_lmb_enabled = true;
> -    mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power8_v2.0");
> +    mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power9_v2.0");
>      mc->has_hotpluggable_cpus = true;
>      smc->resize_hpt_default = SPAPR_RESIZE_HPT_ENABLED;
>      fwc->get_dev_path = spapr_get_fw_dev_path;

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH v8 07/12] spapr: add an extra OV5 field to the sPAPR IRQ backend
  2018-12-17  5:27   ` David Gibson
@ 2018-12-17  8:02     ` Cédric Le Goater
  0 siblings, 0 replies; 30+ messages in thread
From: Cédric Le Goater @ 2018-12-17  8:02 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt

On 12/17/18 6:27 AM, David Gibson wrote:
> On Tue, Dec 11, 2018 at 11:38:18PM +0100, Cédric Le Goater wrote:
>> The interrupt modes supported by the hypervisor are advertised to the
>> guest with new bits definitions of the option vector 5 of property
>> "ibm,arch-vec-5-platform-support. The byte 23 bits 0-1 of the OV5 are
>> defined as follow :
>>
>>   0b00   PAPR 2.7 and earlier (Legacy systems)
>>   0b01   XIVE Exploitation mode only
>>   0b10   Either available
>>
>> If the client/guest selects the XIVE interrupt mode, it informs the
>> hypervisor by returning the value 0b01 in byte 23 bits 0-1. A 0b00
>> value indicates the use of the XICS interrupt mode (Legacy systems).
>>
>> The sPAPR IRQ backend is extended with these definitions and the
>> values are directly used to populate the "ibm,arch-vec-5-platform-support"
>> property. The interrupt mode is advertised under TCG and under KVM.
>> Although a KVM XIVE device is not yet available, the machine can still
>> operate with kernel_irqchip=off. However, we apply a restriction on
>> the CPU which is required to be a POWER9 when a XIVE interrupt
>> controller is in use.
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>> ---
>>
>>  Changes since v7:
>>
>>  - improved commit log
>>  - introduced a check on CPU type.
>>
>>  include/hw/ppc/spapr.h     |  6 ++++++
>>  include/hw/ppc/spapr_irq.h |  1 +
>>  hw/ppc/spapr.c             | 36 ++++++++++++++++++++++++++++++------
>>  hw/ppc/spapr_irq.c         |  3 +++
>>  4 files changed, 40 insertions(+), 6 deletions(-)
>>
>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>> index 6bf028a02fe2..06765b4e9d79 100644
>> --- a/include/hw/ppc/spapr.h
>> +++ b/include/hw/ppc/spapr.h
>> @@ -824,5 +824,11 @@ int spapr_caps_post_migration(sPAPRMachineState *spapr);
>>  
>>  void spapr_check_pagesize(sPAPRMachineState *spapr, hwaddr pagesize,
>>                            Error **errp);
>> +/*
>> + * XIVE definitions
>> + */
>> +#define SPAPR_OV5_XIVE_LEGACY   0x0
>> +#define SPAPR_OV5_XIVE_EXPLOIT  0x40
>> +#define SPAPR_OV5_XIVE_BOTH     0x80 /* Only to advertise on the platform */
>>  
>>  #endif /* HW_SPAPR_H */
>> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
>> index 63061a009b4c..b34d5a00381b 100644
>> --- a/include/hw/ppc/spapr_irq.h
>> +++ b/include/hw/ppc/spapr_irq.h
>> @@ -33,6 +33,7 @@ void spapr_irq_msi_reset(sPAPRMachineState *spapr);
>>  typedef struct sPAPRIrq {
>>      uint32_t    nr_irqs;
>>      uint32_t    nr_msis;
>> +    uint8_t     ov5;
>>  
>>      void (*init)(sPAPRMachineState *spapr, Error **errp);
>>      int (*claim)(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 57c0066edd56..ff138f224a87 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -1095,12 +1095,14 @@ static void spapr_dt_rtas(sPAPRMachineState *spapr, void *fdt)
>>      spapr_dt_rtas_tokens(fdt, rtas);
>>  }
>>  
>> -/* Prepare ibm,arch-vec-5-platform-support, which indicates the MMU features
>> - * that the guest may request and thus the valid values for bytes 24..26 of
>> - * option vector 5: */
>> -static void spapr_dt_ov5_platform_support(void *fdt, int chosen)
>> +/* Prepare ibm,arch-vec-5-platform-support, which indicates the MMU
>> + * and the XIVE features that the guest may request and thus the valid
>> + * values for bytes 23..26 of option vector 5: */
>> +static void spapr_dt_ov5_platform_support(sPAPRMachineState *spapr, void *fdt,
>> +                                          int chosen)
>>  {
>>      PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
>> +    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
>>  
>>      char val[2 * 4] = {
>>          23, 0x00, /* Xive mode, filled in below. */
>> @@ -1111,9 +1113,17 @@ static void spapr_dt_ov5_platform_support(void *fdt, int chosen)
>>  
>>      if (!ppc_check_compat(first_ppc_cpu, CPU_POWERPC_LOGICAL_3_00, 0,
>>                            first_ppc_cpu->compat_pvr)) {
>> -        /* If we're in a pre POWER9 compat mode then the guest should do hash */
>> +        /* If we're in a pre POWER9 compat mode then the guest should do hash
>> +         * and use the legacy interrupt mode
>> +         */
>> +        val[1] = 0x00; /* XICS */
>>          val[3] = 0x00; /* Hash */
>>      } else if (kvm_enabled()) {
>> +        /* If the KVM XIVE device is not available, the machine can
>> +         * still operate with kernel_irqchip=off
>> +         */
>> +        val[1] = smc->irq->ov5;
> 
> 
> I know it means another layer of if, but I'd prefer to rearrange this
> so setting the intc value is outside the if (kvm_enabled())..

That's ok. I expected some discussion on the cpu->intc pointer and we 
have the cross build problem to fix also.

>> +
>>          if (kvmppc_has_cap_mmu_radix() && kvmppc_has_cap_mmu_hash_v3()) {
>>              val[3] = 0x80; /* OV5_MMU_BOTH */
>>          } else if (kvmppc_has_cap_mmu_radix()) {
>> @@ -1122,6 +1132,9 @@ static void spapr_dt_ov5_platform_support(void *fdt, int chosen)
>>              val[3] = 0x00; /* Hash */
>>          }
>>      } else {
>> +        /* In TCG, the interrupt mode is defined by the pseries machine */
>> +        val[1] = smc->irq->ov5;
> 
> ..rather than duplicated in each branch.

ok.

Thanks,

C.


>> +
>>          /* V3 MMU supports both hash and radix in tcg (with dynamic switching) */
>>          val[3] = 0xC0;
>>      }
>> @@ -1189,7 +1202,7 @@ static void spapr_dt_chosen(sPAPRMachineState *spapr, void *fdt)
>>          _FDT(fdt_setprop_string(fdt, chosen, "stdout-path", stdout_path));
>>      }
>>  
>> -    spapr_dt_ov5_platform_support(fdt, chosen);
>> +    spapr_dt_ov5_platform_support(spapr, fdt, chosen);
>>  
>>      g_free(stdout_path);
>>      g_free(bootlist);
>> @@ -2622,6 +2635,17 @@ static void spapr_machine_init(MachineState *machine)
>>      /* advertise support for ibm,dyamic-memory-v2 */
>>      spapr_ovec_set(spapr->ov5, OV5_DRMEM_V2);
>>  
>> +    /* advertise XIVE on POWER9 machines */
>> +    if (smc->irq->ov5 & SPAPR_OV5_XIVE_EXPLOIT) {
>> +        if (ppc_type_check_compat(machine->cpu_type, CPU_POWERPC_LOGICAL_3_00,
>> +                                  0, spapr->max_compat_pvr)) {
>> +            spapr_ovec_set(spapr->ov5, OV5_XIVE_EXPLOIT);
>> +        } else {
>> +            error_report("XIVE-only machines require a POWER9 CPU");
>> +            exit(1);
>> +        }
>> +    }
>> +
>>      /* init CPUs */
>>      spapr_init_cpus(spapr);
>>  
>> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
>> index f835ea939cbf..79f06308995b 100644
>> --- a/hw/ppc/spapr_irq.c
>> +++ b/hw/ppc/spapr_irq.c
>> @@ -216,6 +216,7 @@ static int spapr_irq_post_load_xics(sPAPRMachineState *spapr, int version_id)
>>  sPAPRIrq spapr_irq_xics = {
>>      .nr_irqs     = SPAPR_IRQ_XICS_NR_IRQS,
>>      .nr_msis     = SPAPR_IRQ_XICS_NR_MSIS,
>> +    .ov5         = SPAPR_OV5_XIVE_LEGACY,
>>  
>>      .init        = spapr_irq_init_xics,
>>      .claim       = spapr_irq_claim_xics,
>> @@ -342,6 +343,7 @@ static void spapr_irq_reset_xive(sPAPRMachineState *spapr, Error **errp)
>>  sPAPRIrq spapr_irq_xive = {
>>      .nr_irqs     = SPAPR_IRQ_XIVE_NR_IRQS,
>>      .nr_msis     = SPAPR_IRQ_XIVE_NR_MSIS,
>> +    .ov5         = SPAPR_OV5_XIVE_EXPLOIT,
>>  
>>      .init        = spapr_irq_init_xive,
>>      .claim       = spapr_irq_claim_xive,
>> @@ -466,6 +468,7 @@ int spapr_irq_find(sPAPRMachineState *spapr, int num, bool align, Error **errp)
>>  sPAPRIrq spapr_irq_xics_legacy = {
>>      .nr_irqs     = SPAPR_IRQ_XICS_LEGACY_NR_IRQS,
>>      .nr_msis     = SPAPR_IRQ_XICS_LEGACY_NR_IRQS,
>> +    .ov5         = SPAPR_OV5_XIVE_LEGACY,
>>  
>>      .init        = spapr_irq_init_xics,
>>      .claim       = spapr_irq_claim_xics,
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH v8 00/12] ppc: support for the XIVE interrupt controller (POWER9)
  2018-12-17  5:29 ` [Qemu-devel] [PATCH v8 00/12] ppc: support for the XIVE interrupt controller (POWER9) David Gibson
@ 2018-12-17 10:16   ` Cédric Le Goater
  0 siblings, 0 replies; 30+ messages in thread
From: Cédric Le Goater @ 2018-12-17 10:16 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt

On 12/17/18 6:29 AM, David Gibson wrote:
> On Tue, Dec 11, 2018 at 11:38:11PM +0100, Cédric Le Goater wrote:
>> Hello,
>>
>> Here is the version 8 of the QEMU models adding support for the XIVE
>> interrupt controller to the sPAPR machine, under TCG only. KVM support
>> will be proposed in an other patchset, along with the KVM XIVE device
>> patchset, so will PowerNV.
>>
>> The most important changes for sPAPR are the introduction of a 'ic-mode'
>> machine option to select the interrupt mode of the machine and the fix
>> for CPU hotplug.
> 
> I've applied 1-6, I have some minor comments on 7, still reviewing the
> remainder.
> 
> We also need to fix that Windows build bug - I'm planning to fold /
> insert that fix into the series so as not to break bisection.

ok. I will keep the Windows build fixes as standalone patches. 

There is another one the hcalls, but, may be, you can fold the fix in 
the hcall patch directly.  
 
Thanks,

C.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH v8 12/12] spapr: change default CPU type to POWER9
  2018-12-17  6:07   ` David Gibson
@ 2018-12-17 10:47     ` Cédric Le Goater
  0 siblings, 0 replies; 30+ messages in thread
From: Cédric Le Goater @ 2018-12-17 10:47 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt

On 12/17/18 7:07 AM, David Gibson wrote:
> On Tue, Dec 11, 2018 at 11:38:23PM +0100, Cédric Le Goater wrote:
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> 
> Uh.. this will need something to keep the default at POWER8 for the
> older machine types.

:) Yes. That was an excess of enthusiasm for my part I suppose.

C.

>> ---
>>  hw/ppc/spapr.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 97a5e3c9929f..705d3f4057b0 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -3938,7 +3938,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>>      hc->unplug = spapr_machine_device_unplug;
>>  
>>      smc->dr_lmb_enabled = true;
>> -    mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power8_v2.0");
>> +    mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power9_v2.0");
>>      mc->has_hotpluggable_cpus = true;
>>      smc->resize_hpt_default = SPAPR_RESIZE_HPT_ENABLED;
>>      fwc->get_dev_path = spapr_get_fw_dev_path;
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH v8 11/12] spapr: introduce a new sPAPR IRQ backend supporting XIVE and XICS
  2018-12-17  6:07   ` David Gibson
@ 2018-12-17 22:38     ` Cédric Le Goater
  2018-12-19 19:15       ` Cédric Le Goater
  0 siblings, 1 reply; 30+ messages in thread
From: Cédric Le Goater @ 2018-12-17 22:38 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt

On 12/17/18 7:07 AM, David Gibson wrote:
> On Tue, Dec 11, 2018 at 11:38:22PM +0100, Cédric Le Goater wrote:
>> The 'dual' sPAPR IRQ backend supports both interrupt mode, XIVE
>> exploitation mode and the legacy compatibility mode (XICS). both modes
>> are not supported at the same time.
>>
>> The machine starts with the legacy mode and a new interrupt mode can
>> then be negotiated by the CAS process. In this case, the new mode is
>> activated after a reset to take into account the required changes in
>> the machine. These impact the device tree layout, the interrupt
>> presenter object and the exposed MMIO regions in the case of XIVE.
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>> ---
>>
>>  Changes since v7:
>>
>>  - usage of 'ic-mode' machine option
>>
>>  include/hw/ppc/spapr_irq.h |   1 +
>>  hw/ppc/spapr.c             |  10 ++-
>>  hw/ppc/spapr_hcall.c       |  11 +++
>>  hw/ppc/spapr_irq.c         | 143 +++++++++++++++++++++++++++++++++++++
>>  4 files changed, 162 insertions(+), 3 deletions(-)
>>
>> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
>> index b34d5a00381b..29936498dbc8 100644
>> --- a/include/hw/ppc/spapr_irq.h
>> +++ b/include/hw/ppc/spapr_irq.h
>> @@ -51,6 +51,7 @@ typedef struct sPAPRIrq {
>>  extern sPAPRIrq spapr_irq_xics;
>>  extern sPAPRIrq spapr_irq_xics_legacy;
>>  extern sPAPRIrq spapr_irq_xive;
>> +extern sPAPRIrq spapr_irq_dual;
>>  
>>  void spapr_irq_init(sPAPRMachineState *spapr, Error **errp);
>>  int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 5d985e38a598..97a5e3c9929f 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -2636,11 +2636,11 @@ static void spapr_machine_init(MachineState *machine)
>>      spapr_ovec_set(spapr->ov5, OV5_DRMEM_V2);
>>  
>>      /* advertise XIVE on POWER9 machines */
>> -    if (spapr->irq->ov5 & SPAPR_OV5_XIVE_EXPLOIT) {
>> +    if (spapr->irq->ov5 & (SPAPR_OV5_XIVE_EXPLOIT | SPAPR_OV5_XIVE_BOTH)) {
>>          if (ppc_type_check_compat(machine->cpu_type, CPU_POWERPC_LOGICAL_3_00,
>>                                    0, spapr->max_compat_pvr)) {
>>              spapr_ovec_set(spapr->ov5, OV5_XIVE_EXPLOIT);
>> -        } else {
>> +        } else if (spapr->irq->ov5 & SPAPR_OV5_XIVE_EXPLOIT) {
>>              error_report("XIVE-only machines require a POWER9 CPU");
>>              exit(1);
>>          }
>> @@ -3066,6 +3066,8 @@ static char *spapr_get_ic_mode(Object *obj, Error **errp)
>>          return g_strdup("xics");
>>      } else if (spapr->irq == &spapr_irq_xive) {
>>          return g_strdup("xive");
>> +    } else if (spapr->irq == &spapr_irq_dual) {
>> +        return g_strdup("dual");
>>      }
>>      g_assert_not_reached();
>>  }
>> @@ -3079,6 +3081,8 @@ static void spapr_set_ic_mode(Object *obj, const char *value, Error **errp)
>>          spapr->irq = &spapr_irq_xics;
>>      } else if (strcmp(value, "xive") == 0) {
>>          spapr->irq = &spapr_irq_xive;
>> +    } else if (strcmp(value, "dual") == 0) {
>> +        spapr->irq = &spapr_irq_dual;
>>      } else {
>>          error_setg(errp, "Bad value for \"ic-mode\" property");
>>      }
>> @@ -3127,7 +3131,7 @@ static void spapr_instance_init(Object *obj)
>>      object_property_add_str(obj, "ic-mode", spapr_get_ic_mode,
>>                              spapr_set_ic_mode, NULL);
>>      object_property_set_description(obj, "ic-mode",
>> -                 "Specifies the interrupt controller mode (xics, xive)",
>> +                 "Specifies the interrupt controller mode (xics, xive, dual)",
>>                   NULL);
>>  }
>>  
>> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
>> index ae913d070f50..09386458f267 100644
>> --- a/hw/ppc/spapr_hcall.c
>> +++ b/hw/ppc/spapr_hcall.c
>> @@ -1654,6 +1654,17 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
>>              (spapr_h_cas_compose_response(spapr, args[1], args[2],
>>                                            ov5_updates) != 0);
>>      }
>> +
>> +    /*
>> +     * Generate a machine reset when we have an update of the
>> +     * interrupt mode. Only required on the machine supporting both
>> +     * mode.
>> +     */
>> +    if (!spapr->cas_reboot) {
>> +        spapr->cas_reboot = spapr_ovec_test(ov5_updates, OV5_XIVE_EXPLOIT)
>> +            && spapr->irq->ov5 & SPAPR_OV5_XIVE_BOTH;
>> +    }
>> +
>>      spapr_ovec_cleanup(ov5_updates);
>>  
>>      if (spapr->cas_reboot) {
>> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
>> index b1319905327f..31d195b08952 100644
>> --- a/hw/ppc/spapr_irq.c
>> +++ b/hw/ppc/spapr_irq.c
>> @@ -375,6 +375,149 @@ sPAPRIrq spapr_irq_xive = {
>>      .reset       = spapr_irq_reset_xive,
>>  };
>>  
>> +/*
>> + * Dual XIVE and XICS IRQ backend.
>> + *
>> + * Both interrupt mode, XIVE and XICS, objects are created but the
>> + * machine starts in legacy interrupt mode (XICS). It can be changed
>> + * by the CAS negotiation process and, in that case, the new mode is
>> + * activated after extra machine reset.
>> + */
>> +
>> +/*
>> + * Returns the sPAPR IRQ backend negotiated by CAS. XICS is the
>> + * default.
>> + */
>> +static sPAPRIrq *spapr_irq_current(sPAPRMachineState *spapr)
>> +{
>> +    return spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT) ?
>> +        &spapr_irq_xive : &spapr_irq_xics;
>> +}
>> +
>> +static void spapr_irq_init_dual(sPAPRMachineState *spapr, Error **errp)
>> +{
>> +    MachineState *machine = MACHINE(spapr);
>> +    Error *local_err = NULL;
>> +
>> +    if (kvm_enabled() && machine_kernel_irqchip_allowed(machine)) {
>> +        error_setg(errp, "No KVM support for the 'dual' machine");
>> +        return;
>> +    }
>> +
>> +    spapr_irq_xics.init(spapr, &local_err);
>> +    if (local_err) {
>> +        error_propagate(errp, local_err);
>> +        return;
>> +    }
>> +
>> +    spapr_irq_xive.init(spapr, &local_err);
>> +    if (local_err) {
>> +        error_propagate(errp, local_err);
>> +        return;
>> +    }
>> +}
>> +
>> +static int spapr_irq_claim_dual(sPAPRMachineState *spapr, int irq, bool lsi,
>> +                                Error **errp)
>> +{
>> +    int ret;
>> +    Error *local_err = NULL;
>> +
>> +    ret = spapr_irq_xive.claim(spapr, irq, lsi, &local_err);
>> +    if (local_err) {
>> +        error_propagate(errp, local_err);
>> +        return ret;
>> +    }
>> +
>> +    ret = spapr_irq_xics.claim(spapr, irq, lsi, &local_err);
>> +    if (local_err) {
>> +        error_propagate(errp, local_err);
>> +    }
>> +
>> +    return ret;
>> +}
>> +
>> +static void spapr_irq_free_dual(sPAPRMachineState *spapr, int irq, int num)
>> +{
>> +    spapr_irq_xive.free(spapr, irq, num);
>> +    spapr_irq_xics.free(spapr, irq, num);
>> +}
>> +
>> +static qemu_irq spapr_qirq_dual(sPAPRMachineState *spapr, int irq)
>> +{
>> +    return spapr_irq_current(spapr)->qirq(spapr, irq);
>> +}
> 
> This still makes me really nervous - I'd really prefer to have qirqs
> independent of the backend, rather than relying on *every* irq using
> device never looking up qirqs in advance.

I will take a look. This is a large rework I won't have time to address 
this year. I have removed the dual machine from v9.

You would move the qirq array at the machine level ?  

Thanks,

C.


> 
>> +static void spapr_irq_print_info_dual(sPAPRMachineState *spapr, Monitor *mon)
>> +{
>> +    spapr_irq_current(spapr)->print_info(spapr, mon);
>> +}
>> +
>> +static void spapr_irq_dt_populate_dual(sPAPRMachineState *spapr,
>> +                                       uint32_t nr_servers, void *fdt,
>> +                                       uint32_t phandle)
>> +{
>> +    spapr_irq_current(spapr)->dt_populate(spapr, nr_servers, fdt, phandle);
>> +}
>> +
>> +static Object *spapr_irq_cpu_intc_create_dual(sPAPRMachineState *spapr,
>> +                                              Object *cpu, Error **errp)
>> +{
>> +    Error *local_err = NULL;
>> +
>> +    spapr_irq_xive.cpu_intc_create(spapr, cpu, &local_err);
>> +    if (local_err) {
>> +        error_propagate(errp, local_err);
>> +        return NULL;
>> +    }
>> +
>> +    /* Default to XICS interrupt mode */
>> +    return spapr_irq_xics.cpu_intc_create(spapr, cpu, errp);
>> +}
>> +
>> +static int spapr_irq_post_load_dual(sPAPRMachineState *spapr, int version_id)
>> +{
>> +    /*
>> +     * Force a reset of the XIVE backend after migration. The machine
>> +     * defaults to XICS at startup.
>> +     */
>> +    if (spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT)) {
>> +        spapr_irq_xive.reset(spapr, &error_fatal);
>> +    }
>> +
>> +    return spapr_irq_current(spapr)->post_load(spapr, version_id);
>> +}
>> +
>> +static void spapr_irq_reset_dual(sPAPRMachineState *spapr, Error **errp)
>> +{
>> +    /*
>> +     * Reset the interrupt mode selected by CAS.
>> +     */
>> +    spapr_irq_current(spapr)->reset(spapr, errp);
>> +}
>> +
>> +/*
>> + * Define values in sync with the XIVE and XICS backend
>> + */
>> +#define SPAPR_IRQ_DUAL_NR_IRQS     0x2000
>> +#define SPAPR_IRQ_DUAL_NR_MSIS     (SPAPR_IRQ_DUAL_NR_IRQS - SPAPR_IRQ_MSI)
>> +
>> +sPAPRIrq spapr_irq_dual = {
>> +    .nr_irqs     = SPAPR_IRQ_DUAL_NR_IRQS,
>> +    .nr_msis     = SPAPR_IRQ_DUAL_NR_MSIS,
>> +    .ov5         = SPAPR_OV5_XIVE_BOTH,
>> +
>> +    .init        = spapr_irq_init_dual,
>> +    .claim       = spapr_irq_claim_dual,
>> +    .free        = spapr_irq_free_dual,
>> +    .qirq        = spapr_qirq_dual,
>> +    .print_info  = spapr_irq_print_info_dual,
>> +    .dt_populate = spapr_irq_dt_populate_dual,
>> +    .cpu_intc_create = spapr_irq_cpu_intc_create_dual,
>> +    .post_load   = spapr_irq_post_load_dual,
>> +    .reset       = spapr_irq_reset_dual,
>> +};
>> +
>>  /*
>>   * sPAPR IRQ frontend routines for devices
>>   */
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH v8 09/12] spapr: set the interrupt presenter at reset
  2018-12-17  6:01     ` David Gibson
@ 2018-12-17 22:41       ` Cédric Le Goater
  2018-12-18  0:00         ` David Gibson
  0 siblings, 1 reply; 30+ messages in thread
From: Cédric Le Goater @ 2018-12-17 22:41 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt

On 12/17/18 7:01 AM, David Gibson wrote:
> On Thu, Dec 13, 2018 at 01:52:14PM +0100, Cédric Le Goater wrote:
>> On 12/11/18 11:38 PM, Cédric Le Goater wrote:
>>> Currently, the interrupt presenter of the vCPU is set at realize
>>> time. Setting it at reset will become necessary when the new machine
>>> supporting both interrupt modes is introduced. In this machine, the
>>> interrupt mode is chosen at CAS time and activated after a reset.
>>
>> I think we need a second '->intc' pointer specific to XIVE instead. 
>> How about ->intc_xive ?
> 
> Ah, interesting.  If we're going to need that, we might as well go
> to specific ->icp and ->tctx pointers with the right types.

Hmm, PowerPCCPU is defined under target/ppc and adding the type might 
add some noise. I will check.

> I don't particularly like having those machine specific pointers
> within the cpu structure, but we can look at fixing that later by
> reviving my patches to move various machine specific stuff within the
> cpu into a separate sub-allocation.

ok.

Thanks,

C. 

>>
>> We could just drop this patch and easly get rid of the XiveTCTX object 
>> leak in spapr_unrealize_vcpu().
>>
>> C. 
>>
>>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>>> ---
>>>
>>>  Changes since v7:
>>>
>>>  - introduced spapr_irq_reset_xics(). 
>>>
>>>  include/hw/ppc/spapr_cpu_core.h |  2 ++
>>>  hw/ppc/spapr_cpu_core.c         | 26 ++++++++++++++++++++++++++
>>>  hw/ppc/spapr_irq.c              | 13 +++++++++++++
>>>  3 files changed, 41 insertions(+)
>>>
>>> diff --git a/include/hw/ppc/spapr_cpu_core.h b/include/hw/ppc/spapr_cpu_core.h
>>> index 9e2821e4b31f..fc8ea9021656 100644
>>> --- a/include/hw/ppc/spapr_cpu_core.h
>>> +++ b/include/hw/ppc/spapr_cpu_core.h
>>> @@ -53,4 +53,6 @@ static inline sPAPRCPUState *spapr_cpu_state(PowerPCCPU *cpu)
>>>      return (sPAPRCPUState *)cpu->machine_data;
>>>  }
>>>  
>>> +void spapr_cpu_core_set_intc(PowerPCCPU *cpu, const char *intc_type);
>>> +
>>>  #endif
>>> diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
>>> index 82666436e9b4..afc5d4f4e739 100644
>>> --- a/hw/ppc/spapr_cpu_core.c
>>> +++ b/hw/ppc/spapr_cpu_core.c
>>> @@ -397,3 +397,29 @@ static const TypeInfo spapr_cpu_core_type_infos[] = {
>>>  };
>>>  
>>>  DEFINE_TYPES(spapr_cpu_core_type_infos)
>>> +
>>> +typedef struct ForeachFindIntCArgs {
>>> +    const char *intc_type;
>>> +    Object *intc;
>>> +} ForeachFindIntCArgs;
>>> +
>>> +static int spapr_cpu_core_find_intc(Object *child, void *opaque)
>>> +{
>>> +    ForeachFindIntCArgs *args = opaque;
>>> +
>>> +    if (object_dynamic_cast(child, args->intc_type)) {
>>> +        args->intc = child;
>>> +    }
>>> +
>>> +    return args->intc != NULL;
>>> +}
>>> +
>>> +void spapr_cpu_core_set_intc(PowerPCCPU *cpu, const char *intc_type)
>>> +{
>>> +    ForeachFindIntCArgs args = { intc_type, NULL };
>>> +
>>> +    object_child_foreach(OBJECT(cpu), spapr_cpu_core_find_intc, &args);
>>> +    g_assert(args.intc);
>>> +
>>> +    cpu->intc = args.intc;
>>> +}
>>> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
>>> index 0999a2b2d69c..814500f22d34 100644
>>> --- a/hw/ppc/spapr_irq.c
>>> +++ b/hw/ppc/spapr_irq.c
>>> @@ -12,6 +12,7 @@
>>>  #include "qemu/error-report.h"
>>>  #include "qapi/error.h"
>>>  #include "hw/ppc/spapr.h"
>>> +#include "hw/ppc/spapr_cpu_core.h"
>>>  #include "hw/ppc/spapr_xive.h"
>>>  #include "hw/ppc/xics.h"
>>>  #include "sysemu/kvm.h"
>>> @@ -208,6 +209,15 @@ static int spapr_irq_post_load_xics(sPAPRMachineState *spapr, int version_id)
>>>      return 0;
>>>  }
>>>  
>>> +static void spapr_irq_reset_xics(sPAPRMachineState *spapr, Error **errp)
>>> +{
>>> +   CPUState *cs;
>>> +
>>> +    CPU_FOREACH(cs) {
>>> +        spapr_cpu_core_set_intc(POWERPC_CPU(cs), spapr->icp_type);
>>> +    }
>>> +}
>>> +
>>>  #define SPAPR_IRQ_XICS_NR_IRQS     0x1000
>>>  #define SPAPR_IRQ_XICS_NR_MSIS     \
>>>      (XICS_IRQ_BASE + SPAPR_IRQ_XICS_NR_IRQS - SPAPR_IRQ_MSI)
>>> @@ -225,6 +235,7 @@ sPAPRIrq spapr_irq_xics = {
>>>      .dt_populate = spapr_dt_xics,
>>>      .cpu_intc_create = spapr_irq_cpu_intc_create_xics,
>>>      .post_load   = spapr_irq_post_load_xics,
>>> +    .reset       = spapr_irq_reset_xics,
>>>  };
>>>  
>>>  /*
>>> @@ -325,6 +336,8 @@ static void spapr_irq_reset_xive(sPAPRMachineState *spapr, Error **errp)
>>>      CPU_FOREACH(cs) {
>>>          PowerPCCPU *cpu = POWERPC_CPU(cs);
>>>  
>>> +        spapr_cpu_core_set_intc(cpu, TYPE_XIVE_TCTX);
>>> +
>>>          /* (TCG) Set the OS CAM line of the thread interrupt context. */
>>>          spapr_xive_set_tctx_os_cam(XIVE_TCTX(cpu->intc));
>>>      }
>>>
>>
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH v8 09/12] spapr: set the interrupt presenter at reset
  2018-12-17 22:41       ` Cédric Le Goater
@ 2018-12-18  0:00         ` David Gibson
  0 siblings, 0 replies; 30+ messages in thread
From: David Gibson @ 2018-12-18  0:00 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 5401 bytes --]

On Mon, Dec 17, 2018 at 11:41:34PM +0100, Cédric Le Goater wrote:
> On 12/17/18 7:01 AM, David Gibson wrote:
> > On Thu, Dec 13, 2018 at 01:52:14PM +0100, Cédric Le Goater wrote:
> >> On 12/11/18 11:38 PM, Cédric Le Goater wrote:
> >>> Currently, the interrupt presenter of the vCPU is set at realize
> >>> time. Setting it at reset will become necessary when the new machine
> >>> supporting both interrupt modes is introduced. In this machine, the
> >>> interrupt mode is chosen at CAS time and activated after a reset.
> >>
> >> I think we need a second '->intc' pointer specific to XIVE instead. 
> >> How about ->intc_xive ?
> > 
> > Ah, interesting.  If we're going to need that, we might as well go
> > to specific ->icp and ->tctx pointers with the right types.
> 
> Hmm, PowerPCCPU is defined under target/ppc and adding the type might 
> add some noise. I will check.

It's a pointer, so you can just declare the type without a
definition.  That should be ok.

> > I don't particularly like having those machine specific pointers
> > within the cpu structure, but we can look at fixing that later by
> > reviving my patches to move various machine specific stuff within the
> > cpu into a separate sub-allocation.
> 
> ok.
> 
> Thanks,
> 
> C. 
> 
> >>
> >> We could just drop this patch and easly get rid of the XiveTCTX object 
> >> leak in spapr_unrealize_vcpu().
> >>
> >> C. 
> >>
> >>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> >>> ---
> >>>
> >>>  Changes since v7:
> >>>
> >>>  - introduced spapr_irq_reset_xics(). 
> >>>
> >>>  include/hw/ppc/spapr_cpu_core.h |  2 ++
> >>>  hw/ppc/spapr_cpu_core.c         | 26 ++++++++++++++++++++++++++
> >>>  hw/ppc/spapr_irq.c              | 13 +++++++++++++
> >>>  3 files changed, 41 insertions(+)
> >>>
> >>> diff --git a/include/hw/ppc/spapr_cpu_core.h b/include/hw/ppc/spapr_cpu_core.h
> >>> index 9e2821e4b31f..fc8ea9021656 100644
> >>> --- a/include/hw/ppc/spapr_cpu_core.h
> >>> +++ b/include/hw/ppc/spapr_cpu_core.h
> >>> @@ -53,4 +53,6 @@ static inline sPAPRCPUState *spapr_cpu_state(PowerPCCPU *cpu)
> >>>      return (sPAPRCPUState *)cpu->machine_data;
> >>>  }
> >>>  
> >>> +void spapr_cpu_core_set_intc(PowerPCCPU *cpu, const char *intc_type);
> >>> +
> >>>  #endif
> >>> diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> >>> index 82666436e9b4..afc5d4f4e739 100644
> >>> --- a/hw/ppc/spapr_cpu_core.c
> >>> +++ b/hw/ppc/spapr_cpu_core.c
> >>> @@ -397,3 +397,29 @@ static const TypeInfo spapr_cpu_core_type_infos[] = {
> >>>  };
> >>>  
> >>>  DEFINE_TYPES(spapr_cpu_core_type_infos)
> >>> +
> >>> +typedef struct ForeachFindIntCArgs {
> >>> +    const char *intc_type;
> >>> +    Object *intc;
> >>> +} ForeachFindIntCArgs;
> >>> +
> >>> +static int spapr_cpu_core_find_intc(Object *child, void *opaque)
> >>> +{
> >>> +    ForeachFindIntCArgs *args = opaque;
> >>> +
> >>> +    if (object_dynamic_cast(child, args->intc_type)) {
> >>> +        args->intc = child;
> >>> +    }
> >>> +
> >>> +    return args->intc != NULL;
> >>> +}
> >>> +
> >>> +void spapr_cpu_core_set_intc(PowerPCCPU *cpu, const char *intc_type)
> >>> +{
> >>> +    ForeachFindIntCArgs args = { intc_type, NULL };
> >>> +
> >>> +    object_child_foreach(OBJECT(cpu), spapr_cpu_core_find_intc, &args);
> >>> +    g_assert(args.intc);
> >>> +
> >>> +    cpu->intc = args.intc;
> >>> +}
> >>> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> >>> index 0999a2b2d69c..814500f22d34 100644
> >>> --- a/hw/ppc/spapr_irq.c
> >>> +++ b/hw/ppc/spapr_irq.c
> >>> @@ -12,6 +12,7 @@
> >>>  #include "qemu/error-report.h"
> >>>  #include "qapi/error.h"
> >>>  #include "hw/ppc/spapr.h"
> >>> +#include "hw/ppc/spapr_cpu_core.h"
> >>>  #include "hw/ppc/spapr_xive.h"
> >>>  #include "hw/ppc/xics.h"
> >>>  #include "sysemu/kvm.h"
> >>> @@ -208,6 +209,15 @@ static int spapr_irq_post_load_xics(sPAPRMachineState *spapr, int version_id)
> >>>      return 0;
> >>>  }
> >>>  
> >>> +static void spapr_irq_reset_xics(sPAPRMachineState *spapr, Error **errp)
> >>> +{
> >>> +   CPUState *cs;
> >>> +
> >>> +    CPU_FOREACH(cs) {
> >>> +        spapr_cpu_core_set_intc(POWERPC_CPU(cs), spapr->icp_type);
> >>> +    }
> >>> +}
> >>> +
> >>>  #define SPAPR_IRQ_XICS_NR_IRQS     0x1000
> >>>  #define SPAPR_IRQ_XICS_NR_MSIS     \
> >>>      (XICS_IRQ_BASE + SPAPR_IRQ_XICS_NR_IRQS - SPAPR_IRQ_MSI)
> >>> @@ -225,6 +235,7 @@ sPAPRIrq spapr_irq_xics = {
> >>>      .dt_populate = spapr_dt_xics,
> >>>      .cpu_intc_create = spapr_irq_cpu_intc_create_xics,
> >>>      .post_load   = spapr_irq_post_load_xics,
> >>> +    .reset       = spapr_irq_reset_xics,
> >>>  };
> >>>  
> >>>  /*
> >>> @@ -325,6 +336,8 @@ static void spapr_irq_reset_xive(sPAPRMachineState *spapr, Error **errp)
> >>>      CPU_FOREACH(cs) {
> >>>          PowerPCCPU *cpu = POWERPC_CPU(cs);
> >>>  
> >>> +        spapr_cpu_core_set_intc(cpu, TYPE_XIVE_TCTX);
> >>> +
> >>>          /* (TCG) Set the OS CAM line of the thread interrupt context. */
> >>>          spapr_xive_set_tctx_os_cam(XIVE_TCTX(cpu->intc));
> >>>      }
> >>>
> >>
> > 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH v8 11/12] spapr: introduce a new sPAPR IRQ backend supporting XIVE and XICS
  2018-12-17 22:38     ` Cédric Le Goater
@ 2018-12-19 19:15       ` Cédric Le Goater
  2018-12-21  1:50         ` David Gibson
  0 siblings, 1 reply; 30+ messages in thread
From: Cédric Le Goater @ 2018-12-19 19:15 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt

[ ... ]

>>> +static qemu_irq spapr_qirq_dual(sPAPRMachineState *spapr, int irq)
>>> +{
>>> +    return spapr_irq_current(spapr)->qirq(spapr, irq);
>>> +}
>>
>> This still makes me really nervous - I'd really prefer to have qirqs
>> independent of the backend, rather than relying on *every* irq using
>> device never looking up qirqs in advance.
> 
> I will take a look. This is a large rework I won't have time to address 
> this year. I have removed the dual machine from v9.
> 
> You would move the qirq array at the machine level ?  

I took a look today and did a few changes : 

 - move the qirq array at the machine level
 - introduced a 'set_irq' method to sPAPR IRQ
 - adapted the 'qirq' method of sPAPR IRQ. We still need to perform some
   checks and to handle the IRQ number offset.

It falls well in place, a part for the ICS source of the PnvPSI model 
which does not have any qirq anymore. For PSI, I am thinking of moving 
the qirq array under PnvPSI model, like I did for the machine. 

Would that be ok ?

I think there are a couple more possible cleanups on the different ICS 
models to do if these changes are acceptable. 

C. 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH v8 11/12] spapr: introduce a new sPAPR IRQ backend supporting XIVE and XICS
  2018-12-19 19:15       ` Cédric Le Goater
@ 2018-12-21  1:50         ` David Gibson
  0 siblings, 0 replies; 30+ messages in thread
From: David Gibson @ 2018-12-21  1:50 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-ppc, qemu-devel, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 1721 bytes --]

On Wed, Dec 19, 2018 at 08:15:36PM +0100, Cédric Le Goater wrote:
> [ ... ]
> 
> >>> +static qemu_irq spapr_qirq_dual(sPAPRMachineState *spapr, int irq)
> >>> +{
> >>> +    return spapr_irq_current(spapr)->qirq(spapr, irq);
> >>> +}
> >>
> >> This still makes me really nervous - I'd really prefer to have qirqs
> >> independent of the backend, rather than relying on *every* irq using
> >> device never looking up qirqs in advance.
> > 
> > I will take a look. This is a large rework I won't have time to address 
> > this year. I have removed the dual machine from v9.
> > 
> > You would move the qirq array at the machine level ?  
> 
> I took a look today and did a few changes : 
> 
>  - move the qirq array at the machine level
>  - introduced a 'set_irq' method to sPAPR IRQ
>  - adapted the 'qirq' method of sPAPR IRQ. We still need to perform some
>    checks and to handle the IRQ number offset.
> 
> It falls well in place, a part for the ICS source of the PnvPSI model 
> which does not have any qirq anymore. For PSI, I am thinking of moving 
> the qirq array under PnvPSI model, like I did for the machine. 
> 
> Would that be ok ?

That sounds reasonable.  I'd been thinking of having a qirq array at
the machine level which dispatched to other qirq arrays at the ICS or
XiveSource levels, but if you don't need that, that's ok too.
> 
> I think there are a couple more possible cleanups on the different ICS 
> models to do if these changes are acceptable. 
> 
> C. 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2018-12-21  3:11 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-11 22:38 [Qemu-devel] [PATCH v8 00/12] ppc: support for the XIVE interrupt controller (POWER9) Cédric Le Goater
2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 01/12] spapr: introduce a new machine IRQ backend for XIVE Cédric Le Goater
2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 02/12] spapr: add hcalls support for the XIVE exploitation interrupt mode Cédric Le Goater
2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 03/12] spapr: add device tree support for the XIVE exploitation mode Cédric Le Goater
2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 04/12] spapr: allocate the interrupt thread context under the CPU core Cédric Le Goater
2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 05/12] spapr: extend the sPAPR IRQ backend for XICS migration Cédric Le Goater
2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 06/12] spapr: add a 'reset' method to the sPAPR IRQ backend Cédric Le Goater
2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 07/12] spapr: add an extra OV5 field " Cédric Le Goater
2018-12-17  5:27   ` David Gibson
2018-12-17  8:02     ` Cédric Le Goater
2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 08/12] spapr: introduce an 'ic-mode' machine option Cédric Le Goater
2018-12-17  5:35   ` David Gibson
2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 09/12] spapr: set the interrupt presenter at reset Cédric Le Goater
2018-12-13 12:52   ` Cédric Le Goater
2018-12-17  6:01     ` David Gibson
2018-12-17 22:41       ` Cédric Le Goater
2018-12-18  0:00         ` David Gibson
2018-12-17  5:37   ` David Gibson
2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 10/12] spapr: enable XIVE MMIOs " Cédric Le Goater
2018-12-17  6:04   ` David Gibson
2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 11/12] spapr: introduce a new sPAPR IRQ backend supporting XIVE and XICS Cédric Le Goater
2018-12-17  6:07   ` David Gibson
2018-12-17 22:38     ` Cédric Le Goater
2018-12-19 19:15       ` Cédric Le Goater
2018-12-21  1:50         ` David Gibson
2018-12-11 22:38 ` [Qemu-devel] [PATCH v8 12/12] spapr: change default CPU type to POWER9 Cédric Le Goater
2018-12-17  6:07   ` David Gibson
2018-12-17 10:47     ` Cédric Le Goater
2018-12-17  5:29 ` [Qemu-devel] [PATCH v8 00/12] ppc: support for the XIVE interrupt controller (POWER9) David Gibson
2018-12-17 10:16   ` Cédric Le Goater

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.