All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: "Cédric Le Goater" <clg@kaod.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	qemu-ppc@nongnu.org, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 13/13] spapr: add KVM support to the 'dual' machine
Date: Thu, 14 Feb 2019 14:29:17 +1100	[thread overview]
Message-ID: <20190214032916.GY1884@umbus.fritz.box> (raw)
In-Reply-To: <f0b93ebb-e8b5-4448-9b57-8da2feee493c@kaod.org>

[-- Attachment #1: Type: text/plain, Size: 9717 bytes --]

On Wed, Feb 13, 2019 at 09:22:46AM +0100, Cédric Le Goater wrote:
> On 2/13/19 2:32 AM, David Gibson wrote:
> > On Tue, Feb 12, 2019 at 08:18:19AM +0100, Cédric Le Goater wrote:
> >> On 2/12/19 2:11 AM, David Gibson wrote:
> >>> On Mon, Jan 07, 2019 at 07:39:46PM +0100, Cédric Le Goater wrote:
> >>>> The interrupt mode is chosen by the CAS negotiation process and
> >>>> activated after a reset to take into account the required changes in
> >>>> the machine. This brings new constraints on how the associated KVM IRQ
> >>>> device is initialized.
> >>>>
> >>>> Currently, each model takes care of the initialization of the KVM
> >>>> device in their realize method but this is not possible anymore as the
> >>>> initialization needs to be done globaly when the interrupt mode is
> >>>> known, i.e. when machine is reseted. It also means that we need a way
> >>>> to delete a KVM device when another mode is chosen.
> >>>>
> >>>> Also, to support migration, the QEMU objects holding the state to
> >>>> transfer should always be available but not necessarily activated.
> >>>>
> >>>> The overall approach of this proposal is to initialize both interrupt
> >>>> mode at the QEMU level and keep the IRQ number space in sync to allow
> >>>> switching from one mode to another. For the KVM side of things, the
> >>>> whole initialization of the KVM device, sources and presenters, is
> >>>> grouped in a single routine. The XICS and XIVE sPAPR IRQ reset
> >>>> handlers are modified accordingly to handle the init and the delete
> >>>> sequences of the KVM device.
> >>>>
> >>>> As KVM is now initialized at reset, we loose the possiblity to
> >>>> fallback to the QEMU emulated mode in case of failure and failures
> >>>> become fatal to the machine.
> >>>>
> >>>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> >>>> ---
> >>>>  hw/intc/spapr_xive.c     |  8 +---
> >>>>  hw/intc/spapr_xive_kvm.c | 27 ++++++++++++++
> >>>>  hw/intc/xics_kvm.c       | 25 +++++++++++++
> >>>>  hw/intc/xive.c           |  4 --
> >>>>  hw/ppc/spapr_irq.c       | 79 ++++++++++++++++++++++++++++------------
> >>>>  5 files changed, 109 insertions(+), 34 deletions(-)
> >>>>
> >>>> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> >>>> index 21f3c1ef0901..0661aca35900 100644
> >>>> --- a/hw/intc/spapr_xive.c
> >>>> +++ b/hw/intc/spapr_xive.c
> >>>> @@ -330,13 +330,7 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
> >>>>      xive->eat = g_new0(XiveEAS, xive->nr_irqs);
> >>>>      xive->endt = g_new0(XiveEND, xive->nr_ends);
> >>>>  
> >>>> -    if (kvmppc_xive_enabled()) {
> >>>> -        kvmppc_xive_connect(xive, &local_err);
> >>>> -        if (local_err) {
> >>>> -            error_propagate(errp, local_err);
> >>>> -            return;
> >>>> -        }
> >>>> -    } else {
> >>>> +    if (!kvmppc_xive_enabled()) {
> >>>>          /* TIMA initialization */
> >>>>          memory_region_init_io(&xive->tm_mmio, OBJECT(xive), &xive_tm_ops, xive,
> >>>>                                "xive.tima", 4ull << TM_SHIFT);
> >>>> diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
> >>>> index d35814c1992e..3ebc947f2be7 100644
> >>>> --- a/hw/intc/spapr_xive_kvm.c
> >>>> +++ b/hw/intc/spapr_xive_kvm.c
> >>>> @@ -737,6 +737,15 @@ void kvmppc_xive_connect(sPAPRXive *xive, Error **errp)
> >>>>      Error *local_err = NULL;
> >>>>      size_t esb_len;
> >>>>      size_t tima_len;
> >>>> +    CPUState *cs;
> >>>> +
> >>>> +    /*
> >>>> +     * The KVM XIVE device already in use. This is the case when
> >>>> +     * rebooting XIVE -> XIVE
> >>>
> >>> Can this case actually occur?  Further down you appear to
> >>> unconditionally destroy both KVM devices at reset time.
> >>
> >> I guess you are right. I will check.
> >>
> >>>> +     */
> >>>> +    if (xive->fd != -1) {
> >>>> +        return;
> >>>> +    }
> >>>>  
> >>>>      if (!kvm_enabled() || !kvmppc_has_cap_xive()) {
> >>>>          error_setg(errp, "IRQ_XIVE capability must be present for KVM");
> >>>> @@ -800,6 +809,24 @@ void kvmppc_xive_connect(sPAPRXive *xive, Error **errp)
> >>>>      xive->change = qemu_add_vm_change_state_handler(
> >>>>          kvmppc_xive_change_state_handler, xive);
> >>>>  
> >>>> +    /* Connect the presenters to the initial VCPUs of the machine */
> >>>> +    CPU_FOREACH(cs) {
> >>>> +        PowerPCCPU *cpu = POWERPC_CPU(cs);
> >>>> +
> >>>> +        kvmppc_xive_cpu_connect(cpu->tctx, &local_err);
> >>>> +        if (local_err) {
> >>>> +            error_propagate(errp, local_err);
> >>>> +            return;
> >>>> +        }
> >>>> +    }
> >>>> +
> >>>> +    /* Update the KVM sources */
> >>>> +    kvmppc_xive_source_reset(xsrc, &local_err);
> >>>> +    if (local_err) {
> >>>> +            error_propagate(errp, local_err);
> >>>> +            return;
> >>>> +    }
> >>>> +
> >>>>      kvm_kernel_irqchip = true;
> >>>>      kvm_msi_via_irqfd_allowed = true;
> >>>>      kvm_gsi_direct_mapping = true;
> >>>> diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
> >>>> index 1d21ff217b82..bfc35d71df7f 100644
> >>>> --- a/hw/intc/xics_kvm.c
> >>>> +++ b/hw/intc/xics_kvm.c
> >>>> @@ -448,6 +448,16 @@ static void rtas_dummy(PowerPCCPU *cpu, sPAPRMachineState *spapr,
> >>>>  int xics_kvm_init(sPAPRMachineState *spapr, Error **errp)
> >>>>  {
> >>>>      int rc;
> >>>> +    CPUState *cs;
> >>>> +    Error *local_err = NULL;
> >>>> +
> >>>> +    /*
> >>>> +     * The KVM XICS device already in use. This is the case when
> >>>> +     * rebooting XICS -> XICS
> >>>> +     */
> >>>> +    if (kernel_xics_fd != -1) {
> >>>> +        return 0;
> >>>> +    }
> >>>>  
> >>>>      if (!kvm_enabled() || !kvm_check_extension(kvm_state, KVM_CAP_IRQ_XICS)) {
> >>>>          error_setg(errp,
> >>>> @@ -496,6 +506,21 @@ int xics_kvm_init(sPAPRMachineState *spapr, Error **errp)
> >>>>      kvm_msi_via_irqfd_allowed = true;
> >>>>      kvm_gsi_direct_mapping = true;
> >>>>  
> >>>> +    /* Connect the presenters to the initial VCPUs of the machine */
> >>>> +    CPU_FOREACH(cs) {
> >>>> +        PowerPCCPU *cpu = POWERPC_CPU(cs);
> >>>> +
> >>>> +        icp_kvm_connect(cpu->icp, &local_err);
> >>>> +        if (local_err) {
> >>>> +            error_propagate(errp, local_err);
> >>>> +            goto fail;
> >>>> +        }
> >>>> +        icp_set_kvm_state(cpu->icp, 1);
> >>>> +    }
> >>>> +
> >>>> +    /* Update the KVM sources */
> >>>> +    ics_set_kvm_state(ICS_KVM(spapr->ics), 1);
> >>>> +
> >>>>      return 0;
> >>>>  
> >>>>  fail:
> >>>> diff --git a/hw/intc/xive.c b/hw/intc/xive.c
> >>>> index c5c2fbc3f8bc..c166eab5b210 100644
> >>>> --- a/hw/intc/xive.c
> >>>> +++ b/hw/intc/xive.c
> >>>> @@ -932,10 +932,6 @@ static void xive_source_reset(void *dev)
> >>>>  
> >>>>      /* PQs are initialized to 0b01 (Q=1) which corresponds to "ints off" */
> >>>>      memset(xsrc->status, XIVE_ESB_OFF, xsrc->nr_irqs);
> >>>> -
> >>>> -    if (kvmppc_xive_enabled()) {
> >>>> -        kvmppc_xive_source_reset(xsrc, &error_fatal);
> >>>> -    }
> >>>>  }
> >>>>  
> >>>>  static void xive_source_realize(DeviceState *dev, Error **errp)
> >>>> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> >>>> index ba27d9d8e972..5592eec3787b 100644
> >>>> --- a/hw/ppc/spapr_irq.c
> >>>> +++ b/hw/ppc/spapr_irq.c
> >>>> @@ -98,20 +98,14 @@ static void spapr_irq_init_xics(sPAPRMachineState *spapr, Error **errp)
> >>>>      int nr_irqs = spapr->irq->nr_irqs;
> >>>>      Error *local_err = NULL;
> >>>>  
> >>>> -    if (kvm_enabled()) {
> >>>> -        if (machine_kernel_irqchip_allowed(machine) &&
> >>>> -            !xics_kvm_init(spapr, &local_err)) {
> >>>> -            spapr->icp_type = TYPE_KVM_ICP;
> >>>> -            spapr->ics = spapr_ics_create(spapr, TYPE_ICS_KVM, nr_irqs,
> >>>> -                                          &local_err);
> >>>> -        }
> >>>> -        if (machine_kernel_irqchip_required(machine) && !spapr->ics) {
> >>>> -            error_prepend(&local_err,
> >>>> -                          "kernel_irqchip requested but unavailable: ");
> >>>> -            goto error;
> >>>
> >>> I don't see anything that replaces the irqchip_required logic, which
> >>> doesn't seem right.
> >>
> >> Yes. We do loose the ability to fall back to the emulated device in case
> >> of failure. It is not impossible to do but it will require more changes
> >> to check what are the KVM capabilities before starting the machine.
> > 
> > Uh... it seems more like it's the other way around.  We'll always fall
> > back to emulated, even if we've explicitly said on the command line
> > that we don't want that.
> 
> Ah yes. The init function might be also broken. 
> 
> XICS mode is a bit more difficult to handle than XIVE because we have 
> different object type for the KVM device and the QEMU emulated device, 
> and with the 'dual' mode, we activate the device at CAS reset time.

Yeah.. we should probably fix that.

> Failures being handled at reset time, should we keep the same logic and  
> abort the machine at reset if the kernel irqchip is required ? 
> 
> But we won't be able to fall back on the QEMU emulated device if KVM 
> XICS fails and if the kernel irqchip is only allowed. It should work for 
> XIVE though.

That's fine.  If we've said that kernel irqchip is required, we
shouldn't fall back to emulation.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2019-02-14  3:36 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-07 18:39 [Qemu-devel] [PATCH 00/13] spapr: add KVM support to the XIVE interrupt mode Cédric Le Goater
2019-01-07 18:39 ` [Qemu-devel] [PATCH 01/13] linux-headers: update to 5.0 Cédric Le Goater
2019-01-07 18:39 ` [Qemu-devel] [PATCH 02/13] spapr/xive: add KVM support Cédric Le Goater
2019-02-06  2:39   ` David Gibson
2019-01-07 18:39 ` [Qemu-devel] [PATCH 03/13] spapr/xive: add state synchronization with KVM Cédric Le Goater
2019-02-06  2:42   ` David Gibson
2019-01-07 18:39 ` [Qemu-devel] [PATCH 04/13] spapr/xive: introduce a VM state change handler Cédric Le Goater
2019-02-06  2:49   ` David Gibson
2019-01-07 18:39 ` [Qemu-devel] [PATCH 05/13] spapr/xive: add migration support for KVM Cédric Le Goater
2019-02-07  3:41   ` David Gibson
2019-01-07 18:39 ` [Qemu-devel] [PATCH 06/13] spapr/xive: fix migration of the XiveTCTX under TCG Cédric Le Goater
2019-02-08  5:36   ` David Gibson
2019-02-08  7:12     ` Cédric Le Goater
2019-02-12  0:22       ` David Gibson
2019-02-12  6:58         ` Cédric Le Goater
2019-01-07 18:39 ` [Qemu-devel] [PATCH 07/13] ppc/xics: introduce a icp_kvm_connect() routine Cédric Le Goater
2019-01-07 18:39 ` [Qemu-devel] [PATCH 08/13] spapr/rtas: modify spapr_rtas_register() to remove RTAS handlers Cédric Le Goater
2019-01-29  5:09   ` Alexey Kardashevskiy
2019-01-29  7:20     ` Cédric Le Goater
2019-01-07 18:39 ` [Qemu-devel] [PATCH 09/13] sysbus: add a sysbus_mmio_unmap() helper Cédric Le Goater
2019-01-07 18:39 ` [Qemu-devel] [PATCH 10/13] spapr: introduce routines to delete the KVM IRQ device Cédric Le Goater
2019-02-12  0:58   ` David Gibson
2019-01-07 18:39 ` [Qemu-devel] [PATCH 11/13] spapr: check for the activation of " Cédric Le Goater
2019-02-12  1:01   ` David Gibson
2019-02-12  7:12     ` Cédric Le Goater
2019-02-13  0:17       ` David Gibson
2019-01-07 18:39 ` [Qemu-devel] [PATCH 12/13] spapr/xics: ignore the lower 4K in the IRQ number space Cédric Le Goater
2019-02-12  1:06   ` David Gibson
2019-02-12  7:05     ` Cédric Le Goater
2019-02-13  1:33       ` David Gibson
2019-02-13  8:03         ` Cédric Le Goater
2019-02-13 11:27           ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
2019-02-13 12:11             ` Greg Kurz
2019-01-07 18:39 ` [Qemu-devel] [PATCH 13/13] spapr: add KVM support to the 'dual' machine Cédric Le Goater
2019-02-12  1:11   ` David Gibson
2019-02-12  7:18     ` Cédric Le Goater
2019-02-13  1:32       ` David Gibson
2019-02-13  8:22         ` Cédric Le Goater
2019-02-13 10:07           ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
2019-02-14  3:35             ` David Gibson
2019-02-14  7:13               ` Cédric Le Goater
2019-02-14  3:29           ` David Gibson [this message]
2019-02-22 12:36         ` [Qemu-devel] " Cédric Le Goater

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190214032916.GY1884@umbus.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=benh@kernel.crashing.org \
    --cc=clg@kaod.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.