Re: [Qemu-devel] [RFC ppc-next PATCH 2/6] kvm: hw/kvm is not x86-specific

All of lore.kernel.org
 help / color / mirror / Atom feed

* Re: [Qemu-devel] [RFC ppc-next PATCH 2/6] kvm: hw/kvm is not x86-specific
       [not found] ` <1360823521-32306-3-git-send-email-scottwood@freescale.com>
@ 2013-03-21  8:30   ` Alexander Graf
  0 siblings, 0 replies; 40+ messages in thread
From: Alexander Graf @ 2013-03-21  8:30 UTC (permalink / raw)
  To: Scott Wood
  Cc: Paolo Bonzini, qemu-ppc@nongnu.org list:PowerPC, qemu-devel qemu-devel

If you repost this one with a proper patch description and without RFC, I'd be happy to apply it before the in-kernel MPIC implementation is done.


Alex

On 14.02.2013, at 07:31, Scott Wood wrote:

> Signed-off-by: Scott Wood <scottwood@freescale.com>
> ---
> hw/Makefile.objs      |    1 +
> hw/i386/Makefile.objs |    1 -
> hw/kvm/Makefile.objs  |    2 +-
> 3 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/Makefile.objs b/hw/Makefile.objs
> index 447e32a..46bc395 100644
> --- a/hw/Makefile.objs
> +++ b/hw/Makefile.objs
> @@ -216,4 +216,5 @@ obj-$(CONFIG_LINUX) += vfio_pci.o
> endif
> 
> $(obj)/baum.o: QEMU_CFLAGS += $(SDL_CFLAGS) 
> +obj-$(CONFIG_KVM) += kvm/
> endif
> diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
> index 025803a..370f799 100644
> --- a/hw/i386/Makefile.objs
> +++ b/hw/i386/Makefile.objs
> @@ -10,7 +10,6 @@ obj-y += lpc_ich9.o q35.o pc_q35.o
> obj-$(CONFIG_XEN) += xen_platform.o xen_apic.o
> obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o
> obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o xen_pt_config_init.o xen_pt_msi.o
> -obj-y += kvm/
> obj-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
> obj-y += pc-testdev.o
> 
> diff --git a/hw/kvm/Makefile.objs b/hw/kvm/Makefile.objs
> index f620d7f..6ccb6ed 100644
> --- a/hw/kvm/Makefile.objs
> +++ b/hw/kvm/Makefile.objs
> @@ -1 +1 @@
> -obj-$(CONFIG_KVM) += clock.o apic.o i8259.o ioapic.o i8254.o pci-assign.o
> +obj-$(TARGET_I386) += clock.o apic.o i8259.o ioapic.o i8254.o pci-assign.o
> -- 
> 1.7.9.5
> 
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 3/6] memory: add memory_region_to_address()
       [not found] ` <1360823521-32306-4-git-send-email-scottwood@freescale.com>
@ 2013-03-21  8:31   ` Alexander Graf
  2013-03-21 10:53     ` Peter Maydell
  0 siblings, 1 reply; 40+ messages in thread
From: Alexander Graf @ 2013-03-21  8:31 UTC (permalink / raw)
  To: Scott Wood
  Cc: Peter Maydell, qemu-ppc@nongnu.org list:PowerPC, qemu-devel qemu-devel


On 14.02.2013, at 07:31, Scott Wood wrote:

> This is useful for when a user of the memory region API needs to
> communicate the absolute bus address to something outside QEMU
> (in particular, KVM).
> 
> Signed-off-by: Scott Wood <scottwood@freescale.com>

Peter, how does the VGIC implementation handle this?


Alex

> ---
> include/exec/memory.h |    9 +++++++++
> memory.c              |   38 ++++++++++++++++++++++++++++++++++----
> 2 files changed, 43 insertions(+), 4 deletions(-)
> 
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index 2322732..b800391 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -892,6 +892,15 @@ void *address_space_map(AddressSpace *as, hwaddr addr,
> void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len,
>                          int is_write, hwaddr access_len);
> 
> +/* memory_region_to_address: Find the full address of the start of the
> + *      given #MemoryRegion, ignoring aliases.  There is no guarantee
> + *      that the #MemoryRegion is actually visible at this address, if
> + *      there are overlapping regions.
> + *
> + * @mr: #MemoryRegion being queried
> + * @asp: if non-NULL, returns the #AddressSpace @mr is mapped in, if any
> + */
> +hwaddr memory_region_to_address(MemoryRegion *mr, AddressSpace **asp);
> 
> #endif
> 
> diff --git a/memory.c b/memory.c
> index cd7d5e0..0099f12 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -453,21 +453,51 @@ const IORangeOps memory_region_iorange_ops = {
>     .destructor = memory_region_iorange_destructor,
> };
> 
> -static AddressSpace *memory_region_to_address_space(MemoryRegion *mr)
> +static AddressSpace *memory_region_root_to_address_space(MemoryRegion *mr)
> {
>     AddressSpace *as;
> 
> -    while (mr->parent) {
> -        mr = mr->parent;
> -    }
>     QTAILQ_FOREACH(as, &address_spaces, address_spaces_link) {
>         if (mr == as->root) {
>             return as;
>         }
>     }
> +
> +    return NULL;
> +}
> +
> +static AddressSpace *memory_region_to_address_space(MemoryRegion *mr)
> +{
> +    AddressSpace *as;
> +
> +    while (mr->parent) {
> +        mr = mr->parent;
> +    }
> +
> +    as = memory_region_root_to_address_space(mr);
> +    if (as) {
> +        return as;
> +    }
> +
>     abort();
> }
> 
> +hwaddr memory_region_to_address(MemoryRegion *mr, AddressSpace **asp)
> +{
> +    hwaddr addr = mr->addr;
> +
> +    while (mr->parent) {
> +        mr = mr->parent;
> +        addr += mr->addr;
> +    }
> +
> +    if (asp) {
> +        *asp = memory_region_root_to_address_space(mr);
> +    }
> +
> +    return addr;
> +}
> +
> /* Render a memory region into the global view.  Ranges in @view obscure
>  * ranges in @mr.
>  */
> -- 
> 1.7.9.5
> 
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 5/6] kvm: export result of irqchip config check
       [not found] ` <1360823521-32306-6-git-send-email-scottwood@freescale.com>
@ 2013-03-21  8:34   ` Alexander Graf
  2013-03-21  8:45     ` Jan Kiszka
  0 siblings, 1 reply; 40+ messages in thread
From: Alexander Graf @ 2013-03-21  8:34 UTC (permalink / raw)
  To: Scott Wood
  Cc: Jan Kiszka, qemu-ppc@nongnu.org list:PowerPC, qemu-devel qemu-devel


On 14.02.2013, at 07:32, Scott Wood wrote:

> This allows platform code to register in-kernel irqchips that
> don't use the legacy KVM_CAP_IRQCHIP/KVM_CREATE_IRQCHIP interface.
> 
> Signed-off-by: Scott Wood <scottwood@freescale.com>
> ---
> include/sysemu/kvm.h |   10 ++++++++++
> kvm-all.c            |   11 +++++++++--
> 2 files changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> index f2d97b5..b9a8701 100644
> --- a/include/sysemu/kvm.h
> +++ b/include/sysemu/kvm.h
> @@ -45,6 +45,7 @@ extern bool kvm_async_interrupts_allowed;
> extern bool kvm_irqfds_allowed;
> extern bool kvm_msi_via_irqfd_allowed;
> extern bool kvm_gsi_routing_allowed;
> +extern bool kvm_irqchip_wanted;
> 
> #if defined CONFIG_KVM || !defined NEED_CPU_H
> #define kvm_enabled()           (kvm_allowed)
> @@ -97,6 +98,14 @@ extern bool kvm_gsi_routing_allowed;
>  */
> #define kvm_gsi_routing_enabled() (kvm_gsi_routing_allowed)
> 
> +/**
> + * kvm_irqchip_wanted
> + *
> + * Returns: true if the user requested that an in-kernel IRQ chip be
> + * used, regardless of whether support has been detected.
> + */
> +#define kvm_irqchip_wanted() (kvm_irqchip_wanted)
> +
> #else
> #define kvm_enabled()           (0)
> #define kvm_irqchip_in_kernel() (false)
> @@ -104,6 +113,7 @@ extern bool kvm_gsi_routing_allowed;
> #define kvm_irqfds_enabled() (false)
> #define kvm_msi_via_irqfd_enabled() (false)
> #define kvm_gsi_routing_allowed() (false)
> +#define kvm_irqchip_wanted() (false)
> #endif
> 
> struct kvm_run;
> diff --git a/kvm-all.c b/kvm-all.c
> index 04ec2d5..13a628d 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -109,6 +109,7 @@ bool kvm_async_interrupts_allowed;
> bool kvm_irqfds_allowed;
> bool kvm_msi_via_irqfd_allowed;
> bool kvm_gsi_routing_allowed;
> +bool kvm_irqchip_wanted;
> 
> static const KVMCapabilityInfo kvm_required_capabilites[] = {
>     KVM_CAP_INFO(USER_MEMORY),
> @@ -1205,8 +1206,14 @@ static int kvm_irqchip_create(KVMState *s)
> 
>     if (QTAILQ_EMPTY(&list->head) ||
>         !qemu_opt_get_bool(QTAILQ_FIRST(&list->head),
> -                           "kernel_irqchip", true) ||
> -        !kvm_check_extension(s, KVM_CAP_IRQCHIP)) {
> +                           "kernel_irqchip", true)) {
> +        return 0;
> +    }
> +
> +    kvm_irqchip_wanted = true;
> +
> +    /* Platform code may have a different way of enabling an IRQ chip */
> +    if (!kvm_check_extension(s, KVM_CAP_IRQCHIP)) {

Does x86 have the required checks then to make sure it has the CAP?


Alex

>         return 0;
>     }
> 
> -- 
> 1.7.9.5
> 
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 6/6] kvm/openpic: in-kernel mpic support
       [not found] ` <1360823521-32306-7-git-send-email-scottwood@freescale.com>
@ 2013-03-21  8:41   ` Alexander Graf
  2013-03-21 20:50     ` Scott Wood
  0 siblings, 1 reply; 40+ messages in thread
From: Alexander Graf @ 2013-03-21  8:41 UTC (permalink / raw)
  To: Scott Wood; +Cc: qemu-ppc, qemu-devel


On 14.02.2013, at 07:32, Scott Wood wrote:

> This depends on RFC kernel interfaces proposed at:
> http://patchwork.ozlabs.org/patch/220359/
> http://patchwork.ozlabs.org/patch/220362/
> 
> Signed-off-by: Scott Wood <scottwood@freescale.com>
> ---
> hw/kvm/Makefile.objs |    1 +
> hw/kvm/openpic.c     |  295 ++++++++++++++++++++++++++++++++++++++++++++++++++
> hw/ppc/e500.c        |   28 +++--
> 3 files changed, 317 insertions(+), 7 deletions(-)
> create mode 100644 hw/kvm/openpic.c
> 
> diff --git a/hw/kvm/Makefile.objs b/hw/kvm/Makefile.objs
> index 6ccb6ed..8df0fe1 100644
> --- a/hw/kvm/Makefile.objs
> +++ b/hw/kvm/Makefile.objs
> @@ -1 +1,2 @@
> obj-$(TARGET_I386) += clock.o apic.o i8259.o ioapic.o i8254.o pci-assign.o
> +obj-$(TARGET_PPC) += openpic.o
> diff --git a/hw/kvm/openpic.c b/hw/kvm/openpic.c
> new file mode 100644
> index 0000000..aabc4a6
> --- /dev/null
> +++ b/hw/kvm/openpic.c
> @@ -0,0 +1,295 @@
> +/*
> + * KVM in-kernel OpenPIC
> + *
> + * Copyright 2013 Freescale Semiconductor, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a copy
> + * of this software and associated documentation files (the "Software"), to deal
> + * in the Software without restriction, including without limitation the rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +#include "exec/address-spaces.h"
> +#include "hw/hw.h"
> +#include "hw/openpic.h"
> +#include "hw/pci/msi.h"
> +#include "hw/sysbus.h"
> +#include "sysemu/kvm.h"
> +
> +typedef struct KVMOpenPICState {
> +    SysBusDevice busdev;
> +    MemoryRegion mem;
> +    MemoryListener mem_listener;
> +    hwaddr reg_base;
> +    uint32_t kern_id;
> +    uint32_t model;
> +} KVMOpenPICState;
> +
> +static void kvm_openpic_set_irq(void *opaque, int n_IRQ, int level)
> +{
> +    KVMOpenPICState *opp = opaque;
> +    struct kvm_device_attr attr;
> +    uint32_t val32 = level;
> +    int ret;
> +
> +    attr.dev = opp->kern_id;
> +    attr.group = KVM_DEV_MPIC_GRP_IRQ_ACTIVE;
> +    attr.attr = n_IRQ;
> +    attr.addr = (uint64_t)(long)&val32;
> +
> +    ret = kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, &attr);
> +    if (ret < 0) {
> +        fprintf(stderr, "%s: %s %llx\n", __func__, strerror(errno), attr.attr);
> +    }
> +}
> +
> +static void kvm_openpic_reset(DeviceState *d)
> +{
> +#if 0
> +    OpenPICState *opp = FROM_SYSBUS(typeof(*opp), SYS_BUS_DEVICE(d));
> +    int i;
> +
> +    opp->gcr = GCR_RESET;
> +    /* Initialise controller registers */
> +    opp->frr = ((opp->nb_irqs - 1) << FRR_NIRQ_SHIFT) |
> +               ((opp->nb_cpus - 1) << FRR_NCPU_SHIFT) |
> +               (opp->vid << FRR_VID_SHIFT);
> +
> +    opp->pir = 0;
> +    opp->spve = -1 & opp->vector_mask;
> +    opp->tfrr = opp->tfrr_reset;
> +    /* Initialise IRQ sources */
> +    for (i = 0; i < opp->max_irq; i++) {
> +        opp->src[i].ivpr = opp->ivpr_reset;
> +        opp->src[i].idr  = opp->idr_reset;
> +
> +        switch (opp->src[i].type) {
> +        case IRQ_TYPE_NORMAL:
> +            opp->src[i].level = !!(opp->ivpr_reset & IVPR_SENSE_MASK);
> +            break;
> +
> +        case IRQ_TYPE_FSLINT:
> +            opp->src[i].ivpr |= IVPR_POLARITY_MASK;
> +            break;
> +
> +        case IRQ_TYPE_FSLSPECIAL:
> +            break;
> +        }
> +    }
> +    /* Initialise IRQ destinations */
> +    for (i = 0; i < MAX_CPU; i++) {
> +        opp->dst[i].ctpr      = 15;
> +        memset(&opp->dst[i].raised, 0, sizeof(IRQQueue));
> +        opp->dst[i].raised.next = -1;
> +        memset(&opp->dst[i].servicing, 0, sizeof(IRQQueue));
> +        opp->dst[i].servicing.next = -1;
> +    }
> +    /* Initialise timers */
> +    for (i = 0; i < OPENPIC_MAX_TMR; i++) {
> +        opp->timers[i].tccr = 0;
> +        opp->timers[i].tbcr = TBCR_CI;
> +    }
> +    /* Go out of RESET state */
> +    opp->gcr = 0;
> +#endif
> +}
> +
> +static void kvm_openpic_write(void *opaque, hwaddr addr, uint64_t val,
> +                              unsigned size)
> +{
> +    KVMOpenPICState *opp = opaque;
> +    struct kvm_device_attr attr;
> +    uint32_t val32 = val;
> +    int ret;
> +
> +    attr.dev = opp->kern_id;
> +    attr.group = KVM_DEV_MPIC_GRP_REGISTER;
> +    attr.attr = addr;
> +    attr.addr = (uint64_t)(long)&val32;
> +
> +    ret = kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, &attr);
> +    if (ret < 0) {
> +        qemu_log_mask(LOG_UNIMP, "%s: %s %llx\n", __func__,
> +                      strerror(errno), attr.attr);
> +    }
> +}
> +
> +static uint64_t kvm_openpic_read(void *opaque, hwaddr addr, unsigned size)
> +{
> +    KVMOpenPICState *opp = opaque;
> +    struct kvm_device_attr attr;
> +    uint32_t val = 0xdeadbeef;
> +    int ret;
> +
> +    attr.dev = opp->kern_id;
> +    attr.group = KVM_DEV_MPIC_GRP_REGISTER;
> +    attr.attr = addr;
> +    attr.addr = (uint64_t)(long)&val;
> +
> +    ret = kvm_vm_ioctl(kvm_state, KVM_GET_DEVICE_ATTR, &attr);
> +    if (ret < 0) {
> +        qemu_log_mask(LOG_UNIMP, "%s: %s %llx\n", __func__,
> +                      strerror(errno), attr.attr);
> +        return 0;
> +    }
> +
> +    return val;
> +}
> +
> +static const MemoryRegionOps kvm_openpic_mem_ops = {
> +    .write = kvm_openpic_write,
> +    .read  = kvm_openpic_read,
> +    .endianness = DEVICE_BIG_ENDIAN,
> +    .impl = {
> +        .min_access_size = 4,
> +        .max_access_size = 4,
> +    },
> +};
> +
> +static void kvm_openpic_update_reg_base(MemoryListener *listener)
> +{
> +    KVMOpenPICState *opp = container_of(listener, KVMOpenPICState,
> +                                        mem_listener);
> +    struct kvm_device_attr attr;
> +    uint64_t reg_base;
> +    AddressSpace *as;
> +    int ret;
> +
> +    reg_base = memory_region_to_address(&opp->mem, &as);
> +    if (!as) {
> +        reg_base = 0;
> +    } else if (as != &address_space_memory) {
> +        abort();
> +    }
> +
> +    if (reg_base == opp->reg_base) {
> +        return;
> +    }
> +
> +    opp->reg_base = reg_base;
> +
> +    attr.dev = opp->kern_id;
> +    attr.group = KVM_DEV_MPIC_GRP_MISC;
> +    attr.attr = KVM_DEV_MPIC_BASE_ADDR;
> +    attr.addr = (uint64_t)(long)&reg_base;
> +
> +    ret = kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, &attr);
> +    if (ret < 0) {
> +        fprintf(stderr, "%s: %s %llx\n", __func__, strerror(errno), reg_base);
> +    }
> +}
> +
> +static int kvm_openpic_init(SysBusDevice *dev)
> +{
> +    KVMOpenPICState *opp = FROM_SYSBUS(typeof(*opp), dev);
> +    int kvm_openpic_model;
> +
> +    memory_region_init_io(&opp->mem, &kvm_openpic_mem_ops, opp,
> +                          "kvm-openpic", 0x40000);
> +
> +    switch (opp->model) {
> +    case OPENPIC_MODEL_FSL_MPIC_20:
> +        kvm_openpic_model = KVM_DEV_TYPE_FSL_MPIC_20;
> +        break;
> +
> +    case OPENPIC_MODEL_FSL_MPIC_42:
> +        kvm_openpic_model = KVM_DEV_TYPE_FSL_MPIC_42;
> +        break;
> +
> +    default:
> +        return -EINVAL;
> +    }
> +
> +    sysbus_init_mmio(dev, &opp->mem);
> +    qdev_init_gpio_in(&dev->qdev, kvm_openpic_set_irq, OPENPIC_MAX_IRQ);
> +
> +    opp->mem_listener.commit = kvm_openpic_update_reg_base;
> +    memory_listener_register(&opp->mem_listener, &address_space_memory);
> +
> +    msi_supported = true;
> +    return 0;
> +}
> +
> +DeviceState *kvm_openpic_create(BusState *bus, int model)
> +{
> +    KVMState *s = kvm_state;
> +    DeviceState *dev;
> +    struct kvm_create_device cd = {0};
> +    int ret;
> +
> +    if (!kvm_check_extension(s, KVM_CAP_DEVICE_CTRL)) {
> +        return NULL;
> +    }
> +
> +    switch (model) {
> +    case OPENPIC_MODEL_FSL_MPIC_20:
> +        cd.type = KVM_DEV_TYPE_FSL_MPIC_20;
> +        break;
> +
> +    case OPENPIC_MODEL_FSL_MPIC_42:
> +        cd.type = KVM_DEV_TYPE_FSL_MPIC_42;
> +        break;
> +
> +    default:
> +        qemu_log_mask(LOG_UNIMP, "%s: unknown openpic model %d\n",
> +                      __func__, model);
> +        return NULL;
> +    }
> +
> +    ret = kvm_vm_ioctl(s, KVM_CREATE_DEVICE, &cd);
> +    if (ret < 0) {
> +        fprintf(stderr, "%s: can't create device %d: %s\n", __func__, cd.type,
> +                strerror(errno));
> +        return NULL;
> +    }

Can't all the stuff above here just simply go into the qdev init function?

> +
> +    dev = qdev_create(NULL, "kvm-openpic");
> +    qdev_prop_set_uint32(dev, "model", model);
> +    qdev_prop_set_uint32(dev, "kernel-id", cd.id);
> +
> +    return dev;
> +}
> +
> +static Property kvm_openpic_properties[] = {
> +    DEFINE_PROP_UINT32("model", KVMOpenPICState, model,
> +                       OPENPIC_MODEL_FSL_MPIC_20),
> +    DEFINE_PROP_UINT32("kernel-id", KVMOpenPICState, kern_id, 0),
> +    DEFINE_PROP_END_OF_LIST(),
> +};
> +
> +static void kvm_openpic_class_init(ObjectClass *klass, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(klass);
> +    SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass);
> +
> +    k->init = kvm_openpic_init;
> +    dc->props = kvm_openpic_properties;
> +    dc->reset = kvm_openpic_reset;
> +}
> +
> +static const TypeInfo kvm_openpic_info = {
> +    .name          = "kvm-openpic",
> +    .parent        = TYPE_SYS_BUS_DEVICE,
> +    .instance_size = sizeof(KVMOpenPICState),
> +    .class_init    = kvm_openpic_class_init,
> +};
> +
> +static void kvm_openpic_register_types(void)
> +{
> +    type_register_static(&kvm_openpic_info);
> +}
> +
> +type_init(kvm_openpic_register_types)
> diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c
> index b7474c0..9cfcc1d 100644
> --- a/hw/ppc/e500.c
> +++ b/hw/ppc/e500.c
> @@ -474,6 +474,7 @@ void ppce500_init(PPCE500Params *params)
>     MemoryRegion *ccsr_addr_space;
>     SysBusDevice *s;
>     PPCE500CCSRState *ccsr;
> +    bool kvm_irqchip = false;
> 
>     /* Setup CPUs */
>     if (params->cpu_model == NULL) {
> @@ -543,16 +544,29 @@ void ppce500_init(PPCE500Params *params)
> 
>     /* MPIC */
>     mpic = g_new(qemu_irq, 256);
> -    dev = qdev_create(NULL, "openpic");
> -    qdev_prop_set_uint32(dev, "nb_cpus", smp_cpus);
> -    qdev_prop_set_uint32(dev, "model", params->mpic_version);
> +
> +    if (kvm_irqchip_wanted()) {
> +        dev = kvm_openpic_create(NULL, params->mpic_version);

This really should be just a

    dev = qdev_create(NULL, kvm_irqchip_wanted() ? "kvm-openpic" : "openpic");

The logic whether an in-kernel irqchip is available belongs into the default setting of kvm_irqchip_wanted. If the host kvm version can't handle an in-kernel MPIC, it should simply default to false. If it supports one, it defaults to true. Whenever the user defines something explicitly with -machine, that wins.


Alex

> +        if (dev) {
> +            kvm_irqchip = true;
> +        }
> +    }
> +
> +    if (!kvm_irqchip) {
> +        dev = qdev_create(NULL, "openpic");
> +        qdev_prop_set_uint32(dev, "model", params->mpic_version);
> +        qdev_prop_set_uint32(dev, "nb_cpus", smp_cpus);
> +    }
> +
>     qdev_init_nofail(dev);
>     s = SYS_BUS_DEVICE(dev);
> 
> -    k = 0;
> -    for (i = 0; i < smp_cpus; i++) {
> -        for (j = 0; j < OPENPIC_OUTPUT_NB; j++) {
> -            sysbus_connect_irq(s, k++, irqs[i][j]);
> +    if (!kvm_irqchip) {
> +        k = 0;
> +        for (i = 0; i < smp_cpus; i++) {
> +            for (j = 0; j < OPENPIC_OUTPUT_NB; j++) {
> +                sysbus_connect_irq(s, k++, irqs[i][j]);
> +            }
>         }
>     }
> 
> -- 
> 1.7.9.5
> 
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 5/6] kvm: export result of irqchip config check
  2013-03-21  8:34   ` [Qemu-devel] [RFC ppc-next PATCH 5/6] kvm: export result of irqchip config check Alexander Graf
@ 2013-03-21  8:45     ` Jan Kiszka
  2013-03-21  8:50       ` Alexander Graf
  0 siblings, 1 reply; 40+ messages in thread
From: Jan Kiszka @ 2013-03-21  8:45 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Scott Wood, qemu-ppc@nongnu.org list:PowerPC, qemu-devel qemu-devel

On 2013-03-21 09:34, Alexander Graf wrote:
> 
> On 14.02.2013, at 07:32, Scott Wood wrote:
> 
>> This allows platform code to register in-kernel irqchips that
>> don't use the legacy KVM_CAP_IRQCHIP/KVM_CREATE_IRQCHIP interface.
>>
>> Signed-off-by: Scott Wood <scottwood@freescale.com>
>> ---
>> include/sysemu/kvm.h |   10 ++++++++++
>> kvm-all.c            |   11 +++++++++--
>> 2 files changed, 19 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
>> index f2d97b5..b9a8701 100644
>> --- a/include/sysemu/kvm.h
>> +++ b/include/sysemu/kvm.h
>> @@ -45,6 +45,7 @@ extern bool kvm_async_interrupts_allowed;
>> extern bool kvm_irqfds_allowed;
>> extern bool kvm_msi_via_irqfd_allowed;
>> extern bool kvm_gsi_routing_allowed;
>> +extern bool kvm_irqchip_wanted;
>>
>> #if defined CONFIG_KVM || !defined NEED_CPU_H
>> #define kvm_enabled()           (kvm_allowed)
>> @@ -97,6 +98,14 @@ extern bool kvm_gsi_routing_allowed;
>>  */
>> #define kvm_gsi_routing_enabled() (kvm_gsi_routing_allowed)
>>
>> +/**
>> + * kvm_irqchip_wanted
>> + *
>> + * Returns: true if the user requested that an in-kernel IRQ chip be
>> + * used, regardless of whether support has been detected.
>> + */
>> +#define kvm_irqchip_wanted() (kvm_irqchip_wanted)
>> +
>> #else
>> #define kvm_enabled()           (0)
>> #define kvm_irqchip_in_kernel() (false)
>> @@ -104,6 +113,7 @@ extern bool kvm_gsi_routing_allowed;
>> #define kvm_irqfds_enabled() (false)
>> #define kvm_msi_via_irqfd_enabled() (false)
>> #define kvm_gsi_routing_allowed() (false)
>> +#define kvm_irqchip_wanted() (false)
>> #endif
>>
>> struct kvm_run;
>> diff --git a/kvm-all.c b/kvm-all.c
>> index 04ec2d5..13a628d 100644
>> --- a/kvm-all.c
>> +++ b/kvm-all.c
>> @@ -109,6 +109,7 @@ bool kvm_async_interrupts_allowed;
>> bool kvm_irqfds_allowed;
>> bool kvm_msi_via_irqfd_allowed;
>> bool kvm_gsi_routing_allowed;
>> +bool kvm_irqchip_wanted;
>>
>> static const KVMCapabilityInfo kvm_required_capabilites[] = {
>>     KVM_CAP_INFO(USER_MEMORY),
>> @@ -1205,8 +1206,14 @@ static int kvm_irqchip_create(KVMState *s)
>>
>>     if (QTAILQ_EMPTY(&list->head) ||
>>         !qemu_opt_get_bool(QTAILQ_FIRST(&list->head),
>> -                           "kernel_irqchip", true) ||
>> -        !kvm_check_extension(s, KVM_CAP_IRQCHIP)) {
>> +                           "kernel_irqchip", true)) {
>> +        return 0;
>> +    }
>> +
>> +    kvm_irqchip_wanted = true;
>> +
>> +    /* Platform code may have a different way of enabling an IRQ chip */
>> +    if (!kvm_check_extension(s, KVM_CAP_IRQCHIP)) {
> 
> Does x86 have the required checks then to make sure it has the CAP?

To my understanding, x86 won't evaluate this new flag but continue to
bail out from this service early.

However, invoking something that is called "create" to just end up with
a set flag "wanted" and then do the creation elsewhere is not a very
beautiful design.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 5/6] kvm: export result of irqchip config check
  2013-03-21  8:45     ` Jan Kiszka
@ 2013-03-21  8:50       ` Alexander Graf
  0 siblings, 0 replies; 40+ messages in thread
From: Alexander Graf @ 2013-03-21  8:50 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Scott Wood, qemu-ppc@nongnu.org list:PowerPC, qemu-devel qemu-devel


On 21.03.2013, at 09:45, Jan Kiszka wrote:

> On 2013-03-21 09:34, Alexander Graf wrote:
>> 
>> On 14.02.2013, at 07:32, Scott Wood wrote:
>> 
>>> This allows platform code to register in-kernel irqchips that
>>> don't use the legacy KVM_CAP_IRQCHIP/KVM_CREATE_IRQCHIP interface.
>>> 
>>> Signed-off-by: Scott Wood <scottwood@freescale.com>
>>> ---
>>> include/sysemu/kvm.h |   10 ++++++++++
>>> kvm-all.c            |   11 +++++++++--
>>> 2 files changed, 19 insertions(+), 2 deletions(-)
>>> 
>>> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
>>> index f2d97b5..b9a8701 100644
>>> --- a/include/sysemu/kvm.h
>>> +++ b/include/sysemu/kvm.h
>>> @@ -45,6 +45,7 @@ extern bool kvm_async_interrupts_allowed;
>>> extern bool kvm_irqfds_allowed;
>>> extern bool kvm_msi_via_irqfd_allowed;
>>> extern bool kvm_gsi_routing_allowed;
>>> +extern bool kvm_irqchip_wanted;
>>> 
>>> #if defined CONFIG_KVM || !defined NEED_CPU_H
>>> #define kvm_enabled()           (kvm_allowed)
>>> @@ -97,6 +98,14 @@ extern bool kvm_gsi_routing_allowed;
>>> */
>>> #define kvm_gsi_routing_enabled() (kvm_gsi_routing_allowed)
>>> 
>>> +/**
>>> + * kvm_irqchip_wanted
>>> + *
>>> + * Returns: true if the user requested that an in-kernel IRQ chip be
>>> + * used, regardless of whether support has been detected.
>>> + */
>>> +#define kvm_irqchip_wanted() (kvm_irqchip_wanted)
>>> +
>>> #else
>>> #define kvm_enabled()           (0)
>>> #define kvm_irqchip_in_kernel() (false)
>>> @@ -104,6 +113,7 @@ extern bool kvm_gsi_routing_allowed;
>>> #define kvm_irqfds_enabled() (false)
>>> #define kvm_msi_via_irqfd_enabled() (false)
>>> #define kvm_gsi_routing_allowed() (false)
>>> +#define kvm_irqchip_wanted() (false)
>>> #endif
>>> 
>>> struct kvm_run;
>>> diff --git a/kvm-all.c b/kvm-all.c
>>> index 04ec2d5..13a628d 100644
>>> --- a/kvm-all.c
>>> +++ b/kvm-all.c
>>> @@ -109,6 +109,7 @@ bool kvm_async_interrupts_allowed;
>>> bool kvm_irqfds_allowed;
>>> bool kvm_msi_via_irqfd_allowed;
>>> bool kvm_gsi_routing_allowed;
>>> +bool kvm_irqchip_wanted;
>>> 
>>> static const KVMCapabilityInfo kvm_required_capabilites[] = {
>>>    KVM_CAP_INFO(USER_MEMORY),
>>> @@ -1205,8 +1206,14 @@ static int kvm_irqchip_create(KVMState *s)
>>> 
>>>    if (QTAILQ_EMPTY(&list->head) ||
>>>        !qemu_opt_get_bool(QTAILQ_FIRST(&list->head),
>>> -                           "kernel_irqchip", true) ||
>>> -        !kvm_check_extension(s, KVM_CAP_IRQCHIP)) {
>>> +                           "kernel_irqchip", true)) {
>>> +        return 0;
>>> +    }
>>> +
>>> +    kvm_irqchip_wanted = true;
>>> +
>>> +    /* Platform code may have a different way of enabling an IRQ chip */
>>> +    if (!kvm_check_extension(s, KVM_CAP_IRQCHIP)) {
>> 
>> Does x86 have the required checks then to make sure it has the CAP?
> 
> To my understanding, x86 won't evaluate this new flag but continue to
> bail out from this service early.
> 
> However, invoking something that is called "create" to just end up with
> a set flag "wanted" and then do the creation elsewhere is not a very
> beautiful design.

Yeah. This should probably be

  kvm_irqchip_wanted(KVMState *s) {
    if (!kvm_arch_irqchip_available(s))
      return false;
    
    return qemu_opt_get_bool("kernel_irqchip", true);
  }

with kvm_arch_irqchip_available checking for KVM_CAP_IRQCHIP on x86 and the respective device creation capability on ppc.


Alex

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 3/6] memory: add memory_region_to_address()
  2013-03-21  8:31   ` [Qemu-devel] [RFC ppc-next PATCH 3/6] memory: add memory_region_to_address() Alexander Graf
@ 2013-03-21 10:53     ` Peter Maydell
  2013-03-21 10:59       ` Alexander Graf
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Maydell @ 2013-03-21 10:53 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Scott Wood, qemu-ppc@nongnu.org list:PowerPC, qemu-devel qemu-devel

On 21 March 2013 08:31, Alexander Graf <agraf@suse.de> wrote:
> On 14.02.2013, at 07:31, Scott Wood wrote:
>> This is useful for when a user of the memory region API needs to
>> communicate the absolute bus address to something outside QEMU
>> (in particular, KVM).
>>
>> Signed-off-by: Scott Wood <scottwood@freescale.com>
>
> Peter, how does the VGIC implementation handle this?

Check kvm_arm_register_device() in target-arm/kvm.c. Basically
the VGIC device model calls this function to say "tell the kernel
where this MemoryRegion is in the system address space, when it
eventually gets mapped". The code in kvm.c uses the memory system's
Notifier API to get a callback when the region is mapped into
an address space, which it uses to track the offset in the
address space. Finally, we use a machine init notifier so that
just before everything finally starts we can make the KVM ioctls
to say "here is where everything lives".

I think this is a pretty neat way of doing it because it means
neither the interrupt controller device nor the board model
really need to care about the kernel being told where things
are mapped; it's all abstracted out into kvm.c. If your
interrupt controller can be moved around at runtime that's
probably also handlable, but the ARM code just unregisters its
notifiers at machine init because the GIC can't move.

(I think the code assumes the device only gets mapped into
one address space; this could easily be fixed if it's not true
at some point in the future.)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 3/6] memory: add memory_region_to_address()
  2013-03-21 10:53     ` Peter Maydell
@ 2013-03-21 10:59       ` Alexander Graf
  2013-03-21 11:01         ` Peter Maydell
  0 siblings, 1 reply; 40+ messages in thread
From: Alexander Graf @ 2013-03-21 10:59 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Scott Wood, qemu-ppc@nongnu.org list:PowerPC, qemu-devel qemu-devel


On 21.03.2013, at 11:53, Peter Maydell wrote:

> On 21 March 2013 08:31, Alexander Graf <agraf@suse.de> wrote:
>> On 14.02.2013, at 07:31, Scott Wood wrote:
>>> This is useful for when a user of the memory region API needs to
>>> communicate the absolute bus address to something outside QEMU
>>> (in particular, KVM).
>>> 
>>> Signed-off-by: Scott Wood <scottwood@freescale.com>
>> 
>> Peter, how does the VGIC implementation handle this?
> 
> Check kvm_arm_register_device() in target-arm/kvm.c. Basically
> the VGIC device model calls this function to say "tell the kernel
> where this MemoryRegion is in the system address space, when it
> eventually gets mapped". The code in kvm.c uses the memory system's
> Notifier API to get a callback when the region is mapped into
> an address space, which it uses to track the offset in the
> address space. Finally, we use a machine init notifier so that
> just before everything finally starts we can make the KVM ioctls
> to say "here is where everything lives".

Same thing here. The question is how the kvm-vgic code in QEMU finds out where it got mapped to. Scott adds this patch to do this, but I'd assume you have some other way :)


Alex

> 
> I think this is a pretty neat way of doing it because it means
> neither the interrupt controller device nor the board model
> really need to care about the kernel being told where things
> are mapped; it's all abstracted out into kvm.c. If your
> interrupt controller can be moved around at runtime that's
> probably also handlable, but the ARM code just unregisters its
> notifiers at machine init because the GIC can't move.
> 
> (I think the code assumes the device only gets mapped into
> one address space; this could easily be fixed if it's not true
> at some point in the future.)
> 
> thanks
> -- PMM

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 3/6] memory: add memory_region_to_address()
  2013-03-21 10:59       ` Alexander Graf
@ 2013-03-21 11:01         ` Peter Maydell
  2013-03-21 11:05           ` Alexander Graf
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Maydell @ 2013-03-21 11:01 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Scott Wood, qemu-ppc@nongnu.org list:PowerPC, qemu-devel qemu-devel

On 21 March 2013 10:59, Alexander Graf <agraf@suse.de> wrote:
> On 21.03.2013, at 11:53, Peter Maydell wrote:
>> Check kvm_arm_register_device() in target-arm/kvm.c. Basically
>> the VGIC device model calls this function to say "tell the kernel
>> where this MemoryRegion is in the system address space, when it
>> eventually gets mapped". The code in kvm.c uses the memory system's
>> Notifier API to get a callback when the region is mapped into
>> an address space, which it uses to track the offset in the
>> address space. Finally, we use a machine init notifier so that
>> just before everything finally starts we can make the KVM ioctls
>> to say "here is where everything lives".
>
> Same thing here. The question is how the kvm-vgic code in QEMU
> finds out where it got mapped to. Scott adds this patch to do
> this, but I'd assume you have some other way :)

Hmm? The kvm-vgic code in QEMU doesn't need to know where it
lives. We have to tell the kernel so it can map its bits of
registers in at the right place, that's all.

-- PMM

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 3/6] memory: add memory_region_to_address()
  2013-03-21 11:01         ` Peter Maydell
@ 2013-03-21 11:05           ` Alexander Graf
  2013-03-21 11:09             ` Peter Maydell
  0 siblings, 1 reply; 40+ messages in thread
From: Alexander Graf @ 2013-03-21 11:05 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Scott Wood, qemu-ppc@nongnu.org list:PowerPC, qemu-devel qemu-devel


On 21.03.2013, at 12:01, Peter Maydell wrote:

> On 21 March 2013 10:59, Alexander Graf <agraf@suse.de> wrote:
>> On 21.03.2013, at 11:53, Peter Maydell wrote:
>>> Check kvm_arm_register_device() in target-arm/kvm.c. Basically
>>> the VGIC device model calls this function to say "tell the kernel
>>> where this MemoryRegion is in the system address space, when it
>>> eventually gets mapped". The code in kvm.c uses the memory system's
>>> Notifier API to get a callback when the region is mapped into
>>> an address space, which it uses to track the offset in the
>>> address space. Finally, we use a machine init notifier so that
>>> just before everything finally starts we can make the KVM ioctls
>>> to say "here is where everything lives".
>> 
>> Same thing here. The question is how the kvm-vgic code in QEMU
>> finds out where it got mapped to. Scott adds this patch to do
>> this, but I'd assume you have some other way :)
> 
> Hmm? The kvm-vgic code in QEMU doesn't need to know where it
> lives. We have to tell the kernel so it can map its bits of
> registers in at the right place, that's all.

The kvm-vgic code in QEMU needs to tell the kernel, no? For that, it needs to know what to tell the kernel.

This patch adds a function that allows kvm-openpic to fetch its base flat address from the MemoryListener. I was wondering whether either this patch is superfluous or you guys had an awkward MemoryListener handler :)


Alex

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 3/6] memory: add memory_region_to_address()
  2013-03-21 11:05           ` Alexander Graf
@ 2013-03-21 11:09             ` Peter Maydell
  2013-03-21 11:14               ` Alexander Graf
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Maydell @ 2013-03-21 11:09 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Scott Wood, qemu-ppc@nongnu.org list:PowerPC, qemu-devel qemu-devel

On 21 March 2013 11:05, Alexander Graf <agraf@suse.de> wrote:
>
> On 21.03.2013, at 12:01, Peter Maydell wrote:
>
>> On 21 March 2013 10:59, Alexander Graf <agraf@suse.de> wrote:
>>> On 21.03.2013, at 11:53, Peter Maydell wrote:
>>>> Check kvm_arm_register_device() in target-arm/kvm.c. Basically
>>>> the VGIC device model calls this function to say "tell the kernel
>>>> where this MemoryRegion is in the system address space, when it
>>>> eventually gets mapped". The code in kvm.c uses the memory system's
>>>> Notifier API to get a callback when the region is mapped into
>>>> an address space, which it uses to track the offset in the
>>>> address space. Finally, we use a machine init notifier so that
>>>> just before everything finally starts we can make the KVM ioctls
>>>> to say "here is where everything lives".
>>>
>>> Same thing here. The question is how the kvm-vgic code in QEMU
>>> finds out where it got mapped to. Scott adds this patch to do
>>> this, but I'd assume you have some other way :)
>>
>> Hmm? The kvm-vgic code in QEMU doesn't need to know where it
>> lives. We have to tell the kernel so it can map its bits of
>> registers in at the right place, that's all.
>
> The kvm-vgic code in QEMU needs to tell the kernel, no? For
> that, it needs to know what to tell the kernel.

No. As I explained earlier, all the kvm-vgic code needs to do
is call kvm_arm_register_device(). That code in kvm.c then takes
care of telling the kernel. hw/kvm/arm_gic.c itself never knows or
needs to know where it's mapped. This is the whole point of the
mechanism involving notifiers.

> This patch adds a function that allows kvm-openpic to fetch its
> base flat address from the MemoryListener.

This sounds to me like the wrong way to do it -- it's board
models that decide where devices are mapped and the device
code itself shouldn't have to know where it has been mapped.

-- PMM

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 3/6] memory: add memory_region_to_address()
  2013-03-21 11:09             ` Peter Maydell
@ 2013-03-21 11:14               ` Alexander Graf
  2013-03-21 11:22                 ` Peter Maydell
  0 siblings, 1 reply; 40+ messages in thread
From: Alexander Graf @ 2013-03-21 11:14 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Scott Wood, qemu-ppc@nongnu.org list:PowerPC, qemu-devel qemu-devel


On 21.03.2013, at 12:09, Peter Maydell wrote:

> On 21 March 2013 11:05, Alexander Graf <agraf@suse.de> wrote:
>> 
>> On 21.03.2013, at 12:01, Peter Maydell wrote:
>> 
>>> On 21 March 2013 10:59, Alexander Graf <agraf@suse.de> wrote:
>>>> On 21.03.2013, at 11:53, Peter Maydell wrote:
>>>>> Check kvm_arm_register_device() in target-arm/kvm.c. Basically
>>>>> the VGIC device model calls this function to say "tell the kernel
>>>>> where this MemoryRegion is in the system address space, when it
>>>>> eventually gets mapped". The code in kvm.c uses the memory system's
>>>>> Notifier API to get a callback when the region is mapped into
>>>>> an address space, which it uses to track the offset in the
>>>>> address space. Finally, we use a machine init notifier so that
>>>>> just before everything finally starts we can make the KVM ioctls
>>>>> to say "here is where everything lives".
>>>> 
>>>> Same thing here. The question is how the kvm-vgic code in QEMU
>>>> finds out where it got mapped to. Scott adds this patch to do
>>>> this, but I'd assume you have some other way :)
>>> 
>>> Hmm? The kvm-vgic code in QEMU doesn't need to know where it
>>> lives. We have to tell the kernel so it can map its bits of
>>> registers in at the right place, that's all.
>> 
>> The kvm-vgic code in QEMU needs to tell the kernel, no? For
>> that, it needs to know what to tell the kernel.
> 
> No. As I explained earlier, all the kvm-vgic code needs to do
> is call kvm_arm_register_device(). That code in kvm.c then takes
> care of telling the kernel. hw/kvm/arm_gic.c itself never knows or
> needs to know where it's mapped. This is the whole point of the
> mechanism involving notifiers.

I fully disagree. Code that talks to the in-kernel device should live in hw/kvm/<device>.c, not in some random target-XXX/kvm.c file. Of course the board defines where the device gets mapped to, but the communication with the in-kernel device bits should really be contained to the device itself.

So this is the function that gets invoked on ARM:

static void kvm_arm_machine_init_done(Notifier *notifier, void *data)
{
    KVMDevice *kd, *tkd;

    memory_listener_unregister(&devlistener);
    QSLIST_FOREACH_SAFE(kd, &kvm_devices_head, entries, tkd) {
        if (kd->kda.addr != -1) {
            if (kvm_vm_ioctl(kvm_state, KVM_ARM_SET_DEVICE_ADDR,
                             &kd->kda) < 0) {
                fprintf(stderr, "KVM_ARM_SET_DEVICE_ADDRESS failed: %s\n",
                        strerror(errno));
                abort();
            }
        }
        g_free(kd);
    }
}

This only goes one level deep, right? So if you ever have to nest the VGIC inside of another MemoryRegion, this will break, right?


Alex

> 
>> This patch adds a function that allows kvm-openpic to fetch its
>> base flat address from the MemoryListener.
> 
> This sounds to me like the wrong way to do it -- it's board
> models that decide where devices are mapped and the device
> code itself shouldn't have to know where it has been mapped.
> 
> -- PMM

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 3/6] memory: add memory_region_to_address()
  2013-03-21 11:14               ` Alexander Graf
@ 2013-03-21 11:22                 ` Peter Maydell
  2013-03-21 11:29                   ` Alexander Graf
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Maydell @ 2013-03-21 11:22 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Scott Wood, qemu-ppc@nongnu.org list:PowerPC, qemu-devel qemu-devel

On 21 March 2013 11:14, Alexander Graf <agraf@suse.de> wrote:
>
> On 21.03.2013, at 12:09, Peter Maydell wrote:
>
>> On 21 March 2013 11:05, Alexander Graf <agraf@suse.de> wrote:
>>>
>>> On 21.03.2013, at 12:01, Peter Maydell wrote:
>>>
>>>> On 21 March 2013 10:59, Alexander Graf <agraf@suse.de> wrote:
>>>>> On 21.03.2013, at 11:53, Peter Maydell wrote:
>>>>>> Check kvm_arm_register_device() in target-arm/kvm.c. Basically
>>>>>> the VGIC device model calls this function to say "tell the kernel
>>>>>> where this MemoryRegion is in the system address space, when it
>>>>>> eventually gets mapped". The code in kvm.c uses the memory system's
>>>>>> Notifier API to get a callback when the region is mapped into
>>>>>> an address space, which it uses to track the offset in the
>>>>>> address space. Finally, we use a machine init notifier so that
>>>>>> just before everything finally starts we can make the KVM ioctls
>>>>>> to say "here is where everything lives".
>>>>>
>>>>> Same thing here. The question is how the kvm-vgic code in QEMU
>>>>> finds out where it got mapped to. Scott adds this patch to do
>>>>> this, but I'd assume you have some other way :)
>>>>
>>>> Hmm? The kvm-vgic code in QEMU doesn't need to know where it
>>>> lives. We have to tell the kernel so it can map its bits of
>>>> registers in at the right place, that's all.
>>>
>>> The kvm-vgic code in QEMU needs to tell the kernel, no? For
>>> that, it needs to know what to tell the kernel.
>>
>> No. As I explained earlier, all the kvm-vgic code needs to do
>> is call kvm_arm_register_device(). That code in kvm.c then takes
>> care of telling the kernel. hw/kvm/arm_gic.c itself never knows or
>> needs to know where it's mapped. This is the whole point of the
>> mechanism involving notifiers.
>
> I fully disagree. Code that talks to the in-kernel device should
> live in hw/kvm/<device>.c, not in some random target-XXX/kvm.c file.

The code in kvm.c is entirely generic -- it provides a mechanism
for a device to say "this memory region is kernel ID X and it will
want to know where it lives". The kvm.c code will work for any device
with a memory mapped region, whether it's the GIC or something else.

> Of course the board defines where the device gets mapped to, but
> the communication with the in-kernel device bits should really be
> contained to the device itself.

You're arguing that every device should implement its own set
of notifier functions so it can get called back when its memory
regions are finally mapped, just so it can make a non-device-specific
KVM ioctl? The obvious thing to do is abstract that functionality
out.

> So this is the function that gets invoked on ARM:
>
> static void kvm_arm_machine_init_done(Notifier *notifier, void *data)
> {
>     KVMDevice *kd, *tkd;
>
>     memory_listener_unregister(&devlistener);
>     QSLIST_FOREACH_SAFE(kd, &kvm_devices_head, entries, tkd) {
>         if (kd->kda.addr != -1) {
>             if (kvm_vm_ioctl(kvm_state, KVM_ARM_SET_DEVICE_ADDR,
>                              &kd->kda) < 0) {
>                 fprintf(stderr, "KVM_ARM_SET_DEVICE_ADDRESS failed: %s\n",
>                         strerror(errno));
>                 abort();
>             }
>         }
>         g_free(kd);
>     }
> }
>
> This only goes one level deep, right? So if you ever have to nest the
> VGIC inside of another MemoryRegion, this will break, right?

We already nest the VGIC inside another memory region (the a15mpcore
container), and it works fine. This function is just iterating through
"everything any device asked me to tell the kernel about".

-- PMM

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 3/6] memory: add memory_region_to_address()
  2013-03-21 11:22                 ` Peter Maydell
@ 2013-03-21 11:29                   ` Alexander Graf
  2013-03-21 11:32                     ` Peter Maydell
  0 siblings, 1 reply; 40+ messages in thread
From: Alexander Graf @ 2013-03-21 11:29 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Scott Wood, qemu-ppc@nongnu.org list:PowerPC, qemu-devel qemu-devel


On 21.03.2013, at 12:22, Peter Maydell wrote:

> On 21 March 2013 11:14, Alexander Graf <agraf@suse.de> wrote:
>> 
>> On 21.03.2013, at 12:09, Peter Maydell wrote:
>> 
>>> On 21 March 2013 11:05, Alexander Graf <agraf@suse.de> wrote:
>>>> 
>>>> On 21.03.2013, at 12:01, Peter Maydell wrote:
>>>> 
>>>>> On 21 March 2013 10:59, Alexander Graf <agraf@suse.de> wrote:
>>>>>> On 21.03.2013, at 11:53, Peter Maydell wrote:
>>>>>>> Check kvm_arm_register_device() in target-arm/kvm.c. Basically
>>>>>>> the VGIC device model calls this function to say "tell the kernel
>>>>>>> where this MemoryRegion is in the system address space, when it
>>>>>>> eventually gets mapped". The code in kvm.c uses the memory system's
>>>>>>> Notifier API to get a callback when the region is mapped into
>>>>>>> an address space, which it uses to track the offset in the
>>>>>>> address space. Finally, we use a machine init notifier so that
>>>>>>> just before everything finally starts we can make the KVM ioctls
>>>>>>> to say "here is where everything lives".
>>>>>> 
>>>>>> Same thing here. The question is how the kvm-vgic code in QEMU
>>>>>> finds out where it got mapped to. Scott adds this patch to do
>>>>>> this, but I'd assume you have some other way :)
>>>>> 
>>>>> Hmm? The kvm-vgic code in QEMU doesn't need to know where it
>>>>> lives. We have to tell the kernel so it can map its bits of
>>>>> registers in at the right place, that's all.
>>>> 
>>>> The kvm-vgic code in QEMU needs to tell the kernel, no? For
>>>> that, it needs to know what to tell the kernel.
>>> 
>>> No. As I explained earlier, all the kvm-vgic code needs to do
>>> is call kvm_arm_register_device(). That code in kvm.c then takes
>>> care of telling the kernel. hw/kvm/arm_gic.c itself never knows or
>>> needs to know where it's mapped. This is the whole point of the
>>> mechanism involving notifiers.
>> 
>> I fully disagree. Code that talks to the in-kernel device should
>> live in hw/kvm/<device>.c, not in some random target-XXX/kvm.c file.
> 
> The code in kvm.c is entirely generic -- it provides a mechanism
> for a device to say "this memory region is kernel ID X and it will
> want to know where it lives". The kvm.c code will work for any device
> with a memory mapped region, whether it's the GIC or something else.
> 
>> Of course the board defines where the device gets mapped to, but
>> the communication with the in-kernel device bits should really be
>> contained to the device itself.
> 
> You're arguing that every device should implement its own set
> of notifier functions so it can get called back when its memory
> regions are finally mapped, just so it can make a non-device-specific
> KVM ioctl? The obvious thing to do is abstract that functionality
> out.

What I'm arguing is that every device should look as if it was a QEMU device. Devices that happen to live in KVM, should still make a significant effort to expose themselves to the board model as if they were QEMU devices.

So yes, I think the device model should at least register the memory listener, because only the device model knows what its memory regions would map to in KVM's world. Not all devices have a single flat region. Some have more than one.

Whether we have a helper function in (generic) kvm.c that can call a (generic) ioctl to set a device's region X is a different matter. I'd be open to that if it makes sense.

> 
>> So this is the function that gets invoked on ARM:
>> 
>> static void kvm_arm_machine_init_done(Notifier *notifier, void *data)
>> {
>>    KVMDevice *kd, *tkd;
>> 
>>    memory_listener_unregister(&devlistener);
>>    QSLIST_FOREACH_SAFE(kd, &kvm_devices_head, entries, tkd) {
>>        if (kd->kda.addr != -1) {
>>            if (kvm_vm_ioctl(kvm_state, KVM_ARM_SET_DEVICE_ADDR,
>>                             &kd->kda) < 0) {
>>                fprintf(stderr, "KVM_ARM_SET_DEVICE_ADDRESS failed: %s\n",
>>                        strerror(errno));
>>                abort();
>>            }
>>        }
>>        g_free(kd);
>>    }
>> }
>> 
>> This only goes one level deep, right? So if you ever have to nest the
>> VGIC inside of another MemoryRegion, this will break, right?
> 
> We already nest the VGIC inside another memory region (the a15mpcore
> container), and it works fine. This function is just iterating through
> "everything any device asked me to tell the kernel about".

So kda is the real physical offset? I'm having a hard time reading that code :). According to this function:

static void kvm_arm_devlistener_add(MemoryListener *listener,
                                    MemoryRegionSection *section)
{
    KVMDevice *kd;

    QSLIST_FOREACH(kd, &kvm_devices_head, entries) {
        if (section->mr == kd->mr) {
            kd->kda.addr = section->offset_within_address_space;
        }
    }
}

it's only the offset within its parent region, which would mean it's broken, no?


Alex

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 3/6] memory: add memory_region_to_address()
  2013-03-21 11:29                   ` Alexander Graf
@ 2013-03-21 11:32                     ` Peter Maydell
  2013-03-21 11:38                       ` Alexander Graf
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Maydell @ 2013-03-21 11:32 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Scott Wood, qemu-ppc@nongnu.org list:PowerPC, qemu-devel qemu-devel

On 21 March 2013 11:29, Alexander Graf <agraf@suse.de> wrote:
> On 21.03.2013, at 12:22, Peter Maydell wrote:
>> We already nest the VGIC inside another memory region (the a15mpcore
>> container), and it works fine. This function is just iterating through
>> "everything any device asked me to tell the kernel about".
>
> So kda is the real physical offset? I'm having a hard time reading that code :). According to this function:
>
> static void kvm_arm_devlistener_add(MemoryListener *listener,
>                                     MemoryRegionSection *section)
> {
>     KVMDevice *kd;
>
>     QSLIST_FOREACH(kd, &kvm_devices_head, entries) {
>         if (section->mr == kd->mr) {
>             kd->kda.addr = section->offset_within_address_space;
>         }
>     }
> }
>
> it's only the offset within its parent region, which would mean it's broken, no?

Address spaces are not the same thing as memory regions :-)
The only address space involved here is the system address space.
(As I say, we currently assume we only get mapped into one address
space, but that could be fixed if necessary.)

-- PMM

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 3/6] memory: add memory_region_to_address()
  2013-03-21 11:32                     ` Peter Maydell
@ 2013-03-21 11:38                       ` Alexander Graf
  2013-03-21 11:44                         ` Peter Maydell
  0 siblings, 1 reply; 40+ messages in thread
From: Alexander Graf @ 2013-03-21 11:38 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Scott Wood, qemu-ppc@nongnu.org list:PowerPC, qemu-devel qemu-devel


On 21.03.2013, at 12:32, Peter Maydell wrote:

> On 21 March 2013 11:29, Alexander Graf <agraf@suse.de> wrote:
>> On 21.03.2013, at 12:22, Peter Maydell wrote:
>>> We already nest the VGIC inside another memory region (the a15mpcore
>>> container), and it works fine. This function is just iterating through
>>> "everything any device asked me to tell the kernel about".
>> 
>> So kda is the real physical offset? I'm having a hard time reading that code :). According to this function:
>> 
>> static void kvm_arm_devlistener_add(MemoryListener *listener,
>>                                    MemoryRegionSection *section)
>> {
>>    KVMDevice *kd;
>> 
>>    QSLIST_FOREACH(kd, &kvm_devices_head, entries) {
>>        if (section->mr == kd->mr) {
>>            kd->kda.addr = section->offset_within_address_space;
>>        }
>>    }
>> }
>> 
>> it's only the offset within its parent region, which would mean it's broken, no?
> 
> Address spaces are not the same thing as memory regions :-)
> The only address space involved here is the system address space.
> (As I say, we currently assume we only get mapped into one address
> space, but that could be fixed if necessary.)

Interesting. Oh well, I'll leave that one to Scott to figure out ;).

So what if I want to write an in-kernel IDE PIO accelerator? Or even better yet: An AHCI accelerator that has one MMIO BAR and another PIO BAR that can be remapped by the guest at any time?

The distinction on whether a region is handled by KVM really needs to be done by the device model.


Alex

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 3/6] memory: add memory_region_to_address()
  2013-03-21 11:38                       ` Alexander Graf
@ 2013-03-21 11:44                         ` Peter Maydell
  2013-03-21 11:49                           ` Alexander Graf
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Maydell @ 2013-03-21 11:44 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Scott Wood, qemu-ppc@nongnu.org list:PowerPC, qemu-devel qemu-devel

On 21 March 2013 11:38, Alexander Graf <agraf@suse.de> wrote:
>
> On 21.03.2013, at 12:32, Peter Maydell wrote:
>
>> On 21 March 2013 11:29, Alexander Graf <agraf@suse.de> wrote:
>>> On 21.03.2013, at 12:22, Peter Maydell wrote:
>>>> We already nest the VGIC inside another memory region (the a15mpcore
>>>> container), and it works fine. This function is just iterating through
>>>> "everything any device asked me to tell the kernel about".
>>>
>>> So kda is the real physical offset? I'm having a hard time reading that code :). According to this function:
>>>
>>> static void kvm_arm_devlistener_add(MemoryListener *listener,
>>>                                    MemoryRegionSection *section)
>>> {
>>>    KVMDevice *kd;
>>>
>>>    QSLIST_FOREACH(kd, &kvm_devices_head, entries) {
>>>        if (section->mr == kd->mr) {
>>>            kd->kda.addr = section->offset_within_address_space;
>>>        }
>>>    }
>>> }
>>>
>>> it's only the offset within its parent region, which would mean it's broken, no?
>>
>> Address spaces are not the same thing as memory regions :-)
>> The only address space involved here is the system address space.
>> (As I say, we currently assume we only get mapped into one address
>> space, but that could be fixed if necessary.)
>
> Interesting. Oh well, I'll leave that one to Scott to figure out ;).
>
> So what if I want to write an in-kernel IDE PIO accelerator?

Have the QEMU end of that device call (your equivalent of)
kvm_arm_register_device(), and provide a 'reserved' mmio region to
its users; the kernel end implements the standard 'tell me where I live'
ioctl; that's it.

> Or even better yet: An AHCI accelerator that has one MMIO BAR and
> another PIO BAR that can be remapped by the guest at any time?

Guest remappable KVM regions would require enhancements, yes (it's
not like we have an existing mechanism for doing that on any
architecture at the moment). The principle of implementing the
mechanics of this in common code still holds, probably even more
so for the increased complexity.

> The distinction on whether a region is handled by KVM really needs
> to be done by the device model.

It is -- the device model is what calls kvm_arm_register_device().
It's just the mechanics of "how do we tell the kernel the right
address for this region at the point when we know it" that are
handled in kvm.c.

-- PMM

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 3/6] memory: add memory_region_to_address()
  2013-03-21 11:44                         ` Peter Maydell
@ 2013-03-21 11:49                           ` Alexander Graf
  2013-03-21 11:51                             ` [Qemu-devel] [Qemu-ppc] " Alexander Graf
  2013-03-21 11:53                             ` [Qemu-devel] " Peter Maydell
  0 siblings, 2 replies; 40+ messages in thread
From: Alexander Graf @ 2013-03-21 11:49 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Scott Wood, qemu-ppc@nongnu.org list:PowerPC, qemu-devel qemu-devel


On 21.03.2013, at 12:44, Peter Maydell wrote:

> On 21 March 2013 11:38, Alexander Graf <agraf@suse.de> wrote:
>> 
>> On 21.03.2013, at 12:32, Peter Maydell wrote:
>> 
>>> On 21 March 2013 11:29, Alexander Graf <agraf@suse.de> wrote:
>>>> On 21.03.2013, at 12:22, Peter Maydell wrote:
>>>>> We already nest the VGIC inside another memory region (the a15mpcore
>>>>> container), and it works fine. This function is just iterating through
>>>>> "everything any device asked me to tell the kernel about".
>>>> 
>>>> So kda is the real physical offset? I'm having a hard time reading that code :). According to this function:
>>>> 
>>>> static void kvm_arm_devlistener_add(MemoryListener *listener,
>>>>                                   MemoryRegionSection *section)
>>>> {
>>>>   KVMDevice *kd;
>>>> 
>>>>   QSLIST_FOREACH(kd, &kvm_devices_head, entries) {
>>>>       if (section->mr == kd->mr) {
>>>>           kd->kda.addr = section->offset_within_address_space;
>>>>       }
>>>>   }
>>>> }
>>>> 
>>>> it's only the offset within its parent region, which would mean it's broken, no?
>>> 
>>> Address spaces are not the same thing as memory regions :-)
>>> The only address space involved here is the system address space.
>>> (As I say, we currently assume we only get mapped into one address
>>> space, but that could be fixed if necessary.)
>> 
>> Interesting. Oh well, I'll leave that one to Scott to figure out ;).
>> 
>> So what if I want to write an in-kernel IDE PIO accelerator?
> 
> Have the QEMU end of that device call (your equivalent of)
> kvm_arm_register_device(), and provide a 'reserved' mmio region to
> its users; the kernel end implements the standard 'tell me where I live'
> ioctl; that's it.
> 
>> Or even better yet: An AHCI accelerator that has one MMIO BAR and
>> another PIO BAR that can be remapped by the guest at any time?
> 
> Guest remappable KVM regions would require enhancements, yes (it's
> not like we have an existing mechanism for doing that on any
> architecture at the moment). The principle of implementing the
> mechanics of this in common code still holds, probably even more
> so for the increased complexity.
> 
>> The distinction on whether a region is handled by KVM really needs
>> to be done by the device model.
> 
> It is -- the device model is what calls kvm_arm_register_device().
> It's just the mechanics of "how do we tell the kernel the right
> address for this region at the point when we know it" that are
> handled in kvm.c.

I think I'm slowly grasping what you're aiming at :). Ok, that works. You do actually do the listener in the device model, just that you pass code responsibility over to kvm.c.

That's perfectly valid and sounds like a good model that Scott probably wants to follow as well :).


Alex

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [RFC ppc-next PATCH 3/6] memory: add memory_region_to_address()
  2013-03-21 11:49                           ` Alexander Graf
@ 2013-03-21 11:51                             ` Alexander Graf
  2013-03-21 22:43                               ` Scott Wood
  2013-03-21 11:53                             ` [Qemu-devel] " Peter Maydell
  1 sibling, 1 reply; 40+ messages in thread
From: Alexander Graf @ 2013-03-21 11:51 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Scott Wood, qemu-ppc@nongnu.org list:PowerPC, qemu-devel qemu-devel


On 21.03.2013, at 12:49, Alexander Graf wrote:

> 
> On 21.03.2013, at 12:44, Peter Maydell wrote:
> 
>> On 21 March 2013 11:38, Alexander Graf <agraf@suse.de> wrote:
>>> 
>>> On 21.03.2013, at 12:32, Peter Maydell wrote:
>>> 
>>>> On 21 March 2013 11:29, Alexander Graf <agraf@suse.de> wrote:
>>>>> On 21.03.2013, at 12:22, Peter Maydell wrote:
>>>>>> We already nest the VGIC inside another memory region (the a15mpcore
>>>>>> container), and it works fine. This function is just iterating through
>>>>>> "everything any device asked me to tell the kernel about".
>>>>> 
>>>>> So kda is the real physical offset? I'm having a hard time reading that code :). According to this function:
>>>>> 
>>>>> static void kvm_arm_devlistener_add(MemoryListener *listener,
>>>>>                                  MemoryRegionSection *section)
>>>>> {
>>>>>  KVMDevice *kd;
>>>>> 
>>>>>  QSLIST_FOREACH(kd, &kvm_devices_head, entries) {
>>>>>      if (section->mr == kd->mr) {
>>>>>          kd->kda.addr = section->offset_within_address_space;
>>>>>      }
>>>>>  }
>>>>> }
>>>>> 
>>>>> it's only the offset within its parent region, which would mean it's broken, no?
>>>> 
>>>> Address spaces are not the same thing as memory regions :-)
>>>> The only address space involved here is the system address space.
>>>> (As I say, we currently assume we only get mapped into one address
>>>> space, but that could be fixed if necessary.)
>>> 
>>> Interesting. Oh well, I'll leave that one to Scott to figure out ;).
>>> 
>>> So what if I want to write an in-kernel IDE PIO accelerator?
>> 
>> Have the QEMU end of that device call (your equivalent of)
>> kvm_arm_register_device(), and provide a 'reserved' mmio region to
>> its users; the kernel end implements the standard 'tell me where I live'
>> ioctl; that's it.
>> 
>>> Or even better yet: An AHCI accelerator that has one MMIO BAR and
>>> another PIO BAR that can be remapped by the guest at any time?
>> 
>> Guest remappable KVM regions would require enhancements, yes (it's
>> not like we have an existing mechanism for doing that on any
>> architecture at the moment). The principle of implementing the
>> mechanics of this in common code still holds, probably even more
>> so for the increased complexity.
>> 
>>> The distinction on whether a region is handled by KVM really needs
>>> to be done by the device model.
>> 
>> It is -- the device model is what calls kvm_arm_register_device().
>> It's just the mechanics of "how do we tell the kernel the right
>> address for this region at the point when we know it" that are
>> handled in kvm.c.
> 
> I think I'm slowly grasping what you're aiming at :). Ok, that works. You do actually do the listener in the device model, just that you pass code responsibility over to kvm.c.
> 
> That's perfectly valid and sounds like a good model that Scott probably wants to follow as well :).

s/follow/evaluate/ :).

The currently proposed device api doesn't have a generic notion of device regions. Regions are a per-device property, because a single device can have multiple regions.

However, maybe with a bit of brainstorming we could come up with a reasonably generic scheme.


Alex

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 3/6] memory: add memory_region_to_address()
  2013-03-21 11:49                           ` Alexander Graf
  2013-03-21 11:51                             ` [Qemu-devel] [Qemu-ppc] " Alexander Graf
@ 2013-03-21 11:53                             ` Peter Maydell
  1 sibling, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2013-03-21 11:53 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Scott Wood, qemu-ppc@nongnu.org list:PowerPC, qemu-devel qemu-devel

On 21 March 2013 11:49, Alexander Graf <agraf@suse.de> wrote:
>
> On 21.03.2013, at 12:44, Peter Maydell wrote:
>> It is -- the device model is what calls kvm_arm_register_device().
>> It's just the mechanics of "how do we tell the kernel the right
>> address for this region at the point when we know it" that are
>> handled in kvm.c.
>
> I think I'm slowly grasping what you're aiming at :). Ok, that
> works. You do actually do the listener in the device model, just
> that you pass code responsibility over to kvm.c.
>
> That's perfectly valid and sounds like a good model that Scott
> probably wants to follow as well :).

Yep. We were actually originally going to make the device ioctl
a generic one, not an ARM one, because there really isn't anything
ARM specific about it. We should probably move the code from
target-arm/kvm.c into kvm-all.c with an arch hook to specify
the ioctl to use (same as irq_set_ioctl) if you want to do the
same approach with PPC.

Re multiple regions: yes, the VGIC has several. We just divide
the u32 ID into two halves, one for a device ID and one for
a region ID for that device.

-- PMM

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 6/6] kvm/openpic: in-kernel mpic support
  2013-03-21  8:41   ` [Qemu-devel] [RFC ppc-next PATCH 6/6] kvm/openpic: in-kernel mpic support Alexander Graf
@ 2013-03-21 20:50     ` Scott Wood
  2013-03-21 21:29       ` Alexander Graf
  0 siblings, 1 reply; 40+ messages in thread
From: Scott Wood @ 2013-03-21 20:50 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-ppc, qemu-devel

On 03/21/2013 03:41:19 AM, Alexander Graf wrote:
> 
> On 14.02.2013, at 07:32, Scott Wood wrote:
> 
> > +DeviceState *kvm_openpic_create(BusState *bus, int model)
> > +{
> > +    KVMState *s = kvm_state;
> > +    DeviceState *dev;
> > +    struct kvm_create_device cd = {0};
> > +    int ret;
> > +
> > +    if (!kvm_check_extension(s, KVM_CAP_DEVICE_CTRL)) {
> > +        return NULL;
> > +    }
> > +
> > +    switch (model) {
> > +    case OPENPIC_MODEL_FSL_MPIC_20:
> > +        cd.type = KVM_DEV_TYPE_FSL_MPIC_20;
> > +        break;
> > +
> > +    case OPENPIC_MODEL_FSL_MPIC_42:
> > +        cd.type = KVM_DEV_TYPE_FSL_MPIC_42;
> > +        break;
> > +
> > +    default:
> > +        qemu_log_mask(LOG_UNIMP, "%s: unknown openpic model %d\n",
> > +                      __func__, model);
> > +        return NULL;
> > +    }
> > +
> > +    ret = kvm_vm_ioctl(s, KVM_CREATE_DEVICE, &cd);
> > +    if (ret < 0) {
> > +        fprintf(stderr, "%s: can't create device %d: %s\n",  
> __func__, cd.type,
> > +                strerror(errno));
> > +        return NULL;
> > +    }
> 
> Can't all the stuff above here just simply go into the qdev init  
> function?

Not if you want platform code to be able to fall back to a QEMU mpic if  
an in-kernel mpic is unavailable.

> >     /* MPIC */
> >     mpic = g_new(qemu_irq, 256);
> > -    dev = qdev_create(NULL, "openpic");
> > -    qdev_prop_set_uint32(dev, "nb_cpus", smp_cpus);
> > -    qdev_prop_set_uint32(dev, "model", params->mpic_version);
> > +
> > +    if (kvm_irqchip_wanted()) {
> > +        dev = kvm_openpic_create(NULL, params->mpic_version);
> 
> This really should be just a
> 
>     dev = qdev_create(NULL, kvm_irqchip_wanted() ? "kvm-openpic" :  
> "openpic");
> 
> The logic whether an in-kernel irqchip is available belongs into the  
> default setting of kvm_irqchip_wanted.

That is exactly what I was trying to avoid by introducing  
kvm_irqchip_wanted.  We're no longer testing some vague generic irqchip  
capability, but the presence of a specific type of device (and version  
thereof).  How would the code that sets kvm_irqchip_wanted know what to  
test for?

> If the host kvm version can't handle an in-kernel MPIC, it should  
> simply default to false. If it supports one, it defaults to true.

OK, I misread the existing code and thought that the in-kernel irqchip  
would never be used unless explicitly requested.

> Whenever the user defines something explicitly with -machine, that  
> wins.

Then we'd need kvm_irqchip_wanted to be a tristate -- on, off, or  
unspecified.  At that point it might be better to drop it entirely and  
just open-code the option check.

-Scott

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 6/6] kvm/openpic: in-kernel mpic support
  2013-03-21 20:50     ` Scott Wood
@ 2013-03-21 21:29       ` Alexander Graf
  2013-03-21 21:59         ` Scott Wood
  0 siblings, 1 reply; 40+ messages in thread
From: Alexander Graf @ 2013-03-21 21:29 UTC (permalink / raw)
  To: Scott Wood; +Cc: <qemu-ppc@nongnu.org>, <qemu-devel@nongnu.org>



Am 21.03.2013 um 21:50 schrieb Scott Wood <scottwood@freescale.com>:

> On 03/21/2013 03:41:19 AM, Alexander Graf wrote:
>> On 14.02.2013, at 07:32, Scott Wood wrote:
>> > +DeviceState *kvm_openpic_create(BusState *bus, int model)
>> > +{
>> > +    KVMState *s = kvm_state;
>> > +    DeviceState *dev;
>> > +    struct kvm_create_device cd = {0};
>> > +    int ret;
>> > +
>> > +    if (!kvm_check_extension(s, KVM_CAP_DEVICE_CTRL)) {
>> > +        return NULL;
>> > +    }
>> > +
>> > +    switch (model) {
>> > +    case OPENPIC_MODEL_FSL_MPIC_20:
>> > +        cd.type = KVM_DEV_TYPE_FSL_MPIC_20;
>> > +        break;
>> > +
>> > +    case OPENPIC_MODEL_FSL_MPIC_42:
>> > +        cd.type = KVM_DEV_TYPE_FSL_MPIC_42;
>> > +        break;
>> > +
>> > +    default:
>> > +        qemu_log_mask(LOG_UNIMP, "%s: unknown openpic model %d\n",
>> > +                      __func__, model);
>> > +        return NULL;
>> > +    }
>> > +
>> > +    ret = kvm_vm_ioctl(s, KVM_CREATE_DEVICE, &cd);
>> > +    if (ret < 0) {
>> > +        fprintf(stderr, "%s: can't create device %d: %s\n", __func__, cd.type,
>> > +                strerror(errno));
>> > +        return NULL;
>> > +    }
>> Can't all the stuff above here just simply go into the qdev init function?
> 
> Not if you want platform code to be able to fall back to a QEMU mpic if an in-kernel mpic is unavailable.

Do we want that? We used to have a default like that in qemu-kvm back in the day. That was very confusing, as people started to report that their VMs turned out to be really slow.

I think we should not have fallback code. It makes things easier and more obvious. The default should just depend on the host's capabilities.

> 
>> >     /* MPIC */
>> >     mpic = g_new(qemu_irq, 256);
>> > -    dev = qdev_create(NULL, "openpic");
>> > -    qdev_prop_set_uint32(dev, "nb_cpus", smp_cpus);
>> > -    qdev_prop_set_uint32(dev, "model", params->mpic_version);
>> > +
>> > +    if (kvm_irqchip_wanted()) {
>> > +        dev = kvm_openpic_create(NULL, params->mpic_version);
>> This really should be just a
>>    dev = qdev_create(NULL, kvm_irqchip_wanted() ? "kvm-openpic" : "openpic");
>> The logic whether an in-kernel irqchip is available belongs into the default setting of kvm_irqchip_wanted.
> 
> That is exactly what I was trying to avoid by introducing kvm_irqchip_wanted.  We're no longer testing some vague generic irqchip capability, but the presence of a specific type of device (and version thereof).  How would the code that sets kvm_irqchip_wanted know what to test for?

Then move the default code into the board file and check for the in-kernel mpic cap.

> 
>> If the host kvm version can't handle an in-kernel MPIC, it should simply default to false. If it supports one, it defaults to true.
> 
> OK, I misread the existing code and thought that the in-kernel irqchip would never be used unless explicitly requested.
> 
>> Whenever the user defines something explicitly with -machine, that wins.
> 
> Then we'd need kvm_irqchip_wanted to be a tristate -- on, off, or unspecified.  At that point it might be better to drop it entirely and just open-code the option check.

Yup :)

Alex

> 
> -Scott

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 6/6] kvm/openpic: in-kernel mpic support
  2013-03-21 21:29       ` Alexander Graf
@ 2013-03-21 21:59         ` Scott Wood
  2013-03-21 23:45           ` Alexander Graf
  0 siblings, 1 reply; 40+ messages in thread
From: Scott Wood @ 2013-03-21 21:59 UTC (permalink / raw)
  To: Alexander Graf; +Cc: <qemu-ppc@nongnu.org>, <qemu-devel@nongnu.org>

On 03/21/2013 04:29:02 PM, Alexander Graf wrote:
> 
> 
> Am 21.03.2013 um 21:50 schrieb Scott Wood <scottwood@freescale.com>:
> 
> > On 03/21/2013 03:41:19 AM, Alexander Graf wrote:
> >> Can't all the stuff above here just simply go into the qdev init  
> function?
> >
> > Not if you want platform code to be able to fall back to a QEMU  
> mpic if an in-kernel mpic is unavailable.
> 
> Do we want that? We used to have a default like that in qemu-kvm back  
> in the day. That was very confusing, as people started to report that  
> their VMs turned out to be really slow.
> 
> I think we should not have fallback code. It makes things easier and  
> more obvious. The default should just depend on the host's  
> capabilities.

I don't follow.  What is the difference between "falling back" and  
"depending on the host's capabilities"?  Either we can create an  
in-kernel MPIC or we can't.  We could use KVM_CREATE_DEVICE_TEST to see  
if the device type is supported separately from actually creating it,  
but I don't see what that would accomplish other than adding more code.

> >> >     /* MPIC */
> >> >     mpic = g_new(qemu_irq, 256);
> >> > -    dev = qdev_create(NULL, "openpic");
> >> > -    qdev_prop_set_uint32(dev, "nb_cpus", smp_cpus);
> >> > -    qdev_prop_set_uint32(dev, "model", params->mpic_version);
> >> > +
> >> > +    if (kvm_irqchip_wanted()) {
> >> > +        dev = kvm_openpic_create(NULL, params->mpic_version);
> >> This really should be just a
> >>    dev = qdev_create(NULL, kvm_irqchip_wanted() ? "kvm-openpic" :  
> "openpic");
> >> The logic whether an in-kernel irqchip is available belongs into  
> the default setting of kvm_irqchip_wanted.
> >
> > That is exactly what I was trying to avoid by introducing  
> kvm_irqchip_wanted.  We're no longer testing some vague generic  
> irqchip capability, but the presence of a specific type of device  
> (and version thereof).  How would the code that sets  
> kvm_irqchip_wanted know what to test for?
> 
> Then move the default code into the board file and check for the  
> in-kernel mpic cap.

I'm not quite sure what you mean by "the default code" -- if you mean  
the part that makes the decision whether to fall back or error out,  
that's already in board code.

-Scott

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [RFC ppc-next PATCH 3/6] memory: add memory_region_to_address()
  2013-03-21 11:51                             ` [Qemu-devel] [Qemu-ppc] " Alexander Graf
@ 2013-03-21 22:43                               ` Scott Wood
  2013-03-22 13:08                                 ` Peter Maydell
  0 siblings, 1 reply; 40+ messages in thread
From: Scott Wood @ 2013-03-21 22:43 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Peter Maydell, qemu-ppc@nongnu.org list:PowerPC, qemu-devel qemu-devel

On 03/21/2013 06:51:57 AM, Alexander Graf wrote:
> 
> On 21.03.2013, at 12:49, Alexander Graf wrote:
> 
> >
> > On 21.03.2013, at 12:44, Peter Maydell wrote:
> >
> >> On 21 March 2013 11:38, Alexander Graf <agraf@suse.de> wrote:
> >>>
> >>> On 21.03.2013, at 12:32, Peter Maydell wrote:
> >>>
> >>>> On 21 March 2013 11:29, Alexander Graf <agraf@suse.de> wrote:
> >>>>> On 21.03.2013, at 12:22, Peter Maydell wrote:
> >>>>>> We already nest the VGIC inside another memory region (the  
> a15mpcore
> >>>>>> container), and it works fine. This function is just iterating  
> through
> >>>>>> "everything any device asked me to tell the kernel about".
> >>>>>
> >>>>> So kda is the real physical offset? I'm having a hard time  
> reading that code :). According to this function:
> >>>>>
> >>>>> static void kvm_arm_devlistener_add(MemoryListener *listener,
> >>>>>                                  MemoryRegionSection *section)
> >>>>> {
> >>>>>  KVMDevice *kd;
> >>>>>
> >>>>>  QSLIST_FOREACH(kd, &kvm_devices_head, entries) {
> >>>>>      if (section->mr == kd->mr) {
> >>>>>          kd->kda.addr = section->offset_within_address_space;
> >>>>>      }
> >>>>>  }
> >>>>> }

What if the update is to a parent memory region, not to the one  
directly associated with the device?

Or does add() get called for all child regions (recursively) in such  
cases?

> >>> The distinction on whether a region is handled by KVM really needs
> >>> to be done by the device model.
> >>
> >> It is -- the device model is what calls kvm_arm_register_device().
> >> It's just the mechanics of "how do we tell the kernel the right
> >> address for this region at the point when we know it" that are
> >> handled in kvm.c.
> >
> > I think I'm slowly grasping what you're aiming at :). Ok, that  
> works. You do actually do the listener in the device model, just that  
> you pass code responsibility over to kvm.c.
> >
> > That's perfectly valid and sounds like a good model that Scott  
> probably wants to follow as well :).
> 
> s/follow/evaluate/ :).
> 
> The currently proposed device api doesn't have a generic notion of  
> device regions. Regions are a per-device property, because a single  
> device can have multiple regions.
> 
> However, maybe with a bit of brainstorming we could come up with a  
> reasonably generic scheme.

In the kernel API?  Or do you mean a generic scheme within QEMU that  
encodes any reasonably expected mechanism for setting the device adress  
(e.g. assume that it is either a 64-bit attribute, or uses the legacy  
ARM API), or perhaps a callback into device code?

The MPIC's memory listener isn't that much code... I'm not sure there's  
a great need for a central KVM registry.

-Scott

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 6/6] kvm/openpic: in-kernel mpic support
  2013-03-21 21:59         ` Scott Wood
@ 2013-03-21 23:45           ` Alexander Graf
  2013-03-22 22:51             ` Scott Wood
  0 siblings, 1 reply; 40+ messages in thread
From: Alexander Graf @ 2013-03-21 23:45 UTC (permalink / raw)
  To: Scott Wood; +Cc: <qemu-ppc@nongnu.org>, <qemu-devel@nongnu.org>


On 21.03.2013, at 22:59, Scott Wood wrote:

> On 03/21/2013 04:29:02 PM, Alexander Graf wrote:
>> Am 21.03.2013 um 21:50 schrieb Scott Wood <scottwood@freescale.com>:
>> > On 03/21/2013 03:41:19 AM, Alexander Graf wrote:
>> >> Can't all the stuff above here just simply go into the qdev init function?
>> >
>> > Not if you want platform code to be able to fall back to a QEMU mpic if an in-kernel mpic is unavailable.
>> Do we want that? We used to have a default like that in qemu-kvm back in the day. That was very confusing, as people started to report that their VMs turned out to be really slow.
>> I think we should not have fallback code. It makes things easier and more obvious. The default should just depend on the host's capabilities.
> 
> I don't follow.  What is the difference between "falling back" and "depending on the host's capabilities"?  Either we can create an in-kernel MPIC or we can't.  We could use KVM_CREATE_DEVICE_TEST to see if the device type is supported separately from actually creating it, but I don't see what that would accomplish other than adding more code.

We usually have settled on a tri-state way to change QEMU behavior for most machine options:

  -machine <opt> is not specified -> best possible behavior in the current system
  -machine <opt>=on -> turn the option on, fail if that doesn't work
  -machine <opt>=off -> turn the option off always

So for the in-kernel irqchip, we should follow the same model. If the -machine option is not passed in, we should try to allocate an in-kernel irqchip if possible. If the kernel advertises that it can do an in-kernel irqchip, but in fact it can't, I would consider that simply a bug we shouldn't worry about.

However, when the user says -machine kernel_irqchip=on, then we should create a kvm based irqchip. And if we can't do that, machine creation should just fail.

> 
>> >> >     /* MPIC */
>> >> >     mpic = g_new(qemu_irq, 256);
>> >> > -    dev = qdev_create(NULL, "openpic");
>> >> > -    qdev_prop_set_uint32(dev, "nb_cpus", smp_cpus);
>> >> > -    qdev_prop_set_uint32(dev, "model", params->mpic_version);
>> >> > +
>> >> > +    if (kvm_irqchip_wanted()) {
>> >> > +        dev = kvm_openpic_create(NULL, params->mpic_version);
>> >> This really should be just a
>> >>    dev = qdev_create(NULL, kvm_irqchip_wanted() ? "kvm-openpic" : "openpic");
>> >> The logic whether an in-kernel irqchip is available belongs into the default setting of kvm_irqchip_wanted.
>> >
>> > That is exactly what I was trying to avoid by introducing kvm_irqchip_wanted.  We're no longer testing some vague generic irqchip capability, but the presence of a specific type of device (and version thereof).  How would the code that sets kvm_irqchip_wanted know what to test for?
>> Then move the default code into the board file and check for the in-kernel mpic cap.
> 
> I'm not quite sure what you mean by "the default code" -- if you mean the part that makes the decision whether to fall back or error out, that's already in board code.

No, currently that lives mostly in kvm-all.c. I'm talking about the code that checks qemu_opt_get_bool("kernel_irqchip") and decides what to do based on that.


Alex

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [RFC ppc-next PATCH 3/6] memory: add memory_region_to_address()
  2013-03-21 22:43                               ` Scott Wood
@ 2013-03-22 13:08                                 ` Peter Maydell
  2013-03-22 22:05                                   ` Scott Wood
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Maydell @ 2013-03-22 13:08 UTC (permalink / raw)
  To: Scott Wood
  Cc: qemu-ppc@nongnu.org list:PowerPC, Alexander Graf, qemu-devel qemu-devel

On 21 March 2013 22:43, Scott Wood <scottwood@freescale.com> wrote:
> What if the update is to a parent memory region, not to the one directly
> associated with the device?
>
> Or does add() get called for all child regions (recursively) in such cases?

The memory API flattens the tree of memory regions down into a flat
view of the address space. These callbacks get called for the
final flattened view (so you'll never see a pure container in the
callback, only leaves). The callbacks happen for every region which
appears in the address space, in linear order. When an update happens
memory.c identifies the changes between the old flat view and the
new one and calls callbacks appropriately. This code isn't the
first use of the memory API listeners, so it's all well-tested code.

>> However, maybe with a bit of brainstorming we could come up with a
>> reasonably generic scheme.

> In the kernel API?  Or do you mean a generic scheme within QEMU that encodes
> any reasonably expected mechanism for setting the device adress (e.g. assume
> that it is either a 64-bit attribute, or uses the legacy ARM API), or
> perhaps a callback into device code?
>
> The MPIC's memory listener isn't that much code... I'm not sure
> there's a great need for a central KVM registry.

Well, nor is the ARM memory listener, but why have two bits of
code doing the same thing when you could have one?

-- PMM

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [RFC ppc-next PATCH 3/6] memory: add memory_region_to_address()
  2013-03-22 13:08                                 ` Peter Maydell
@ 2013-03-22 22:05                                   ` Scott Wood
  2013-03-23 11:24                                     ` Peter Maydell
  0 siblings, 1 reply; 40+ messages in thread
From: Scott Wood @ 2013-03-22 22:05 UTC (permalink / raw)
  To: Peter Maydell
  Cc: qemu-ppc@nongnu.org list:PowerPC, Alexander Graf, qemu-devel qemu-devel

On 03/22/2013 08:08:57 AM, Peter Maydell wrote:
> On 21 March 2013 22:43, Scott Wood <scottwood@freescale.com> wrote:
> > What if the update is to a parent memory region, not to the one  
> directly
> > associated with the device?
> >
> > Or does add() get called for all child regions (recursively) in  
> such cases?
> 
> The memory API flattens the tree of memory regions down into a flat
> view of the address space. These callbacks get called for the
> final flattened view (so you'll never see a pure container in the
> callback, only leaves). The callbacks happen for every region which
> appears in the address space, in linear order. When an update happens
> memory.c identifies the changes between the old flat view and the
> new one and calls callbacks appropriately.

OK, so .add and .del will be sufficient to capture any manipulation  
that would affect whether and where the region we care about is mapped?

> This code isn't the
> first use of the memory API listeners, so it's all well-tested code.

Sure, I'm not suggesting the code doesn't work -- just trying to  
understand how, so I know I'm using it properly.  The implementation is  
a bit opaque (to me at least), and the listener callbacks aren't  
documented the way the normal API functions are.

> >> However, maybe with a bit of brainstorming we could come up with a
> >> reasonably generic scheme.
> 
> > In the kernel API?  Or do you mean a generic scheme within QEMU  
> that encodes
> > any reasonably expected mechanism for setting the device adress  
> (e.g. assume
> > that it is either a 64-bit attribute, or uses the legacy ARM API),  
> or
> > perhaps a callback into device code?
> >
> > The MPIC's memory listener isn't that much code... I'm not sure
> > there's a great need for a central KVM registry.
> 
> Well, nor is the ARM memory listener, but why have two bits of
> code doing the same thing when you could have one?

They're not doing quite the same thing, though, and the effort required  
to unify them is non-zero.  The two main issues are the way that the  
address is communicated to KVM, and the ability to change the mapping  
after the guest starts.

-Scott

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC ppc-next PATCH 6/6] kvm/openpic: in-kernel mpic support
  2013-03-21 23:45           ` Alexander Graf
@ 2013-03-22 22:51             ` Scott Wood
  0 siblings, 0 replies; 40+ messages in thread
From: Scott Wood @ 2013-03-22 22:51 UTC (permalink / raw)
  To: Alexander Graf; +Cc: <qemu-ppc@nongnu.org>, <qemu-devel@nongnu.org>

On 03/21/2013 06:45:31 PM, Alexander Graf wrote:
> 
> On 21.03.2013, at 22:59, Scott Wood wrote:
> 
> > On 03/21/2013 04:29:02 PM, Alexander Graf wrote:
> >> Am 21.03.2013 um 21:50 schrieb Scott Wood  
> <scottwood@freescale.com>:
> >> > On 03/21/2013 03:41:19 AM, Alexander Graf wrote:
> >> >> Can't all the stuff above here just simply go into the qdev  
> init function?
> >> >
> >> > Not if you want platform code to be able to fall back to a QEMU  
> mpic if an in-kernel mpic is unavailable.
> >> Do we want that? We used to have a default like that in qemu-kvm  
> back in the day. That was very confusing, as people started to report  
> that their VMs turned out to be really slow.
> >> I think we should not have fallback code. It makes things easier  
> and more obvious. The default should just depend on the host's  
> capabilities.
> >
> > I don't follow.  What is the difference between "falling back" and  
> "depending on the host's capabilities"?  Either we can create an  
> in-kernel MPIC or we can't.  We could use KVM_CREATE_DEVICE_TEST to  
> see if the device type is supported separately from actually creating  
> it, but I don't see what that would accomplish other than adding more  
> code.

> We usually have settled on a tri-state way to change QEMU behavior  
> for most machine options:
> 
>   -machine <opt> is not specified -> best possible behavior in the  
> current system
>   -machine <opt>=on -> turn the option on, fail if that doesn't work
>   -machine <opt>=off -> turn the option off always
> 
> So for the in-kernel irqchip, we should follow the same model. If the  
> -machine option is not passed in, we should try to allocate an  
> in-kernel irqchip if possible.

That's fine, I just misunderstood what the semantics were supposed to  
be.

> If the kernel advertises that it can do an in-kernel irqchip, but in  
> fact it can't, I would consider that simply a bug we shouldn't worry  
> about.

I'm not worried about it.  I just don't see any benefit to deferring  
the creation of the device at that point, and we can't do the test any  
earlier because we won't know what type of irqchip to inquire about.

> >> > That is exactly what I was trying to avoid by introducing  
> kvm_irqchip_wanted.  We're no longer testing some vague generic  
> irqchip capability, but the presence of a specific type of device  
> (and version thereof).  How would the code that sets  
> kvm_irqchip_wanted know what to test for?
> >> Then move the default code into the board file and check for the  
> in-kernel mpic cap.
> >
> > I'm not quite sure what you mean by "the default code" -- if you  
> mean the part that makes the decision whether to fall back or error  
> out, that's already in board code.
> 
> No, currently that lives mostly in kvm-all.c.

By "currently" I mean in the current state of this patch, not how it's  
currently done for x86.  It's e500.c code that checks whether a "kvm  
openpic" was created.  If it wasn't -- due to the kernel not supporting  
it, or due to the user not wanting it -- it creates a normal openpic  
device.

This would change to error out if a kvm mpic was explicitly requested  
but unable to be created.

> I'm talking about the code that checks  
> qemu_opt_get_bool("kernel_irqchip") and decides what to do based on  
> that.

The only thing that it "decides what to do" for MPIC is record the  
preference for board code's consumption, and I agreed that there's no  
point leaving that there if it's more complicated than a simple bool.

-Scott

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [RFC ppc-next PATCH 3/6] memory: add memory_region_to_address()
  2013-03-22 22:05                                   ` Scott Wood
@ 2013-03-23 11:24                                     ` Peter Maydell
  2013-03-25 18:23                                       ` Scott Wood
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Maydell @ 2013-03-23 11:24 UTC (permalink / raw)
  To: Scott Wood
  Cc: qemu-ppc@nongnu.org list:PowerPC, Alexander Graf, qemu-devel qemu-devel

On 22 March 2013 22:05, Scott Wood <scottwood@freescale.com> wrote:
> On 03/22/2013 08:08:57 AM, Peter Maydell wrote:
>> The memory API flattens the tree of memory regions down into a flat
>> view of the address space. These callbacks get called for the
>> final flattened view (so you'll never see a pure container in the
>> callback, only leaves). The callbacks happen for every region which
>> appears in the address space, in linear order. When an update happens
>> memory.c identifies the changes between the old flat view and the
>> new one and calls callbacks appropriately.
>
> OK, so .add and .del will be sufficient to capture any manipulation that
> would affect whether and where the region we care about is mapped?

Yes. Note that if the board (brokenly) maps the region so it is
'hidden' by another region, this manifests as a .del [since it
is no longer accessible]. Also I think if the board maps something
small on top and in the middle of the region you get an add for
each of the partially visible fragments. Personally I'm happy to
not worry about either of these cases on the basis that they would
be board model bugs.

>> This code isn't the
>> first use of the memory API listeners, so it's all well-tested code.
>
>
> Sure, I'm not suggesting the code doesn't work -- just trying to understand
> how, so I know I'm using it properly.  The implementation is a bit opaque
> (to me at least), and the listener callbacks aren't documented the way the
> normal API functions are.

Yeah, it would I guess be good to add doc comments for all the fields
in struct MemoryListener describing the semantics of the callbacks.

>> > The MPIC's memory listener isn't that much code... I'm not sure
>> > there's a great need for a central KVM registry.
>>
>> Well, nor is the ARM memory listener, but why have two bits of
>> code doing the same thing when you could have one?
>
> They're not doing quite the same thing, though, and the effort required to
> unify them is non-zero.  The two main issues are the way that the address is
> communicated to KVM, and the ability to change the mapping after the guest
> starts.

Ah, guest-programmable mappings are a real use case and not a hypothetical?
Do we run into synchronisation issues with making sure that QEMU and
the kernel both agree simultaneously about where the mapping is?
Can the mapping be different between different CPU cores? [let's
hope not :-)] Is the mapping controlled by a register within the
mapping itself, or is there some separate non-moving register which
defines the location of the mappable registers?

thanks
-- PMM

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [RFC ppc-next PATCH 3/6] memory: add memory_region_to_address()
  2013-03-23 11:24                                     ` Peter Maydell
@ 2013-03-25 18:23                                       ` Scott Wood
  0 siblings, 0 replies; 40+ messages in thread
From: Scott Wood @ 2013-03-25 18:23 UTC (permalink / raw)
  To: Peter Maydell
  Cc: qemu-ppc@nongnu.org list:PowerPC, Alexander Graf, qemu-devel qemu-devel

On 03/23/2013 06:24:49 AM, Peter Maydell wrote:
> On 22 March 2013 22:05, Scott Wood <scottwood@freescale.com> wrote:
> > On 03/22/2013 08:08:57 AM, Peter Maydell wrote:
> >> The memory API flattens the tree of memory regions down into a flat
> >> view of the address space. These callbacks get called for the
> >> final flattened view (so you'll never see a pure container in the
> >> callback, only leaves). The callbacks happen for every region which
> >> appears in the address space, in linear order. When an update  
> happens
> >> memory.c identifies the changes between the old flat view and the
> >> new one and calls callbacks appropriately.
> >
> > OK, so .add and .del will be sufficient to capture any manipulation  
> that
> > would affect whether and where the region we care about is mapped?
> 
> Yes. Note that if the board (brokenly) maps the region so it is
> 'hidden' by another region, this manifests as a .del [since it
> is no longer accessible]. Also I think if the board maps something
> small on top and in the middle of the region you get an add for
> each of the partially visible fragments. Personally I'm happy to
> not worry about either of these cases

Yeah, if we do check for those cases it should just be to print an  
error.

> on the basis that they would be board model bugs.

In some cases it could be guest code doing something screwy, but if you  
need to support that then turn off the in-kernel pic.

> >> > The MPIC's memory listener isn't that much code... I'm not sure
> >> > there's a great need for a central KVM registry.
> >>
> >> Well, nor is the ARM memory listener, but why have two bits of
> >> code doing the same thing when you could have one?
> >
> > They're not doing quite the same thing, though, and the effort  
> required to
> > unify them is non-zero.  The two main issues are the way that the  
> address is
> > communicated to KVM, and the ability to change the mapping after  
> the guest
> > starts.
> 
> Ah, guest-programmable mappings are a real use case and not a  
> hypothetical?

It's real in terms of how the hardware works.  QEMU doesn't yet  
implement it, but it should, especially with Alex occasionally saying  
he'd like to see QEMU capable of running U-Boot.

> Do we run into synchronisation issues with making sure that QEMU and
> the kernel both agree simultaneously about where the mapping is?

I don't think so -- the guest CPU that is doing the moving is stopped  
for QEMU MMIO, and if the other CPUs try to access it in the meantime,  
it would be undefined even on real hardware whether it happens before  
or after the move takes effect.

> Can the mapping be different between different CPU cores? [let's
> hope not :-)]

Not in a way that's relevant here.  There's some per-cpu magic internal  
to the region, but that's handled within the kernel.

> Is the mapping controlled by a register within the
> mapping itself, or is there some separate non-moving register which
> defines the location of the mappable registers?

There's a separate moving register. :-P

MPIC is combined with a bunch of other devices into a large region  
called CCSR, and that region can be moved as a whole by writing to a  
register at the beginning of CCSR.

-Scott

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC PATCH v2 0/6] kvm/openpic: in-kernel irqchip
       [not found] <1360823521-32306-1-git-send-email-scottwood@freescale.com>
                   ` (3 preceding siblings ...)
       [not found] ` <1360823521-32306-7-git-send-email-scottwood@freescale.com>
@ 2013-04-15 23:19 ` Scott Wood
  2013-04-15 23:19   ` [Qemu-devel] [RFC PATCH v2 1/6] kvm: update linux-headers Scott Wood
                     ` (5 more replies)
  4 siblings, 6 replies; 40+ messages in thread
From: Scott Wood @ 2013-04-15 23:19 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-ppc, qemu-devel

This allows QEMU to use the in-kernel KVM MPIC on some PPC platforms.

Scott Wood (6):
  kvm: update linux-headers
  kvm: use hw/kvm/Makefile.objs consistently for all relevant
    architectures
  memory: add memory_region_to_address()
  openpic: factor out some common defines into openpic.h
  PPC: e500: factor out mpic init code
  kvm/openpic: in-kernel mpic support

 hw/Makefile.objs                |    1 +
 hw/arm/Makefile.objs            |    1 -
 hw/i386/Makefile.objs           |    1 -
 hw/kvm/Makefile.objs            |    8 +-
 hw/kvm/openpic.c                |  259 +++++++++++++++++++++++++++++++++++++++
 hw/openpic.c                    |   40 +++---
 hw/openpic.h                    |   12 ++
 hw/ppc/e500.c                   |  123 +++++++++++++++----
 include/exec/memory.h           |    9 ++
 linux-headers/asm-powerpc/kvm.h |   16 +++
 linux-headers/linux/kvm.h       |   34 +++++
 memory.c                        |   38 +++++-
 12 files changed, 491 insertions(+), 51 deletions(-)
 create mode 100644 hw/kvm/openpic.c

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC PATCH v2 1/6] kvm: update linux-headers
  2013-04-15 23:19 ` [Qemu-devel] [RFC PATCH v2 0/6] kvm/openpic: in-kernel irqchip Scott Wood
@ 2013-04-15 23:19   ` Scott Wood
  2013-04-15 23:19   ` [Qemu-devel] [RFC PATCH v2 2/6] kvm: use hw/kvm/Makefile.objs consistently for all relevant architectures Scott Wood
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 40+ messages in thread
From: Scott Wood @ 2013-04-15 23:19 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Scott Wood, qemu-ppc, qemu-devel

These headers have not yet been merged into Linux -- this is an RFC
patchset.

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 linux-headers/asm-powerpc/kvm.h |   16 ++++++++++++++++
 linux-headers/linux/kvm.h       |   34 ++++++++++++++++++++++++++++++++++
 2 files changed, 50 insertions(+)

diff --git a/linux-headers/asm-powerpc/kvm.h b/linux-headers/asm-powerpc/kvm.h
index 16064d0..36be2fe 100644
--- a/linux-headers/asm-powerpc/kvm.h
+++ b/linux-headers/asm-powerpc/kvm.h
@@ -417,4 +417,20 @@ struct kvm_get_htab_header {
 #define KVM_REG_PPC_EPCR	(KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x85)
 #define KVM_REG_PPC_EPR		(KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x86)
 
+/* Timer Status Register OR/CLEAR interface */
+#define KVM_REG_PPC_OR_TSR	(KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x87)
+#define KVM_REG_PPC_CLEAR_TSR	(KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x88)
+#define KVM_REG_PPC_TCR		(KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x89)
+#define KVM_REG_PPC_TSR		(KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x8a)
+
+/* Debugging: Special instruction for software breakpoint */
+#define KVM_REG_PPC_DEBUG_INST	(KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x8b)
+
+/* Device control API: PPC-specific devices */
+#define KVM_DEV_MPIC_GRP_MISC		1
+#define   KVM_DEV_MPIC_BASE_ADDR	0	/* 64-bit */
+
+#define KVM_DEV_MPIC_GRP_REGISTER	2	/* 32-bit */
+#define KVM_DEV_MPIC_GRP_IRQ_ACTIVE	3	/* 32-bit */
+
 #endif /* __LINUX_KVM_POWERPC_H */
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index caca979..6c271c9 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -449,12 +449,15 @@ enum {
 	kvm_ioeventfd_flag_nr_datamatch,
 	kvm_ioeventfd_flag_nr_pio,
 	kvm_ioeventfd_flag_nr_deassign,
+	kvm_ioeventfd_flag_nr_virtio_ccw_notify,
 	kvm_ioeventfd_flag_nr_max,
 };
 
 #define KVM_IOEVENTFD_FLAG_DATAMATCH (1 << kvm_ioeventfd_flag_nr_datamatch)
 #define KVM_IOEVENTFD_FLAG_PIO       (1 << kvm_ioeventfd_flag_nr_pio)
 #define KVM_IOEVENTFD_FLAG_DEASSIGN  (1 << kvm_ioeventfd_flag_nr_deassign)
+#define KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY \
+	(1 << kvm_ioeventfd_flag_nr_virtio_ccw_notify)
 
 #define KVM_IOEVENTFD_VALID_FLAG_MASK  ((1 << kvm_ioeventfd_flag_nr_max) - 1)
 
@@ -665,6 +668,8 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_PPC_EPR 86
 #define KVM_CAP_ARM_PSCI 87
 #define KVM_CAP_ARM_SET_DEVICE_ADDR 88
+#define KVM_CAP_DEVICE_CTRL 89
+#define KVM_CAP_IRQ_MPIC 90
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -906,6 +911,35 @@ struct kvm_s390_ucas_mapping {
 #define KVM_ARM_SET_DEVICE_ADDR	  _IOW(KVMIO,  0xab, struct kvm_arm_device_addr)
 
 /*
+ * Device control API, available with KVM_CAP_DEVICE_CTRL
+ */
+#define KVM_CREATE_DEVICE_TEST		1
+
+struct kvm_create_device {
+	__u32	type;	/* in: KVM_DEV_TYPE_xxx */
+	__u32	fd;	/* out: device handle */
+	__u32	flags;	/* in: KVM_CREATE_DEVICE_xxx */
+};
+
+struct kvm_device_attr {
+	__u32	flags;		/* no flags currently defined */
+	__u32	group;		/* device-defined */
+	__u64	attr;		/* group-defined */
+	__u64	addr;		/* userspace address of attr data */
+};
+
+#define KVM_DEV_TYPE_FSL_MPIC_20	1
+#define KVM_DEV_TYPE_FSL_MPIC_42	2
+
+/* ioctl for vm fd */
+#define KVM_CREATE_DEVICE	  _IOWR(KVMIO,  0xe0, struct kvm_create_device)
+
+/* ioctls for fds returned by KVM_CREATE_DEVICE */
+#define KVM_SET_DEVICE_ATTR	  _IOW(KVMIO,  0xe1, struct kvm_device_attr)
+#define KVM_GET_DEVICE_ATTR	  _IOW(KVMIO,  0xe2, struct kvm_device_attr)
+#define KVM_HAS_DEVICE_ATTR	  _IOW(KVMIO,  0xe3, struct kvm_device_attr)
+
+/*
  * ioctls for vcpu fds
  */
 #define KVM_RUN                   _IO(KVMIO,   0x80)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC PATCH v2 2/6] kvm: use hw/kvm/Makefile.objs consistently for all relevant architectures
  2013-04-15 23:19 ` [Qemu-devel] [RFC PATCH v2 0/6] kvm/openpic: in-kernel irqchip Scott Wood
  2013-04-15 23:19   ` [Qemu-devel] [RFC PATCH v2 1/6] kvm: update linux-headers Scott Wood
@ 2013-04-15 23:19   ` Scott Wood
  2013-04-15 23:19   ` [Qemu-devel] [RFC PATCH v2 3/6] memory: add memory_region_to_address() Scott Wood
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 40+ messages in thread
From: Scott Wood @ 2013-04-15 23:19 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Scott Wood, qemu-ppc, qemu-devel

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
Build tested on ppc and x86, but not arm as I currently lack a suitable
toolchain.

Maybe TARGET_I386 should be set on x86_64, instead of needing to
test TARGET_BASE_ARCH in Makefile.objs?  It seems odd that it's set
for x86_64 in C code, but not in the makefiles.
---
 hw/Makefile.objs      |    1 +
 hw/arm/Makefile.objs  |    1 -
 hw/i386/Makefile.objs |    1 -
 hw/kvm/Makefile.objs  |    7 ++++++-
 4 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index d0b2ecb..3ce4ccd 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -216,4 +216,5 @@ obj-$(CONFIG_KVM) += ivshmem.o
 obj-$(CONFIG_LINUX) += vfio_pci.o
 endif
 
+obj-$(CONFIG_KVM) += kvm/
 endif
diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
index f5f7d0e..aebbc86 100644
--- a/hw/arm/Makefile.objs
+++ b/hw/arm/Makefile.objs
@@ -23,7 +23,6 @@ obj-y += bitbang_i2c.o marvell_88w8618_audio.o
 obj-y += framebuffer.o
 obj-y += strongarm.o
 obj-y += imx_serial.o imx_ccm.o imx_timer.o imx_avic.o
-obj-$(CONFIG_KVM) += kvm/arm_gic.o
 
 obj-y := $(addprefix ../,$(obj-y))
 
diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
index a78c0b2..5c54054 100644
--- a/hw/i386/Makefile.objs
+++ b/hw/i386/Makefile.objs
@@ -9,7 +9,6 @@ obj-y += lpc_ich9.o q35.o
 obj-$(CONFIG_XEN) += xen_platform.o xen_apic.o
 obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o
 obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o xen_pt_config_init.o xen_pt_msi.o
-obj-y += kvm/
 obj-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
 obj-y += pc-testdev.o
 
diff --git a/hw/kvm/Makefile.objs b/hw/kvm/Makefile.objs
index f620d7f..2a157a6 100644
--- a/hw/kvm/Makefile.objs
+++ b/hw/kvm/Makefile.objs
@@ -1 +1,6 @@
-obj-$(CONFIG_KVM) += clock.o apic.o i8259.o ioapic.o i8254.o pci-assign.o
+ifeq ($(TARGET_BASE_ARCH),i386)
+TARGET_BASE_I386=y
+endif
+
+obj-$(TARGET_BASE_I386) += clock.o apic.o i8259.o ioapic.o i8254.o pci-assign.o
+obj-$(TARGET_ARM) += arm_gic.o
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC PATCH v2 3/6] memory: add memory_region_to_address()
  2013-04-15 23:19 ` [Qemu-devel] [RFC PATCH v2 0/6] kvm/openpic: in-kernel irqchip Scott Wood
  2013-04-15 23:19   ` [Qemu-devel] [RFC PATCH v2 1/6] kvm: update linux-headers Scott Wood
  2013-04-15 23:19   ` [Qemu-devel] [RFC PATCH v2 2/6] kvm: use hw/kvm/Makefile.objs consistently for all relevant architectures Scott Wood
@ 2013-04-15 23:19   ` Scott Wood
  2013-04-16  8:25     ` Peter Maydell
  2013-04-15 23:19   ` [Qemu-devel] [RFC PATCH v2 4/6] openpic: factor out some common defines into openpic.h Scott Wood
                     ` (2 subsequent siblings)
  5 siblings, 1 reply; 40+ messages in thread
From: Scott Wood @ 2013-04-15 23:19 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Scott Wood, qemu-ppc, qemu-devel

This is useful for when a user of the memory region API needs to
communicate the absolute bus address to something outside QEMU
(in particular, KVM).

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
TODO: Use add/del memory listeners later in the patchset, which would
eliminate the need for this patch.
---
 include/exec/memory.h |    9 +++++++++
 memory.c              |   38 ++++++++++++++++++++++++++++++++++----
 2 files changed, 43 insertions(+), 4 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 2322732..b800391 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -892,6 +892,15 @@ void *address_space_map(AddressSpace *as, hwaddr addr,
 void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len,
                          int is_write, hwaddr access_len);
 
+/* memory_region_to_address: Find the full address of the start of the
+ *      given #MemoryRegion, ignoring aliases.  There is no guarantee
+ *      that the #MemoryRegion is actually visible at this address, if
+ *      there are overlapping regions.
+ *
+ * @mr: #MemoryRegion being queried
+ * @asp: if non-NULL, returns the #AddressSpace @mr is mapped in, if any
+ */
+hwaddr memory_region_to_address(MemoryRegion *mr, AddressSpace **asp);
 
 #endif
 
diff --git a/memory.c b/memory.c
index 75ca281..5d3b13d 100644
--- a/memory.c
+++ b/memory.c
@@ -453,21 +453,51 @@ const IORangeOps memory_region_iorange_ops = {
     .destructor = memory_region_iorange_destructor,
 };
 
-static AddressSpace *memory_region_to_address_space(MemoryRegion *mr)
+static AddressSpace *memory_region_root_to_address_space(MemoryRegion *mr)
 {
     AddressSpace *as;
 
-    while (mr->parent) {
-        mr = mr->parent;
-    }
     QTAILQ_FOREACH(as, &address_spaces, address_spaces_link) {
         if (mr == as->root) {
             return as;
         }
     }
+
+    return NULL;
+}
+
+static AddressSpace *memory_region_to_address_space(MemoryRegion *mr)
+{
+    AddressSpace *as;
+
+    while (mr->parent) {
+        mr = mr->parent;
+    }
+
+    as = memory_region_root_to_address_space(mr);
+    if (as) {
+        return as;
+    }
+
     abort();
 }
 
+hwaddr memory_region_to_address(MemoryRegion *mr, AddressSpace **asp)
+{
+    hwaddr addr = mr->addr;
+
+    while (mr->parent) {
+        mr = mr->parent;
+        addr += mr->addr;
+    }
+
+    if (asp) {
+        *asp = memory_region_root_to_address_space(mr);
+    }
+
+    return addr;
+}
+
 /* Render a memory region into the global view.  Ranges in @view obscure
  * ranges in @mr.
  */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC PATCH v2 4/6] openpic: factor out some common defines into openpic.h
  2013-04-15 23:19 ` [Qemu-devel] [RFC PATCH v2 0/6] kvm/openpic: in-kernel irqchip Scott Wood
                     ` (2 preceding siblings ...)
  2013-04-15 23:19   ` [Qemu-devel] [RFC PATCH v2 3/6] memory: add memory_region_to_address() Scott Wood
@ 2013-04-15 23:19   ` Scott Wood
  2013-04-15 23:19   ` [Qemu-devel] [RFC PATCH v2 5/6] PPC: e500: factor out mpic init code Scott Wood
  2013-04-15 23:19   ` [Qemu-devel] [RFC PATCH v2 6/6] kvm/openpic: in-kernel mpic support Scott Wood
  5 siblings, 0 replies; 40+ messages in thread
From: Scott Wood @ 2013-04-15 23:19 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Scott Wood, qemu-ppc, qemu-devel

...for use by the KVM in-kernel irqchip stub.

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 hw/openpic.c |   40 ++++++++++++++++++----------------------
 hw/openpic.h |   11 +++++++++++
 2 files changed, 29 insertions(+), 22 deletions(-)

diff --git a/hw/openpic.c b/hw/openpic.c
index 03a7075..b655381 100644
--- a/hw/openpic.c
+++ b/hw/openpic.c
@@ -57,11 +57,7 @@ static const int debug_openpic = 0;
     } while (0)
 
 #define MAX_CPU     32
-#define MAX_SRC     256
-#define MAX_TMR     4
-#define MAX_IPI     4
 #define MAX_MSI     8
-#define MAX_IRQ     (MAX_SRC + MAX_IPI + MAX_TMR)
 #define VID         0x03 /* MPIC version ID */
 
 /* OpenPIC capability flags */
@@ -78,7 +74,7 @@ static const int debug_openpic = 0;
 #define OPENPIC_SUMMARY_REG_START   0x3800
 #define OPENPIC_SUMMARY_REG_SIZE    0x800
 #define OPENPIC_SRC_REG_START        0x10000
-#define OPENPIC_SRC_REG_SIZE         (MAX_SRC * 0x20)
+#define OPENPIC_SRC_REG_SIZE         (OPENPIC_MAX_SRC * 0x20)
 #define OPENPIC_CPU_REG_START        0x20000
 #define OPENPIC_CPU_REG_SIZE         0x100 + ((MAX_CPU - 1) * 0x1000)
 
@@ -86,8 +82,8 @@ static const int debug_openpic = 0;
 #define RAVEN_MAX_CPU      2
 #define RAVEN_MAX_EXT     48
 #define RAVEN_MAX_IRQ     64
-#define RAVEN_MAX_TMR      MAX_TMR
-#define RAVEN_MAX_IPI      MAX_IPI
+#define RAVEN_MAX_TMR      OPENPIC_MAX_TMR
+#define RAVEN_MAX_IPI      OPENPIC_MAX_IPI
 
 /* Interrupt definitions */
 #define RAVEN_FE_IRQ     (RAVEN_MAX_EXT)     /* Internal functional IRQ */
@@ -209,7 +205,7 @@ typedef struct IRQQueue {
     /* Round up to the nearest 64 IRQs so that the queue length
      * won't change when moving between 32 and 64 bit hosts.
      */
-    unsigned long queue[BITS_TO_LONGS((MAX_IRQ + 63) & ~63)];
+    unsigned long queue[BITS_TO_LONGS((OPENPIC_MAX_IRQ + 63) & ~63)];
     int next;
     int priority;
 } IRQQueue;
@@ -283,7 +279,7 @@ typedef struct OpenPICState {
     uint32_t spve; /* Spurious vector register */
     uint32_t tfrr; /* Timer frequency reporting register */
     /* Source registers */
-    IRQSource src[MAX_IRQ];
+    IRQSource src[OPENPIC_MAX_IRQ];
     /* Local registers per output pin */
     IRQDest dst[MAX_CPU];
     uint32_t nb_cpus;
@@ -291,7 +287,7 @@ typedef struct OpenPICState {
     struct {
         uint32_t tccr;  /* Global timer current count register */
         uint32_t tbcr;  /* Global timer base count register */
-    } timers[MAX_TMR];
+    } timers[OPENPIC_MAX_TMR];
     /* Shared MSI registers */
     struct {
         uint32_t msir;   /* Shared Message Signaled Interrupt Register */
@@ -503,7 +499,7 @@ static void openpic_set_irq(void *opaque, int n_IRQ, int level)
     OpenPICState *opp = opaque;
     IRQSource *src;
 
-    if (n_IRQ >= MAX_IRQ) {
+    if (n_IRQ >= OPENPIC_MAX_IRQ) {
         fprintf(stderr, "%s: IRQ %d out of range\n", __func__, n_IRQ);
         abort();
     }
@@ -576,7 +572,7 @@ static void openpic_reset(DeviceState *d)
         opp->dst[i].servicing.next = -1;
     }
     /* Initialise timers */
-    for (i = 0; i < MAX_TMR; i++) {
+    for (i = 0; i < OPENPIC_MAX_TMR; i++) {
         opp->timers[i].tccr = 0;
         opp->timers[i].tbcr = TBCR_CI;
     }
@@ -1182,7 +1178,7 @@ static uint32_t openpic_iack(OpenPICState *opp, IRQDest *dst, int cpu)
         IRQ_resetbit(&dst->raised, irq);
     }
 
-    if ((irq >= opp->irq_ipi0) &&  (irq < (opp->irq_ipi0 + MAX_IPI))) {
+    if ((irq >= opp->irq_ipi0) &&  (irq < (opp->irq_ipi0 + OPENPIC_MAX_IPI))) {
         src->destmask &= ~(1 << cpu);
         if (src->destmask && !src->level) {
             /* trigger on CPUs that didn't know about it yet */
@@ -1381,7 +1377,7 @@ static void openpic_save(QEMUFile* f, void *opaque)
                         sizeof(opp->dst[i].outputs_active));
     }
 
-    for (i = 0; i < MAX_TMR; i++) {
+    for (i = 0; i < OPENPIC_MAX_TMR; i++) {
         qemu_put_be32s(f, &opp->timers[i].tccr);
         qemu_put_be32s(f, &opp->timers[i].tbcr);
     }
@@ -1440,7 +1436,7 @@ static int openpic_load(QEMUFile* f, void *opaque, int version_id)
                         sizeof(opp->dst[i].outputs_active));
     }
 
-    for (i = 0; i < MAX_TMR; i++) {
+    for (i = 0; i < OPENPIC_MAX_TMR; i++) {
         qemu_get_be32s(f, &opp->timers[i].tccr);
         qemu_get_be32s(f, &opp->timers[i].tbcr);
     }
@@ -1473,7 +1469,7 @@ typedef struct MemReg {
 static void fsl_common_init(OpenPICState *opp)
 {
     int i;
-    int virq = MAX_SRC;
+    int virq = OPENPIC_MAX_SRC;
 
     opp->vid = VID_REVISION_1_2;
     opp->vir = VIR_GENERIC;
@@ -1481,14 +1477,14 @@ static void fsl_common_init(OpenPICState *opp)
     opp->tfrr_reset = 0;
     opp->ivpr_reset = IVPR_MASK_MASK;
     opp->idr_reset = 1 << 0;
-    opp->max_irq = MAX_IRQ;
+    opp->max_irq = OPENPIC_MAX_IRQ;
 
     opp->irq_ipi0 = virq;
-    virq += MAX_IPI;
+    virq += OPENPIC_MAX_IPI;
     opp->irq_tim0 = virq;
-    virq += MAX_TMR;
+    virq += OPENPIC_MAX_TMR;
 
-    assert(virq <= MAX_IRQ);
+    assert(virq <= OPENPIC_MAX_IRQ);
 
     opp->irq_msi = 224;
 
@@ -1498,13 +1494,13 @@ static void fsl_common_init(OpenPICState *opp)
     }
 
     /* Internal interrupts, including message and MSI */
-    for (i = 16; i < MAX_SRC; i++) {
+    for (i = 16; i < OPENPIC_MAX_SRC; i++) {
         opp->src[i].type = IRQ_TYPE_FSLINT;
         opp->src[i].level = true;
     }
 
     /* timers and IPIs */
-    for (i = MAX_SRC; i < virq; i++) {
+    for (i = OPENPIC_MAX_SRC; i < virq; i++) {
         opp->src[i].type = IRQ_TYPE_FSLSPECIAL;
         opp->src[i].level = false;
     }
diff --git a/hw/openpic.h b/hw/openpic.h
index 9dcaf0e..d873bb6 100644
--- a/hw/openpic.h
+++ b/hw/openpic.h
@@ -1,6 +1,9 @@
 #if !defined(__OPENPIC_H__)
 #define __OPENPIC_H__
 
+#include "qemu-common.h"
+#include "hw/qdev.h"
+
 /* OpenPIC have 5 outputs per CPU connected and one IRQ out single output */
 enum {
     OPENPIC_OUTPUT_INT = 0, /* IRQ                       */
@@ -15,4 +18,12 @@ enum {
 #define OPENPIC_MODEL_FSL_MPIC_20 1
 #define OPENPIC_MODEL_FSL_MPIC_42 2
 
+#define OPENPIC_MAX_SRC     256
+#define OPENPIC_MAX_TMR     4
+#define OPENPIC_MAX_IPI     4
+#define OPENPIC_MAX_IRQ     (OPENPIC_MAX_SRC + OPENPIC_MAX_IPI + \
+                             OPENPIC_MAX_TMR)
+
+DeviceState *kvm_openpic_create(BusState *bus, int model);
+
 #endif /* __OPENPIC_H__ */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC PATCH v2 5/6] PPC: e500: factor out mpic init code
  2013-04-15 23:19 ` [Qemu-devel] [RFC PATCH v2 0/6] kvm/openpic: in-kernel irqchip Scott Wood
                     ` (3 preceding siblings ...)
  2013-04-15 23:19   ` [Qemu-devel] [RFC PATCH v2 4/6] openpic: factor out some common defines into openpic.h Scott Wood
@ 2013-04-15 23:19   ` Scott Wood
  2013-04-15 23:19   ` [Qemu-devel] [RFC PATCH v2 6/6] kvm/openpic: in-kernel mpic support Scott Wood
  5 siblings, 0 replies; 40+ messages in thread
From: Scott Wood @ 2013-04-15 23:19 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Scott Wood, qemu-ppc, qemu-devel

KVM in-kernel MPIC support is going to expand this even more,
so let's keep it contained.

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 hw/ppc/e500.c |   56 ++++++++++++++++++++++++++++++++++----------------------
 1 file changed, 34 insertions(+), 22 deletions(-)

diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c
index fef9c5d..09dbf7f 100644
--- a/hw/ppc/e500.c
+++ b/hw/ppc/e500.c
@@ -448,6 +448,38 @@ static void ppce500_cpu_reset(void *opaque)
     mmubooke_create_initial_mapping(env);
 }
 
+static qemu_irq *ppce500_init_mpic(PPCE500Params *params, MemoryRegion *ccsr,
+                                   qemu_irq **irqs)
+{
+    qemu_irq *mpic;
+    DeviceState *dev;
+    SysBusDevice *s;
+    int i, j, k;
+
+    mpic = g_new(qemu_irq, 256);
+    dev = qdev_create(NULL, "openpic");
+    qdev_prop_set_uint32(dev, "nb_cpus", smp_cpus);
+    qdev_prop_set_uint32(dev, "model", params->mpic_version);
+    qdev_init_nofail(dev);
+    s = SYS_BUS_DEVICE(dev);
+
+    k = 0;
+    for (i = 0; i < smp_cpus; i++) {
+        for (j = 0; j < OPENPIC_OUTPUT_NB; j++) {
+            sysbus_connect_irq(s, k++, irqs[i][j]);
+        }
+    }
+
+    for (i = 0; i < 256; i++) {
+        mpic[i] = qdev_get_gpio_in(dev, i);
+    }
+
+    memory_region_add_subregion(ccsr, MPC8544_MPIC_REGS_OFFSET,
+                                s->mmio[0].memory);
+
+    return mpic;
+}
+
 void ppce500_init(PPCE500Params *params)
 {
     MemoryRegion *address_space_mem = get_system_memory();
@@ -463,7 +495,7 @@ void ppce500_init(PPCE500Params *params)
     target_ulong initrd_base = 0;
     target_long initrd_size = 0;
     target_ulong cur_base = 0;
-    int i = 0, j, k;
+    int i;
     unsigned int pci_irq_nrs[4] = {1, 2, 3, 4};
     qemu_irq **irqs, *mpic;
     DeviceState *dev;
@@ -538,27 +570,7 @@ void ppce500_init(PPCE500Params *params)
     memory_region_add_subregion(address_space_mem, MPC8544_CCSRBAR_BASE,
                                 ccsr_addr_space);
 
-    /* MPIC */
-    mpic = g_new(qemu_irq, 256);
-    dev = qdev_create(NULL, "openpic");
-    qdev_prop_set_uint32(dev, "nb_cpus", smp_cpus);
-    qdev_prop_set_uint32(dev, "model", params->mpic_version);
-    qdev_init_nofail(dev);
-    s = SYS_BUS_DEVICE(dev);
-
-    k = 0;
-    for (i = 0; i < smp_cpus; i++) {
-        for (j = 0; j < OPENPIC_OUTPUT_NB; j++) {
-            sysbus_connect_irq(s, k++, irqs[i][j]);
-        }
-    }
-
-    for (i = 0; i < 256; i++) {
-        mpic[i] = qdev_get_gpio_in(dev, i);
-    }
-
-    memory_region_add_subregion(ccsr_addr_space, MPC8544_MPIC_REGS_OFFSET,
-                                s->mmio[0].memory);
+    mpic = ppce500_init_mpic(params, ccsr_addr_space, irqs);
 
     /* Serial */
     if (serial_hds[0]) {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC PATCH v2 6/6] kvm/openpic: in-kernel mpic support
  2013-04-15 23:19 ` [Qemu-devel] [RFC PATCH v2 0/6] kvm/openpic: in-kernel irqchip Scott Wood
                     ` (4 preceding siblings ...)
  2013-04-15 23:19   ` [Qemu-devel] [RFC PATCH v2 5/6] PPC: e500: factor out mpic init code Scott Wood
@ 2013-04-15 23:19   ` Scott Wood
  5 siblings, 0 replies; 40+ messages in thread
From: Scott Wood @ 2013-04-15 23:19 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Scott Wood, qemu-ppc, qemu-devel

This depends on RFC kernel interfaces proposed at:
http://patchwork.ozlabs.org/patch/236285/
http://patchwork.ozlabs.org/patch/236288/
http://patchwork.ozlabs.org/patch/236290/

TODO: use KVM_IRQ_LINE

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 hw/kvm/Makefile.objs |    1 +
 hw/kvm/openpic.c     |  259 ++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/openpic.h         |    1 +
 hw/ppc/e500.c        |   77 ++++++++++++++-
 4 files changed, 333 insertions(+), 5 deletions(-)
 create mode 100644 hw/kvm/openpic.c

diff --git a/hw/kvm/Makefile.objs b/hw/kvm/Makefile.objs
index 2a157a6..be95fc1 100644
--- a/hw/kvm/Makefile.objs
+++ b/hw/kvm/Makefile.objs
@@ -4,3 +4,4 @@ endif
 
 obj-$(TARGET_BASE_I386) += clock.o apic.o i8259.o ioapic.o i8254.o pci-assign.o
 obj-$(TARGET_ARM) += arm_gic.o
+obj-$(TARGET_PPC) += openpic.o
diff --git a/hw/kvm/openpic.c b/hw/kvm/openpic.c
new file mode 100644
index 0000000..609e268
--- /dev/null
+++ b/hw/kvm/openpic.c
@@ -0,0 +1,259 @@
+/*
+ * KVM in-kernel OpenPIC
+ *
+ * Copyright 2013 Freescale Semiconductor, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include <sys/ioctl.h>
+#include "exec/address-spaces.h"
+#include "hw/hw.h"
+#include "hw/openpic.h"
+#include "hw/pci/msi.h"
+#include "hw/sysbus.h"
+#include "sysemu/kvm.h"
+#include "qemu/log.h"
+
+typedef struct KVMOpenPICState {
+    SysBusDevice busdev;
+    MemoryRegion mem;
+    MemoryListener mem_listener;
+    hwaddr reg_base;
+    uint32_t kern_id;
+    uint32_t model;
+} KVMOpenPICState;
+
+static void kvm_openpic_set_irq(void *opaque, int n_IRQ, int level)
+{
+    KVMOpenPICState *opp = opaque;
+    struct kvm_device_attr attr;
+    uint32_t val32 = level;
+    int ret;
+
+    attr.group = KVM_DEV_MPIC_GRP_IRQ_ACTIVE;
+    attr.attr = n_IRQ;
+    attr.addr = (uint64_t)(long)&val32;
+
+    ret = ioctl(opp->kern_id, KVM_SET_DEVICE_ATTR, &attr);
+    if (ret < 0) {
+        fprintf(stderr, "%s: %s %llx\n", __func__, strerror(errno), attr.attr);
+    }
+}
+
+static void kvm_openpic_reset(DeviceState *d)
+{
+    qemu_log_mask(LOG_UNIMP, "%s: unimplemented\n", __func__);
+}
+
+static void kvm_openpic_write(void *opaque, hwaddr addr, uint64_t val,
+                              unsigned size)
+{
+    KVMOpenPICState *opp = opaque;
+    struct kvm_device_attr attr;
+    uint32_t val32 = val;
+    int ret;
+
+    attr.group = KVM_DEV_MPIC_GRP_REGISTER;
+    attr.attr = addr;
+    attr.addr = (uint64_t)(long)&val32;
+
+    ret = ioctl(opp->kern_id, KVM_SET_DEVICE_ATTR, &attr);
+    if (ret < 0) {
+        qemu_log_mask(LOG_UNIMP, "%s: %s %llx\n", __func__,
+                      strerror(errno), attr.attr);
+    }
+}
+
+static uint64_t kvm_openpic_read(void *opaque, hwaddr addr, unsigned size)
+{
+    KVMOpenPICState *opp = opaque;
+    struct kvm_device_attr attr;
+    uint32_t val = 0xdeadbeef;
+    int ret;
+
+    attr.group = KVM_DEV_MPIC_GRP_REGISTER;
+    attr.attr = addr;
+    attr.addr = (uint64_t)(long)&val;
+
+    ret = ioctl(opp->kern_id, KVM_GET_DEVICE_ATTR, &attr);
+    if (ret < 0) {
+        qemu_log_mask(LOG_UNIMP, "%s: %s %llx\n", __func__,
+                      strerror(errno), attr.attr);
+        return 0;
+    }
+
+    return val;
+}
+
+static const MemoryRegionOps kvm_openpic_mem_ops = {
+    .write = kvm_openpic_write,
+    .read  = kvm_openpic_read,
+    .endianness = DEVICE_BIG_ENDIAN,
+    .impl = {
+        .min_access_size = 4,
+        .max_access_size = 4,
+    },
+};
+
+static void kvm_openpic_update_reg_base(MemoryListener *listener)
+{
+    KVMOpenPICState *opp = container_of(listener, KVMOpenPICState,
+                                        mem_listener);
+    struct kvm_device_attr attr;
+    uint64_t reg_base;
+    AddressSpace *as;
+    int ret;
+
+    reg_base = memory_region_to_address(&opp->mem, &as);
+    if (!as) {
+        reg_base = 0;
+    } else if (as != &address_space_memory) {
+        abort();
+    }
+
+    if (reg_base == opp->reg_base) {
+        return;
+    }
+
+    opp->reg_base = reg_base;
+
+    attr.group = KVM_DEV_MPIC_GRP_MISC;
+    attr.attr = KVM_DEV_MPIC_BASE_ADDR;
+    attr.addr = (uint64_t)(long)&reg_base;
+
+    ret = ioctl(opp->kern_id, KVM_SET_DEVICE_ATTR, &attr);
+    if (ret < 0) {
+        fprintf(stderr, "%s: %s %llx\n", __func__, strerror(errno), reg_base);
+    }
+}
+
+static int kvm_openpic_init(SysBusDevice *dev)
+{
+    KVMOpenPICState *opp = FROM_SYSBUS(typeof(*opp), dev);
+    int kvm_openpic_model;
+
+    memory_region_init_io(&opp->mem, &kvm_openpic_mem_ops, opp,
+                          "kvm-openpic", 0x40000);
+
+    switch (opp->model) {
+    case OPENPIC_MODEL_FSL_MPIC_20:
+        kvm_openpic_model = KVM_DEV_TYPE_FSL_MPIC_20;
+        break;
+
+    case OPENPIC_MODEL_FSL_MPIC_42:
+        kvm_openpic_model = KVM_DEV_TYPE_FSL_MPIC_42;
+        break;
+
+    default:
+        return -EINVAL;
+    }
+
+    sysbus_init_mmio(dev, &opp->mem);
+    qdev_init_gpio_in(&dev->qdev, kvm_openpic_set_irq, OPENPIC_MAX_IRQ);
+
+    opp->mem_listener.commit = kvm_openpic_update_reg_base;
+    memory_listener_register(&opp->mem_listener, &address_space_memory);
+
+    msi_supported = true;
+    return 0;
+}
+
+int kvm_openpic_connect_vcpu(DeviceState *d, CPUState *cs)
+{
+    KVMOpenPICState *opp = FROM_SYSBUS(typeof(*opp), SYS_BUS_DEVICE(d));
+    struct kvm_enable_cap encap = {};
+
+    encap.cap = KVM_CAP_IRQ_MPIC;
+    encap.args[0] = opp->kern_id;
+    encap.args[1] = cs->cpu_index;
+
+    return kvm_vcpu_ioctl(cs, KVM_ENABLE_CAP, &encap);
+}
+
+DeviceState *kvm_openpic_create(BusState *bus, int model)
+{
+    KVMState *s = kvm_state;
+    DeviceState *dev;
+    struct kvm_create_device cd = {0};
+    int ret;
+
+    if (!kvm_check_extension(s, KVM_CAP_DEVICE_CTRL)) {
+        return NULL;
+    }
+
+    switch (model) {
+    case OPENPIC_MODEL_FSL_MPIC_20:
+        cd.type = KVM_DEV_TYPE_FSL_MPIC_20;
+        break;
+
+    case OPENPIC_MODEL_FSL_MPIC_42:
+        cd.type = KVM_DEV_TYPE_FSL_MPIC_42;
+        break;
+
+    default:
+        qemu_log_mask(LOG_UNIMP, "%s: unknown openpic model %d\n",
+                      __func__, model);
+        return NULL;
+    }
+
+    ret = kvm_vm_ioctl(s, KVM_CREATE_DEVICE, &cd);
+    if (ret < 0) {
+        qemu_log_mask(LOG_UNIMP, "%s: can't create device %d: %s\n",
+                      __func__, cd.type, strerror(errno));
+        return NULL;
+    }
+
+    dev = qdev_create(NULL, "kvm-openpic");
+    qdev_prop_set_uint32(dev, "model", model);
+    qdev_prop_set_uint32(dev, "kernel-id", cd.fd);
+
+    return dev;
+}
+
+static Property kvm_openpic_properties[] = {
+    DEFINE_PROP_UINT32("model", KVMOpenPICState, model,
+                       OPENPIC_MODEL_FSL_MPIC_20),
+    DEFINE_PROP_UINT32("kernel-id", KVMOpenPICState, kern_id, 0),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void kvm_openpic_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+    SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass);
+
+    k->init = kvm_openpic_init;
+    dc->props = kvm_openpic_properties;
+    dc->reset = kvm_openpic_reset;
+}
+
+static const TypeInfo kvm_openpic_info = {
+    .name          = "kvm-openpic",
+    .parent        = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(KVMOpenPICState),
+    .class_init    = kvm_openpic_class_init,
+};
+
+static void kvm_openpic_register_types(void)
+{
+    type_register_static(&kvm_openpic_info);
+}
+
+type_init(kvm_openpic_register_types)
diff --git a/hw/openpic.h b/hw/openpic.h
index d873bb6..1a27efa 100644
--- a/hw/openpic.h
+++ b/hw/openpic.h
@@ -25,5 +25,6 @@ enum {
                              OPENPIC_MAX_TMR)
 
 DeviceState *kvm_openpic_create(BusState *bus, int model);
+int kvm_openpic_connect_vcpu(DeviceState *d, CPUState *cs);
 
 #endif /* __OPENPIC_H__ */
diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c
index 09dbf7f..1697b44 100644
--- a/hw/ppc/e500.c
+++ b/hw/ppc/e500.c
@@ -448,18 +448,17 @@ static void ppce500_cpu_reset(void *opaque)
     mmubooke_create_initial_mapping(env);
 }
 
-static qemu_irq *ppce500_init_mpic(PPCE500Params *params, MemoryRegion *ccsr,
-                                   qemu_irq **irqs)
+static DeviceState *ppce500_init_mpic_qemu(PPCE500Params *params,
+                                           qemu_irq **irqs)
 {
-    qemu_irq *mpic;
     DeviceState *dev;
     SysBusDevice *s;
     int i, j, k;
 
-    mpic = g_new(qemu_irq, 256);
     dev = qdev_create(NULL, "openpic");
-    qdev_prop_set_uint32(dev, "nb_cpus", smp_cpus);
     qdev_prop_set_uint32(dev, "model", params->mpic_version);
+    qdev_prop_set_uint32(dev, "nb_cpus", smp_cpus);
+
     qdev_init_nofail(dev);
     s = SYS_BUS_DEVICE(dev);
 
@@ -470,10 +469,78 @@ static qemu_irq *ppce500_init_mpic(PPCE500Params *params, MemoryRegion *ccsr,
         }
     }
 
+    return dev;
+}
+
+static DeviceState *ppce500_init_mpic_kvm(PPCE500Params *params,
+                                          qemu_irq **irqs)
+{
+    DeviceState *dev;
+    CPUPPCState *env;
+    CPUState *cs;
+
+    dev = kvm_openpic_create(NULL, params->mpic_version);
+    if (!dev) {
+        return NULL;
+    }
+
+    qdev_init_nofail(dev);
+
+    for (env = first_cpu; env != NULL; env = env->next_cpu) {
+        cs = ENV_GET_CPU(env);
+
+        if (kvm_openpic_connect_vcpu(dev, cs)) {
+            fprintf(stderr, "%s: failed to connect vcpu to irqchip\n",
+                    __func__);
+            abort();
+        }
+    }
+
+    return dev;
+}
+
+static qemu_irq *ppce500_init_mpic(PPCE500Params *params, MemoryRegion *ccsr,
+                                   qemu_irq **irqs)
+{
+    QemuOptsList *list;
+    qemu_irq *mpic;
+    DeviceState *dev = NULL;
+    SysBusDevice *s;
+    int i;
+
+    mpic = g_new(qemu_irq, 256);
+
+    if (kvm_enabled()) {
+        bool irqchip_allowed = true, irqchip_required = false;
+
+        list = qemu_find_opts("machine");
+        if (!QTAILQ_EMPTY(&list->head)) {
+            irqchip_allowed = qemu_opt_get_bool(QTAILQ_FIRST(&list->head),
+                                                "kernel_irqchip", true);
+            irqchip_required = qemu_opt_get_bool(QTAILQ_FIRST(&list->head),
+                                                 "kernel_irqchip", false);
+        }
+
+        if (irqchip_allowed) {
+            dev = ppce500_init_mpic_kvm(params, irqs);
+        }
+
+        if (irqchip_required && !dev) {
+            fprintf(stderr, "%s: irqchip requested but unavailable\n",
+                    __func__);
+            abort();
+        }
+    }
+
+    if (!dev) {
+        dev = ppce500_init_mpic_qemu(params, irqs);
+    }
+
     for (i = 0; i < 256; i++) {
         mpic[i] = qdev_get_gpio_in(dev, i);
     }
 
+    s = SYS_BUS_DEVICE(dev);
     memory_region_add_subregion(ccsr, MPC8544_MPIC_REGS_OFFSET,
                                 s->mmio[0].memory);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 3/6] memory: add memory_region_to_address()
  2013-04-15 23:19   ` [Qemu-devel] [RFC PATCH v2 3/6] memory: add memory_region_to_address() Scott Wood
@ 2013-04-16  8:25     ` Peter Maydell
  2013-04-17  0:10       ` Scott Wood
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Maydell @ 2013-04-16  8:25 UTC (permalink / raw)
  To: Scott Wood; +Cc: qemu-ppc, Alexander Graf, qemu-devel

On 16 April 2013 00:19, Scott Wood <scottwood@freescale.com> wrote:
> This is useful for when a user of the memory region API needs to
> communicate the absolute bus address to something outside QEMU
> (in particular, KVM).
>
> Signed-off-by: Scott Wood <scottwood@freescale.com>
> ---
> TODO: Use add/del memory listeners later in the patchset, which would
> eliminate the need for this patch.

Yes, please do.

> +/* memory_region_to_address: Find the full address of the start of the
> + *      given #MemoryRegion, ignoring aliases.  There is no guarantee
> + *      that the #MemoryRegion is actually visible at this address, if
> + *      there are overlapping regions.
> + *
> + * @mr: #MemoryRegion being queried
> + * @asp: if non-NULL, returns the #AddressSpace @mr is mapped in, if any
> + */
> +hwaddr memory_region_to_address(MemoryRegion *mr, AddressSpace **asp);

A MemoryRegion can appear in more than one AddressSpace (or none at all),
so I don't think this is a very clearly defined API to put in the
memory API itself. (It's ok to make that kind of assumption as a user
of the memory APIs for particular cases, eg in how a memory listener
callback function behaves. But we shouldn't be baking those assumptions
into new API functions.)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 3/6] memory: add memory_region_to_address()
  2013-04-16  8:25     ` Peter Maydell
@ 2013-04-17  0:10       ` Scott Wood
  2013-04-17  9:14         ` Peter Maydell
  0 siblings, 1 reply; 40+ messages in thread
From: Scott Wood @ 2013-04-17  0:10 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-ppc, Alexander Graf, qemu-devel

On 04/16/2013 03:25:50 AM, Peter Maydell wrote:
> On 16 April 2013 00:19, Scott Wood <scottwood@freescale.com> wrote:
> > This is useful for when a user of the memory region API needs to
> > communicate the absolute bus address to something outside QEMU
> > (in particular, KVM).
> >
> > Signed-off-by: Scott Wood <scottwood@freescale.com>
> > ---
> > TODO: Use add/del memory listeners later in the patchset, which  
> would
> > eliminate the need for this patch.
> 
> Yes, please do.

I will.  Alex wanted a respun patchset to work with the latest kernel  
API, and I didn't have a chance to address this yet.  It's still RFC.

> > +/* memory_region_to_address: Find the full address of the start of  
> the
> > + *      given #MemoryRegion, ignoring aliases.  There is no  
> guarantee
> > + *      that the #MemoryRegion is actually visible at this  
> address, if
> > + *      there are overlapping regions.
> > + *
> > + * @mr: #MemoryRegion being queried
> > + * @asp: if non-NULL, returns the #AddressSpace @mr is mapped in,  
> if any
> > + */
> > +hwaddr memory_region_to_address(MemoryRegion *mr, AddressSpace  
> **asp);
> 
> A MemoryRegion can appear in more than one AddressSpace (or none at  
> all),
> so I don't think this is a very clearly defined API to put in the
> memory API itself. (It's ok to make that kind of assumption as a user
> of the memory APIs for particular cases, eg in how a memory listener
> callback function behaves. But we shouldn't be baking those  
> assumptions
> into new API functions.)

I'm not sure how it's different from memory_region_to_address_space in  
that regard, other than its ability to gracefully handle the "or none  
at all" case.

How does it work to be in more than one AddressSpace, if there's a  
parent chain?  More than one "as" with the same "as->root"?  Or are you  
referring to aliases?

-Scott

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 3/6] memory: add memory_region_to_address()
  2013-04-17  0:10       ` Scott Wood
@ 2013-04-17  9:14         ` Peter Maydell
  0 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2013-04-17  9:14 UTC (permalink / raw)
  To: Scott Wood; +Cc: qemu-ppc, Alexander Graf, qemu-devel

On 17 April 2013 01:10, Scott Wood <scottwood@freescale.com> wrote:
> On 04/16/2013 03:25:50 AM, Peter Maydell wrote:
>> A MemoryRegion can appear in more than one AddressSpace (or none at all),
>> so I don't think this is a very clearly defined API to put in the
>> memory API itself. (It's ok to make that kind of assumption as a user
>> of the memory APIs for particular cases, eg in how a memory listener
>> callback function behaves. But we shouldn't be baking those assumptions
>> into new API functions.)

> I'm not sure how it's different from memory_region_to_address_space in that
> regard, other than its ability to gracefully handle the "or none at all"
> case.

memory_region_to_address_space() is an internal function used by
a couple of functions -- it's passed "a top level (ie parentless)
region", not an arbitrary region [semantically, anyway].

> How does it work to be in more than one AddressSpace, if there's a parent
> chain?  More than one "as" with the same "as->root"?  Or are you referring
> to aliases?

Yes, I had aliases in mind. Ignoring aliases isn't really very
doable since you can't control whether somebody else has taken
an alias of you.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2013-04-17  9:15 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1360823521-32306-1-git-send-email-scottwood@freescale.com>
     [not found] ` <1360823521-32306-3-git-send-email-scottwood@freescale.com>
2013-03-21  8:30   ` [Qemu-devel] [RFC ppc-next PATCH 2/6] kvm: hw/kvm is not x86-specific Alexander Graf
     [not found] ` <1360823521-32306-4-git-send-email-scottwood@freescale.com>
2013-03-21  8:31   ` [Qemu-devel] [RFC ppc-next PATCH 3/6] memory: add memory_region_to_address() Alexander Graf
2013-03-21 10:53     ` Peter Maydell
2013-03-21 10:59       ` Alexander Graf
2013-03-21 11:01         ` Peter Maydell
2013-03-21 11:05           ` Alexander Graf
2013-03-21 11:09             ` Peter Maydell
2013-03-21 11:14               ` Alexander Graf
2013-03-21 11:22                 ` Peter Maydell
2013-03-21 11:29                   ` Alexander Graf
2013-03-21 11:32                     ` Peter Maydell
2013-03-21 11:38                       ` Alexander Graf
2013-03-21 11:44                         ` Peter Maydell
2013-03-21 11:49                           ` Alexander Graf
2013-03-21 11:51                             ` [Qemu-devel] [Qemu-ppc] " Alexander Graf
2013-03-21 22:43                               ` Scott Wood
2013-03-22 13:08                                 ` Peter Maydell
2013-03-22 22:05                                   ` Scott Wood
2013-03-23 11:24                                     ` Peter Maydell
2013-03-25 18:23                                       ` Scott Wood
2013-03-21 11:53                             ` [Qemu-devel] " Peter Maydell
     [not found] ` <1360823521-32306-6-git-send-email-scottwood@freescale.com>
2013-03-21  8:34   ` [Qemu-devel] [RFC ppc-next PATCH 5/6] kvm: export result of irqchip config check Alexander Graf
2013-03-21  8:45     ` Jan Kiszka
2013-03-21  8:50       ` Alexander Graf
     [not found] ` <1360823521-32306-7-git-send-email-scottwood@freescale.com>
2013-03-21  8:41   ` [Qemu-devel] [RFC ppc-next PATCH 6/6] kvm/openpic: in-kernel mpic support Alexander Graf
2013-03-21 20:50     ` Scott Wood
2013-03-21 21:29       ` Alexander Graf
2013-03-21 21:59         ` Scott Wood
2013-03-21 23:45           ` Alexander Graf
2013-03-22 22:51             ` Scott Wood
2013-04-15 23:19 ` [Qemu-devel] [RFC PATCH v2 0/6] kvm/openpic: in-kernel irqchip Scott Wood
2013-04-15 23:19   ` [Qemu-devel] [RFC PATCH v2 1/6] kvm: update linux-headers Scott Wood
2013-04-15 23:19   ` [Qemu-devel] [RFC PATCH v2 2/6] kvm: use hw/kvm/Makefile.objs consistently for all relevant architectures Scott Wood
2013-04-15 23:19   ` [Qemu-devel] [RFC PATCH v2 3/6] memory: add memory_region_to_address() Scott Wood
2013-04-16  8:25     ` Peter Maydell
2013-04-17  0:10       ` Scott Wood
2013-04-17  9:14         ` Peter Maydell
2013-04-15 23:19   ` [Qemu-devel] [RFC PATCH v2 4/6] openpic: factor out some common defines into openpic.h Scott Wood
2013-04-15 23:19   ` [Qemu-devel] [RFC PATCH v2 5/6] PPC: e500: factor out mpic init code Scott Wood
2013-04-15 23:19   ` [Qemu-devel] [RFC PATCH v2 6/6] kvm/openpic: in-kernel mpic support Scott Wood

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.