qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: qemu-devel@nongnu.org, "Paolo Bonzini" <pbonzini@redhat.com>,
	"Paul Durrant" <paul@xen.org>,
	"Joao Martins" <joao.m.martins@oracle.com>,
	"Ankur Arora" <ankur.a.arora@oracle.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Thomas Huth" <thuth@redhat.com>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	"Juan Quintela" <quintela@redhat.com>,
	"Claudio Fontana" <cfontana@suse.de>,
	"Julien Grall" <julien@xen.org>
Subject: Re: [RFC PATCH v5 14/52] hw/xen: Add xen_overlay device for emulating shared xenheap pages
Date: Tue, 3 Jan 2023 17:54:26 +0000	[thread overview]
Message-ID: <Y7Rr0r+qb0F2k+N3@work-vm> (raw)
In-Reply-To: <20221230121235.1282915-15-dwmw2@infradead.org>

* David Woodhouse (dwmw2@infradead.org) wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> For the shared info page and for grant tables, Xen shares its own pages
> from the "Xen heap" to the guest. The guest requests that a given page
> from a certain address space (XENMAPSPACE_shared_info, etc.) be mapped
> to a given GPA using the XENMEM_add_to_physmap hypercall.
> 
> To support that in qemu when *emulating* Xen, create a memory region
> (migratable) and allow it to be mapped as an overlay when requested.
> 
> Xen theoretically allows the same page to be mapped multiple times
> into the guest, but that's hard to track and reinstate over migration,
> so we automatically *unmap* any previous mapping when creating a new
> one. This approach has been used in production with.... a non-trivial
> number of guests expecting true Xen, without any problems yet being
> noticed.
> 
> This adds just the shared info page for now. The grant tables will be
> a larger region, and will need to be overlaid one page at a time. I
> think that means I need to create separate aliases for each page of
> the overall grant_frames region, so that they can be mapped individually.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>  hw/i386/kvm/meson.build   |   1 +
>  hw/i386/kvm/xen_overlay.c | 200 ++++++++++++++++++++++++++++++++++++++
>  hw/i386/kvm/xen_overlay.h |  20 ++++
>  include/sysemu/kvm_xen.h  |   4 +
>  4 files changed, 225 insertions(+)
>  create mode 100644 hw/i386/kvm/xen_overlay.c
>  create mode 100644 hw/i386/kvm/xen_overlay.h
> 
> diff --git a/hw/i386/kvm/meson.build b/hw/i386/kvm/meson.build
> index 95467f1ded..6165cbf019 100644
> --- a/hw/i386/kvm/meson.build
> +++ b/hw/i386/kvm/meson.build
> @@ -4,5 +4,6 @@ i386_kvm_ss.add(when: 'CONFIG_APIC', if_true: files('apic.c'))
>  i386_kvm_ss.add(when: 'CONFIG_I8254', if_true: files('i8254.c'))
>  i386_kvm_ss.add(when: 'CONFIG_I8259', if_true: files('i8259.c'))
>  i386_kvm_ss.add(when: 'CONFIG_IOAPIC', if_true: files('ioapic.c'))
> +i386_kvm_ss.add(when: 'CONFIG_XEN_EMU', if_true: files('xen_overlay.c'))
>  
>  i386_ss.add_all(when: 'CONFIG_KVM', if_true: i386_kvm_ss)
> diff --git a/hw/i386/kvm/xen_overlay.c b/hw/i386/kvm/xen_overlay.c
> new file mode 100644
> index 0000000000..331dea6b8b
> --- /dev/null
> +++ b/hw/i386/kvm/xen_overlay.c
> @@ -0,0 +1,200 @@
> +/*
> + * QEMU Xen emulation: Shared/overlay pages support
> + *
> + * Copyright © 2022 Amazon.com, Inc. or its affiliates. All Rights Reserved.
> + *
> + * Authors: David Woodhouse <dwmw2@infradead.org>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/host-utils.h"
> +#include "qemu/module.h"
> +#include "qemu/main-loop.h"
> +#include "qapi/error.h"
> +#include "qom/object.h"
> +#include "exec/target_page.h"
> +#include "exec/address-spaces.h"
> +#include "migration/vmstate.h"
> +
> +#include "hw/sysbus.h"
> +#include "hw/xen/xen.h"
> +#include "xen_overlay.h"
> +
> +#include "sysemu/kvm.h"
> +#include "sysemu/kvm_xen.h"
> +#include <linux/kvm.h>
> +
> +#include "standard-headers/xen/memory.h"
> +
> +
> +#define TYPE_XEN_OVERLAY "xen-overlay"
> +OBJECT_DECLARE_SIMPLE_TYPE(XenOverlayState, XEN_OVERLAY)
> +
> +#define XEN_PAGE_SHIFT 12
> +#define XEN_PAGE_SIZE (1ULL << XEN_PAGE_SHIFT)
> +
> +struct XenOverlayState {
> +    /*< private >*/
> +    SysBusDevice busdev;
> +    /*< public >*/
> +
> +    MemoryRegion shinfo_mem;
> +    void *shinfo_ptr;
> +    uint64_t shinfo_gpa;
> +};
> +
> +struct XenOverlayState *xen_overlay_singleton;
> +
> +static void xen_overlay_map_page_locked(MemoryRegion *page, uint64_t gpa)
> +{
> +    /*
> +     * Xen allows guests to map the same page as many times as it likes
> +     * into guest physical frames. We don't, because it would be hard
> +     * to track and restore them all. One mapping of each page is
> +     * perfectly sufficient for all known guests... and we've tested
> +     * that theory on a few now in other implementations. dwmw2.
> +     */
> +    if (memory_region_is_mapped(page)) {
> +        if (gpa == INVALID_GPA) {
> +            memory_region_del_subregion(get_system_memory(), page);
> +        } else {
> +            /* Just move it */
> +            memory_region_set_address(page, gpa);
> +        }
> +    } else if (gpa != INVALID_GPA) {
> +        memory_region_add_subregion_overlap(get_system_memory(), gpa, page, 0);
> +    }
> +}
> +
> +/* KVM is the only existing back end for now. Let's not overengineer it yet. */
> +static int xen_overlay_set_be_shinfo(uint64_t gfn)
> +{
> +    struct kvm_xen_hvm_attr xa = {
> +        .type = KVM_XEN_ATTR_TYPE_SHARED_INFO,
> +        .u.shared_info.gfn = gfn,
> +    };
> +
> +    return kvm_vm_ioctl(kvm_state, KVM_XEN_HVM_SET_ATTR, &xa);
> +}
> +
> +
> +static void xen_overlay_realize(DeviceState *dev, Error **errp)
> +{
> +    XenOverlayState *s = XEN_OVERLAY(dev);
> +
> +    if (xen_mode != XEN_EMULATE) {
> +        error_setg(errp, "Xen overlay page support is for Xen emulation");
> +        return;
> +    }
> +
> +    memory_region_init_ram(&s->shinfo_mem, OBJECT(dev), "xen:shared_info",
> +                           XEN_PAGE_SIZE, &error_abort);
> +    memory_region_set_enabled(&s->shinfo_mem, true);
> +
> +    s->shinfo_ptr = memory_region_get_ram_ptr(&s->shinfo_mem);
> +    s->shinfo_gpa = INVALID_GPA;
> +    memset(s->shinfo_ptr, 0, XEN_PAGE_SIZE);
> +}
> +
> +static int xen_overlay_post_load(void *opaque, int version_id)
> +{
> +    XenOverlayState *s = opaque;
> +
> +    if (s->shinfo_gpa != INVALID_GPA) {
> +        xen_overlay_map_page_locked(&s->shinfo_mem, s->shinfo_gpa);
> +        xen_overlay_set_be_shinfo(s->shinfo_gpa >> XEN_PAGE_SHIFT);
> +    }
> +
> +    return 0;
> +}
> +
> +static bool xen_overlay_is_needed(void *opaque)
> +{
> +    return xen_mode == XEN_EMULATE;
> +}
> +
> +static const VMStateDescription xen_overlay_vmstate = {
> +    .name = "xen_overlay",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .needed = xen_overlay_is_needed,
> +    .post_load = xen_overlay_post_load,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_UINT64(shinfo_gpa, XenOverlayState),
> +        VMSTATE_END_OF_LIST()
> +    }
> +};
> +
> +static void xen_overlay_class_init(ObjectClass *klass, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(klass);
> +
> +    dc->realize = xen_overlay_realize;
> +    dc->vmsd = &xen_overlay_vmstate;

That looks OK from a migration point of view

> +}
> +
> +static const TypeInfo xen_overlay_info = {
> +    .name          = TYPE_XEN_OVERLAY,
> +    .parent        = TYPE_SYS_BUS_DEVICE,
> +    .instance_size = sizeof(XenOverlayState),
> +    .class_init    = xen_overlay_class_init,
> +};
> +
> +void xen_overlay_create(void)
> +{
> +    xen_overlay_singleton = XEN_OVERLAY(sysbus_create_simple(TYPE_XEN_OVERLAY,
> +                                                             -1, NULL));
> +}
> +
> +static void xen_overlay_register_types(void)
> +{
> +    type_register_static(&xen_overlay_info);
> +}
> +
> +type_init(xen_overlay_register_types)
> +
> +int xen_overlay_map_shinfo_page(uint64_t gpa)
> +{
> +    XenOverlayState *s = xen_overlay_singleton;
> +    int ret;
> +
> +    if (!s) {
> +        return -ENOENT;
> +    }
> +
> +    qemu_mutex_lock_iothread();
> +    if (s->shinfo_gpa) {
> +            /* If removing shinfo page, turn the kernel magic off first */

Odd indent?

Dave

> +        ret = xen_overlay_set_be_shinfo(INVALID_GFN);
> +        if (ret) {
> +            goto out;
> +        }
> +    }
> +
> +    xen_overlay_map_page_locked(&s->shinfo_mem, gpa);
> +    if (gpa != INVALID_GPA) {
> +        ret = xen_overlay_set_be_shinfo(gpa >> XEN_PAGE_SHIFT);
> +        if (ret) {
> +            goto out;
> +        }
> +    }
> +    s->shinfo_gpa = gpa;
> + out:
> +    qemu_mutex_unlock_iothread();
> +
> +    return ret;
> +}
> +
> +void *xen_overlay_get_shinfo_ptr(void)
> +{
> +    XenOverlayState *s = xen_overlay_singleton;
> +
> +    if (!s) {
> +        return NULL;
> +    }
> +
> +    return s->shinfo_ptr;
> +}
> diff --git a/hw/i386/kvm/xen_overlay.h b/hw/i386/kvm/xen_overlay.h
> new file mode 100644
> index 0000000000..00cff05bb0
> --- /dev/null
> +++ b/hw/i386/kvm/xen_overlay.h
> @@ -0,0 +1,20 @@
> +/*
> + * QEMU Xen emulation: Shared/overlay pages support
> + *
> + * Copyright © 2022 Amazon.com, Inc. or its affiliates. All Rights Reserved.
> + *
> + * Authors: David Woodhouse <dwmw2@infradead.org>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#ifndef QEMU_XEN_OVERLAY_H
> +#define QEMU_XEN_OVERLAY_H
> +
> +void xen_overlay_create(void);
> +
> +int xen_overlay_map_shinfo_page(uint64_t gpa);
> +void *xen_overlay_get_shinfo_ptr(void);
> +
> +#endif /* QEMU_XEN_OVERLAY_H */
> diff --git a/include/sysemu/kvm_xen.h b/include/sysemu/kvm_xen.h
> index 296533f2d5..3e43cd7843 100644
> --- a/include/sysemu/kvm_xen.h
> +++ b/include/sysemu/kvm_xen.h
> @@ -12,6 +12,10 @@
>  #ifndef QEMU_SYSEMU_KVM_XEN_H
>  #define QEMU_SYSEMU_KVM_XEN_H
>  
> +/* The KVM API uses these to indicate "no GPA" or "no GFN" */
> +#define INVALID_GPA UINT64_MAX
> +#define INVALID_GFN UINT64_MAX
> +
>  uint32_t kvm_xen_get_caps(void);
>  
>  #define kvm_xen_has_cap(cap) (!!(kvm_xen_get_caps() &           \
> -- 
> 2.35.3
> 
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



  reply	other threads:[~2023-01-03 17:54 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-30 12:11 [RFC PATCH v5 00/52] Xen support under KVM David Woodhouse
2022-12-30 12:11 ` [RFC PATCH v5 01/52] include: import Xen public headers to include/standard-headers/ David Woodhouse
2022-12-30 12:11 ` [RFC PATCH v5 02/52] xen: add CONFIG_XENFV_MACHINE and CONFIG_XEN_EMU options for Xen emulation David Woodhouse
2022-12-30 12:11 ` [RFC PATCH v5 03/52] xen: Add XEN_DISABLED mode and make it default David Woodhouse
2022-12-30 12:11 ` [RFC PATCH v5 04/52] i386/kvm: Add xen-version KVM accelerator property and init KVM Xen support David Woodhouse
2022-12-30 12:11 ` [RFC PATCH v5 05/52] i386/kvm: handle Xen HVM cpuid leaves David Woodhouse
2022-12-30 12:11 ` [RFC PATCH v5 06/52] i386/hvm: Set Xen vCPU ID in KVM David Woodhouse
2022-12-30 12:11 ` [RFC PATCH v5 07/52] xen-platform: exclude vfio-pci from the PCI platform unplug David Woodhouse
2022-12-30 12:11 ` [RFC PATCH v5 08/52] xen-platform: allow its creation with XEN_EMULATE mode David Woodhouse
2022-12-30 12:11 ` [RFC PATCH v5 09/52] hw/xen_backend: refactor xen_be_init() David Woodhouse
2022-12-30 12:11 ` [RFC PATCH v5 10/52] i386/xen: handle guest hypercalls David Woodhouse
2022-12-30 12:11 ` [RFC PATCH v5 11/52] i386/xen: implement HYPERVISOR_xen_version David Woodhouse
2022-12-30 12:11 ` [RFC PATCH v5 12/52] i386/xen: implement HYPERVISOR_sched_op, SCHEDOP_shutdown David Woodhouse
2022-12-30 12:11 ` [RFC PATCH v5 13/52] i386/xen: Implement SCHEDOP_poll and SCHEDOP_yield David Woodhouse
2022-12-30 12:11 ` [RFC PATCH v5 14/52] hw/xen: Add xen_overlay device for emulating shared xenheap pages David Woodhouse
2023-01-03 17:54   ` Dr. David Alan Gilbert [this message]
2022-12-30 12:11 ` [RFC PATCH v5 15/52] i386/xen: add pc_machine_kvm_type to initialize XEN_EMULATE mode David Woodhouse
2022-12-30 12:11 ` [RFC PATCH v5 16/52] i386/xen: manage and save/restore Xen guest long_mode setting David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 17/52] i386/xen: implement HYPERVISOR_memory_op David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 18/52] i386/xen: implement XENMEM_add_to_physmap_batch David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 19/52] i386/xen: implement HYPERVISOR_hvm_op David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 20/52] i386/xen: implement HYPERVISOR_vcpu_op David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 21/52] i386/xen: handle VCPUOP_register_vcpu_info David Woodhouse
2023-01-03 18:13   ` Dr. David Alan Gilbert
2022-12-30 12:12 ` [RFC PATCH v5 22/52] i386/xen: handle VCPUOP_register_vcpu_time_info David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 23/52] i386/xen: handle VCPUOP_register_runstate_memory_area David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 24/52] i386/xen: implement HYPERVISOR_event_channel_op David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 25/52] i386/xen: implement HVMOP_set_evtchn_upcall_vector David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 26/52] i386/xen: implement HVMOP_set_param David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 27/52] hw/xen: Add xen_evtchn device for event channel emulation David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 28/52] i386/xen: Add support for Xen event channel delivery to vCPU David Woodhouse
2023-01-09 21:18   ` David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 29/52] hw/xen: Implement EVTCHNOP_status David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 30/52] hw/xen: Implement EVTCHNOP_close David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 31/52] hw/xen: Implement EVTCHNOP_unmask David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 32/52] hw/xen: Implement EVTCHNOP_bind_virq David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 33/52] hw/xen: Implement EVTCHNOP_bind_ipi David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 34/52] hw/xen: Implement EVTCHNOP_send David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 35/52] hw/xen: Implement EVTCHNOP_alloc_unbound David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 36/52] hw/xen: Implement EVTCHNOP_bind_interdomain David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 37/52] hw/xen: Implement EVTCHNOP_bind_vcpu David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 38/52] hw/xen: Implement EVTCHNOP_reset David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 39/52] i386/xen: add monitor commands to test event injection David Woodhouse
2023-01-04 12:48   ` Dr. David Alan Gilbert
2023-01-05 19:42     ` David Woodhouse
2023-01-05 20:09       ` Dr. David Alan Gilbert
2023-01-09 17:24         ` David Woodhouse
2023-01-09 18:51           ` Dr. David Alan Gilbert
2023-01-09 19:49             ` David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 40/52] hw/xen: Support HVM_PARAM_CALLBACK_TYPE_GSI callback David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 41/52] hw/xen: Support HVM_PARAM_CALLBACK_TYPE_PCI_INTX callback David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 42/52] kvm/i386: Add xen-gnttab-max-frames property David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 43/52] hw/xen: Add xen_gnttab device for grant table emulation David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 44/52] hw/xen: Support mapping grant frames David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 45/52] i386/xen: Implement HYPERVISOR_grant_table_op and GNTTABOP_[gs]et_verson David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 46/52] hw/xen: Implement GNTTABOP_query_size David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 47/52] i386/xen: handle PV timer hypercalls David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 48/52] i386/xen: Reserve Xen special pages for console, xenstore rings David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 49/52] i386/xen: handle HVMOP_get_param David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 50/52] hw/xen: Add backend implementation of interdomain event channel support David Woodhouse
2023-01-04 11:22   ` Dr. David Alan Gilbert
2023-01-04 14:33     ` David Woodhouse
2023-01-04 11:52   ` Dr. David Alan Gilbert
2022-12-30 12:12 ` [RFC PATCH v5 51/52] hw/xen: Add xen_xenstore device for xenstore emulation David Woodhouse
2023-01-04 12:01   ` Dr. David Alan Gilbert
2023-01-04 14:35     ` David Woodhouse
2022-12-30 12:12 ` [RFC PATCH v5 52/52] hw/xen: Add basic ring handling to xenstore David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y7Rr0r+qb0F2k+N3@work-vm \
    --to=dgilbert@redhat.com \
    --cc=alex.bennee@linaro.org \
    --cc=ankur.a.arora@oracle.com \
    --cc=cfontana@suse.de \
    --cc=dwmw2@infradead.org \
    --cc=joao.m.martins@oracle.com \
    --cc=julien@xen.org \
    --cc=paul@xen.org \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).