All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zheng Chuan <zhengchuan@huawei.com>
To: Steve Sistare <steven.sistare@oracle.com>, <qemu-devel@nongnu.org>
Cc: "Jason Zeng" <jason.zeng@linux.intel.com>,
	"Juan Quintela" <quintela@redhat.com>,
	"Eric Blake" <eblake@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"Markus Armbruster" <armbru@redhat.com>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Marc-André Lureau" <marcandre.lureau@redhat.com>,
	"Daniel P. Berrange" <berrange@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@redhat.com>,
	"Alex Bennée" <alex.bennee@linaro.org>
Subject: Re: [PATCH V5 16/25] vfio-pci: cpr part 1
Date: Wed, 28 Jul 2021 12:56:26 +0800	[thread overview]
Message-ID: <50a6096e-f49d-4dad-970a-0f78ba601259@huawei.com> (raw)
In-Reply-To: <1625678434-240960-17-git-send-email-steven.sistare@oracle.com>

Hi

On 2021/7/8 1:20, Steve Sistare wrote:
> Enable vfio-pci devices to be saved and restored across an exec restart
> of qemu.
> 
> At vfio creation time, save the value of vfio container, group, and device
> descriptors in the environment.
> 
> In cprsave and cprexec, suspend the use of virtual addresses in DMA
> mappings with VFIO_DMA_UNMAP_FLAG_VADDR, because guest ram will be remapped
> at a different VA after exec.  DMA to already-mapped pages continues.  Save
> the msi message area as part of vfio-pci vmstate, save the interrupt and
> notifier eventfd's in the environment, and clear the close-on-exec flag
> for the vfio descriptors.  The flag is not cleared earlier because the
> descriptors should not persist across miscellaneous fork and exec calls
> that may be performed during normal operation.
> 
> On qemu restart, vfio_realize() finds the descriptor env vars, uses
> the descriptors, and notes that the device is being reused.  Device and
> iommu state is already configured, so operations in vfio_realize that
> would modify the configuration are skipped for a reused device, including
> vfio ioctl's and writes to PCI configuration space.  The result is that
> vfio_realize constructs qemu data structures that reflect the current
> state of the device.  However, the reconstruction is not complete until
> cprload is called. cprload loads the msi data and finds eventfds in the
> environment.  It rebuilds vector data structures and attaches the
> interrupts to the new KVM instance.  cprload then walks the flattened
> ranges of the vfio_address_spaces and calls VFIO_DMA_MAP_FLAG_VADDR to
> inform the kernel of the new VA's.  Lastly, it starts the VM and suppresses
> vfio device reset.
> 
> This functionality is delivered by 2 patches for clarity.  Part 2 adds
> eventfd and vector support.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  MAINTAINERS                   |   1 +
>  hw/pci/pci.c                  |   4 ++
>  hw/vfio/common.c              |  69 +++++++++++++++++--
>  hw/vfio/cpr.c                 | 154 ++++++++++++++++++++++++++++++++++++++++++
>  hw/vfio/meson.build           |   1 +
>  hw/vfio/pci.c                 |  66 +++++++++++++++++-
>  hw/vfio/trace-events          |   1 +
>  include/hw/pci/pci.h          |   1 +
>  include/hw/vfio/vfio-common.h |   5 ++
>  include/migration/cpr.h       |   3 +
>  linux-headers/linux/vfio.h    |   6 ++
>  migration/cpr.c               |  20 ++++++
>  12 files changed, 323 insertions(+), 8 deletions(-)
>  create mode 100644 hw/vfio/cpr.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 8647a97..58479db 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2862,6 +2862,7 @@ CPR
>  M: Steve Sistare <steven.sistare@oracle.com>
>  M: Mark Kanda <mark.kanda@oracle.com>
>  S: Maintained
> +F: hw/vfio/cpr.c
>  F: include/migration/cpr.h
>  F: migration/cpr.c
>  F: qapi/cpr.json
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 2590898..fa4a439 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -307,6 +307,10 @@ static void pci_do_device_reset(PCIDevice *dev)
>  {
>      int r;
>  
> +    if (dev->reused) {
> +        return;
> +    }
> +
>      pci_device_deassert_intx(dev);
>      assert(dev->irq_state == 0);
>  
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 9220e64..40c882f 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -31,6 +31,7 @@
>  #include "exec/memory.h"
>  #include "exec/ram_addr.h"
>  #include "hw/hw.h"
> +#include "qemu/env.h"
>  #include "qemu/error-report.h"
>  #include "qemu/main-loop.h"
>  #include "qemu/range.h"
> @@ -440,6 +441,10 @@ static int vfio_dma_unmap(VFIOContainer *container,
>          return vfio_dma_unmap_bitmap(container, iova, size, iotlb);
>      }
>  
> +    if (container->reused) {
> +        return 0;
> +    }
> +
>      while (ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, &unmap)) {
>          /*
>           * The type1 backend has an off-by-one bug in the kernel (71a7d3d78e3c
> @@ -463,6 +468,11 @@ static int vfio_dma_unmap(VFIOContainer *container,
>          return -errno;
>      }
>  
> +    if (unmap.size != size) {
> +        warn_report("VFIO_UNMAP_DMA(0x%lx, 0x%lx) only unmaps 0x%llx",
> +                     iova, size, unmap.size);
> +    }
> +
>      return 0;
>  }
>  
> @@ -477,6 +487,10 @@ static int vfio_dma_map(VFIOContainer *container, hwaddr iova,
>          .size = size,
>      };
>  
> +    if (container->reused) {
> +        return 0;
> +    }
> +
>      if (!readonly) {
>          map.flags |= VFIO_DMA_MAP_FLAG_WRITE;
>      }
> @@ -1603,6 +1617,10 @@ static int vfio_init_container(VFIOContainer *container, int group_fd,
>      if (iommu_type < 0) {
>          return iommu_type;
>      }
> +    if (container->reused) {
> +        container->iommu_type = iommu_type;
> +        return 0;
> +    }
>  
>      ret = ioctl(group_fd, VFIO_GROUP_SET_CONTAINER, &container->fd);
>      if (ret) {
> @@ -1703,6 +1721,8 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
>  {
>      VFIOContainer *container;
>      int ret, fd;
> +    bool reused;
> +    char name[40];
>      VFIOAddressSpace *space;
>  
>      space = vfio_get_address_space(as);
> @@ -1739,16 +1759,31 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
>          return ret;
>      }
>  
> +    snprintf(name, sizeof(name), "vfio_container_for_group_%d", group->groupid);
> +    fd = getenv_fd(name);
> +    reused = (fd >= 0);
> +
>      QLIST_FOREACH(container, &space->containers, next) {
> -        if (!ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &container->fd)) {
> -            group->container = container;
> -            QLIST_INSERT_HEAD(&container->group_list, group, container_next);
> +        if (container->fd == fd ||
> +            !ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &container->fd)) {
> +            break;
> +        }
> +    }
> +
> +    if (container) {
> +        group->container = container;
> +        QLIST_INSERT_HEAD(&container->group_list, group, container_next);
> +        if (!reused) {
>              vfio_kvm_device_add_group(group);
> -            return 0;
> +            setenv_fd(name, container->fd);
>          }
> +        return 0;
> +    }
> +
> +    if (!reused) {
> +        fd = qemu_open_old("/dev/vfio/vfio", O_RDWR);
>      }
>  
> -    fd = qemu_open_old("/dev/vfio/vfio", O_RDWR);
>      if (fd < 0) {
>          error_setg_errno(errp, errno, "failed to open /dev/vfio/vfio");
>          ret = -errno;
> @@ -1766,6 +1801,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
>      container = g_malloc0(sizeof(*container));
>      container->space = space;
>      container->fd = fd;
> +    container->reused = reused;
>      container->error = NULL;
>      container->dirty_pages_supported = false;
>      QLIST_INIT(&container->giommu_list);
> @@ -1893,6 +1929,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
>      }
>  
>      container->initialized = true;
> +    setenv_fd(name, fd);
>  
>      return 0;
>  listener_release_exit:
> @@ -1920,6 +1957,7 @@ static void vfio_disconnect_container(VFIOGroup *group)
>  
>      QLIST_REMOVE(group, container_next);
>      group->container = NULL;
> +    unsetenv_fdv("vfio_container_for_group_%d", group->groupid);
>  
>      /*
>       * Explicitly release the listener first before unset container,
> @@ -1978,7 +2016,12 @@ VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, Error **errp)
>      group = g_malloc0(sizeof(*group));
>  
>      snprintf(path, sizeof(path), "/dev/vfio/%d", groupid);
> -    group->fd = qemu_open_old(path, O_RDWR);
> +
> +    group->fd = getenv_fd(path);
> +    if (group->fd < 0) {
> +        group->fd = qemu_open_old(path, O_RDWR);
> +    }
> +
>      if (group->fd < 0) {
>          error_setg_errno(errp, errno, "failed to open %s", path);
>          goto free_group_exit;
> @@ -2012,6 +2055,8 @@ VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, Error **errp)
>  
>      QLIST_INSERT_HEAD(&vfio_group_list, group, next);
>  
> +    setenv_fd(path, group->fd);
> +
>      return group;
>  
>  close_fd_exit:
> @@ -2036,6 +2081,7 @@ void vfio_put_group(VFIOGroup *group)
>      vfio_disconnect_container(group);
>      QLIST_REMOVE(group, next);
>      trace_vfio_put_group(group->fd);
> +    unsetenv_fdv("/dev/vfio/%d", group->groupid);
>      close(group->fd);
>      g_free(group);
>  
> @@ -2049,8 +2095,14 @@ int vfio_get_device(VFIOGroup *group, const char *name,
>  {
>      struct vfio_device_info dev_info = { .argsz = sizeof(dev_info) };
>      int ret, fd;
> +    bool reused;
> +
> +    fd = getenv_fd(name);
> +    reused = (fd >= 0);
> +    if (!reused) {
> +        fd = ioctl(group->fd, VFIO_GROUP_GET_DEVICE_FD, name);
> +    }
>  
> -    fd = ioctl(group->fd, VFIO_GROUP_GET_DEVICE_FD, name);
>      if (fd < 0) {
>          error_setg_errno(errp, errno, "error getting device from group %d",
>                           group->groupid);
> @@ -2095,6 +2147,8 @@ int vfio_get_device(VFIOGroup *group, const char *name,
>      vbasedev->num_irqs = dev_info.num_irqs;
>      vbasedev->num_regions = dev_info.num_regions;
>      vbasedev->flags = dev_info.flags;
> +    vbasedev->reused = reused;
> +    setenv_fd(name, fd);
>  
>      trace_vfio_get_device(name, dev_info.flags, dev_info.num_regions,
>                            dev_info.num_irqs);
> @@ -2111,6 +2165,7 @@ void vfio_put_base_device(VFIODevice *vbasedev)
>      QLIST_REMOVE(vbasedev, next);
>      vbasedev->group = NULL;
>      trace_vfio_put_base_device(vbasedev->fd);
> +    unsetenv_fd(vbasedev->name);
>      close(vbasedev->fd);
>  }
>  
> diff --git a/hw/vfio/cpr.c b/hw/vfio/cpr.c
> new file mode 100644
> index 0000000..28f8a76
> --- /dev/null
> +++ b/hw/vfio/cpr.c
> @@ -0,0 +1,154 @@
> +/*
> + * Copyright (c) 2021 Oracle and/or its affiliates.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include <sys/ioctl.h>
> +#include <linux/vfio.h>
> +#include "hw/vfio/vfio-common.h"
> +#include "sysemu/kvm.h"
> +#include "qapi/error.h"
> +#include "trace.h"
> +
> +static int
> +vfio_dma_unmap_vaddr_all(VFIOContainer *container, Error **errp)
> +{
> +    struct vfio_iommu_type1_dma_unmap unmap = {
> +        .argsz = sizeof(unmap),
> +        .flags = VFIO_DMA_UNMAP_FLAG_VADDR | VFIO_DMA_UNMAP_FLAG_ALL,
> +        .iova = 0,
> +        .size = 0,
> +    };
> +    if (ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, &unmap)) {
> +        error_setg_errno(errp, errno, "vfio_dma_unmap_vaddr_all");
> +        return -errno;
> +    }
> +    return 0;
> +}
> +
> +static int vfio_dma_map_vaddr(VFIOContainer *container, hwaddr iova,
> +                              ram_addr_t size, void *vaddr,
> +                              Error **errp)
> +{
> +    struct vfio_iommu_type1_dma_map map = {
> +        .argsz = sizeof(map),
> +        .flags = VFIO_DMA_MAP_FLAG_VADDR,
> +        .vaddr = (__u64)(uintptr_t)vaddr,
> +        .iova = iova,
> +        .size = size,
> +    };
> +    if (ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map)) {
> +        error_setg_errno(errp, errno,
> +                         "vfio_dma_map_vaddr(iova %lu, size %ld, va %p)",
> +                         iova, size, vaddr);
> +        return -errno;
> +    }
> +    return 0;
> +}
> +
> +static int
> +vfio_region_remap(MemoryRegionSection *section, void *handle, Error **errp)
> +{
> +    MemoryRegion *mr = section->mr;
> +    VFIOContainer *container = handle;
> +    const char *name = memory_region_name(mr);
> +    ram_addr_t size = int128_get64(section->size);
> +    hwaddr offset, iova, roundup;
> +    void *vaddr;
> +
> +    if (vfio_listener_skipped_section(section) || memory_region_is_iommu(mr)) {
> +        return 0;
> +    }
> +
> +    offset = section->offset_within_address_space;
> +    iova = TARGET_PAGE_ALIGN(offset);
> +    roundup = iova - offset;
> +    size = (size - roundup) & TARGET_PAGE_MASK;
> +    vaddr = memory_region_get_ram_ptr(mr) +
> +            section->offset_within_region + roundup;
> +
> +    trace_vfio_region_remap(name, container->fd, iova, iova + size - 1, vaddr);
> +    return vfio_dma_map_vaddr(container, iova, size, vaddr, errp);
> +}
> +
> +bool vfio_cpr_capable(VFIOContainer *container, Error **errp)
> +{
> +    if (!ioctl(container->fd, VFIO_CHECK_EXTENSION, VFIO_UPDATE_VADDR) ||
> +        !ioctl(container->fd, VFIO_CHECK_EXTENSION, VFIO_UNMAP_ALL)) {
> +        error_setg(errp, "VFIO container does not support VFIO_UPDATE_VADDR "
> +                         "or VFIO_UNMAP_ALL");
> +        return false;
> +    } else {
> +        return true;
> +    }
> +}
> +
> +int vfio_cprsave(Error **errp)
> +{
> +    VFIOAddressSpace *space, *last_space;
> +    VFIOContainer *container, *last_container;
> +
> +    QLIST_FOREACH(space, &vfio_address_spaces, list) {
> +        QLIST_FOREACH(container, &space->containers, next) {
> +            if (!vfio_cpr_capable(container, errp)) {
> +                return 1;
> +            }
> +        }
> +    }
> +
> +    QLIST_FOREACH(space, &vfio_address_spaces, list) {
> +        QLIST_FOREACH(container, &space->containers, next) {
> +            if (vfio_dma_unmap_vaddr_all(container, errp)) {
> +                goto unwind;
> +            }
> +        }
> +    }
> +    return 0;
> +
> +unwind:
> +    last_space = space;
> +    last_container = container;
> +    QLIST_FOREACH(space, &vfio_address_spaces, list) {
> +        QLIST_FOREACH(container, &space->containers, next) {
> +            Error *err;
> +
> +            if (space == last_space && container == last_container) {
> +                break;
> +            }
> +            if (as_flat_walk(space->as, vfio_region_remap, container, &err)) {
> +                error_prepend(errp, "%s", error_get_pretty(err));
> +                error_free(err);
> +            }
> +        }
> +    }
> +    return 1;
> +}
> +
> +int vfio_cprload(Error **errp)
> +{
> +    VFIOAddressSpace *space;
> +    VFIOContainer *container;
> +    VFIOGroup *group;
> +    VFIODevice *vbasedev;
> +
> +    QLIST_FOREACH(space, &vfio_address_spaces, list) {
> +        QLIST_FOREACH(container, &space->containers, next) {
> +            if (!vfio_cpr_capable(container, errp)) {
> +                return 1;
> +            }
> +            container->reused = false;
> +            if (as_flat_walk(space->as, vfio_region_remap, container, errp)) {
> +                return 1;
> +            }
> +        }
> +    }
> +    QLIST_FOREACH(group, &vfio_group_list, next) {
> +        QLIST_FOREACH(vbasedev, &group->device_list, next) {
> +            vbasedev->reused = false;
> +        }
> +    }
> +    return 0;
> +}
> diff --git a/hw/vfio/meson.build b/hw/vfio/meson.build
> index da9af29..e247b2b 100644
> --- a/hw/vfio/meson.build
> +++ b/hw/vfio/meson.build
> @@ -5,6 +5,7 @@ vfio_ss.add(files(
>    'migration.c',
>  ))
>  vfio_ss.add(when: 'CONFIG_VFIO_PCI', if_true: files(
> +  'cpr.c',
>    'display.c',
>    'pci-quirks.c',
>    'pci.c',
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 9fc12bc..0f5c542 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -29,6 +29,8 @@
>  #include "hw/qdev-properties.h"
>  #include "hw/qdev-properties-system.h"
>  #include "migration/vmstate.h"
> +#include "migration/cpr.h"
> +#include "qemu/env.h"
>  #include "qemu/error-report.h"
>  #include "qemu/main-loop.h"
>  #include "qemu/module.h"
> @@ -1656,6 +1658,7 @@ static void vfio_bars_prepare(VFIOPCIDevice *vdev)
>  static void vfio_bar_register(VFIOPCIDevice *vdev, int nr)
>  {
>      VFIOBAR *bar = &vdev->bars[nr];
> +    PCIDevice *pdev = &vdev->pdev;
>      char *name;
>  
>      if (!bar->size) {
> @@ -1676,7 +1679,7 @@ static void vfio_bar_register(VFIOPCIDevice *vdev, int nr)
>          }
>      }
>  
> -    pci_register_bar(&vdev->pdev, nr, bar->type, bar->mr);
> +    pci_register_bar(pdev, nr, bar->type, bar->mr);
>  }
>  
>  static void vfio_bars_register(VFIOPCIDevice *vdev)
> @@ -2888,6 +2891,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
>          vfio_put_group(group);
>          goto error;
>      }
> +    pdev->reused = vdev->vbasedev.reused;
>  
>      vfio_populate_device(vdev, &err);
>      if (err) {
> @@ -3157,6 +3161,10 @@ static void vfio_pci_reset(DeviceState *dev)
>  {
>      VFIOPCIDevice *vdev = VFIO_PCI(dev);
>  
> +    if (vdev->pdev.reused) {
> +        return;
> +    }
> +
>      trace_vfio_pci_reset(vdev->vbasedev.name);
>  
>      vfio_pci_pre_reset(vdev);
> @@ -3264,6 +3272,61 @@ static Property vfio_pci_dev_properties[] = {
>      DEFINE_PROP_END_OF_LIST(),
>  };
>  
> +static void vfio_merge_config(VFIOPCIDevice *vdev)
> +{
> +    PCIDevice *pdev = &vdev->pdev;
> +    int size = MIN(pci_config_size(pdev), vdev->config_size);
> +    uint8_t *phys_config = g_malloc(size);
> +    uint32_t mask;
> +    int ret, i;
> +
> +    ret = pread(vdev->vbasedev.fd, phys_config, size, vdev->config_offset);
> +    if (ret < size) {
> +        ret = ret < 0 ? errno : EFAULT;
> +        error_report("failed to read device config space: %s", strerror(ret));
> +        return;
> +    }
> +
> +    for (i = 0; i < size; i++) {
> +        mask = vdev->emulated_config_bits[i];
> +        pdev->config[i] = (pdev->config[i] & mask) | (phys_config[i] & ~mask);
> +    }
> +
> +    g_free(phys_config);
> +}
> +
> +static int vfio_pci_post_load(void *opaque, int version_id)
> +{
> +    VFIOPCIDevice *vdev = opaque;
> +    PCIDevice *pdev = &vdev->pdev;
> +    bool enabled;
> +
> +    vfio_merge_config(vdev);
> +
> +    pdev->reused = false;
> +    enabled = pci_get_word(pdev->config + PCI_COMMAND) & PCI_COMMAND_MASTER;
> +    memory_region_set_enabled(&pdev->bus_master_enable_region, enabled);
> +
> +    return 0;
> +}
> +
> +static bool vfio_pci_needed(void *opaque)
> +{
> +    return cpr_mode() == CPR_MODE_RESTART;
> +}
> +
> +static const VMStateDescription vfio_pci_vmstate = {
> +    .name = "vfio-pci",
> +    .unmigratable = 1,
> +    .version_id = 0,
> +    .minimum_version_id = 0,
> +    .post_load = vfio_pci_post_load,
> +    .needed = vfio_pci_needed,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_END_OF_LIST()
> +    }
> +};
> +
>  static void vfio_pci_dev_class_init(ObjectClass *klass, void *data)
>  {
>      DeviceClass *dc = DEVICE_CLASS(klass);
> @@ -3271,6 +3334,7 @@ static void vfio_pci_dev_class_init(ObjectClass *klass, void *data)
>  
>      dc->reset = vfio_pci_reset;
>      device_class_set_props(dc, vfio_pci_dev_properties);
> +    dc->vmsd = &vfio_pci_vmstate;
>      dc->desc = "VFIO-based PCI device assignment";
>      set_bit(DEVICE_CATEGORY_MISC, dc->categories);
>      pdc->realize = vfio_realize;
> diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
> index 0ef1b5f..63dd0fe 100644
> --- a/hw/vfio/trace-events
> +++ b/hw/vfio/trace-events
> @@ -118,6 +118,7 @@ vfio_region_sparse_mmap_header(const char *name, int index, int nr_areas) "Devic
>  vfio_region_sparse_mmap_entry(int i, unsigned long start, unsigned long end) "sparse entry %d [0x%lx - 0x%lx]"
>  vfio_get_dev_region(const char *name, int index, uint32_t type, uint32_t subtype) "%s index %d, %08x/%0x8"
>  vfio_dma_unmap_overflow_workaround(void) ""
> +vfio_region_remap(const char *name, int fd, uint64_t iova_start, uint64_t iova_end, void *vaddr) "%s fd %d 0x%"PRIx64" - 0x%"PRIx64" [%p]"
>  
>  # platform.c
>  vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group #%d"
> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index bef3e49..add7f46 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -360,6 +360,7 @@ struct PCIDevice {
>      /* ID of standby device in net_failover pair */
>      char *failover_pair_id;
>      uint32_t acpi_index;
> +    bool reused;
>  };
>  
>  void pci_register_bar(PCIDevice *pci_dev, int region_num,
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index 00acb85..b46d850 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -85,6 +85,7 @@ typedef struct VFIOContainer {
>      Error *error;
>      bool initialized;
>      bool dirty_pages_supported;
> +    bool reused;
>      uint64_t dirty_pgsizes;
>      uint64_t max_dirty_bitmap_size;
>      unsigned long pgsizes;
> @@ -124,6 +125,7 @@ typedef struct VFIODevice {
>      bool no_mmap;
>      bool ram_block_discard_allowed;
>      bool enable_migration;
> +    bool reused;
>      VFIODeviceOps *ops;
>      unsigned int num_irqs;
>      unsigned int num_regions;
> @@ -200,6 +202,9 @@ VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, Error **errp);
>  void vfio_put_group(VFIOGroup *group);
>  int vfio_get_device(VFIOGroup *group, const char *name,
>                      VFIODevice *vbasedev, Error **errp);
> +int vfio_cprsave(Error **errp);
> +int vfio_cprload(Error **errp);
> +bool vfio_cpr_capable(VFIOContainer *container, Error **errp);
>  
>  extern const MemoryRegionOps vfio_region_ops;
>  typedef QLIST_HEAD(VFIOGroupList, VFIOGroup) VFIOGroupList;
> diff --git a/include/migration/cpr.h b/include/migration/cpr.h
> index bffee19..1ea5046 100644
> --- a/include/migration/cpr.h
> +++ b/include/migration/cpr.h
> @@ -10,6 +10,9 @@
>  
>  #include "qapi/qapi-types-cpr.h"
>  
> +#define CPR_MODE_NONE ((CprMode)(-1))
> +
> +CprMode cpr_mode(void);
>  void cprsave(const char *file, CprMode mode, Error **errp);
>  void cprexec(strList *args, Error **errp);
>  void cprload(const char *file, Error **errp);
> diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
> index e680594..48a02c0 100644
> --- a/linux-headers/linux/vfio.h
> +++ b/linux-headers/linux/vfio.h
> @@ -52,6 +52,12 @@
>  /* Supports the vaddr flag for DMA map and unmap */
>  #define VFIO_UPDATE_VADDR		10
>  
> +/* Supports VFIO_DMA_UNMAP_FLAG_ALL */
> +#define VFIO_UNMAP_ALL                        9
> +
> +/* Supports VFIO DMA map and unmap with the VADDR flag */
> +#define VFIO_UPDATE_VADDR              10
> +
>  /*
>   * The IOCTL interface is designed for extensibility by embedding the
>   * structure length (argsz) and flags into structures passed between
> diff --git a/migration/cpr.c b/migration/cpr.c
> index fb57dec..578466c 100644
> --- a/migration/cpr.c
> +++ b/migration/cpr.c
> @@ -31,6 +31,13 @@
>  #include "hw/virtio/vhost.h"
>  #include "qemu/env.h"
>  
> +static CprMode cpr_active_mode = CPR_MODE_NONE;
> +
> +CprMode cpr_mode(void)
> +{
> +    return cpr_active_mode;
> +}
> +
>  QEMUFile *qf_file_open(const char *path, int flags, int mode,
>                                const char *name, Error **errp)
>  {
> @@ -92,6 +99,7 @@ void cprsave(const char *file, CprMode mode, Error **errp)
>      }
>      vm_stop(RUN_STATE_SAVE_VM);
>  
> +    cpr_active_mode = mode;
>      ret = qemu_save_device_state(f);
>      qemu_fclose(f);
>      if (ret < 0) {
> @@ -105,6 +113,7 @@ err:
>      if (saved_vm_running) {
>          vm_start();
>      }
> +    cpr_active_mode = CPR_MODE_NONE;
>  done:
>      return;
>  }
> @@ -125,6 +134,13 @@ void cprexec(strList *args, Error **errp)
>          error_setg(errp, "runstate is not save-vm");
>          return;
>      }
> +    if (cpr_active_mode != CPR_MODE_RESTART) {
> +        error_setg(errp, "cprexec requires cprsave with restart mode");
> +        return;
> +    }
> +    if (vfio_cprsave(errp)) {
> +        return;
> +    }
>      walkenv(FD_PREFIX, preserve_fd, 0);
>      qemu_system_exec_request(args);
>  }
> @@ -158,6 +174,10 @@ void cprload(const char *file, Error **errp)
>          return;
>      }
>  
> +    if (vfio_cprload(errp)) {
> +        return;
> +    }
> +
It will compile failed in some targets without vfio support such as m68k.
Maybe CONFIG_VFIO should be added for vfio_{save, load}.

>      state = global_state_get_runstate();
>      if (state == RUN_STATE_RUNNING) {
>          vm_start();
> 

-- 
Regards.
Chuan


  parent reply	other threads:[~2021-07-28  4:57 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-07 17:20 [PATCH V5 00/25] Live Update Steve Sistare
2021-07-07 17:20 ` [PATCH V5 01/25] qemu_ram_volatile Steve Sistare
2021-07-08 12:01   ` Marc-André Lureau
2021-07-12 17:06     ` Steven Sistare
2021-07-07 17:20 ` [PATCH V5 02/25] cpr: reboot mode Steve Sistare
2021-07-08 12:25   ` Marc-André Lureau
2021-07-12 17:07     ` Steven Sistare
2021-08-04 15:48   ` Eric Blake
2021-07-07 17:20 ` [PATCH V5 03/25] cpr: QMP interfaces for reboot Steve Sistare
2021-07-08 13:27   ` Marc-André Lureau
2021-07-12 17:07     ` Steven Sistare
2021-08-04 15:48   ` Eric Blake
2021-08-04 20:27     ` Steven Sistare
2021-07-07 17:20 ` [PATCH V5 04/25] cpr: HMP " Steve Sistare
2021-07-28  4:55   ` Zheng Chuan
2021-07-07 17:20 ` [PATCH V5 05/25] as_flat_walk Steve Sistare
2021-07-08 13:49   ` Marc-André Lureau
2021-07-12 17:07     ` Steven Sistare
2021-07-07 17:20 ` [PATCH V5 06/25] oslib: qemu_clr_cloexec Steve Sistare
2021-07-08 13:58   ` Marc-André Lureau
2021-07-12 17:07     ` Steven Sistare
2021-07-07 17:20 ` [PATCH V5 07/25] machine: memfd-alloc option Steve Sistare
2021-07-08 14:20   ` Marc-André Lureau
2021-07-12 17:07     ` Steven Sistare
2021-07-12 17:45       ` Marc-André Lureau
2021-07-07 17:20 ` [PATCH V5 08/25] vl: add helper to request re-exec Steve Sistare
2021-07-08 14:31   ` Marc-André Lureau
2021-07-12 17:07     ` Steven Sistare
2021-07-07 17:20 ` [PATCH V5 09/25] string to strList Steve Sistare
2021-07-08 14:37   ` Marc-André Lureau
2021-07-07 17:20 ` [PATCH V5 10/25] util: env var helpers Steve Sistare
2021-07-08 15:10   ` Marc-André Lureau
2021-07-12 19:19     ` Steven Sistare
2021-07-12 19:36       ` Marc-André Lureau
2021-07-13 16:15         ` Steven Sistare
2021-07-07 17:20 ` [PATCH V5 11/25] cpr: restart mode Steve Sistare
2021-07-08 15:43   ` Marc-André Lureau
2021-07-08 15:54     ` Marc-André Lureau
2021-07-12 19:19       ` Steven Sistare
2021-07-07 17:20 ` [PATCH V5 12/25] cpr: QMP interfaces for restart Steve Sistare
2021-07-08 15:49   ` Marc-André Lureau
2021-07-12 19:19     ` Steven Sistare
2021-08-04 16:00   ` Eric Blake
2021-08-04 20:22     ` Steven Sistare
2021-07-07 17:20 ` [PATCH V5 13/25] cpr: HMP " Steve Sistare
2021-07-28  4:56   ` Zheng Chuan
2021-07-07 17:20 ` [PATCH V5 14/25] pci: export functions for cpr Steve Sistare
2021-07-07 17:20 ` [PATCH V5 15/25] vfio-pci: refactor " Steve Sistare
2021-07-07 17:20 ` [PATCH V5 16/25] vfio-pci: cpr part 1 Steve Sistare
2021-07-16 17:45   ` Alex Williamson
2021-07-19 17:43     ` Steven Sistare
2021-07-28  4:56   ` Zheng Chuan [this message]
2021-07-30 12:50     ` Steven Sistare
2021-07-07 17:20 ` [PATCH V5 17/25] vfio-pci: cpr part 2 Steve Sistare
2021-07-16 20:51   ` Alex Williamson
2021-07-19 17:44     ` Steven Sistare
2021-07-19 18:10       ` Alex Williamson
2021-07-19 18:38         ` Steven Sistare
2021-07-28  4:56           ` Zheng Chuan
2021-07-30 12:52             ` Steven Sistare
2021-07-31  6:07               ` Zheng Chuan
2021-07-07 17:20 ` [PATCH V5 18/25] vhost: reset vhost devices upon cprsave Steve Sistare
2021-07-07 17:20 ` [PATCH V5 19/25] hostmem-memfd: cpr support Steve Sistare
2021-07-07 17:20 ` [PATCH V5 20/25] chardev: cpr framework Steve Sistare
2021-07-08 16:03   ` Marc-André Lureau
2021-07-12 19:20     ` Steven Sistare
2021-07-12 19:49       ` Marc-André Lureau
2021-07-13 14:34         ` Steven Sistare
2021-07-07 17:20 ` [PATCH V5 21/25] chardev: cpr for simple devices Steve Sistare
2021-07-07 17:20 ` [PATCH V5 22/25] chardev: cpr for pty Steve Sistare
2021-07-07 17:20 ` [PATCH V5 23/25] chardev: cpr for sockets Steve Sistare
2021-07-29  4:04   ` Zheng Chuan
2021-07-07 17:20 ` [PATCH V5 24/25] cpr: only-cpr-capable option Steve Sistare
2021-07-07 17:20 ` [PATCH V5 25/25] simplify savevm Steve Sistare

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50a6096e-f49d-4dad-970a-0f78ba601259@huawei.com \
    --to=zhengchuan@huawei.com \
    --cc=alex.bennee@linaro.org \
    --cc=alex.williamson@redhat.com \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eblake@redhat.com \
    --cc=jason.zeng@linux.intel.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=steven.sistare@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.