All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device
@ 2022-05-24  6:18 Lei Rao
  2022-05-24  6:18 ` [RFC PATCH 01/13] vfio/migration: put together checks of migration initialization conditions Lei Rao
                   ` (13 more replies)
  0 siblings, 14 replies; 19+ messages in thread
From: Lei Rao @ 2022-05-24  6:18 UTC (permalink / raw)
  To: alex.williamson, kevin.tian, eddie.dong, jason.zeng, quintela,
	dgilbert, yadong.li, yi.l.liu
  Cc: qemu-devel, Lei Rao

Migration of a VFIO passthrough device can be supported by using a device 
specific kernel driver to save/restore the device state thru device specific 
interfaces. But this approach doesn't work for devices that lack a state 
migration interface, e.g. NVMe.

On the other hand, Infrastructure Process Unit (IPU) or Data Processing Unit 
(DPU) vendors may choose to implement an out-of-band interface from the SoC to 
help manage the state of such non-migratable devices e.g. via gRPC or JSON-RPC 
protocols.

This RFC attempts to support such out-of-band migration interface by introducing
the concept of migration backends in vfio. The existing logic around vfio 
migration uAPI is now called the 'local' backend while a new 'out-of-band' 
backend is further introduced allowing vfio to redirect VMState ops to an 
external plugin.

Currently, the backend migration Ops is defined close to SaveVMHandlers. We also
considered whether there is value of abstracting it in a lower level e.g. close 
to vfio migration uAPI but no clear conclusion. Hence this is one part which 
we'd like to hear suggestions.

This proposal adopts a plugin mechanism (an example can be found in [1]) given 
that IPU/DPU vendors usually implement proprietary migration interfaces without
a standard. But we are also open if an alternative option makes better sense,
e.g. via loadable modules (with Qemu supporting gRPC or JSON-RPC support) or an
IPC mechanism similar to vhost-user.

The following graph describes the overall component relationship:

 +----------------------------------------------------+
 | QEMU                                               |
 | +------------------------------------------------+ |
 | |        VFIO Live Migration Framework           | |
 | |    +--------------------------------------+    | |
 | |    |         VFIOMigrationOps             |    | |
 | |    +-------^---------------------^--------+    | |
 | |            |                     |             | |
 | |    +-------v-------+     +-------v--------+    | |
 | |    | LM Backend Via|     | LM Backend Via |    | |
 | |    |   Device Fd   |     |    Plugins     |    | |
 | |    +-------^-------+     |     +----------+    | |
 | |            |             |     |Plugin Ops+----+-+------------+
 | |            |             +-----+----------+    | |            |
 | |            |                                   | |  +---------v----------+
 | +------------+-----------------------------------+ |  |  Vendor Specific   |
 |              |                                     |  |    Plugins(.so)    |
 +--------------+-------------------------------------+  +----------+---------+
  UserSpace     |                                                   |
----------------+---------------------------------------------      |
  Kernel        |                                                   |
                |                                                   |
     +----------v----------------------+                            |
     |        Kernel VFIO Driver       |                            |
     |    +-------------------------+  |                            |
     |    |                         |  |                            | Network
     |    | Vendor-Specific Driver  |  |                            |
     |    |                         |  |                            |
     |    +----------^--------------+  |                            |
     |               |                 |                            |
     +---------------+-----------------+                            |
                     |                                              |
                     |                                              |
---------------------+-----------------------------------------     |
  Hardware           |                                              |
                     |            +-----+-----+-----+----+-----+    |
          +----------v------+     | VF0 | VF1 | VF2 | ...| VFn |    |
          |   Traditional   |     +-----+-----+-----+----+-----+    |
          |  PCIe Devices   |     |                            |    |
          +-----------------+     |   +--------+------------+  |    |
                                  |   |        |   Agent    |<-+----+
                                  |   |        +------------+  |
                                  |   |                     |  |
                                  |   | SOC                 |  |
                                  |   +---------------------+  |
                                  | IPU                        |
                                  +----------------------------+

Two command-line parameters (x-plugin-path and x-plugin-arg) are introduced to 
enable the out-of-band backend. If specified, vfio will attempt to use the 
out-of-band backend.

The following is an example of VFIO command-line parameters for OOB-Approach:

  -device vfio-pci,id=$ID,host=$bdf,x-enable-migration,x-plugin-path=$plugin_path,x-plugin-arg=$plugin_arg

[1] https://github.com/raolei-intel/vfio-lm-plugin-example.git

Lei Rao (13):
  vfio/migration: put together checks of migration initialization
    conditions
  vfio/migration: move migration struct allocation out of
    vfio_migration_init
  vfio/migration: move vfio_get_dev_region_info out of
    vfio_migration_probe
  vfio/migration: Separated functions that relate to the In-Band
    approach
  vfio/migration: rename functions that relate to the In-Band approach
  vfio/migration: introduce VFIOMigrationOps layer in VFIO live
    migration framework
  vfio/migration: move the statistics of bytes_transferred to generic
    VFIO migration layer
  vfio/migration: split migration handler registering from
    vfio_migration_init
  vfio/migration: move the functions of In-Band approach to a new file
  vfio/pci: introduce command-line parameters to specify migration
    method
  vfio/migration: add a plugin layer to support out-of-band live
    migration
  vfio/migration: add some trace-events for vfio migration plugin
  vfio/migration: make the region and plugin member of struct
    VFIOMigration to be a union

 docs/devel/vfio-migration-plugin.rst    | 165 +++++++
 hw/vfio/meson.build                     |   2 +
 hw/vfio/migration-local.c               | 456 +++++++++++++++++++
 hw/vfio/migration-plugin.c              | 266 +++++++++++
 hw/vfio/migration.c                     | 577 ++++++------------------
 hw/vfio/pci.c                           |   2 +
 hw/vfio/trace-events                    |   9 +-
 include/hw/vfio/vfio-common.h           |  37 +-
 include/hw/vfio/vfio-migration-plugin.h |  21 +
 9 files changed, 1096 insertions(+), 439 deletions(-)
 create mode 100644 docs/devel/vfio-migration-plugin.rst
 create mode 100644 hw/vfio/migration-local.c
 create mode 100644 hw/vfio/migration-plugin.c
 create mode 100644 include/hw/vfio/vfio-migration-plugin.h

-- 
2.32.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC PATCH 01/13] vfio/migration: put together checks of migration initialization conditions
  2022-05-24  6:18 [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device Lei Rao
@ 2022-05-24  6:18 ` Lei Rao
  2022-05-24  6:18 ` [RFC PATCH 02/13] vfio/migration: move migration struct allocation out of vfio_migration_init Lei Rao
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Lei Rao @ 2022-05-24  6:18 UTC (permalink / raw)
  To: alex.williamson, kevin.tian, eddie.dong, jason.zeng, quintela,
	dgilbert, yadong.li, yi.l.liu
  Cc: qemu-devel, Lei Rao

Current VFIO live migration initialization code is tightly coupled with
local migration region handling. It is necessary to decouple it to
facilitate the introduction of a generic VFIO live migration framework so
that other approaches can be possible besides the In-Band approach.

This patch puts various checks of migration initialization conditions into
one function vfio_migration_check().

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
---
 hw/vfio/migration.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index a6ad1f8945..770f535e81 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -787,6 +787,21 @@ static void vfio_migration_exit(VFIODevice *vbasedev)
     vbasedev->migration = NULL;
 }
 
+static int vfio_migration_check(VFIODevice *vbasedev)
+{
+    VFIOContainer *container = vbasedev->group->container;
+
+    if (!vbasedev->enable_migration || !container->dirty_pages_supported) {
+        return -EINVAL;
+    }
+
+    if (!vbasedev->ops->vfio_get_object) {
+        return -EINVAL;
+    }
+
+    return 0;
+}
+
 static int vfio_migration_init(VFIODevice *vbasedev,
                                struct vfio_region_info *info)
 {
@@ -796,10 +811,6 @@ static int vfio_migration_init(VFIODevice *vbasedev,
     char id[256] = "";
     g_autofree char *path = NULL, *oid = NULL;
 
-    if (!vbasedev->ops->vfio_get_object) {
-        return -EINVAL;
-    }
-
     obj = vbasedev->ops->vfio_get_object(vbasedev);
     if (!obj) {
         return -EINVAL;
@@ -857,11 +868,11 @@ int64_t vfio_mig_bytes_transferred(void)
 
 int vfio_migration_probe(VFIODevice *vbasedev, Error **errp)
 {
-    VFIOContainer *container = vbasedev->group->container;
     struct vfio_region_info *info = NULL;
     int ret = -ENOTSUP;
 
-    if (!vbasedev->enable_migration || !container->dirty_pages_supported) {
+    ret = vfio_migration_check(vbasedev);
+    if (ret) {
         goto add_blocker;
     }
 
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 02/13] vfio/migration: move migration struct allocation out of vfio_migration_init
  2022-05-24  6:18 [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device Lei Rao
  2022-05-24  6:18 ` [RFC PATCH 01/13] vfio/migration: put together checks of migration initialization conditions Lei Rao
@ 2022-05-24  6:18 ` Lei Rao
  2022-05-24  6:18 ` [RFC PATCH 03/13] vfio/migration: move vfio_get_dev_region_info out of vfio_migration_probe Lei Rao
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Lei Rao @ 2022-05-24  6:18 UTC (permalink / raw)
  To: alex.williamson, kevin.tian, eddie.dong, jason.zeng, quintela,
	dgilbert, yadong.li, yi.l.liu
  Cc: qemu-devel, Lei Rao

Migration struct is a common data structure. Memory allocation of migration
struct is not unique to In-Band approach. So, move it from vfio_migration_init()
to vfio_migration_probe().

Signed-off-by: Lei Rao <lei.rao@intel.com>
---
 hw/vfio/migration.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 770f535e81..11ce87bb1a 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -807,17 +807,15 @@ static int vfio_migration_init(VFIODevice *vbasedev,
 {
     int ret;
     Object *obj;
-    VFIOMigration *migration;
     char id[256] = "";
     g_autofree char *path = NULL, *oid = NULL;
+    VFIOMigration *migration = vbasedev->migration;
 
     obj = vbasedev->ops->vfio_get_object(vbasedev);
     if (!obj) {
         return -EINVAL;
     }
 
-    vbasedev->migration = g_new0(VFIOMigration, 1);
-
     ret = vfio_region_setup(obj, vbasedev, &vbasedev->migration->region,
                             info->index, "migration");
     if (ret) {
@@ -833,9 +831,6 @@ static int vfio_migration_init(VFIODevice *vbasedev,
         goto err;
     }
 
-    migration = vbasedev->migration;
-    migration->vbasedev = vbasedev;
-
     oid = vmstate_if_get_id(VMSTATE_IF(DEVICE(obj)));
     if (oid) {
         path = g_strdup_printf("%s/vfio", oid);
@@ -876,6 +871,9 @@ int vfio_migration_probe(VFIODevice *vbasedev, Error **errp)
         goto add_blocker;
     }
 
+    vbasedev->migration = g_new0(VFIOMigration, 1);
+    vbasedev->migration->vbasedev = vbasedev;
+
     ret = vfio_get_dev_region_info(vbasedev,
                                    VFIO_REGION_TYPE_MIGRATION_DEPRECATED,
                                    VFIO_REGION_SUBTYPE_MIGRATION_DEPRECATED,
@@ -903,6 +901,8 @@ add_blocker:
         error_free(vbasedev->migration_blocker);
         vbasedev->migration_blocker = NULL;
     }
+    g_free(vbasedev->migration);
+    vbasedev->migration = NULL;
     return ret;
 }
 
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 03/13] vfio/migration: move vfio_get_dev_region_info out of vfio_migration_probe
  2022-05-24  6:18 [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device Lei Rao
  2022-05-24  6:18 ` [RFC PATCH 01/13] vfio/migration: put together checks of migration initialization conditions Lei Rao
  2022-05-24  6:18 ` [RFC PATCH 02/13] vfio/migration: move migration struct allocation out of vfio_migration_init Lei Rao
@ 2022-05-24  6:18 ` Lei Rao
  2022-05-24  6:18 ` [RFC PATCH 04/13] vfio/migration: Separated functions that relate to the In-Band approach Lei Rao
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Lei Rao @ 2022-05-24  6:18 UTC (permalink / raw)
  To: alex.williamson, kevin.tian, eddie.dong, jason.zeng, quintela,
	dgilbert, yadong.li, yi.l.liu
  Cc: qemu-devel, Lei Rao

vfio_get_dev_region_info() in vfio_migration_probe() is a specific operation of
In-Band approach.  So, it's better to put it in vfio_migration_init() because
most of the setup of In-Band approach are handled there. The vfio_migration_init
will be rename to vfio_migration_probe_local().

Signed-off-by: Lei Rao <lei.rao@intel.com>
---
 hw/vfio/migration.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 11ce87bb1a..e61c19171a 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -802,12 +802,12 @@ static int vfio_migration_check(VFIODevice *vbasedev)
     return 0;
 }
 
-static int vfio_migration_init(VFIODevice *vbasedev,
-                               struct vfio_region_info *info)
+static int vfio_migration_init(VFIODevice *vbasedev)
 {
     int ret;
     Object *obj;
     char id[256] = "";
+    struct vfio_region_info *info = NULL;
     g_autofree char *path = NULL, *oid = NULL;
     VFIOMigration *migration = vbasedev->migration;
 
@@ -816,6 +816,14 @@ static int vfio_migration_init(VFIODevice *vbasedev,
         return -EINVAL;
     }
 
+    ret = vfio_get_dev_region_info(vbasedev,
+                                   VFIO_REGION_TYPE_MIGRATION_DEPRECATED,
+                                   VFIO_REGION_SUBTYPE_MIGRATION_DEPRECATED,
+                                   &info);
+    if (ret) {
+        return -EINVAL;
+    }
+
     ret = vfio_region_setup(obj, vbasedev, &vbasedev->migration->region,
                             info->index, "migration");
     if (ret) {
@@ -847,10 +855,14 @@ static int vfio_migration_init(VFIODevice *vbasedev,
                                                            vbasedev);
     migration->migration_state.notify = vfio_migration_state_notifier;
     add_migration_state_change_notifier(&migration->migration_state);
+
+    trace_vfio_migration_probe(vbasedev->name, info->index);
+    g_free(info);
     return 0;
 
 err:
     vfio_migration_exit(vbasedev);
+    g_free(info);
     return ret;
 }
 
@@ -863,7 +875,6 @@ int64_t vfio_mig_bytes_transferred(void)
 
 int vfio_migration_probe(VFIODevice *vbasedev, Error **errp)
 {
-    struct vfio_region_info *info = NULL;
     int ret = -ENOTSUP;
 
     ret = vfio_migration_check(vbasedev);
@@ -874,27 +885,16 @@ int vfio_migration_probe(VFIODevice *vbasedev, Error **errp)
     vbasedev->migration = g_new0(VFIOMigration, 1);
     vbasedev->migration->vbasedev = vbasedev;
 
-    ret = vfio_get_dev_region_info(vbasedev,
-                                   VFIO_REGION_TYPE_MIGRATION_DEPRECATED,
-                                   VFIO_REGION_SUBTYPE_MIGRATION_DEPRECATED,
-                                   &info);
-    if (ret) {
-        goto add_blocker;
-    }
-
-    ret = vfio_migration_init(vbasedev, info);
+    ret = vfio_migration_init(vbasedev);
     if (ret) {
         goto add_blocker;
     }
 
-    trace_vfio_migration_probe(vbasedev->name, info->index);
-    g_free(info);
     return 0;
 
 add_blocker:
     error_setg(&vbasedev->migration_blocker,
                "VFIO device doesn't support migration");
-    g_free(info);
 
     ret = migrate_add_blocker(vbasedev->migration_blocker, errp);
     if (ret < 0) {
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 04/13] vfio/migration: Separated functions that relate to the In-Band approach
  2022-05-24  6:18 [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device Lei Rao
                   ` (2 preceding siblings ...)
  2022-05-24  6:18 ` [RFC PATCH 03/13] vfio/migration: move vfio_get_dev_region_info out of vfio_migration_probe Lei Rao
@ 2022-05-24  6:18 ` Lei Rao
  2022-05-24  6:18 ` [RFC PATCH 05/13] vfio/migration: rename " Lei Rao
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Lei Rao @ 2022-05-24  6:18 UTC (permalink / raw)
  To: alex.williamson, kevin.tian, eddie.dong, jason.zeng, quintela,
	dgilbert, yadong.li, yi.l.liu
  Cc: qemu-devel, Lei Rao

Split functions of In-Band approach from common function, to prepare for
the introduction of generic VFIO live migration layer and another Sub-Ops.

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
---
 hw/vfio/migration.c | 64 ++++++++++++++++++++++++++++++++++-----------
 1 file changed, 49 insertions(+), 15 deletions(-)

diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index e61c19171a..c2df2caae6 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -394,7 +394,7 @@ static int vfio_load_device_config_state(QEMUFile *f, void *opaque)
     return qemu_file_get_error(f);
 }
 
-static void vfio_migration_cleanup(VFIODevice *vbasedev)
+static void vfio_migration_cleanup_local(VFIODevice *vbasedev)
 {
     VFIOMigration *migration = vbasedev->migration;
 
@@ -403,17 +403,17 @@ static void vfio_migration_cleanup(VFIODevice *vbasedev)
     }
 }
 
+static void vfio_migration_cleanup(VFIODevice *vbasedev)
+{
+    vfio_migration_cleanup_local(vbasedev);
+}
+
 /* ---------------------------------------------------------------------- */
 
-static int vfio_save_setup(QEMUFile *f, void *opaque)
+static int vfio_migration_save_setup_local(VFIODevice *vbasedev)
 {
-    VFIODevice *vbasedev = opaque;
     VFIOMigration *migration = vbasedev->migration;
-    int ret;
-
-    trace_vfio_save_setup(vbasedev->name);
-
-    qemu_put_be64(f, VFIO_MIG_FLAG_DEV_SETUP_STATE);
+    int ret = -1;
 
     if (migration->region.mmaps) {
         /*
@@ -430,6 +430,24 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
             error_report("%s: Falling back to slow path", vbasedev->name);
         }
     }
+    return ret;
+}
+
+static int vfio_save_setup(QEMUFile *f, void *opaque)
+{
+    VFIODevice *vbasedev = opaque;
+    int ret;
+
+    trace_vfio_save_setup(vbasedev->name);
+
+    qemu_put_be64(f, VFIO_MIG_FLAG_DEV_SETUP_STATE);
+
+    ret = vfio_migration_save_setup_local(vbasedev);
+    if (ret) {
+        error_report("%s: Failed to vfio lm save setup:%s",
+                     vbasedev->name, strerror(-ret));
+        return ret;
+    }
 
     ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_MASK,
                                    VFIO_DEVICE_STATE_V1_SAVING);
@@ -592,11 +610,10 @@ static void vfio_save_state(QEMUFile *f, void *opaque)
     }
 }
 
-static int vfio_load_setup(QEMUFile *f, void *opaque)
+static int vfio_migration_load_setup_local(VFIODevice *vbasedev)
 {
-    VFIODevice *vbasedev = opaque;
     VFIOMigration *migration = vbasedev->migration;
-    int ret = 0;
+    int ret = -1;
 
     if (migration->region.mmaps) {
         ret = vfio_region_mmap(&migration->region);
@@ -607,14 +624,26 @@ static int vfio_load_setup(QEMUFile *f, void *opaque)
             error_report("%s: Falling back to slow path", vbasedev->name);
         }
     }
+    return ret;
+}
+
+static int vfio_load_setup(QEMUFile *f, void *opaque)
+{
+    VFIODevice *vbasedev = opaque;
+    int ret = 0;
+
+    ret = vfio_migration_load_setup_local(vbasedev);
+    if (ret < 0) {
+        error_report("%s: Failed to migration load setup", vbasedev->name);
+        return ret;
+    }
 
     ret = vfio_migration_set_state(vbasedev, ~VFIO_DEVICE_STATE_MASK,
                                    VFIO_DEVICE_STATE_V1_RESUMING);
     if (ret) {
         error_report("%s: Failed to set state RESUMING", vbasedev->name);
-        if (migration->region.mmaps) {
-            vfio_region_unmap(&migration->region);
-        }
+        vfio_migration_cleanup(vbasedev);
+        return ret;
     }
     return ret;
 }
@@ -777,12 +806,17 @@ static void vfio_migration_state_notifier(Notifier *notifier, void *data)
     }
 }
 
-static void vfio_migration_exit(VFIODevice *vbasedev)
+static void vfio_migration_exit_local(VFIODevice *vbasedev)
 {
     VFIOMigration *migration = vbasedev->migration;
 
     vfio_region_exit(&migration->region);
     vfio_region_finalize(&migration->region);
+}
+
+static void vfio_migration_exit(VFIODevice *vbasedev)
+{
+    vfio_migration_exit_local(vbasedev);
     g_free(vbasedev->migration);
     vbasedev->migration = NULL;
 }
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 05/13] vfio/migration: rename functions that relate to the In-Band approach
  2022-05-24  6:18 [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device Lei Rao
                   ` (3 preceding siblings ...)
  2022-05-24  6:18 ` [RFC PATCH 04/13] vfio/migration: Separated functions that relate to the In-Band approach Lei Rao
@ 2022-05-24  6:18 ` Lei Rao
  2022-05-24  6:18 ` [RFC PATCH 06/13] vfio/migration: introduce VFIOMigrationOps layer in VFIO live migration framework Lei Rao
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Lei Rao @ 2022-05-24  6:18 UTC (permalink / raw)
  To: alex.williamson, kevin.tian, eddie.dong, jason.zeng, quintela,
	dgilbert, yadong.li, yi.l.liu
  Cc: qemu-devel, Lei Rao

Rename some functions that are related to the In-Band approach to facilitate
introducing generic vfio live migration layer.

Rename vfio_migration_set_state to vfio_migration_set_state_local,
vfio_save_buffer to vfio_migration_save_buffer_local,
vfio_load_buffer to vfio_migration_load_buffer_local,
vfio_update_pending to vfio_migration_update_pending_local,
vfio_migration_init to vfio_migration_probe_local,
vfio_migration_exit to vfio_migration_exit_local.

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
---
 hw/vfio/migration.c  | 74 +++++++++++++++++++++++---------------------
 hw/vfio/trace-events |  6 ++--
 2 files changed, 42 insertions(+), 38 deletions(-)

diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index c2df2caae6..04360e1f17 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -107,8 +107,8 @@ static int vfio_mig_rw(VFIODevice *vbasedev, __u8 *buf, size_t count,
  * an error is returned.
  */
 
-static int vfio_migration_set_state(VFIODevice *vbasedev, uint32_t mask,
-                                    uint32_t value)
+static int vfio_migration_set_state_local(VFIODevice *vbasedev, uint32_t mask,
+                                          uint32_t value)
 {
     VFIOMigration *migration = vbasedev->migration;
     VFIORegion *region = &migration->region;
@@ -193,7 +193,8 @@ static void *get_data_section_size(VFIORegion *region, uint64_t data_offset,
     return ptr;
 }
 
-static int vfio_save_buffer(QEMUFile *f, VFIODevice *vbasedev, uint64_t *size)
+static int vfio_migration_save_buffer_local(QEMUFile *f, VFIODevice *vbasedev,
+                                            uint64_t *size)
 {
     VFIOMigration *migration = vbasedev->migration;
     VFIORegion *region = &migration->region;
@@ -212,8 +213,8 @@ static int vfio_save_buffer(QEMUFile *f, VFIODevice *vbasedev, uint64_t *size)
         return ret;
     }
 
-    trace_vfio_save_buffer(vbasedev->name, data_offset, data_size,
-                           migration->pending_bytes);
+    trace_vfio_save_buffer_local(vbasedev->name, data_offset, data_size,
+                                 migration->pending_bytes);
 
     qemu_put_be64(f, data_size);
     sz = data_size;
@@ -260,8 +261,8 @@ static int vfio_save_buffer(QEMUFile *f, VFIODevice *vbasedev, uint64_t *size)
     return ret;
 }
 
-static int vfio_load_buffer(QEMUFile *f, VFIODevice *vbasedev,
-                            uint64_t data_size)
+static int vfio_migration_load_buffer_local(QEMUFile *f, VFIODevice *vbasedev,
+                                            uint64_t data_size)
 {
     VFIORegion *region = &vbasedev->migration->region;
     uint64_t data_offset = 0, size, report_size;
@@ -288,7 +289,8 @@ static int vfio_load_buffer(QEMUFile *f, VFIODevice *vbasedev,
             data_size = 0;
         }
 
-        trace_vfio_load_state_device_data(vbasedev->name, data_offset, size);
+        trace_vfio_load_state_device_data_local(vbasedev->name, data_offset,
+                                                size);
 
         while (size) {
             void *buf;
@@ -331,7 +333,7 @@ static int vfio_load_buffer(QEMUFile *f, VFIODevice *vbasedev,
     return 0;
 }
 
-static int vfio_update_pending(VFIODevice *vbasedev)
+static int vfio_migration_update_pending_local(VFIODevice *vbasedev)
 {
     VFIOMigration *migration = vbasedev->migration;
     VFIORegion *region = &migration->region;
@@ -449,8 +451,8 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
         return ret;
     }
 
-    ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_MASK,
-                                   VFIO_DEVICE_STATE_V1_SAVING);
+    ret = vfio_migration_set_state_local(vbasedev, VFIO_DEVICE_STATE_MASK,
+                                         VFIO_DEVICE_STATE_V1_SAVING);
     if (ret) {
         error_report("%s: Failed to set state SAVING", vbasedev->name);
         return ret;
@@ -484,7 +486,7 @@ static void vfio_save_pending(QEMUFile *f, void *opaque,
     VFIOMigration *migration = vbasedev->migration;
     int ret;
 
-    ret = vfio_update_pending(vbasedev);
+    ret = vfio_migration_update_pending_local(vbasedev);
     if (ret) {
         return;
     }
@@ -505,7 +507,7 @@ static int vfio_save_iterate(QEMUFile *f, void *opaque)
     qemu_put_be64(f, VFIO_MIG_FLAG_DEV_DATA_STATE);
 
     if (migration->pending_bytes == 0) {
-        ret = vfio_update_pending(vbasedev);
+        ret = vfio_migration_update_pending_local(vbasedev);
         if (ret) {
             return ret;
         }
@@ -518,10 +520,10 @@ static int vfio_save_iterate(QEMUFile *f, void *opaque)
         }
     }
 
-    ret = vfio_save_buffer(f, vbasedev, &data_size);
+    ret = vfio_migration_save_buffer_local(f, vbasedev, &data_size);
     if (ret) {
-        error_report("%s: vfio_save_buffer failed %s", vbasedev->name,
-                     strerror(errno));
+        error_report("%s: vfio_miragion_save_buffer_local failed %s",
+                     vbasedev->name, strerror(errno));
         return ret;
     }
 
@@ -534,8 +536,8 @@ static int vfio_save_iterate(QEMUFile *f, void *opaque)
 
     /*
      * Reset pending_bytes as .save_live_pending is not called during savevm or
-     * snapshot case, in such case vfio_update_pending() at the start of this
-     * function updates pending_bytes.
+     * snapshot case, in such case vfio_migration_update_pending_local() at the
+     * start of this function updates pending_bytes.
      */
     migration->pending_bytes = 0;
     trace_vfio_save_iterate(vbasedev->name, data_size);
@@ -549,22 +551,23 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
     uint64_t data_size;
     int ret;
 
-    ret = vfio_migration_set_state(vbasedev, ~VFIO_DEVICE_STATE_V1_RUNNING,
-                                   VFIO_DEVICE_STATE_V1_SAVING);
+    ret = vfio_migration_set_state_local(vbasedev,
+                                         ~VFIO_DEVICE_STATE_V1_RUNNING,
+                                         VFIO_DEVICE_STATE_V1_SAVING);
     if (ret) {
         error_report("%s: Failed to set state STOP and SAVING",
                      vbasedev->name);
         return ret;
     }
 
-    ret = vfio_update_pending(vbasedev);
+    ret = vfio_migration_update_pending_local(vbasedev);
     if (ret) {
         return ret;
     }
 
     while (migration->pending_bytes > 0) {
         qemu_put_be64(f, VFIO_MIG_FLAG_DEV_DATA_STATE);
-        ret = vfio_save_buffer(f, vbasedev, &data_size);
+        ret = vfio_migration_save_buffer_local(f, vbasedev, &data_size);
         if (ret < 0) {
             error_report("%s: Failed to save buffer", vbasedev->name);
             return ret;
@@ -574,7 +577,7 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
             break;
         }
 
-        ret = vfio_update_pending(vbasedev);
+        ret = vfio_migration_update_pending_local(vbasedev);
         if (ret) {
             return ret;
         }
@@ -587,7 +590,8 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
         return ret;
     }
 
-    ret = vfio_migration_set_state(vbasedev, ~VFIO_DEVICE_STATE_V1_SAVING, 0);
+    ret = vfio_migration_set_state_local(vbasedev, ~VFIO_DEVICE_STATE_V1_SAVING,
+                                         0);
     if (ret) {
         error_report("%s: Failed to set state STOPPED", vbasedev->name);
         return ret;
@@ -638,7 +642,7 @@ static int vfio_load_setup(QEMUFile *f, void *opaque)
         return ret;
     }
 
-    ret = vfio_migration_set_state(vbasedev, ~VFIO_DEVICE_STATE_MASK,
+    ret = vfio_migration_set_state_local(vbasedev, ~VFIO_DEVICE_STATE_MASK,
                                    VFIO_DEVICE_STATE_V1_RESUMING);
     if (ret) {
         error_report("%s: Failed to set state RESUMING", vbasedev->name);
@@ -690,7 +694,7 @@ static int vfio_load_state(QEMUFile *f, void *opaque, int version_id)
             uint64_t data_size = qemu_get_be64(f);
 
             if (data_size) {
-                ret = vfio_load_buffer(f, vbasedev, data_size);
+                ret = vfio_migration_load_buffer_local(f, vbasedev, data_size);
                 if (ret < 0) {
                     return ret;
                 }
@@ -765,7 +769,7 @@ static void vfio_vmstate_change(void *opaque, bool running, RunState state)
         }
     }
 
-    ret = vfio_migration_set_state(vbasedev, mask, value);
+    ret = vfio_migration_set_state_local(vbasedev, mask, value);
     if (ret) {
         /*
          * Migration should be aborted in this case, but vm_state_notify()
@@ -796,10 +800,10 @@ static void vfio_migration_state_notifier(Notifier *notifier, void *data)
     case MIGRATION_STATUS_CANCELLED:
     case MIGRATION_STATUS_FAILED:
         bytes_transferred = 0;
-        ret = vfio_migration_set_state(vbasedev,
-                                       ~(VFIO_DEVICE_STATE_V1_SAVING |
-                                         VFIO_DEVICE_STATE_V1_RESUMING),
-                                       VFIO_DEVICE_STATE_V1_RUNNING);
+        ret = vfio_migration_set_state_local(vbasedev,
+                                             ~(VFIO_DEVICE_STATE_V1_SAVING |
+                                             VFIO_DEVICE_STATE_V1_RESUMING),
+                                             VFIO_DEVICE_STATE_V1_RUNNING);
         if (ret) {
             error_report("%s: Failed to set state RUNNING", vbasedev->name);
         }
@@ -836,7 +840,7 @@ static int vfio_migration_check(VFIODevice *vbasedev)
     return 0;
 }
 
-static int vfio_migration_init(VFIODevice *vbasedev)
+static int vfio_migration_probe_local(VFIODevice *vbasedev)
 {
     int ret;
     Object *obj;
@@ -890,12 +894,12 @@ static int vfio_migration_init(VFIODevice *vbasedev)
     migration->migration_state.notify = vfio_migration_state_notifier;
     add_migration_state_change_notifier(&migration->migration_state);
 
-    trace_vfio_migration_probe(vbasedev->name, info->index);
+    trace_vfio_migration_probe_local(vbasedev->name, info->index);
     g_free(info);
     return 0;
 
 err:
-    vfio_migration_exit(vbasedev);
+    vfio_migration_exit_local(vbasedev);
     g_free(info);
     return ret;
 }
@@ -919,7 +923,7 @@ int vfio_migration_probe(VFIODevice *vbasedev, Error **errp)
     vbasedev->migration = g_new0(VFIOMigration, 1);
     vbasedev->migration->vbasedev = vbasedev;
 
-    ret = vfio_migration_init(vbasedev);
+    ret = vfio_migration_probe_local(vbasedev);
     if (ret) {
         goto add_blocker;
     }
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index 582882db91..ca85edeb11 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -148,13 +148,13 @@ vfio_display_edid_update(uint32_t prefx, uint32_t prefy) "%ux%u"
 vfio_display_edid_write_error(void) ""
 
 # migration.c
-vfio_migration_probe(const char *name, uint32_t index) " (%s) Region %d"
+vfio_migration_probe_local(const char *name, uint32_t index) " (%s) Region %d"
 vfio_migration_set_state(const char *name, uint32_t state) " (%s) state %d"
 vfio_vmstate_change(const char *name, int running, const char *reason, uint32_t dev_state) " (%s) running %d reason %s device state %d"
 vfio_migration_state_notifier(const char *name, const char *state) " (%s) state %s"
 vfio_save_setup(const char *name) " (%s)"
 vfio_save_cleanup(const char *name) " (%s)"
-vfio_save_buffer(const char *name, uint64_t data_offset, uint64_t data_size, uint64_t pending) " (%s) Offset 0x%"PRIx64" size 0x%"PRIx64" pending 0x%"PRIx64
+vfio_save_buffer_local(const char *name, uint64_t data_offset, uint64_t data_size, uint64_t pending) " (%s) Offset 0x%"PRIx64" size 0x%"PRIx64" pending 0x%"PRIx64
 vfio_update_pending(const char *name, uint64_t pending) " (%s) pending 0x%"PRIx64
 vfio_save_device_config_state(const char *name) " (%s)"
 vfio_save_pending(const char *name, uint64_t precopy, uint64_t postcopy, uint64_t compatible) " (%s) precopy 0x%"PRIx64" postcopy 0x%"PRIx64" compatible 0x%"PRIx64
@@ -162,7 +162,7 @@ vfio_save_iterate(const char *name, int data_size) " (%s) data_size %d"
 vfio_save_complete_precopy(const char *name) " (%s)"
 vfio_load_device_config_state(const char *name) " (%s)"
 vfio_load_state(const char *name, uint64_t data) " (%s) data 0x%"PRIx64
-vfio_load_state_device_data(const char *name, uint64_t data_offset, uint64_t data_size) " (%s) Offset 0x%"PRIx64" size 0x%"PRIx64
+vfio_load_state_device_data_local(const char *name, uint64_t data_offset, uint64_t data_size) " (%s) Offset 0x%"PRIx64" size 0x%"PRIx64
 vfio_load_cleanup(const char *name) " (%s)"
 vfio_get_dirty_bitmap(int fd, uint64_t iova, uint64_t size, uint64_t bitmap_size, uint64_t start) "container fd=%d, iova=0x%"PRIx64" size= 0x%"PRIx64" bitmap_size=0x%"PRIx64" start=0x%"PRIx64
 vfio_iommu_map_dirty_notify(uint64_t iova_start, uint64_t iova_end) "iommu dirty @ 0x%"PRIx64" - 0x%"PRIx64
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 06/13] vfio/migration: introduce VFIOMigrationOps layer in VFIO live migration framework
  2022-05-24  6:18 [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device Lei Rao
                   ` (4 preceding siblings ...)
  2022-05-24  6:18 ` [RFC PATCH 05/13] vfio/migration: rename " Lei Rao
@ 2022-05-24  6:18 ` Lei Rao
  2022-05-24  6:18 ` [RFC PATCH 07/13] vfio/migration: move the statistics of bytes_transferred to generic VFIO migration layer Lei Rao
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Lei Rao @ 2022-05-24  6:18 UTC (permalink / raw)
  To: alex.williamson, kevin.tian, eddie.dong, jason.zeng, quintela,
	dgilbert, yadong.li, yi.l.liu
  Cc: qemu-devel, Lei Rao

Add an abstraction layer, VFIOMigrationOps, to the VFIO live migration
framework. Also adapt the In-Band approach to this abstraction layer by defining
its own VFIOMigrationOps callbacks.

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
---
 hw/vfio/migration.c           | 203 +++++++++++++++++++++-------------
 include/hw/vfio/vfio-common.h |  14 +++
 2 files changed, 142 insertions(+), 75 deletions(-)

diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 04360e1f17..4736af90e7 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -407,7 +407,11 @@ static void vfio_migration_cleanup_local(VFIODevice *vbasedev)
 
 static void vfio_migration_cleanup(VFIODevice *vbasedev)
 {
-    vfio_migration_cleanup_local(vbasedev);
+    VFIOMigration *migration = vbasedev->migration;
+
+    if (migration->ops->cleanup) {
+        migration->ops->cleanup(vbasedev);
+    }
 }
 
 /* ---------------------------------------------------------------------- */
@@ -438,24 +442,29 @@ static int vfio_migration_save_setup_local(VFIODevice *vbasedev)
 static int vfio_save_setup(QEMUFile *f, void *opaque)
 {
     VFIODevice *vbasedev = opaque;
+    VFIOMigration *migration = vbasedev->migration;
     int ret;
 
     trace_vfio_save_setup(vbasedev->name);
 
     qemu_put_be64(f, VFIO_MIG_FLAG_DEV_SETUP_STATE);
 
-    ret = vfio_migration_save_setup_local(vbasedev);
-    if (ret) {
-        error_report("%s: Failed to vfio lm save setup:%s",
-                     vbasedev->name, strerror(-ret));
-        return ret;
+    if (migration->ops->save_setup) {
+        ret = migration->ops->save_setup(vbasedev);
+        if (ret) {
+            error_report("%s: Failed to vfio lm save setup:%s",
+                         vbasedev->name, strerror(-ret));
+            return ret;
+        }
     }
 
-    ret = vfio_migration_set_state_local(vbasedev, VFIO_DEVICE_STATE_MASK,
-                                         VFIO_DEVICE_STATE_V1_SAVING);
-    if (ret) {
-        error_report("%s: Failed to set state SAVING", vbasedev->name);
-        return ret;
+    if (migration->ops->set_state) {
+        ret = migration->ops->set_state(vbasedev, VFIO_DEVICE_STATE_MASK,
+                                        VFIO_DEVICE_STATE_V1_SAVING);
+        if (ret) {
+            error_report("%s: Failed to set state SAVING", vbasedev->name);
+            return ret;
+        }
     }
 
     qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
@@ -486,9 +495,11 @@ static void vfio_save_pending(QEMUFile *f, void *opaque,
     VFIOMigration *migration = vbasedev->migration;
     int ret;
 
-    ret = vfio_migration_update_pending_local(vbasedev);
-    if (ret) {
-        return;
+    if (migration->ops->update_pending) {
+        ret = migration->ops->update_pending(vbasedev);
+        if (ret) {
+            return;
+        }
     }
 
     *res_precopy_only += migration->pending_bytes;
@@ -507,9 +518,11 @@ static int vfio_save_iterate(QEMUFile *f, void *opaque)
     qemu_put_be64(f, VFIO_MIG_FLAG_DEV_DATA_STATE);
 
     if (migration->pending_bytes == 0) {
-        ret = vfio_migration_update_pending_local(vbasedev);
-        if (ret) {
-            return ret;
+        if (migration->ops->update_pending) {
+            ret = migration->ops->update_pending(vbasedev);
+            if (ret) {
+                return ret;
+            }
         }
 
         if (migration->pending_bytes == 0) {
@@ -520,11 +533,13 @@ static int vfio_save_iterate(QEMUFile *f, void *opaque)
         }
     }
 
-    ret = vfio_migration_save_buffer_local(f, vbasedev, &data_size);
-    if (ret) {
-        error_report("%s: vfio_miragion_save_buffer_local failed %s",
-                     vbasedev->name, strerror(errno));
-        return ret;
+    if (migration->ops->save_buffer) {
+        ret = migration->ops->save_buffer(f, vbasedev, &data_size);
+        if (ret) {
+            error_report("%s: vfio_miragion_save_buffer_local failed %s",
+                         vbasedev->name, strerror(errno));
+            return ret;
+        }
     }
 
     qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
@@ -551,35 +566,43 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
     uint64_t data_size;
     int ret;
 
-    ret = vfio_migration_set_state_local(vbasedev,
-                                         ~VFIO_DEVICE_STATE_V1_RUNNING,
-                                         VFIO_DEVICE_STATE_V1_SAVING);
-    if (ret) {
-        error_report("%s: Failed to set state STOP and SAVING",
-                     vbasedev->name);
-        return ret;
+    if (migration->ops->set_state) {
+        ret = migration->ops->set_state(vbasedev,
+                                        ~VFIO_DEVICE_STATE_V1_RUNNING,
+                                        VFIO_DEVICE_STATE_V1_SAVING);
+        if (ret) {
+            error_report("%s: Failed to set state STOP and SAVING",
+                         vbasedev->name);
+            return ret;
+        }
     }
 
-    ret = vfio_migration_update_pending_local(vbasedev);
-    if (ret) {
-        return ret;
+    if (migration->ops->update_pending) {
+        ret = migration->ops->update_pending(vbasedev);
+        if (ret) {
+            return ret;
+        }
     }
 
     while (migration->pending_bytes > 0) {
         qemu_put_be64(f, VFIO_MIG_FLAG_DEV_DATA_STATE);
-        ret = vfio_migration_save_buffer_local(f, vbasedev, &data_size);
-        if (ret < 0) {
-            error_report("%s: Failed to save buffer", vbasedev->name);
-            return ret;
+        if (migration->ops->save_buffer) {
+            ret = migration->ops->save_buffer(f, vbasedev, &data_size);
+            if (ret < 0) {
+                error_report("%s: Failed to save buffer", vbasedev->name);
+                return ret;
+            }
         }
 
         if (data_size == 0) {
             break;
         }
 
-        ret = vfio_migration_update_pending_local(vbasedev);
-        if (ret) {
-            return ret;
+        if (migration->ops->update_pending) {
+            ret = migration->ops->update_pending(vbasedev);
+            if (ret) {
+                return ret;
+            }
         }
     }
 
@@ -590,11 +613,13 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
         return ret;
     }
 
-    ret = vfio_migration_set_state_local(vbasedev, ~VFIO_DEVICE_STATE_V1_SAVING,
-                                         0);
-    if (ret) {
-        error_report("%s: Failed to set state STOPPED", vbasedev->name);
-        return ret;
+    if (migration->ops->set_state) {
+        ret = migration->ops->set_state(vbasedev, ~VFIO_DEVICE_STATE_V1_SAVING,
+                                        0);
+        if (ret) {
+            error_report("%s: Failed to set state STOPPED", vbasedev->name);
+            return ret;
+        }
     }
 
     trace_vfio_save_complete_precopy(vbasedev->name);
@@ -634,20 +659,25 @@ static int vfio_migration_load_setup_local(VFIODevice *vbasedev)
 static int vfio_load_setup(QEMUFile *f, void *opaque)
 {
     VFIODevice *vbasedev = opaque;
+    VFIOMigration *migration = vbasedev->migration;
     int ret = 0;
 
-    ret = vfio_migration_load_setup_local(vbasedev);
-    if (ret < 0) {
-        error_report("%s: Failed to migration load setup", vbasedev->name);
-        return ret;
+    if (migration->ops->load_setup) {
+        ret = migration->ops->load_setup(vbasedev);
+        if (ret < 0) {
+            error_report("%s: Failed to migration load setup", vbasedev->name);
+            return ret;
+        }
     }
 
-    ret = vfio_migration_set_state_local(vbasedev, ~VFIO_DEVICE_STATE_MASK,
-                                   VFIO_DEVICE_STATE_V1_RESUMING);
-    if (ret) {
-        error_report("%s: Failed to set state RESUMING", vbasedev->name);
-        vfio_migration_cleanup(vbasedev);
-        return ret;
+    if (migration->ops->set_state) {
+        ret = migration->ops->set_state(vbasedev, ~VFIO_DEVICE_STATE_MASK,
+                                        VFIO_DEVICE_STATE_V1_RESUMING);
+        if (ret) {
+            error_report("%s: Failed to set state RESUMING", vbasedev->name);
+            vfio_migration_cleanup(vbasedev);
+            return ret;
+        }
     }
     return ret;
 }
@@ -692,11 +722,14 @@ static int vfio_load_state(QEMUFile *f, void *opaque, int version_id)
         case VFIO_MIG_FLAG_DEV_DATA_STATE:
         {
             uint64_t data_size = qemu_get_be64(f);
+            VFIOMigration *migration = vbasedev->migration;
 
             if (data_size) {
-                ret = vfio_migration_load_buffer_local(f, vbasedev, data_size);
-                if (ret < 0) {
-                    return ret;
+                if (migration->ops->load_buffer) {
+                    ret = migration->ops->load_buffer(f, vbasedev, data_size);
+                    if (ret < 0) {
+                        return ret;
+                    }
                 }
             }
             break;
@@ -736,7 +769,7 @@ static void vfio_vmstate_change(void *opaque, bool running, RunState state)
     uint32_t value, mask;
     int ret;
 
-    if (vbasedev->migration->vm_running == running) {
+    if (migration->vm_running == running) {
         return;
     }
 
@@ -769,17 +802,19 @@ static void vfio_vmstate_change(void *opaque, bool running, RunState state)
         }
     }
 
-    ret = vfio_migration_set_state_local(vbasedev, mask, value);
-    if (ret) {
-        /*
-         * Migration should be aborted in this case, but vm_state_notify()
-         * currently does not support reporting failures.
-         */
-        error_report("%s: Failed to set device state 0x%x", vbasedev->name,
-                     (migration->device_state & mask) | value);
-        qemu_file_set_error(migrate_get_current()->to_dst_file, ret);
+    if (migration->ops->set_state) {
+        ret = migration->ops->set_state(vbasedev, mask, value);
+        if (ret) {
+            /*
+             * Migration should be aborted in this case, but vm_state_notify()
+             * currently does not support reporting failures.
+             */
+            error_report("%s: Failed to set device state 0x%x", vbasedev->name,
+                         (migration->device_state & mask) | value);
+            qemu_file_set_error(migrate_get_current()->to_dst_file, ret);
+        }
     }
-    vbasedev->migration->vm_running = running;
+    migration->vm_running = running;
     trace_vfio_vmstate_change(vbasedev->name, running, RunState_str(state),
             (migration->device_state & mask) | value);
 }
@@ -800,12 +835,14 @@ static void vfio_migration_state_notifier(Notifier *notifier, void *data)
     case MIGRATION_STATUS_CANCELLED:
     case MIGRATION_STATUS_FAILED:
         bytes_transferred = 0;
-        ret = vfio_migration_set_state_local(vbasedev,
-                                             ~(VFIO_DEVICE_STATE_V1_SAVING |
-                                             VFIO_DEVICE_STATE_V1_RESUMING),
-                                             VFIO_DEVICE_STATE_V1_RUNNING);
-        if (ret) {
-            error_report("%s: Failed to set state RUNNING", vbasedev->name);
+        if (migration->ops->set_state) {
+            ret = migration->ops->set_state(vbasedev,
+                                            ~(VFIO_DEVICE_STATE_V1_SAVING |
+                                            VFIO_DEVICE_STATE_V1_RESUMING),
+                                            VFIO_DEVICE_STATE_V1_RUNNING);
+            if (ret) {
+                error_report("%s: Failed to set state RUNNING", vbasedev->name);
+            }
         }
     }
 }
@@ -820,7 +857,11 @@ static void vfio_migration_exit_local(VFIODevice *vbasedev)
 
 static void vfio_migration_exit(VFIODevice *vbasedev)
 {
-    vfio_migration_exit_local(vbasedev);
+    VFIOMigration *migration = vbasedev->migration;
+
+    if (migration->ops->exit) {
+        migration->ops->exit(vbasedev);
+    }
     g_free(vbasedev->migration);
     vbasedev->migration = NULL;
 }
@@ -840,6 +881,17 @@ static int vfio_migration_check(VFIODevice *vbasedev)
     return 0;
 }
 
+static VFIOMigrationOps vfio_local_method = {
+    .save_setup = vfio_migration_save_setup_local,
+    .load_setup = vfio_migration_load_setup_local,
+    .update_pending = vfio_migration_update_pending_local,
+    .save_buffer = vfio_migration_save_buffer_local,
+    .load_buffer = vfio_migration_load_buffer_local,
+    .set_state = vfio_migration_set_state_local,
+    .cleanup = vfio_migration_cleanup_local,
+    .exit = vfio_migration_exit_local,
+};
+
 static int vfio_migration_probe_local(VFIODevice *vbasedev)
 {
     int ret;
@@ -895,6 +947,7 @@ static int vfio_migration_probe_local(VFIODevice *vbasedev)
     add_migration_state_change_notifier(&migration->migration_state);
 
     trace_vfio_migration_probe_local(vbasedev->name, info->index);
+    migration->ops = &vfio_local_method;
     g_free(info);
     return 0;
 
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index e573f5a9f1..8ef85a871c 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -58,10 +58,13 @@ typedef struct VFIORegion {
     uint8_t nr; /* cache the region number for debug */
 } VFIORegion;
 
+typedef struct VFIOMigrationOps VFIOMigrationOps;
+
 typedef struct VFIOMigration {
     struct VFIODevice *vbasedev;
     VMChangeStateEntry *vm_state;
     VFIORegion region;
+    VFIOMigrationOps *ops;
     uint32_t device_state;
     int vm_running;
     Notifier migration_state;
@@ -154,6 +157,17 @@ struct VFIODeviceOps {
     int (*vfio_load_config)(VFIODevice *vdev, QEMUFile *f);
 };
 
+typedef struct VFIOMigrationOps {
+    int (*save_setup)(VFIODevice *vbasedev);
+    int (*load_setup)(VFIODevice *vbasedev);
+    int (*update_pending)(VFIODevice *vbasedev);
+    int (*save_buffer)(QEMUFile *f, VFIODevice *vbasedev, uint64_t *size);
+    int (*load_buffer)(QEMUFile *f, VFIODevice *vbasedev, uint64_t data_size);
+    int (*set_state)(VFIODevice *vbasedev, uint32_t mask, uint32_t value);
+    void (*cleanup)(VFIODevice *vbasedev);
+    void (*exit)(VFIODevice *vbasedev);
+} VFIOMigrationOps;
+
 typedef struct VFIOGroup {
     int fd;
     int groupid;
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 07/13] vfio/migration: move the statistics of bytes_transferred to generic VFIO migration layer
  2022-05-24  6:18 [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device Lei Rao
                   ` (5 preceding siblings ...)
  2022-05-24  6:18 ` [RFC PATCH 06/13] vfio/migration: introduce VFIOMigrationOps layer in VFIO live migration framework Lei Rao
@ 2022-05-24  6:18 ` Lei Rao
  2022-05-24  6:18 ` [RFC PATCH 08/13] vfio/migration: split migration handler registering from vfio_migration_init Lei Rao
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Lei Rao @ 2022-05-24  6:18 UTC (permalink / raw)
  To: alex.williamson, kevin.tian, eddie.dong, jason.zeng, quintela,
	dgilbert, yadong.li, yi.l.liu
  Cc: qemu-devel, Lei Rao

The statistics of bytes transferred conceptually belong to The VFIO live
migration framework, and should not belong to any specific implementation
such In-Band approach, so move it out from vfio_migration_region_save_buffer(),
which makes it easier to add other implementations.

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
---
 hw/vfio/migration.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 4736af90e7..c114fab3a2 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -257,7 +257,6 @@ static int vfio_migration_save_buffer_local(QEMUFile *f, VFIODevice *vbasedev,
         *size = data_size;
     }
 
-    bytes_transferred += data_size;
     return ret;
 }
 
@@ -540,6 +539,7 @@ static int vfio_save_iterate(QEMUFile *f, void *opaque)
                          vbasedev->name, strerror(errno));
             return ret;
         }
+        bytes_transferred += data_size;
     }
 
     qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
@@ -592,6 +592,7 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
                 error_report("%s: Failed to save buffer", vbasedev->name);
                 return ret;
             }
+            bytes_transferred += data_size;
         }
 
         if (data_size == 0) {
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 08/13] vfio/migration: split migration handler registering from vfio_migration_init
  2022-05-24  6:18 [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device Lei Rao
                   ` (6 preceding siblings ...)
  2022-05-24  6:18 ` [RFC PATCH 07/13] vfio/migration: move the statistics of bytes_transferred to generic VFIO migration layer Lei Rao
@ 2022-05-24  6:18 ` Lei Rao
  2022-05-24  6:18 ` [RFC PATCH 09/13] vfio/migration: move the functions of In-Band approach to a new file Lei Rao
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Lei Rao @ 2022-05-24  6:18 UTC (permalink / raw)
  To: alex.williamson, kevin.tian, eddie.dong, jason.zeng, quintela,
	dgilbert, yadong.li, yi.l.liu
  Cc: qemu-devel, Lei Rao

vfio_migration_init() is mainly related to initialization of In-Band approach.
Migration handler registering may also be used by other approaches. so split it
from vfio_migration_init() and move it to vfio_migration_probe().

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
---
 hw/vfio/migration.c | 56 ++++++++++++++++++++++++++++++---------------
 1 file changed, 37 insertions(+), 19 deletions(-)

diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index c114fab3a2..0c67ed85f3 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -882,6 +882,38 @@ static int vfio_migration_check(VFIODevice *vbasedev)
     return 0;
 }
 
+static int vfio_migration_register_handlers(VFIODevice *vbasedev)
+{
+    Object *obj;
+    char id[256] = "";
+    g_autofree char *path = NULL, *oid = NULL;
+    VFIOMigration *migration = vbasedev->migration;
+
+    obj = vbasedev->ops->vfio_get_object(vbasedev);
+    if (!obj) {
+        return -EINVAL;
+    }
+
+    oid = vmstate_if_get_id(VMSTATE_IF(DEVICE(obj)));
+    if (oid) {
+        path = g_strdup_printf("%s/vfio", oid);
+    } else {
+        path = g_strdup("vfio");
+    }
+    strpadcpy(id, sizeof(id), path, '\0');
+
+    register_savevm_live(id, VMSTATE_INSTANCE_ID_ANY, 1, &savevm_vfio_handlers,
+                         vbasedev);
+
+    migration->vm_state = qdev_add_vm_change_state_handler(vbasedev->dev,
+                                                           vfio_vmstate_change,
+                                                           vbasedev);
+    migration->migration_state.notify = vfio_migration_state_notifier;
+    add_migration_state_change_notifier(&migration->migration_state);
+
+    return 0;
+}
+
 static VFIOMigrationOps vfio_local_method = {
     .save_setup = vfio_migration_save_setup_local,
     .load_setup = vfio_migration_load_setup_local,
@@ -897,9 +929,7 @@ static int vfio_migration_probe_local(VFIODevice *vbasedev)
 {
     int ret;
     Object *obj;
-    char id[256] = "";
     struct vfio_region_info *info = NULL;
-    g_autofree char *path = NULL, *oid = NULL;
     VFIOMigration *migration = vbasedev->migration;
 
     obj = vbasedev->ops->vfio_get_object(vbasedev);
@@ -930,23 +960,6 @@ static int vfio_migration_probe_local(VFIODevice *vbasedev)
         goto err;
     }
 
-    oid = vmstate_if_get_id(VMSTATE_IF(DEVICE(obj)));
-    if (oid) {
-        path = g_strdup_printf("%s/vfio", oid);
-    } else {
-        path = g_strdup("vfio");
-    }
-    strpadcpy(id, sizeof(id), path, '\0');
-
-    register_savevm_live(id, VMSTATE_INSTANCE_ID_ANY, 1, &savevm_vfio_handlers,
-                         vbasedev);
-
-    migration->vm_state = qdev_add_vm_change_state_handler(vbasedev->dev,
-                                                           vfio_vmstate_change,
-                                                           vbasedev);
-    migration->migration_state.notify = vfio_migration_state_notifier;
-    add_migration_state_change_notifier(&migration->migration_state);
-
     trace_vfio_migration_probe_local(vbasedev->name, info->index);
     migration->ops = &vfio_local_method;
     g_free(info);
@@ -982,6 +995,11 @@ int vfio_migration_probe(VFIODevice *vbasedev, Error **errp)
         goto add_blocker;
     }
 
+    ret = vfio_migration_register_handlers(vbasedev);
+    if (ret) {
+        goto add_blocker;
+    }
+
     return 0;
 
 add_blocker:
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 09/13] vfio/migration: move the functions of In-Band approach to a new file
  2022-05-24  6:18 [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device Lei Rao
                   ` (7 preceding siblings ...)
  2022-05-24  6:18 ` [RFC PATCH 08/13] vfio/migration: split migration handler registering from vfio_migration_init Lei Rao
@ 2022-05-24  6:18 ` Lei Rao
  2022-05-24  6:18 ` [RFC PATCH 10/13] vfio/pci: introduce command-line parameters to specify migration method Lei Rao
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Lei Rao @ 2022-05-24  6:18 UTC (permalink / raw)
  To: alex.williamson, kevin.tian, eddie.dong, jason.zeng, quintela,
	dgilbert, yadong.li, yi.l.liu
  Cc: qemu-devel, Lei Rao

Moving the functions of In-Band approach into a new file to match with the new
abstraction layer of migration ops.

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
---
 hw/vfio/meson.build           |   1 +
 hw/vfio/migration-local.c     | 453 ++++++++++++++++++++++++++++++++++
 hw/vfio/migration.c           | 421 -------------------------------
 include/hw/vfio/vfio-common.h |   1 +
 4 files changed, 455 insertions(+), 421 deletions(-)
 create mode 100644 hw/vfio/migration-local.c

diff --git a/hw/vfio/meson.build b/hw/vfio/meson.build
index da9af297a0..5a72b8c349 100644
--- a/hw/vfio/meson.build
+++ b/hw/vfio/meson.build
@@ -3,6 +3,7 @@ vfio_ss.add(files(
   'common.c',
   'spapr.c',
   'migration.c',
+  'migration-local.c',
 ))
 vfio_ss.add(when: 'CONFIG_VFIO_PCI', if_true: files(
   'display.c',
diff --git a/hw/vfio/migration-local.c b/hw/vfio/migration-local.c
new file mode 100644
index 0000000000..46c8baed50
--- /dev/null
+++ b/hw/vfio/migration-local.c
@@ -0,0 +1,453 @@
+/*
+ * QEMU VFIO Migration Support
+ *
+ * Copyright NVIDIA, Inc. 2020
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/main-loop.h"
+#include "qemu/cutils.h"
+#include <linux/vfio.h>
+#include <sys/ioctl.h>
+
+#include "sysemu/runstate.h"
+#include "hw/vfio/vfio-common.h"
+#include "migration/migration.h"
+#include "migration/vmstate.h"
+#include "migration/qemu-file.h"
+#include "migration/register.h"
+#include "migration/blocker.h"
+#include "migration/misc.h"
+#include "qapi/error.h"
+#include "exec/ramlist.h"
+#include "exec/ram_addr.h"
+#include "pci.h"
+#include "trace.h"
+#include "hw/hw.h"
+#include "ui/console.h"
+
+static inline int vfio_mig_access(VFIODevice *vbasedev, void *val, int count,
+                                  off_t off, bool iswrite)
+{
+    int ret;
+
+    ret = iswrite ? pwrite(vbasedev->fd, val, count, off) :
+                    pread(vbasedev->fd, val, count, off);
+    if (ret < count) {
+        error_report("vfio_mig_%s %d byte %s: failed at offset 0x%"
+                     HWADDR_PRIx", err: %s", iswrite ? "write" : "read", count,
+                     vbasedev->name, off, strerror(errno));
+        return (ret < 0) ? ret : -EINVAL;
+    }
+    return 0;
+}
+
+static int vfio_mig_rw(VFIODevice *vbasedev, __u8 *buf, size_t count,
+                       off_t off, bool iswrite)
+{
+    int ret, done = 0;
+    __u8 *tbuf = buf;
+
+    while (count) {
+        int bytes = 0;
+
+        if (count >= 8 && !(off % 8)) {
+            bytes = 8;
+        } else if (count >= 4 && !(off % 4)) {
+            bytes = 4;
+        } else if (count >= 2 && !(off % 2)) {
+            bytes = 2;
+        } else {
+            bytes = 1;
+        }
+
+        ret = vfio_mig_access(vbasedev, tbuf, bytes, off, iswrite);
+        if (ret) {
+            return ret;
+        }
+
+        count -= bytes;
+        done += bytes;
+        off += bytes;
+        tbuf += bytes;
+    }
+    return done;
+}
+
+#define vfio_mig_read(f, v, c, o)       vfio_mig_rw(f, (__u8 *)v, c, o, false)
+#define vfio_mig_write(f, v, c, o)      vfio_mig_rw(f, (__u8 *)v, c, o, true)
+
+#define VFIO_MIG_STRUCT_OFFSET(f)       \
+                                 offsetof(struct vfio_device_migration_info, f)
+/*
+ * Change the device_state register for device @vbasedev. Bits set in @mask
+ * are preserved, bits set in @value are set, and bits not set in either @mask
+ * or @value are cleared in device_state. If the register cannot be accessed,
+ * the resulting state would be invalid, or the device enters an error state,
+ * an error is returned.
+ */
+
+static int vfio_migration_set_state_local(VFIODevice *vbasedev, uint32_t mask,
+                                          uint32_t value)
+{
+    VFIOMigration *migration = vbasedev->migration;
+    VFIORegion *region = &migration->region;
+    off_t dev_state_off = region->fd_offset +
+                          VFIO_MIG_STRUCT_OFFSET(device_state);
+    uint32_t device_state;
+    int ret;
+
+    ret = vfio_mig_read(vbasedev, &device_state, sizeof(device_state),
+                        dev_state_off);
+    if (ret < 0) {
+        return ret;
+    }
+
+    device_state = (device_state & mask) | value;
+
+    if (!VFIO_DEVICE_STATE_VALID(device_state)) {
+        return -EINVAL;
+    }
+
+    ret = vfio_mig_write(vbasedev, &device_state, sizeof(device_state),
+                         dev_state_off);
+    if (ret < 0) {
+        int rret;
+
+        rret = vfio_mig_read(vbasedev, &device_state, sizeof(device_state),
+                             dev_state_off);
+
+        if ((rret < 0) || (VFIO_DEVICE_STATE_IS_ERROR(device_state))) {
+            hw_error("%s: Device in error state 0x%x", vbasedev->name,
+                     device_state);
+            return rret ? rret : -EIO;
+        }
+        return ret;
+    }
+
+    migration->device_state = device_state;
+    trace_vfio_migration_set_state(vbasedev->name, device_state);
+    return 0;
+}
+
+static void *get_data_section_size(VFIORegion *region, uint64_t data_offset,
+                                   uint64_t data_size, uint64_t *size)
+{
+    void *ptr = NULL;
+    uint64_t limit = 0;
+    int i;
+
+    if (!region->mmaps) {
+        if (size) {
+            *size = MIN(data_size, region->size - data_offset);
+        }
+        return ptr;
+    }
+
+    for (i = 0; i < region->nr_mmaps; i++) {
+        VFIOMmap *map = region->mmaps + i;
+
+        if ((data_offset >= map->offset) &&
+            (data_offset < map->offset + map->size)) {
+
+            /* check if data_offset is within sparse mmap areas */
+            ptr = map->mmap + data_offset - map->offset;
+            if (size) {
+                *size = MIN(data_size, map->offset + map->size - data_offset);
+            }
+            break;
+        } else if ((data_offset < map->offset) &&
+                   (!limit || limit > map->offset)) {
+            /*
+             * data_offset is not within sparse mmap areas, find size of
+             * non-mapped area. Check through all list since region->mmaps list
+             * is not sorted.
+             */
+            limit = map->offset;
+        }
+    }
+
+    if (!ptr && size) {
+        *size = limit ? MIN(data_size, limit - data_offset) : data_size;
+    }
+    return ptr;
+}
+
+static int vfio_migration_save_buffer_local(QEMUFile *f, VFIODevice *vbasedev,
+                                            uint64_t *size)
+{
+    VFIOMigration *migration = vbasedev->migration;
+    VFIORegion *region = &migration->region;
+    uint64_t data_offset = 0, data_size = 0, sz;
+    int ret;
+
+    ret = vfio_mig_read(vbasedev, &data_offset, sizeof(data_offset),
+                      region->fd_offset + VFIO_MIG_STRUCT_OFFSET(data_offset));
+    if (ret < 0) {
+        return ret;
+    }
+
+    ret = vfio_mig_read(vbasedev, &data_size, sizeof(data_size),
+                        region->fd_offset + VFIO_MIG_STRUCT_OFFSET(data_size));
+    if (ret < 0) {
+        return ret;
+    }
+
+    trace_vfio_save_buffer_local(vbasedev->name, data_offset, data_size,
+                                 migration->pending_bytes);
+
+    qemu_put_be64(f, data_size);
+    sz = data_size;
+
+    while (sz) {
+        void *buf;
+        uint64_t sec_size;
+        bool buf_allocated = false;
+
+        buf = get_data_section_size(region, data_offset, sz, &sec_size);
+
+        if (!buf) {
+            buf = g_try_malloc(sec_size);
+            if (!buf) {
+                error_report("%s: Error allocating buffer ", __func__);
+                return -ENOMEM;
+            }
+            buf_allocated = true;
+
+            ret = vfio_mig_read(vbasedev, buf, sec_size,
+                                region->fd_offset + data_offset);
+            if (ret < 0) {
+                g_free(buf);
+                return ret;
+            }
+        }
+
+        qemu_put_buffer(f, buf, sec_size);
+
+        if (buf_allocated) {
+            g_free(buf);
+        }
+        sz -= sec_size;
+        data_offset += sec_size;
+    }
+
+    ret = qemu_file_get_error(f);
+
+    if (!ret && size) {
+        *size = data_size;
+    }
+
+    return ret;
+}
+
+static int vfio_migration_load_buffer_local(QEMUFile *f, VFIODevice *vbasedev,
+                                            uint64_t data_size)
+{
+    VFIORegion *region = &vbasedev->migration->region;
+    uint64_t data_offset = 0, size, report_size;
+    int ret;
+
+    do {
+        ret = vfio_mig_read(vbasedev, &data_offset, sizeof(data_offset),
+                      region->fd_offset + VFIO_MIG_STRUCT_OFFSET(data_offset));
+        if (ret < 0) {
+            return ret;
+        }
+
+        if (data_offset + data_size > region->size) {
+            /*
+             * If data_size is greater than the data section of migration region
+             * then iterate the write buffer operation. This case can occur if
+             * size of migration region at destination is smaller than size of
+             * migration region at source.
+             */
+            report_size = size = region->size - data_offset;
+            data_size -= size;
+        } else {
+            report_size = size = data_size;
+            data_size = 0;
+        }
+
+        trace_vfio_load_state_device_data_local(vbasedev->name, data_offset,
+                                                size);
+
+        while (size) {
+            void *buf;
+            uint64_t sec_size;
+            bool buf_alloc = false;
+
+            buf = get_data_section_size(region, data_offset, size, &sec_size);
+
+            if (!buf) {
+                buf = g_try_malloc(sec_size);
+                if (!buf) {
+                    error_report("%s: Error allocating buffer ", __func__);
+                    return -ENOMEM;
+                }
+                buf_alloc = true;
+            }
+
+            qemu_get_buffer(f, buf, sec_size);
+
+            if (buf_alloc) {
+                ret = vfio_mig_write(vbasedev, buf, sec_size,
+                        region->fd_offset + data_offset);
+                g_free(buf);
+
+                if (ret < 0) {
+                    return ret;
+                }
+            }
+            size -= sec_size;
+            data_offset += sec_size;
+        }
+
+        ret = vfio_mig_write(vbasedev, &report_size, sizeof(report_size),
+                        region->fd_offset + VFIO_MIG_STRUCT_OFFSET(data_size));
+        if (ret < 0) {
+            return ret;
+        }
+    } while (data_size);
+
+    return 0;
+}
+
+static int vfio_migration_update_pending_local(VFIODevice *vbasedev)
+{
+    VFIOMigration *migration = vbasedev->migration;
+    VFIORegion *region = &migration->region;
+    uint64_t pending_bytes = 0;
+    int ret;
+
+    ret = vfio_mig_read(vbasedev, &pending_bytes, sizeof(pending_bytes),
+                    region->fd_offset + VFIO_MIG_STRUCT_OFFSET(pending_bytes));
+    if (ret < 0) {
+        migration->pending_bytes = 0;
+        return ret;
+    }
+
+    migration->pending_bytes = pending_bytes;
+    trace_vfio_update_pending(vbasedev->name, pending_bytes);
+    return 0;
+}
+
+static void vfio_migration_cleanup_local(VFIODevice *vbasedev)
+{
+    VFIOMigration *migration = vbasedev->migration;
+
+    if (migration->region.mmaps) {
+        vfio_region_unmap(&migration->region);
+    }
+}
+
+static int vfio_migration_save_setup_local(VFIODevice *vbasedev)
+{
+    VFIOMigration *migration = vbasedev->migration;
+    int ret = -1;
+
+    if (migration->region.mmaps) {
+        /*
+         * Calling vfio_region_mmap() from migration thread. Memory API called
+         * from this function require locking the iothread when called from
+         * outside the main loop thread.
+         */
+        qemu_mutex_lock_iothread();
+        ret = vfio_region_mmap(&migration->region);
+        qemu_mutex_unlock_iothread();
+        if (ret) {
+            error_report("%s: Failed to mmap VFIO migration region: %s",
+                         vbasedev->name, strerror(-ret));
+            error_report("%s: Falling back to slow path", vbasedev->name);
+        }
+    }
+    return ret;
+}
+
+static int vfio_migration_load_setup_local(VFIODevice *vbasedev)
+{
+    VFIOMigration *migration = vbasedev->migration;
+    int ret = -1;
+
+    if (migration->region.mmaps) {
+        ret = vfio_region_mmap(&migration->region);
+        if (ret) {
+            error_report("%s: Failed to mmap VFIO migration region %d: %s",
+                         vbasedev->name, migration->region.nr,
+                         strerror(-ret));
+            error_report("%s: Falling back to slow path", vbasedev->name);
+        }
+    }
+    return ret;
+}
+
+static void vfio_migration_exit_local(VFIODevice *vbasedev)
+{
+    VFIOMigration *migration = vbasedev->migration;
+
+    vfio_region_exit(&migration->region);
+    vfio_region_finalize(&migration->region);
+}
+
+static VFIOMigrationOps vfio_local_method = {
+    .save_setup = vfio_migration_save_setup_local,
+    .load_setup = vfio_migration_load_setup_local,
+    .update_pending = vfio_migration_update_pending_local,
+    .save_buffer = vfio_migration_save_buffer_local,
+    .load_buffer = vfio_migration_load_buffer_local,
+    .set_state = vfio_migration_set_state_local,
+    .cleanup = vfio_migration_cleanup_local,
+    .exit = vfio_migration_exit_local,
+};
+
+int vfio_migration_probe_local(VFIODevice *vbasedev)
+{
+    int ret;
+    Object *obj;
+    struct vfio_region_info *info = NULL;
+    VFIOMigration *migration = vbasedev->migration;
+
+    obj = vbasedev->ops->vfio_get_object(vbasedev);
+    if (!obj) {
+        return -EINVAL;
+    }
+
+    ret = vfio_get_dev_region_info(vbasedev,
+                                   VFIO_REGION_TYPE_MIGRATION_DEPRECATED,
+                                   VFIO_REGION_SUBTYPE_MIGRATION_DEPRECATED,
+                                   &info);
+    if (ret) {
+        return -EINVAL;
+    }
+
+    ret = vfio_region_setup(obj, vbasedev, &vbasedev->migration->region,
+                            info->index, "migration");
+    if (ret) {
+        error_report("%s: Failed to setup VFIO migration region %d: %s",
+                     vbasedev->name, info->index, strerror(-ret));
+        goto err;
+    }
+
+    if (!vbasedev->migration->region.size) {
+        error_report("%s: Invalid zero-sized VFIO migration region %d",
+                     vbasedev->name, info->index);
+        ret = -EINVAL;
+        goto err;
+    }
+
+    trace_vfio_migration_probe_local(vbasedev->name, info->index);
+    migration->ops = &vfio_local_method;
+    g_free(info);
+    return 0;
+
+err:
+    vfio_migration_exit_local(vbasedev);
+    g_free(info);
+    return ret;
+}
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 0c67ed85f3..bb62e1ca0e 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -46,311 +46,6 @@
 
 static int64_t bytes_transferred;
 
-static inline int vfio_mig_access(VFIODevice *vbasedev, void *val, int count,
-                                  off_t off, bool iswrite)
-{
-    int ret;
-
-    ret = iswrite ? pwrite(vbasedev->fd, val, count, off) :
-                    pread(vbasedev->fd, val, count, off);
-    if (ret < count) {
-        error_report("vfio_mig_%s %d byte %s: failed at offset 0x%"
-                     HWADDR_PRIx", err: %s", iswrite ? "write" : "read", count,
-                     vbasedev->name, off, strerror(errno));
-        return (ret < 0) ? ret : -EINVAL;
-    }
-    return 0;
-}
-
-static int vfio_mig_rw(VFIODevice *vbasedev, __u8 *buf, size_t count,
-                       off_t off, bool iswrite)
-{
-    int ret, done = 0;
-    __u8 *tbuf = buf;
-
-    while (count) {
-        int bytes = 0;
-
-        if (count >= 8 && !(off % 8)) {
-            bytes = 8;
-        } else if (count >= 4 && !(off % 4)) {
-            bytes = 4;
-        } else if (count >= 2 && !(off % 2)) {
-            bytes = 2;
-        } else {
-            bytes = 1;
-        }
-
-        ret = vfio_mig_access(vbasedev, tbuf, bytes, off, iswrite);
-        if (ret) {
-            return ret;
-        }
-
-        count -= bytes;
-        done += bytes;
-        off += bytes;
-        tbuf += bytes;
-    }
-    return done;
-}
-
-#define vfio_mig_read(f, v, c, o)       vfio_mig_rw(f, (__u8 *)v, c, o, false)
-#define vfio_mig_write(f, v, c, o)      vfio_mig_rw(f, (__u8 *)v, c, o, true)
-
-#define VFIO_MIG_STRUCT_OFFSET(f)       \
-                                 offsetof(struct vfio_device_migration_info, f)
-/*
- * Change the device_state register for device @vbasedev. Bits set in @mask
- * are preserved, bits set in @value are set, and bits not set in either @mask
- * or @value are cleared in device_state. If the register cannot be accessed,
- * the resulting state would be invalid, or the device enters an error state,
- * an error is returned.
- */
-
-static int vfio_migration_set_state_local(VFIODevice *vbasedev, uint32_t mask,
-                                          uint32_t value)
-{
-    VFIOMigration *migration = vbasedev->migration;
-    VFIORegion *region = &migration->region;
-    off_t dev_state_off = region->fd_offset +
-                          VFIO_MIG_STRUCT_OFFSET(device_state);
-    uint32_t device_state;
-    int ret;
-
-    ret = vfio_mig_read(vbasedev, &device_state, sizeof(device_state),
-                        dev_state_off);
-    if (ret < 0) {
-        return ret;
-    }
-
-    device_state = (device_state & mask) | value;
-
-    if (!VFIO_DEVICE_STATE_VALID(device_state)) {
-        return -EINVAL;
-    }
-
-    ret = vfio_mig_write(vbasedev, &device_state, sizeof(device_state),
-                         dev_state_off);
-    if (ret < 0) {
-        int rret;
-
-        rret = vfio_mig_read(vbasedev, &device_state, sizeof(device_state),
-                             dev_state_off);
-
-        if ((rret < 0) || (VFIO_DEVICE_STATE_IS_ERROR(device_state))) {
-            hw_error("%s: Device in error state 0x%x", vbasedev->name,
-                     device_state);
-            return rret ? rret : -EIO;
-        }
-        return ret;
-    }
-
-    migration->device_state = device_state;
-    trace_vfio_migration_set_state(vbasedev->name, device_state);
-    return 0;
-}
-
-static void *get_data_section_size(VFIORegion *region, uint64_t data_offset,
-                                   uint64_t data_size, uint64_t *size)
-{
-    void *ptr = NULL;
-    uint64_t limit = 0;
-    int i;
-
-    if (!region->mmaps) {
-        if (size) {
-            *size = MIN(data_size, region->size - data_offset);
-        }
-        return ptr;
-    }
-
-    for (i = 0; i < region->nr_mmaps; i++) {
-        VFIOMmap *map = region->mmaps + i;
-
-        if ((data_offset >= map->offset) &&
-            (data_offset < map->offset + map->size)) {
-
-            /* check if data_offset is within sparse mmap areas */
-            ptr = map->mmap + data_offset - map->offset;
-            if (size) {
-                *size = MIN(data_size, map->offset + map->size - data_offset);
-            }
-            break;
-        } else if ((data_offset < map->offset) &&
-                   (!limit || limit > map->offset)) {
-            /*
-             * data_offset is not within sparse mmap areas, find size of
-             * non-mapped area. Check through all list since region->mmaps list
-             * is not sorted.
-             */
-            limit = map->offset;
-        }
-    }
-
-    if (!ptr && size) {
-        *size = limit ? MIN(data_size, limit - data_offset) : data_size;
-    }
-    return ptr;
-}
-
-static int vfio_migration_save_buffer_local(QEMUFile *f, VFIODevice *vbasedev,
-                                            uint64_t *size)
-{
-    VFIOMigration *migration = vbasedev->migration;
-    VFIORegion *region = &migration->region;
-    uint64_t data_offset = 0, data_size = 0, sz;
-    int ret;
-
-    ret = vfio_mig_read(vbasedev, &data_offset, sizeof(data_offset),
-                      region->fd_offset + VFIO_MIG_STRUCT_OFFSET(data_offset));
-    if (ret < 0) {
-        return ret;
-    }
-
-    ret = vfio_mig_read(vbasedev, &data_size, sizeof(data_size),
-                        region->fd_offset + VFIO_MIG_STRUCT_OFFSET(data_size));
-    if (ret < 0) {
-        return ret;
-    }
-
-    trace_vfio_save_buffer_local(vbasedev->name, data_offset, data_size,
-                                 migration->pending_bytes);
-
-    qemu_put_be64(f, data_size);
-    sz = data_size;
-
-    while (sz) {
-        void *buf;
-        uint64_t sec_size;
-        bool buf_allocated = false;
-
-        buf = get_data_section_size(region, data_offset, sz, &sec_size);
-
-        if (!buf) {
-            buf = g_try_malloc(sec_size);
-            if (!buf) {
-                error_report("%s: Error allocating buffer ", __func__);
-                return -ENOMEM;
-            }
-            buf_allocated = true;
-
-            ret = vfio_mig_read(vbasedev, buf, sec_size,
-                                region->fd_offset + data_offset);
-            if (ret < 0) {
-                g_free(buf);
-                return ret;
-            }
-        }
-
-        qemu_put_buffer(f, buf, sec_size);
-
-        if (buf_allocated) {
-            g_free(buf);
-        }
-        sz -= sec_size;
-        data_offset += sec_size;
-    }
-
-    ret = qemu_file_get_error(f);
-
-    if (!ret && size) {
-        *size = data_size;
-    }
-
-    return ret;
-}
-
-static int vfio_migration_load_buffer_local(QEMUFile *f, VFIODevice *vbasedev,
-                                            uint64_t data_size)
-{
-    VFIORegion *region = &vbasedev->migration->region;
-    uint64_t data_offset = 0, size, report_size;
-    int ret;
-
-    do {
-        ret = vfio_mig_read(vbasedev, &data_offset, sizeof(data_offset),
-                      region->fd_offset + VFIO_MIG_STRUCT_OFFSET(data_offset));
-        if (ret < 0) {
-            return ret;
-        }
-
-        if (data_offset + data_size > region->size) {
-            /*
-             * If data_size is greater than the data section of migration region
-             * then iterate the write buffer operation. This case can occur if
-             * size of migration region at destination is smaller than size of
-             * migration region at source.
-             */
-            report_size = size = region->size - data_offset;
-            data_size -= size;
-        } else {
-            report_size = size = data_size;
-            data_size = 0;
-        }
-
-        trace_vfio_load_state_device_data_local(vbasedev->name, data_offset,
-                                                size);
-
-        while (size) {
-            void *buf;
-            uint64_t sec_size;
-            bool buf_alloc = false;
-
-            buf = get_data_section_size(region, data_offset, size, &sec_size);
-
-            if (!buf) {
-                buf = g_try_malloc(sec_size);
-                if (!buf) {
-                    error_report("%s: Error allocating buffer ", __func__);
-                    return -ENOMEM;
-                }
-                buf_alloc = true;
-            }
-
-            qemu_get_buffer(f, buf, sec_size);
-
-            if (buf_alloc) {
-                ret = vfio_mig_write(vbasedev, buf, sec_size,
-                        region->fd_offset + data_offset);
-                g_free(buf);
-
-                if (ret < 0) {
-                    return ret;
-                }
-            }
-            size -= sec_size;
-            data_offset += sec_size;
-        }
-
-        ret = vfio_mig_write(vbasedev, &report_size, sizeof(report_size),
-                        region->fd_offset + VFIO_MIG_STRUCT_OFFSET(data_size));
-        if (ret < 0) {
-            return ret;
-        }
-    } while (data_size);
-
-    return 0;
-}
-
-static int vfio_migration_update_pending_local(VFIODevice *vbasedev)
-{
-    VFIOMigration *migration = vbasedev->migration;
-    VFIORegion *region = &migration->region;
-    uint64_t pending_bytes = 0;
-    int ret;
-
-    ret = vfio_mig_read(vbasedev, &pending_bytes, sizeof(pending_bytes),
-                    region->fd_offset + VFIO_MIG_STRUCT_OFFSET(pending_bytes));
-    if (ret < 0) {
-        migration->pending_bytes = 0;
-        return ret;
-    }
-
-    migration->pending_bytes = pending_bytes;
-    trace_vfio_update_pending(vbasedev->name, pending_bytes);
-    return 0;
-}
-
 static int vfio_save_device_config_state(QEMUFile *f, void *opaque)
 {
     VFIODevice *vbasedev = opaque;
@@ -395,15 +90,6 @@ static int vfio_load_device_config_state(QEMUFile *f, void *opaque)
     return qemu_file_get_error(f);
 }
 
-static void vfio_migration_cleanup_local(VFIODevice *vbasedev)
-{
-    VFIOMigration *migration = vbasedev->migration;
-
-    if (migration->region.mmaps) {
-        vfio_region_unmap(&migration->region);
-    }
-}
-
 static void vfio_migration_cleanup(VFIODevice *vbasedev)
 {
     VFIOMigration *migration = vbasedev->migration;
@@ -413,31 +99,6 @@ static void vfio_migration_cleanup(VFIODevice *vbasedev)
     }
 }
 
-/* ---------------------------------------------------------------------- */
-
-static int vfio_migration_save_setup_local(VFIODevice *vbasedev)
-{
-    VFIOMigration *migration = vbasedev->migration;
-    int ret = -1;
-
-    if (migration->region.mmaps) {
-        /*
-         * Calling vfio_region_mmap() from migration thread. Memory API called
-         * from this function require locking the iothread when called from
-         * outside the main loop thread.
-         */
-        qemu_mutex_lock_iothread();
-        ret = vfio_region_mmap(&migration->region);
-        qemu_mutex_unlock_iothread();
-        if (ret) {
-            error_report("%s: Failed to mmap VFIO migration region: %s",
-                         vbasedev->name, strerror(-ret));
-            error_report("%s: Falling back to slow path", vbasedev->name);
-        }
-    }
-    return ret;
-}
-
 static int vfio_save_setup(QEMUFile *f, void *opaque)
 {
     VFIODevice *vbasedev = opaque;
@@ -640,23 +301,6 @@ static void vfio_save_state(QEMUFile *f, void *opaque)
     }
 }
 
-static int vfio_migration_load_setup_local(VFIODevice *vbasedev)
-{
-    VFIOMigration *migration = vbasedev->migration;
-    int ret = -1;
-
-    if (migration->region.mmaps) {
-        ret = vfio_region_mmap(&migration->region);
-        if (ret) {
-            error_report("%s: Failed to mmap VFIO migration region %d: %s",
-                         vbasedev->name, migration->region.nr,
-                         strerror(-ret));
-            error_report("%s: Falling back to slow path", vbasedev->name);
-        }
-    }
-    return ret;
-}
-
 static int vfio_load_setup(QEMUFile *f, void *opaque)
 {
     VFIODevice *vbasedev = opaque;
@@ -848,14 +492,6 @@ static void vfio_migration_state_notifier(Notifier *notifier, void *data)
     }
 }
 
-static void vfio_migration_exit_local(VFIODevice *vbasedev)
-{
-    VFIOMigration *migration = vbasedev->migration;
-
-    vfio_region_exit(&migration->region);
-    vfio_region_finalize(&migration->region);
-}
-
 static void vfio_migration_exit(VFIODevice *vbasedev)
 {
     VFIOMigration *migration = vbasedev->migration;
@@ -914,63 +550,6 @@ static int vfio_migration_register_handlers(VFIODevice *vbasedev)
     return 0;
 }
 
-static VFIOMigrationOps vfio_local_method = {
-    .save_setup = vfio_migration_save_setup_local,
-    .load_setup = vfio_migration_load_setup_local,
-    .update_pending = vfio_migration_update_pending_local,
-    .save_buffer = vfio_migration_save_buffer_local,
-    .load_buffer = vfio_migration_load_buffer_local,
-    .set_state = vfio_migration_set_state_local,
-    .cleanup = vfio_migration_cleanup_local,
-    .exit = vfio_migration_exit_local,
-};
-
-static int vfio_migration_probe_local(VFIODevice *vbasedev)
-{
-    int ret;
-    Object *obj;
-    struct vfio_region_info *info = NULL;
-    VFIOMigration *migration = vbasedev->migration;
-
-    obj = vbasedev->ops->vfio_get_object(vbasedev);
-    if (!obj) {
-        return -EINVAL;
-    }
-
-    ret = vfio_get_dev_region_info(vbasedev,
-                                   VFIO_REGION_TYPE_MIGRATION_DEPRECATED,
-                                   VFIO_REGION_SUBTYPE_MIGRATION_DEPRECATED,
-                                   &info);
-    if (ret) {
-        return -EINVAL;
-    }
-
-    ret = vfio_region_setup(obj, vbasedev, &vbasedev->migration->region,
-                            info->index, "migration");
-    if (ret) {
-        error_report("%s: Failed to setup VFIO migration region %d: %s",
-                     vbasedev->name, info->index, strerror(-ret));
-        goto err;
-    }
-
-    if (!vbasedev->migration->region.size) {
-        error_report("%s: Invalid zero-sized VFIO migration region %d",
-                     vbasedev->name, info->index);
-        ret = -EINVAL;
-        goto err;
-    }
-
-    trace_vfio_migration_probe_local(vbasedev->name, info->index);
-    migration->ops = &vfio_local_method;
-    g_free(info);
-    return 0;
-
-err:
-    vfio_migration_exit_local(vbasedev);
-    g_free(info);
-    return ret;
-}
-
 /* ---------------------------------------------------------------------- */
 
 int64_t vfio_mig_bytes_transferred(void)
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 8ef85a871c..be8adf890f 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -256,6 +256,7 @@ int vfio_spapr_remove_window(VFIOContainer *container,
                              hwaddr offset_within_address_space);
 
 int vfio_migration_probe(VFIODevice *vbasedev, Error **errp);
+int vfio_migration_probe_local(VFIODevice *vbasedev);
 void vfio_migration_finalize(VFIODevice *vbasedev);
 
 #endif /* HW_VFIO_VFIO_COMMON_H */
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 10/13] vfio/pci: introduce command-line parameters to specify migration method
  2022-05-24  6:18 [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device Lei Rao
                   ` (8 preceding siblings ...)
  2022-05-24  6:18 ` [RFC PATCH 09/13] vfio/migration: move the functions of In-Band approach to a new file Lei Rao
@ 2022-05-24  6:18 ` Lei Rao
  2022-05-24  6:18 ` [RFC PATCH 11/13] vfio/migration: add a plugin layer to support out-of-band live migration Lei Rao
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Lei Rao @ 2022-05-24  6:18 UTC (permalink / raw)
  To: alex.williamson, kevin.tian, eddie.dong, jason.zeng, quintela,
	dgilbert, yadong.li, yi.l.liu
  Cc: qemu-devel, Lei Rao

Add command-line parameters (x-plugin-path and x-plugin-arg) of migration plugin
for VFIO PCI functions. x-plugin-path indicates the path of a dynamic load
library and x-plugin-arg is the necessary parameter to load and use it.
A typical example is, if the plugin communicates with the agent running on
IPU/DPU backend SOC through network, the argument should be the IP and Port of
agent. The usage as follows:

-device vfio-pci,id=$ID,host=$bdf,x-enable-migration,\
x-plugin-path=$plugin_path,x-plugin-arg=<IP:Port>

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
---
 hw/vfio/pci.c                 | 2 ++
 include/hw/vfio/vfio-common.h | 6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 939dcc3d4a..1553ba7116 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3309,6 +3309,8 @@ static Property vfio_pci_dev_properties[] = {
                                    qdev_prop_nv_gpudirect_clique, uint8_t),
     DEFINE_PROP_OFF_AUTO_PCIBAR("x-msix-relocation", VFIOPCIDevice, msix_relo,
                                 OFF_AUTOPCIBAR_OFF),
+    DEFINE_PROP_STRING("x-plugin-path", VFIOPCIDevice, vbasedev.desc.path),
+    DEFINE_PROP_STRING("x-plugin-arg", VFIOPCIDevice, vbasedev.desc.arg),
     /*
      * TODO - support passed fds... is this necessary?
      * DEFINE_PROP_STRING("vfiofd", VFIOPCIDevice, vfiofd_name),
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index be8adf890f..45d6d75284 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -58,6 +58,11 @@ typedef struct VFIORegion {
     uint8_t nr; /* cache the region number for debug */
 } VFIORegion;
 
+struct vfio_migration_plugin_desc {
+    char *path;
+    char *arg;
+};
+
 typedef struct VFIOMigrationOps VFIOMigrationOps;
 
 typedef struct VFIOMigration {
@@ -144,6 +149,7 @@ typedef struct VFIODevice {
     unsigned int num_regions;
     unsigned int flags;
     VFIOMigration *migration;
+    struct vfio_migration_plugin_desc desc;
     Error *migration_blocker;
     OnOffAuto pre_copy_dirty_page_tracking;
 } VFIODevice;
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 11/13] vfio/migration: add a plugin layer to support out-of-band live migration
  2022-05-24  6:18 [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device Lei Rao
                   ` (9 preceding siblings ...)
  2022-05-24  6:18 ` [RFC PATCH 10/13] vfio/pci: introduce command-line parameters to specify migration method Lei Rao
@ 2022-05-24  6:18 ` Lei Rao
  2022-05-24  6:18 ` [RFC PATCH 12/13] vfio/migration: add some trace-events for vfio migration plugin Lei Rao
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Lei Rao @ 2022-05-24  6:18 UTC (permalink / raw)
  To: alex.williamson, kevin.tian, eddie.dong, jason.zeng, quintela,
	dgilbert, yadong.li, yi.l.liu
  Cc: qemu-devel, Lei Rao

Introduce a plugin mechanism under VFIOMigrationOps layer. Each vendor can
provide a dynamic load library that implements the communication driver to
talk with IPU/DPU backend agent for saving and restoring device state during
live migration.

There are three interfaces between QEMU VFIO and a migration plugin:

    - VFIOLMPluginGetVersion:
        This is a function type. Plugin must expose a function symbol named
        "vfio_lm_get_plugin_version" with this function type to return the
        interface version supported by the plugin.
    - VFIOLMPluginGetOps:
        This is a function type. Plugin must expose a function symbol named
        "vfio_lm_get_plugin_ops" with this function type to return a pointer to
        VFIOMigrationPluginOps struct.
    - VFIOMigrationPluginOps:
        This is a struct type containing a set of callbacks that plugin
        exposes. The callbacks will be invoked by QEMU VFIO during live
        migration for saving and restoring device states.

The interfaces are defined in include/hw/vfio/vfio-migration-plugin.h.

When QEMU loads a migration plugin, it will first find and invoke function
symbol named "vfio_lm_get_plugin_version" to check the interface version that
plugin supports. And then find and invoke function symbol named
"vfio_lm_get_plugin_ops" to get vendor device specific VFIOMigrationPluginOps
which will be used for saving/restoring device states during live migration.

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
---
 docs/devel/vfio-migration-plugin.rst    | 165 +++++++++++++++
 hw/vfio/meson.build                     |   1 +
 hw/vfio/migration-plugin.c              | 262 ++++++++++++++++++++++++
 hw/vfio/migration.c                     |  13 +-
 include/hw/vfio/vfio-common.h           |  12 ++
 include/hw/vfio/vfio-migration-plugin.h |  21 ++
 6 files changed, 471 insertions(+), 3 deletions(-)
 create mode 100644 docs/devel/vfio-migration-plugin.rst
 create mode 100644 hw/vfio/migration-plugin.c
 create mode 100644 include/hw/vfio/vfio-migration-plugin.h

diff --git a/docs/devel/vfio-migration-plugin.rst b/docs/devel/vfio-migration-plugin.rst
new file mode 100644
index 0000000000..800d1bac0a
--- /dev/null
+++ b/docs/devel/vfio-migration-plugin.rst
@@ -0,0 +1,165 @@
+============================
+VFIO Device Migration Plugins
+============================
+
+Contents:
+=========
+* Introduction
+* Usage
+* Plugin based VFIO Live Migration Flow
+* Interface Description between QEMU and Plugins
+
+Introduction:
+============
+
+Plugin based VFIO live migration is an extension to VFIO live migration
+mechanism, which is described in ``docs/devel/vfio-migration.rst``. It provides
+an out-of-band migration solution for PCIe functions exposed by Infrastructure
+Processing Units (IPU) and Data Processing Units (DPU).
+
+IPU/DPU usually has an SoC in the backend where a Linux system usually runs
+out-of-band agents to provision and configure the interfaces and communicate
+with a host management stack such as gRPC or JSON-RPC. Plugin based VFIO live
+migration leverage the agents in the Soc to save/restore PCIe device states.
+
+This is a new feature for VFIO live migration and it allows device vendors to
+develop out-of-tree plugins that can be dynamically loaded into a running QEMU
+process during VFIO passthrough devices live migration.
+
+This document describes the interfaces between QEMU VFIO live migration
+framework and the plugins.
+
+Usage:
+======
+
+An example to use VFIO migration plugin is as the following command line:
+
+-device vfio-pci-emu,x-enable-migration=on,x-plugin-path=$plugin_path,x-plugin-arg=$plugin_arg
+
+Where,
+
+- the 'x-enable-migration' controls whether the VFIO device supports live
+  migration (Not supported by default).
+
+- 'x-plugin-path' indicates the path of the plugin on the host.
+
+- 'x-plugin-arg' is a parameter required by QEMU to load and use the out-of-tree
+  plugin, if the plugin communicates with the backend on IPU/DPU by network,
+  this parameter should be <IP: Port>.
+
+Plugin based VFIO Live Migration Flow:
+======================================
+
+The following ASCII graph describes the overall component relationship:
+
+ +----------------------------------------------------+
+ | QEMU                                               |
+ | +------------------------------------------------+ |
+ | |        VFIO Live Migration Framework           | |
+ | |    +--------------------------------------+    | |
+ | |    |         VFIOMigrationOps             |    | |
+ | |    +-------^---------------------^--------+    | |
+ | |            |                     |             | |
+ | |    +-------v-------+     +-------v--------+    | |
+ | |    | VFIO LM Based |     | VFIO LM Based  |    | |
+ | |    |On Local Region|     |   On Plugin    |    | |
+ | |    +-------^-------+     |     +----------+    | |
+ | |            |             |     |Plugin Ops+----+-+------------+
+ | |            |             +-----+----------+    | |            |
+ | |            |                                   | |  +---------v----------+
+ | +------------+-----------------------------------+ |  |  Vendor Specific   |
+ |              |                                     |  |    Plugins(.so)    |
+ +--------------+-------------------------------------+  +----------+---------+
+  UserSpace     |                                                   |
+----------------+---------------------------------------------      |
+  Kernel        |                                                   |
+                |                                                   |
+     +----------v----------------------+                            |
+     |        Kernel VFIO Driver       |                            |
+     |    +-------------------------+  |                            |
+     |    |                         |  |                            | Network
+     |    | Vendor-Specific Driver  |  |                            |
+     |    |                         |  |                            |
+     |    +----------^--------------+  |                            |
+     |               |                 |                            |
+     +---------------+-----------------+                            |
+                     |                                              |
+                     |                                              |
+---------------------+-----------------------------------------     |
+  Hardware           |                                              |
+                     |            +-----+-----+-----+----+-----+    |
+          +----------v------+     | VF0 | VF1 | VF2 | ...| VFn |    |
+          |   Traditional   |     +-----+-----+-----+----+-----+    |
+          |  PCIe Devices   |     |                            |    |
+          +-----------------+     |   +--------+------------+  |    |
+                                  |   |        |   Agent    |<-+----+
+                                  |   |        +------------+  |
+                                  |   |                     |  |
+                                  |   | SOC                 |  |
+                                  |   +---------------------+  |
+                                  | IPU/DPU                    |
+                                  +----------------------------+
+
+Two QEMU command line options (x-plugin-path and x-plugin-arg) are introduced to
+specify the corresponding plugin and its parameters for a passthrough device.
+If they are specified, the plugin will be loaded in vfio_migration_probe(),
+which will check the plugin version and get the pointer to the plugin's
+VFIOMigrationPluginOps. If any failure during the probing, the plugin will not
+be loaded, and this PCIe device will be marked as no supporting of live
+migration.
+
+When live migration happens, VFIO live migration framework will invoke the
+callbacks defined in VFIOMigrationPluginOps to save/restore the device states,
+as described in the following section.
+
+Interface Description between QEMU and Plugins:
+=============================================
+
+The interfaces between QEMU VFIO live migration framework and vendor-specific
+plugin are defined as follows:
+
+    - VFIOLMPluginGetVersion:
+        This is a function type. Plugins must expose a function symbol named
+        ``vfio_lm_get_plugin_version`` with this function type to return the
+        interface version supported by the plugin.
+    - VFIOLMPluginGetOps:
+        This is a function type. Plugins must expose a function symbol named
+        ``vfio_lm_get_plugin_ops`` with this function type to return a pointer
+        to VFIOMigrationPluginOps struct.
+    - VFIOMigrationPluginOps:
+        This is a struct type containing a set of callbacks that plugin
+        exposes. The callbacks will be invoked by QEMU VFIO during live
+        migration for saving and restoring device states.
+
+The interfaces are defined in include/hw/vfio/vfio-migration-plugin.h.
+
+When QEMU loads a migration plugin, it will first find and invoke a function
+symbol named ``vfio_lm_get_plugin_version`` to check the interface version that
+plugin supports. The core code will refuse to load a plugin if it doesn't export
+the symbol or the version doesn't match the one QEMU supports.
+
+Then QEMU finds and invokes function symbol named ``vfio_lm_get_plugin_ops`` to
+get vendor device-specific VFIOMigrationPluginOps which will be used for
+saving/restoring device states.
+
+VFIOMigrationPluginOps is defined as follows:
+
+typedef struct VFIOMigrationPluginOps {
+    void *(*init)(char *devid, char *arg);
+    int (*save)(void *handle, uint8_t *state, uint64_t len);
+    int (*load)(void *handle, uint8_t *state, uint64_t len);
+    int (*update_pending)(void *handle, uint64_t *pending_bytes);
+    int (*set_state)(void *handle, uint32_t value);
+    int (*get_state)(void *handle, uint32_t *value);
+    int (*cleanup)(void *handle);
+} VFIOMigrationPluginOps;
+
+Here:
+    - init(): set the PCIe device BDF and args, and get the plugin handle.
+    - save(): save the VFIO passthrough device states on the source.
+    - load(): restore the VFIO passthrough device states on the destination.
+    - set_state(): set the PCIe device states including SAVING, RUNNING,
+                   STOP, and RESUMING.
+    - get_state(): get the PCIe device states.
+    - update_pending(): get the remaining bytes during data transfer.
+    - cleanup(): unload the plugin and release some resources.
diff --git a/hw/vfio/meson.build b/hw/vfio/meson.build
index 5a72b8c349..592d56536e 100644
--- a/hw/vfio/meson.build
+++ b/hw/vfio/meson.build
@@ -4,6 +4,7 @@ vfio_ss.add(files(
   'spapr.c',
   'migration.c',
   'migration-local.c',
+  'migration-plugin.c',
 ))
 vfio_ss.add(when: 'CONFIG_VFIO_PCI', if_true: files(
   'display.c',
diff --git a/hw/vfio/migration-plugin.c b/hw/vfio/migration-plugin.c
new file mode 100644
index 0000000000..63124e1571
--- /dev/null
+++ b/hw/vfio/migration-plugin.c
@@ -0,0 +1,262 @@
+/*
+ * QEMU VFIO Migration Support
+ *
+ * Copyright Intel Corporation, 2022
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/cutils.h"
+
+#include "hw/vfio/vfio-common.h"
+#include "migration/qemu-file.h"
+#include "qapi/error.h"
+#include "hw/vfio/vfio-migration-plugin.h"
+#include "sysemu/sysemu.h"
+
+#define CHUNK_SIZE (1024 * 1024)
+
+static int vfio_migration_load_plugin(VFIODevice *vbasedev)
+{
+    char *path = vbasedev->desc.path;
+    VFIOMigration *migration = vbasedev->migration;
+    VFIOMigrationPlugin *plugin = NULL;
+    VFIOLMPluginGetVersion vfio_lm_get_plugin_version = NULL;
+    VFIOLMPluginGetOps vfio_lm_get_plugin_ops = NULL;
+
+    plugin = g_malloc0(sizeof(VFIOMigrationPlugin));
+    if (!plugin) {
+        error_report("%s: Error allocating buffer", __func__);
+        return -ENOMEM;
+    }
+
+    plugin->module = g_module_open(path, G_MODULE_BIND_LOCAL);
+    if (!plugin->module) {
+        error_report("Failed to load VFIO migration plugin:%s", path);
+        g_free(plugin);
+        return -1;
+    }
+
+    if (!g_module_symbol(plugin->module, "vfio_lm_get_plugin_version",
+                        (void *)&vfio_lm_get_plugin_version)) {
+        error_report("Failed to load plugin ops %s: %s", path,
+                    g_module_error());
+        goto err;
+    }
+
+    if (vfio_lm_get_plugin_version() != VFIO_LM_PLUGIN_API_VERSION) {
+        error_report("Invalid VFIO Plugin API Version %s : %s", path,
+                     g_module_error());
+        goto err;
+    }
+
+    if (!g_module_symbol(plugin->module, "vfio_lm_get_plugin_ops",
+                        (void *)&vfio_lm_get_plugin_ops)) {
+        error_report("Failed to load plugin ops %s: %s", path,
+                     g_module_error());
+        goto err;
+    }
+
+    plugin->ops = vfio_lm_get_plugin_ops();
+    if (!plugin->ops) {
+        error_report("Failed to Get Plugin Ops: %s", path);
+        goto err;
+    }
+
+    migration->plugin = plugin;
+
+    return 0;
+
+err:
+    g_module_close(plugin->module);
+    g_free(plugin);
+    plugin = NULL;
+    return -1;
+}
+
+static int vfio_migration_save_load_setup_plugin(VFIODevice *vbasedev)
+{
+    char *arg = vbasedev->desc.arg;
+    VFIOMigrationPlugin *plugin = vbasedev->migration->plugin;
+
+    /* The name is BDF for PCIe device */
+    plugin->handle = plugin->ops->init(vbasedev->name, arg);
+    if (!plugin->handle) {
+        error_report("Failed to init: %s", vbasedev->desc.path);
+        return -1;
+    }
+
+    return 0;
+}
+
+static void vfio_migration_cleanup_plugin(VFIODevice *vbasedev)
+{
+    VFIOMigration *migration = vbasedev->migration;
+    VFIOMigrationPlugin *plugin = migration->plugin;
+
+    if (plugin->ops->cleanup) {
+        plugin->ops->cleanup(plugin->handle);
+        plugin->handle = NULL;
+    }
+
+    if (migration->plugin->module) {
+        g_module_close(migration->plugin->module);
+        migration->plugin->module = NULL;
+    }
+
+    g_free(migration->plugin);
+    migration->plugin = NULL;
+}
+
+static int vfio_migration_update_pending_plugin(VFIODevice *vbasedev)
+{
+    VFIOMigration *migration = vbasedev->migration;
+    VFIOMigrationPlugin *plugin = migration->plugin;
+    uint64_t pending_bytes = 0;
+    int ret = -1;
+
+    ret = plugin->ops->update_pending(plugin->handle, &pending_bytes);
+    if (ret) {
+        migration->pending_bytes = 0;
+        error_report("%s: Failed to get pending size", __func__);
+        return ret;
+    }
+    migration->pending_bytes = pending_bytes;
+
+    return 0;
+}
+
+static int vfio_migration_set_state_plugin(VFIODevice *vbasedev, uint32_t mask,
+                                           uint32_t value)
+{
+    int ret = -1;
+    uint32_t device_state = 0;
+    VFIOMigrationPlugin *plugin = vbasedev->migration->plugin;
+
+    ret = plugin->ops->get_state(plugin->handle, &device_state);
+    if (ret) {
+        error_report("%s: Get device state error", vbasedev->name);
+        return ret;
+    }
+
+    device_state = (device_state & mask) | value;
+
+    if (!VFIO_DEVICE_STATE_VALID(device_state)) {
+        return -EINVAL;
+    }
+
+    ret = plugin->ops->set_state(plugin->handle, device_state);
+    if (ret) {
+        error_report("%s: Device in error state 0x%x", vbasedev->name,
+                     value);
+        return ret;
+    }
+
+    vbasedev->migration->device_state = device_state;
+
+    return 0;
+}
+
+static int vfio_migration_save_buffer_plugin(QEMUFile *f, VFIODevice *vbasedev,
+                                   uint64_t *size)
+{
+    int ret = 0;
+    VFIOMigrationPlugin *plugin = vbasedev->migration->plugin;
+    uint64_t data_size, tmp_size;
+
+    ret = plugin->ops->update_pending(plugin->handle, &data_size);
+    if (ret < 0) {
+        error_report("%s: Failed to get pending size", __func__);
+        return ret;
+    }
+
+    qemu_put_be64(f, data_size);
+    tmp_size = data_size;
+
+    while (tmp_size) {
+        uint64_t sz = tmp_size <= CHUNK_SIZE ? tmp_size : CHUNK_SIZE;
+        void *buf = g_try_malloc(sz);
+
+        if (!buf) {
+            error_report("%s: Error allocating buffer", __func__);
+            return -ENOMEM;
+        }
+
+        ret = plugin->ops->save(plugin->handle, buf, sz);
+        if (ret) {
+            error_report("%s:Failed saving device state", __func__);
+            g_free(buf);
+            return ret;
+        }
+
+        qemu_put_buffer(f, buf, sz);
+        g_free(buf);
+        tmp_size -= sz;
+    }
+
+    ret = qemu_file_get_error(f);
+    if (!ret && size) {
+        *size = data_size;
+    }
+
+    return ret;
+}
+
+static int vfio_migration_load_buffer_plugin(QEMUFile *f, VFIODevice *vbasedev,
+                            uint64_t data_size)
+{
+    int ret = 0;
+    VFIOMigrationPlugin *plugin = vbasedev->migration->plugin;
+
+    while (data_size) {
+        uint64_t sz = data_size <= CHUNK_SIZE ? data_size : CHUNK_SIZE;
+        void *buf = g_try_malloc(sz);
+
+        if (!buf) {
+            error_report("%s: Error allocating buffer", __func__);
+            return -ENOMEM;
+        }
+
+        qemu_get_buffer(f, buf, sz);
+        ret = plugin->ops->load(plugin->handle, buf, sz);
+        g_free(buf);
+        if (ret < 0) {
+            error_report("%s: Error loading device state", vbasedev->name);
+            return ret;
+        }
+
+        data_size -= sz;
+    }
+
+    return ret;
+}
+
+static VFIOMigrationOps vfio_plugin_method = {
+    .save_setup = vfio_migration_save_load_setup_plugin,
+    .load_setup = vfio_migration_save_load_setup_plugin,
+    .update_pending = vfio_migration_update_pending_plugin,
+    .save_buffer = vfio_migration_save_buffer_plugin,
+    .load_buffer = vfio_migration_load_buffer_plugin,
+    .set_state = vfio_migration_set_state_plugin,
+    .cleanup = vfio_migration_cleanup_plugin
+};
+
+int vfio_migration_probe_plugin(VFIODevice *vbasedev)
+{
+    VFIOMigration *migration = vbasedev->migration;
+
+    if (vfio_migration_load_plugin(vbasedev)) {
+        error_report("vfio migration plugin probe failed");
+        return -1;
+    }
+
+    migration->ops = &vfio_plugin_method;
+
+    return 0;
+}
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index bb62e1ca0e..24a3126a56 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -569,9 +569,16 @@ int vfio_migration_probe(VFIODevice *vbasedev, Error **errp)
     vbasedev->migration = g_new0(VFIOMigration, 1);
     vbasedev->migration->vbasedev = vbasedev;
 
-    ret = vfio_migration_probe_local(vbasedev);
-    if (ret) {
-        goto add_blocker;
+    if (vbasedev->desc.arg != NULL && vbasedev->desc.path != NULL) {
+        ret = vfio_migration_probe_plugin(vbasedev);
+        if (ret) {
+            goto add_blocker;
+        }
+    } else {
+        ret = vfio_migration_probe_local(vbasedev);
+        if (ret) {
+            goto add_blocker;
+        }
     }
 
     ret = vfio_migration_register_handlers(vbasedev);
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 45d6d75284..2ea016a894 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -22,6 +22,7 @@
 #define HW_VFIO_VFIO_COMMON_H
 
 #include "exec/memory.h"
+#include "qemu/iov.h"
 #include "qemu/queue.h"
 #include "qemu/notify.h"
 #include "ui/console.h"
@@ -30,6 +31,9 @@
 #include <linux/vfio.h>
 #endif
 #include "sysemu/sysemu.h"
+#include "vfio-migration-plugin.h"
+#include <gmodule.h>
+#include <glib.h>
 
 #define VFIO_MSG_PREFIX "vfio %s: "
 
@@ -58,6 +62,12 @@ typedef struct VFIORegion {
     uint8_t nr; /* cache the region number for debug */
 } VFIORegion;
 
+typedef struct VFIOMigrationPlugin {
+    GModule *module;
+    VFIOMigrationPluginOps *ops;
+    void *handle;
+} VFIOMigrationPlugin;
+
 struct vfio_migration_plugin_desc {
     char *path;
     char *arg;
@@ -70,6 +80,7 @@ typedef struct VFIOMigration {
     VMChangeStateEntry *vm_state;
     VFIORegion region;
     VFIOMigrationOps *ops;
+    VFIOMigrationPlugin *plugin;
     uint32_t device_state;
     int vm_running;
     Notifier migration_state;
@@ -263,6 +274,7 @@ int vfio_spapr_remove_window(VFIOContainer *container,
 
 int vfio_migration_probe(VFIODevice *vbasedev, Error **errp);
 int vfio_migration_probe_local(VFIODevice *vbasedev);
+int vfio_migration_probe_plugin(VFIODevice *vbasedev);
 void vfio_migration_finalize(VFIODevice *vbasedev);
 
 #endif /* HW_VFIO_VFIO_COMMON_H */
diff --git a/include/hw/vfio/vfio-migration-plugin.h b/include/hw/vfio/vfio-migration-plugin.h
new file mode 100644
index 0000000000..02f6cc4608
--- /dev/null
+++ b/include/hw/vfio/vfio-migration-plugin.h
@@ -0,0 +1,21 @@
+#ifndef HW_VFIO_PLUGIN_MIGRATION_H
+#define HW_VFIO_PLUGIN_MIGRATION_H
+
+#include <stdint.h>
+
+#define VFIO_LM_PLUGIN_API_VERSION  0
+
+typedef struct VFIOMigrationPluginOps {
+    void *(*init)(char *devid, char *arg);
+    int (*save)(void *handle, uint8_t *state, uint64_t len);
+    int (*load)(void *handle, uint8_t *state, uint64_t len);
+    int (*update_pending)(void *handle, uint64_t *pending_bytes);
+    int (*set_state)(void *handle, uint32_t value);
+    int (*get_state)(void *handle, uint32_t *value);
+    int (*cleanup)(void *handle);
+} VFIOMigrationPluginOps;
+
+typedef int (*VFIOLMPluginGetVersion)(void);
+typedef VFIOMigrationPluginOps* (*VFIOLMPluginGetOps)(void);
+
+#endif
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 12/13] vfio/migration: add some trace-events for vfio migration plugin
  2022-05-24  6:18 [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device Lei Rao
                   ` (10 preceding siblings ...)
  2022-05-24  6:18 ` [RFC PATCH 11/13] vfio/migration: add a plugin layer to support out-of-band live migration Lei Rao
@ 2022-05-24  6:18 ` Lei Rao
  2022-05-24  6:18 ` [RFC PATCH 13/13] vfio/migration: make the region and plugin member of struct VFIOMigration to be a union Lei Rao
  2022-05-26 18:44 ` [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device Alex Williamson
  13 siblings, 0 replies; 19+ messages in thread
From: Lei Rao @ 2022-05-24  6:18 UTC (permalink / raw)
  To: alex.williamson, kevin.tian, eddie.dong, jason.zeng, quintela,
	dgilbert, yadong.li, yi.l.liu
  Cc: qemu-devel, Lei Rao, Eddie Dong

Add some trace-events including trace_vfio_migration_plugin_probe
trace_vfio_plugin_save_buffer, trace_vfio_plugin_load_state_device_data
trace_vfio_update_pending, and trace_vfio_migration_set_state to make
debugging easier.

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@Intel.com>
---
 hw/vfio/migration-plugin.c | 10 +++++++---
 hw/vfio/trace-events       |  3 +++
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/hw/vfio/migration-plugin.c b/hw/vfio/migration-plugin.c
index 63124e1571..c545cbe334 100644
--- a/hw/vfio/migration-plugin.c
+++ b/hw/vfio/migration-plugin.c
@@ -19,6 +19,7 @@
 #include "qapi/error.h"
 #include "hw/vfio/vfio-migration-plugin.h"
 #include "sysemu/sysemu.h"
+#include "trace.h"
 
 #define CHUNK_SIZE (1024 * 1024)
 
@@ -128,7 +129,7 @@ static int vfio_migration_update_pending_plugin(VFIODevice *vbasedev)
         return ret;
     }
     migration->pending_bytes = pending_bytes;
-
+    trace_vfio_update_pending(vbasedev->name, pending_bytes);
     return 0;
 }
 
@@ -159,7 +160,7 @@ static int vfio_migration_set_state_plugin(VFIODevice *vbasedev, uint32_t mask,
     }
 
     vbasedev->migration->device_state = device_state;
-
+    trace_vfio_migration_set_state(vbasedev->name, device_state);
     return 0;
 }
 
@@ -179,6 +180,7 @@ static int vfio_migration_save_buffer_plugin(QEMUFile *f, VFIODevice *vbasedev,
     qemu_put_be64(f, data_size);
     tmp_size = data_size;
 
+    trace_vfio_save_buffer_plugin(vbasedev->name, data_size);
     while (tmp_size) {
         uint64_t sz = tmp_size <= CHUNK_SIZE ? tmp_size : CHUNK_SIZE;
         void *buf = g_try_malloc(sz);
@@ -214,6 +216,7 @@ static int vfio_migration_load_buffer_plugin(QEMUFile *f, VFIODevice *vbasedev,
     int ret = 0;
     VFIOMigrationPlugin *plugin = vbasedev->migration->plugin;
 
+    trace_vfio_load_state_device_data_plugin(vbasedev->name, data_size);
     while (data_size) {
         uint64_t sz = data_size <= CHUNK_SIZE ? data_size : CHUNK_SIZE;
         void *buf = g_try_malloc(sz);
@@ -257,6 +260,7 @@ int vfio_migration_probe_plugin(VFIODevice *vbasedev)
     }
 
     migration->ops = &vfio_plugin_method;
-
+    trace_vfio_migration_probe_plugin(vbasedev->name, vbasedev->desc.path,
+                                      vbasedev->desc.arg);
     return 0;
 }
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index ca85edeb11..6c2cba29fd 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -149,12 +149,14 @@ vfio_display_edid_write_error(void) ""
 
 # migration.c
 vfio_migration_probe_local(const char *name, uint32_t index) " (%s) Region %d"
+vfio_migration_probe_plugin(const char *name, const char *path, const char *arg) " (%s) Plugin path=%s arg=%s"
 vfio_migration_set_state(const char *name, uint32_t state) " (%s) state %d"
 vfio_vmstate_change(const char *name, int running, const char *reason, uint32_t dev_state) " (%s) running %d reason %s device state %d"
 vfio_migration_state_notifier(const char *name, const char *state) " (%s) state %s"
 vfio_save_setup(const char *name) " (%s)"
 vfio_save_cleanup(const char *name) " (%s)"
 vfio_save_buffer_local(const char *name, uint64_t data_offset, uint64_t data_size, uint64_t pending) " (%s) Offset 0x%"PRIx64" size 0x%"PRIx64" pending 0x%"PRIx64
+vfio_save_buffer_plugin(const char *name, uint64_t data_size) " (%s) data size 0x%"PRIx64
 vfio_update_pending(const char *name, uint64_t pending) " (%s) pending 0x%"PRIx64
 vfio_save_device_config_state(const char *name) " (%s)"
 vfio_save_pending(const char *name, uint64_t precopy, uint64_t postcopy, uint64_t compatible) " (%s) precopy 0x%"PRIx64" postcopy 0x%"PRIx64" compatible 0x%"PRIx64
@@ -163,6 +165,7 @@ vfio_save_complete_precopy(const char *name) " (%s)"
 vfio_load_device_config_state(const char *name) " (%s)"
 vfio_load_state(const char *name, uint64_t data) " (%s) data 0x%"PRIx64
 vfio_load_state_device_data_local(const char *name, uint64_t data_offset, uint64_t data_size) " (%s) Offset 0x%"PRIx64" size 0x%"PRIx64
+vfio_load_state_device_data_plugin(const char *name, uint64_t data_size) " (%s) data size 0x%"PRIx64
 vfio_load_cleanup(const char *name) " (%s)"
 vfio_get_dirty_bitmap(int fd, uint64_t iova, uint64_t size, uint64_t bitmap_size, uint64_t start) "container fd=%d, iova=0x%"PRIx64" size= 0x%"PRIx64" bitmap_size=0x%"PRIx64" start=0x%"PRIx64
 vfio_iommu_map_dirty_notify(uint64_t iova_start, uint64_t iova_end) "iommu dirty @ 0x%"PRIx64" - 0x%"PRIx64
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 13/13] vfio/migration: make the region and plugin member of struct VFIOMigration to be a union
  2022-05-24  6:18 [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device Lei Rao
                   ` (11 preceding siblings ...)
  2022-05-24  6:18 ` [RFC PATCH 12/13] vfio/migration: add some trace-events for vfio migration plugin Lei Rao
@ 2022-05-24  6:18 ` Lei Rao
  2022-05-26 18:44 ` [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device Alex Williamson
  13 siblings, 0 replies; 19+ messages in thread
From: Lei Rao @ 2022-05-24  6:18 UTC (permalink / raw)
  To: alex.williamson, kevin.tian, eddie.dong, jason.zeng, quintela,
	dgilbert, yadong.li, yi.l.liu
  Cc: qemu-devel, Lei Rao

Since a VFIO device either uses In-Band or Out-of-Band live migration. So, the
region and plugin in VFIOMigration can be put into a union.

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
---
 hw/vfio/migration-local.c     | 33 ++++++++++++++++++---------------
 include/hw/vfio/vfio-common.h |  6 ++++--
 2 files changed, 22 insertions(+), 17 deletions(-)

diff --git a/hw/vfio/migration-local.c b/hw/vfio/migration-local.c
index 46c8baed50..13d1abee5a 100644
--- a/hw/vfio/migration-local.c
+++ b/hw/vfio/migration-local.c
@@ -98,7 +98,7 @@ static int vfio_migration_set_state_local(VFIODevice *vbasedev, uint32_t mask,
                                           uint32_t value)
 {
     VFIOMigration *migration = vbasedev->migration;
-    VFIORegion *region = &migration->region;
+    VFIORegion *region = migration->region;
     off_t dev_state_off = region->fd_offset +
                           VFIO_MIG_STRUCT_OFFSET(device_state);
     uint32_t device_state;
@@ -184,7 +184,7 @@ static int vfio_migration_save_buffer_local(QEMUFile *f, VFIODevice *vbasedev,
                                             uint64_t *size)
 {
     VFIOMigration *migration = vbasedev->migration;
-    VFIORegion *region = &migration->region;
+    VFIORegion *region = migration->region;
     uint64_t data_offset = 0, data_size = 0, sz;
     int ret;
 
@@ -250,7 +250,7 @@ static int vfio_migration_save_buffer_local(QEMUFile *f, VFIODevice *vbasedev,
 static int vfio_migration_load_buffer_local(QEMUFile *f, VFIODevice *vbasedev,
                                             uint64_t data_size)
 {
-    VFIORegion *region = &vbasedev->migration->region;
+    VFIORegion *region = vbasedev->migration->region;
     uint64_t data_offset = 0, size, report_size;
     int ret;
 
@@ -322,7 +322,7 @@ static int vfio_migration_load_buffer_local(QEMUFile *f, VFIODevice *vbasedev,
 static int vfio_migration_update_pending_local(VFIODevice *vbasedev)
 {
     VFIOMigration *migration = vbasedev->migration;
-    VFIORegion *region = &migration->region;
+    VFIORegion *region = migration->region;
     uint64_t pending_bytes = 0;
     int ret;
 
@@ -342,8 +342,8 @@ static void vfio_migration_cleanup_local(VFIODevice *vbasedev)
 {
     VFIOMigration *migration = vbasedev->migration;
 
-    if (migration->region.mmaps) {
-        vfio_region_unmap(&migration->region);
+    if (migration->region->mmaps) {
+        vfio_region_unmap(migration->region);
     }
 }
 
@@ -352,14 +352,14 @@ static int vfio_migration_save_setup_local(VFIODevice *vbasedev)
     VFIOMigration *migration = vbasedev->migration;
     int ret = -1;
 
-    if (migration->region.mmaps) {
+    if (migration->region->mmaps) {
         /*
          * Calling vfio_region_mmap() from migration thread. Memory API called
          * from this function require locking the iothread when called from
          * outside the main loop thread.
          */
         qemu_mutex_lock_iothread();
-        ret = vfio_region_mmap(&migration->region);
+        ret = vfio_region_mmap(migration->region);
         qemu_mutex_unlock_iothread();
         if (ret) {
             error_report("%s: Failed to mmap VFIO migration region: %s",
@@ -375,11 +375,11 @@ static int vfio_migration_load_setup_local(VFIODevice *vbasedev)
     VFIOMigration *migration = vbasedev->migration;
     int ret = -1;
 
-    if (migration->region.mmaps) {
-        ret = vfio_region_mmap(&migration->region);
+    if (migration->region->mmaps) {
+        ret = vfio_region_mmap(migration->region);
         if (ret) {
             error_report("%s: Failed to mmap VFIO migration region %d: %s",
-                         vbasedev->name, migration->region.nr,
+                         vbasedev->name, migration->region->nr,
                          strerror(-ret));
             error_report("%s: Falling back to slow path", vbasedev->name);
         }
@@ -391,8 +391,10 @@ static void vfio_migration_exit_local(VFIODevice *vbasedev)
 {
     VFIOMigration *migration = vbasedev->migration;
 
-    vfio_region_exit(&migration->region);
-    vfio_region_finalize(&migration->region);
+    vfio_region_exit(migration->region);
+    vfio_region_finalize(migration->region);
+    g_free(migration->region);
+    migration->region = NULL;
 }
 
 static VFIOMigrationOps vfio_local_method = {
@@ -426,7 +428,8 @@ int vfio_migration_probe_local(VFIODevice *vbasedev)
         return -EINVAL;
     }
 
-    ret = vfio_region_setup(obj, vbasedev, &vbasedev->migration->region,
+    migration->region = g_new0(VFIORegion, 1);
+    ret = vfio_region_setup(obj, vbasedev, vbasedev->migration->region,
                             info->index, "migration");
     if (ret) {
         error_report("%s: Failed to setup VFIO migration region %d: %s",
@@ -434,7 +437,7 @@ int vfio_migration_probe_local(VFIODevice *vbasedev)
         goto err;
     }
 
-    if (!vbasedev->migration->region.size) {
+    if (!vbasedev->migration->region->size) {
         error_report("%s: Invalid zero-sized VFIO migration region %d",
                      vbasedev->name, info->index);
         ret = -EINVAL;
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 2ea016a894..bded2b4908 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -78,9 +78,11 @@ typedef struct VFIOMigrationOps VFIOMigrationOps;
 typedef struct VFIOMigration {
     struct VFIODevice *vbasedev;
     VMChangeStateEntry *vm_state;
-    VFIORegion region;
     VFIOMigrationOps *ops;
-    VFIOMigrationPlugin *plugin;
+    union {
+        VFIORegion *region;
+        VFIOMigrationPlugin *plugin;
+    };
     uint32_t device_state;
     int vm_running;
     Notifier migration_state;
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device
  2022-05-24  6:18 [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device Lei Rao
                   ` (12 preceding siblings ...)
  2022-05-24  6:18 ` [RFC PATCH 13/13] vfio/migration: make the region and plugin member of struct VFIOMigration to be a union Lei Rao
@ 2022-05-26 18:44 ` Alex Williamson
  2022-06-01 17:09   ` Dong, Eddie
  13 siblings, 1 reply; 19+ messages in thread
From: Alex Williamson @ 2022-05-26 18:44 UTC (permalink / raw)
  To: Lei Rao
  Cc: kevin.tian, eddie.dong, jason.zeng, quintela, dgilbert,
	yadong.li, yi.l.liu, qemu-devel

On Tue, 24 May 2022 14:18:35 +0800
Lei Rao <lei.rao@intel.com> wrote:

> Migration of a VFIO passthrough device can be supported by using a device 
> specific kernel driver to save/restore the device state thru device specific 
> interfaces. But this approach doesn't work for devices that lack a state 
> migration interface, e.g. NVMe.
> 
> On the other hand, Infrastructure Process Unit (IPU) or Data Processing Unit 
> (DPU) vendors may choose to implement an out-of-band interface from the SoC to 
> help manage the state of such non-migratable devices e.g. via gRPC or JSON-RPC 
> protocols.
> 
> This RFC attempts to support such out-of-band migration interface by introducing
> the concept of migration backends in vfio. The existing logic around vfio 
> migration uAPI is now called the 'local' backend while a new 'out-of-band' 
> backend is further introduced allowing vfio to redirect VMState ops to an 
> external plugin.
> 
> Currently, the backend migration Ops is defined close to SaveVMHandlers. We also
> considered whether there is value of abstracting it in a lower level e.g. close 
> to vfio migration uAPI but no clear conclusion. Hence this is one part which 
> we'd like to hear suggestions.
> 
> This proposal adopts a plugin mechanism (an example can be found in [1]) given 
> that IPU/DPU vendors usually implement proprietary migration interfaces without
> a standard. But we are also open if an alternative option makes better sense,
> e.g. via loadable modules (with Qemu supporting gRPC or JSON-RPC support) or an
> IPC mechanism similar to vhost-user.

AFAIU, QEMU is not interested in supporting plugin modules, especially
proprietary ones.  I don't see that a case has really been made that
this cannot be done in-band, through a vfio-pci variant driver,
possibly making use of proprietary interfaces to a userspace agent if
necessary (though please don't ask such to be accepted in-tree for the
kernel either).  A vfio-user device server might also host such
proprietary, device specific agents while supporting the standard,
in-band migration interface.  Thanks,

Alex

> 
> The following graph describes the overall component relationship:
> 
>  +----------------------------------------------------+
>  | QEMU                                               |
>  | +------------------------------------------------+ |
>  | |        VFIO Live Migration Framework           | |
>  | |    +--------------------------------------+    | |
>  | |    |         VFIOMigrationOps             |    | |
>  | |    +-------^---------------------^--------+    | |
>  | |            |                     |             | |
>  | |    +-------v-------+     +-------v--------+    | |
>  | |    | LM Backend Via|     | LM Backend Via |    | |
>  | |    |   Device Fd   |     |    Plugins     |    | |
>  | |    +-------^-------+     |     +----------+    | |
>  | |            |             |     |Plugin Ops+----+-+------------+
>  | |            |             +-----+----------+    | |            |
>  | |            |                                   | |  +---------v----------+
>  | +------------+-----------------------------------+ |  |  Vendor Specific   |
>  |              |                                     |  |    Plugins(.so)    |
>  +--------------+-------------------------------------+  +----------+---------+
>   UserSpace     |                                                   |
> ----------------+---------------------------------------------      |
>   Kernel        |                                                   |
>                 |                                                   |
>      +----------v----------------------+                            |
>      |        Kernel VFIO Driver       |                            |
>      |    +-------------------------+  |                            |
>      |    |                         |  |                            | Network
>      |    | Vendor-Specific Driver  |  |                            |
>      |    |                         |  |                            |
>      |    +----------^--------------+  |                            |
>      |               |                 |                            |
>      +---------------+-----------------+                            |
>                      |                                              |
>                      |                                              |
> ---------------------+-----------------------------------------     |
>   Hardware           |                                              |
>                      |            +-----+-----+-----+----+-----+    |
>           +----------v------+     | VF0 | VF1 | VF2 | ...| VFn |    |
>           |   Traditional   |     +-----+-----+-----+----+-----+    |
>           |  PCIe Devices   |     |                            |    |
>           +-----------------+     |   +--------+------------+  |    |
>                                   |   |        |   Agent    |<-+----+
>                                   |   |        +------------+  |
>                                   |   |                     |  |
>                                   |   | SOC                 |  |
>                                   |   +---------------------+  |
>                                   | IPU                        |
>                                   +----------------------------+
> 
> Two command-line parameters (x-plugin-path and x-plugin-arg) are introduced to 
> enable the out-of-band backend. If specified, vfio will attempt to use the 
> out-of-band backend.
> 
> The following is an example of VFIO command-line parameters for OOB-Approach:
> 
>   -device vfio-pci,id=$ID,host=$bdf,x-enable-migration,x-plugin-path=$plugin_path,x-plugin-arg=$plugin_arg
> 
> [1] https://github.com/raolei-intel/vfio-lm-plugin-example.git
> 
> Lei Rao (13):
>   vfio/migration: put together checks of migration initialization
>     conditions
>   vfio/migration: move migration struct allocation out of
>     vfio_migration_init
>   vfio/migration: move vfio_get_dev_region_info out of
>     vfio_migration_probe
>   vfio/migration: Separated functions that relate to the In-Band
>     approach
>   vfio/migration: rename functions that relate to the In-Band approach
>   vfio/migration: introduce VFIOMigrationOps layer in VFIO live
>     migration framework
>   vfio/migration: move the statistics of bytes_transferred to generic
>     VFIO migration layer
>   vfio/migration: split migration handler registering from
>     vfio_migration_init
>   vfio/migration: move the functions of In-Band approach to a new file
>   vfio/pci: introduce command-line parameters to specify migration
>     method
>   vfio/migration: add a plugin layer to support out-of-band live
>     migration
>   vfio/migration: add some trace-events for vfio migration plugin
>   vfio/migration: make the region and plugin member of struct
>     VFIOMigration to be a union
> 
>  docs/devel/vfio-migration-plugin.rst    | 165 +++++++
>  hw/vfio/meson.build                     |   2 +
>  hw/vfio/migration-local.c               | 456 +++++++++++++++++++
>  hw/vfio/migration-plugin.c              | 266 +++++++++++
>  hw/vfio/migration.c                     | 577 ++++++------------------
>  hw/vfio/pci.c                           |   2 +
>  hw/vfio/trace-events                    |   9 +-
>  include/hw/vfio/vfio-common.h           |  37 +-
>  include/hw/vfio/vfio-migration-plugin.h |  21 +
>  9 files changed, 1096 insertions(+), 439 deletions(-)
>  create mode 100644 docs/devel/vfio-migration-plugin.rst
>  create mode 100644 hw/vfio/migration-local.c
>  create mode 100644 hw/vfio/migration-plugin.c
>  create mode 100644 include/hw/vfio/vfio-migration-plugin.h
> 



^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device
  2022-05-26 18:44 ` [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device Alex Williamson
@ 2022-06-01 17:09   ` Dong, Eddie
  2022-06-01 18:01     ` Alex Williamson
  0 siblings, 1 reply; 19+ messages in thread
From: Dong, Eddie @ 2022-06-01 17:09 UTC (permalink / raw)
  To: Alex Williamson, Rao, Lei
  Cc: Tian, Kevin, Zeng, Jason, quintela, dgilbert, Li, Yadong, Liu,
	Yi L, qemu-devel



> -----Original Message-----
> From: Qemu-devel <qemu-devel-
> bounces+eddie.dong=intel.com@nongnu.org> On Behalf Of Alex Williamson
> Sent: Thursday, May 26, 2022 11:44 AM
> To: Rao, Lei <Lei.Rao@intel.com>
> Cc: Tian, Kevin <kevin.tian@intel.com>; Dong, Eddie
> <eddie.dong@intel.com>; Zeng, Jason <jason.zeng@intel.com>;
> quintela@redhat.com; dgilbert@redhat.com; Li, Yadong
> <yadong.li@intel.com>; Liu, Yi L <yi.l.liu@intel.com>; qemu-
> devel@nongnu.org
> Subject: Re: [RFC PATCH 00/13] Add a plugin to support out-of-band live
> migration for VFIO pass-through device
> 
> On Tue, 24 May 2022 14:18:35 +0800
> Lei Rao <lei.rao@intel.com> wrote:
> 
> > Migration of a VFIO passthrough device can be supported by using a
> > device specific kernel driver to save/restore the device state thru
> > device specific interfaces. But this approach doesn't work for devices
> > that lack a state migration interface, e.g. NVMe.
> >
> > On the other hand, Infrastructure Process Unit (IPU) or Data
> > Processing Unit
> > (DPU) vendors may choose to implement an out-of-band interface from
> > the SoC to help manage the state of such non-migratable devices e.g.
> > via gRPC or JSON-RPC protocols.
> >
> > This RFC attempts to support such out-of-band migration interface by
> > introducing the concept of migration backends in vfio. The existing
> > logic around vfio migration uAPI is now called the 'local' backend while a
> new 'out-of-band'
> > backend is further introduced allowing vfio to redirect VMState ops to
> > an external plugin.
> >
> > Currently, the backend migration Ops is defined close to
> > SaveVMHandlers. We also considered whether there is value of
> > abstracting it in a lower level e.g. close to vfio migration uAPI but
> > no clear conclusion. Hence this is one part which we'd like to hear
> suggestions.
> >
> > This proposal adopts a plugin mechanism (an example can be found in
> > [1]) given that IPU/DPU vendors usually implement proprietary
> > migration interfaces without a standard. But we are also open if an
> > alternative option makes better sense, e.g. via loadable modules (with
> > Qemu supporting gRPC or JSON-RPC support) or an IPC mechanism similar
> to vhost-user.
> 
> AFAIU, QEMU is not interested in supporting plugin modules, especially
> proprietary ones.  I don't see that a case has really been made that this
> cannot be done in-band, through a vfio-pci variant driver, possibly making
> use of proprietary interfaces to a userspace agent if necessary (though
> please don't ask such to be accepted in-tree for the kernel either).  A vfio-
> user device server might also host such proprietary, device specific agents
> while supporting the standard, in-band migration interface.  Thanks,
> 

Thanks Alex. Removing plug-in module is not a problem.

Do you mean to implement the migration and protocol handling inside Qemu-client (probably vfio-client after the VFIO-user is merged)? Or to build as part of libvfio-user? We can also build it as a separate process of Qemu-server as part of Multi-Process Qemu.

In here, the protocol between host CPU and SoC is based on gRPC, which support Rust code in client (Host CPU side here) more than C/C++. Do you have any suggestion to better support Rust code with Qemu C/C++ code? 


Thx Eddie




> Alex
> 
> >
> > The following graph describes the overall component relationship:
> >
> >  +----------------------------------------------------+
> >  | QEMU                                               |
> >  | +------------------------------------------------+ |
> >  | |        VFIO Live Migration Framework           | |
> >  | |    +--------------------------------------+    | |
> >  | |    |         VFIOMigrationOps             |    | |
> >  | |    +-------^---------------------^--------+    | |
> >  | |            |                     |             | |
> >  | |    +-------v-------+     +-------v--------+    | |
> >  | |    | LM Backend Via|     | LM Backend Via |    | |
> >  | |    |   Device Fd   |     |    Plugins     |    | |
> >  | |    +-------^-------+     |     +----------+    | |
> >  | |            |             |     |Plugin Ops+----+-+------------+
> >  | |            |             +-----+----------+    | |            |
> >  | |            |                                   | |  +---------v----------+
> >  | +------------+-----------------------------------+ |  |  Vendor Specific   |
> >  |              |                                     |  |    Plugins(.so)    |
> >  +--------------+-------------------------------------+  +----------+---------+
> >   UserSpace     |                                                   |
> > ----------------+---------------------------------------------      |
> >   Kernel        |                                                   |
> >                 |                                                   |
> >      +----------v----------------------+                            |
> >      |        Kernel VFIO Driver       |                            |
> >      |    +-------------------------+  |                            |
> >      |    |                         |  |                            | Network
> >      |    | Vendor-Specific Driver  |  |                            |
> >      |    |                         |  |                            |
> >      |    +----------^--------------+  |                            |
> >      |               |                 |                            |
> >      +---------------+-----------------+                            |
> >                      |                                              |
> >                      |                                              |
> > ---------------------+-----------------------------------------     |
> >   Hardware           |                                              |
> >                      |            +-----+-----+-----+----+-----+    |
> >           +----------v------+     | VF0 | VF1 | VF2 | ...| VFn |    |
> >           |   Traditional   |     +-----+-----+-----+----+-----+    |
> >           |  PCIe Devices   |     |                            |    |
> >           +-----------------+     |   +--------+------------+  |    |
> >                                   |   |        |   Agent    |<-+----+
> >                                   |   |        +------------+  |
> >                                   |   |                     |  |
> >                                   |   | SOC                 |  |
> >                                   |   +---------------------+  |
> >                                   | IPU                        |
> >                                   +----------------------------+
> >
> > Two command-line parameters (x-plugin-path and x-plugin-arg) are
> > introduced to enable the out-of-band backend. If specified, vfio will
> > attempt to use the out-of-band backend.
> >
> > The following is an example of VFIO command-line parameters for OOB-
> Approach:
> >
> >   -device
> > vfio-pci,id=$ID,host=$bdf,x-enable-migration,x-plugin-path=$plugin_pat
> > h,x-plugin-arg=$plugin_arg
> >
> > [1] https://github.com/raolei-intel/vfio-lm-plugin-example.git
> >
> > Lei Rao (13):
> >   vfio/migration: put together checks of migration initialization
> >     conditions
> >   vfio/migration: move migration struct allocation out of
> >     vfio_migration_init
> >   vfio/migration: move vfio_get_dev_region_info out of
> >     vfio_migration_probe
> >   vfio/migration: Separated functions that relate to the In-Band
> >     approach
> >   vfio/migration: rename functions that relate to the In-Band approach
> >   vfio/migration: introduce VFIOMigrationOps layer in VFIO live
> >     migration framework
> >   vfio/migration: move the statistics of bytes_transferred to generic
> >     VFIO migration layer
> >   vfio/migration: split migration handler registering from
> >     vfio_migration_init
> >   vfio/migration: move the functions of In-Band approach to a new file
> >   vfio/pci: introduce command-line parameters to specify migration
> >     method
> >   vfio/migration: add a plugin layer to support out-of-band live
> >     migration
> >   vfio/migration: add some trace-events for vfio migration plugin
> >   vfio/migration: make the region and plugin member of struct
> >     VFIOMigration to be a union
> >
> >  docs/devel/vfio-migration-plugin.rst    | 165 +++++++
> >  hw/vfio/meson.build                     |   2 +
> >  hw/vfio/migration-local.c               | 456 +++++++++++++++++++
> >  hw/vfio/migration-plugin.c              | 266 +++++++++++
> >  hw/vfio/migration.c                     | 577 ++++++------------------
> >  hw/vfio/pci.c                           |   2 +
> >  hw/vfio/trace-events                    |   9 +-
> >  include/hw/vfio/vfio-common.h           |  37 +-
> >  include/hw/vfio/vfio-migration-plugin.h |  21 +
> >  9 files changed, 1096 insertions(+), 439 deletions(-)  create mode
> > 100644 docs/devel/vfio-migration-plugin.rst
> >  create mode 100644 hw/vfio/migration-local.c  create mode 100644
> > hw/vfio/migration-plugin.c  create mode 100644
> > include/hw/vfio/vfio-migration-plugin.h
> >
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device
  2022-06-01 17:09   ` Dong, Eddie
@ 2022-06-01 18:01     ` Alex Williamson
  2022-06-02  4:24       ` Tian, Kevin
  2022-06-14 21:10       ` Dong, Eddie
  0 siblings, 2 replies; 19+ messages in thread
From: Alex Williamson @ 2022-06-01 18:01 UTC (permalink / raw)
  To: Dong, Eddie
  Cc: Rao, Lei, Tian, Kevin, Zeng, Jason, quintela, dgilbert, Li,
	Yadong, Liu, Yi L, qemu-devel

On Wed, 1 Jun 2022 17:09:25 +0000
"Dong, Eddie" <eddie.dong@intel.com> wrote:

> > -----Original Message-----
> > From: Qemu-devel <qemu-devel-  
> > bounces+eddie.dong=intel.com@nongnu.org> On Behalf Of Alex Williamson  
> > On Tue, 24 May 2022 14:18:35 +0800
> > Lei Rao <lei.rao@intel.com> wrote:
> > > This proposal adopts a plugin mechanism (an example can be found in
> > > [1]) given that IPU/DPU vendors usually implement proprietary
> > > migration interfaces without a standard. But we are also open if an
> > > alternative option makes better sense, e.g. via loadable modules (with
> > > Qemu supporting gRPC or JSON-RPC support) or an IPC mechanism similar  
> > to vhost-user.
> > 
> > AFAIU, QEMU is not interested in supporting plugin modules, especially
> > proprietary ones.  I don't see that a case has really been made that this
> > cannot be done in-band, through a vfio-pci variant driver, possibly making
> > use of proprietary interfaces to a userspace agent if necessary (though
> > please don't ask such to be accepted in-tree for the kernel either).  A vfio-
> > user device server might also host such proprietary, device specific agents
> > while supporting the standard, in-band migration interface.  Thanks,
> >   
> 
> Thanks Alex. Removing plug-in module is not a problem.
> 
> Do you mean to implement the migration and protocol handling inside
> Qemu-client (probably vfio-client after the VFIO-user is merged)? Or
> to build as part of libvfio-user? We can also build it as a separate
> process of Qemu-server as part of Multi-Process Qemu.

AIUI, the QEMU "client" in a vfio-user configuration is simply QEMU
itself.  The vfio-user "server" provides the actual device
implementation, which could support different license models, depending
on what libraries or existing code is incorporated to implement that
server.  The QEMU remote machine type is simply a QEMU-based
implementation of a vfio-user server.  The vfio-user server is analogous
to a vfio-pci variant driver in the kernel/ioctl interface model.  The
vfio-user client should be device agnostic, possibly with similar
exceptions we have today via device specific quirk handling for the
vfio kernel interface.

> In here, the protocol between host CPU and SoC is based on gRPC,
> which support Rust code in client (Host CPU side here) more than
> C/C++. Do you have any suggestion to better support Rust code with
> Qemu C/C++ code? 

I'm not qualified to provide suggestions regarding Rust code
integration with QEMU, but I think that's only required if the device
specific migration support is on the "client".  As above, I don't think
that's the correct model, the vfio migration protocol is meant to
support any device specific requirements on the device end of the
connection, ie. the "server" end for vfio-user, which can be an
entirely separate, non-QEMU based process.  I think there are also ways
to write kernel drivers in Rust, so possibly a kernel interface
vfio-pci variant driver could also work.  Thanks,

Alex



^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device
  2022-06-01 18:01     ` Alex Williamson
@ 2022-06-02  4:24       ` Tian, Kevin
  2022-06-14 21:10       ` Dong, Eddie
  1 sibling, 0 replies; 19+ messages in thread
From: Tian, Kevin @ 2022-06-02  4:24 UTC (permalink / raw)
  To: Alex Williamson, Dong, Eddie
  Cc: Rao, Lei, Zeng, Jason, quintela, dgilbert, Li, Yadong, Liu, Yi L,
	qemu-devel

Hi, Alex,

> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Thursday, June 2, 2022 2:01 AM
> 
> On Wed, 1 Jun 2022 17:09:25 +0000
> "Dong, Eddie" <eddie.dong@intel.com> wrote:
> 
> > > -----Original Message-----
> > > From: Qemu-devel <qemu-devel-
> > > bounces+eddie.dong=intel.com@nongnu.org> On Behalf Of Alex
> Williamson
> > > On Tue, 24 May 2022 14:18:35 +0800
> > > Lei Rao <lei.rao@intel.com> wrote:
> > > > This proposal adopts a plugin mechanism (an example can be found in
> > > > [1]) given that IPU/DPU vendors usually implement proprietary
> > > > migration interfaces without a standard. But we are also open if an
> > > > alternative option makes better sense, e.g. via loadable modules (with
> > > > Qemu supporting gRPC or JSON-RPC support) or an IPC mechanism
> similar
> > > to vhost-user.
> > >
> > > AFAIU, QEMU is not interested in supporting plugin modules, especially
> > > proprietary ones.  I don't see that a case has really been made that this
> > > cannot be done in-band, through a vfio-pci variant driver, possibly
> making
> > > use of proprietary interfaces to a userspace agent if necessary (though
> > > please don't ask such to be accepted in-tree for the kernel either).  A vfio-
> > > user device server might also host such proprietary, device specific agents
> > > while supporting the standard, in-band migration interface.  Thanks,
> > >
> >
> > Thanks Alex. Removing plug-in module is not a problem.
> >
> > Do you mean to implement the migration and protocol handling inside
> > Qemu-client (probably vfio-client after the VFIO-user is merged)? Or
> > to build as part of libvfio-user? We can also build it as a separate
> > process of Qemu-server as part of Multi-Process Qemu.
> 
> AIUI, the QEMU "client" in a vfio-user configuration is simply QEMU
> itself.  The vfio-user "server" provides the actual device
> implementation, which could support different license models, depending
> on what libraries or existing code is incorporated to implement that
> server.  The QEMU remote machine type is simply a QEMU-based
> implementation of a vfio-user server.  The vfio-user server is analogous
> to a vfio-pci variant driver in the kernel/ioctl interface model.  The
> vfio-user client should be device agnostic, possibly with similar
> exceptions we have today via device specific quirk handling for the
> vfio kernel interface.
> 

Sounds like vfio-user is currently defined around virtual device
oriented usages, e.g.:

  - Client accesses virtual pci regions via VFIO_USER_REGION_READ/WRITE
    or mmap if the server passes a shmem fd for a given region in
    VFIO_USER_DEVICE_GET_REGION_INFO;

  - Client maps DMA memory to server via VFIO_USER_DMA_MAP/UNMAP;

  - Server access client memory via VFIO_USER_DMA_READ/WRITE or
    mmap if client passes a fd for the DMA region when doing DMA_MAP;

  - Server delivers virtual interrupt to client via IPC e.g. eventfd;

But in this usage it is still expected to allow Qemu directly access the
physical device for performance. Turning to vfio-user suggests that it
may need to support a model where from kernel p.o.v the physical
device is assigned to vfio-user server but in the end vfio-user client
actually operates the device (or, 'partially'). 

'partially' comes from that mmap must be done by Qemu otherwise
we lose all the performance merit of passthrough while DMA map
and interrupt delivery may be still routed via server (for DMA map
this requires that the server can mmap the client's memory). Such
indirect routes does bring some overhead but it's less obvious than
losing mmap.

I'm not sure how it works w/o additional kernel-side cooperation
though. Currently there is only one fd to represent the entire 
kernel vfio device then how could the server delegate pci regions
to another process (client) for mmap?

If the server just passes the vfio device fd to the client then it's
not vfio-user any more.

Did I misunderstand your idea?

Thanks
Kevin


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device
  2022-06-01 18:01     ` Alex Williamson
  2022-06-02  4:24       ` Tian, Kevin
@ 2022-06-14 21:10       ` Dong, Eddie
  1 sibling, 0 replies; 19+ messages in thread
From: Dong, Eddie @ 2022-06-14 21:10 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Rao, Lei, Tian, Kevin, Zeng, Jason, quintela, dgilbert, Li,
	Yadong, Liu, Yi L, qemu-devel



> -----Original Message-----
> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Wednesday, June 1, 2022 11:01 AM
> To: Dong, Eddie <eddie.dong@intel.com>
> Cc: Rao, Lei <lei.rao@intel.com>; Tian, Kevin <kevin.tian@intel.com>; Zeng,
> Jason <jason.zeng@intel.com>; quintela@redhat.com; dgilbert@redhat.com;
> Li, Yadong <yadong.li@intel.com>; Liu, Yi L <yi.l.liu@intel.com>; qemu-
> devel@nongnu.org
> Subject: Re: [RFC PATCH 00/13] Add a plugin to support out-of-band live
> migration for VFIO pass-through device
> 
> On Wed, 1 Jun 2022 17:09:25 +0000
> "Dong, Eddie" <eddie.dong@intel.com> wrote:
> 
> > > -----Original Message-----
> > > From: Qemu-devel <qemu-devel-
> > > bounces+eddie.dong=intel.com@nongnu.org> On Behalf Of Alex
> > > bounces+Williamson
> > > On Tue, 24 May 2022 14:18:35 +0800
> > > Lei Rao <lei.rao@intel.com> wrote:
> > > > This proposal adopts a plugin mechanism (an example can be found
> > > > in
> > > > [1]) given that IPU/DPU vendors usually implement proprietary
> > > > migration interfaces without a standard. But we are also open if
> > > > an alternative option makes better sense, e.g. via loadable
> > > > modules (with Qemu supporting gRPC or JSON-RPC support) or an IPC
> > > > mechanism similar
> > > to vhost-user.
> > >
> > > AFAIU, QEMU is not interested in supporting plugin modules,
> > > especially proprietary ones.  I don't see that a case has really
> > > been made that this cannot be done in-band, through a vfio-pci
> > > variant driver, possibly making use of proprietary interfaces to a
> > > userspace agent if necessary (though please don't ask such to be
> > > accepted in-tree for the kernel either).  A vfio- user device server
> > > might also host such proprietary, device specific agents while
> > > supporting the standard, in-band migration interface.  Thanks,
> > >
> >
> > Thanks Alex. Removing plug-in module is not a problem.
> >
> > Do you mean to implement the migration and protocol handling inside
> > Qemu-client (probably vfio-client after the VFIO-user is merged)? Or
> > to build as part of libvfio-user? We can also build it as a separate
> > process of Qemu-server as part of Multi-Process Qemu.
> 
> AIUI, the QEMU "client" in a vfio-user configuration is simply QEMU itself.
> The vfio-user "server" provides the actual device implementation, which
> could support different license models, depending on what libraries or
> existing code is incorporated to implement that server.  The QEMU remote
> machine type is simply a QEMU-based implementation of a vfio-user server.
> The vfio-user server is analogous to a vfio-pci variant driver in the
> kernel/ioctl interface model.  The vfio-user client should be device agnostic,
> possibly with similar exceptions we have today via device specific quirk
> handling for the vfio kernel interface.
> 
> > In here, the protocol between host CPU and SoC is based on gRPC, which
> > support Rust code in client (Host CPU side here) more than C/C++. Do
> > you have any suggestion to better support Rust code with Qemu C/C++
> > code?
> 
> I'm not qualified to provide suggestions regarding Rust code integration with
> QEMU, but I think that's only required if the device specific migration
> support is on the "client".  As above, I don't think that's the correct model,
> the vfio migration protocol is meant to support any device specific
> requirements on the device end of the connection, ie. the "server" end for
> vfio-user, which can be an entirely separate, non-QEMU based process.  I
> think there are also ways to write kernel drivers in Rust, so possibly a kernel
> interface vfio-pci variant driver could also work.  Thanks,
> 


Alex:
	Thanks for your suggestion. Yes, agree Qemu (client) is, by nature, neutral to physical device knowledge.
	With more thinking, it seems that:
	1: Solution to have a separate kernel driver:   
		The way the host CPU talking with the SoC chip of the device is going thru TCP/IP network, plus high level protocol (gRPC or Json..). This is going to be very complicated and might be hard to be accepted by community.

	2: Implement as a full qemu-server device model.
		This way works if we implement a full device model in vfio-user, but given that the device (NVME for now) works in VFIO passthru mode for performance, the issue Kevin Tian raised in another email is a real concern too.

	3: Implement partial (or supplemental) feature in Qemu-server device model.
		This solution defines a Qemu/VFIO migration interface between the client and server for migration.  Client migration-proxy uses hardware transparent interface to talk with the remote-migration server. The remote server may manage device-specific (protocol specific to be more precisely) components to talk with different hardware backend. In this case, we rely on today's VFIO module to manage the device and manipulate the device thru kernel driver. During migration, it will use the migration-proxy to get/set state etc...   
		Users can configure the additional Qemu command line parameter to choose a remote migration-proxy for a VFIO device. 


	Can you suggest?

Thx Eddie


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-06-14 21:12 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-24  6:18 [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device Lei Rao
2022-05-24  6:18 ` [RFC PATCH 01/13] vfio/migration: put together checks of migration initialization conditions Lei Rao
2022-05-24  6:18 ` [RFC PATCH 02/13] vfio/migration: move migration struct allocation out of vfio_migration_init Lei Rao
2022-05-24  6:18 ` [RFC PATCH 03/13] vfio/migration: move vfio_get_dev_region_info out of vfio_migration_probe Lei Rao
2022-05-24  6:18 ` [RFC PATCH 04/13] vfio/migration: Separated functions that relate to the In-Band approach Lei Rao
2022-05-24  6:18 ` [RFC PATCH 05/13] vfio/migration: rename " Lei Rao
2022-05-24  6:18 ` [RFC PATCH 06/13] vfio/migration: introduce VFIOMigrationOps layer in VFIO live migration framework Lei Rao
2022-05-24  6:18 ` [RFC PATCH 07/13] vfio/migration: move the statistics of bytes_transferred to generic VFIO migration layer Lei Rao
2022-05-24  6:18 ` [RFC PATCH 08/13] vfio/migration: split migration handler registering from vfio_migration_init Lei Rao
2022-05-24  6:18 ` [RFC PATCH 09/13] vfio/migration: move the functions of In-Band approach to a new file Lei Rao
2022-05-24  6:18 ` [RFC PATCH 10/13] vfio/pci: introduce command-line parameters to specify migration method Lei Rao
2022-05-24  6:18 ` [RFC PATCH 11/13] vfio/migration: add a plugin layer to support out-of-band live migration Lei Rao
2022-05-24  6:18 ` [RFC PATCH 12/13] vfio/migration: add some trace-events for vfio migration plugin Lei Rao
2022-05-24  6:18 ` [RFC PATCH 13/13] vfio/migration: make the region and plugin member of struct VFIOMigration to be a union Lei Rao
2022-05-26 18:44 ` [RFC PATCH 00/13] Add a plugin to support out-of-band live migration for VFIO pass-through device Alex Williamson
2022-06-01 17:09   ` Dong, Eddie
2022-06-01 18:01     ` Alex Williamson
2022-06-02  4:24       ` Tian, Kevin
2022-06-14 21:10       ` Dong, Eddie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.