qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v3 0/4] Add migration support for VFIO PCI devices in SMMUv3 nested mode
@ 2021-05-11  2:08 Kunkun Jiang
  2021-05-11  2:08 ` [RFC PATCH v3 1/4] vfio: Introduce helpers to mark dirty pages of a RAM section Kunkun Jiang
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Kunkun Jiang @ 2021-05-11  2:08 UTC (permalink / raw)
  To: Eric Auger, Peter Maydell, Alex Williamson, open list:ARM SMMU,
	open list:All patches CC here
  Cc: kevin.tian, Kunkun Jiang, Kirti Wankhede, Zenghui Yu,
	wanghaibin.wang, Keqian Zhu

Hi all,

Since the SMMUv3's nested translation stages has been introduced by Eric, we
need to pay attention to the migration of VFIO PCI devices in SMMUv3 nested
stage mode. At present, it is not yet supported in QEMU. There are two problems
in the existing framework.

First, the current way to get dirty pages is not applicable to nested stage
mode. Because of the "Caching Mode", VTD can map the RAM through the host
single stage (giova->hpa). "vfio_listener_log_sync" gets dirty pages by
transferring "giova" to the kernel for the RAM block section of mapped MMIO
region. In nested stage mode, we setup the stage 2 (gpa->hpa) and the stage 1
(giova->gpa) separately. So it is inapplicable to get dirty pages by the
current way in nested stage mode.

Second, it also need to pass stage 1 configurations to the destination host
after the migration. In Eric's patch, it passes the stage 1 configuration to
the host on each STE update for the devices set the PASID PciOps. The
configuration will be applied at physical level. But the data of physical
level will not be sent to the destination host. So we have to pass stage 1
configurations to the destination host after the migration.

This Patch set includes patches as below:
Patch 1-2:
- Refactor the vfio_listener_log_sync and added a new function to get dirty
pages for nested mode

Patch 3:
- Added global_log_start/stop interface in vfio_memory_prereg_listener for
nested mode

Patch 4:
- Added the post_load function to vmstate_smmuv3 for passing stage 1
configuration to the destination host after the migration

Best regards,
Kunkun Jiang

History:

v2 -> v3:
- Rebase to v9 of Eric's series 'vSMMUv3/pSMMUv3 2 stage VFIO integration'[1]
- Delete smmuv3_manual_set_pci_device_pasid_table() and reuse
  smmuv3_notify_config_change() [Eric]

v1 -> v2:
- Add global_log_start/stop interface in vfio_memory_prereg_listener
- Add support for repass stage 1 configs with multiple CDs to the host

[1] [RFC v9 00/29] vSMMUv3/pSMMUv3 2 stage VFIO integration
https://lore.kernel.org/qemu-devel/20210411120912.15770-1-eric.auger@redhat.com/

Kunkun Jiang (4):
  vfio: Introduce helpers to mark dirty pages of a RAM section
  vfio: Add vfio_prereg_listener_log_sync in nested stage
  vfio: Add vfio_prereg_listener_global_log_start/stop in nested stage
  hw/arm/smmuv3: Post-load stage 1 configurations to the host

 hw/arm/smmuv3.c  | 33 ++++++++++++++++++----
 hw/vfio/common.c | 73 ++++++++++++++++++++++++++++++++++++++++++------
 2 files changed, 93 insertions(+), 13 deletions(-)

-- 
2.23.0



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [RFC PATCH v3 1/4] vfio: Introduce helpers to mark dirty pages of a RAM section
  2021-05-11  2:08 [RFC PATCH v3 0/4] Add migration support for VFIO PCI devices in SMMUv3 nested mode Kunkun Jiang
@ 2021-05-11  2:08 ` Kunkun Jiang
  2021-05-11  2:08 ` [RFC PATCH v3 2/4] vfio: Add vfio_prereg_listener_log_sync in nested stage Kunkun Jiang
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Kunkun Jiang @ 2021-05-11  2:08 UTC (permalink / raw)
  To: Eric Auger, Peter Maydell, Alex Williamson, open list:ARM SMMU,
	open list:All patches CC here
  Cc: kevin.tian, Kunkun Jiang, Kirti Wankhede, Zenghui Yu,
	wanghaibin.wang, Keqian Zhu

Extract part of the code from vfio_sync_dirty_bitmap to form a
new helper, which allows to mark dirty pages of a RAM section.
This helper will be called for nested stage.

Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 hw/vfio/common.c | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index dc8372c772..9fb8d44a6d 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1271,6 +1271,19 @@ err_out:
     return ret;
 }
 
+static int vfio_dma_sync_ram_section_dirty_bitmap(VFIOContainer *container,
+                                                  MemoryRegionSection *section)
+{
+    ram_addr_t ram_addr;
+
+    ram_addr = memory_region_get_ram_addr(section->mr) +
+               section->offset_within_region;
+
+    return vfio_get_dirty_bitmap(container,
+                    REAL_HOST_PAGE_ALIGN(section->offset_within_address_space),
+                    int128_get64(section->size), ram_addr);
+}
+
 typedef struct {
     IOMMUNotifier n;
     VFIOGuestIOMMU *giommu;
@@ -1312,8 +1325,6 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
 static int vfio_sync_dirty_bitmap(VFIOContainer *container,
                                   MemoryRegionSection *section)
 {
-    ram_addr_t ram_addr;
-
     if (memory_region_is_iommu(section->mr)) {
         VFIOGuestIOMMU *giommu;
 
@@ -1342,12 +1353,7 @@ static int vfio_sync_dirty_bitmap(VFIOContainer *container,
         return 0;
     }
 
-    ram_addr = memory_region_get_ram_addr(section->mr) +
-               section->offset_within_region;
-
-    return vfio_get_dirty_bitmap(container,
-                   REAL_HOST_PAGE_ALIGN(section->offset_within_address_space),
-                   int128_get64(section->size), ram_addr);
+    return vfio_dma_sync_ram_section_dirty_bitmap(container, section);
 }
 
 static void vfio_listener_log_sync(MemoryListener *listener,
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC PATCH v3 2/4] vfio: Add vfio_prereg_listener_log_sync in nested stage
  2021-05-11  2:08 [RFC PATCH v3 0/4] Add migration support for VFIO PCI devices in SMMUv3 nested mode Kunkun Jiang
  2021-05-11  2:08 ` [RFC PATCH v3 1/4] vfio: Introduce helpers to mark dirty pages of a RAM section Kunkun Jiang
@ 2021-05-11  2:08 ` Kunkun Jiang
  2021-05-11  2:08 ` [RFC PATCH v3 3/4] vfio: Add vfio_prereg_listener_global_log_start/stop " Kunkun Jiang
  2021-05-11  2:08 ` [RFC PATCH v3 4/4] hw/arm/smmuv3: Post-load stage 1 configurations to the host Kunkun Jiang
  3 siblings, 0 replies; 5+ messages in thread
From: Kunkun Jiang @ 2021-05-11  2:08 UTC (permalink / raw)
  To: Eric Auger, Peter Maydell, Alex Williamson, open list:ARM SMMU,
	open list:All patches CC here
  Cc: kevin.tian, Kunkun Jiang, Kirti Wankhede, Zenghui Yu,
	wanghaibin.wang, Keqian Zhu

In nested mode, we set up the stage 2 (gpa->hpa)and stage 1
(giova->gpa) separately by vfio_prereg_listener_region_add()
and vfio_listener_region_add(). So when marking dirty pages
we just need to pay attention to stage 2 mappings.

Legacy vfio_listener_log_sync cannot be used in nested stage.
This patch adds vfio_prereg_listener_log_sync to mark dirty
pages in nested mode.

Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 hw/vfio/common.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 9fb8d44a6d..149e535a75 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1284,6 +1284,22 @@ static int vfio_dma_sync_ram_section_dirty_bitmap(VFIOContainer *container,
                     int128_get64(section->size), ram_addr);
 }
 
+static void vfio_prereg_listener_log_sync(MemoryListener *listener,
+                                          MemoryRegionSection *section)
+{
+    VFIOContainer *container =
+        container_of(listener, VFIOContainer, prereg_listener);
+
+    if (!memory_region_is_ram(section->mr) ||
+        !container->dirty_pages_supported) {
+        return;
+    }
+
+    if (vfio_devices_all_dirty_tracking(container)) {
+        vfio_dma_sync_ram_section_dirty_bitmap(container, section);
+    }
+}
+
 typedef struct {
     IOMMUNotifier n;
     VFIOGuestIOMMU *giommu;
@@ -1328,6 +1344,16 @@ static int vfio_sync_dirty_bitmap(VFIOContainer *container,
     if (memory_region_is_iommu(section->mr)) {
         VFIOGuestIOMMU *giommu;
 
+        /*
+         * In nested mode, stage 2 (gpa->hpa) and stage 1 (giova->gpa) are
+         * set up separately. It is inappropriate to pass 'giova' to kernel
+         * to get dirty pages. We only need to focus on stage 2 mapping when
+         * marking dirty pages.
+         */
+        if (container->iommu_type == VFIO_TYPE1_NESTING_IOMMU) {
+            return 0;
+        }
+
         QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) {
             if (MEMORY_REGION(giommu->iommu) == section->mr &&
                 giommu->n.start == section->offset_within_region) {
@@ -1382,6 +1408,7 @@ static const MemoryListener vfio_memory_listener = {
 static MemoryListener vfio_memory_prereg_listener = {
     .region_add = vfio_prereg_listener_region_add,
     .region_del = vfio_prereg_listener_region_del,
+    .log_sync = vfio_prereg_listener_log_sync,
 };
 
 static void vfio_listener_release(VFIOContainer *container)
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC PATCH v3 3/4] vfio: Add vfio_prereg_listener_global_log_start/stop in nested stage
  2021-05-11  2:08 [RFC PATCH v3 0/4] Add migration support for VFIO PCI devices in SMMUv3 nested mode Kunkun Jiang
  2021-05-11  2:08 ` [RFC PATCH v3 1/4] vfio: Introduce helpers to mark dirty pages of a RAM section Kunkun Jiang
  2021-05-11  2:08 ` [RFC PATCH v3 2/4] vfio: Add vfio_prereg_listener_log_sync in nested stage Kunkun Jiang
@ 2021-05-11  2:08 ` Kunkun Jiang
  2021-05-11  2:08 ` [RFC PATCH v3 4/4] hw/arm/smmuv3: Post-load stage 1 configurations to the host Kunkun Jiang
  3 siblings, 0 replies; 5+ messages in thread
From: Kunkun Jiang @ 2021-05-11  2:08 UTC (permalink / raw)
  To: Eric Auger, Peter Maydell, Alex Williamson, open list:ARM SMMU,
	open list:All patches CC here
  Cc: kevin.tian, Kunkun Jiang, Kirti Wankhede, Zenghui Yu,
	wanghaibin.wang, Keqian Zhu

In nested mode, we set up the stage 2 and stage 1 separately. In my
opinion, vfio_memory_prereg_listener is used for stage 2 and
vfio_memory_listener is used for stage 1. So it feels weird to call
the global_log_start/stop interface in vfio_memory_listener to switch
dirty tracking, although this won't cause any errors. Add
global_log_start/stop interface in vfio_memory_prereg_listener
can separate stage 2 from stage 1.

Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 hw/vfio/common.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 149e535a75..6e004fc00f 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1209,6 +1209,17 @@ static void vfio_listener_log_global_start(MemoryListener *listener)
 {
     VFIOContainer *container = container_of(listener, VFIOContainer, listener);
 
+    /* For nested mode, vfio_prereg_listener is used to start dirty tracking */
+    if (container->iommu_type != VFIO_TYPE1_NESTING_IOMMU) {
+        vfio_set_dirty_page_tracking(container, true);
+    }
+}
+
+static void vfio_prereg_listener_log_global_start(MemoryListener *listener)
+{
+    VFIOContainer *container =
+        container_of(listener, VFIOContainer, prereg_listener);
+
     vfio_set_dirty_page_tracking(container, true);
 }
 
@@ -1216,6 +1227,17 @@ static void vfio_listener_log_global_stop(MemoryListener *listener)
 {
     VFIOContainer *container = container_of(listener, VFIOContainer, listener);
 
+    /* For nested mode, vfio_prereg_listener is used to stop dirty tracking */
+    if (container->iommu_type != VFIO_TYPE1_NESTING_IOMMU) {
+        vfio_set_dirty_page_tracking(container, false);
+    }
+}
+
+static void vfio_prereg_listener_log_global_stop(MemoryListener *listener)
+{
+    VFIOContainer *container =
+        container_of(listener, VFIOContainer, prereg_listener);
+
     vfio_set_dirty_page_tracking(container, false);
 }
 
@@ -1408,6 +1430,8 @@ static const MemoryListener vfio_memory_listener = {
 static MemoryListener vfio_memory_prereg_listener = {
     .region_add = vfio_prereg_listener_region_add,
     .region_del = vfio_prereg_listener_region_del,
+    .log_global_start = vfio_prereg_listener_log_global_start,
+    .log_global_stop = vfio_prereg_listener_log_global_stop,
     .log_sync = vfio_prereg_listener_log_sync,
 };
 
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC PATCH v3 4/4] hw/arm/smmuv3: Post-load stage 1 configurations to the host
  2021-05-11  2:08 [RFC PATCH v3 0/4] Add migration support for VFIO PCI devices in SMMUv3 nested mode Kunkun Jiang
                   ` (2 preceding siblings ...)
  2021-05-11  2:08 ` [RFC PATCH v3 3/4] vfio: Add vfio_prereg_listener_global_log_start/stop " Kunkun Jiang
@ 2021-05-11  2:08 ` Kunkun Jiang
  3 siblings, 0 replies; 5+ messages in thread
From: Kunkun Jiang @ 2021-05-11  2:08 UTC (permalink / raw)
  To: Eric Auger, Peter Maydell, Alex Williamson, open list:ARM SMMU,
	open list:All patches CC here
  Cc: kevin.tian, Kunkun Jiang, Kirti Wankhede, Zenghui Yu,
	wanghaibin.wang, Keqian Zhu

In nested mode, we call the set_pasid_table() callback on each
STE update to pass the guest stage 1 configuration to the host
and apply it at physical level.

In the case of live migration, we need to manually call the
set_pasid_table() to load the guest stage 1 configurations to
the host. If this operation fails, the migration fails.

Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 hw/arm/smmuv3.c | 33 ++++++++++++++++++++++++++++-----
 1 file changed, 28 insertions(+), 5 deletions(-)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index ca690513e6..ac1de572f3 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -929,7 +929,7 @@ static void smmuv3_s1_range_inval(SMMUState *s, Cmd *cmd)
     }
 }
 
-static void smmuv3_notify_config_change(SMMUState *bs, uint32_t sid)
+static int smmuv3_notify_config_change(SMMUState *bs, uint32_t sid)
 {
 #ifdef __linux__
     IOMMUMemoryRegion *mr = smmu_iommu_mr(bs, sid);
@@ -938,9 +938,10 @@ static void smmuv3_notify_config_change(SMMUState *bs, uint32_t sid)
     IOMMUConfig iommu_config = {};
     SMMUTransCfg *cfg;
     SMMUDevice *sdev;
+    int ret;
 
     if (!mr) {
-        return;
+        return 0;
     }
 
     sdev = container_of(mr, SMMUDevice, iommu);
@@ -949,13 +950,13 @@ static void smmuv3_notify_config_change(SMMUState *bs, uint32_t sid)
     smmuv3_flush_config(sdev);
 
     if (!pci_device_is_pasid_ops_set(sdev->bus, sdev->devfn)) {
-        return;
+        return 0;
     }
 
     cfg = smmuv3_get_config(sdev, &event);
 
     if (!cfg) {
-        return;
+        return 0;
     }
 
     iommu_config.pasid_cfg.argsz = sizeof(struct iommu_pasid_table_config);
@@ -977,10 +978,13 @@ static void smmuv3_notify_config_change(SMMUState *bs, uint32_t sid)
                                       iommu_config.pasid_cfg.config,
                                       iommu_config.pasid_cfg.base_ptr);
 
-    if (pci_device_set_pasid_table(sdev->bus, sdev->devfn, &iommu_config)) {
+    ret = pci_device_set_pasid_table(sdev->bus, sdev->devfn, &iommu_config);
+    if (ret) {
         error_report("Failed to pass PASID table to host for iommu mr %s (%m)",
                      mr->parent_obj.name);
     }
+
+    return ret;
 #endif
 }
 
@@ -1545,6 +1549,24 @@ static void smmu_realize(DeviceState *d, Error **errp)
     smmu_init_irq(s, dev);
 }
 
+static int smmuv3_post_load(void *opaque, int version_id)
+{
+    SMMUv3State *s3 = opaque;
+    SMMUState *s = &(s3->smmu_state);
+    SMMUDevice *sdev;
+    int ret = 0;
+
+    QLIST_FOREACH(sdev, &s->devices_with_notifiers, next) {
+        uint32_t sid = smmu_get_sid(sdev);
+        ret = smmuv3_notify_config_change(s, sid);
+        if (ret) {
+            break;
+        }
+    }
+
+    return ret;
+}
+
 static const VMStateDescription vmstate_smmuv3_queue = {
     .name = "smmuv3_queue",
     .version_id = 1,
@@ -1563,6 +1585,7 @@ static const VMStateDescription vmstate_smmuv3 = {
     .version_id = 1,
     .minimum_version_id = 1,
     .priority = MIG_PRI_IOMMU,
+    .post_load = smmuv3_post_load,
     .fields = (VMStateField[]) {
         VMSTATE_UINT32(features, SMMUv3State),
         VMSTATE_UINT8(sid_size, SMMUv3State),
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-05-11  2:22 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-11  2:08 [RFC PATCH v3 0/4] Add migration support for VFIO PCI devices in SMMUv3 nested mode Kunkun Jiang
2021-05-11  2:08 ` [RFC PATCH v3 1/4] vfio: Introduce helpers to mark dirty pages of a RAM section Kunkun Jiang
2021-05-11  2:08 ` [RFC PATCH v3 2/4] vfio: Add vfio_prereg_listener_log_sync in nested stage Kunkun Jiang
2021-05-11  2:08 ` [RFC PATCH v3 3/4] vfio: Add vfio_prereg_listener_global_log_start/stop " Kunkun Jiang
2021-05-11  2:08 ` [RFC PATCH v3 4/4] hw/arm/smmuv3: Post-load stage 1 configurations to the host Kunkun Jiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).