* [RFC PATCH v2 1/4] vfio: Introduce helpers to mark dirty pages of a RAM section
2021-03-31 10:12 [RFC PATCH v2 0/4] Add migration support for VFIO PCI devices in SMMUv3 nested stage mode Kunkun Jiang
@ 2021-03-31 10:12 ` Kunkun Jiang
2021-03-31 10:12 ` [RFC PATCH v2 2/4] vfio: Add vfio_prereg_listener_log_sync in nested stage Kunkun Jiang
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Kunkun Jiang @ 2021-03-31 10:12 UTC (permalink / raw)
To: Eric Auger, Peter Maydell, Alex Williamson, kevin.tian, Yi Sun,
open list:ARM SMMU, open list:All patches CC here
Cc: Liu Yi L, shameerali.kolothum.thodi, Kirti Wankhede, Zenghui Yu,
wanghaibin.wang, Keqian Zhu
Extract part of the code from vfio_sync_dirty_bitmap to form a
new helper, which allows to mark dirty pages of a RAM section.
This helper will be called for nested stage.
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
hw/vfio/common.c | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 7da1e95b43..3117979307 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1271,6 +1271,19 @@ err_out:
return ret;
}
+static int vfio_dma_sync_ram_section_dirty_bitmap(VFIOContainer *container,
+ MemoryRegionSection *section)
+{
+ ram_addr_t ram_addr;
+
+ ram_addr = memory_region_get_ram_addr(section->mr) +
+ section->offset_within_region;
+
+ return vfio_get_dirty_bitmap(container,
+ REAL_HOST_PAGE_ALIGN(section->offset_within_address_space),
+ int128_get64(section->size), ram_addr);
+}
+
typedef struct {
IOMMUNotifier n;
VFIOGuestIOMMU *giommu;
@@ -1312,8 +1325,6 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
static int vfio_sync_dirty_bitmap(VFIOContainer *container,
MemoryRegionSection *section)
{
- ram_addr_t ram_addr;
-
if (memory_region_is_iommu(section->mr)) {
VFIOGuestIOMMU *giommu;
@@ -1342,12 +1353,7 @@ static int vfio_sync_dirty_bitmap(VFIOContainer *container,
return 0;
}
- ram_addr = memory_region_get_ram_addr(section->mr) +
- section->offset_within_region;
-
- return vfio_get_dirty_bitmap(container,
- REAL_HOST_PAGE_ALIGN(section->offset_within_address_space),
- int128_get64(section->size), ram_addr);
+ return vfio_dma_sync_ram_section_dirty_bitmap(container, section);
}
static void vfio_listener_log_sync(MemoryListener *listener,
--
2.23.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [RFC PATCH v2 2/4] vfio: Add vfio_prereg_listener_log_sync in nested stage
2021-03-31 10:12 [RFC PATCH v2 0/4] Add migration support for VFIO PCI devices in SMMUv3 nested stage mode Kunkun Jiang
2021-03-31 10:12 ` [RFC PATCH v2 1/4] vfio: Introduce helpers to mark dirty pages of a RAM section Kunkun Jiang
@ 2021-03-31 10:12 ` Kunkun Jiang
2021-03-31 10:12 ` [RFC PATCH v2 3/4] vfio: Add vfio_prereg_listener_global_log_start/stop " Kunkun Jiang
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Kunkun Jiang @ 2021-03-31 10:12 UTC (permalink / raw)
To: Eric Auger, Peter Maydell, Alex Williamson, kevin.tian, Yi Sun,
open list:ARM SMMU, open list:All patches CC here
Cc: Liu Yi L, shameerali.kolothum.thodi, Kirti Wankhede, Zenghui Yu,
wanghaibin.wang, Keqian Zhu
On Intel, the DMA mapped through the host single stage. Instead
we set up the stage 2 and stage 1 separately in nested mode as there
is no "Caching Mode".
Legacy vfio_listener_log_sync cannot be used in nested stage as we
don't need to pay close attention to stage 1 mapping. This patch adds
vfio_prereg_listener_log_sync to mark dirty pages in nested mode.
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
hw/vfio/common.c | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 3117979307..86722814d4 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1284,6 +1284,22 @@ static int vfio_dma_sync_ram_section_dirty_bitmap(VFIOContainer *container,
int128_get64(section->size), ram_addr);
}
+static void vfio_prereg_listener_log_sync(MemoryListener *listener,
+ MemoryRegionSection *section)
+{
+ VFIOContainer *container =
+ container_of(listener, VFIOContainer, prereg_listener);
+
+ if (!memory_region_is_ram(section->mr) ||
+ !container->dirty_pages_supported) {
+ return;
+ }
+
+ if (vfio_devices_all_dirty_tracking(container)) {
+ vfio_dma_sync_ram_section_dirty_bitmap(container, section);
+ }
+}
+
typedef struct {
IOMMUNotifier n;
VFIOGuestIOMMU *giommu;
@@ -1328,6 +1344,16 @@ static int vfio_sync_dirty_bitmap(VFIOContainer *container,
if (memory_region_is_iommu(section->mr)) {
VFIOGuestIOMMU *giommu;
+ /*
+ * In nested mode, stage 2 (gpa->hpa) and stage 1 (giova->gpa) are
+ * set up separately. It is inappropriate to pass 'giova' to kernel
+ * to get dirty pages. We only need to focus on stage 2 mapping when
+ * marking dirty pages.
+ */
+ if (container->iommu_type == VFIO_TYPE1_NESTING_IOMMU) {
+ return 0;
+ }
+
QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) {
if (MEMORY_REGION(giommu->iommu) == section->mr &&
giommu->n.start == section->offset_within_region) {
@@ -1382,6 +1408,7 @@ static const MemoryListener vfio_memory_listener = {
static MemoryListener vfio_memory_prereg_listener = {
.region_add = vfio_prereg_listener_region_add,
.region_del = vfio_prereg_listener_region_del,
+ .log_sync = vfio_prereg_listener_log_sync,
};
static void vfio_listener_release(VFIOContainer *container)
--
2.23.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [RFC PATCH v2 3/4] vfio: Add vfio_prereg_listener_global_log_start/stop in nested stage
2021-03-31 10:12 [RFC PATCH v2 0/4] Add migration support for VFIO PCI devices in SMMUv3 nested stage mode Kunkun Jiang
2021-03-31 10:12 ` [RFC PATCH v2 1/4] vfio: Introduce helpers to mark dirty pages of a RAM section Kunkun Jiang
2021-03-31 10:12 ` [RFC PATCH v2 2/4] vfio: Add vfio_prereg_listener_log_sync in nested stage Kunkun Jiang
@ 2021-03-31 10:12 ` Kunkun Jiang
2021-03-31 10:12 ` [RFC PATCH v2 4/4] hw/arm/smmuv3: Post-load stage 1 configurations to the host Kunkun Jiang
2021-05-11 2:23 ` [RFC PATCH v2 0/4] Add migration support for VFIO PCI devices in SMMUv3 nested stage mode Kunkun Jiang
4 siblings, 0 replies; 6+ messages in thread
From: Kunkun Jiang @ 2021-03-31 10:12 UTC (permalink / raw)
To: Eric Auger, Peter Maydell, Alex Williamson, kevin.tian, Yi Sun,
open list:ARM SMMU, open list:All patches CC here
Cc: Liu Yi L, shameerali.kolothum.thodi, Kirti Wankhede, Zenghui Yu,
wanghaibin.wang, Keqian Zhu
In nested mode, we set up the stage 2 and stage 1 separately. In my
opinion, vfio_memory_prereg_listener is used for stage 2 and
vfio_memory_listener is used for stage 1. So it feels weird to call
the global_log_start/stop interface in vfio_memory_listener to switch
dirty tracking, although this won't cause any errors. Add
global_log_start/stop interface in vfio_memory_prereg_listener
can separate stage 2 from stage 1.
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
hw/vfio/common.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 86722814d4..efea252e46 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1209,6 +1209,16 @@ static void vfio_listener_log_global_start(MemoryListener *listener)
{
VFIOContainer *container = container_of(listener, VFIOContainer, listener);
+ if (container->iommu_type != VFIO_TYPE1_NESTING_IOMMU) {
+ vfio_set_dirty_page_tracking(container, true);
+ }
+}
+
+static void vfio_prereg_listener_log_global_start(MemoryListener *listener)
+{
+ VFIOContainer *container =
+ container_of(listener, VFIOContainer, prereg_listener);
+
vfio_set_dirty_page_tracking(container, true);
}
@@ -1216,6 +1226,16 @@ static void vfio_listener_log_global_stop(MemoryListener *listener)
{
VFIOContainer *container = container_of(listener, VFIOContainer, listener);
+ if (container->iommu_type != VFIO_TYPE1_NESTING_IOMMU) {
+ vfio_set_dirty_page_tracking(container, false);
+ }
+}
+
+static void vfio_prereg_listener_log_global_stop(MemoryListener *listener)
+{
+ VFIOContainer *container =
+ container_of(listener, VFIOContainer, prereg_listener);
+
vfio_set_dirty_page_tracking(container, false);
}
@@ -1408,6 +1428,8 @@ static const MemoryListener vfio_memory_listener = {
static MemoryListener vfio_memory_prereg_listener = {
.region_add = vfio_prereg_listener_region_add,
.region_del = vfio_prereg_listener_region_del,
+ .log_global_start = vfio_prereg_listener_log_global_start,
+ .log_global_stop = vfio_prereg_listener_log_global_stop,
.log_sync = vfio_prereg_listener_log_sync,
};
--
2.23.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [RFC PATCH v2 4/4] hw/arm/smmuv3: Post-load stage 1 configurations to the host
2021-03-31 10:12 [RFC PATCH v2 0/4] Add migration support for VFIO PCI devices in SMMUv3 nested stage mode Kunkun Jiang
` (2 preceding siblings ...)
2021-03-31 10:12 ` [RFC PATCH v2 3/4] vfio: Add vfio_prereg_listener_global_log_start/stop " Kunkun Jiang
@ 2021-03-31 10:12 ` Kunkun Jiang
2021-05-11 2:23 ` [RFC PATCH v2 0/4] Add migration support for VFIO PCI devices in SMMUv3 nested stage mode Kunkun Jiang
4 siblings, 0 replies; 6+ messages in thread
From: Kunkun Jiang @ 2021-03-31 10:12 UTC (permalink / raw)
To: Eric Auger, Peter Maydell, Alex Williamson, kevin.tian, Yi Sun,
open list:ARM SMMU, open list:All patches CC here
Cc: Liu Yi L, shameerali.kolothum.thodi, Kirti Wankhede, Zenghui Yu,
wanghaibin.wang, Keqian Zhu
In nested mode, we call the set_pasid_table() callback on each STE
update to pass the guest stage 1 configuration to the host and
apply it at physical level.
In the case of live migration, we need to manual call the
set_pasid_table() to load the guest stage 1 configurations to the
host. If this operation is fail, the migration is fail.
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
hw/arm/smmuv3.c | 62 +++++++++++++++++++++++++++++++++++++++++++++
hw/arm/trace-events | 1 +
2 files changed, 63 insertions(+)
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 55aa6ad874..4d28ca3777 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1534,6 +1534,67 @@ static void smmu_realize(DeviceState *d, Error **errp)
smmu_init_irq(s, dev);
}
+static int smmuv3_manual_set_pci_device_pasid_table(SMMUDevice *sdev)
+{
+#ifdef __linux__
+ IOMMUMemoryRegion *mr = &(sdev->iommu);
+ int sid = smmu_get_sid(sdev);
+ SMMUEventInfo event = {.type = SMMU_EVT_NONE, .sid = sid,
+ .inval_ste_allowed = true};
+ IOMMUConfig iommu_config = {};
+ SMMUTransCfg *cfg;
+ int ret = -1;
+
+ cfg = smmuv3_get_config(sdev, &event);
+ if (!cfg) {
+ return ret;
+ }
+
+ iommu_config.pasid_cfg.argsz = sizeof(struct iommu_pasid_table_config);
+ iommu_config.pasid_cfg.version = PASID_TABLE_CFG_VERSION_1;
+ iommu_config.pasid_cfg.format = IOMMU_PASID_FORMAT_SMMUV3;
+ iommu_config.pasid_cfg.base_ptr = cfg->s1ctxptr;
+ iommu_config.pasid_cfg.pasid_bits = cfg->s1cdmax;
+ iommu_config.pasid_cfg.vendor_data.smmuv3.version = PASID_TABLE_SMMUV3_CFG_VERSION_1;
+ iommu_config.pasid_cfg.vendor_data.smmuv3.s1fmt = cfg->s1fmt;
+ iommu_config.pasid_cfg.vendor_data.smmuv3.s1dss = cfg->s1dss;
+
+ if (cfg->disabled || cfg->bypassed) {
+ iommu_config.pasid_cfg.config = IOMMU_PASID_CONFIG_BYPASS;
+ } else if (cfg->aborted) {
+ iommu_config.pasid_cfg.config = IOMMU_PASID_CONFIG_ABORT;
+ } else {
+ iommu_config.pasid_cfg.config = IOMMU_PASID_CONFIG_TRANSLATE;
+ }
+
+ ret = pci_device_set_pasid_table(sdev->bus, sdev->devfn, &iommu_config);
+ if (ret) {
+ error_report("Failed to pass PASID table to host for iommu mr %s (%m)",
+ mr->parent_obj.name);
+ }
+
+ return ret;
+#endif
+}
+
+static int smmuv3_post_load(void *opaque, int version_id)
+{
+ SMMUv3State *s3 = opaque;
+ SMMUState *s = &(s3->smmu_state);
+ SMMUDevice *sdev;
+ int ret = 0;
+
+ QLIST_FOREACH(sdev, &s->devices_with_notifiers, next) {
+ trace_smmuv3_post_load_sdev(sdev->devfn, sdev->iommu.parent_obj.name);
+ ret = smmuv3_manual_set_pci_device_pasid_table(sdev);
+ if (ret) {
+ break;
+ }
+ }
+
+ return ret;
+}
+
static const VMStateDescription vmstate_smmuv3_queue = {
.name = "smmuv3_queue",
.version_id = 1,
@@ -1552,6 +1613,7 @@ static const VMStateDescription vmstate_smmuv3 = {
.version_id = 1,
.minimum_version_id = 1,
.priority = MIG_PRI_IOMMU,
+ .post_load = smmuv3_post_load,
.fields = (VMStateField[]) {
VMSTATE_UINT32(features, SMMUv3State),
VMSTATE_UINT8(sid_size, SMMUv3State),
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index b0b0030d24..2f093286ec 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -54,4 +54,5 @@ smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node for iommu mr=%s
smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node for iommu mr=%s"
smmuv3_inv_notifiers_iova(const char *name, uint16_t asid, uint64_t iova, uint8_t tg, uint64_t num_pages) "iommu mr=%s asid=%d iova=0x%"PRIx64" tg=%d num_pages=0x%"PRIx64
smmuv3_notify_config_change(const char *name, uint8_t config, uint64_t s1ctxptr) "iommu mr=%s config=%d s1ctxptr=0x%"PRIx64
+smmuv3_post_load_sdev(int devfn, const char *name) "sdev devfn=%d iommu mr=%s"PRIx64
--
2.23.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [RFC PATCH v2 0/4] Add migration support for VFIO PCI devices in SMMUv3 nested stage mode
2021-03-31 10:12 [RFC PATCH v2 0/4] Add migration support for VFIO PCI devices in SMMUv3 nested stage mode Kunkun Jiang
` (3 preceding siblings ...)
2021-03-31 10:12 ` [RFC PATCH v2 4/4] hw/arm/smmuv3: Post-load stage 1 configurations to the host Kunkun Jiang
@ 2021-05-11 2:23 ` Kunkun Jiang
4 siblings, 0 replies; 6+ messages in thread
From: Kunkun Jiang @ 2021-05-11 2:23 UTC (permalink / raw)
To: Eric Auger, Peter Maydell, Alex Williamson, kevin.tian, Yi Sun,
open list:ARM SMMU, open list:All patches CC here
Cc: Liu Yi L, shameerali.kolothum.thodi, Kirti Wankhede, Zenghui Yu,
wanghaibin.wang, Keqian Zhu
Hi all,
This series has been updated to v3.[1]
Any comments and reviews are welcome.
Thanks,
Kunkun Jiang
[1] [RFC PATCH v3 0/4] Add migration support for VFIO PCI devices in
SMMUv3 nested mode
https://lore.kernel.org/qemu-devel/20210511020816.2905-1-jiangkunkun@huawei.com/
On 2021/3/31 18:12, Kunkun Jiang wrote:
> Hi all,
>
> Since the SMMUv3's nested translation stages[1] has been introduced by Eric, we
> need to pay attention to the migration of VFIO PCI devices in SMMUv3 nested stage
> mode. At present, it is not yet supported in QEMU. There are two problems in the
> existing framework.
>
> First, the current way to get dirty pages is not applicable to nested stage mode.
> Because of the "Caching Mode", VTD can map the RAM through the host single stage
> (giova->hpa). "vfio_listener_log_sync" gets dirty pages by transferring "giova"
> to the kernel for the RAM block section of mapped MMIO region. In nested stage
> mode, we setup the stage 2 (gpa->hpa) and the stage 1(giova->gpa) separately. So
> it is inapplicable to get dirty pages by the current way in nested stage mode.
>
> Second, it also need to pass stage 1 configurations to the destination host after
> the migration. In Eric's patch, it passes the stage 1 configuration to the host on
> each STE update for the devices set the PASID PciOps. The configuration will be
> applied at physical level. But the data of physical level will not be sent to the
> destination host. So we have to pass stage 1 configurations to the destination
> host after the migration.
>
> Best Regards,
> Kunkun Jiang
>
> [1] [RFC,v8,00/28] vSMMUv3/pSMMUv3 2 stage VFIO integration
> http://patchwork.ozlabs.org/project/qemu-devel/cover/20210225105233.650545-1-eric.auger@redhat.com/
>
> This Patch set includes patches as below:
> Patch 1-2:
> - Refactor the vfio_listener_log_sync and added a new function to get dirty pages
> in nested stage mode.
>
> Patch 3:
> - Added global_log_start/stop interface in vfio_memory_prereg_listener
>
> Patch 4:
> - Added the post_load function to vmstate_smmuv3 for passing stage 1 configuration
> to the destination host after the migration.
>
> History:
>
> v1 -> v2:
> - Add global_log_start/stop interface in vfio_memory_prereg_listener
> - Add support for repass stage 1 configs with multiple CDs to the host
>
> Kunkun Jiang (4):
> vfio: Introduce helpers to mark dirty pages of a RAM section
> vfio: Add vfio_prereg_listener_log_sync in nested stage
> vfio: Add vfio_prereg_listener_global_log_start/stop in nested stage
> hw/arm/smmuv3: Post-load stage 1 configurations to the host
>
> hw/arm/smmuv3.c | 62 +++++++++++++++++++++++++++++++++++++++
> hw/arm/trace-events | 1 +
> hw/vfio/common.c | 71 ++++++++++++++++++++++++++++++++++++++++-----
> 3 files changed, 126 insertions(+), 8 deletions(-)
>
^ permalink raw reply [flat|nested] 6+ messages in thread