All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v9 0/2] vhost-vdpa: add support for vIOMMU
@ 2022-10-31  3:10 Cindy Lu
  2022-10-31  3:10 ` [PATCH v9 1/2] vfio: move implement of vfio_get_xlat_addr() to memory.c Cindy Lu
  2022-10-31  3:10 ` [PATCH v9 2/2] vhost-vdpa: add support for vIOMMU Cindy Lu
  0 siblings, 2 replies; 15+ messages in thread
From: Cindy Lu @ 2022-10-31  3:10 UTC (permalink / raw)
  To: lulu, alex.williamson, jasowang, mst, pbonzini, peterx, david,
	f4bug, sgarzare
  Cc: qemu-devel

These patches are to support vIOMMU in vdpa device

changes in V3
1. Move function vfio_get_xlat_addr to memory.c
2. Use the existing memory listener, while the MR is
iommu MR then call the function iommu_region_add/
iommu_region_del

changes in V4
1.make the comments in vfio_get_xlat_addr more general

changes in V5
1. Address the comments in the last version
2. Add a new arg in the function vfio_get_xlat_addr, which shows whether
the memory is backed by a discard manager. So the device can have its
own warning.

changes in V6
move the error_report for the unpopulated discard back to
memeory_get_xlat_addr

changes in V7
organize the error massage to avoid the duplicate information

changes in V8
Organize the code follow the comments in the last version

changes in V9
Organize the code follow the comments

Cindy Lu (2):
  vfio: move implement of vfio_get_xlat_addr() to memory.c
  vhost-vdpa: add support for vIOMMU

 hw/vfio/common.c               |  66 ++----------------
 hw/virtio/vhost-vdpa.c         | 123 ++++++++++++++++++++++++++++++---
 include/exec/memory.h          |   4 ++
 include/hw/virtio/vhost-vdpa.h |  10 +++
 softmmu/memory.c               |  72 +++++++++++++++++++
 5 files changed, 203 insertions(+), 72 deletions(-)

-- 
2.34.3



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v9 1/2] vfio: move implement of vfio_get_xlat_addr() to memory.c
  2022-10-31  3:10 [PATCH v9 0/2] vhost-vdpa: add support for vIOMMU Cindy Lu
@ 2022-10-31  3:10 ` Cindy Lu
  2022-10-31  3:20   ` Alex Williamson
  2022-10-31  6:38   ` Michael S. Tsirkin
  2022-10-31  3:10 ` [PATCH v9 2/2] vhost-vdpa: add support for vIOMMU Cindy Lu
  1 sibling, 2 replies; 15+ messages in thread
From: Cindy Lu @ 2022-10-31  3:10 UTC (permalink / raw)
  To: lulu, alex.williamson, jasowang, mst, pbonzini, peterx, david,
	f4bug, sgarzare
  Cc: qemu-devel

- Move the implement vfio_get_xlat_addr to softmmu/memory.c, and
  change the name to memory_get_xlat_addr(). So we can use this
  function on other devices, such as vDPA device.
- Add a new function vfio_get_xlat_addr in vfio/common.c, and it will check
  whether the memory is backed by a discard manager. then device can
  have its own warning.

Signed-off-by: Cindy Lu <lulu@redhat.com>
---
 hw/vfio/common.c      | 66 +++------------------------------------
 include/exec/memory.h |  4 +++
 softmmu/memory.c      | 72 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 81 insertions(+), 61 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index ace9562a9b..6bc02b32c8 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -578,45 +578,11 @@ static bool vfio_listener_skipped_section(MemoryRegionSection *section)
 static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
                                ram_addr_t *ram_addr, bool *read_only)
 {
-    MemoryRegion *mr;
-    hwaddr xlat;
-    hwaddr len = iotlb->addr_mask + 1;
-    bool writable = iotlb->perm & IOMMU_WO;
-
-    /*
-     * The IOMMU TLB entry we have just covers translation through
-     * this IOMMU to its immediate target.  We need to translate
-     * it the rest of the way through to memory.
-     */
-    mr = address_space_translate(&address_space_memory,
-                                 iotlb->translated_addr,
-                                 &xlat, &len, writable,
-                                 MEMTXATTRS_UNSPECIFIED);
-    if (!memory_region_is_ram(mr)) {
-        error_report("iommu map to non memory area %"HWADDR_PRIx"",
-                     xlat);
-        return false;
-    } else if (memory_region_has_ram_discard_manager(mr)) {
-        RamDiscardManager *rdm = memory_region_get_ram_discard_manager(mr);
-        MemoryRegionSection tmp = {
-            .mr = mr,
-            .offset_within_region = xlat,
-            .size = int128_make64(len),
-        };
-
-        /*
-         * Malicious VMs can map memory into the IOMMU, which is expected
-         * to remain discarded. vfio will pin all pages, populating memory.
-         * Disallow that. vmstate priorities make sure any RamDiscardManager
-         * were already restored before IOMMUs are restored.
-         */
-        if (!ram_discard_manager_is_populated(rdm, &tmp)) {
-            error_report("iommu map to discarded memory (e.g., unplugged via"
-                         " virtio-mem): %"HWADDR_PRIx"",
-                         iotlb->translated_addr);
-            return false;
-        }
+    bool ret, mr_has_discard_manager;
 
+    ret = memory_get_xlat_addr(iotlb, vaddr, ram_addr, read_only,
+                               &mr_has_discard_manager);
+    if (ret && mr_has_discard_manager) {
         /*
          * Malicious VMs might trigger discarding of IOMMU-mapped memory. The
          * pages will remain pinned inside vfio until unmapped, resulting in a
@@ -635,29 +601,7 @@ static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
                          " intended via an IOMMU. It's possible to mitigate "
                          " by setting/adjusting RLIMIT_MEMLOCK.");
     }
-
-    /*
-     * Translation truncates length to the IOMMU page size,
-     * check that it did not truncate too much.
-     */
-    if (len & iotlb->addr_mask) {
-        error_report("iommu has granularity incompatible with target AS");
-        return false;
-    }
-
-    if (vaddr) {
-        *vaddr = memory_region_get_ram_ptr(mr) + xlat;
-    }
-
-    if (ram_addr) {
-        *ram_addr = memory_region_get_ram_addr(mr) + xlat;
-    }
-
-    if (read_only) {
-        *read_only = !writable || mr->readonly;
-    }
-
-    return true;
+    return ret;
 }
 
 static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index bfb1de8eea..d1e79c39dc 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -713,6 +713,10 @@ void ram_discard_manager_register_listener(RamDiscardManager *rdm,
 void ram_discard_manager_unregister_listener(RamDiscardManager *rdm,
                                              RamDiscardListener *rdl);
 
+bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
+                          ram_addr_t *ram_addr, bool *read_only,
+                          bool *mr_has_discard_manager);
+
 typedef struct CoalescedMemoryRange CoalescedMemoryRange;
 typedef struct MemoryRegionIoeventfd MemoryRegionIoeventfd;
 
diff --git a/softmmu/memory.c b/softmmu/memory.c
index 7ba2048836..bc0be3f62c 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -33,6 +33,7 @@
 #include "qemu/accel.h"
 #include "hw/boards.h"
 #include "migration/vmstate.h"
+#include "exec/address-spaces.h"
 
 //#define DEBUG_UNASSIGNED
 
@@ -2121,6 +2122,77 @@ void ram_discard_manager_unregister_listener(RamDiscardManager *rdm,
     rdmc->unregister_listener(rdm, rdl);
 }
 
+/* Called with rcu_read_lock held.  */
+bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
+                          ram_addr_t *ram_addr, bool *read_only,
+                          bool *mr_has_discard_manager)
+{
+    MemoryRegion *mr;
+    hwaddr xlat;
+    hwaddr len = iotlb->addr_mask + 1;
+    bool writable = iotlb->perm & IOMMU_WO;
+
+    if (mr_has_discard_manager) {
+        *mr_has_discard_manager = false;
+    }
+    /*
+     * The IOMMU TLB entry we have just covers translation through
+     * this IOMMU to its immediate target.  We need to translate
+     * it the rest of the way through to memory.
+     */
+    mr = address_space_translate(&address_space_memory, iotlb->translated_addr,
+                                 &xlat, &len, writable, MEMTXATTRS_UNSPECIFIED);
+    if (!memory_region_is_ram(mr)) {
+        error_report("iommu map to non memory area %" HWADDR_PRIx "", xlat);
+        return false;
+    } else if (memory_region_has_ram_discard_manager(mr)) {
+        RamDiscardManager *rdm = memory_region_get_ram_discard_manager(mr);
+        MemoryRegionSection tmp = {
+            .mr = mr,
+            .offset_within_region = xlat,
+            .size = int128_make64(len),
+        };
+        if (mr_has_discard_manager) {
+            *mr_has_discard_manager = true;
+        }
+        /*
+         * Malicious VMs can map memory into the IOMMU, which is expected
+         * to remain discarded. vfio will pin all pages, populating memory.
+         * Disallow that. vmstate priorities make sure any RamDiscardManager
+         * were already restored before IOMMUs are restored.
+         */
+        if (!ram_discard_manager_is_populated(rdm, &tmp)) {
+            error_report("iommu map to discarded memory (e.g., unplugged via"
+                         " virtio-mem): %" HWADDR_PRIx "",
+                         iotlb->translated_addr);
+            return false;
+        }
+    }
+
+    /*
+     * Translation truncates length to the IOMMU page size,
+     * check that it did not truncate too much.
+     */
+    if (len & iotlb->addr_mask) {
+        error_report("iommu has granularity incompatible with target AS");
+        return false;
+    }
+
+    if (vaddr) {
+        *vaddr = memory_region_get_ram_ptr(mr) + xlat;
+    }
+
+    if (ram_addr) {
+        *ram_addr = memory_region_get_ram_addr(mr) + xlat;
+    }
+
+    if (read_only) {
+        *read_only = !writable || mr->readonly;
+    }
+
+    return true;
+}
+
 void memory_region_set_log(MemoryRegion *mr, bool log, unsigned client)
 {
     uint8_t mask = 1 << client;
-- 
2.34.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v9 2/2] vhost-vdpa: add support for vIOMMU
  2022-10-31  3:10 [PATCH v9 0/2] vhost-vdpa: add support for vIOMMU Cindy Lu
  2022-10-31  3:10 ` [PATCH v9 1/2] vfio: move implement of vfio_get_xlat_addr() to memory.c Cindy Lu
@ 2022-10-31  3:10 ` Cindy Lu
  2022-10-31  7:04   ` Michael S. Tsirkin
  1 sibling, 1 reply; 15+ messages in thread
From: Cindy Lu @ 2022-10-31  3:10 UTC (permalink / raw)
  To: lulu, alex.williamson, jasowang, mst, pbonzini, peterx, david,
	f4bug, sgarzare
  Cc: qemu-devel

Add support for vIOMMU. add the new function to deal with iommu MR.
- during iommu_region_add register a specific IOMMU notifier,
 and store all notifiers in a list.
- during iommu_region_del, compare and delete the IOMMU notifier from the list

Verified in vp_vdpa and vdpa_sim_net driver

Signed-off-by: Cindy Lu <lulu@redhat.com>
---
 hw/virtio/vhost-vdpa.c         | 123 ++++++++++++++++++++++++++++++---
 include/hw/virtio/vhost-vdpa.h |  10 +++
 2 files changed, 122 insertions(+), 11 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 3ff9ce3501..dcfaaccfa9 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -26,6 +26,7 @@
 #include "cpu.h"
 #include "trace.h"
 #include "qapi/error.h"
+#include "hw/virtio/virtio-access.h"
 
 /*
  * Return one past the end of the end of section. Be careful with uint64_t
@@ -44,7 +45,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
                                                 uint64_t iova_min,
                                                 uint64_t iova_max)
 {
-    Int128 llend;
 
     if ((!memory_region_is_ram(section->mr) &&
          !memory_region_is_iommu(section->mr)) ||
@@ -61,14 +61,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
         return true;
     }
 
-    llend = vhost_vdpa_section_end(section);
-    if (int128_gt(llend, int128_make64(iova_max))) {
-        error_report("RAM section out of device range (max=0x%" PRIx64
-                     ", end addr=0x%" PRIx64 ")",
-                     iova_max, int128_get64(llend));
-        return true;
-    }
-
     return false;
 }
 
@@ -173,6 +165,106 @@ static void vhost_vdpa_listener_commit(MemoryListener *listener)
     v->iotlb_batch_begin_sent = false;
 }
 
+static void vhost_vdpa_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
+{
+    struct vdpa_iommu *iommu = container_of(n, struct vdpa_iommu, n);
+
+    hwaddr iova = iotlb->iova + iommu->iommu_offset;
+    struct vhost_vdpa *v = iommu->dev;
+    void *vaddr;
+    int ret;
+
+    if (iotlb->target_as != &address_space_memory) {
+        error_report("Wrong target AS \"%s\", only system memory is allowed",
+                     iotlb->target_as->name ? iotlb->target_as->name : "none");
+        return;
+    }
+    RCU_READ_LOCK_GUARD();
+    vhost_vdpa_iotlb_batch_begin_once(v);
+
+    if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
+        bool read_only;
+
+        if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL)) {
+            return;
+        }
+        ret =
+            vhost_vdpa_dma_map(v, iova, iotlb->addr_mask + 1, vaddr, read_only);
+        if (ret) {
+            error_report("vhost_vdpa_dma_map(%p, 0x%" HWADDR_PRIx ", "
+                         "0x%" HWADDR_PRIx ", %p) = %d (%m)",
+                         v, iova, iotlb->addr_mask + 1, vaddr, ret);
+        }
+    } else {
+        ret = vhost_vdpa_dma_unmap(v, iova, iotlb->addr_mask + 1);
+        if (ret) {
+            error_report("vhost_vdpa_dma_unmap(%p, 0x%" HWADDR_PRIx ", "
+                         "0x%" HWADDR_PRIx ") = %d (%m)",
+                         v, iova, iotlb->addr_mask + 1, ret);
+        }
+    }
+}
+
+static void vhost_vdpa_iommu_region_add(MemoryListener *listener,
+                                        MemoryRegionSection *section)
+{
+    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
+
+    struct vdpa_iommu *iommu;
+    Int128 end;
+    int iommu_idx;
+    IOMMUMemoryRegion *iommu_mr;
+    int ret;
+
+    iommu_mr = IOMMU_MEMORY_REGION(section->mr);
+
+    iommu = g_malloc0(sizeof(*iommu));
+    end =  int128_add(int128_make64(section->offset_within_region),
+            section->size);
+    end = int128_sub(end, int128_one());
+    iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
+            MEMTXATTRS_UNSPECIFIED);
+
+    iommu->iommu_mr = iommu_mr;
+
+    iommu_notifier_init(
+        &iommu->n, vhost_vdpa_iommu_map_notify, IOMMU_NOTIFIER_IOTLB_EVENTS,
+        section->offset_within_region, int128_get64(end), iommu_idx);
+    iommu->iommu_offset =
+        section->offset_within_address_space - section->offset_within_region;
+    iommu->dev = v;
+
+    ret = memory_region_register_iommu_notifier(section->mr, &iommu->n, NULL);
+    if (ret) {
+        g_free(iommu);
+        return;
+    }
+
+    QLIST_INSERT_HEAD(&v->iommu_list, iommu, iommu_next);
+    memory_region_iommu_replay(iommu->iommu_mr, &iommu->n);
+
+    return;
+}
+
+static void vhost_vdpa_iommu_region_del(MemoryListener *listener,
+                                        MemoryRegionSection *section)
+{
+    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
+
+    struct vdpa_iommu *iommu;
+
+    QLIST_FOREACH(iommu, &v->iommu_list, iommu_next)
+    {
+        if (MEMORY_REGION(iommu->iommu_mr) == section->mr &&
+            iommu->n.start == section->offset_within_region) {
+            memory_region_unregister_iommu_notifier(section->mr, &iommu->n);
+            QLIST_REMOVE(iommu, iommu_next);
+            g_free(iommu);
+            break;
+        }
+    }
+}
+
 static void vhost_vdpa_listener_region_add(MemoryListener *listener,
                                            MemoryRegionSection *section)
 {
@@ -186,6 +278,10 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
                                             v->iova_range.last)) {
         return;
     }
+    if (memory_region_is_iommu(section->mr)) {
+        vhost_vdpa_iommu_region_add(listener, section);
+        return;
+    }
 
     if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
                  (section->offset_within_region & ~TARGET_PAGE_MASK))) {
@@ -260,6 +356,10 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
                                             v->iova_range.last)) {
         return;
     }
+    if (memory_region_is_iommu(section->mr)) {
+        vhost_vdpa_iommu_region_del(listener, section);
+        return;
+    }
 
     if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
                  (section->offset_within_region & ~TARGET_PAGE_MASK))) {
@@ -312,6 +412,7 @@ static const MemoryListener vhost_vdpa_memory_listener = {
     .region_del = vhost_vdpa_listener_region_del,
 };
 
+
 static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
                              void *arg)
 {
@@ -587,7 +688,6 @@ static int vhost_vdpa_cleanup(struct vhost_dev *dev)
     v = dev->opaque;
     trace_vhost_vdpa_cleanup(dev, v);
     vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
-    memory_listener_unregister(&v->listener);
     vhost_vdpa_svq_cleanup(dev);
 
     dev->opaque = NULL;
@@ -1127,7 +1227,8 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
     }
 
     if (started) {
-        memory_listener_register(&v->listener, &address_space_memory);
+        memory_listener_register(&v->listener, dev->vdev->dma_as);
+
         return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
     } else {
         vhost_vdpa_reset_device(dev);
diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index d10a89303e..64a46e37cb 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -41,8 +41,18 @@ typedef struct vhost_vdpa {
     void *shadow_vq_ops_opaque;
     struct vhost_dev *dev;
     VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
+    QLIST_HEAD(, vdpa_iommu) iommu_list;
+    IOMMUNotifier n;
 } VhostVDPA;
 
+struct vdpa_iommu {
+    struct vhost_vdpa *dev;
+    IOMMUMemoryRegion *iommu_mr;
+    hwaddr iommu_offset;
+    IOMMUNotifier n;
+    QLIST_ENTRY(vdpa_iommu) iommu_next;
+};
+
 int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
                        void *vaddr, bool readonly);
 int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
-- 
2.34.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 1/2] vfio: move implement of vfio_get_xlat_addr() to memory.c
  2022-10-31  3:10 ` [PATCH v9 1/2] vfio: move implement of vfio_get_xlat_addr() to memory.c Cindy Lu
@ 2022-10-31  3:20   ` Alex Williamson
  2022-10-31  6:38   ` Michael S. Tsirkin
  1 sibling, 0 replies; 15+ messages in thread
From: Alex Williamson @ 2022-10-31  3:20 UTC (permalink / raw)
  To: Cindy Lu
  Cc: jasowang, mst, pbonzini, peterx, david, f4bug, sgarzare, qemu-devel

On Mon, 31 Oct 2022 11:10:19 +0800
Cindy Lu <lulu@redhat.com> wrote:

> - Move the implement vfio_get_xlat_addr to softmmu/memory.c, and
>   change the name to memory_get_xlat_addr(). So we can use this
>   function on other devices, such as vDPA device.
> - Add a new function vfio_get_xlat_addr in vfio/common.c, and it will check
>   whether the memory is backed by a discard manager. then device can
>   have its own warning.
> 
> Signed-off-by: Cindy Lu <lulu@redhat.com>
> ---
>  hw/vfio/common.c      | 66 +++------------------------------------
>  include/exec/memory.h |  4 +++
>  softmmu/memory.c      | 72 +++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 81 insertions(+), 61 deletions(-)

Acked-by: Alex Williamson <alex.williamson@redhat.com>



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 1/2] vfio: move implement of vfio_get_xlat_addr() to memory.c
  2022-10-31  3:10 ` [PATCH v9 1/2] vfio: move implement of vfio_get_xlat_addr() to memory.c Cindy Lu
  2022-10-31  3:20   ` Alex Williamson
@ 2022-10-31  6:38   ` Michael S. Tsirkin
  2022-10-31  6:44     ` Cindy Lu
  1 sibling, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2022-10-31  6:38 UTC (permalink / raw)
  To: Cindy Lu
  Cc: alex.williamson, jasowang, pbonzini, peterx, david, f4bug,
	sgarzare, qemu-devel

On Mon, Oct 31, 2022 at 11:10:19AM +0800, Cindy Lu wrote:
> - Move the implement vfio_get_xlat_addr to softmmu/memory.c, and
>   change the name to memory_get_xlat_addr(). So we can use this
>   function on other devices, such as vDPA device.
> - Add a new function vfio_get_xlat_addr in vfio/common.c, and it will check
>   whether the memory is backed by a discard manager. then device can
>   have its own warning.
> 
> Signed-off-by: Cindy Lu <lulu@redhat.com>

Could you rebase on top of my tree (config interrupt support conflicts).

> ---
>  hw/vfio/common.c      | 66 +++------------------------------------
>  include/exec/memory.h |  4 +++
>  softmmu/memory.c      | 72 +++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 81 insertions(+), 61 deletions(-)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index ace9562a9b..6bc02b32c8 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -578,45 +578,11 @@ static bool vfio_listener_skipped_section(MemoryRegionSection *section)
>  static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
>                                 ram_addr_t *ram_addr, bool *read_only)
>  {
> -    MemoryRegion *mr;
> -    hwaddr xlat;
> -    hwaddr len = iotlb->addr_mask + 1;
> -    bool writable = iotlb->perm & IOMMU_WO;
> -
> -    /*
> -     * The IOMMU TLB entry we have just covers translation through
> -     * this IOMMU to its immediate target.  We need to translate
> -     * it the rest of the way through to memory.
> -     */
> -    mr = address_space_translate(&address_space_memory,
> -                                 iotlb->translated_addr,
> -                                 &xlat, &len, writable,
> -                                 MEMTXATTRS_UNSPECIFIED);
> -    if (!memory_region_is_ram(mr)) {
> -        error_report("iommu map to non memory area %"HWADDR_PRIx"",
> -                     xlat);
> -        return false;
> -    } else if (memory_region_has_ram_discard_manager(mr)) {
> -        RamDiscardManager *rdm = memory_region_get_ram_discard_manager(mr);
> -        MemoryRegionSection tmp = {
> -            .mr = mr,
> -            .offset_within_region = xlat,
> -            .size = int128_make64(len),
> -        };
> -
> -        /*
> -         * Malicious VMs can map memory into the IOMMU, which is expected
> -         * to remain discarded. vfio will pin all pages, populating memory.
> -         * Disallow that. vmstate priorities make sure any RamDiscardManager
> -         * were already restored before IOMMUs are restored.
> -         */
> -        if (!ram_discard_manager_is_populated(rdm, &tmp)) {
> -            error_report("iommu map to discarded memory (e.g., unplugged via"
> -                         " virtio-mem): %"HWADDR_PRIx"",
> -                         iotlb->translated_addr);
> -            return false;
> -        }
> +    bool ret, mr_has_discard_manager;
>  
> +    ret = memory_get_xlat_addr(iotlb, vaddr, ram_addr, read_only,
> +                               &mr_has_discard_manager);
> +    if (ret && mr_has_discard_manager) {
>          /*
>           * Malicious VMs might trigger discarding of IOMMU-mapped memory. The
>           * pages will remain pinned inside vfio until unmapped, resulting in a
> @@ -635,29 +601,7 @@ static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
>                           " intended via an IOMMU. It's possible to mitigate "
>                           " by setting/adjusting RLIMIT_MEMLOCK.");
>      }
> -
> -    /*
> -     * Translation truncates length to the IOMMU page size,
> -     * check that it did not truncate too much.
> -     */
> -    if (len & iotlb->addr_mask) {
> -        error_report("iommu has granularity incompatible with target AS");
> -        return false;
> -    }
> -
> -    if (vaddr) {
> -        *vaddr = memory_region_get_ram_ptr(mr) + xlat;
> -    }
> -
> -    if (ram_addr) {
> -        *ram_addr = memory_region_get_ram_addr(mr) + xlat;
> -    }
> -
> -    if (read_only) {
> -        *read_only = !writable || mr->readonly;
> -    }
> -
> -    return true;
> +    return ret;
>  }
>  
>  static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index bfb1de8eea..d1e79c39dc 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -713,6 +713,10 @@ void ram_discard_manager_register_listener(RamDiscardManager *rdm,
>  void ram_discard_manager_unregister_listener(RamDiscardManager *rdm,
>                                               RamDiscardListener *rdl);
>  
> +bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
> +                          ram_addr_t *ram_addr, bool *read_only,
> +                          bool *mr_has_discard_manager);
> +
>  typedef struct CoalescedMemoryRange CoalescedMemoryRange;
>  typedef struct MemoryRegionIoeventfd MemoryRegionIoeventfd;
>  
> diff --git a/softmmu/memory.c b/softmmu/memory.c
> index 7ba2048836..bc0be3f62c 100644
> --- a/softmmu/memory.c
> +++ b/softmmu/memory.c
> @@ -33,6 +33,7 @@
>  #include "qemu/accel.h"
>  #include "hw/boards.h"
>  #include "migration/vmstate.h"
> +#include "exec/address-spaces.h"
>  
>  //#define DEBUG_UNASSIGNED
>  
> @@ -2121,6 +2122,77 @@ void ram_discard_manager_unregister_listener(RamDiscardManager *rdm,
>      rdmc->unregister_listener(rdm, rdl);
>  }
>  
> +/* Called with rcu_read_lock held.  */
> +bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
> +                          ram_addr_t *ram_addr, bool *read_only,
> +                          bool *mr_has_discard_manager)
> +{
> +    MemoryRegion *mr;
> +    hwaddr xlat;
> +    hwaddr len = iotlb->addr_mask + 1;
> +    bool writable = iotlb->perm & IOMMU_WO;
> +
> +    if (mr_has_discard_manager) {
> +        *mr_has_discard_manager = false;
> +    }
> +    /*
> +     * The IOMMU TLB entry we have just covers translation through
> +     * this IOMMU to its immediate target.  We need to translate
> +     * it the rest of the way through to memory.
> +     */
> +    mr = address_space_translate(&address_space_memory, iotlb->translated_addr,
> +                                 &xlat, &len, writable, MEMTXATTRS_UNSPECIFIED);
> +    if (!memory_region_is_ram(mr)) {
> +        error_report("iommu map to non memory area %" HWADDR_PRIx "", xlat);
> +        return false;
> +    } else if (memory_region_has_ram_discard_manager(mr)) {
> +        RamDiscardManager *rdm = memory_region_get_ram_discard_manager(mr);
> +        MemoryRegionSection tmp = {
> +            .mr = mr,
> +            .offset_within_region = xlat,
> +            .size = int128_make64(len),
> +        };
> +        if (mr_has_discard_manager) {
> +            *mr_has_discard_manager = true;
> +        }
> +        /*
> +         * Malicious VMs can map memory into the IOMMU, which is expected
> +         * to remain discarded. vfio will pin all pages, populating memory.
> +         * Disallow that. vmstate priorities make sure any RamDiscardManager
> +         * were already restored before IOMMUs are restored.
> +         */
> +        if (!ram_discard_manager_is_populated(rdm, &tmp)) {
> +            error_report("iommu map to discarded memory (e.g., unplugged via"
> +                         " virtio-mem): %" HWADDR_PRIx "",
> +                         iotlb->translated_addr);
> +            return false;
> +        }
> +    }
> +
> +    /*
> +     * Translation truncates length to the IOMMU page size,
> +     * check that it did not truncate too much.
> +     */
> +    if (len & iotlb->addr_mask) {
> +        error_report("iommu has granularity incompatible with target AS");
> +        return false;
> +    }
> +
> +    if (vaddr) {
> +        *vaddr = memory_region_get_ram_ptr(mr) + xlat;
> +    }
> +
> +    if (ram_addr) {
> +        *ram_addr = memory_region_get_ram_addr(mr) + xlat;
> +    }
> +
> +    if (read_only) {
> +        *read_only = !writable || mr->readonly;
> +    }
> +
> +    return true;
> +}
> +
>  void memory_region_set_log(MemoryRegion *mr, bool log, unsigned client)
>  {
>      uint8_t mask = 1 << client;
> -- 
> 2.34.3

On Mon, Oct 31, 2022 at 11:10:20AM +0800, Cindy Lu wrote:
> Add support for vIOMMU. add the new function to deal with iommu MR.
> - during iommu_region_add register a specific IOMMU notifier,
>  and store all notifiers in a list.
> - during iommu_region_del, compare and delete the IOMMU notifier from the list
> 
> Verified in vp_vdpa and vdpa_sim_net driver
> 
> Signed-off-by: Cindy Lu <lulu@redhat.com>
> ---
>  hw/virtio/vhost-vdpa.c         | 123 ++++++++++++++++++++++++++++++---
>  include/hw/virtio/vhost-vdpa.h |  10 +++
>  2 files changed, 122 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index 3ff9ce3501..dcfaaccfa9 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -26,6 +26,7 @@
>  #include "cpu.h"
>  #include "trace.h"
>  #include "qapi/error.h"
> +#include "hw/virtio/virtio-access.h"
>  
>  /*
>   * Return one past the end of the end of section. Be careful with uint64_t
> @@ -44,7 +45,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
>                                                  uint64_t iova_min,
>                                                  uint64_t iova_max)
>  {
> -    Int128 llend;
>  
>      if ((!memory_region_is_ram(section->mr) &&
>           !memory_region_is_iommu(section->mr)) ||
> @@ -61,14 +61,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
>          return true;
>      }
>  
> -    llend = vhost_vdpa_section_end(section);
> -    if (int128_gt(llend, int128_make64(iova_max))) {
> -        error_report("RAM section out of device range (max=0x%" PRIx64
> -                     ", end addr=0x%" PRIx64 ")",
> -                     iova_max, int128_get64(llend));
> -        return true;
> -    }
> -
>      return false;
>  }
>  
> @@ -173,6 +165,106 @@ static void vhost_vdpa_listener_commit(MemoryListener *listener)
>      v->iotlb_batch_begin_sent = false;
>  }
>  
> +static void vhost_vdpa_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
> +{
> +    struct vdpa_iommu *iommu = container_of(n, struct vdpa_iommu, n);
> +
> +    hwaddr iova = iotlb->iova + iommu->iommu_offset;
> +    struct vhost_vdpa *v = iommu->dev;
> +    void *vaddr;
> +    int ret;
> +
> +    if (iotlb->target_as != &address_space_memory) {
> +        error_report("Wrong target AS \"%s\", only system memory is allowed",
> +                     iotlb->target_as->name ? iotlb->target_as->name : "none");
> +        return;
> +    }
> +    RCU_READ_LOCK_GUARD();
> +    vhost_vdpa_iotlb_batch_begin_once(v);
> +
> +    if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
> +        bool read_only;
> +
> +        if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL)) {
> +            return;
> +        }
> +        ret =
> +            vhost_vdpa_dma_map(v, iova, iotlb->addr_mask + 1, vaddr, read_only);
> +        if (ret) {
> +            error_report("vhost_vdpa_dma_map(%p, 0x%" HWADDR_PRIx ", "
> +                         "0x%" HWADDR_PRIx ", %p) = %d (%m)",
> +                         v, iova, iotlb->addr_mask + 1, vaddr, ret);
> +        }
> +    } else {
> +        ret = vhost_vdpa_dma_unmap(v, iova, iotlb->addr_mask + 1);
> +        if (ret) {
> +            error_report("vhost_vdpa_dma_unmap(%p, 0x%" HWADDR_PRIx ", "
> +                         "0x%" HWADDR_PRIx ") = %d (%m)",
> +                         v, iova, iotlb->addr_mask + 1, ret);
> +        }
> +    }
> +}
> +
> +static void vhost_vdpa_iommu_region_add(MemoryListener *listener,
> +                                        MemoryRegionSection *section)
> +{
> +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> +
> +    struct vdpa_iommu *iommu;
> +    Int128 end;
> +    int iommu_idx;
> +    IOMMUMemoryRegion *iommu_mr;
> +    int ret;
> +
> +    iommu_mr = IOMMU_MEMORY_REGION(section->mr);
> +
> +    iommu = g_malloc0(sizeof(*iommu));
> +    end =  int128_add(int128_make64(section->offset_within_region),
> +            section->size);
> +    end = int128_sub(end, int128_one());
> +    iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
> +            MEMTXATTRS_UNSPECIFIED);
> +
> +    iommu->iommu_mr = iommu_mr;
> +
> +    iommu_notifier_init(
> +        &iommu->n, vhost_vdpa_iommu_map_notify, IOMMU_NOTIFIER_IOTLB_EVENTS,
> +        section->offset_within_region, int128_get64(end), iommu_idx);
> +    iommu->iommu_offset =
> +        section->offset_within_address_space - section->offset_within_region;
> +    iommu->dev = v;
> +
> +    ret = memory_region_register_iommu_notifier(section->mr, &iommu->n, NULL);
> +    if (ret) {
> +        g_free(iommu);
> +        return;
> +    }
> +
> +    QLIST_INSERT_HEAD(&v->iommu_list, iommu, iommu_next);
> +    memory_region_iommu_replay(iommu->iommu_mr, &iommu->n);
> +
> +    return;
> +}
> +
> +static void vhost_vdpa_iommu_region_del(MemoryListener *listener,
> +                                        MemoryRegionSection *section)
> +{
> +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> +
> +    struct vdpa_iommu *iommu;
> +
> +    QLIST_FOREACH(iommu, &v->iommu_list, iommu_next)
> +    {
> +        if (MEMORY_REGION(iommu->iommu_mr) == section->mr &&
> +            iommu->n.start == section->offset_within_region) {
> +            memory_region_unregister_iommu_notifier(section->mr, &iommu->n);
> +            QLIST_REMOVE(iommu, iommu_next);
> +            g_free(iommu);
> +            break;
> +        }
> +    }
> +}
> +
>  static void vhost_vdpa_listener_region_add(MemoryListener *listener,
>                                             MemoryRegionSection *section)
>  {
> @@ -186,6 +278,10 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
>                                              v->iova_range.last)) {
>          return;
>      }
> +    if (memory_region_is_iommu(section->mr)) {
> +        vhost_vdpa_iommu_region_add(listener, section);
> +        return;
> +    }
>  
>      if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
>                   (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> @@ -260,6 +356,10 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
>                                              v->iova_range.last)) {
>          return;
>      }
> +    if (memory_region_is_iommu(section->mr)) {
> +        vhost_vdpa_iommu_region_del(listener, section);
> +        return;
> +    }
>  
>      if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
>                   (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> @@ -312,6 +412,7 @@ static const MemoryListener vhost_vdpa_memory_listener = {
>      .region_del = vhost_vdpa_listener_region_del,
>  };
>  
> +
>  static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
>                               void *arg)
>  {
> @@ -587,7 +688,6 @@ static int vhost_vdpa_cleanup(struct vhost_dev *dev)
>      v = dev->opaque;
>      trace_vhost_vdpa_cleanup(dev, v);
>      vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
> -    memory_listener_unregister(&v->listener);
>      vhost_vdpa_svq_cleanup(dev);
>  
>      dev->opaque = NULL;
> @@ -1127,7 +1227,8 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
>      }
>  
>      if (started) {
> -        memory_listener_register(&v->listener, &address_space_memory);
> +        memory_listener_register(&v->listener, dev->vdev->dma_as);
> +
>          return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
>      } else {
>          vhost_vdpa_reset_device(dev);
> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> index d10a89303e..64a46e37cb 100644
> --- a/include/hw/virtio/vhost-vdpa.h
> +++ b/include/hw/virtio/vhost-vdpa.h
> @@ -41,8 +41,18 @@ typedef struct vhost_vdpa {
>      void *shadow_vq_ops_opaque;
>      struct vhost_dev *dev;
>      VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> +    QLIST_HEAD(, vdpa_iommu) iommu_list;
> +    IOMMUNotifier n;
>  } VhostVDPA;
>  
> +struct vdpa_iommu {
> +    struct vhost_vdpa *dev;
> +    IOMMUMemoryRegion *iommu_mr;
> +    hwaddr iommu_offset;
> +    IOMMUNotifier n;
> +    QLIST_ENTRY(vdpa_iommu) iommu_next;
> +};
> +
>  int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
>                         void *vaddr, bool readonly);
>  int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
> -- 
> 2.34.3



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 1/2] vfio: move implement of vfio_get_xlat_addr() to memory.c
  2022-10-31  6:38   ` Michael S. Tsirkin
@ 2022-10-31  6:44     ` Cindy Lu
  2022-10-31  6:54       ` Michael S. Tsirkin
  0 siblings, 1 reply; 15+ messages in thread
From: Cindy Lu @ 2022-10-31  6:44 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: alex.williamson, jasowang, pbonzini, peterx, david, f4bug,
	sgarzare, qemu-devel

On Mon, 31 Oct 2022 at 14:38, Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Oct 31, 2022 at 11:10:19AM +0800, Cindy Lu wrote:
> > - Move the implement vfio_get_xlat_addr to softmmu/memory.c, and
> >   change the name to memory_get_xlat_addr(). So we can use this
> >   function on other devices, such as vDPA device.
> > - Add a new function vfio_get_xlat_addr in vfio/common.c, and it will check
> >   whether the memory is backed by a discard manager. then device can
> >   have its own warning.
> >
> > Signed-off-by: Cindy Lu <lulu@redhat.com>
>
> Could you rebase on top of my tree (config interrupt support conflicts).
>
Hi Micheal,
sure, will do, but I found a crash in config interrupt while testing
vhost user,
should I post a new version for it? Or maybe a patch later?
Thanks
Cindy
> > ---
> >  hw/vfio/common.c      | 66 +++------------------------------------
> >  include/exec/memory.h |  4 +++
> >  softmmu/memory.c      | 72 +++++++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 81 insertions(+), 61 deletions(-)
> >
> > diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> > index ace9562a9b..6bc02b32c8 100644
> > --- a/hw/vfio/common.c
> > +++ b/hw/vfio/common.c
> > @@ -578,45 +578,11 @@ static bool vfio_listener_skipped_section(MemoryRegionSection *section)
> >  static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
> >                                 ram_addr_t *ram_addr, bool *read_only)
> >  {
> > -    MemoryRegion *mr;
> > -    hwaddr xlat;
> > -    hwaddr len = iotlb->addr_mask + 1;
> > -    bool writable = iotlb->perm & IOMMU_WO;
> > -
> > -    /*
> > -     * The IOMMU TLB entry we have just covers translation through
> > -     * this IOMMU to its immediate target.  We need to translate
> > -     * it the rest of the way through to memory.
> > -     */
> > -    mr = address_space_translate(&address_space_memory,
> > -                                 iotlb->translated_addr,
> > -                                 &xlat, &len, writable,
> > -                                 MEMTXATTRS_UNSPECIFIED);
> > -    if (!memory_region_is_ram(mr)) {
> > -        error_report("iommu map to non memory area %"HWADDR_PRIx"",
> > -                     xlat);
> > -        return false;
> > -    } else if (memory_region_has_ram_discard_manager(mr)) {
> > -        RamDiscardManager *rdm = memory_region_get_ram_discard_manager(mr);
> > -        MemoryRegionSection tmp = {
> > -            .mr = mr,
> > -            .offset_within_region = xlat,
> > -            .size = int128_make64(len),
> > -        };
> > -
> > -        /*
> > -         * Malicious VMs can map memory into the IOMMU, which is expected
> > -         * to remain discarded. vfio will pin all pages, populating memory.
> > -         * Disallow that. vmstate priorities make sure any RamDiscardManager
> > -         * were already restored before IOMMUs are restored.
> > -         */
> > -        if (!ram_discard_manager_is_populated(rdm, &tmp)) {
> > -            error_report("iommu map to discarded memory (e.g., unplugged via"
> > -                         " virtio-mem): %"HWADDR_PRIx"",
> > -                         iotlb->translated_addr);
> > -            return false;
> > -        }
> > +    bool ret, mr_has_discard_manager;
> >
> > +    ret = memory_get_xlat_addr(iotlb, vaddr, ram_addr, read_only,
> > +                               &mr_has_discard_manager);
> > +    if (ret && mr_has_discard_manager) {
> >          /*
> >           * Malicious VMs might trigger discarding of IOMMU-mapped memory. The
> >           * pages will remain pinned inside vfio until unmapped, resulting in a
> > @@ -635,29 +601,7 @@ static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
> >                           " intended via an IOMMU. It's possible to mitigate "
> >                           " by setting/adjusting RLIMIT_MEMLOCK.");
> >      }
> > -
> > -    /*
> > -     * Translation truncates length to the IOMMU page size,
> > -     * check that it did not truncate too much.
> > -     */
> > -    if (len & iotlb->addr_mask) {
> > -        error_report("iommu has granularity incompatible with target AS");
> > -        return false;
> > -    }
> > -
> > -    if (vaddr) {
> > -        *vaddr = memory_region_get_ram_ptr(mr) + xlat;
> > -    }
> > -
> > -    if (ram_addr) {
> > -        *ram_addr = memory_region_get_ram_addr(mr) + xlat;
> > -    }
> > -
> > -    if (read_only) {
> > -        *read_only = !writable || mr->readonly;
> > -    }
> > -
> > -    return true;
> > +    return ret;
> >  }
> >
> >  static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
> > diff --git a/include/exec/memory.h b/include/exec/memory.h
> > index bfb1de8eea..d1e79c39dc 100644
> > --- a/include/exec/memory.h
> > +++ b/include/exec/memory.h
> > @@ -713,6 +713,10 @@ void ram_discard_manager_register_listener(RamDiscardManager *rdm,
> >  void ram_discard_manager_unregister_listener(RamDiscardManager *rdm,
> >                                               RamDiscardListener *rdl);
> >
> > +bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
> > +                          ram_addr_t *ram_addr, bool *read_only,
> > +                          bool *mr_has_discard_manager);
> > +
> >  typedef struct CoalescedMemoryRange CoalescedMemoryRange;
> >  typedef struct MemoryRegionIoeventfd MemoryRegionIoeventfd;
> >
> > diff --git a/softmmu/memory.c b/softmmu/memory.c
> > index 7ba2048836..bc0be3f62c 100644
> > --- a/softmmu/memory.c
> > +++ b/softmmu/memory.c
> > @@ -33,6 +33,7 @@
> >  #include "qemu/accel.h"
> >  #include "hw/boards.h"
> >  #include "migration/vmstate.h"
> > +#include "exec/address-spaces.h"
> >
> >  //#define DEBUG_UNASSIGNED
> >
> > @@ -2121,6 +2122,77 @@ void ram_discard_manager_unregister_listener(RamDiscardManager *rdm,
> >      rdmc->unregister_listener(rdm, rdl);
> >  }
> >
> > +/* Called with rcu_read_lock held.  */
> > +bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
> > +                          ram_addr_t *ram_addr, bool *read_only,
> > +                          bool *mr_has_discard_manager)
> > +{
> > +    MemoryRegion *mr;
> > +    hwaddr xlat;
> > +    hwaddr len = iotlb->addr_mask + 1;
> > +    bool writable = iotlb->perm & IOMMU_WO;
> > +
> > +    if (mr_has_discard_manager) {
> > +        *mr_has_discard_manager = false;
> > +    }
> > +    /*
> > +     * The IOMMU TLB entry we have just covers translation through
> > +     * this IOMMU to its immediate target.  We need to translate
> > +     * it the rest of the way through to memory.
> > +     */
> > +    mr = address_space_translate(&address_space_memory, iotlb->translated_addr,
> > +                                 &xlat, &len, writable, MEMTXATTRS_UNSPECIFIED);
> > +    if (!memory_region_is_ram(mr)) {
> > +        error_report("iommu map to non memory area %" HWADDR_PRIx "", xlat);
> > +        return false;
> > +    } else if (memory_region_has_ram_discard_manager(mr)) {
> > +        RamDiscardManager *rdm = memory_region_get_ram_discard_manager(mr);
> > +        MemoryRegionSection tmp = {
> > +            .mr = mr,
> > +            .offset_within_region = xlat,
> > +            .size = int128_make64(len),
> > +        };
> > +        if (mr_has_discard_manager) {
> > +            *mr_has_discard_manager = true;
> > +        }
> > +        /*
> > +         * Malicious VMs can map memory into the IOMMU, which is expected
> > +         * to remain discarded. vfio will pin all pages, populating memory.
> > +         * Disallow that. vmstate priorities make sure any RamDiscardManager
> > +         * were already restored before IOMMUs are restored.
> > +         */
> > +        if (!ram_discard_manager_is_populated(rdm, &tmp)) {
> > +            error_report("iommu map to discarded memory (e.g., unplugged via"
> > +                         " virtio-mem): %" HWADDR_PRIx "",
> > +                         iotlb->translated_addr);
> > +            return false;
> > +        }
> > +    }
> > +
> > +    /*
> > +     * Translation truncates length to the IOMMU page size,
> > +     * check that it did not truncate too much.
> > +     */
> > +    if (len & iotlb->addr_mask) {
> > +        error_report("iommu has granularity incompatible with target AS");
> > +        return false;
> > +    }
> > +
> > +    if (vaddr) {
> > +        *vaddr = memory_region_get_ram_ptr(mr) + xlat;
> > +    }
> > +
> > +    if (ram_addr) {
> > +        *ram_addr = memory_region_get_ram_addr(mr) + xlat;
> > +    }
> > +
> > +    if (read_only) {
> > +        *read_only = !writable || mr->readonly;
> > +    }
> > +
> > +    return true;
> > +}
> > +
> >  void memory_region_set_log(MemoryRegion *mr, bool log, unsigned client)
> >  {
> >      uint8_t mask = 1 << client;
> > --
> > 2.34.3
>
> On Mon, Oct 31, 2022 at 11:10:20AM +0800, Cindy Lu wrote:
> > Add support for vIOMMU. add the new function to deal with iommu MR.
> > - during iommu_region_add register a specific IOMMU notifier,
> >  and store all notifiers in a list.
> > - during iommu_region_del, compare and delete the IOMMU notifier from the list
> >
> > Verified in vp_vdpa and vdpa_sim_net driver
> >
> > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > ---
> >  hw/virtio/vhost-vdpa.c         | 123 ++++++++++++++++++++++++++++++---
> >  include/hw/virtio/vhost-vdpa.h |  10 +++
> >  2 files changed, 122 insertions(+), 11 deletions(-)
> >
> > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > index 3ff9ce3501..dcfaaccfa9 100644
> > --- a/hw/virtio/vhost-vdpa.c
> > +++ b/hw/virtio/vhost-vdpa.c
> > @@ -26,6 +26,7 @@
> >  #include "cpu.h"
> >  #include "trace.h"
> >  #include "qapi/error.h"
> > +#include "hw/virtio/virtio-access.h"
> >
> >  /*
> >   * Return one past the end of the end of section. Be careful with uint64_t
> > @@ -44,7 +45,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
> >                                                  uint64_t iova_min,
> >                                                  uint64_t iova_max)
> >  {
> > -    Int128 llend;
> >
> >      if ((!memory_region_is_ram(section->mr) &&
> >           !memory_region_is_iommu(section->mr)) ||
> > @@ -61,14 +61,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
> >          return true;
> >      }
> >
> > -    llend = vhost_vdpa_section_end(section);
> > -    if (int128_gt(llend, int128_make64(iova_max))) {
> > -        error_report("RAM section out of device range (max=0x%" PRIx64
> > -                     ", end addr=0x%" PRIx64 ")",
> > -                     iova_max, int128_get64(llend));
> > -        return true;
> > -    }
> > -
> >      return false;
> >  }
> >
> > @@ -173,6 +165,106 @@ static void vhost_vdpa_listener_commit(MemoryListener *listener)
> >      v->iotlb_batch_begin_sent = false;
> >  }
> >
> > +static void vhost_vdpa_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
> > +{
> > +    struct vdpa_iommu *iommu = container_of(n, struct vdpa_iommu, n);
> > +
> > +    hwaddr iova = iotlb->iova + iommu->iommu_offset;
> > +    struct vhost_vdpa *v = iommu->dev;
> > +    void *vaddr;
> > +    int ret;
> > +
> > +    if (iotlb->target_as != &address_space_memory) {
> > +        error_report("Wrong target AS \"%s\", only system memory is allowed",
> > +                     iotlb->target_as->name ? iotlb->target_as->name : "none");
> > +        return;
> > +    }
> > +    RCU_READ_LOCK_GUARD();
> > +    vhost_vdpa_iotlb_batch_begin_once(v);
> > +
> > +    if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
> > +        bool read_only;
> > +
> > +        if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL)) {
> > +            return;
> > +        }
> > +        ret =
> > +            vhost_vdpa_dma_map(v, iova, iotlb->addr_mask + 1, vaddr, read_only);
> > +        if (ret) {
> > +            error_report("vhost_vdpa_dma_map(%p, 0x%" HWADDR_PRIx ", "
> > +                         "0x%" HWADDR_PRIx ", %p) = %d (%m)",
> > +                         v, iova, iotlb->addr_mask + 1, vaddr, ret);
> > +        }
> > +    } else {
> > +        ret = vhost_vdpa_dma_unmap(v, iova, iotlb->addr_mask + 1);
> > +        if (ret) {
> > +            error_report("vhost_vdpa_dma_unmap(%p, 0x%" HWADDR_PRIx ", "
> > +                         "0x%" HWADDR_PRIx ") = %d (%m)",
> > +                         v, iova, iotlb->addr_mask + 1, ret);
> > +        }
> > +    }
> > +}
> > +
> > +static void vhost_vdpa_iommu_region_add(MemoryListener *listener,
> > +                                        MemoryRegionSection *section)
> > +{
> > +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> > +
> > +    struct vdpa_iommu *iommu;
> > +    Int128 end;
> > +    int iommu_idx;
> > +    IOMMUMemoryRegion *iommu_mr;
> > +    int ret;
> > +
> > +    iommu_mr = IOMMU_MEMORY_REGION(section->mr);
> > +
> > +    iommu = g_malloc0(sizeof(*iommu));
> > +    end =  int128_add(int128_make64(section->offset_within_region),
> > +            section->size);
> > +    end = int128_sub(end, int128_one());
> > +    iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
> > +            MEMTXATTRS_UNSPECIFIED);
> > +
> > +    iommu->iommu_mr = iommu_mr;
> > +
> > +    iommu_notifier_init(
> > +        &iommu->n, vhost_vdpa_iommu_map_notify, IOMMU_NOTIFIER_IOTLB_EVENTS,
> > +        section->offset_within_region, int128_get64(end), iommu_idx);
> > +    iommu->iommu_offset =
> > +        section->offset_within_address_space - section->offset_within_region;
> > +    iommu->dev = v;
> > +
> > +    ret = memory_region_register_iommu_notifier(section->mr, &iommu->n, NULL);
> > +    if (ret) {
> > +        g_free(iommu);
> > +        return;
> > +    }
> > +
> > +    QLIST_INSERT_HEAD(&v->iommu_list, iommu, iommu_next);
> > +    memory_region_iommu_replay(iommu->iommu_mr, &iommu->n);
> > +
> > +    return;
> > +}
> > +
> > +static void vhost_vdpa_iommu_region_del(MemoryListener *listener,
> > +                                        MemoryRegionSection *section)
> > +{
> > +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> > +
> > +    struct vdpa_iommu *iommu;
> > +
> > +    QLIST_FOREACH(iommu, &v->iommu_list, iommu_next)
> > +    {
> > +        if (MEMORY_REGION(iommu->iommu_mr) == section->mr &&
> > +            iommu->n.start == section->offset_within_region) {
> > +            memory_region_unregister_iommu_notifier(section->mr, &iommu->n);
> > +            QLIST_REMOVE(iommu, iommu_next);
> > +            g_free(iommu);
> > +            break;
> > +        }
> > +    }
> > +}
> > +
> >  static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> >                                             MemoryRegionSection *section)
> >  {
> > @@ -186,6 +278,10 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> >                                              v->iova_range.last)) {
> >          return;
> >      }
> > +    if (memory_region_is_iommu(section->mr)) {
> > +        vhost_vdpa_iommu_region_add(listener, section);
> > +        return;
> > +    }
> >
> >      if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> >                   (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> > @@ -260,6 +356,10 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
> >                                              v->iova_range.last)) {
> >          return;
> >      }
> > +    if (memory_region_is_iommu(section->mr)) {
> > +        vhost_vdpa_iommu_region_del(listener, section);
> > +        return;
> > +    }
> >
> >      if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> >                   (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> > @@ -312,6 +412,7 @@ static const MemoryListener vhost_vdpa_memory_listener = {
> >      .region_del = vhost_vdpa_listener_region_del,
> >  };
> >
> > +
> >  static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> >                               void *arg)
> >  {
> > @@ -587,7 +688,6 @@ static int vhost_vdpa_cleanup(struct vhost_dev *dev)
> >      v = dev->opaque;
> >      trace_vhost_vdpa_cleanup(dev, v);
> >      vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
> > -    memory_listener_unregister(&v->listener);
> >      vhost_vdpa_svq_cleanup(dev);
> >
> >      dev->opaque = NULL;
> > @@ -1127,7 +1227,8 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> >      }
> >
> >      if (started) {
> > -        memory_listener_register(&v->listener, &address_space_memory);
> > +        memory_listener_register(&v->listener, dev->vdev->dma_as);
> > +
> >          return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
> >      } else {
> >          vhost_vdpa_reset_device(dev);
> > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > index d10a89303e..64a46e37cb 100644
> > --- a/include/hw/virtio/vhost-vdpa.h
> > +++ b/include/hw/virtio/vhost-vdpa.h
> > @@ -41,8 +41,18 @@ typedef struct vhost_vdpa {
> >      void *shadow_vq_ops_opaque;
> >      struct vhost_dev *dev;
> >      VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> > +    QLIST_HEAD(, vdpa_iommu) iommu_list;
> > +    IOMMUNotifier n;
> >  } VhostVDPA;
> >
> > +struct vdpa_iommu {
> > +    struct vhost_vdpa *dev;
> > +    IOMMUMemoryRegion *iommu_mr;
> > +    hwaddr iommu_offset;
> > +    IOMMUNotifier n;
> > +    QLIST_ENTRY(vdpa_iommu) iommu_next;
> > +};
> > +
> >  int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
> >                         void *vaddr, bool readonly);
> >  int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
> > --
> > 2.34.3
>



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 1/2] vfio: move implement of vfio_get_xlat_addr() to memory.c
  2022-10-31  6:44     ` Cindy Lu
@ 2022-10-31  6:54       ` Michael S. Tsirkin
  2022-10-31  6:56         ` Cindy Lu
  0 siblings, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2022-10-31  6:54 UTC (permalink / raw)
  To: Cindy Lu
  Cc: alex.williamson, jasowang, pbonzini, peterx, david, f4bug,
	sgarzare, qemu-devel

On Mon, Oct 31, 2022 at 02:44:11PM +0800, Cindy Lu wrote:
> On Mon, 31 Oct 2022 at 14:38, Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Oct 31, 2022 at 11:10:19AM +0800, Cindy Lu wrote:
> > > - Move the implement vfio_get_xlat_addr to softmmu/memory.c, and
> > >   change the name to memory_get_xlat_addr(). So we can use this
> > >   function on other devices, such as vDPA device.
> > > - Add a new function vfio_get_xlat_addr in vfio/common.c, and it will check
> > >   whether the memory is backed by a discard manager. then device can
> > >   have its own warning.
> > >
> > > Signed-off-by: Cindy Lu <lulu@redhat.com>
> >
> > Could you rebase on top of my tree (config interrupt support conflicts).
> >
> Hi Micheal,
> sure, will do, but I found a crash in config interrupt while testing
> vhost user,
> should I post a new version for it? Or maybe a patch later?
> Thanks
> Cindy

New version, I will drop this one. So do you want this one picked up and
config interrupt on top?

> > > ---
> > >  hw/vfio/common.c      | 66 +++------------------------------------
> > >  include/exec/memory.h |  4 +++
> > >  softmmu/memory.c      | 72 +++++++++++++++++++++++++++++++++++++++++++
> > >  3 files changed, 81 insertions(+), 61 deletions(-)
> > >
> > > diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> > > index ace9562a9b..6bc02b32c8 100644
> > > --- a/hw/vfio/common.c
> > > +++ b/hw/vfio/common.c
> > > @@ -578,45 +578,11 @@ static bool vfio_listener_skipped_section(MemoryRegionSection *section)
> > >  static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
> > >                                 ram_addr_t *ram_addr, bool *read_only)
> > >  {
> > > -    MemoryRegion *mr;
> > > -    hwaddr xlat;
> > > -    hwaddr len = iotlb->addr_mask + 1;
> > > -    bool writable = iotlb->perm & IOMMU_WO;
> > > -
> > > -    /*
> > > -     * The IOMMU TLB entry we have just covers translation through
> > > -     * this IOMMU to its immediate target.  We need to translate
> > > -     * it the rest of the way through to memory.
> > > -     */
> > > -    mr = address_space_translate(&address_space_memory,
> > > -                                 iotlb->translated_addr,
> > > -                                 &xlat, &len, writable,
> > > -                                 MEMTXATTRS_UNSPECIFIED);
> > > -    if (!memory_region_is_ram(mr)) {
> > > -        error_report("iommu map to non memory area %"HWADDR_PRIx"",
> > > -                     xlat);
> > > -        return false;
> > > -    } else if (memory_region_has_ram_discard_manager(mr)) {
> > > -        RamDiscardManager *rdm = memory_region_get_ram_discard_manager(mr);
> > > -        MemoryRegionSection tmp = {
> > > -            .mr = mr,
> > > -            .offset_within_region = xlat,
> > > -            .size = int128_make64(len),
> > > -        };
> > > -
> > > -        /*
> > > -         * Malicious VMs can map memory into the IOMMU, which is expected
> > > -         * to remain discarded. vfio will pin all pages, populating memory.
> > > -         * Disallow that. vmstate priorities make sure any RamDiscardManager
> > > -         * were already restored before IOMMUs are restored.
> > > -         */
> > > -        if (!ram_discard_manager_is_populated(rdm, &tmp)) {
> > > -            error_report("iommu map to discarded memory (e.g., unplugged via"
> > > -                         " virtio-mem): %"HWADDR_PRIx"",
> > > -                         iotlb->translated_addr);
> > > -            return false;
> > > -        }
> > > +    bool ret, mr_has_discard_manager;
> > >
> > > +    ret = memory_get_xlat_addr(iotlb, vaddr, ram_addr, read_only,
> > > +                               &mr_has_discard_manager);
> > > +    if (ret && mr_has_discard_manager) {
> > >          /*
> > >           * Malicious VMs might trigger discarding of IOMMU-mapped memory. The
> > >           * pages will remain pinned inside vfio until unmapped, resulting in a
> > > @@ -635,29 +601,7 @@ static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
> > >                           " intended via an IOMMU. It's possible to mitigate "
> > >                           " by setting/adjusting RLIMIT_MEMLOCK.");
> > >      }
> > > -
> > > -    /*
> > > -     * Translation truncates length to the IOMMU page size,
> > > -     * check that it did not truncate too much.
> > > -     */
> > > -    if (len & iotlb->addr_mask) {
> > > -        error_report("iommu has granularity incompatible with target AS");
> > > -        return false;
> > > -    }
> > > -
> > > -    if (vaddr) {
> > > -        *vaddr = memory_region_get_ram_ptr(mr) + xlat;
> > > -    }
> > > -
> > > -    if (ram_addr) {
> > > -        *ram_addr = memory_region_get_ram_addr(mr) + xlat;
> > > -    }
> > > -
> > > -    if (read_only) {
> > > -        *read_only = !writable || mr->readonly;
> > > -    }
> > > -
> > > -    return true;
> > > +    return ret;
> > >  }
> > >
> > >  static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
> > > diff --git a/include/exec/memory.h b/include/exec/memory.h
> > > index bfb1de8eea..d1e79c39dc 100644
> > > --- a/include/exec/memory.h
> > > +++ b/include/exec/memory.h
> > > @@ -713,6 +713,10 @@ void ram_discard_manager_register_listener(RamDiscardManager *rdm,
> > >  void ram_discard_manager_unregister_listener(RamDiscardManager *rdm,
> > >                                               RamDiscardListener *rdl);
> > >
> > > +bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
> > > +                          ram_addr_t *ram_addr, bool *read_only,
> > > +                          bool *mr_has_discard_manager);
> > > +
> > >  typedef struct CoalescedMemoryRange CoalescedMemoryRange;
> > >  typedef struct MemoryRegionIoeventfd MemoryRegionIoeventfd;
> > >
> > > diff --git a/softmmu/memory.c b/softmmu/memory.c
> > > index 7ba2048836..bc0be3f62c 100644
> > > --- a/softmmu/memory.c
> > > +++ b/softmmu/memory.c
> > > @@ -33,6 +33,7 @@
> > >  #include "qemu/accel.h"
> > >  #include "hw/boards.h"
> > >  #include "migration/vmstate.h"
> > > +#include "exec/address-spaces.h"
> > >
> > >  //#define DEBUG_UNASSIGNED
> > >
> > > @@ -2121,6 +2122,77 @@ void ram_discard_manager_unregister_listener(RamDiscardManager *rdm,
> > >      rdmc->unregister_listener(rdm, rdl);
> > >  }
> > >
> > > +/* Called with rcu_read_lock held.  */
> > > +bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
> > > +                          ram_addr_t *ram_addr, bool *read_only,
> > > +                          bool *mr_has_discard_manager)
> > > +{
> > > +    MemoryRegion *mr;
> > > +    hwaddr xlat;
> > > +    hwaddr len = iotlb->addr_mask + 1;
> > > +    bool writable = iotlb->perm & IOMMU_WO;
> > > +
> > > +    if (mr_has_discard_manager) {
> > > +        *mr_has_discard_manager = false;
> > > +    }
> > > +    /*
> > > +     * The IOMMU TLB entry we have just covers translation through
> > > +     * this IOMMU to its immediate target.  We need to translate
> > > +     * it the rest of the way through to memory.
> > > +     */
> > > +    mr = address_space_translate(&address_space_memory, iotlb->translated_addr,
> > > +                                 &xlat, &len, writable, MEMTXATTRS_UNSPECIFIED);
> > > +    if (!memory_region_is_ram(mr)) {
> > > +        error_report("iommu map to non memory area %" HWADDR_PRIx "", xlat);
> > > +        return false;
> > > +    } else if (memory_region_has_ram_discard_manager(mr)) {
> > > +        RamDiscardManager *rdm = memory_region_get_ram_discard_manager(mr);
> > > +        MemoryRegionSection tmp = {
> > > +            .mr = mr,
> > > +            .offset_within_region = xlat,
> > > +            .size = int128_make64(len),
> > > +        };
> > > +        if (mr_has_discard_manager) {
> > > +            *mr_has_discard_manager = true;
> > > +        }
> > > +        /*
> > > +         * Malicious VMs can map memory into the IOMMU, which is expected
> > > +         * to remain discarded. vfio will pin all pages, populating memory.
> > > +         * Disallow that. vmstate priorities make sure any RamDiscardManager
> > > +         * were already restored before IOMMUs are restored.
> > > +         */
> > > +        if (!ram_discard_manager_is_populated(rdm, &tmp)) {
> > > +            error_report("iommu map to discarded memory (e.g., unplugged via"
> > > +                         " virtio-mem): %" HWADDR_PRIx "",
> > > +                         iotlb->translated_addr);
> > > +            return false;
> > > +        }
> > > +    }
> > > +
> > > +    /*
> > > +     * Translation truncates length to the IOMMU page size,
> > > +     * check that it did not truncate too much.
> > > +     */
> > > +    if (len & iotlb->addr_mask) {
> > > +        error_report("iommu has granularity incompatible with target AS");
> > > +        return false;
> > > +    }
> > > +
> > > +    if (vaddr) {
> > > +        *vaddr = memory_region_get_ram_ptr(mr) + xlat;
> > > +    }
> > > +
> > > +    if (ram_addr) {
> > > +        *ram_addr = memory_region_get_ram_addr(mr) + xlat;
> > > +    }
> > > +
> > > +    if (read_only) {
> > > +        *read_only = !writable || mr->readonly;
> > > +    }
> > > +
> > > +    return true;
> > > +}
> > > +
> > >  void memory_region_set_log(MemoryRegion *mr, bool log, unsigned client)
> > >  {
> > >      uint8_t mask = 1 << client;
> > > --
> > > 2.34.3
> >
> > On Mon, Oct 31, 2022 at 11:10:20AM +0800, Cindy Lu wrote:
> > > Add support for vIOMMU. add the new function to deal with iommu MR.
> > > - during iommu_region_add register a specific IOMMU notifier,
> > >  and store all notifiers in a list.
> > > - during iommu_region_del, compare and delete the IOMMU notifier from the list
> > >
> > > Verified in vp_vdpa and vdpa_sim_net driver
> > >
> > > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > > ---
> > >  hw/virtio/vhost-vdpa.c         | 123 ++++++++++++++++++++++++++++++---
> > >  include/hw/virtio/vhost-vdpa.h |  10 +++
> > >  2 files changed, 122 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > > index 3ff9ce3501..dcfaaccfa9 100644
> > > --- a/hw/virtio/vhost-vdpa.c
> > > +++ b/hw/virtio/vhost-vdpa.c
> > > @@ -26,6 +26,7 @@
> > >  #include "cpu.h"
> > >  #include "trace.h"
> > >  #include "qapi/error.h"
> > > +#include "hw/virtio/virtio-access.h"
> > >
> > >  /*
> > >   * Return one past the end of the end of section. Be careful with uint64_t
> > > @@ -44,7 +45,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
> > >                                                  uint64_t iova_min,
> > >                                                  uint64_t iova_max)
> > >  {
> > > -    Int128 llend;
> > >
> > >      if ((!memory_region_is_ram(section->mr) &&
> > >           !memory_region_is_iommu(section->mr)) ||
> > > @@ -61,14 +61,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
> > >          return true;
> > >      }
> > >
> > > -    llend = vhost_vdpa_section_end(section);
> > > -    if (int128_gt(llend, int128_make64(iova_max))) {
> > > -        error_report("RAM section out of device range (max=0x%" PRIx64
> > > -                     ", end addr=0x%" PRIx64 ")",
> > > -                     iova_max, int128_get64(llend));
> > > -        return true;
> > > -    }
> > > -
> > >      return false;
> > >  }
> > >
> > > @@ -173,6 +165,106 @@ static void vhost_vdpa_listener_commit(MemoryListener *listener)
> > >      v->iotlb_batch_begin_sent = false;
> > >  }
> > >
> > > +static void vhost_vdpa_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
> > > +{
> > > +    struct vdpa_iommu *iommu = container_of(n, struct vdpa_iommu, n);
> > > +
> > > +    hwaddr iova = iotlb->iova + iommu->iommu_offset;
> > > +    struct vhost_vdpa *v = iommu->dev;
> > > +    void *vaddr;
> > > +    int ret;
> > > +
> > > +    if (iotlb->target_as != &address_space_memory) {
> > > +        error_report("Wrong target AS \"%s\", only system memory is allowed",
> > > +                     iotlb->target_as->name ? iotlb->target_as->name : "none");
> > > +        return;
> > > +    }
> > > +    RCU_READ_LOCK_GUARD();
> > > +    vhost_vdpa_iotlb_batch_begin_once(v);
> > > +
> > > +    if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
> > > +        bool read_only;
> > > +
> > > +        if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL)) {
> > > +            return;
> > > +        }
> > > +        ret =
> > > +            vhost_vdpa_dma_map(v, iova, iotlb->addr_mask + 1, vaddr, read_only);
> > > +        if (ret) {
> > > +            error_report("vhost_vdpa_dma_map(%p, 0x%" HWADDR_PRIx ", "
> > > +                         "0x%" HWADDR_PRIx ", %p) = %d (%m)",
> > > +                         v, iova, iotlb->addr_mask + 1, vaddr, ret);
> > > +        }
> > > +    } else {
> > > +        ret = vhost_vdpa_dma_unmap(v, iova, iotlb->addr_mask + 1);
> > > +        if (ret) {
> > > +            error_report("vhost_vdpa_dma_unmap(%p, 0x%" HWADDR_PRIx ", "
> > > +                         "0x%" HWADDR_PRIx ") = %d (%m)",
> > > +                         v, iova, iotlb->addr_mask + 1, ret);
> > > +        }
> > > +    }
> > > +}
> > > +
> > > +static void vhost_vdpa_iommu_region_add(MemoryListener *listener,
> > > +                                        MemoryRegionSection *section)
> > > +{
> > > +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> > > +
> > > +    struct vdpa_iommu *iommu;
> > > +    Int128 end;
> > > +    int iommu_idx;
> > > +    IOMMUMemoryRegion *iommu_mr;
> > > +    int ret;
> > > +
> > > +    iommu_mr = IOMMU_MEMORY_REGION(section->mr);
> > > +
> > > +    iommu = g_malloc0(sizeof(*iommu));
> > > +    end =  int128_add(int128_make64(section->offset_within_region),
> > > +            section->size);
> > > +    end = int128_sub(end, int128_one());
> > > +    iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
> > > +            MEMTXATTRS_UNSPECIFIED);
> > > +
> > > +    iommu->iommu_mr = iommu_mr;
> > > +
> > > +    iommu_notifier_init(
> > > +        &iommu->n, vhost_vdpa_iommu_map_notify, IOMMU_NOTIFIER_IOTLB_EVENTS,
> > > +        section->offset_within_region, int128_get64(end), iommu_idx);
> > > +    iommu->iommu_offset =
> > > +        section->offset_within_address_space - section->offset_within_region;
> > > +    iommu->dev = v;
> > > +
> > > +    ret = memory_region_register_iommu_notifier(section->mr, &iommu->n, NULL);
> > > +    if (ret) {
> > > +        g_free(iommu);
> > > +        return;
> > > +    }
> > > +
> > > +    QLIST_INSERT_HEAD(&v->iommu_list, iommu, iommu_next);
> > > +    memory_region_iommu_replay(iommu->iommu_mr, &iommu->n);
> > > +
> > > +    return;
> > > +}
> > > +
> > > +static void vhost_vdpa_iommu_region_del(MemoryListener *listener,
> > > +                                        MemoryRegionSection *section)
> > > +{
> > > +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> > > +
> > > +    struct vdpa_iommu *iommu;
> > > +
> > > +    QLIST_FOREACH(iommu, &v->iommu_list, iommu_next)
> > > +    {
> > > +        if (MEMORY_REGION(iommu->iommu_mr) == section->mr &&
> > > +            iommu->n.start == section->offset_within_region) {
> > > +            memory_region_unregister_iommu_notifier(section->mr, &iommu->n);
> > > +            QLIST_REMOVE(iommu, iommu_next);
> > > +            g_free(iommu);
> > > +            break;
> > > +        }
> > > +    }
> > > +}
> > > +
> > >  static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> > >                                             MemoryRegionSection *section)
> > >  {
> > > @@ -186,6 +278,10 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> > >                                              v->iova_range.last)) {
> > >          return;
> > >      }
> > > +    if (memory_region_is_iommu(section->mr)) {
> > > +        vhost_vdpa_iommu_region_add(listener, section);
> > > +        return;
> > > +    }
> > >
> > >      if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> > >                   (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> > > @@ -260,6 +356,10 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
> > >                                              v->iova_range.last)) {
> > >          return;
> > >      }
> > > +    if (memory_region_is_iommu(section->mr)) {
> > > +        vhost_vdpa_iommu_region_del(listener, section);
> > > +        return;
> > > +    }
> > >
> > >      if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> > >                   (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> > > @@ -312,6 +412,7 @@ static const MemoryListener vhost_vdpa_memory_listener = {
> > >      .region_del = vhost_vdpa_listener_region_del,
> > >  };
> > >
> > > +
> > >  static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> > >                               void *arg)
> > >  {
> > > @@ -587,7 +688,6 @@ static int vhost_vdpa_cleanup(struct vhost_dev *dev)
> > >      v = dev->opaque;
> > >      trace_vhost_vdpa_cleanup(dev, v);
> > >      vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
> > > -    memory_listener_unregister(&v->listener);
> > >      vhost_vdpa_svq_cleanup(dev);
> > >
> > >      dev->opaque = NULL;
> > > @@ -1127,7 +1227,8 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> > >      }
> > >
> > >      if (started) {
> > > -        memory_listener_register(&v->listener, &address_space_memory);
> > > +        memory_listener_register(&v->listener, dev->vdev->dma_as);
> > > +
> > >          return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
> > >      } else {
> > >          vhost_vdpa_reset_device(dev);
> > > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > > index d10a89303e..64a46e37cb 100644
> > > --- a/include/hw/virtio/vhost-vdpa.h
> > > +++ b/include/hw/virtio/vhost-vdpa.h
> > > @@ -41,8 +41,18 @@ typedef struct vhost_vdpa {
> > >      void *shadow_vq_ops_opaque;
> > >      struct vhost_dev *dev;
> > >      VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> > > +    QLIST_HEAD(, vdpa_iommu) iommu_list;
> > > +    IOMMUNotifier n;
> > >  } VhostVDPA;
> > >
> > > +struct vdpa_iommu {
> > > +    struct vhost_vdpa *dev;
> > > +    IOMMUMemoryRegion *iommu_mr;
> > > +    hwaddr iommu_offset;
> > > +    IOMMUNotifier n;
> > > +    QLIST_ENTRY(vdpa_iommu) iommu_next;
> > > +};
> > > +
> > >  int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
> > >                         void *vaddr, bool readonly);
> > >  int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
> > > --
> > > 2.34.3
> >



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 1/2] vfio: move implement of vfio_get_xlat_addr() to memory.c
  2022-10-31  6:54       ` Michael S. Tsirkin
@ 2022-10-31  6:56         ` Cindy Lu
  2022-10-31  7:07           ` Michael S. Tsirkin
  0 siblings, 1 reply; 15+ messages in thread
From: Cindy Lu @ 2022-10-31  6:56 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: alex.williamson, jasowang, pbonzini, peterx, david, f4bug,
	sgarzare, qemu-devel

On Mon, 31 Oct 2022 at 14:55, Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Oct 31, 2022 at 02:44:11PM +0800, Cindy Lu wrote:
> > On Mon, 31 Oct 2022 at 14:38, Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Oct 31, 2022 at 11:10:19AM +0800, Cindy Lu wrote:
> > > > - Move the implement vfio_get_xlat_addr to softmmu/memory.c, and
> > > >   change the name to memory_get_xlat_addr(). So we can use this
> > > >   function on other devices, such as vDPA device.
> > > > - Add a new function vfio_get_xlat_addr in vfio/common.c, and it will check
> > > >   whether the memory is backed by a discard manager. then device can
> > > >   have its own warning.
> > > >
> > > > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > >
> > > Could you rebase on top of my tree (config interrupt support conflicts).
> > >
> > Hi Micheal,
> > sure, will do, but I found a crash in config interrupt while testing
> > vhost user,
> > should I post a new version for it? Or maybe a patch later?
> > Thanks
> > Cindy
>
> New version, I will drop this one. So do you want this one picked up and
> config interrupt on top?
>
sure, I will rebase the config interrupt patches on top of this
Thanks
Cindy
> > > > ---
> > > >  hw/vfio/common.c      | 66 +++------------------------------------
> > > >  include/exec/memory.h |  4 +++
> > > >  softmmu/memory.c      | 72 +++++++++++++++++++++++++++++++++++++++++++
> > > >  3 files changed, 81 insertions(+), 61 deletions(-)
> > > >
> > > > diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> > > > index ace9562a9b..6bc02b32c8 100644
> > > > --- a/hw/vfio/common.c
> > > > +++ b/hw/vfio/common.c
> > > > @@ -578,45 +578,11 @@ static bool vfio_listener_skipped_section(MemoryRegionSection *section)
> > > >  static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
> > > >                                 ram_addr_t *ram_addr, bool *read_only)
> > > >  {
> > > > -    MemoryRegion *mr;
> > > > -    hwaddr xlat;
> > > > -    hwaddr len = iotlb->addr_mask + 1;
> > > > -    bool writable = iotlb->perm & IOMMU_WO;
> > > > -
> > > > -    /*
> > > > -     * The IOMMU TLB entry we have just covers translation through
> > > > -     * this IOMMU to its immediate target.  We need to translate
> > > > -     * it the rest of the way through to memory.
> > > > -     */
> > > > -    mr = address_space_translate(&address_space_memory,
> > > > -                                 iotlb->translated_addr,
> > > > -                                 &xlat, &len, writable,
> > > > -                                 MEMTXATTRS_UNSPECIFIED);
> > > > -    if (!memory_region_is_ram(mr)) {
> > > > -        error_report("iommu map to non memory area %"HWADDR_PRIx"",
> > > > -                     xlat);
> > > > -        return false;
> > > > -    } else if (memory_region_has_ram_discard_manager(mr)) {
> > > > -        RamDiscardManager *rdm = memory_region_get_ram_discard_manager(mr);
> > > > -        MemoryRegionSection tmp = {
> > > > -            .mr = mr,
> > > > -            .offset_within_region = xlat,
> > > > -            .size = int128_make64(len),
> > > > -        };
> > > > -
> > > > -        /*
> > > > -         * Malicious VMs can map memory into the IOMMU, which is expected
> > > > -         * to remain discarded. vfio will pin all pages, populating memory.
> > > > -         * Disallow that. vmstate priorities make sure any RamDiscardManager
> > > > -         * were already restored before IOMMUs are restored.
> > > > -         */
> > > > -        if (!ram_discard_manager_is_populated(rdm, &tmp)) {
> > > > -            error_report("iommu map to discarded memory (e.g., unplugged via"
> > > > -                         " virtio-mem): %"HWADDR_PRIx"",
> > > > -                         iotlb->translated_addr);
> > > > -            return false;
> > > > -        }
> > > > +    bool ret, mr_has_discard_manager;
> > > >
> > > > +    ret = memory_get_xlat_addr(iotlb, vaddr, ram_addr, read_only,
> > > > +                               &mr_has_discard_manager);
> > > > +    if (ret && mr_has_discard_manager) {
> > > >          /*
> > > >           * Malicious VMs might trigger discarding of IOMMU-mapped memory. The
> > > >           * pages will remain pinned inside vfio until unmapped, resulting in a
> > > > @@ -635,29 +601,7 @@ static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
> > > >                           " intended via an IOMMU. It's possible to mitigate "
> > > >                           " by setting/adjusting RLIMIT_MEMLOCK.");
> > > >      }
> > > > -
> > > > -    /*
> > > > -     * Translation truncates length to the IOMMU page size,
> > > > -     * check that it did not truncate too much.
> > > > -     */
> > > > -    if (len & iotlb->addr_mask) {
> > > > -        error_report("iommu has granularity incompatible with target AS");
> > > > -        return false;
> > > > -    }
> > > > -
> > > > -    if (vaddr) {
> > > > -        *vaddr = memory_region_get_ram_ptr(mr) + xlat;
> > > > -    }
> > > > -
> > > > -    if (ram_addr) {
> > > > -        *ram_addr = memory_region_get_ram_addr(mr) + xlat;
> > > > -    }
> > > > -
> > > > -    if (read_only) {
> > > > -        *read_only = !writable || mr->readonly;
> > > > -    }
> > > > -
> > > > -    return true;
> > > > +    return ret;
> > > >  }
> > > >
> > > >  static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
> > > > diff --git a/include/exec/memory.h b/include/exec/memory.h
> > > > index bfb1de8eea..d1e79c39dc 100644
> > > > --- a/include/exec/memory.h
> > > > +++ b/include/exec/memory.h
> > > > @@ -713,6 +713,10 @@ void ram_discard_manager_register_listener(RamDiscardManager *rdm,
> > > >  void ram_discard_manager_unregister_listener(RamDiscardManager *rdm,
> > > >                                               RamDiscardListener *rdl);
> > > >
> > > > +bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
> > > > +                          ram_addr_t *ram_addr, bool *read_only,
> > > > +                          bool *mr_has_discard_manager);
> > > > +
> > > >  typedef struct CoalescedMemoryRange CoalescedMemoryRange;
> > > >  typedef struct MemoryRegionIoeventfd MemoryRegionIoeventfd;
> > > >
> > > > diff --git a/softmmu/memory.c b/softmmu/memory.c
> > > > index 7ba2048836..bc0be3f62c 100644
> > > > --- a/softmmu/memory.c
> > > > +++ b/softmmu/memory.c
> > > > @@ -33,6 +33,7 @@
> > > >  #include "qemu/accel.h"
> > > >  #include "hw/boards.h"
> > > >  #include "migration/vmstate.h"
> > > > +#include "exec/address-spaces.h"
> > > >
> > > >  //#define DEBUG_UNASSIGNED
> > > >
> > > > @@ -2121,6 +2122,77 @@ void ram_discard_manager_unregister_listener(RamDiscardManager *rdm,
> > > >      rdmc->unregister_listener(rdm, rdl);
> > > >  }
> > > >
> > > > +/* Called with rcu_read_lock held.  */
> > > > +bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
> > > > +                          ram_addr_t *ram_addr, bool *read_only,
> > > > +                          bool *mr_has_discard_manager)
> > > > +{
> > > > +    MemoryRegion *mr;
> > > > +    hwaddr xlat;
> > > > +    hwaddr len = iotlb->addr_mask + 1;
> > > > +    bool writable = iotlb->perm & IOMMU_WO;
> > > > +
> > > > +    if (mr_has_discard_manager) {
> > > > +        *mr_has_discard_manager = false;
> > > > +    }
> > > > +    /*
> > > > +     * The IOMMU TLB entry we have just covers translation through
> > > > +     * this IOMMU to its immediate target.  We need to translate
> > > > +     * it the rest of the way through to memory.
> > > > +     */
> > > > +    mr = address_space_translate(&address_space_memory, iotlb->translated_addr,
> > > > +                                 &xlat, &len, writable, MEMTXATTRS_UNSPECIFIED);
> > > > +    if (!memory_region_is_ram(mr)) {
> > > > +        error_report("iommu map to non memory area %" HWADDR_PRIx "", xlat);
> > > > +        return false;
> > > > +    } else if (memory_region_has_ram_discard_manager(mr)) {
> > > > +        RamDiscardManager *rdm = memory_region_get_ram_discard_manager(mr);
> > > > +        MemoryRegionSection tmp = {
> > > > +            .mr = mr,
> > > > +            .offset_within_region = xlat,
> > > > +            .size = int128_make64(len),
> > > > +        };
> > > > +        if (mr_has_discard_manager) {
> > > > +            *mr_has_discard_manager = true;
> > > > +        }
> > > > +        /*
> > > > +         * Malicious VMs can map memory into the IOMMU, which is expected
> > > > +         * to remain discarded. vfio will pin all pages, populating memory.
> > > > +         * Disallow that. vmstate priorities make sure any RamDiscardManager
> > > > +         * were already restored before IOMMUs are restored.
> > > > +         */
> > > > +        if (!ram_discard_manager_is_populated(rdm, &tmp)) {
> > > > +            error_report("iommu map to discarded memory (e.g., unplugged via"
> > > > +                         " virtio-mem): %" HWADDR_PRIx "",
> > > > +                         iotlb->translated_addr);
> > > > +            return false;
> > > > +        }
> > > > +    }
> > > > +
> > > > +    /*
> > > > +     * Translation truncates length to the IOMMU page size,
> > > > +     * check that it did not truncate too much.
> > > > +     */
> > > > +    if (len & iotlb->addr_mask) {
> > > > +        error_report("iommu has granularity incompatible with target AS");
> > > > +        return false;
> > > > +    }
> > > > +
> > > > +    if (vaddr) {
> > > > +        *vaddr = memory_region_get_ram_ptr(mr) + xlat;
> > > > +    }
> > > > +
> > > > +    if (ram_addr) {
> > > > +        *ram_addr = memory_region_get_ram_addr(mr) + xlat;
> > > > +    }
> > > > +
> > > > +    if (read_only) {
> > > > +        *read_only = !writable || mr->readonly;
> > > > +    }
> > > > +
> > > > +    return true;
> > > > +}
> > > > +
> > > >  void memory_region_set_log(MemoryRegion *mr, bool log, unsigned client)
> > > >  {
> > > >      uint8_t mask = 1 << client;
> > > > --
> > > > 2.34.3
> > >
> > > On Mon, Oct 31, 2022 at 11:10:20AM +0800, Cindy Lu wrote:
> > > > Add support for vIOMMU. add the new function to deal with iommu MR.
> > > > - during iommu_region_add register a specific IOMMU notifier,
> > > >  and store all notifiers in a list.
> > > > - during iommu_region_del, compare and delete the IOMMU notifier from the list
> > > >
> > > > Verified in vp_vdpa and vdpa_sim_net driver
> > > >
> > > > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > > > ---
> > > >  hw/virtio/vhost-vdpa.c         | 123 ++++++++++++++++++++++++++++++---
> > > >  include/hw/virtio/vhost-vdpa.h |  10 +++
> > > >  2 files changed, 122 insertions(+), 11 deletions(-)
> > > >
> > > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > > > index 3ff9ce3501..dcfaaccfa9 100644
> > > > --- a/hw/virtio/vhost-vdpa.c
> > > > +++ b/hw/virtio/vhost-vdpa.c
> > > > @@ -26,6 +26,7 @@
> > > >  #include "cpu.h"
> > > >  #include "trace.h"
> > > >  #include "qapi/error.h"
> > > > +#include "hw/virtio/virtio-access.h"
> > > >
> > > >  /*
> > > >   * Return one past the end of the end of section. Be careful with uint64_t
> > > > @@ -44,7 +45,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
> > > >                                                  uint64_t iova_min,
> > > >                                                  uint64_t iova_max)
> > > >  {
> > > > -    Int128 llend;
> > > >
> > > >      if ((!memory_region_is_ram(section->mr) &&
> > > >           !memory_region_is_iommu(section->mr)) ||
> > > > @@ -61,14 +61,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
> > > >          return true;
> > > >      }
> > > >
> > > > -    llend = vhost_vdpa_section_end(section);
> > > > -    if (int128_gt(llend, int128_make64(iova_max))) {
> > > > -        error_report("RAM section out of device range (max=0x%" PRIx64
> > > > -                     ", end addr=0x%" PRIx64 ")",
> > > > -                     iova_max, int128_get64(llend));
> > > > -        return true;
> > > > -    }
> > > > -
> > > >      return false;
> > > >  }
> > > >
> > > > @@ -173,6 +165,106 @@ static void vhost_vdpa_listener_commit(MemoryListener *listener)
> > > >      v->iotlb_batch_begin_sent = false;
> > > >  }
> > > >
> > > > +static void vhost_vdpa_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
> > > > +{
> > > > +    struct vdpa_iommu *iommu = container_of(n, struct vdpa_iommu, n);
> > > > +
> > > > +    hwaddr iova = iotlb->iova + iommu->iommu_offset;
> > > > +    struct vhost_vdpa *v = iommu->dev;
> > > > +    void *vaddr;
> > > > +    int ret;
> > > > +
> > > > +    if (iotlb->target_as != &address_space_memory) {
> > > > +        error_report("Wrong target AS \"%s\", only system memory is allowed",
> > > > +                     iotlb->target_as->name ? iotlb->target_as->name : "none");
> > > > +        return;
> > > > +    }
> > > > +    RCU_READ_LOCK_GUARD();
> > > > +    vhost_vdpa_iotlb_batch_begin_once(v);
> > > > +
> > > > +    if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
> > > > +        bool read_only;
> > > > +
> > > > +        if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL)) {
> > > > +            return;
> > > > +        }
> > > > +        ret =
> > > > +            vhost_vdpa_dma_map(v, iova, iotlb->addr_mask + 1, vaddr, read_only);
> > > > +        if (ret) {
> > > > +            error_report("vhost_vdpa_dma_map(%p, 0x%" HWADDR_PRIx ", "
> > > > +                         "0x%" HWADDR_PRIx ", %p) = %d (%m)",
> > > > +                         v, iova, iotlb->addr_mask + 1, vaddr, ret);
> > > > +        }
> > > > +    } else {
> > > > +        ret = vhost_vdpa_dma_unmap(v, iova, iotlb->addr_mask + 1);
> > > > +        if (ret) {
> > > > +            error_report("vhost_vdpa_dma_unmap(%p, 0x%" HWADDR_PRIx ", "
> > > > +                         "0x%" HWADDR_PRIx ") = %d (%m)",
> > > > +                         v, iova, iotlb->addr_mask + 1, ret);
> > > > +        }
> > > > +    }
> > > > +}
> > > > +
> > > > +static void vhost_vdpa_iommu_region_add(MemoryListener *listener,
> > > > +                                        MemoryRegionSection *section)
> > > > +{
> > > > +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> > > > +
> > > > +    struct vdpa_iommu *iommu;
> > > > +    Int128 end;
> > > > +    int iommu_idx;
> > > > +    IOMMUMemoryRegion *iommu_mr;
> > > > +    int ret;
> > > > +
> > > > +    iommu_mr = IOMMU_MEMORY_REGION(section->mr);
> > > > +
> > > > +    iommu = g_malloc0(sizeof(*iommu));
> > > > +    end =  int128_add(int128_make64(section->offset_within_region),
> > > > +            section->size);
> > > > +    end = int128_sub(end, int128_one());
> > > > +    iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
> > > > +            MEMTXATTRS_UNSPECIFIED);
> > > > +
> > > > +    iommu->iommu_mr = iommu_mr;
> > > > +
> > > > +    iommu_notifier_init(
> > > > +        &iommu->n, vhost_vdpa_iommu_map_notify, IOMMU_NOTIFIER_IOTLB_EVENTS,
> > > > +        section->offset_within_region, int128_get64(end), iommu_idx);
> > > > +    iommu->iommu_offset =
> > > > +        section->offset_within_address_space - section->offset_within_region;
> > > > +    iommu->dev = v;
> > > > +
> > > > +    ret = memory_region_register_iommu_notifier(section->mr, &iommu->n, NULL);
> > > > +    if (ret) {
> > > > +        g_free(iommu);
> > > > +        return;
> > > > +    }
> > > > +
> > > > +    QLIST_INSERT_HEAD(&v->iommu_list, iommu, iommu_next);
> > > > +    memory_region_iommu_replay(iommu->iommu_mr, &iommu->n);
> > > > +
> > > > +    return;
> > > > +}
> > > > +
> > > > +static void vhost_vdpa_iommu_region_del(MemoryListener *listener,
> > > > +                                        MemoryRegionSection *section)
> > > > +{
> > > > +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> > > > +
> > > > +    struct vdpa_iommu *iommu;
> > > > +
> > > > +    QLIST_FOREACH(iommu, &v->iommu_list, iommu_next)
> > > > +    {
> > > > +        if (MEMORY_REGION(iommu->iommu_mr) == section->mr &&
> > > > +            iommu->n.start == section->offset_within_region) {
> > > > +            memory_region_unregister_iommu_notifier(section->mr, &iommu->n);
> > > > +            QLIST_REMOVE(iommu, iommu_next);
> > > > +            g_free(iommu);
> > > > +            break;
> > > > +        }
> > > > +    }
> > > > +}
> > > > +
> > > >  static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> > > >                                             MemoryRegionSection *section)
> > > >  {
> > > > @@ -186,6 +278,10 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> > > >                                              v->iova_range.last)) {
> > > >          return;
> > > >      }
> > > > +    if (memory_region_is_iommu(section->mr)) {
> > > > +        vhost_vdpa_iommu_region_add(listener, section);
> > > > +        return;
> > > > +    }
> > > >
> > > >      if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> > > >                   (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> > > > @@ -260,6 +356,10 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
> > > >                                              v->iova_range.last)) {
> > > >          return;
> > > >      }
> > > > +    if (memory_region_is_iommu(section->mr)) {
> > > > +        vhost_vdpa_iommu_region_del(listener, section);
> > > > +        return;
> > > > +    }
> > > >
> > > >      if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> > > >                   (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> > > > @@ -312,6 +412,7 @@ static const MemoryListener vhost_vdpa_memory_listener = {
> > > >      .region_del = vhost_vdpa_listener_region_del,
> > > >  };
> > > >
> > > > +
> > > >  static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> > > >                               void *arg)
> > > >  {
> > > > @@ -587,7 +688,6 @@ static int vhost_vdpa_cleanup(struct vhost_dev *dev)
> > > >      v = dev->opaque;
> > > >      trace_vhost_vdpa_cleanup(dev, v);
> > > >      vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
> > > > -    memory_listener_unregister(&v->listener);
> > > >      vhost_vdpa_svq_cleanup(dev);
> > > >
> > > >      dev->opaque = NULL;
> > > > @@ -1127,7 +1227,8 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> > > >      }
> > > >
> > > >      if (started) {
> > > > -        memory_listener_register(&v->listener, &address_space_memory);
> > > > +        memory_listener_register(&v->listener, dev->vdev->dma_as);
> > > > +
> > > >          return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
> > > >      } else {
> > > >          vhost_vdpa_reset_device(dev);
> > > > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > > > index d10a89303e..64a46e37cb 100644
> > > > --- a/include/hw/virtio/vhost-vdpa.h
> > > > +++ b/include/hw/virtio/vhost-vdpa.h
> > > > @@ -41,8 +41,18 @@ typedef struct vhost_vdpa {
> > > >      void *shadow_vq_ops_opaque;
> > > >      struct vhost_dev *dev;
> > > >      VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> > > > +    QLIST_HEAD(, vdpa_iommu) iommu_list;
> > > > +    IOMMUNotifier n;
> > > >  } VhostVDPA;
> > > >
> > > > +struct vdpa_iommu {
> > > > +    struct vhost_vdpa *dev;
> > > > +    IOMMUMemoryRegion *iommu_mr;
> > > > +    hwaddr iommu_offset;
> > > > +    IOMMUNotifier n;
> > > > +    QLIST_ENTRY(vdpa_iommu) iommu_next;
> > > > +};
> > > > +
> > > >  int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
> > > >                         void *vaddr, bool readonly);
> > > >  int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
> > > > --
> > > > 2.34.3
> > >
>



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 2/2] vhost-vdpa: add support for vIOMMU
  2022-10-31  3:10 ` [PATCH v9 2/2] vhost-vdpa: add support for vIOMMU Cindy Lu
@ 2022-10-31  7:04   ` Michael S. Tsirkin
  2022-10-31  7:15     ` Cindy Lu
  0 siblings, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2022-10-31  7:04 UTC (permalink / raw)
  To: Cindy Lu
  Cc: alex.williamson, jasowang, pbonzini, peterx, david, f4bug,
	sgarzare, qemu-devel

On Mon, Oct 31, 2022 at 11:10:20AM +0800, Cindy Lu wrote:
> Add support for vIOMMU. add the new function to deal with iommu MR.
> - during iommu_region_add register a specific IOMMU notifier,
>  and store all notifiers in a list.
> - during iommu_region_del, compare and delete the IOMMU notifier from the list
> 
> Verified in vp_vdpa and vdpa_sim_net driver
> 
> Signed-off-by: Cindy Lu <lulu@redhat.com>
> ---
>  hw/virtio/vhost-vdpa.c         | 123 ++++++++++++++++++++++++++++++---
>  include/hw/virtio/vhost-vdpa.h |  10 +++
>  2 files changed, 122 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index 3ff9ce3501..dcfaaccfa9 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -26,6 +26,7 @@
>  #include "cpu.h"
>  #include "trace.h"
>  #include "qapi/error.h"
> +#include "hw/virtio/virtio-access.h"
>  
>  /*
>   * Return one past the end of the end of section. Be careful with uint64_t
> @@ -44,7 +45,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
>                                                  uint64_t iova_min,
>                                                  uint64_t iova_max)
>  {
> -    Int128 llend;
>  
>      if ((!memory_region_is_ram(section->mr) &&
>           !memory_region_is_iommu(section->mr)) ||
> @@ -61,14 +61,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
>          return true;
>      }
>  
> -    llend = vhost_vdpa_section_end(section);
> -    if (int128_gt(llend, int128_make64(iova_max))) {
> -        error_report("RAM section out of device range (max=0x%" PRIx64
> -                     ", end addr=0x%" PRIx64 ")",
> -                     iova_max, int128_get64(llend));
> -        return true;
> -    }
> -
>      return false;
>  }
>

I couldn't figure out we are completely removing this.
So this function is now checking iova_min but not iova_max?
Seems asymmetrical.

  
> @@ -173,6 +165,106 @@ static void vhost_vdpa_listener_commit(MemoryListener *listener)
>      v->iotlb_batch_begin_sent = false;
>  }
>  
> +static void vhost_vdpa_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
> +{
> +    struct vdpa_iommu *iommu = container_of(n, struct vdpa_iommu, n);
> +
> +    hwaddr iova = iotlb->iova + iommu->iommu_offset;
> +    struct vhost_vdpa *v = iommu->dev;
> +    void *vaddr;
> +    int ret;
> +
> +    if (iotlb->target_as != &address_space_memory) {
> +        error_report("Wrong target AS \"%s\", only system memory is allowed",
> +                     iotlb->target_as->name ? iotlb->target_as->name : "none");
> +        return;
> +    }
> +    RCU_READ_LOCK_GUARD();
> +    vhost_vdpa_iotlb_batch_begin_once(v);
> +
> +    if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
> +        bool read_only;
> +
> +        if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL)) {
> +            return;
> +        }
> +        ret =
> +            vhost_vdpa_dma_map(v, iova, iotlb->addr_mask + 1, vaddr, read_only);
> +        if (ret) {
> +            error_report("vhost_vdpa_dma_map(%p, 0x%" HWADDR_PRIx ", "
> +                         "0x%" HWADDR_PRIx ", %p) = %d (%m)",
> +                         v, iova, iotlb->addr_mask + 1, vaddr, ret);
> +        }
> +    } else {
> +        ret = vhost_vdpa_dma_unmap(v, iova, iotlb->addr_mask + 1);
> +        if (ret) {
> +            error_report("vhost_vdpa_dma_unmap(%p, 0x%" HWADDR_PRIx ", "
> +                         "0x%" HWADDR_PRIx ") = %d (%m)",
> +                         v, iova, iotlb->addr_mask + 1, ret);
> +        }
> +    }
> +}
> +
> +static void vhost_vdpa_iommu_region_add(MemoryListener *listener,
> +                                        MemoryRegionSection *section)
> +{
> +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> +
> +    struct vdpa_iommu *iommu;
> +    Int128 end;
> +    int iommu_idx;
> +    IOMMUMemoryRegion *iommu_mr;
> +    int ret;
> +
> +    iommu_mr = IOMMU_MEMORY_REGION(section->mr);
> +
> +    iommu = g_malloc0(sizeof(*iommu));
> +    end =  int128_add(int128_make64(section->offset_within_region),
> +            section->size);
> +    end = int128_sub(end, int128_one());
> +    iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
> +            MEMTXATTRS_UNSPECIFIED);
> +
> +    iommu->iommu_mr = iommu_mr;
> +
> +    iommu_notifier_init(
> +        &iommu->n, vhost_vdpa_iommu_map_notify, IOMMU_NOTIFIER_IOTLB_EVENTS,
> +        section->offset_within_region, int128_get64(end), iommu_idx);
> +    iommu->iommu_offset =
> +        section->offset_within_address_space - section->offset_within_region;
> +    iommu->dev = v;
> +
> +    ret = memory_region_register_iommu_notifier(section->mr, &iommu->n, NULL);
> +    if (ret) {
> +        g_free(iommu);
> +        return;
> +    }
> +
> +    QLIST_INSERT_HEAD(&v->iommu_list, iommu, iommu_next);
> +    memory_region_iommu_replay(iommu->iommu_mr, &iommu->n);
> +
> +    return;
> +}
> +
> +static void vhost_vdpa_iommu_region_del(MemoryListener *listener,
> +                                        MemoryRegionSection *section)
> +{
> +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> +
> +    struct vdpa_iommu *iommu;
> +
> +    QLIST_FOREACH(iommu, &v->iommu_list, iommu_next)
> +    {
> +        if (MEMORY_REGION(iommu->iommu_mr) == section->mr &&
> +            iommu->n.start == section->offset_within_region) {
> +            memory_region_unregister_iommu_notifier(section->mr, &iommu->n);
> +            QLIST_REMOVE(iommu, iommu_next);
> +            g_free(iommu);
> +            break;
> +        }
> +    }
> +}
> +
>  static void vhost_vdpa_listener_region_add(MemoryListener *listener,
>                                             MemoryRegionSection *section)
>  {
> @@ -186,6 +278,10 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
>                                              v->iova_range.last)) {
>          return;
>      }
> +    if (memory_region_is_iommu(section->mr)) {
> +        vhost_vdpa_iommu_region_add(listener, section);
> +        return;
> +    }
>  
>      if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
>                   (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> @@ -260,6 +356,10 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
>                                              v->iova_range.last)) {
>          return;
>      }
> +    if (memory_region_is_iommu(section->mr)) {
> +        vhost_vdpa_iommu_region_del(listener, section);
> +        return;
> +    }
>  
>      if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
>                   (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> @@ -312,6 +412,7 @@ static const MemoryListener vhost_vdpa_memory_listener = {
>      .region_del = vhost_vdpa_listener_region_del,
>  };
>  
> +
>  static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
>                               void *arg)
>  {


This change is not necessary.

> @@ -587,7 +688,6 @@ static int vhost_vdpa_cleanup(struct vhost_dev *dev)
>      v = dev->opaque;
>      trace_vhost_vdpa_cleanup(dev, v);
>      vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
> -    memory_listener_unregister(&v->listener);
>      vhost_vdpa_svq_cleanup(dev);
>  
>      dev->opaque = NULL;
> @@ -1127,7 +1227,8 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
>      }
>  
>      if (started) {
> -        memory_listener_register(&v->listener, &address_space_memory);
> +        memory_listener_register(&v->listener, dev->vdev->dma_as);
> +
>          return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
>      } else {
>          vhost_vdpa_reset_device(dev);
> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> index d10a89303e..64a46e37cb 100644
> --- a/include/hw/virtio/vhost-vdpa.h
> +++ b/include/hw/virtio/vhost-vdpa.h
> @@ -41,8 +41,18 @@ typedef struct vhost_vdpa {
>      void *shadow_vq_ops_opaque;
>      struct vhost_dev *dev;
>      VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> +    QLIST_HEAD(, vdpa_iommu) iommu_list;
> +    IOMMUNotifier n;
>  } VhostVDPA;
>  
> +struct vdpa_iommu {
> +    struct vhost_vdpa *dev;
> +    IOMMUMemoryRegion *iommu_mr;
> +    hwaddr iommu_offset;
> +    IOMMUNotifier n;
> +    QLIST_ENTRY(vdpa_iommu) iommu_next;
> +};
> +

You need to add a typedef as per coding style.

>  int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
>                         void *vaddr, bool readonly);
>  int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
> -- 
> 2.34.3



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 1/2] vfio: move implement of vfio_get_xlat_addr() to memory.c
  2022-10-31  6:56         ` Cindy Lu
@ 2022-10-31  7:07           ` Michael S. Tsirkin
  0 siblings, 0 replies; 15+ messages in thread
From: Michael S. Tsirkin @ 2022-10-31  7:07 UTC (permalink / raw)
  To: Cindy Lu
  Cc: alex.williamson, jasowang, pbonzini, peterx, david, f4bug,
	sgarzare, qemu-devel

On Mon, Oct 31, 2022 at 02:56:47PM +0800, Cindy Lu wrote:
> On Mon, 31 Oct 2022 at 14:55, Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Oct 31, 2022 at 02:44:11PM +0800, Cindy Lu wrote:
> > > On Mon, 31 Oct 2022 at 14:38, Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Mon, Oct 31, 2022 at 11:10:19AM +0800, Cindy Lu wrote:
> > > > > - Move the implement vfio_get_xlat_addr to softmmu/memory.c, and
> > > > >   change the name to memory_get_xlat_addr(). So we can use this
> > > > >   function on other devices, such as vDPA device.
> > > > > - Add a new function vfio_get_xlat_addr in vfio/common.c, and it will check
> > > > >   whether the memory is backed by a discard manager. then device can
> > > > >   have its own warning.
> > > > >
> > > > > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > > >
> > > > Could you rebase on top of my tree (config interrupt support conflicts).
> > > >
> > > Hi Micheal,
> > > sure, will do, but I found a crash in config interrupt while testing
> > > vhost user,
> > > should I post a new version for it? Or maybe a patch later?
> > > Thanks
> > > Cindy
> >
> > New version, I will drop this one. So do you want this one picked up and
> > config interrupt on top?
> >
> sure, I will rebase the config interrupt patches on top of this
> Thanks
> Cindy


Hmm no, sent comments on this one to.

> > > > > ---
> > > > >  hw/vfio/common.c      | 66 +++------------------------------------
> > > > >  include/exec/memory.h |  4 +++
> > > > >  softmmu/memory.c      | 72 +++++++++++++++++++++++++++++++++++++++++++
> > > > >  3 files changed, 81 insertions(+), 61 deletions(-)
> > > > >
> > > > > diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> > > > > index ace9562a9b..6bc02b32c8 100644
> > > > > --- a/hw/vfio/common.c
> > > > > +++ b/hw/vfio/common.c
> > > > > @@ -578,45 +578,11 @@ static bool vfio_listener_skipped_section(MemoryRegionSection *section)
> > > > >  static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
> > > > >                                 ram_addr_t *ram_addr, bool *read_only)
> > > > >  {
> > > > > -    MemoryRegion *mr;
> > > > > -    hwaddr xlat;
> > > > > -    hwaddr len = iotlb->addr_mask + 1;
> > > > > -    bool writable = iotlb->perm & IOMMU_WO;
> > > > > -
> > > > > -    /*
> > > > > -     * The IOMMU TLB entry we have just covers translation through
> > > > > -     * this IOMMU to its immediate target.  We need to translate
> > > > > -     * it the rest of the way through to memory.
> > > > > -     */
> > > > > -    mr = address_space_translate(&address_space_memory,
> > > > > -                                 iotlb->translated_addr,
> > > > > -                                 &xlat, &len, writable,
> > > > > -                                 MEMTXATTRS_UNSPECIFIED);
> > > > > -    if (!memory_region_is_ram(mr)) {
> > > > > -        error_report("iommu map to non memory area %"HWADDR_PRIx"",
> > > > > -                     xlat);
> > > > > -        return false;
> > > > > -    } else if (memory_region_has_ram_discard_manager(mr)) {
> > > > > -        RamDiscardManager *rdm = memory_region_get_ram_discard_manager(mr);
> > > > > -        MemoryRegionSection tmp = {
> > > > > -            .mr = mr,
> > > > > -            .offset_within_region = xlat,
> > > > > -            .size = int128_make64(len),
> > > > > -        };
> > > > > -
> > > > > -        /*
> > > > > -         * Malicious VMs can map memory into the IOMMU, which is expected
> > > > > -         * to remain discarded. vfio will pin all pages, populating memory.
> > > > > -         * Disallow that. vmstate priorities make sure any RamDiscardManager
> > > > > -         * were already restored before IOMMUs are restored.
> > > > > -         */
> > > > > -        if (!ram_discard_manager_is_populated(rdm, &tmp)) {
> > > > > -            error_report("iommu map to discarded memory (e.g., unplugged via"
> > > > > -                         " virtio-mem): %"HWADDR_PRIx"",
> > > > > -                         iotlb->translated_addr);
> > > > > -            return false;
> > > > > -        }
> > > > > +    bool ret, mr_has_discard_manager;
> > > > >
> > > > > +    ret = memory_get_xlat_addr(iotlb, vaddr, ram_addr, read_only,
> > > > > +                               &mr_has_discard_manager);
> > > > > +    if (ret && mr_has_discard_manager) {
> > > > >          /*
> > > > >           * Malicious VMs might trigger discarding of IOMMU-mapped memory. The
> > > > >           * pages will remain pinned inside vfio until unmapped, resulting in a
> > > > > @@ -635,29 +601,7 @@ static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
> > > > >                           " intended via an IOMMU. It's possible to mitigate "
> > > > >                           " by setting/adjusting RLIMIT_MEMLOCK.");
> > > > >      }
> > > > > -
> > > > > -    /*
> > > > > -     * Translation truncates length to the IOMMU page size,
> > > > > -     * check that it did not truncate too much.
> > > > > -     */
> > > > > -    if (len & iotlb->addr_mask) {
> > > > > -        error_report("iommu has granularity incompatible with target AS");
> > > > > -        return false;
> > > > > -    }
> > > > > -
> > > > > -    if (vaddr) {
> > > > > -        *vaddr = memory_region_get_ram_ptr(mr) + xlat;
> > > > > -    }
> > > > > -
> > > > > -    if (ram_addr) {
> > > > > -        *ram_addr = memory_region_get_ram_addr(mr) + xlat;
> > > > > -    }
> > > > > -
> > > > > -    if (read_only) {
> > > > > -        *read_only = !writable || mr->readonly;
> > > > > -    }
> > > > > -
> > > > > -    return true;
> > > > > +    return ret;
> > > > >  }
> > > > >
> > > > >  static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
> > > > > diff --git a/include/exec/memory.h b/include/exec/memory.h
> > > > > index bfb1de8eea..d1e79c39dc 100644
> > > > > --- a/include/exec/memory.h
> > > > > +++ b/include/exec/memory.h
> > > > > @@ -713,6 +713,10 @@ void ram_discard_manager_register_listener(RamDiscardManager *rdm,
> > > > >  void ram_discard_manager_unregister_listener(RamDiscardManager *rdm,
> > > > >                                               RamDiscardListener *rdl);
> > > > >
> > > > > +bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
> > > > > +                          ram_addr_t *ram_addr, bool *read_only,
> > > > > +                          bool *mr_has_discard_manager);
> > > > > +
> > > > >  typedef struct CoalescedMemoryRange CoalescedMemoryRange;
> > > > >  typedef struct MemoryRegionIoeventfd MemoryRegionIoeventfd;
> > > > >
> > > > > diff --git a/softmmu/memory.c b/softmmu/memory.c
> > > > > index 7ba2048836..bc0be3f62c 100644
> > > > > --- a/softmmu/memory.c
> > > > > +++ b/softmmu/memory.c
> > > > > @@ -33,6 +33,7 @@
> > > > >  #include "qemu/accel.h"
> > > > >  #include "hw/boards.h"
> > > > >  #include "migration/vmstate.h"
> > > > > +#include "exec/address-spaces.h"
> > > > >
> > > > >  //#define DEBUG_UNASSIGNED
> > > > >
> > > > > @@ -2121,6 +2122,77 @@ void ram_discard_manager_unregister_listener(RamDiscardManager *rdm,
> > > > >      rdmc->unregister_listener(rdm, rdl);
> > > > >  }
> > > > >
> > > > > +/* Called with rcu_read_lock held.  */
> > > > > +bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
> > > > > +                          ram_addr_t *ram_addr, bool *read_only,
> > > > > +                          bool *mr_has_discard_manager)
> > > > > +{
> > > > > +    MemoryRegion *mr;
> > > > > +    hwaddr xlat;
> > > > > +    hwaddr len = iotlb->addr_mask + 1;
> > > > > +    bool writable = iotlb->perm & IOMMU_WO;
> > > > > +
> > > > > +    if (mr_has_discard_manager) {
> > > > > +        *mr_has_discard_manager = false;
> > > > > +    }
> > > > > +    /*
> > > > > +     * The IOMMU TLB entry we have just covers translation through
> > > > > +     * this IOMMU to its immediate target.  We need to translate
> > > > > +     * it the rest of the way through to memory.
> > > > > +     */
> > > > > +    mr = address_space_translate(&address_space_memory, iotlb->translated_addr,
> > > > > +                                 &xlat, &len, writable, MEMTXATTRS_UNSPECIFIED);
> > > > > +    if (!memory_region_is_ram(mr)) {
> > > > > +        error_report("iommu map to non memory area %" HWADDR_PRIx "", xlat);
> > > > > +        return false;
> > > > > +    } else if (memory_region_has_ram_discard_manager(mr)) {
> > > > > +        RamDiscardManager *rdm = memory_region_get_ram_discard_manager(mr);
> > > > > +        MemoryRegionSection tmp = {
> > > > > +            .mr = mr,
> > > > > +            .offset_within_region = xlat,
> > > > > +            .size = int128_make64(len),
> > > > > +        };
> > > > > +        if (mr_has_discard_manager) {
> > > > > +            *mr_has_discard_manager = true;
> > > > > +        }
> > > > > +        /*
> > > > > +         * Malicious VMs can map memory into the IOMMU, which is expected
> > > > > +         * to remain discarded. vfio will pin all pages, populating memory.
> > > > > +         * Disallow that. vmstate priorities make sure any RamDiscardManager
> > > > > +         * were already restored before IOMMUs are restored.
> > > > > +         */
> > > > > +        if (!ram_discard_manager_is_populated(rdm, &tmp)) {
> > > > > +            error_report("iommu map to discarded memory (e.g., unplugged via"
> > > > > +                         " virtio-mem): %" HWADDR_PRIx "",
> > > > > +                         iotlb->translated_addr);
> > > > > +            return false;
> > > > > +        }
> > > > > +    }
> > > > > +
> > > > > +    /*
> > > > > +     * Translation truncates length to the IOMMU page size,
> > > > > +     * check that it did not truncate too much.
> > > > > +     */
> > > > > +    if (len & iotlb->addr_mask) {
> > > > > +        error_report("iommu has granularity incompatible with target AS");
> > > > > +        return false;
> > > > > +    }
> > > > > +
> > > > > +    if (vaddr) {
> > > > > +        *vaddr = memory_region_get_ram_ptr(mr) + xlat;
> > > > > +    }
> > > > > +
> > > > > +    if (ram_addr) {
> > > > > +        *ram_addr = memory_region_get_ram_addr(mr) + xlat;
> > > > > +    }
> > > > > +
> > > > > +    if (read_only) {
> > > > > +        *read_only = !writable || mr->readonly;
> > > > > +    }
> > > > > +
> > > > > +    return true;
> > > > > +}
> > > > > +
> > > > >  void memory_region_set_log(MemoryRegion *mr, bool log, unsigned client)
> > > > >  {
> > > > >      uint8_t mask = 1 << client;
> > > > > --
> > > > > 2.34.3
> > > >
> > > > On Mon, Oct 31, 2022 at 11:10:20AM +0800, Cindy Lu wrote:
> > > > > Add support for vIOMMU. add the new function to deal with iommu MR.
> > > > > - during iommu_region_add register a specific IOMMU notifier,
> > > > >  and store all notifiers in a list.
> > > > > - during iommu_region_del, compare and delete the IOMMU notifier from the list
> > > > >
> > > > > Verified in vp_vdpa and vdpa_sim_net driver
> > > > >
> > > > > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > > > > ---
> > > > >  hw/virtio/vhost-vdpa.c         | 123 ++++++++++++++++++++++++++++++---
> > > > >  include/hw/virtio/vhost-vdpa.h |  10 +++
> > > > >  2 files changed, 122 insertions(+), 11 deletions(-)
> > > > >
> > > > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > > > > index 3ff9ce3501..dcfaaccfa9 100644
> > > > > --- a/hw/virtio/vhost-vdpa.c
> > > > > +++ b/hw/virtio/vhost-vdpa.c
> > > > > @@ -26,6 +26,7 @@
> > > > >  #include "cpu.h"
> > > > >  #include "trace.h"
> > > > >  #include "qapi/error.h"
> > > > > +#include "hw/virtio/virtio-access.h"
> > > > >
> > > > >  /*
> > > > >   * Return one past the end of the end of section. Be careful with uint64_t
> > > > > @@ -44,7 +45,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
> > > > >                                                  uint64_t iova_min,
> > > > >                                                  uint64_t iova_max)
> > > > >  {
> > > > > -    Int128 llend;
> > > > >
> > > > >      if ((!memory_region_is_ram(section->mr) &&
> > > > >           !memory_region_is_iommu(section->mr)) ||
> > > > > @@ -61,14 +61,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
> > > > >          return true;
> > > > >      }
> > > > >
> > > > > -    llend = vhost_vdpa_section_end(section);
> > > > > -    if (int128_gt(llend, int128_make64(iova_max))) {
> > > > > -        error_report("RAM section out of device range (max=0x%" PRIx64
> > > > > -                     ", end addr=0x%" PRIx64 ")",
> > > > > -                     iova_max, int128_get64(llend));
> > > > > -        return true;
> > > > > -    }
> > > > > -
> > > > >      return false;
> > > > >  }
> > > > >
> > > > > @@ -173,6 +165,106 @@ static void vhost_vdpa_listener_commit(MemoryListener *listener)
> > > > >      v->iotlb_batch_begin_sent = false;
> > > > >  }
> > > > >
> > > > > +static void vhost_vdpa_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
> > > > > +{
> > > > > +    struct vdpa_iommu *iommu = container_of(n, struct vdpa_iommu, n);
> > > > > +
> > > > > +    hwaddr iova = iotlb->iova + iommu->iommu_offset;
> > > > > +    struct vhost_vdpa *v = iommu->dev;
> > > > > +    void *vaddr;
> > > > > +    int ret;
> > > > > +
> > > > > +    if (iotlb->target_as != &address_space_memory) {
> > > > > +        error_report("Wrong target AS \"%s\", only system memory is allowed",
> > > > > +                     iotlb->target_as->name ? iotlb->target_as->name : "none");
> > > > > +        return;
> > > > > +    }
> > > > > +    RCU_READ_LOCK_GUARD();
> > > > > +    vhost_vdpa_iotlb_batch_begin_once(v);
> > > > > +
> > > > > +    if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
> > > > > +        bool read_only;
> > > > > +
> > > > > +        if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL)) {
> > > > > +            return;
> > > > > +        }
> > > > > +        ret =
> > > > > +            vhost_vdpa_dma_map(v, iova, iotlb->addr_mask + 1, vaddr, read_only);
> > > > > +        if (ret) {
> > > > > +            error_report("vhost_vdpa_dma_map(%p, 0x%" HWADDR_PRIx ", "
> > > > > +                         "0x%" HWADDR_PRIx ", %p) = %d (%m)",
> > > > > +                         v, iova, iotlb->addr_mask + 1, vaddr, ret);
> > > > > +        }
> > > > > +    } else {
> > > > > +        ret = vhost_vdpa_dma_unmap(v, iova, iotlb->addr_mask + 1);
> > > > > +        if (ret) {
> > > > > +            error_report("vhost_vdpa_dma_unmap(%p, 0x%" HWADDR_PRIx ", "
> > > > > +                         "0x%" HWADDR_PRIx ") = %d (%m)",
> > > > > +                         v, iova, iotlb->addr_mask + 1, ret);
> > > > > +        }
> > > > > +    }
> > > > > +}
> > > > > +
> > > > > +static void vhost_vdpa_iommu_region_add(MemoryListener *listener,
> > > > > +                                        MemoryRegionSection *section)
> > > > > +{
> > > > > +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> > > > > +
> > > > > +    struct vdpa_iommu *iommu;
> > > > > +    Int128 end;
> > > > > +    int iommu_idx;
> > > > > +    IOMMUMemoryRegion *iommu_mr;
> > > > > +    int ret;
> > > > > +
> > > > > +    iommu_mr = IOMMU_MEMORY_REGION(section->mr);
> > > > > +
> > > > > +    iommu = g_malloc0(sizeof(*iommu));
> > > > > +    end =  int128_add(int128_make64(section->offset_within_region),
> > > > > +            section->size);
> > > > > +    end = int128_sub(end, int128_one());
> > > > > +    iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
> > > > > +            MEMTXATTRS_UNSPECIFIED);
> > > > > +
> > > > > +    iommu->iommu_mr = iommu_mr;
> > > > > +
> > > > > +    iommu_notifier_init(
> > > > > +        &iommu->n, vhost_vdpa_iommu_map_notify, IOMMU_NOTIFIER_IOTLB_EVENTS,
> > > > > +        section->offset_within_region, int128_get64(end), iommu_idx);
> > > > > +    iommu->iommu_offset =
> > > > > +        section->offset_within_address_space - section->offset_within_region;
> > > > > +    iommu->dev = v;
> > > > > +
> > > > > +    ret = memory_region_register_iommu_notifier(section->mr, &iommu->n, NULL);
> > > > > +    if (ret) {
> > > > > +        g_free(iommu);
> > > > > +        return;
> > > > > +    }
> > > > > +
> > > > > +    QLIST_INSERT_HEAD(&v->iommu_list, iommu, iommu_next);
> > > > > +    memory_region_iommu_replay(iommu->iommu_mr, &iommu->n);
> > > > > +
> > > > > +    return;
> > > > > +}
> > > > > +
> > > > > +static void vhost_vdpa_iommu_region_del(MemoryListener *listener,
> > > > > +                                        MemoryRegionSection *section)
> > > > > +{
> > > > > +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> > > > > +
> > > > > +    struct vdpa_iommu *iommu;
> > > > > +
> > > > > +    QLIST_FOREACH(iommu, &v->iommu_list, iommu_next)
> > > > > +    {
> > > > > +        if (MEMORY_REGION(iommu->iommu_mr) == section->mr &&
> > > > > +            iommu->n.start == section->offset_within_region) {
> > > > > +            memory_region_unregister_iommu_notifier(section->mr, &iommu->n);
> > > > > +            QLIST_REMOVE(iommu, iommu_next);
> > > > > +            g_free(iommu);
> > > > > +            break;
> > > > > +        }
> > > > > +    }
> > > > > +}
> > > > > +
> > > > >  static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> > > > >                                             MemoryRegionSection *section)
> > > > >  {
> > > > > @@ -186,6 +278,10 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> > > > >                                              v->iova_range.last)) {
> > > > >          return;
> > > > >      }
> > > > > +    if (memory_region_is_iommu(section->mr)) {
> > > > > +        vhost_vdpa_iommu_region_add(listener, section);
> > > > > +        return;
> > > > > +    }
> > > > >
> > > > >      if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> > > > >                   (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> > > > > @@ -260,6 +356,10 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
> > > > >                                              v->iova_range.last)) {
> > > > >          return;
> > > > >      }
> > > > > +    if (memory_region_is_iommu(section->mr)) {
> > > > > +        vhost_vdpa_iommu_region_del(listener, section);
> > > > > +        return;
> > > > > +    }
> > > > >
> > > > >      if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> > > > >                   (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> > > > > @@ -312,6 +412,7 @@ static const MemoryListener vhost_vdpa_memory_listener = {
> > > > >      .region_del = vhost_vdpa_listener_region_del,
> > > > >  };
> > > > >
> > > > > +
> > > > >  static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> > > > >                               void *arg)
> > > > >  {
> > > > > @@ -587,7 +688,6 @@ static int vhost_vdpa_cleanup(struct vhost_dev *dev)
> > > > >      v = dev->opaque;
> > > > >      trace_vhost_vdpa_cleanup(dev, v);
> > > > >      vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
> > > > > -    memory_listener_unregister(&v->listener);
> > > > >      vhost_vdpa_svq_cleanup(dev);
> > > > >
> > > > >      dev->opaque = NULL;
> > > > > @@ -1127,7 +1227,8 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> > > > >      }
> > > > >
> > > > >      if (started) {
> > > > > -        memory_listener_register(&v->listener, &address_space_memory);
> > > > > +        memory_listener_register(&v->listener, dev->vdev->dma_as);
> > > > > +
> > > > >          return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
> > > > >      } else {
> > > > >          vhost_vdpa_reset_device(dev);
> > > > > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > > > > index d10a89303e..64a46e37cb 100644
> > > > > --- a/include/hw/virtio/vhost-vdpa.h
> > > > > +++ b/include/hw/virtio/vhost-vdpa.h
> > > > > @@ -41,8 +41,18 @@ typedef struct vhost_vdpa {
> > > > >      void *shadow_vq_ops_opaque;
> > > > >      struct vhost_dev *dev;
> > > > >      VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> > > > > +    QLIST_HEAD(, vdpa_iommu) iommu_list;
> > > > > +    IOMMUNotifier n;
> > > > >  } VhostVDPA;
> > > > >
> > > > > +struct vdpa_iommu {
> > > > > +    struct vhost_vdpa *dev;
> > > > > +    IOMMUMemoryRegion *iommu_mr;
> > > > > +    hwaddr iommu_offset;
> > > > > +    IOMMUNotifier n;
> > > > > +    QLIST_ENTRY(vdpa_iommu) iommu_next;
> > > > > +};
> > > > > +
> > > > >  int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
> > > > >                         void *vaddr, bool readonly);
> > > > >  int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
> > > > > --
> > > > > 2.34.3
> > > >
> >



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 2/2] vhost-vdpa: add support for vIOMMU
  2022-10-31  7:04   ` Michael S. Tsirkin
@ 2022-10-31  7:15     ` Cindy Lu
  2022-10-31  7:20       ` Michael S. Tsirkin
  0 siblings, 1 reply; 15+ messages in thread
From: Cindy Lu @ 2022-10-31  7:15 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: alex.williamson, jasowang, pbonzini, peterx, david, f4bug,
	sgarzare, qemu-devel

,


On Mon, 31 Oct 2022 at 15:04, Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Oct 31, 2022 at 11:10:20AM +0800, Cindy Lu wrote:
> > Add support for vIOMMU. add the new function to deal with iommu MR.
> > - during iommu_region_add register a specific IOMMU notifier,
> >  and store all notifiers in a list.
> > - during iommu_region_del, compare and delete the IOMMU notifier from the list
> >
> > Verified in vp_vdpa and vdpa_sim_net driver
> >
> > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > ---
> >  hw/virtio/vhost-vdpa.c         | 123 ++++++++++++++++++++++++++++++---
> >  include/hw/virtio/vhost-vdpa.h |  10 +++
> >  2 files changed, 122 insertions(+), 11 deletions(-)
> >
> > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > index 3ff9ce3501..dcfaaccfa9 100644
> > --- a/hw/virtio/vhost-vdpa.c
> > +++ b/hw/virtio/vhost-vdpa.c
> > @@ -26,6 +26,7 @@
> >  #include "cpu.h"
> >  #include "trace.h"
> >  #include "qapi/error.h"
> > +#include "hw/virtio/virtio-access.h"
> >
> >  /*
> >   * Return one past the end of the end of section. Be careful with uint64_t
> > @@ -44,7 +45,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
> >                                                  uint64_t iova_min,
> >                                                  uint64_t iova_max)
> >  {
> > -    Int128 llend;
> >
> >      if ((!memory_region_is_ram(section->mr) &&
> >           !memory_region_is_iommu(section->mr)) ||
> > @@ -61,14 +61,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
> >          return true;
> >      }
> >
> > -    llend = vhost_vdpa_section_end(section);
> > -    if (int128_gt(llend, int128_make64(iova_max))) {
> > -        error_report("RAM section out of device range (max=0x%" PRIx64
> > -                     ", end addr=0x%" PRIx64 ")",
> > -                     iova_max, int128_get64(llend));
> > -        return true;
> > -    }
> > -
> >      return false;
> >  }
> >
>
> I couldn't figure out we are completely removing this.
> So this function is now checking iova_min but not iova_max?
> Seems asymmetrical.
>
this is because this is an asset for int128_get64,So I just not jump
this part of the check,
static inline uint64_t int128_get64(Int128 a)
{
    uint64_t r = a;
    assert(r == a);
    return r;
}


>
> > @@ -173,6 +165,106 @@ static void vhost_vdpa_listener_commit(MemoryListener *listener)
> >      v->iotlb_batch_begin_sent = false;
> >  }
> >
> > +static void vhost_vdpa_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
> > +{
> > +    struct vdpa_iommu *iommu = container_of(n, struct vdpa_iommu, n);
> > +
> > +    hwaddr iova = iotlb->iova + iommu->iommu_offset;
> > +    struct vhost_vdpa *v = iommu->dev;
> > +    void *vaddr;
> > +    int ret;
> > +
> > +    if (iotlb->target_as != &address_space_memory) {
> > +        error_report("Wrong target AS \"%s\", only system memory is allowed",
> > +                     iotlb->target_as->name ? iotlb->target_as->name : "none");
> > +        return;
> > +    }
> > +    RCU_READ_LOCK_GUARD();
> > +    vhost_vdpa_iotlb_batch_begin_once(v);
> > +
> > +    if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
> > +        bool read_only;
> > +
> > +        if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL)) {
> > +            return;
> > +        }
> > +        ret =
> > +            vhost_vdpa_dma_map(v, iova, iotlb->addr_mask + 1, vaddr, read_only);
> > +        if (ret) {
> > +            error_report("vhost_vdpa_dma_map(%p, 0x%" HWADDR_PRIx ", "
> > +                         "0x%" HWADDR_PRIx ", %p) = %d (%m)",
> > +                         v, iova, iotlb->addr_mask + 1, vaddr, ret);
> > +        }
> > +    } else {
> > +        ret = vhost_vdpa_dma_unmap(v, iova, iotlb->addr_mask + 1);
> > +        if (ret) {
> > +            error_report("vhost_vdpa_dma_unmap(%p, 0x%" HWADDR_PRIx ", "
> > +                         "0x%" HWADDR_PRIx ") = %d (%m)",
> > +                         v, iova, iotlb->addr_mask + 1, ret);
> > +        }
> > +    }
> > +}
> > +
> > +static void vhost_vdpa_iommu_region_add(MemoryListener *listener,
> > +                                        MemoryRegionSection *section)
> > +{
> > +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> > +
> > +    struct vdpa_iommu *iommu;
> > +    Int128 end;
> > +    int iommu_idx;
> > +    IOMMUMemoryRegion *iommu_mr;
> > +    int ret;
> > +
> > +    iommu_mr = IOMMU_MEMORY_REGION(section->mr);
> > +
> > +    iommu = g_malloc0(sizeof(*iommu));
> > +    end =  int128_add(int128_make64(section->offset_within_region),
> > +            section->size);
> > +    end = int128_sub(end, int128_one());
> > +    iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
> > +            MEMTXATTRS_UNSPECIFIED);
> > +
> > +    iommu->iommu_mr = iommu_mr;
> > +
> > +    iommu_notifier_init(
> > +        &iommu->n, vhost_vdpa_iommu_map_notify, IOMMU_NOTIFIER_IOTLB_EVENTS,
> > +        section->offset_within_region, int128_get64(end), iommu_idx);
> > +    iommu->iommu_offset =
> > +        section->offset_within_address_space - section->offset_within_region;
> > +    iommu->dev = v;
> > +
> > +    ret = memory_region_register_iommu_notifier(section->mr, &iommu->n, NULL);
> > +    if (ret) {
> > +        g_free(iommu);
> > +        return;
> > +    }
> > +
> > +    QLIST_INSERT_HEAD(&v->iommu_list, iommu, iommu_next);
> > +    memory_region_iommu_replay(iommu->iommu_mr, &iommu->n);
> > +
> > +    return;
> > +}
> > +
> > +static void vhost_vdpa_iommu_region_del(MemoryListener *listener,
> > +                                        MemoryRegionSection *section)
> > +{
> > +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> > +
> > +    struct vdpa_iommu *iommu;
> > +
> > +    QLIST_FOREACH(iommu, &v->iommu_list, iommu_next)
> > +    {
> > +        if (MEMORY_REGION(iommu->iommu_mr) == section->mr &&
> > +            iommu->n.start == section->offset_within_region) {
> > +            memory_region_unregister_iommu_notifier(section->mr, &iommu->n);
> > +            QLIST_REMOVE(iommu, iommu_next);
> > +            g_free(iommu);
> > +            break;
> > +        }
> > +    }
> > +}
> > +
> >  static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> >                                             MemoryRegionSection *section)
> >  {
> > @@ -186,6 +278,10 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> >                                              v->iova_range.last)) {
> >          return;
> >      }
> > +    if (memory_region_is_iommu(section->mr)) {
> > +        vhost_vdpa_iommu_region_add(listener, section);
> > +        return;
> > +    }
> >
> >      if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> >                   (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> > @@ -260,6 +356,10 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
> >                                              v->iova_range.last)) {
> >          return;
> >      }
> > +    if (memory_region_is_iommu(section->mr)) {
> > +        vhost_vdpa_iommu_region_del(listener, section);
> > +        return;
> > +    }
> >
> >      if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> >                   (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> > @@ -312,6 +412,7 @@ static const MemoryListener vhost_vdpa_memory_listener = {
> >      .region_del = vhost_vdpa_listener_region_del,
> >  };
> >
> > +
> >  static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> >                               void *arg)
> >  {
>
>
> This change is not necessary.
>
will fix this
> > @@ -587,7 +688,6 @@ static int vhost_vdpa_cleanup(struct vhost_dev *dev)
> >      v = dev->opaque;
> >      trace_vhost_vdpa_cleanup(dev, v);
> >      vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
> > -    memory_listener_unregister(&v->listener);
> >      vhost_vdpa_svq_cleanup(dev);
> >
> >      dev->opaque = NULL;
> > @@ -1127,7 +1227,8 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> >      }
> >
> >      if (started) {
> > -        memory_listener_register(&v->listener, &address_space_memory);
> > +        memory_listener_register(&v->listener, dev->vdev->dma_as);
> > +
> >          return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
> >      } else {
> >          vhost_vdpa_reset_device(dev);
> > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > index d10a89303e..64a46e37cb 100644
> > --- a/include/hw/virtio/vhost-vdpa.h
> > +++ b/include/hw/virtio/vhost-vdpa.h
> > @@ -41,8 +41,18 @@ typedef struct vhost_vdpa {
> >      void *shadow_vq_ops_opaque;
> >      struct vhost_dev *dev;
> >      VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> > +    QLIST_HEAD(, vdpa_iommu) iommu_list;
> > +    IOMMUNotifier n;
> >  } VhostVDPA;
> >
> > +struct vdpa_iommu {
> > +    struct vhost_vdpa *dev;
> > +    IOMMUMemoryRegion *iommu_mr;
> > +    hwaddr iommu_offset;
> > +    IOMMUNotifier n;
> > +    QLIST_ENTRY(vdpa_iommu) iommu_next;
> > +};
> > +
>
> You need to add a typedef as per coding style.
>
will fix this
Thanks
Cindy
> >  int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
> >                         void *vaddr, bool readonly);
> >  int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
> > --
> > 2.34.3
>



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 2/2] vhost-vdpa: add support for vIOMMU
  2022-10-31  7:15     ` Cindy Lu
@ 2022-10-31  7:20       ` Michael S. Tsirkin
  2022-10-31  8:30         ` Cindy Lu
  0 siblings, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2022-10-31  7:20 UTC (permalink / raw)
  To: Cindy Lu
  Cc: alex.williamson, jasowang, pbonzini, peterx, david, f4bug,
	sgarzare, qemu-devel

On Mon, Oct 31, 2022 at 03:15:14PM +0800, Cindy Lu wrote:
> ,
> 
> 
> On Mon, 31 Oct 2022 at 15:04, Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Oct 31, 2022 at 11:10:20AM +0800, Cindy Lu wrote:
> > > Add support for vIOMMU. add the new function to deal with iommu MR.
> > > - during iommu_region_add register a specific IOMMU notifier,
> > >  and store all notifiers in a list.
> > > - during iommu_region_del, compare and delete the IOMMU notifier from the list
> > >
> > > Verified in vp_vdpa and vdpa_sim_net driver
> > >
> > > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > > ---
> > >  hw/virtio/vhost-vdpa.c         | 123 ++++++++++++++++++++++++++++++---
> > >  include/hw/virtio/vhost-vdpa.h |  10 +++
> > >  2 files changed, 122 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > > index 3ff9ce3501..dcfaaccfa9 100644
> > > --- a/hw/virtio/vhost-vdpa.c
> > > +++ b/hw/virtio/vhost-vdpa.c
> > > @@ -26,6 +26,7 @@
> > >  #include "cpu.h"
> > >  #include "trace.h"
> > >  #include "qapi/error.h"
> > > +#include "hw/virtio/virtio-access.h"
> > >
> > >  /*
> > >   * Return one past the end of the end of section. Be careful with uint64_t
> > > @@ -44,7 +45,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
> > >                                                  uint64_t iova_min,
> > >                                                  uint64_t iova_max)
> > >  {
> > > -    Int128 llend;
> > >
> > >      if ((!memory_region_is_ram(section->mr) &&
> > >           !memory_region_is_iommu(section->mr)) ||
> > > @@ -61,14 +61,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
> > >          return true;
> > >      }
> > >
> > > -    llend = vhost_vdpa_section_end(section);
> > > -    if (int128_gt(llend, int128_make64(iova_max))) {
> > > -        error_report("RAM section out of device range (max=0x%" PRIx64
> > > -                     ", end addr=0x%" PRIx64 ")",
> > > -                     iova_max, int128_get64(llend));
> > > -        return true;
> > > -    }
> > > -
> > >      return false;
> > >  }
> > >
> >
> > I couldn't figure out we are completely removing this.
> > So this function is now checking iova_min but not iova_max?
> > Seems asymmetrical.
> >
> this is because this is an asset for int128_get64,So I just not jump
> this part of the check,
> static inline uint64_t int128_get64(Int128 a)
> {
>     uint64_t r = a;
>     assert(r == a);
>     return r;
> }


?

Could not parse this. You mean assert? And removing functionality
because you don't like an error message does not make sense.
So find another way to print it?


> 
> >
> > > @@ -173,6 +165,106 @@ static void vhost_vdpa_listener_commit(MemoryListener *listener)
> > >      v->iotlb_batch_begin_sent = false;
> > >  }
> > >
> > > +static void vhost_vdpa_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
> > > +{
> > > +    struct vdpa_iommu *iommu = container_of(n, struct vdpa_iommu, n);
> > > +
> > > +    hwaddr iova = iotlb->iova + iommu->iommu_offset;
> > > +    struct vhost_vdpa *v = iommu->dev;
> > > +    void *vaddr;
> > > +    int ret;
> > > +
> > > +    if (iotlb->target_as != &address_space_memory) {
> > > +        error_report("Wrong target AS \"%s\", only system memory is allowed",
> > > +                     iotlb->target_as->name ? iotlb->target_as->name : "none");
> > > +        return;
> > > +    }
> > > +    RCU_READ_LOCK_GUARD();
> > > +    vhost_vdpa_iotlb_batch_begin_once(v);
> > > +
> > > +    if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
> > > +        bool read_only;
> > > +
> > > +        if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL)) {
> > > +            return;
> > > +        }
> > > +        ret =
> > > +            vhost_vdpa_dma_map(v, iova, iotlb->addr_mask + 1, vaddr, read_only);
> > > +        if (ret) {
> > > +            error_report("vhost_vdpa_dma_map(%p, 0x%" HWADDR_PRIx ", "
> > > +                         "0x%" HWADDR_PRIx ", %p) = %d (%m)",
> > > +                         v, iova, iotlb->addr_mask + 1, vaddr, ret);
> > > +        }
> > > +    } else {
> > > +        ret = vhost_vdpa_dma_unmap(v, iova, iotlb->addr_mask + 1);
> > > +        if (ret) {
> > > +            error_report("vhost_vdpa_dma_unmap(%p, 0x%" HWADDR_PRIx ", "
> > > +                         "0x%" HWADDR_PRIx ") = %d (%m)",
> > > +                         v, iova, iotlb->addr_mask + 1, ret);
> > > +        }
> > > +    }
> > > +}
> > > +
> > > +static void vhost_vdpa_iommu_region_add(MemoryListener *listener,
> > > +                                        MemoryRegionSection *section)
> > > +{
> > > +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> > > +
> > > +    struct vdpa_iommu *iommu;
> > > +    Int128 end;
> > > +    int iommu_idx;
> > > +    IOMMUMemoryRegion *iommu_mr;
> > > +    int ret;
> > > +
> > > +    iommu_mr = IOMMU_MEMORY_REGION(section->mr);
> > > +
> > > +    iommu = g_malloc0(sizeof(*iommu));
> > > +    end =  int128_add(int128_make64(section->offset_within_region),
> > > +            section->size);
> > > +    end = int128_sub(end, int128_one());
> > > +    iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
> > > +            MEMTXATTRS_UNSPECIFIED);
> > > +
> > > +    iommu->iommu_mr = iommu_mr;
> > > +
> > > +    iommu_notifier_init(
> > > +        &iommu->n, vhost_vdpa_iommu_map_notify, IOMMU_NOTIFIER_IOTLB_EVENTS,
> > > +        section->offset_within_region, int128_get64(end), iommu_idx);
> > > +    iommu->iommu_offset =
> > > +        section->offset_within_address_space - section->offset_within_region;
> > > +    iommu->dev = v;
> > > +
> > > +    ret = memory_region_register_iommu_notifier(section->mr, &iommu->n, NULL);
> > > +    if (ret) {
> > > +        g_free(iommu);
> > > +        return;
> > > +    }
> > > +
> > > +    QLIST_INSERT_HEAD(&v->iommu_list, iommu, iommu_next);
> > > +    memory_region_iommu_replay(iommu->iommu_mr, &iommu->n);
> > > +
> > > +    return;
> > > +}
> > > +
> > > +static void vhost_vdpa_iommu_region_del(MemoryListener *listener,
> > > +                                        MemoryRegionSection *section)
> > > +{
> > > +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> > > +
> > > +    struct vdpa_iommu *iommu;
> > > +
> > > +    QLIST_FOREACH(iommu, &v->iommu_list, iommu_next)
> > > +    {
> > > +        if (MEMORY_REGION(iommu->iommu_mr) == section->mr &&
> > > +            iommu->n.start == section->offset_within_region) {
> > > +            memory_region_unregister_iommu_notifier(section->mr, &iommu->n);
> > > +            QLIST_REMOVE(iommu, iommu_next);
> > > +            g_free(iommu);
> > > +            break;
> > > +        }
> > > +    }
> > > +}
> > > +
> > >  static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> > >                                             MemoryRegionSection *section)
> > >  {
> > > @@ -186,6 +278,10 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> > >                                              v->iova_range.last)) {
> > >          return;
> > >      }
> > > +    if (memory_region_is_iommu(section->mr)) {
> > > +        vhost_vdpa_iommu_region_add(listener, section);
> > > +        return;
> > > +    }
> > >
> > >      if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> > >                   (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> > > @@ -260,6 +356,10 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
> > >                                              v->iova_range.last)) {
> > >          return;
> > >      }
> > > +    if (memory_region_is_iommu(section->mr)) {
> > > +        vhost_vdpa_iommu_region_del(listener, section);
> > > +        return;
> > > +    }
> > >
> > >      if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> > >                   (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> > > @@ -312,6 +412,7 @@ static const MemoryListener vhost_vdpa_memory_listener = {
> > >      .region_del = vhost_vdpa_listener_region_del,
> > >  };
> > >
> > > +
> > >  static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> > >                               void *arg)
> > >  {
> >
> >
> > This change is not necessary.
> >
> will fix this
> > > @@ -587,7 +688,6 @@ static int vhost_vdpa_cleanup(struct vhost_dev *dev)
> > >      v = dev->opaque;
> > >      trace_vhost_vdpa_cleanup(dev, v);
> > >      vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
> > > -    memory_listener_unregister(&v->listener);
> > >      vhost_vdpa_svq_cleanup(dev);
> > >
> > >      dev->opaque = NULL;
> > > @@ -1127,7 +1227,8 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> > >      }
> > >
> > >      if (started) {
> > > -        memory_listener_register(&v->listener, &address_space_memory);
> > > +        memory_listener_register(&v->listener, dev->vdev->dma_as);
> > > +
> > >          return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
> > >      } else {
> > >          vhost_vdpa_reset_device(dev);
> > > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > > index d10a89303e..64a46e37cb 100644
> > > --- a/include/hw/virtio/vhost-vdpa.h
> > > +++ b/include/hw/virtio/vhost-vdpa.h
> > > @@ -41,8 +41,18 @@ typedef struct vhost_vdpa {
> > >      void *shadow_vq_ops_opaque;
> > >      struct vhost_dev *dev;
> > >      VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> > > +    QLIST_HEAD(, vdpa_iommu) iommu_list;
> > > +    IOMMUNotifier n;
> > >  } VhostVDPA;
> > >
> > > +struct vdpa_iommu {
> > > +    struct vhost_vdpa *dev;
> > > +    IOMMUMemoryRegion *iommu_mr;
> > > +    hwaddr iommu_offset;
> > > +    IOMMUNotifier n;
> > > +    QLIST_ENTRY(vdpa_iommu) iommu_next;
> > > +};
> > > +
> >
> > You need to add a typedef as per coding style.
> >
> will fix this
> Thanks
> Cindy
> > >  int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
> > >                         void *vaddr, bool readonly);
> > >  int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
> > > --
> > > 2.34.3
> >



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 2/2] vhost-vdpa: add support for vIOMMU
  2022-10-31  7:20       ` Michael S. Tsirkin
@ 2022-10-31  8:30         ` Cindy Lu
  2022-10-31 12:56           ` Cindy Lu
  0 siblings, 1 reply; 15+ messages in thread
From: Cindy Lu @ 2022-10-31  8:30 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: alex.williamson, jasowang, pbonzini, peterx, david, f4bug,
	sgarzare, qemu-devel

On Mon, 31 Oct 2022 at 15:20, Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Oct 31, 2022 at 03:15:14PM +0800, Cindy Lu wrote:
> > ,
> >
> >
> > On Mon, 31 Oct 2022 at 15:04, Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Oct 31, 2022 at 11:10:20AM +0800, Cindy Lu wrote:
> > > > Add support for vIOMMU. add the new function to deal with iommu MR.
> > > > - during iommu_region_add register a specific IOMMU notifier,
> > > >  and store all notifiers in a list.
> > > > - during iommu_region_del, compare and delete the IOMMU notifier from the list
> > > >
> > > > Verified in vp_vdpa and vdpa_sim_net driver
> > > >
> > > > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > > > ---
> > > >  hw/virtio/vhost-vdpa.c         | 123 ++++++++++++++++++++++++++++++---
> > > >  include/hw/virtio/vhost-vdpa.h |  10 +++
> > > >  2 files changed, 122 insertions(+), 11 deletions(-)
> > > >
> > > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > > > index 3ff9ce3501..dcfaaccfa9 100644
> > > > --- a/hw/virtio/vhost-vdpa.c
> > > > +++ b/hw/virtio/vhost-vdpa.c
> > > > @@ -26,6 +26,7 @@
> > > >  #include "cpu.h"
> > > >  #include "trace.h"
> > > >  #include "qapi/error.h"
> > > > +#include "hw/virtio/virtio-access.h"
> > > >
> > > >  /*
> > > >   * Return one past the end of the end of section. Be careful with uint64_t
> > > > @@ -44,7 +45,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
> > > >                                                  uint64_t iova_min,
> > > >                                                  uint64_t iova_max)
> > > >  {
> > > > -    Int128 llend;
> > > >
> > > >      if ((!memory_region_is_ram(section->mr) &&
> > > >           !memory_region_is_iommu(section->mr)) ||
> > > > @@ -61,14 +61,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
> > > >          return true;
> > > >      }
> > > >
> > > > -    llend = vhost_vdpa_section_end(section);
> > > > -    if (int128_gt(llend, int128_make64(iova_max))) {
> > > > -        error_report("RAM section out of device range (max=0x%" PRIx64
> > > > -                     ", end addr=0x%" PRIx64 ")",
> > > > -                     iova_max, int128_get64(llend));
> > > > -        return true;
> > > > -    }
> > > > -
> > > >      return false;
> > > >  }
> > > >
> > >
> > > I couldn't figure out we are completely removing this.
> > > So this function is now checking iova_min but not iova_max?
> > > Seems asymmetrical.
> > >
> > this is because this is an asset for int128_get64,So I just not jump
> > this part of the check,
> > static inline uint64_t int128_get64(Int128 a)
> > {
> >     uint64_t r = a;
> >     assert(r == a);
> >     return r;
> > }
>
>
> ?
>
> Could not parse this. You mean assert? And removing functionality
> because you don't like an error message does not make sense.
> So find another way to print it?
>
sorry for my mistake here
for this part, I remove this because it will report error in iommu mr added
Also there is no such check in vfio,
seems the llend is always small than iov_max in iommu domain,
not sure we can remove it first and I will add more comments later ?
Thanks
cindy
>
> >
> > >
> > > > @@ -173,6 +165,106 @@ static void vhost_vdpa_listener_commit(MemoryListener *listener)
> > > >      v->iotlb_batch_begin_sent = false;
> > > >  }
> > > >
> > > > +static void vhost_vdpa_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
> > > > +{
> > > > +    struct vdpa_iommu *iommu = container_of(n, struct vdpa_iommu, n);
> > > > +
> > > > +    hwaddr iova = iotlb->iova + iommu->iommu_offset;
> > > > +    struct vhost_vdpa *v = iommu->dev;
> > > > +    void *vaddr;
> > > > +    int ret;
> > > > +
> > > > +    if (iotlb->target_as != &address_space_memory) {
> > > > +        error_report("Wrong target AS \"%s\", only system memory is allowed",
> > > > +                     iotlb->target_as->name ? iotlb->target_as->name : "none");
> > > > +        return;
> > > > +    }
> > > > +    RCU_READ_LOCK_GUARD();
> > > > +    vhost_vdpa_iotlb_batch_begin_once(v);
> > > > +
> > > > +    if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
> > > > +        bool read_only;
> > > > +
> > > > +        if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL)) {
> > > > +            return;
> > > > +        }
> > > > +        ret =
> > > > +            vhost_vdpa_dma_map(v, iova, iotlb->addr_mask + 1, vaddr, read_only);
> > > > +        if (ret) {
> > > > +            error_report("vhost_vdpa_dma_map(%p, 0x%" HWADDR_PRIx ", "
> > > > +                         "0x%" HWADDR_PRIx ", %p) = %d (%m)",
> > > > +                         v, iova, iotlb->addr_mask + 1, vaddr, ret);
> > > > +        }
> > > > +    } else {
> > > > +        ret = vhost_vdpa_dma_unmap(v, iova, iotlb->addr_mask + 1);
> > > > +        if (ret) {
> > > > +            error_report("vhost_vdpa_dma_unmap(%p, 0x%" HWADDR_PRIx ", "
> > > > +                         "0x%" HWADDR_PRIx ") = %d (%m)",
> > > > +                         v, iova, iotlb->addr_mask + 1, ret);
> > > > +        }
> > > > +    }
> > > > +}
> > > > +
> > > > +static void vhost_vdpa_iommu_region_add(MemoryListener *listener,
> > > > +                                        MemoryRegionSection *section)
> > > > +{
> > > > +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> > > > +
> > > > +    struct vdpa_iommu *iommu;
> > > > +    Int128 end;
> > > > +    int iommu_idx;
> > > > +    IOMMUMemoryRegion *iommu_mr;
> > > > +    int ret;
> > > > +
> > > > +    iommu_mr = IOMMU_MEMORY_REGION(section->mr);
> > > > +
> > > > +    iommu = g_malloc0(sizeof(*iommu));
> > > > +    end =  int128_add(int128_make64(section->offset_within_region),
> > > > +            section->size);
> > > > +    end = int128_sub(end, int128_one());
> > > > +    iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
> > > > +            MEMTXATTRS_UNSPECIFIED);
> > > > +
> > > > +    iommu->iommu_mr = iommu_mr;
> > > > +
> > > > +    iommu_notifier_init(
> > > > +        &iommu->n, vhost_vdpa_iommu_map_notify, IOMMU_NOTIFIER_IOTLB_EVENTS,
> > > > +        section->offset_within_region, int128_get64(end), iommu_idx);
> > > > +    iommu->iommu_offset =
> > > > +        section->offset_within_address_space - section->offset_within_region;
> > > > +    iommu->dev = v;
> > > > +
> > > > +    ret = memory_region_register_iommu_notifier(section->mr, &iommu->n, NULL);
> > > > +    if (ret) {
> > > > +        g_free(iommu);
> > > > +        return;
> > > > +    }
> > > > +
> > > > +    QLIST_INSERT_HEAD(&v->iommu_list, iommu, iommu_next);
> > > > +    memory_region_iommu_replay(iommu->iommu_mr, &iommu->n);
> > > > +
> > > > +    return;
> > > > +}
> > > > +
> > > > +static void vhost_vdpa_iommu_region_del(MemoryListener *listener,
> > > > +                                        MemoryRegionSection *section)
> > > > +{
> > > > +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> > > > +
> > > > +    struct vdpa_iommu *iommu;
> > > > +
> > > > +    QLIST_FOREACH(iommu, &v->iommu_list, iommu_next)
> > > > +    {
> > > > +        if (MEMORY_REGION(iommu->iommu_mr) == section->mr &&
> > > > +            iommu->n.start == section->offset_within_region) {
> > > > +            memory_region_unregister_iommu_notifier(section->mr, &iommu->n);
> > > > +            QLIST_REMOVE(iommu, iommu_next);
> > > > +            g_free(iommu);
> > > > +            break;
> > > > +        }
> > > > +    }
> > > > +}
> > > > +
> > > >  static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> > > >                                             MemoryRegionSection *section)
> > > >  {
> > > > @@ -186,6 +278,10 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> > > >                                              v->iova_range.last)) {
> > > >          return;
> > > >      }
> > > > +    if (memory_region_is_iommu(section->mr)) {
> > > > +        vhost_vdpa_iommu_region_add(listener, section);
> > > > +        return;
> > > > +    }
> > > >
> > > >      if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> > > >                   (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> > > > @@ -260,6 +356,10 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
> > > >                                              v->iova_range.last)) {
> > > >          return;
> > > >      }
> > > > +    if (memory_region_is_iommu(section->mr)) {
> > > > +        vhost_vdpa_iommu_region_del(listener, section);
> > > > +        return;
> > > > +    }
> > > >
> > > >      if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> > > >                   (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> > > > @@ -312,6 +412,7 @@ static const MemoryListener vhost_vdpa_memory_listener = {
> > > >      .region_del = vhost_vdpa_listener_region_del,
> > > >  };
> > > >
> > > > +
> > > >  static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> > > >                               void *arg)
> > > >  {
> > >
> > >
> > > This change is not necessary.
> > >
> > will fix this
> > > > @@ -587,7 +688,6 @@ static int vhost_vdpa_cleanup(struct vhost_dev *dev)
> > > >      v = dev->opaque;
> > > >      trace_vhost_vdpa_cleanup(dev, v);
> > > >      vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
> > > > -    memory_listener_unregister(&v->listener);
> > > >      vhost_vdpa_svq_cleanup(dev);
> > > >
> > > >      dev->opaque = NULL;
> > > > @@ -1127,7 +1227,8 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> > > >      }
> > > >
> > > >      if (started) {
> > > > -        memory_listener_register(&v->listener, &address_space_memory);
> > > > +        memory_listener_register(&v->listener, dev->vdev->dma_as);
> > > > +
> > > >          return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
> > > >      } else {
> > > >          vhost_vdpa_reset_device(dev);
> > > > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > > > index d10a89303e..64a46e37cb 100644
> > > > --- a/include/hw/virtio/vhost-vdpa.h
> > > > +++ b/include/hw/virtio/vhost-vdpa.h
> > > > @@ -41,8 +41,18 @@ typedef struct vhost_vdpa {
> > > >      void *shadow_vq_ops_opaque;
> > > >      struct vhost_dev *dev;
> > > >      VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> > > > +    QLIST_HEAD(, vdpa_iommu) iommu_list;
> > > > +    IOMMUNotifier n;
> > > >  } VhostVDPA;
> > > >
> > > > +struct vdpa_iommu {
> > > > +    struct vhost_vdpa *dev;
> > > > +    IOMMUMemoryRegion *iommu_mr;
> > > > +    hwaddr iommu_offset;
> > > > +    IOMMUNotifier n;
> > > > +    QLIST_ENTRY(vdpa_iommu) iommu_next;
> > > > +};
> > > > +
> > >
> > > You need to add a typedef as per coding style.
> > >
> > will fix this
> > Thanks
> > Cindy
> > > >  int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
> > > >                         void *vaddr, bool readonly);
> > > >  int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
> > > > --
> > > > 2.34.3
> > >
>



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 2/2] vhost-vdpa: add support for vIOMMU
  2022-10-31  8:30         ` Cindy Lu
@ 2022-10-31 12:56           ` Cindy Lu
  2022-10-31 13:07             ` Michael S. Tsirkin
  0 siblings, 1 reply; 15+ messages in thread
From: Cindy Lu @ 2022-10-31 12:56 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: alex.williamson, jasowang, pbonzini, peterx, david, f4bug,
	sgarzare, qemu-devel

On Mon, 31 Oct 2022 at 16:30, Cindy Lu <lulu@redhat.com> wrote:
>
> On Mon, 31 Oct 2022 at 15:20, Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Oct 31, 2022 at 03:15:14PM +0800, Cindy Lu wrote:
> > > ,
> > >
> > >
> > > On Mon, 31 Oct 2022 at 15:04, Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Mon, Oct 31, 2022 at 11:10:20AM +0800, Cindy Lu wrote:
> > > > > Add support for vIOMMU. add the new function to deal with iommu MR.
> > > > > - during iommu_region_add register a specific IOMMU notifier,
> > > > >  and store all notifiers in a list.
> > > > > - during iommu_region_del, compare and delete the IOMMU notifier from the list
> > > > >
> > > > > Verified in vp_vdpa and vdpa_sim_net driver
> > > > >
> > > > > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > > > > ---
> > > > >  hw/virtio/vhost-vdpa.c         | 123 ++++++++++++++++++++++++++++++---
> > > > >  include/hw/virtio/vhost-vdpa.h |  10 +++
> > > > >  2 files changed, 122 insertions(+), 11 deletions(-)
> > > > >
> > > > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > > > > index 3ff9ce3501..dcfaaccfa9 100644
> > > > > --- a/hw/virtio/vhost-vdpa.c
> > > > > +++ b/hw/virtio/vhost-vdpa.c
> > > > > @@ -26,6 +26,7 @@
> > > > >  #include "cpu.h"
> > > > >  #include "trace.h"
> > > > >  #include "qapi/error.h"
> > > > > +#include "hw/virtio/virtio-access.h"
> > > > >
> > > > >  /*
> > > > >   * Return one past the end of the end of section. Be careful with uint64_t
> > > > > @@ -44,7 +45,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
> > > > >                                                  uint64_t iova_min,
> > > > >                                                  uint64_t iova_max)
> > > > >  {
> > > > > -    Int128 llend;
> > > > >
> > > > >      if ((!memory_region_is_ram(section->mr) &&
> > > > >           !memory_region_is_iommu(section->mr)) ||
> > > > > @@ -61,14 +61,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
> > > > >          return true;
> > > > >      }
> > > > >
> > > > > -    llend = vhost_vdpa_section_end(section);
> > > > > -    if (int128_gt(llend, int128_make64(iova_max))) {
> > > > > -        error_report("RAM section out of device range (max=0x%" PRIx64
> > > > > -                     ", end addr=0x%" PRIx64 ")",
> > > > > -                     iova_max, int128_get64(llend));
> > > > > -        return true;
> > > > > -    }
> > > > > -
> > > > >      return false;
> > > > >  }
> > > > >
> > > >
> > > > I couldn't figure out we are completely removing this.
> > > > So this function is now checking iova_min but not iova_max?
> > > > Seems asymmetrical.
> > > >
> > > this is because this is an asset for int128_get64,So I just not jump
> > > this part of the check,
> > > static inline uint64_t int128_get64(Int128 a)
> > > {
> > >     uint64_t r = a;
> > >     assert(r == a);
> > >     return r;
> > > }
> >
> >
> > ?
> >
> > Could not parse this. You mean assert? And removing functionality
> > because you don't like an error message does not make sense.
> > So find another way to print it?
> >
> sorry for my mistake here
> for this part, I remove this because it will report error in iommu mr added
> Also there is no such check in vfio,
> seems the llend is always small than iov_max in iommu domain,
> not sure we can remove it first and I will add more comments later ?
> Thanks
> cindy
sorry here I mean llend is larger than  iov_max here, so the iommu mr
can not pass the
check, not sure if we can remove this check first?
Thanks
Cindy
> >
> > >
> > > >
> > > > > @@ -173,6 +165,106 @@ static void vhost_vdpa_listener_commit(MemoryListener *listener)
> > > > >      v->iotlb_batch_begin_sent = false;
> > > > >  }
> > > > >
> > > > > +static void vhost_vdpa_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
> > > > > +{
> > > > > +    struct vdpa_iommu *iommu = container_of(n, struct vdpa_iommu, n);
> > > > > +
> > > > > +    hwaddr iova = iotlb->iova + iommu->iommu_offset;
> > > > > +    struct vhost_vdpa *v = iommu->dev;
> > > > > +    void *vaddr;
> > > > > +    int ret;
> > > > > +
> > > > > +    if (iotlb->target_as != &address_space_memory) {
> > > > > +        error_report("Wrong target AS \"%s\", only system memory is allowed",
> > > > > +                     iotlb->target_as->name ? iotlb->target_as->name : "none");
> > > > > +        return;
> > > > > +    }
> > > > > +    RCU_READ_LOCK_GUARD();
> > > > > +    vhost_vdpa_iotlb_batch_begin_once(v);
> > > > > +
> > > > > +    if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
> > > > > +        bool read_only;
> > > > > +
> > > > > +        if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL)) {
> > > > > +            return;
> > > > > +        }
> > > > > +        ret =
> > > > > +            vhost_vdpa_dma_map(v, iova, iotlb->addr_mask + 1, vaddr, read_only);
> > > > > +        if (ret) {
> > > > > +            error_report("vhost_vdpa_dma_map(%p, 0x%" HWADDR_PRIx ", "
> > > > > +                         "0x%" HWADDR_PRIx ", %p) = %d (%m)",
> > > > > +                         v, iova, iotlb->addr_mask + 1, vaddr, ret);
> > > > > +        }
> > > > > +    } else {
> > > > > +        ret = vhost_vdpa_dma_unmap(v, iova, iotlb->addr_mask + 1);
> > > > > +        if (ret) {
> > > > > +            error_report("vhost_vdpa_dma_unmap(%p, 0x%" HWADDR_PRIx ", "
> > > > > +                         "0x%" HWADDR_PRIx ") = %d (%m)",
> > > > > +                         v, iova, iotlb->addr_mask + 1, ret);
> > > > > +        }
> > > > > +    }
> > > > > +}
> > > > > +
> > > > > +static void vhost_vdpa_iommu_region_add(MemoryListener *listener,
> > > > > +                                        MemoryRegionSection *section)
> > > > > +{
> > > > > +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> > > > > +
> > > > > +    struct vdpa_iommu *iommu;
> > > > > +    Int128 end;
> > > > > +    int iommu_idx;
> > > > > +    IOMMUMemoryRegion *iommu_mr;
> > > > > +    int ret;
> > > > > +
> > > > > +    iommu_mr = IOMMU_MEMORY_REGION(section->mr);
> > > > > +
> > > > > +    iommu = g_malloc0(sizeof(*iommu));
> > > > > +    end =  int128_add(int128_make64(section->offset_within_region),
> > > > > +            section->size);
> > > > > +    end = int128_sub(end, int128_one());
> > > > > +    iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
> > > > > +            MEMTXATTRS_UNSPECIFIED);
> > > > > +
> > > > > +    iommu->iommu_mr = iommu_mr;
> > > > > +
> > > > > +    iommu_notifier_init(
> > > > > +        &iommu->n, vhost_vdpa_iommu_map_notify, IOMMU_NOTIFIER_IOTLB_EVENTS,
> > > > > +        section->offset_within_region, int128_get64(end), iommu_idx);
> > > > > +    iommu->iommu_offset =
> > > > > +        section->offset_within_address_space - section->offset_within_region;
> > > > > +    iommu->dev = v;
> > > > > +
> > > > > +    ret = memory_region_register_iommu_notifier(section->mr, &iommu->n, NULL);
> > > > > +    if (ret) {
> > > > > +        g_free(iommu);
> > > > > +        return;
> > > > > +    }
> > > > > +
> > > > > +    QLIST_INSERT_HEAD(&v->iommu_list, iommu, iommu_next);
> > > > > +    memory_region_iommu_replay(iommu->iommu_mr, &iommu->n);
> > > > > +
> > > > > +    return;
> > > > > +}
> > > > > +
> > > > > +static void vhost_vdpa_iommu_region_del(MemoryListener *listener,
> > > > > +                                        MemoryRegionSection *section)
> > > > > +{
> > > > > +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> > > > > +
> > > > > +    struct vdpa_iommu *iommu;
> > > > > +
> > > > > +    QLIST_FOREACH(iommu, &v->iommu_list, iommu_next)
> > > > > +    {
> > > > > +        if (MEMORY_REGION(iommu->iommu_mr) == section->mr &&
> > > > > +            iommu->n.start == section->offset_within_region) {
> > > > > +            memory_region_unregister_iommu_notifier(section->mr, &iommu->n);
> > > > > +            QLIST_REMOVE(iommu, iommu_next);
> > > > > +            g_free(iommu);
> > > > > +            break;
> > > > > +        }
> > > > > +    }
> > > > > +}
> > > > > +
> > > > >  static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> > > > >                                             MemoryRegionSection *section)
> > > > >  {
> > > > > @@ -186,6 +278,10 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> > > > >                                              v->iova_range.last)) {
> > > > >          return;
> > > > >      }
> > > > > +    if (memory_region_is_iommu(section->mr)) {
> > > > > +        vhost_vdpa_iommu_region_add(listener, section);
> > > > > +        return;
> > > > > +    }
> > > > >
> > > > >      if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> > > > >                   (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> > > > > @@ -260,6 +356,10 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
> > > > >                                              v->iova_range.last)) {
> > > > >          return;
> > > > >      }
> > > > > +    if (memory_region_is_iommu(section->mr)) {
> > > > > +        vhost_vdpa_iommu_region_del(listener, section);
> > > > > +        return;
> > > > > +    }
> > > > >
> > > > >      if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> > > > >                   (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> > > > > @@ -312,6 +412,7 @@ static const MemoryListener vhost_vdpa_memory_listener = {
> > > > >      .region_del = vhost_vdpa_listener_region_del,
> > > > >  };
> > > > >
> > > > > +
> > > > >  static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> > > > >                               void *arg)
> > > > >  {
> > > >
> > > >
> > > > This change is not necessary.
> > > >
> > > will fix this
> > > > > @@ -587,7 +688,6 @@ static int vhost_vdpa_cleanup(struct vhost_dev *dev)
> > > > >      v = dev->opaque;
> > > > >      trace_vhost_vdpa_cleanup(dev, v);
> > > > >      vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
> > > > > -    memory_listener_unregister(&v->listener);
> > > > >      vhost_vdpa_svq_cleanup(dev);
> > > > >
> > > > >      dev->opaque = NULL;
> > > > > @@ -1127,7 +1227,8 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> > > > >      }
> > > > >
> > > > >      if (started) {
> > > > > -        memory_listener_register(&v->listener, &address_space_memory);
> > > > > +        memory_listener_register(&v->listener, dev->vdev->dma_as);
> > > > > +
> > > > >          return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
> > > > >      } else {
> > > > >          vhost_vdpa_reset_device(dev);
> > > > > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > > > > index d10a89303e..64a46e37cb 100644
> > > > > --- a/include/hw/virtio/vhost-vdpa.h
> > > > > +++ b/include/hw/virtio/vhost-vdpa.h
> > > > > @@ -41,8 +41,18 @@ typedef struct vhost_vdpa {
> > > > >      void *shadow_vq_ops_opaque;
> > > > >      struct vhost_dev *dev;
> > > > >      VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> > > > > +    QLIST_HEAD(, vdpa_iommu) iommu_list;
> > > > > +    IOMMUNotifier n;
> > > > >  } VhostVDPA;
> > > > >
> > > > > +struct vdpa_iommu {
> > > > > +    struct vhost_vdpa *dev;
> > > > > +    IOMMUMemoryRegion *iommu_mr;
> > > > > +    hwaddr iommu_offset;
> > > > > +    IOMMUNotifier n;
> > > > > +    QLIST_ENTRY(vdpa_iommu) iommu_next;
> > > > > +};
> > > > > +
> > > >
> > > > You need to add a typedef as per coding style.
> > > >
> > > will fix this
> > > Thanks
> > > Cindy
> > > > >  int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
> > > > >                         void *vaddr, bool readonly);
> > > > >  int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
> > > > > --
> > > > > 2.34.3
> > > >
> >



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 2/2] vhost-vdpa: add support for vIOMMU
  2022-10-31 12:56           ` Cindy Lu
@ 2022-10-31 13:07             ` Michael S. Tsirkin
  0 siblings, 0 replies; 15+ messages in thread
From: Michael S. Tsirkin @ 2022-10-31 13:07 UTC (permalink / raw)
  To: Cindy Lu
  Cc: alex.williamson, jasowang, pbonzini, peterx, david, f4bug,
	sgarzare, qemu-devel

On Mon, Oct 31, 2022 at 08:56:22PM +0800, Cindy Lu wrote:
> On Mon, 31 Oct 2022 at 16:30, Cindy Lu <lulu@redhat.com> wrote:
> >
> > On Mon, 31 Oct 2022 at 15:20, Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Oct 31, 2022 at 03:15:14PM +0800, Cindy Lu wrote:
> > > > ,
> > > >
> > > >
> > > > On Mon, 31 Oct 2022 at 15:04, Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Mon, Oct 31, 2022 at 11:10:20AM +0800, Cindy Lu wrote:
> > > > > > Add support for vIOMMU. add the new function to deal with iommu MR.
> > > > > > - during iommu_region_add register a specific IOMMU notifier,
> > > > > >  and store all notifiers in a list.
> > > > > > - during iommu_region_del, compare and delete the IOMMU notifier from the list
> > > > > >
> > > > > > Verified in vp_vdpa and vdpa_sim_net driver
> > > > > >
> > > > > > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > > > > > ---
> > > > > >  hw/virtio/vhost-vdpa.c         | 123 ++++++++++++++++++++++++++++++---
> > > > > >  include/hw/virtio/vhost-vdpa.h |  10 +++
> > > > > >  2 files changed, 122 insertions(+), 11 deletions(-)
> > > > > >
> > > > > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > > > > > index 3ff9ce3501..dcfaaccfa9 100644
> > > > > > --- a/hw/virtio/vhost-vdpa.c
> > > > > > +++ b/hw/virtio/vhost-vdpa.c
> > > > > > @@ -26,6 +26,7 @@
> > > > > >  #include "cpu.h"
> > > > > >  #include "trace.h"
> > > > > >  #include "qapi/error.h"
> > > > > > +#include "hw/virtio/virtio-access.h"
> > > > > >
> > > > > >  /*
> > > > > >   * Return one past the end of the end of section. Be careful with uint64_t
> > > > > > @@ -44,7 +45,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
> > > > > >                                                  uint64_t iova_min,
> > > > > >                                                  uint64_t iova_max)
> > > > > >  {
> > > > > > -    Int128 llend;
> > > > > >
> > > > > >      if ((!memory_region_is_ram(section->mr) &&
> > > > > >           !memory_region_is_iommu(section->mr)) ||
> > > > > > @@ -61,14 +61,6 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
> > > > > >          return true;
> > > > > >      }
> > > > > >
> > > > > > -    llend = vhost_vdpa_section_end(section);
> > > > > > -    if (int128_gt(llend, int128_make64(iova_max))) {
> > > > > > -        error_report("RAM section out of device range (max=0x%" PRIx64
> > > > > > -                     ", end addr=0x%" PRIx64 ")",
> > > > > > -                     iova_max, int128_get64(llend));
> > > > > > -        return true;
> > > > > > -    }
> > > > > > -
> > > > > >      return false;
> > > > > >  }
> > > > > >
> > > > >
> > > > > I couldn't figure out we are completely removing this.
> > > > > So this function is now checking iova_min but not iova_max?
> > > > > Seems asymmetrical.
> > > > >
> > > > this is because this is an asset for int128_get64,So I just not jump
> > > > this part of the check,
> > > > static inline uint64_t int128_get64(Int128 a)
> > > > {
> > > >     uint64_t r = a;
> > > >     assert(r == a);
> > > >     return r;
> > > > }
> > >
> > >
> > > ?
> > >
> > > Could not parse this. You mean assert? And removing functionality
> > > because you don't like an error message does not make sense.
> > > So find another way to print it?
> > >
> > sorry for my mistake here
> > for this part, I remove this because it will report error in iommu mr added
> > Also there is no such check in vfio,
> > seems the llend is always small than iov_max in iommu domain,
> > not sure we can remove it first and I will add more comments later ?
> > Thanks
> > cindy
> sorry here I mean llend is larger than  iov_max here, so the iommu mr
> can not pass the
> check, not sure if we can remove this check first?
> Thanks
> Cindy


ys split it out with proper documentation first.

> > >
> > > >
> > > > >
> > > > > > @@ -173,6 +165,106 @@ static void vhost_vdpa_listener_commit(MemoryListener *listener)
> > > > > >      v->iotlb_batch_begin_sent = false;
> > > > > >  }
> > > > > >
> > > > > > +static void vhost_vdpa_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
> > > > > > +{
> > > > > > +    struct vdpa_iommu *iommu = container_of(n, struct vdpa_iommu, n);
> > > > > > +
> > > > > > +    hwaddr iova = iotlb->iova + iommu->iommu_offset;
> > > > > > +    struct vhost_vdpa *v = iommu->dev;
> > > > > > +    void *vaddr;
> > > > > > +    int ret;
> > > > > > +
> > > > > > +    if (iotlb->target_as != &address_space_memory) {
> > > > > > +        error_report("Wrong target AS \"%s\", only system memory is allowed",
> > > > > > +                     iotlb->target_as->name ? iotlb->target_as->name : "none");
> > > > > > +        return;
> > > > > > +    }
> > > > > > +    RCU_READ_LOCK_GUARD();
> > > > > > +    vhost_vdpa_iotlb_batch_begin_once(v);
> > > > > > +
> > > > > > +    if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
> > > > > > +        bool read_only;
> > > > > > +
> > > > > > +        if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL)) {
> > > > > > +            return;
> > > > > > +        }
> > > > > > +        ret =
> > > > > > +            vhost_vdpa_dma_map(v, iova, iotlb->addr_mask + 1, vaddr, read_only);
> > > > > > +        if (ret) {
> > > > > > +            error_report("vhost_vdpa_dma_map(%p, 0x%" HWADDR_PRIx ", "
> > > > > > +                         "0x%" HWADDR_PRIx ", %p) = %d (%m)",
> > > > > > +                         v, iova, iotlb->addr_mask + 1, vaddr, ret);
> > > > > > +        }
> > > > > > +    } else {
> > > > > > +        ret = vhost_vdpa_dma_unmap(v, iova, iotlb->addr_mask + 1);
> > > > > > +        if (ret) {
> > > > > > +            error_report("vhost_vdpa_dma_unmap(%p, 0x%" HWADDR_PRIx ", "
> > > > > > +                         "0x%" HWADDR_PRIx ") = %d (%m)",
> > > > > > +                         v, iova, iotlb->addr_mask + 1, ret);
> > > > > > +        }
> > > > > > +    }
> > > > > > +}
> > > > > > +
> > > > > > +static void vhost_vdpa_iommu_region_add(MemoryListener *listener,
> > > > > > +                                        MemoryRegionSection *section)
> > > > > > +{
> > > > > > +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> > > > > > +
> > > > > > +    struct vdpa_iommu *iommu;
> > > > > > +    Int128 end;
> > > > > > +    int iommu_idx;
> > > > > > +    IOMMUMemoryRegion *iommu_mr;
> > > > > > +    int ret;
> > > > > > +
> > > > > > +    iommu_mr = IOMMU_MEMORY_REGION(section->mr);
> > > > > > +
> > > > > > +    iommu = g_malloc0(sizeof(*iommu));
> > > > > > +    end =  int128_add(int128_make64(section->offset_within_region),
> > > > > > +            section->size);
> > > > > > +    end = int128_sub(end, int128_one());
> > > > > > +    iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
> > > > > > +            MEMTXATTRS_UNSPECIFIED);
> > > > > > +
> > > > > > +    iommu->iommu_mr = iommu_mr;
> > > > > > +
> > > > > > +    iommu_notifier_init(
> > > > > > +        &iommu->n, vhost_vdpa_iommu_map_notify, IOMMU_NOTIFIER_IOTLB_EVENTS,
> > > > > > +        section->offset_within_region, int128_get64(end), iommu_idx);
> > > > > > +    iommu->iommu_offset =
> > > > > > +        section->offset_within_address_space - section->offset_within_region;
> > > > > > +    iommu->dev = v;
> > > > > > +
> > > > > > +    ret = memory_region_register_iommu_notifier(section->mr, &iommu->n, NULL);
> > > > > > +    if (ret) {
> > > > > > +        g_free(iommu);
> > > > > > +        return;
> > > > > > +    }
> > > > > > +
> > > > > > +    QLIST_INSERT_HEAD(&v->iommu_list, iommu, iommu_next);
> > > > > > +    memory_region_iommu_replay(iommu->iommu_mr, &iommu->n);
> > > > > > +
> > > > > > +    return;
> > > > > > +}
> > > > > > +
> > > > > > +static void vhost_vdpa_iommu_region_del(MemoryListener *listener,
> > > > > > +                                        MemoryRegionSection *section)
> > > > > > +{
> > > > > > +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> > > > > > +
> > > > > > +    struct vdpa_iommu *iommu;
> > > > > > +
> > > > > > +    QLIST_FOREACH(iommu, &v->iommu_list, iommu_next)
> > > > > > +    {
> > > > > > +        if (MEMORY_REGION(iommu->iommu_mr) == section->mr &&
> > > > > > +            iommu->n.start == section->offset_within_region) {
> > > > > > +            memory_region_unregister_iommu_notifier(section->mr, &iommu->n);
> > > > > > +            QLIST_REMOVE(iommu, iommu_next);
> > > > > > +            g_free(iommu);
> > > > > > +            break;
> > > > > > +        }
> > > > > > +    }
> > > > > > +}
> > > > > > +
> > > > > >  static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> > > > > >                                             MemoryRegionSection *section)
> > > > > >  {
> > > > > > @@ -186,6 +278,10 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> > > > > >                                              v->iova_range.last)) {
> > > > > >          return;
> > > > > >      }
> > > > > > +    if (memory_region_is_iommu(section->mr)) {
> > > > > > +        vhost_vdpa_iommu_region_add(listener, section);
> > > > > > +        return;
> > > > > > +    }
> > > > > >
> > > > > >      if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> > > > > >                   (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> > > > > > @@ -260,6 +356,10 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
> > > > > >                                              v->iova_range.last)) {
> > > > > >          return;
> > > > > >      }
> > > > > > +    if (memory_region_is_iommu(section->mr)) {
> > > > > > +        vhost_vdpa_iommu_region_del(listener, section);
> > > > > > +        return;
> > > > > > +    }
> > > > > >
> > > > > >      if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> > > > > >                   (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> > > > > > @@ -312,6 +412,7 @@ static const MemoryListener vhost_vdpa_memory_listener = {
> > > > > >      .region_del = vhost_vdpa_listener_region_del,
> > > > > >  };
> > > > > >
> > > > > > +
> > > > > >  static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> > > > > >                               void *arg)
> > > > > >  {
> > > > >
> > > > >
> > > > > This change is not necessary.
> > > > >
> > > > will fix this
> > > > > > @@ -587,7 +688,6 @@ static int vhost_vdpa_cleanup(struct vhost_dev *dev)
> > > > > >      v = dev->opaque;
> > > > > >      trace_vhost_vdpa_cleanup(dev, v);
> > > > > >      vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
> > > > > > -    memory_listener_unregister(&v->listener);
> > > > > >      vhost_vdpa_svq_cleanup(dev);
> > > > > >
> > > > > >      dev->opaque = NULL;
> > > > > > @@ -1127,7 +1227,8 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> > > > > >      }
> > > > > >
> > > > > >      if (started) {
> > > > > > -        memory_listener_register(&v->listener, &address_space_memory);
> > > > > > +        memory_listener_register(&v->listener, dev->vdev->dma_as);
> > > > > > +
> > > > > >          return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > >      } else {
> > > > > >          vhost_vdpa_reset_device(dev);
> > > > > > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > > > > > index d10a89303e..64a46e37cb 100644
> > > > > > --- a/include/hw/virtio/vhost-vdpa.h
> > > > > > +++ b/include/hw/virtio/vhost-vdpa.h
> > > > > > @@ -41,8 +41,18 @@ typedef struct vhost_vdpa {
> > > > > >      void *shadow_vq_ops_opaque;
> > > > > >      struct vhost_dev *dev;
> > > > > >      VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> > > > > > +    QLIST_HEAD(, vdpa_iommu) iommu_list;
> > > > > > +    IOMMUNotifier n;
> > > > > >  } VhostVDPA;
> > > > > >
> > > > > > +struct vdpa_iommu {
> > > > > > +    struct vhost_vdpa *dev;
> > > > > > +    IOMMUMemoryRegion *iommu_mr;
> > > > > > +    hwaddr iommu_offset;
> > > > > > +    IOMMUNotifier n;
> > > > > > +    QLIST_ENTRY(vdpa_iommu) iommu_next;
> > > > > > +};
> > > > > > +
> > > > >
> > > > > You need to add a typedef as per coding style.
> > > > >
> > > > will fix this
> > > > Thanks
> > > > Cindy
> > > > > >  int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
> > > > > >                         void *vaddr, bool readonly);
> > > > > >  int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
> > > > > > --
> > > > > > 2.34.3
> > > > >
> > >



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-10-31 13:11 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-31  3:10 [PATCH v9 0/2] vhost-vdpa: add support for vIOMMU Cindy Lu
2022-10-31  3:10 ` [PATCH v9 1/2] vfio: move implement of vfio_get_xlat_addr() to memory.c Cindy Lu
2022-10-31  3:20   ` Alex Williamson
2022-10-31  6:38   ` Michael S. Tsirkin
2022-10-31  6:44     ` Cindy Lu
2022-10-31  6:54       ` Michael S. Tsirkin
2022-10-31  6:56         ` Cindy Lu
2022-10-31  7:07           ` Michael S. Tsirkin
2022-10-31  3:10 ` [PATCH v9 2/2] vhost-vdpa: add support for vIOMMU Cindy Lu
2022-10-31  7:04   ` Michael S. Tsirkin
2022-10-31  7:15     ` Cindy Lu
2022-10-31  7:20       ` Michael S. Tsirkin
2022-10-31  8:30         ` Cindy Lu
2022-10-31 12:56           ` Cindy Lu
2022-10-31 13:07             ` Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.