All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v12 0/1] vhost-vdpa: add support for vIOMMU
@ 2022-12-09 13:08 Cindy Lu
  2022-12-09 13:08 ` [PATCH v12 1/1] " Cindy Lu
  2022-12-20 14:33 ` [PATCH v12 0/1] " Michael S. Tsirkin
  0 siblings, 2 replies; 6+ messages in thread
From: Cindy Lu @ 2022-12-09 13:08 UTC (permalink / raw)
  To: lulu, jasowang, mst; +Cc: qemu-devel

These patches are to support vIOMMU in vdpa device
Verified in vp_vdpa/vdpa_sim_net driverand intel_iommu
virtio-iommu device

changes in V3
1. Move function vfio_get_xlat_addr to memory.c
2. Use the existing memory listener, while the MR is
iommu MR then call the function iommu_region_add/
iommu_region_del

changes in V4
1.make the comments in vfio_get_xlat_addr more general

changes in V5
1. Address the comments in the last version
2. Add a new arg in the function vfio_get_xlat_addr, which shows whether
the memory is backed by a discard manager. So the device can have its
own warning.

changes in V6
move the error_report for the unpopulated discard back to
memeory_get_xlat_addr

changes in V7
organize the error massage to avoid the duplicate information

changes in V8
Organize the code follow the comments in the last version

changes in V9
Organize the code follow the comments

changes in V10
Address the comments

changes in V11
Address the comments
fix the crash found in test

changes in V12
Address the comments, squash patch 1 into the next patch
improve the code style issue

Cindy Lu (1):
  vhost-vdpa: add support for vIOMMU

 hw/virtio/vhost-vdpa.c         | 162 ++++++++++++++++++++++++++++++---
 include/hw/virtio/vhost-vdpa.h |  10 ++
 2 files changed, 161 insertions(+), 11 deletions(-)

-- 
2.34.3



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v12 1/1] vhost-vdpa: add support for vIOMMU
  2022-12-09 13:08 [PATCH v12 0/1] vhost-vdpa: add support for vIOMMU Cindy Lu
@ 2022-12-09 13:08 ` Cindy Lu
  2022-12-13  8:17   ` Jason Wang
  2022-12-20 14:33 ` [PATCH v12 0/1] " Michael S. Tsirkin
  1 sibling, 1 reply; 6+ messages in thread
From: Cindy Lu @ 2022-12-09 13:08 UTC (permalink / raw)
  To: lulu, jasowang, mst; +Cc: qemu-devel

1.Skip the check in vhost_vdpa_listener_skipped_section() while
MR is IOMMU, Move this check to  vhost_vdpa_iommu_map_notify()
2.Add support for vIOMMU.
Add the new function to deal with iommu MR.
- during iommu_region_add register a specific IOMMU notifier,
 and store all notifiers in a list.
- during iommu_region_del, compare and delete the IOMMU notifier from the list

Verified in vp_vdpa and vdpa_sim_net driver

Signed-off-by: Cindy Lu <lulu@redhat.com>
---
 hw/virtio/vhost-vdpa.c         | 162 ++++++++++++++++++++++++++++++---
 include/hw/virtio/vhost-vdpa.h |  10 ++
 2 files changed, 161 insertions(+), 11 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 7468e44b87..2b3920c2a1 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -26,6 +26,7 @@
 #include "cpu.h"
 #include "trace.h"
 #include "qapi/error.h"
+#include "hw/virtio/virtio-access.h"
 
 /*
  * Return one past the end of the end of section. Be careful with uint64_t
@@ -60,15 +61,22 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
                      iova_min, section->offset_within_address_space);
         return true;
     }
+    /*
+     * While using vIOMMU, Sometimes the section will be larger than iova_max
+     * but the memory that  actually mapping is smaller, So skip the check
+     * here. Will add the check in vhost_vdpa_iommu_map_notify,
+     *There is the real size that maps to the kernel
+     */
 
-    llend = vhost_vdpa_section_end(section);
-    if (int128_gt(llend, int128_make64(iova_max))) {
-        error_report("RAM section out of device range (max=0x%" PRIx64
-                     ", end addr=0x%" PRIx64 ")",
-                     iova_max, int128_get64(llend));
-        return true;
+    if (!memory_region_is_iommu(section->mr)) {
+        llend = vhost_vdpa_section_end(section);
+        if (int128_gt(llend, int128_make64(iova_max))) {
+            error_report("RAM section out of device range (max=0x%" PRIx64
+                         ", end addr=0x%" PRIx64 ")",
+                         iova_max, int128_get64(llend));
+            return true;
+        }
     }
-
     return false;
 }
 
@@ -173,6 +181,115 @@ static void vhost_vdpa_listener_commit(MemoryListener *listener)
     v->iotlb_batch_begin_sent = false;
 }
 
+static void vhost_vdpa_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
+{
+    struct vdpa_iommu *iommu = container_of(n, struct vdpa_iommu, n);
+
+    hwaddr iova = iotlb->iova + iommu->iommu_offset;
+    struct vhost_vdpa *v = iommu->dev;
+    void *vaddr;
+    int ret;
+    Int128 llend;
+
+    if (iotlb->target_as != &address_space_memory) {
+        error_report("Wrong target AS \"%s\", only system memory is allowed",
+                     iotlb->target_as->name ? iotlb->target_as->name : "none");
+        return;
+    }
+    RCU_READ_LOCK_GUARD();
+    /* check if RAM section out of device range */
+    llend = int128_add(int128_makes64(iotlb->addr_mask), int128_makes64(iova));
+    if (int128_gt(llend, int128_make64(v->iova_range.last))) {
+        error_report("RAM section out of device range (max=0x%" PRIx64
+                     ", end addr=0x%" PRIx64 ")",
+                     v->iova_range.last, int128_get64(llend));
+        return;
+    }
+
+    vhost_vdpa_iotlb_batch_begin_once(v);
+
+    if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
+        bool read_only;
+
+        if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL)) {
+            return;
+        }
+
+        ret = vhost_vdpa_dma_map(v, iova, iotlb->addr_mask + 1, vaddr,
+            read_only);
+        if (ret) {
+            error_report("vhost_vdpa_dma_map(%p, 0x%" HWADDR_PRIx ", "
+                         "0x%" HWADDR_PRIx ", %p) = %d (%m)",
+                         v, iova, iotlb->addr_mask + 1, vaddr, ret);
+        }
+    } else {
+        ret = vhost_vdpa_dma_unmap(v, iova, iotlb->addr_mask + 1);
+        if (ret) {
+            error_report("vhost_vdpa_dma_unmap(%p, 0x%" HWADDR_PRIx ", "
+                         "0x%" HWADDR_PRIx ") = %d (%m)",
+                         v, iova, iotlb->addr_mask + 1, ret);
+        }
+    }
+}
+
+static void vhost_vdpa_iommu_region_add(MemoryListener *listener,
+                                        MemoryRegionSection *section)
+{
+    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
+
+    struct vdpa_iommu *iommu;
+    Int128 end;
+    int iommu_idx;
+    IOMMUMemoryRegion *iommu_mr;
+    int ret;
+
+    iommu_mr = IOMMU_MEMORY_REGION(section->mr);
+
+    iommu = g_malloc0(sizeof(*iommu));
+    end = int128_add(int128_make64(section->offset_within_region),
+        section->size);
+    end = int128_sub(end, int128_one());
+    iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
+        MEMTXATTRS_UNSPECIFIED);
+    iommu->iommu_mr = iommu_mr;
+    iommu_notifier_init(&iommu->n, vhost_vdpa_iommu_map_notify,
+        IOMMU_NOTIFIER_IOTLB_EVENTS,
+        section->offset_within_region, int128_get64(end), iommu_idx);
+    iommu->iommu_offset = section->offset_within_address_space -
+        section->offset_within_region;
+    iommu->dev = v;
+
+    ret = memory_region_register_iommu_notifier(section->mr, &iommu->n, NULL);
+    if (ret) {
+        g_free(iommu);
+        return;
+    }
+
+    QLIST_INSERT_HEAD(&v->iommu_list, iommu, iommu_next);
+    memory_region_iommu_replay(iommu->iommu_mr, &iommu->n);
+
+    return;
+}
+
+static void vhost_vdpa_iommu_region_del(MemoryListener *listener,
+                                        MemoryRegionSection *section)
+{
+    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
+
+    struct vdpa_iommu *iommu;
+
+    QLIST_FOREACH(iommu, &v->iommu_list, iommu_next)
+    {
+        if (MEMORY_REGION(iommu->iommu_mr) == section->mr &&
+            iommu->n.start == section->offset_within_region) {
+            memory_region_unregister_iommu_notifier(section->mr, &iommu->n);
+            QLIST_REMOVE(iommu, iommu_next);
+            g_free(iommu);
+            break;
+        }
+    }
+}
+
 static void vhost_vdpa_listener_region_add(MemoryListener *listener,
                                            MemoryRegionSection *section)
 {
@@ -187,6 +304,10 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
                                             v->iova_range.last)) {
         return;
     }
+    if (memory_region_is_iommu(section->mr)) {
+        vhost_vdpa_iommu_region_add(listener, section);
+        return;
+    }
 
     if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
                  (section->offset_within_region & ~TARGET_PAGE_MASK))) {
@@ -266,6 +387,9 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
                                             v->iova_range.last)) {
         return;
     }
+    if (memory_region_is_iommu(section->mr)) {
+        vhost_vdpa_iommu_region_del(listener, section);
+    }
 
     if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
                  (section->offset_within_region & ~TARGET_PAGE_MASK))) {
@@ -276,7 +400,8 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
     iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
     llend = vhost_vdpa_section_end(section);
 
-    trace_vhost_vdpa_listener_region_del(v, iova, int128_get64(llend));
+    trace_vhost_vdpa_listener_region_del(v, iova,
+        int128_get64(int128_sub(llend, int128_one())));
 
     if (int128_ge(int128_make64(iova), llend)) {
         return;
@@ -303,9 +428,24 @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
         vhost_iova_tree_remove(v->iova_tree, *result);
     }
     vhost_vdpa_iotlb_batch_begin_once(v);
+    /*
+     * The unmap ioctl doesn't accept a full 64-bit. need to check it
+     */
+    if (int128_eq(llsize, int128_2_64())) {
+        llsize = int128_rshift(llsize, 1);
+        ret = vhost_vdpa_dma_unmap(v, iova, int128_get64(llsize));
+        if (ret) {
+            error_report("vhost_vdpa_dma_unmap(%p, 0x%" HWADDR_PRIx ", "
+                         "0x%" HWADDR_PRIx ") = %d (%m)",
+                         v, iova, int128_get64(llsize), ret);
+        }
+        iova += int128_get64(llsize);
+    }
     ret = vhost_vdpa_dma_unmap(v, iova, int128_get64(llsize));
     if (ret) {
-        error_report("vhost_vdpa dma unmap error!");
+        error_report("vhost_vdpa_dma_unmap(%p, 0x%" HWADDR_PRIx ", "
+                     "0x%" HWADDR_PRIx ") = %d (%m)",
+                     v, iova, int128_get64(llsize), ret);
     }
 
     memory_region_unref(section->mr);
@@ -597,7 +737,6 @@ static int vhost_vdpa_cleanup(struct vhost_dev *dev)
     v = dev->opaque;
     trace_vhost_vdpa_cleanup(dev, v);
     vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
-    memory_listener_unregister(&v->listener);
     vhost_vdpa_svq_cleanup(dev);
 
     dev->opaque = NULL;
@@ -1115,7 +1254,8 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
     }
 
     if (started) {
-        memory_listener_register(&v->listener, &address_space_memory);
+        memory_listener_register(&v->listener, dev->vdev->dma_as);
+
         return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
     } else {
         vhost_vdpa_reset_device(dev);
diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index 1111d85643..0d5b2c9508 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -40,8 +40,18 @@ typedef struct vhost_vdpa {
     void *shadow_vq_ops_opaque;
     struct vhost_dev *dev;
     VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
+    QLIST_HEAD(, vdpa_iommu) iommu_list;
+    IOMMUNotifier n;
 } VhostVDPA;
 
+typedef struct vdpa_iommu {
+    struct vhost_vdpa *dev;
+    IOMMUMemoryRegion *iommu_mr;
+    hwaddr iommu_offset;
+    IOMMUNotifier n;
+    QLIST_ENTRY(vdpa_iommu) iommu_next;
+} VDPAIOMMUState;
+
 int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
                        void *vaddr, bool readonly);
 int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
-- 
2.34.3



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v12 1/1] vhost-vdpa: add support for vIOMMU
  2022-12-09 13:08 ` [PATCH v12 1/1] " Cindy Lu
@ 2022-12-13  8:17   ` Jason Wang
  2022-12-13 11:17     ` Eugenio Perez Martin
  0 siblings, 1 reply; 6+ messages in thread
From: Jason Wang @ 2022-12-13  8:17 UTC (permalink / raw)
  To: Cindy Lu, mst; +Cc: qemu-devel, Eugenio Perez Martin


在 2022/12/9 21:08, Cindy Lu 写道:
> 1.Skip the check in vhost_vdpa_listener_skipped_section() while
> MR is IOMMU, Move this check to  vhost_vdpa_iommu_map_notify()
> 2.Add support for vIOMMU.


So I think the changlog needs some tweak, we need to explain why you 
need to do the above.


> Add the new function to deal with iommu MR.
> - during iommu_region_add register a specific IOMMU notifier,
>   and store all notifiers in a list.
> - during iommu_region_del, compare and delete the IOMMU notifier from the list


And try to describe how you implement it by avoid duplicating what the 
code did as below.

E.g you can say you implement this through the IOMMU MAP notifier etc.


>
> Verified in vp_vdpa and vdpa_sim_net driver
>
> Signed-off-by: Cindy Lu <lulu@redhat.com>
> ---
>   hw/virtio/vhost-vdpa.c         | 162 ++++++++++++++++++++++++++++++---
>   include/hw/virtio/vhost-vdpa.h |  10 ++
>   2 files changed, 161 insertions(+), 11 deletions(-)
>
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index 7468e44b87..2b3920c2a1 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -26,6 +26,7 @@
>   #include "cpu.h"
>   #include "trace.h"
>   #include "qapi/error.h"
> +#include "hw/virtio/virtio-access.h"


Any reason you need to use virtio accessors here?


>   
>   /*
>    * Return one past the end of the end of section. Be careful with uint64_t
> @@ -60,15 +61,22 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
>                        iova_min, section->offset_within_address_space);
>           return true;
>       }
> +    /*
> +     * While using vIOMMU, Sometimes the section will be larger than iova_max
> +     * but the memory that  actually mapping is smaller, So skip the check
> +     * here. Will add the check in vhost_vdpa_iommu_map_notify,
> +     *There is the real size that maps to the kernel
> +     */
>   
> -    llend = vhost_vdpa_section_end(section);
> -    if (int128_gt(llend, int128_make64(iova_max))) {
> -        error_report("RAM section out of device range (max=0x%" PRIx64
> -                     ", end addr=0x%" PRIx64 ")",
> -                     iova_max, int128_get64(llend));
> -        return true;
> +    if (!memory_region_is_iommu(section->mr)) {
> +        llend = vhost_vdpa_section_end(section);
> +        if (int128_gt(llend, int128_make64(iova_max))) {
> +            error_report("RAM section out of device range (max=0x%" PRIx64
> +                         ", end addr=0x%" PRIx64 ")",
> +                         iova_max, int128_get64(llend));
> +            return true;
> +        }
>       }
> -
>       return false;
>   }
>   
> @@ -173,6 +181,115 @@ static void vhost_vdpa_listener_commit(MemoryListener *listener)
>       v->iotlb_batch_begin_sent = false;
>   }
>   
> +static void vhost_vdpa_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
> +{
> +    struct vdpa_iommu *iommu = container_of(n, struct vdpa_iommu, n);
> +
> +    hwaddr iova = iotlb->iova + iommu->iommu_offset;
> +    struct vhost_vdpa *v = iommu->dev;
> +    void *vaddr;
> +    int ret;
> +    Int128 llend;
> +
> +    if (iotlb->target_as != &address_space_memory) {
> +        error_report("Wrong target AS \"%s\", only system memory is allowed",
> +                     iotlb->target_as->name ? iotlb->target_as->name : "none");
> +        return;
> +    }
> +    RCU_READ_LOCK_GUARD();
> +    /* check if RAM section out of device range */
> +    llend = int128_add(int128_makes64(iotlb->addr_mask), int128_makes64(iova));
> +    if (int128_gt(llend, int128_make64(v->iova_range.last))) {
> +        error_report("RAM section out of device range (max=0x%" PRIx64
> +                     ", end addr=0x%" PRIx64 ")",
> +                     v->iova_range.last, int128_get64(llend));
> +        return;
> +    }
> +
> +    vhost_vdpa_iotlb_batch_begin_once(v);'


Any reason we need to consider batching here? (or batching can be done 
via notifier?)


> +
> +    if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
> +        bool read_only;
> +
> +        if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL)) {
> +            return;
> +        }
> +
> +        ret = vhost_vdpa_dma_map(v, iova, iotlb->addr_mask + 1, vaddr,
> +            read_only);


Let's add some tracepoints for those as what is done in the 
region_add()/del().


> +        if (ret) {
> +            error_report("vhost_vdpa_dma_map(%p, 0x%" HWADDR_PRIx ", "
> +                         "0x%" HWADDR_PRIx ", %p) = %d (%m)",
> +                         v, iova, iotlb->addr_mask + 1, vaddr, ret);
> +        }
> +    } else {
> +        ret = vhost_vdpa_dma_unmap(v, iova, iotlb->addr_mask + 1);
> +        if (ret) {
> +            error_report("vhost_vdpa_dma_unmap(%p, 0x%" HWADDR_PRIx ", "
> +                         "0x%" HWADDR_PRIx ") = %d (%m)",
> +                         v, iova, iotlb->addr_mask + 1, ret);
> +        }
> +    }
> +}
> +
> +static void vhost_vdpa_iommu_region_add(MemoryListener *listener,
> +                                        MemoryRegionSection *section)
> +{
> +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> +
> +    struct vdpa_iommu *iommu;
> +    Int128 end;
> +    int iommu_idx;
> +    IOMMUMemoryRegion *iommu_mr;
> +    int ret;
> +
> +    iommu_mr = IOMMU_MEMORY_REGION(section->mr);
> +
> +    iommu = g_malloc0(sizeof(*iommu));
> +    end = int128_add(int128_make64(section->offset_within_region),
> +        section->size);


Though checkpatch.pl doesn't complain, the indentation looks odd, e.g 
the 's' should be indent to below "i" of "int128" etc.

You can tweak you editor to adopt Qemu coding style.


> +    end = int128_sub(end, int128_one());
> +    iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
> +        MEMTXATTRS_UNSPECIFIED);
> +    iommu->iommu_mr = iommu_mr;
> +    iommu_notifier_init(&iommu->n, vhost_vdpa_iommu_map_notify,
> +        IOMMU_NOTIFIER_IOTLB_EVENTS,
> +        section->offset_within_region, int128_get64(end), iommu_idx);
> +    iommu->iommu_offset = section->offset_within_address_space -
> +        section->offset_within_region;
> +    iommu->dev = v;
> +
> +    ret = memory_region_register_iommu_notifier(section->mr, &iommu->n, NULL);
> +    if (ret) {
> +        g_free(iommu);
> +        return;
> +    }
> +
> +    QLIST_INSERT_HEAD(&v->iommu_list, iommu, iommu_next);
> +    memory_region_iommu_replay(iommu->iommu_mr, &iommu->n);
> +
> +    return;
> +}
> +
> +static void vhost_vdpa_iommu_region_del(MemoryListener *listener,
> +                                        MemoryRegionSection *section)
> +{
> +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> +
> +    struct vdpa_iommu *iommu;
> +
> +    QLIST_FOREACH(iommu, &v->iommu_list, iommu_next)
> +    {
> +        if (MEMORY_REGION(iommu->iommu_mr) == section->mr &&
> +            iommu->n.start == section->offset_within_region) {
> +            memory_region_unregister_iommu_notifier(section->mr, &iommu->n);
> +            QLIST_REMOVE(iommu, iommu_next);
> +            g_free(iommu);
> +            break;
> +        }
> +    }
> +}
> +
>   static void vhost_vdpa_listener_region_add(MemoryListener *listener,
>                                              MemoryRegionSection *section)
>   {
> @@ -187,6 +304,10 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
>                                               v->iova_range.last)) {
>           return;
>       }
> +    if (memory_region_is_iommu(section->mr)) {
> +        vhost_vdpa_iommu_region_add(listener, section);


Adding Eugenio.

I think it need to populate iova_tree otherwise the vIOMMU may break 
shadow virtqueue (and we need to do it for region_del as well).

E.g you can test your code with shadow virtqueue via x-svq=on.

Thanks




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v12 1/1] vhost-vdpa: add support for vIOMMU
  2022-12-13  8:17   ` Jason Wang
@ 2022-12-13 11:17     ` Eugenio Perez Martin
  0 siblings, 0 replies; 6+ messages in thread
From: Eugenio Perez Martin @ 2022-12-13 11:17 UTC (permalink / raw)
  To: Jason Wang; +Cc: Cindy Lu, mst, qemu-devel

On Tue, Dec 13, 2022 at 9:17 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2022/12/9 21:08, Cindy Lu 写道:
> > 1.Skip the check in vhost_vdpa_listener_skipped_section() while
> > MR is IOMMU, Move this check to  vhost_vdpa_iommu_map_notify()
> > 2.Add support for vIOMMU.
>
>
> So I think the changlog needs some tweak, we need to explain why you
> need to do the above.
>
>
> > Add the new function to deal with iommu MR.
> > - during iommu_region_add register a specific IOMMU notifier,
> >   and store all notifiers in a list.
> > - during iommu_region_del, compare and delete the IOMMU notifier from the list
>
>
> And try to describe how you implement it by avoid duplicating what the
> code did as below.
>
> E.g you can say you implement this through the IOMMU MAP notifier etc.
>
>
> >
> > Verified in vp_vdpa and vdpa_sim_net driver
> >
> > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > ---
> >   hw/virtio/vhost-vdpa.c         | 162 ++++++++++++++++++++++++++++++---
> >   include/hw/virtio/vhost-vdpa.h |  10 ++
> >   2 files changed, 161 insertions(+), 11 deletions(-)
> >
> > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > index 7468e44b87..2b3920c2a1 100644
> > --- a/hw/virtio/vhost-vdpa.c
> > +++ b/hw/virtio/vhost-vdpa.c
> > @@ -26,6 +26,7 @@
> >   #include "cpu.h"
> >   #include "trace.h"
> >   #include "qapi/error.h"
> > +#include "hw/virtio/virtio-access.h"
>
>
> Any reason you need to use virtio accessors here?
>
>
> >
> >   /*
> >    * Return one past the end of the end of section. Be careful with uint64_t
> > @@ -60,15 +61,22 @@ static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
> >                        iova_min, section->offset_within_address_space);
> >           return true;
> >       }
> > +    /*
> > +     * While using vIOMMU, Sometimes the section will be larger than iova_max
> > +     * but the memory that  actually mapping is smaller, So skip the check
> > +     * here. Will add the check in vhost_vdpa_iommu_map_notify,
> > +     *There is the real size that maps to the kernel
> > +     */
> >
> > -    llend = vhost_vdpa_section_end(section);
> > -    if (int128_gt(llend, int128_make64(iova_max))) {
> > -        error_report("RAM section out of device range (max=0x%" PRIx64
> > -                     ", end addr=0x%" PRIx64 ")",
> > -                     iova_max, int128_get64(llend));
> > -        return true;
> > +    if (!memory_region_is_iommu(section->mr)) {
> > +        llend = vhost_vdpa_section_end(section);
> > +        if (int128_gt(llend, int128_make64(iova_max))) {
> > +            error_report("RAM section out of device range (max=0x%" PRIx64
> > +                         ", end addr=0x%" PRIx64 ")",
> > +                         iova_max, int128_get64(llend));
> > +            return true;
> > +        }
> >       }
> > -
> >       return false;
> >   }
> >
> > @@ -173,6 +181,115 @@ static void vhost_vdpa_listener_commit(MemoryListener *listener)
> >       v->iotlb_batch_begin_sent = false;
> >   }
> >
> > +static void vhost_vdpa_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
> > +{
> > +    struct vdpa_iommu *iommu = container_of(n, struct vdpa_iommu, n);
> > +
> > +    hwaddr iova = iotlb->iova + iommu->iommu_offset;
> > +    struct vhost_vdpa *v = iommu->dev;
> > +    void *vaddr;
> > +    int ret;
> > +    Int128 llend;
> > +
> > +    if (iotlb->target_as != &address_space_memory) {
> > +        error_report("Wrong target AS \"%s\", only system memory is allowed",
> > +                     iotlb->target_as->name ? iotlb->target_as->name : "none");
> > +        return;
> > +    }
> > +    RCU_READ_LOCK_GUARD();
> > +    /* check if RAM section out of device range */
> > +    llend = int128_add(int128_makes64(iotlb->addr_mask), int128_makes64(iova));
> > +    if (int128_gt(llend, int128_make64(v->iova_range.last))) {
> > +        error_report("RAM section out of device range (max=0x%" PRIx64
> > +                     ", end addr=0x%" PRIx64 ")",
> > +                     v->iova_range.last, int128_get64(llend));
> > +        return;
> > +    }
> > +
> > +    vhost_vdpa_iotlb_batch_begin_once(v);'
>
>
> Any reason we need to consider batching here? (or batching can be done
> via notifier?)
>
>
> > +
> > +    if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
> > +        bool read_only;
> > +
> > +        if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL)) {
> > +            return;
> > +        }
> > +
> > +        ret = vhost_vdpa_dma_map(v, iova, iotlb->addr_mask + 1, vaddr,
> > +            read_only);
>
>
> Let's add some tracepoints for those as what is done in the
> region_add()/del().
>
>
> > +        if (ret) {
> > +            error_report("vhost_vdpa_dma_map(%p, 0x%" HWADDR_PRIx ", "
> > +                         "0x%" HWADDR_PRIx ", %p) = %d (%m)",
> > +                         v, iova, iotlb->addr_mask + 1, vaddr, ret);
> > +        }
> > +    } else {
> > +        ret = vhost_vdpa_dma_unmap(v, iova, iotlb->addr_mask + 1);
> > +        if (ret) {
> > +            error_report("vhost_vdpa_dma_unmap(%p, 0x%" HWADDR_PRIx ", "
> > +                         "0x%" HWADDR_PRIx ") = %d (%m)",
> > +                         v, iova, iotlb->addr_mask + 1, ret);
> > +        }
> > +    }
> > +}
> > +
> > +static void vhost_vdpa_iommu_region_add(MemoryListener *listener,
> > +                                        MemoryRegionSection *section)
> > +{
> > +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> > +
> > +    struct vdpa_iommu *iommu;
> > +    Int128 end;
> > +    int iommu_idx;
> > +    IOMMUMemoryRegion *iommu_mr;
> > +    int ret;
> > +
> > +    iommu_mr = IOMMU_MEMORY_REGION(section->mr);
> > +
> > +    iommu = g_malloc0(sizeof(*iommu));
> > +    end = int128_add(int128_make64(section->offset_within_region),
> > +        section->size);
>
>
> Though checkpatch.pl doesn't complain, the indentation looks odd, e.g
> the 's' should be indent to below "i" of "int128" etc.
>
> You can tweak you editor to adopt Qemu coding style.
>
>
> > +    end = int128_sub(end, int128_one());
> > +    iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
> > +        MEMTXATTRS_UNSPECIFIED);
> > +    iommu->iommu_mr = iommu_mr;
> > +    iommu_notifier_init(&iommu->n, vhost_vdpa_iommu_map_notify,
> > +        IOMMU_NOTIFIER_IOTLB_EVENTS,
> > +        section->offset_within_region, int128_get64(end), iommu_idx);
> > +    iommu->iommu_offset = section->offset_within_address_space -
> > +        section->offset_within_region;
> > +    iommu->dev = v;
> > +
> > +    ret = memory_region_register_iommu_notifier(section->mr, &iommu->n, NULL);
> > +    if (ret) {
> > +        g_free(iommu);
> > +        return;
> > +    }
> > +
> > +    QLIST_INSERT_HEAD(&v->iommu_list, iommu, iommu_next);
> > +    memory_region_iommu_replay(iommu->iommu_mr, &iommu->n);
> > +
> > +    return;
> > +}
> > +
> > +static void vhost_vdpa_iommu_region_del(MemoryListener *listener,
> > +                                        MemoryRegionSection *section)
> > +{
> > +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> > +
> > +    struct vdpa_iommu *iommu;
> > +
> > +    QLIST_FOREACH(iommu, &v->iommu_list, iommu_next)
> > +    {
> > +        if (MEMORY_REGION(iommu->iommu_mr) == section->mr &&
> > +            iommu->n.start == section->offset_within_region) {
> > +            memory_region_unregister_iommu_notifier(section->mr, &iommu->n);
> > +            QLIST_REMOVE(iommu, iommu_next);
> > +            g_free(iommu);
> > +            break;
> > +        }
> > +    }
> > +}
> > +
> >   static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> >                                              MemoryRegionSection *section)
> >   {
> > @@ -187,6 +304,10 @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> >                                               v->iova_range.last)) {
> >           return;
> >       }
> > +    if (memory_region_is_iommu(section->mr)) {
> > +        vhost_vdpa_iommu_region_add(listener, section);
>
>
> Adding Eugenio.
>
> I think it need to populate iova_tree otherwise the vIOMMU may break
> shadow virtqueue (and we need to do it for region_del as well).
>

Populate the iova_tree could be the easiest and more convenient way.

Thinking out loud, does iommu offer a way to iterate through its iova
tree? This way SVQ would avoid the duplication of the entries and the
need of maintain it.

SVQ would still need some entries in its own tree (SVQ vrings etc) anyhow.

Thanks!

> E.g you can test your code with shadow virtqueue via x-svq=on.
>
> Thanks
>
>



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v12 0/1] vhost-vdpa: add support for vIOMMU
  2022-12-09 13:08 [PATCH v12 0/1] vhost-vdpa: add support for vIOMMU Cindy Lu
  2022-12-09 13:08 ` [PATCH v12 1/1] " Cindy Lu
@ 2022-12-20 14:33 ` Michael S. Tsirkin
  2022-12-21  6:13   ` Cindy Lu
  1 sibling, 1 reply; 6+ messages in thread
From: Michael S. Tsirkin @ 2022-12-20 14:33 UTC (permalink / raw)
  To: Cindy Lu; +Cc: jasowang, qemu-devel

On Fri, Dec 09, 2022 at 09:08:04PM +0800, Cindy Lu wrote:
> These patches are to support vIOMMU in vdpa device
> Verified in vp_vdpa/vdpa_sim_net driverand intel_iommu
> virtio-iommu device

Pls address comments and repost.

> changes in V3
> 1. Move function vfio_get_xlat_addr to memory.c
> 2. Use the existing memory listener, while the MR is
> iommu MR then call the function iommu_region_add/
> iommu_region_del
> 
> changes in V4
> 1.make the comments in vfio_get_xlat_addr more general
> 
> changes in V5
> 1. Address the comments in the last version
> 2. Add a new arg in the function vfio_get_xlat_addr, which shows whether
> the memory is backed by a discard manager. So the device can have its
> own warning.
> 
> changes in V6
> move the error_report for the unpopulated discard back to
> memeory_get_xlat_addr
> 
> changes in V7
> organize the error massage to avoid the duplicate information
> 
> changes in V8
> Organize the code follow the comments in the last version
> 
> changes in V9
> Organize the code follow the comments
> 
> changes in V10
> Address the comments
> 
> changes in V11
> Address the comments
> fix the crash found in test
> 
> changes in V12
> Address the comments, squash patch 1 into the next patch
> improve the code style issue
> 
> Cindy Lu (1):
>   vhost-vdpa: add support for vIOMMU
> 
>  hw/virtio/vhost-vdpa.c         | 162 ++++++++++++++++++++++++++++++---
>  include/hw/virtio/vhost-vdpa.h |  10 ++
>  2 files changed, 161 insertions(+), 11 deletions(-)
> 
> -- 
> 2.34.3



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v12 0/1] vhost-vdpa: add support for vIOMMU
  2022-12-20 14:33 ` [PATCH v12 0/1] " Michael S. Tsirkin
@ 2022-12-21  6:13   ` Cindy Lu
  0 siblings, 0 replies; 6+ messages in thread
From: Cindy Lu @ 2022-12-21  6:13 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: jasowang, qemu-devel

On Tue, 20 Dec 2022 at 22:33, Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Fri, Dec 09, 2022 at 09:08:04PM +0800, Cindy Lu wrote:
> > These patches are to support vIOMMU in vdpa device
> > Verified in vp_vdpa/vdpa_sim_net driverand intel_iommu
> > virtio-iommu device
>
> Pls address comments and repost.
>
Hi Micheal
There are some issues found while dpdk running with svq enable
we are still working on it, and will post a new version after we
address the bugs
Thanks
Cindy
> > changes in V3
> > 1. Move function vfio_get_xlat_addr to memory.c
> > 2. Use the existing memory listener, while the MR is
> > iommu MR then call the function iommu_region_add/
> > iommu_region_del
> >
> > changes in V4
> > 1.make the comments in vfio_get_xlat_addr more general
> >
> > changes in V5
> > 1. Address the comments in the last version
> > 2. Add a new arg in the function vfio_get_xlat_addr, which shows whether
> > the memory is backed by a discard manager. So the device can have its
> > own warning.
> >
> > changes in V6
> > move the error_report for the unpopulated discard back to
> > memeory_get_xlat_addr
> >
> > changes in V7
> > organize the error massage to avoid the duplicate information
> >
> > changes in V8
> > Organize the code follow the comments in the last version
> >
> > changes in V9
> > Organize the code follow the comments
> >
> > changes in V10
> > Address the comments
> >
> > changes in V11
> > Address the comments
> > fix the crash found in test
> >
> > changes in V12
> > Address the comments, squash patch 1 into the next patch
> > improve the code style issue
> >
> > Cindy Lu (1):
> >   vhost-vdpa: add support for vIOMMU
> >
> >  hw/virtio/vhost-vdpa.c         | 162 ++++++++++++++++++++++++++++++---
> >  include/hw/virtio/vhost-vdpa.h |  10 ++
> >  2 files changed, 161 insertions(+), 11 deletions(-)
> >
> > --
> > 2.34.3
>



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-12-21  6:15 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-09 13:08 [PATCH v12 0/1] vhost-vdpa: add support for vIOMMU Cindy Lu
2022-12-09 13:08 ` [PATCH v12 1/1] " Cindy Lu
2022-12-13  8:17   ` Jason Wang
2022-12-13 11:17     ` Eugenio Perez Martin
2022-12-20 14:33 ` [PATCH v12 0/1] " Michael S. Tsirkin
2022-12-21  6:13   ` Cindy Lu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.