All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 0/5] virtio-iommu: VFIO integration
@ 2020-03-13  7:48 Bharat Bhushan
  2020-03-13  7:48 ` [PATCH v7 1/5] hw/vfio/common: Remove error print on mmio region translation by viommu Bharat Bhushan
                   ` (4 more replies)
  0 siblings, 5 replies; 27+ messages in thread
From: Bharat Bhushan @ 2020-03-13  7:48 UTC (permalink / raw)
  To: peter.maydell, peterx, eric.auger.pro, alex.williamson,
	kevin.tian, mst, tnowicki, drjones, linuc.decode, qemu-devel,
	qemu-arm, bharatb.linux
  Cc: Bharat Bhushan

This patch series integrates VFIO with virtio-iommu.
This is only applicable for PCI pass-through with virtio-iommu.

This series is available at:
https://github.com/bharat-bhushan-devel/qemu.git virtio-iommu-vfio-integration-v7

This is tested with assigning more than one pci devices to Virtual Machine.

This series is based on:
  - virtio-iommu device emulation by Eric Augur.
    [v16,00/10] VIRTIO-IOMMU device
    https://github.com/eauger/qemu/tree/v4.2-virtio-iommu-v16

  - Linux 5.6.0-rc4

v6->v7:
  - corrected email-address

v5->v6:
  - Rebase to v16 version from Eric
  - Tested with upstream Linux
  - Added a patch from Eric/Myself on removing mmio-region error print in vfio

v4->v5:
 - Rebase to v9 version from Eric
 - PCIe device hotplug fix
 - Added Patch 1/5 from Eric previous series (Eric somehow dropped in
   last version.
 - Patch "Translate the MSI doorbell in kvm_arch_fixup_msi_route"
   already integrated with vsmmu3

v3->v4:
 - Rebase to v4 version from Eric
 - Fixes from Eric with DPDK in VM
 - Logical division in multiple patches

v2->v3:
 - This series is based on "[RFC v3 0/8] VIRTIO-IOMMU device"
   Which is based on top of v2.10-rc0 that
 - Fixed issue with two PCI devices
 - Addressed review comments

v1->v2:
  - Added trace events
  - removed vSMMU3 link in patch description

Bharat Bhushan (5):
  hw/vfio/common: Remove error print on mmio region translation by
    viommu
  virtio-iommu: Add iommu notifier for map/unmap
  virtio-iommu: Call iommu notifier for attach/detach
  virtio-iommu: add iommu replay
  virtio-iommu: add iommu notifier memory-region

 hw/vfio/common.c                 |   2 -
 hw/virtio/trace-events           |   5 +
 hw/virtio/virtio-iommu.c         | 189 ++++++++++++++++++++++++++++++-
 include/hw/virtio/virtio-iommu.h |   6 +
 4 files changed, 199 insertions(+), 3 deletions(-)

-- 
2.17.1



^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v7 1/5] hw/vfio/common: Remove error print on mmio region translation by viommu
  2020-03-13  7:48 [PATCH v7 0/5] virtio-iommu: VFIO integration Bharat Bhushan
@ 2020-03-13  7:48 ` Bharat Bhushan
  2020-03-13  7:48 ` [PATCH v7 2/5] virtio-iommu: Add iommu notifier for map/unmap Bharat Bhushan
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 27+ messages in thread
From: Bharat Bhushan @ 2020-03-13  7:48 UTC (permalink / raw)
  To: peter.maydell, peterx, eric.auger.pro, alex.williamson,
	kevin.tian, mst, tnowicki, drjones, linuc.decode, qemu-devel,
	qemu-arm, bharatb.linux
  Cc: Eric Auger, Bharat Bhushan

On ARM, the MSI doorbell is translated by the virtual IOMMU.
As such address_space_translate() returns the MSI controller
MMIO region and we get an "iommu map to non memory area"
message. Let's remove this latter.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
---
 hw/vfio/common.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 5ca11488d6..c586edf47a 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb, void **vaddr,
                                  &xlat, &len, writable,
                                  MEMTXATTRS_UNSPECIFIED);
     if (!memory_region_is_ram(mr)) {
-        error_report("iommu map to non memory area %"HWADDR_PRIx"",
-                     xlat);
         return false;
     }
 
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 2/5] virtio-iommu: Add iommu notifier for map/unmap
  2020-03-13  7:48 [PATCH v7 0/5] virtio-iommu: VFIO integration Bharat Bhushan
  2020-03-13  7:48 ` [PATCH v7 1/5] hw/vfio/common: Remove error print on mmio region translation by viommu Bharat Bhushan
@ 2020-03-13  7:48 ` Bharat Bhushan
  2020-03-13 14:25   ` Auger Eric
  2020-03-13  7:48 ` [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach Bharat Bhushan
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 27+ messages in thread
From: Bharat Bhushan @ 2020-03-13  7:48 UTC (permalink / raw)
  To: peter.maydell, peterx, eric.auger.pro, alex.williamson,
	kevin.tian, mst, tnowicki, drjones, linuc.decode, qemu-devel,
	qemu-arm, bharatb.linux
  Cc: Eric Auger, Bharat Bhushan

This patch extends VIRTIO_IOMMU_T_MAP/UNMAP request to
notify registered iommu-notifier. Which will call vfio
notifier to map/unmap region in iommu.

Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/virtio/trace-events           |  2 +
 hw/virtio/virtio-iommu.c         | 66 +++++++++++++++++++++++++++++++-
 include/hw/virtio/virtio-iommu.h |  6 +++
 3 files changed, 73 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index e83500bee9..d94a1cd8a3 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -73,3 +73,5 @@ virtio_iommu_get_domain(uint32_t domain_id) "Alloc domain=%d"
 virtio_iommu_put_domain(uint32_t domain_id) "Free domain=%d"
 virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d"
 virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address =0x%"PRIx64
+virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64
+virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 4cee8083bc..e51344a53e 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -123,6 +123,38 @@ static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
     }
 }
 
+static void virtio_iommu_notify_map(IOMMUMemoryRegion *mr, hwaddr iova,
+                                    hwaddr paddr, hwaddr size)
+{
+    IOMMUTLBEntry entry;
+
+    entry.target_as = &address_space_memory;
+    entry.addr_mask = size - 1;
+
+    entry.iova = iova;
+    trace_virtio_iommu_notify_map(mr->parent_obj.name, iova, paddr, size);
+    entry.perm = IOMMU_RW;
+    entry.translated_addr = paddr;
+
+    memory_region_notify_iommu(mr, 0, entry);
+}
+
+static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova,
+                                      hwaddr size)
+{
+    IOMMUTLBEntry entry;
+
+    entry.target_as = &address_space_memory;
+    entry.addr_mask = size - 1;
+
+    entry.iova = iova;
+    trace_virtio_iommu_notify_unmap(mr->parent_obj.name, iova, size);
+    entry.perm = IOMMU_NONE;
+    entry.translated_addr = 0;
+
+    memory_region_notify_iommu(mr, 0, entry);
+}
+
 static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep)
 {
     if (!ep->domain) {
@@ -307,9 +339,12 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
     uint64_t virt_start = le64_to_cpu(req->virt_start);
     uint64_t virt_end = le64_to_cpu(req->virt_end);
     uint32_t flags = le32_to_cpu(req->flags);
+    hwaddr size = virt_end - virt_start + 1;
+    VirtioIOMMUNotifierNode *node;
     VirtIOIOMMUDomain *domain;
     VirtIOIOMMUInterval *interval;
     VirtIOIOMMUMapping *mapping;
+    VirtIOIOMMUEndpoint *ep;
 
     if (flags & ~VIRTIO_IOMMU_MAP_F_MASK) {
         return VIRTIO_IOMMU_S_INVAL;
@@ -339,9 +374,37 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
 
     g_tree_insert(domain->mappings, interval, mapping);
 
+    /* All devices in an address-space share mapping */
+    QLIST_FOREACH(node, &s->notifiers_list, next) {
+        QLIST_FOREACH(ep, &domain->endpoint_list, next) {
+            if (ep->id == node->iommu_dev->devfn) {
+                virtio_iommu_notify_map(&node->iommu_dev->iommu_mr,
+                                        virt_start, phys_start, size);
+            }
+        }
+    }
+
     return VIRTIO_IOMMU_S_OK;
 }
 
+static void virtio_iommu_remove_mapping(VirtIOIOMMU *s, VirtIOIOMMUDomain *domain,
+                                        VirtIOIOMMUInterval *interval)
+{
+    VirtioIOMMUNotifierNode *node;
+    VirtIOIOMMUEndpoint *ep;
+
+    QLIST_FOREACH(node, &s->notifiers_list, next) {
+        QLIST_FOREACH(ep, &domain->endpoint_list, next) {
+            if (ep->id == node->iommu_dev->devfn) {
+                virtio_iommu_notify_unmap(&node->iommu_dev->iommu_mr,
+                                          interval->low,
+                                          interval->high - interval->low + 1);
+            }
+        }
+    }
+    g_tree_remove(domain->mappings, (gpointer)(interval));
+}
+
 static int virtio_iommu_unmap(VirtIOIOMMU *s,
                               struct virtio_iommu_req_unmap *req)
 {
@@ -368,7 +431,7 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
         uint64_t current_high = iter_key->high;
 
         if (interval.low <= current_low && interval.high >= current_high) {
-            g_tree_remove(domain->mappings, iter_key);
+            virtio_iommu_remove_mapping(s, domain, iter_key);
             trace_virtio_iommu_unmap_done(domain_id, current_low, current_high);
         } else {
             ret = VIRTIO_IOMMU_S_RANGE;
@@ -655,6 +718,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
     VirtIOIOMMU *s = VIRTIO_IOMMU(dev);
 
+    QLIST_INIT(&s->notifiers_list);
     virtio_init(vdev, "virtio-iommu", VIRTIO_ID_IOMMU,
                 sizeof(struct virtio_iommu_config));
 
diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h
index 6f67f1020a..4539c8ae72 100644
--- a/include/hw/virtio/virtio-iommu.h
+++ b/include/hw/virtio/virtio-iommu.h
@@ -44,6 +44,11 @@ typedef struct IOMMUPciBus {
     IOMMUDevice  *pbdev[0]; /* Parent array is sparse, so dynamically alloc */
 } IOMMUPciBus;
 
+typedef struct VirtioIOMMUNotifierNode {
+    IOMMUDevice *iommu_dev;
+    QLIST_ENTRY(VirtioIOMMUNotifierNode) next;
+} VirtioIOMMUNotifierNode;
+
 typedef struct VirtIOIOMMU {
     VirtIODevice parent_obj;
     VirtQueue *req_vq;
@@ -56,6 +61,7 @@ typedef struct VirtIOIOMMU {
     GTree *domains;
     QemuMutex mutex;
     GTree *endpoints;
+    QLIST_HEAD(, VirtioIOMMUNotifierNode) notifiers_list;
 } VirtIOIOMMU;
 
 #endif
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
  2020-03-13  7:48 [PATCH v7 0/5] virtio-iommu: VFIO integration Bharat Bhushan
  2020-03-13  7:48 ` [PATCH v7 1/5] hw/vfio/common: Remove error print on mmio region translation by viommu Bharat Bhushan
  2020-03-13  7:48 ` [PATCH v7 2/5] virtio-iommu: Add iommu notifier for map/unmap Bharat Bhushan
@ 2020-03-13  7:48 ` Bharat Bhushan
  2020-03-13 14:41   ` Auger Eric
  2020-03-13  7:48 ` [PATCH v7 4/5] virtio-iommu: add iommu replay Bharat Bhushan
  2020-03-13  7:48 ` [PATCH v7 5/5] virtio-iommu: add iommu notifier memory-region Bharat Bhushan
  4 siblings, 1 reply; 27+ messages in thread
From: Bharat Bhushan @ 2020-03-13  7:48 UTC (permalink / raw)
  To: peter.maydell, peterx, eric.auger.pro, alex.williamson,
	kevin.tian, mst, tnowicki, drjones, linuc.decode, qemu-devel,
	qemu-arm, bharatb.linux
  Cc: Bharat Bhushan

iommu-notifier are called when a device is attached
or detached to as address-space.
This is needed for VFIO.

Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
---
 hw/virtio/virtio-iommu.c | 47 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 47 insertions(+)

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index e51344a53e..2006f72901 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -49,6 +49,7 @@ typedef struct VirtIOIOMMUEndpoint {
     uint32_t id;
     VirtIOIOMMUDomain *domain;
     QLIST_ENTRY(VirtIOIOMMUEndpoint) next;
+    VirtIOIOMMU *viommu;
 } VirtIOIOMMUEndpoint;
 
 typedef struct VirtIOIOMMUInterval {
@@ -155,8 +156,44 @@ static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova,
     memory_region_notify_iommu(mr, 0, entry);
 }
 
+static gboolean virtio_iommu_mapping_unmap(gpointer key, gpointer value,
+                                           gpointer data)
+{
+    VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key;
+    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
+
+    virtio_iommu_notify_unmap(mr, interval->low,
+                              interval->high - interval->low + 1);
+
+    return false;
+}
+
+static gboolean virtio_iommu_mapping_map(gpointer key, gpointer value,
+                                         gpointer data)
+{
+    VirtIOIOMMUMapping *mapping = (VirtIOIOMMUMapping *) value;
+    VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key;
+    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
+
+    virtio_iommu_notify_map(mr, interval->low, mapping->phys_addr,
+                            interval->high - interval->low + 1);
+
+    return false;
+}
+
 static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep)
 {
+    VirtioIOMMUNotifierNode *node;
+    VirtIOIOMMU *s = ep->viommu;
+    VirtIOIOMMUDomain *domain = ep->domain;
+
+    QLIST_FOREACH(node, &s->notifiers_list, next) {
+        if (ep->id == node->iommu_dev->devfn) {
+            g_tree_foreach(domain->mappings, virtio_iommu_mapping_unmap,
+                           &node->iommu_dev->iommu_mr);
+        }
+    }
+
     if (!ep->domain) {
         return;
     }
@@ -178,6 +215,7 @@ static VirtIOIOMMUEndpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s,
     }
     ep = g_malloc0(sizeof(*ep));
     ep->id = ep_id;
+    ep->viommu = s;
     trace_virtio_iommu_get_endpoint(ep_id);
     g_tree_insert(s->endpoints, GUINT_TO_POINTER(ep_id), ep);
     return ep;
@@ -272,6 +310,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
 {
     uint32_t domain_id = le32_to_cpu(req->domain);
     uint32_t ep_id = le32_to_cpu(req->endpoint);
+    VirtioIOMMUNotifierNode *node;
     VirtIOIOMMUDomain *domain;
     VirtIOIOMMUEndpoint *ep;
 
@@ -299,6 +338,14 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
 
     ep->domain = domain;
 
+    /* Replay existing address space mappings on the associated memory region */
+    QLIST_FOREACH(node, &s->notifiers_list, next) {
+        if (ep_id == node->iommu_dev->devfn) {
+            g_tree_foreach(domain->mappings, virtio_iommu_mapping_map,
+                           &node->iommu_dev->iommu_mr);
+        }
+    }
+
     return VIRTIO_IOMMU_S_OK;
 }
 
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 4/5] virtio-iommu: add iommu replay
  2020-03-13  7:48 [PATCH v7 0/5] virtio-iommu: VFIO integration Bharat Bhushan
                   ` (2 preceding siblings ...)
  2020-03-13  7:48 ` [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach Bharat Bhushan
@ 2020-03-13  7:48 ` Bharat Bhushan
  2020-03-13  7:48 ` [PATCH v7 5/5] virtio-iommu: add iommu notifier memory-region Bharat Bhushan
  4 siblings, 0 replies; 27+ messages in thread
From: Bharat Bhushan @ 2020-03-13  7:48 UTC (permalink / raw)
  To: peter.maydell, peterx, eric.auger.pro, alex.williamson,
	kevin.tian, mst, tnowicki, drjones, linuc.decode, qemu-devel,
	qemu-arm, bharatb.linux
  Cc: Bharat Bhushan

Default replay does not work with virtio-iommu,
so this patch provide virtio-iommu replay functionality.

Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
---
 hw/virtio/trace-events   |  1 +
 hw/virtio/virtio-iommu.c | 44 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index d94a1cd8a3..8bae651191 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -75,3 +75,4 @@ virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid)
 virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address =0x%"PRIx64
 virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64
 virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64
+virtio_iommu_remap(uint64_t iova, uint64_t pa, uint64_t size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 2006f72901..bcc9895b76 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -760,6 +760,49 @@ static gint int_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
     return (ua > ub) - (ua < ub);
 }
 
+static gboolean virtio_iommu_remap(gpointer key, gpointer value, gpointer data)
+{
+    VirtIOIOMMUMapping *mapping = (VirtIOIOMMUMapping *) value;
+    VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key;
+    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
+
+    trace_virtio_iommu_remap(interval->low, mapping->phys_addr,
+                             interval->high - interval->low + 1);
+    /* unmap previous entry and map again */
+    virtio_iommu_notify_unmap(mr, interval->low,
+                              interval->high - interval->low + 1);
+
+    virtio_iommu_notify_map(mr, interval->low, mapping->phys_addr,
+                            interval->high - interval->low + 1);
+    return false;
+}
+
+static void virtio_iommu_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
+{
+    IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
+    VirtIOIOMMU *s = sdev->viommu;
+    uint32_t sid;
+    VirtIOIOMMUEndpoint *ep;
+
+    sid = virtio_iommu_get_bdf(sdev);
+
+    qemu_mutex_lock(&s->mutex);
+
+    if (!s->endpoints) {
+        goto unlock;
+    }
+
+    ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(sid));
+    if (!ep || !ep->domain) {
+        goto unlock;
+    }
+
+    g_tree_foreach(ep->domain->mappings, virtio_iommu_remap, mr);
+
+unlock:
+    qemu_mutex_unlock(&s->mutex);
+}
+
 static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
@@ -976,6 +1019,7 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass,
     IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
 
     imrc->translate = virtio_iommu_translate;
+    imrc->replay = virtio_iommu_replay;
 }
 
 static const TypeInfo virtio_iommu_info = {
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v7 5/5] virtio-iommu: add iommu notifier memory-region
  2020-03-13  7:48 [PATCH v7 0/5] virtio-iommu: VFIO integration Bharat Bhushan
                   ` (3 preceding siblings ...)
  2020-03-13  7:48 ` [PATCH v7 4/5] virtio-iommu: add iommu replay Bharat Bhushan
@ 2020-03-13  7:48 ` Bharat Bhushan
  4 siblings, 0 replies; 27+ messages in thread
From: Bharat Bhushan @ 2020-03-13  7:48 UTC (permalink / raw)
  To: peter.maydell, peterx, eric.auger.pro, alex.williamson,
	kevin.tian, mst, tnowicki, drjones, linuc.decode, qemu-devel,
	qemu-arm, bharatb.linux
  Cc: Bharat Bhushan

Finally add notify_flag_changed() to for memory-region
access flag iommu flag change notifier

Finally add the memory notifier

Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
---
 hw/virtio/trace-events   |  2 ++
 hw/virtio/virtio-iommu.c | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 34 insertions(+)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 8bae651191..a486adcf6d 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -76,3 +76,5 @@ virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uin
 virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64
 virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64
 virtio_iommu_remap(uint64_t iova, uint64_t pa, uint64_t size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
+virtio_iommu_notify_flag_add(const char *iommu) "Add virtio-iommu notifier node for memory region %s"
+virtio_iommu_notify_flag_del(const char *iommu) "Del virtio-iommu notifier node for memory region %s"
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index bcc9895b76..7744410f72 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -803,6 +803,37 @@ unlock:
     qemu_mutex_unlock(&s->mutex);
 }
 
+static int virtio_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu_mr,
+                                             IOMMUNotifierFlag old,
+                                             IOMMUNotifierFlag new,
+                                             Error **errp)
+{
+    IOMMUDevice *sdev = container_of(iommu_mr, IOMMUDevice, iommu_mr);
+    VirtIOIOMMU *s = sdev->viommu;
+    VirtioIOMMUNotifierNode *node = NULL;
+    VirtioIOMMUNotifierNode *next_node = NULL;
+
+    if (old == IOMMU_NOTIFIER_NONE) {
+        trace_virtio_iommu_notify_flag_add(iommu_mr->parent_obj.name);
+        node = g_malloc0(sizeof(*node));
+        node->iommu_dev = sdev;
+        QLIST_INSERT_HEAD(&s->notifiers_list, node, next);
+        return 0;
+    }
+
+    /* update notifier node with new flags */
+    QLIST_FOREACH_SAFE(node, &s->notifiers_list, next, next_node) {
+        if (node->iommu_dev == sdev) {
+            if (new == IOMMU_NOTIFIER_NONE) {
+                trace_virtio_iommu_notify_flag_del(iommu_mr->parent_obj.name);
+                QLIST_REMOVE(node, next);
+                g_free(node);
+            }
+        }
+    }
+    return 0;
+}
+
 static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
@@ -1020,6 +1051,7 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass,
 
     imrc->translate = virtio_iommu_translate;
     imrc->replay = virtio_iommu_replay;
+    imrc->notify_flag_changed = virtio_iommu_notify_flag_changed;
 }
 
 static const TypeInfo virtio_iommu_info = {
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 2/5] virtio-iommu: Add iommu notifier for map/unmap
  2020-03-13  7:48 ` [PATCH v7 2/5] virtio-iommu: Add iommu notifier for map/unmap Bharat Bhushan
@ 2020-03-13 14:25   ` Auger Eric
  2020-03-16  6:36     ` Bharat Bhushan
  0 siblings, 1 reply; 27+ messages in thread
From: Auger Eric @ 2020-03-13 14:25 UTC (permalink / raw)
  To: Bharat Bhushan, peter.maydell, peterx, eric.auger.pro,
	alex.williamson, kevin.tian, mst, tnowicki, drjones,
	linuc.decode, qemu-devel, qemu-arm, bharatb.linux

Hi Bharat,
On 3/13/20 8:48 AM, Bharat Bhushan wrote:
> This patch extends VIRTIO_IOMMU_T_MAP/UNMAP request to
> notify registered iommu-notifier. Which will call vfio
s/iommu-notifier/iommu-notifiers
> notifier to map/unmap region in iommu.
can be any notifier (vhost/vfio).
> 
> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  hw/virtio/trace-events           |  2 +
>  hw/virtio/virtio-iommu.c         | 66 +++++++++++++++++++++++++++++++-
>  include/hw/virtio/virtio-iommu.h |  6 +++
>  3 files changed, 73 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> index e83500bee9..d94a1cd8a3 100644
> --- a/hw/virtio/trace-events
> +++ b/hw/virtio/trace-events
> @@ -73,3 +73,5 @@ virtio_iommu_get_domain(uint32_t domain_id) "Alloc domain=%d"
>  virtio_iommu_put_domain(uint32_t domain_id) "Free domain=%d"
>  virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d"
>  virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address =0x%"PRIx64
> +virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64
> +virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64
> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> index 4cee8083bc..e51344a53e 100644
> --- a/hw/virtio/virtio-iommu.c
> +++ b/hw/virtio/virtio-iommu.c
> @@ -123,6 +123,38 @@ static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
>      }
>  }
>  
> +static void virtio_iommu_notify_map(IOMMUMemoryRegion *mr, hwaddr iova,
> +                                    hwaddr paddr, hwaddr size)
> +{
> +    IOMMUTLBEntry entry;
> +
> +    entry.target_as = &address_space_memory;
> +    entry.addr_mask = size - 1;
> +
> +    entry.iova = iova;
> +    trace_virtio_iommu_notify_map(mr->parent_obj.name, iova, paddr, size);
> +    entry.perm = IOMMU_RW;
> +    entry.translated_addr = paddr;
> +
> +    memory_region_notify_iommu(mr, 0, entry);
> +}
> +
> +static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova,
> +                                      hwaddr size)
> +{
> +    IOMMUTLBEntry entry;
> +
> +    entry.target_as = &address_space_memory;
> +    entry.addr_mask = size - 1;
> +
> +    entry.iova = iova;
> +    trace_virtio_iommu_notify_unmap(mr->parent_obj.name, iova, size);
> +    entry.perm = IOMMU_NONE;
> +    entry.translated_addr = 0;
> +
> +    memory_region_notify_iommu(mr, 0, entry);
> +}
> +
>  static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep)
>  {
>      if (!ep->domain) {
> @@ -307,9 +339,12 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
>      uint64_t virt_start = le64_to_cpu(req->virt_start);
>      uint64_t virt_end = le64_to_cpu(req->virt_end);
>      uint32_t flags = le32_to_cpu(req->flags);
> +    hwaddr size = virt_end - virt_start + 1;
> +    VirtioIOMMUNotifierNode *node;
>      VirtIOIOMMUDomain *domain;
>      VirtIOIOMMUInterval *interval;
>      VirtIOIOMMUMapping *mapping;
> +    VirtIOIOMMUEndpoint *ep;
>  
>      if (flags & ~VIRTIO_IOMMU_MAP_F_MASK) {
>          return VIRTIO_IOMMU_S_INVAL;
> @@ -339,9 +374,37 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
>  
>      g_tree_insert(domain->mappings, interval, mapping);
>  
> +    /* All devices in an address-space share mapping */
> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> +        QLIST_FOREACH(ep, &domain->endpoint_list, next) {
> +            if (ep->id == node->iommu_dev->devfn) {
> +                virtio_iommu_notify_map(&node->iommu_dev->iommu_mr,
> +                                        virt_start, phys_start, size);
> +            }
> +        }
> +    }
> +
>      return VIRTIO_IOMMU_S_OK;
>  }
>  
> +static void virtio_iommu_remove_mapping(VirtIOIOMMU *s, VirtIOIOMMUDomain *domain,
> +                                        VirtIOIOMMUInterval *interval)
> +{
> +    VirtioIOMMUNotifierNode *node;
> +    VirtIOIOMMUEndpoint *ep;
> +
> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> +        QLIST_FOREACH(ep, &domain->endpoint_list, next) {
> +            if (ep->id == node->iommu_dev->devfn) {
> +                virtio_iommu_notify_unmap(&node->iommu_dev->iommu_mr,
> +                                          interval->low,
> +                                          interval->high - interval->low + 1);
> +            }
> +        }
> +    }
> +    g_tree_remove(domain->mappings, (gpointer)(interval));
> +}
What about virtio_iommu_put_domain() where you destroy the mapping
gtree. I guess you also need to send invalidations there?
> +
>  static int virtio_iommu_unmap(VirtIOIOMMU *s,
>                                struct virtio_iommu_req_unmap *req)
>  {
> @@ -368,7 +431,7 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
>          uint64_t current_high = iter_key->high;
>  
>          if (interval.low <= current_low && interval.high >= current_high) {
> -            g_tree_remove(domain->mappings, iter_key);
> +            virtio_iommu_remove_mapping(s, domain, iter_key);
>              trace_virtio_iommu_unmap_done(domain_id, current_low, current_high);
>          } else {
>              ret = VIRTIO_IOMMU_S_RANGE;
> @@ -655,6 +718,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>      VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>      VirtIOIOMMU *s = VIRTIO_IOMMU(dev);
>  
> +    QLIST_INIT(&s->notifiers_list);
>      virtio_init(vdev, "virtio-iommu", VIRTIO_ID_IOMMU,
>                  sizeof(struct virtio_iommu_config));
>  
> diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h
> index 6f67f1020a..4539c8ae72 100644
> --- a/include/hw/virtio/virtio-iommu.h
> +++ b/include/hw/virtio/virtio-iommu.h
> @@ -44,6 +44,11 @@ typedef struct IOMMUPciBus {
>      IOMMUDevice  *pbdev[0]; /* Parent array is sparse, so dynamically alloc */
>  } IOMMUPciBus;
>  
> +typedef struct VirtioIOMMUNotifierNode {
> +    IOMMUDevice *iommu_dev;
> +    QLIST_ENTRY(VirtioIOMMUNotifierNode) next;
> +} VirtioIOMMUNotifierNode;
You may use scripts/git.orderfile for a better diff ordering.
> +
>  typedef struct VirtIOIOMMU {
>      VirtIODevice parent_obj;
>      VirtQueue *req_vq;
> @@ -56,6 +61,7 @@ typedef struct VirtIOIOMMU {
>      GTree *domains;
>      QemuMutex mutex;
>      GTree *endpoints;
> +    QLIST_HEAD(, VirtioIOMMUNotifierNode) notifiers_list;
See what was done in smmuv3 and intel. We now directly use a list of
IOMMUDevice directly. VirtioIOMMUNotifierNode does not bring anything extra.
>  } VirtIOIOMMU;
>  
>  #endif
> 

Thanks

Eric



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
  2020-03-13  7:48 ` [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach Bharat Bhushan
@ 2020-03-13 14:41   ` Auger Eric
  2020-03-16  6:41     ` Bharat Bhushan
  0 siblings, 1 reply; 27+ messages in thread
From: Auger Eric @ 2020-03-13 14:41 UTC (permalink / raw)
  To: Bharat Bhushan, peter.maydell, peterx, eric.auger.pro,
	alex.williamson, kevin.tian, mst, tnowicki, drjones,
	linuc.decode, qemu-devel, qemu-arm, bharatb.linux

Hi Bharat

On 3/13/20 8:48 AM, Bharat Bhushan wrote:
> iommu-notifier are called when a device is attached
IOMMU notifiers
> or detached to as address-space.
> This is needed for VFIO.
and vhost for detach
> 
> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
> ---
>  hw/virtio/virtio-iommu.c | 47 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 47 insertions(+)
> 
> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> index e51344a53e..2006f72901 100644
> --- a/hw/virtio/virtio-iommu.c
> +++ b/hw/virtio/virtio-iommu.c
> @@ -49,6 +49,7 @@ typedef struct VirtIOIOMMUEndpoint {
>      uint32_t id;
>      VirtIOIOMMUDomain *domain;
>      QLIST_ENTRY(VirtIOIOMMUEndpoint) next;
> +    VirtIOIOMMU *viommu;
This needs specal care on post-load. When migrating the EPs, only the id
is migrated. On post-load you need to set viommu as it is done for
domain. migration is allowed with vhost.
>  } VirtIOIOMMUEndpoint;
>  
>  typedef struct VirtIOIOMMUInterval {
> @@ -155,8 +156,44 @@ static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova,
>      memory_region_notify_iommu(mr, 0, entry);
>  }
>  
> +static gboolean virtio_iommu_mapping_unmap(gpointer key, gpointer value,
> +                                           gpointer data)
> +{
> +    VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key;
> +    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
> +
> +    virtio_iommu_notify_unmap(mr, interval->low,
> +                              interval->high - interval->low + 1);
> +
> +    return false;
> +}
> +
> +static gboolean virtio_iommu_mapping_map(gpointer key, gpointer value,
> +                                         gpointer data)
> +{
> +    VirtIOIOMMUMapping *mapping = (VirtIOIOMMUMapping *) value;
> +    VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key;
> +    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
> +
> +    virtio_iommu_notify_map(mr, interval->low, mapping->phys_addr,
> +                            interval->high - interval->low + 1);
> +
> +    return false;
> +}
> +
>  static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep)
>  {
> +    VirtioIOMMUNotifierNode *node;
> +    VirtIOIOMMU *s = ep->viommu;
> +    VirtIOIOMMUDomain *domain = ep->domain;
> +
> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> +        if (ep->id == node->iommu_dev->devfn) {
> +            g_tree_foreach(domain->mappings, virtio_iommu_mapping_unmap,
> +                           &node->iommu_dev->iommu_mr);
I understand this should fo the job for domain removal
> +        }
> +    }
> +
>      if (!ep->domain) {
>          return;
>      }
> @@ -178,6 +215,7 @@ static VirtIOIOMMUEndpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s,
>      }
>      ep = g_malloc0(sizeof(*ep));
>      ep->id = ep_id;
> +    ep->viommu = s;
>      trace_virtio_iommu_get_endpoint(ep_id);
>      g_tree_insert(s->endpoints, GUINT_TO_POINTER(ep_id), ep);
>      return ep;
> @@ -272,6 +310,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
>  {
>      uint32_t domain_id = le32_to_cpu(req->domain);
>      uint32_t ep_id = le32_to_cpu(req->endpoint);
> +    VirtioIOMMUNotifierNode *node;
>      VirtIOIOMMUDomain *domain;
>      VirtIOIOMMUEndpoint *ep;
>  
> @@ -299,6 +338,14 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
>  
>      ep->domain = domain;
>  
> +    /* Replay existing address space mappings on the associated memory region */
maybe use the "domain" terminology here.
> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> +        if (ep_id == node->iommu_dev->devfn) {
> +            g_tree_foreach(domain->mappings, virtio_iommu_mapping_map,
> +                           &node->iommu_dev->iommu_mr);
> +        }
> +    }
> +
>      return VIRTIO_IOMMU_S_OK;
>  }
>  
> 
Thanks

Eric



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 2/5] virtio-iommu: Add iommu notifier for map/unmap
  2020-03-13 14:25   ` Auger Eric
@ 2020-03-16  6:36     ` Bharat Bhushan
  2020-03-16  7:32       ` Auger Eric
  0 siblings, 1 reply; 27+ messages in thread
From: Bharat Bhushan @ 2020-03-16  6:36 UTC (permalink / raw)
  To: Auger Eric
  Cc: peter.maydell, kevin.tian, tnowicki, mst, drjones, peterx,
	qemu-devel, alex.williamson, qemu-arm, Bharat Bhushan,
	linuc.decode, eric.auger.pro

Hi Eric,

On Fri, Mar 13, 2020 at 7:55 PM Auger Eric <eric.auger@redhat.com> wrote:
>
> Hi Bharat,
> On 3/13/20 8:48 AM, Bharat Bhushan wrote:
> > This patch extends VIRTIO_IOMMU_T_MAP/UNMAP request to
> > notify registered iommu-notifier. Which will call vfio
> s/iommu-notifier/iommu-notifiers
> > notifier to map/unmap region in iommu.
> can be any notifier (vhost/vfio).
> >
> > Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
> > Signed-off-by: Eric Auger <eric.auger@redhat.com>
> > ---
> >  hw/virtio/trace-events           |  2 +
> >  hw/virtio/virtio-iommu.c         | 66 +++++++++++++++++++++++++++++++-
> >  include/hw/virtio/virtio-iommu.h |  6 +++
> >  3 files changed, 73 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> > index e83500bee9..d94a1cd8a3 100644
> > --- a/hw/virtio/trace-events
> > +++ b/hw/virtio/trace-events
> > @@ -73,3 +73,5 @@ virtio_iommu_get_domain(uint32_t domain_id) "Alloc domain=%d"
> >  virtio_iommu_put_domain(uint32_t domain_id) "Free domain=%d"
> >  virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d"
> >  virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address =0x%"PRIx64
> > +virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64
> > +virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64
> > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> > index 4cee8083bc..e51344a53e 100644
> > --- a/hw/virtio/virtio-iommu.c
> > +++ b/hw/virtio/virtio-iommu.c
> > @@ -123,6 +123,38 @@ static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
> >      }
> >  }
> >
> > +static void virtio_iommu_notify_map(IOMMUMemoryRegion *mr, hwaddr iova,
> > +                                    hwaddr paddr, hwaddr size)
> > +{
> > +    IOMMUTLBEntry entry;
> > +
> > +    entry.target_as = &address_space_memory;
> > +    entry.addr_mask = size - 1;
> > +
> > +    entry.iova = iova;
> > +    trace_virtio_iommu_notify_map(mr->parent_obj.name, iova, paddr, size);
> > +    entry.perm = IOMMU_RW;
> > +    entry.translated_addr = paddr;
> > +
> > +    memory_region_notify_iommu(mr, 0, entry);
> > +}
> > +
> > +static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova,
> > +                                      hwaddr size)
> > +{
> > +    IOMMUTLBEntry entry;
> > +
> > +    entry.target_as = &address_space_memory;
> > +    entry.addr_mask = size - 1;
> > +
> > +    entry.iova = iova;
> > +    trace_virtio_iommu_notify_unmap(mr->parent_obj.name, iova, size);
> > +    entry.perm = IOMMU_NONE;
> > +    entry.translated_addr = 0;
> > +
> > +    memory_region_notify_iommu(mr, 0, entry);
> > +}
> > +
> >  static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep)
> >  {
> >      if (!ep->domain) {
> > @@ -307,9 +339,12 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
> >      uint64_t virt_start = le64_to_cpu(req->virt_start);
> >      uint64_t virt_end = le64_to_cpu(req->virt_end);
> >      uint32_t flags = le32_to_cpu(req->flags);
> > +    hwaddr size = virt_end - virt_start + 1;
> > +    VirtioIOMMUNotifierNode *node;
> >      VirtIOIOMMUDomain *domain;
> >      VirtIOIOMMUInterval *interval;
> >      VirtIOIOMMUMapping *mapping;
> > +    VirtIOIOMMUEndpoint *ep;
> >
> >      if (flags & ~VIRTIO_IOMMU_MAP_F_MASK) {
> >          return VIRTIO_IOMMU_S_INVAL;
> > @@ -339,9 +374,37 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
> >
> >      g_tree_insert(domain->mappings, interval, mapping);
> >
> > +    /* All devices in an address-space share mapping */
> > +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> > +        QLIST_FOREACH(ep, &domain->endpoint_list, next) {
> > +            if (ep->id == node->iommu_dev->devfn) {
> > +                virtio_iommu_notify_map(&node->iommu_dev->iommu_mr,
> > +                                        virt_start, phys_start, size);
> > +            }
> > +        }
> > +    }
> > +
> >      return VIRTIO_IOMMU_S_OK;
> >  }
> >
> > +static void virtio_iommu_remove_mapping(VirtIOIOMMU *s, VirtIOIOMMUDomain *domain,
> > +                                        VirtIOIOMMUInterval *interval)
> > +{
> > +    VirtioIOMMUNotifierNode *node;
> > +    VirtIOIOMMUEndpoint *ep;
> > +
> > +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> > +        QLIST_FOREACH(ep, &domain->endpoint_list, next) {
> > +            if (ep->id == node->iommu_dev->devfn) {
> > +                virtio_iommu_notify_unmap(&node->iommu_dev->iommu_mr,
> > +                                          interval->low,
> > +                                          interval->high - interval->low + 1);
> > +            }
> > +        }
> > +    }
> > +    g_tree_remove(domain->mappings, (gpointer)(interval));
> > +}
> What about virtio_iommu_put_domain() where you destroy the mapping
> gtree. I guess you also need to send invalidations there?

In virtio_iommu_put_domain(), before destroying domain->mappings we
are calling virtio_iommu_detach_endpoint_from_domain(), which send
invalidations, no ?

> > +
> >  static int virtio_iommu_unmap(VirtIOIOMMU *s,
> >                                struct virtio_iommu_req_unmap *req)
> >  {
> > @@ -368,7 +431,7 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
> >          uint64_t current_high = iter_key->high;
> >
> >          if (interval.low <= current_low && interval.high >= current_high) {
> > -            g_tree_remove(domain->mappings, iter_key);
> > +            virtio_iommu_remove_mapping(s, domain, iter_key);
> >              trace_virtio_iommu_unmap_done(domain_id, current_low, current_high);
> >          } else {
> >              ret = VIRTIO_IOMMU_S_RANGE;
> > @@ -655,6 +718,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
> >      VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> >      VirtIOIOMMU *s = VIRTIO_IOMMU(dev);
> >
> > +    QLIST_INIT(&s->notifiers_list);
> >      virtio_init(vdev, "virtio-iommu", VIRTIO_ID_IOMMU,
> >                  sizeof(struct virtio_iommu_config));
> >
> > diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h
> > index 6f67f1020a..4539c8ae72 100644
> > --- a/include/hw/virtio/virtio-iommu.h
> > +++ b/include/hw/virtio/virtio-iommu.h
> > @@ -44,6 +44,11 @@ typedef struct IOMMUPciBus {
> >      IOMMUDevice  *pbdev[0]; /* Parent array is sparse, so dynamically alloc */
> >  } IOMMUPciBus;
> >
> > +typedef struct VirtioIOMMUNotifierNode {
> > +    IOMMUDevice *iommu_dev;
> > +    QLIST_ENTRY(VirtioIOMMUNotifierNode) next;
> > +} VirtioIOMMUNotifierNode;
> You may use scripts/git.orderfile for a better diff ordering.

ok, run "git config diff.orderFile scripts/git.orderfile"

> > +
> >  typedef struct VirtIOIOMMU {
> >      VirtIODevice parent_obj;
> >      VirtQueue *req_vq;
> > @@ -56,6 +61,7 @@ typedef struct VirtIOIOMMU {
> >      GTree *domains;
> >      QemuMutex mutex;
> >      GTree *endpoints;
> > +    QLIST_HEAD(, VirtioIOMMUNotifierNode) notifiers_list;
> See what was done in smmuv3 and intel. We now directly use a list of
> IOMMUDevice directly. VirtioIOMMUNotifierNode does not bring anything extra.

ok,

Thanks
-Bharat

> >  } VirtIOIOMMU;
> >
> >  #endif
> >
>
> Thanks
>
> Eric
>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
  2020-03-13 14:41   ` Auger Eric
@ 2020-03-16  6:41     ` Bharat Bhushan
  2020-03-16  7:32       ` Auger Eric
  0 siblings, 1 reply; 27+ messages in thread
From: Bharat Bhushan @ 2020-03-16  6:41 UTC (permalink / raw)
  To: Auger Eric
  Cc: peter.maydell, kevin.tian, tnowicki, mst, drjones, peterx,
	qemu-devel, alex.williamson, qemu-arm, Bharat Bhushan,
	linuc.decode, eric.auger.pro

Hi Eric,

On Fri, Mar 13, 2020 at 8:11 PM Auger Eric <eric.auger@redhat.com> wrote:
>
> Hi Bharat
>
> On 3/13/20 8:48 AM, Bharat Bhushan wrote:
> > iommu-notifier are called when a device is attached
> IOMMU notifiers
> > or detached to as address-space.
> > This is needed for VFIO.
> and vhost for detach
> >
> > Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
> > ---
> >  hw/virtio/virtio-iommu.c | 47 ++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 47 insertions(+)
> >
> > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> > index e51344a53e..2006f72901 100644
> > --- a/hw/virtio/virtio-iommu.c
> > +++ b/hw/virtio/virtio-iommu.c
> > @@ -49,6 +49,7 @@ typedef struct VirtIOIOMMUEndpoint {
> >      uint32_t id;
> >      VirtIOIOMMUDomain *domain;
> >      QLIST_ENTRY(VirtIOIOMMUEndpoint) next;
> > +    VirtIOIOMMU *viommu;
> This needs specal care on post-load. When migrating the EPs, only the id
> is migrated. On post-load you need to set viommu as it is done for
> domain. migration is allowed with vhost.

ok, I have not tried vhost/migration. Below change set viommu when
reconstructing endpoint.

@@ -984,6 +973,7 @@ static gboolean reconstruct_endpoints(gpointer
key, gpointer value,

     QLIST_FOREACH(iter, &d->endpoint_list, next) {
         iter->domain = d;
+       iter->viommu = s;
         g_tree_insert(s->endpoints, GUINT_TO_POINTER(iter->id), iter);
     }
     return false; /* continue the domain traversal */

> >  } VirtIOIOMMUEndpoint;
> >
> >  typedef struct VirtIOIOMMUInterval {
> > @@ -155,8 +156,44 @@ static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova,
> >      memory_region_notify_iommu(mr, 0, entry);
> >  }
> >
> > +static gboolean virtio_iommu_mapping_unmap(gpointer key, gpointer value,
> > +                                           gpointer data)
> > +{
> > +    VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key;
> > +    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
> > +
> > +    virtio_iommu_notify_unmap(mr, interval->low,
> > +                              interval->high - interval->low + 1);
> > +
> > +    return false;
> > +}
> > +
> > +static gboolean virtio_iommu_mapping_map(gpointer key, gpointer value,
> > +                                         gpointer data)
> > +{
> > +    VirtIOIOMMUMapping *mapping = (VirtIOIOMMUMapping *) value;
> > +    VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key;
> > +    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
> > +
> > +    virtio_iommu_notify_map(mr, interval->low, mapping->phys_addr,
> > +                            interval->high - interval->low + 1);
> > +
> > +    return false;
> > +}
> > +
> >  static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep)
> >  {
> > +    VirtioIOMMUNotifierNode *node;
> > +    VirtIOIOMMU *s = ep->viommu;
> > +    VirtIOIOMMUDomain *domain = ep->domain;
> > +
> > +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> > +        if (ep->id == node->iommu_dev->devfn) {
> > +            g_tree_foreach(domain->mappings, virtio_iommu_mapping_unmap,
> > +                           &node->iommu_dev->iommu_mr);
> I understand this should fo the job for domain removal

did not get the comment, are you saying we should do this on domain removal?

> > +        }
> > +    }
> > +
> >      if (!ep->domain) {
> >          return;
> >      }
> > @@ -178,6 +215,7 @@ static VirtIOIOMMUEndpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s,
> >      }
> >      ep = g_malloc0(sizeof(*ep));
> >      ep->id = ep_id;
> > +    ep->viommu = s;
> >      trace_virtio_iommu_get_endpoint(ep_id);
> >      g_tree_insert(s->endpoints, GUINT_TO_POINTER(ep_id), ep);
> >      return ep;
> > @@ -272,6 +310,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
> >  {
> >      uint32_t domain_id = le32_to_cpu(req->domain);
> >      uint32_t ep_id = le32_to_cpu(req->endpoint);
> > +    VirtioIOMMUNotifierNode *node;
> >      VirtIOIOMMUDomain *domain;
> >      VirtIOIOMMUEndpoint *ep;
> >
> > @@ -299,6 +338,14 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
> >
> >      ep->domain = domain;
> >
> > +    /* Replay existing address space mappings on the associated memory region */
> maybe use the "domain" terminology here.

ok,

Thanks
-Bharat

> > +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> > +        if (ep_id == node->iommu_dev->devfn) {
> > +            g_tree_foreach(domain->mappings, virtio_iommu_mapping_map,
> > +                           &node->iommu_dev->iommu_mr);
> > +        }
> > +    }
> > +
> >      return VIRTIO_IOMMU_S_OK;
> >  }
> >
> >
> Thanks
>
> Eric
>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
  2020-03-16  6:41     ` Bharat Bhushan
@ 2020-03-16  7:32       ` Auger Eric
  2020-03-16  7:45         ` Bharat Bhushan
  0 siblings, 1 reply; 27+ messages in thread
From: Auger Eric @ 2020-03-16  7:32 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: peter.maydell, kevin.tian, tnowicki, mst, drjones, peterx,
	qemu-devel, alex.williamson, qemu-arm, Bharat Bhushan,
	linuc.decode, eric.auger.pro

Hi Bharat,

On 3/16/20 7:41 AM, Bharat Bhushan wrote:
> Hi Eric,
> 
> On Fri, Mar 13, 2020 at 8:11 PM Auger Eric <eric.auger@redhat.com> wrote:
>>
>> Hi Bharat
>>
>> On 3/13/20 8:48 AM, Bharat Bhushan wrote:
>>> iommu-notifier are called when a device is attached
>> IOMMU notifiers
>>> or detached to as address-space.
>>> This is needed for VFIO.
>> and vhost for detach
>>>
>>> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
>>> ---
>>>  hw/virtio/virtio-iommu.c | 47 ++++++++++++++++++++++++++++++++++++++++
>>>  1 file changed, 47 insertions(+)
>>>
>>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>>> index e51344a53e..2006f72901 100644
>>> --- a/hw/virtio/virtio-iommu.c
>>> +++ b/hw/virtio/virtio-iommu.c
>>> @@ -49,6 +49,7 @@ typedef struct VirtIOIOMMUEndpoint {
>>>      uint32_t id;
>>>      VirtIOIOMMUDomain *domain;
>>>      QLIST_ENTRY(VirtIOIOMMUEndpoint) next;
>>> +    VirtIOIOMMU *viommu;
>> This needs specal care on post-load. When migrating the EPs, only the id
>> is migrated. On post-load you need to set viommu as it is done for
>> domain. migration is allowed with vhost.
> 
> ok, I have not tried vhost/migration. Below change set viommu when
> reconstructing endpoint.


Yes I think this should be OK.

By the end I did the series a try with vhost/vfio. with vhost it works
(not with recent kernel though, but the issue may be related to kernel).
With VFIO however it does not for me.

First issue is: your guest can use 4K page and your host can use 64KB
pages. In that case VFIO_DMA_MAP will fail with -EINVAL. We must devise
a way to pass the host settings to the VIRTIO-IOMMU device.

Even with 64KB pages, it did not work for me. I have obviously not the
storm of VFIO_DMA_MAP failures but I have some, most probably due to
some wrong notifications somewhere. I will try to investigate on my side.

Did you test with VFIO on your side?

Thanks

Eric
> 
> @@ -984,6 +973,7 @@ static gboolean reconstruct_endpoints(gpointer
> key, gpointer value,
> 
>      QLIST_FOREACH(iter, &d->endpoint_list, next) {
>          iter->domain = d;
> +       iter->viommu = s;
>          g_tree_insert(s->endpoints, GUINT_TO_POINTER(iter->id), iter);
>      }
>      return false; /* continue the domain traversal */
> 
>>>  } VirtIOIOMMUEndpoint;
>>>
>>>  typedef struct VirtIOIOMMUInterval {
>>> @@ -155,8 +156,44 @@ static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova,
>>>      memory_region_notify_iommu(mr, 0, entry);
>>>  }
>>>
>>> +static gboolean virtio_iommu_mapping_unmap(gpointer key, gpointer value,
>>> +                                           gpointer data)
>>> +{
>>> +    VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key;
>>> +    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
>>> +
>>> +    virtio_iommu_notify_unmap(mr, interval->low,
>>> +                              interval->high - interval->low + 1);
>>> +
>>> +    return false;
>>> +}
>>> +
>>> +static gboolean virtio_iommu_mapping_map(gpointer key, gpointer value,
>>> +                                         gpointer data)
>>> +{
>>> +    VirtIOIOMMUMapping *mapping = (VirtIOIOMMUMapping *) value;
>>> +    VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key;
>>> +    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
>>> +
>>> +    virtio_iommu_notify_map(mr, interval->low, mapping->phys_addr,
>>> +                            interval->high - interval->low + 1);
>>> +
>>> +    return false;
>>> +}
>>> +
>>>  static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep)
>>>  {
>>> +    VirtioIOMMUNotifierNode *node;
>>> +    VirtIOIOMMU *s = ep->viommu;
>>> +    VirtIOIOMMUDomain *domain = ep->domain;
>>> +
>>> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
>>> +        if (ep->id == node->iommu_dev->devfn) {
>>> +            g_tree_foreach(domain->mappings, virtio_iommu_mapping_unmap,
>>> +                           &node->iommu_dev->iommu_mr);
>> I understand this should fo the job for domain removal
> 
> did not get the comment, are you saying we should do this on domain removal?
see my reply on 2/5

Note the above code should be moved after the check of !ep->domain below
> 
>>> +        }
>>> +    }
>>> +
>>>      if (!ep->domain) {
>>>          return;
>>>      }
>>> @@ -178,6 +215,7 @@ static VirtIOIOMMUEndpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s,
>>>      }
>>>      ep = g_malloc0(sizeof(*ep));
>>>      ep->id = ep_id;
>>> +    ep->viommu = s;
>>>      trace_virtio_iommu_get_endpoint(ep_id);
>>>      g_tree_insert(s->endpoints, GUINT_TO_POINTER(ep_id), ep);
>>>      return ep;
>>> @@ -272,6 +310,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
>>>  {
>>>      uint32_t domain_id = le32_to_cpu(req->domain);
>>>      uint32_t ep_id = le32_to_cpu(req->endpoint);
>>> +    VirtioIOMMUNotifierNode *node;
>>>      VirtIOIOMMUDomain *domain;
>>>      VirtIOIOMMUEndpoint *ep;
>>>
>>> @@ -299,6 +338,14 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
>>>
>>>      ep->domain = domain;
>>>
>>> +    /* Replay existing address space mappings on the associated memory region */
>> maybe use the "domain" terminology here.
> 
> ok,
> 
> Thanks
> -Bharat
> 
>>> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
>>> +        if (ep_id == node->iommu_dev->devfn) {
>>> +            g_tree_foreach(domain->mappings, virtio_iommu_mapping_map,
>>> +                           &node->iommu_dev->iommu_mr);
>>> +        }
>>> +    }
>>> +
>>>      return VIRTIO_IOMMU_S_OK;
>>>  }
>>>
>>>
>> Thanks
>>
>> Eric
>>
> 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 2/5] virtio-iommu: Add iommu notifier for map/unmap
  2020-03-16  6:36     ` Bharat Bhushan
@ 2020-03-16  7:32       ` Auger Eric
  0 siblings, 0 replies; 27+ messages in thread
From: Auger Eric @ 2020-03-16  7:32 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: peter.maydell, kevin.tian, tnowicki, mst, drjones, peterx,
	qemu-devel, alex.williamson, qemu-arm, Bharat Bhushan,
	linuc.decode, eric.auger.pro

Hi Bharat,

On 3/16/20 7:36 AM, Bharat Bhushan wrote:
> Hi Eric,
> 
> On Fri, Mar 13, 2020 at 7:55 PM Auger Eric <eric.auger@redhat.com> wrote:
>>
>> Hi Bharat,
>> On 3/13/20 8:48 AM, Bharat Bhushan wrote:
>>> This patch extends VIRTIO_IOMMU_T_MAP/UNMAP request to
>>> notify registered iommu-notifier. Which will call vfio
>> s/iommu-notifier/iommu-notifiers
>>> notifier to map/unmap region in iommu.
>> can be any notifier (vhost/vfio).
>>>
>>> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>> ---
>>>  hw/virtio/trace-events           |  2 +
>>>  hw/virtio/virtio-iommu.c         | 66 +++++++++++++++++++++++++++++++-
>>>  include/hw/virtio/virtio-iommu.h |  6 +++
>>>  3 files changed, 73 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
>>> index e83500bee9..d94a1cd8a3 100644
>>> --- a/hw/virtio/trace-events
>>> +++ b/hw/virtio/trace-events
>>> @@ -73,3 +73,5 @@ virtio_iommu_get_domain(uint32_t domain_id) "Alloc domain=%d"
>>>  virtio_iommu_put_domain(uint32_t domain_id) "Free domain=%d"
>>>  virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d"
>>>  virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address =0x%"PRIx64
>>> +virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64
>>> +virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64
>>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>>> index 4cee8083bc..e51344a53e 100644
>>> --- a/hw/virtio/virtio-iommu.c
>>> +++ b/hw/virtio/virtio-iommu.c
>>> @@ -123,6 +123,38 @@ static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
>>>      }
>>>  }
>>>
>>> +static void virtio_iommu_notify_map(IOMMUMemoryRegion *mr, hwaddr iova,
>>> +                                    hwaddr paddr, hwaddr size)
>>> +{
>>> +    IOMMUTLBEntry entry;
>>> +
>>> +    entry.target_as = &address_space_memory;
>>> +    entry.addr_mask = size - 1;
>>> +
>>> +    entry.iova = iova;
>>> +    trace_virtio_iommu_notify_map(mr->parent_obj.name, iova, paddr, size);
>>> +    entry.perm = IOMMU_RW;
>>> +    entry.translated_addr = paddr;
>>> +
>>> +    memory_region_notify_iommu(mr, 0, entry);
>>> +}
>>> +
>>> +static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova,
>>> +                                      hwaddr size)
>>> +{
>>> +    IOMMUTLBEntry entry;
>>> +
>>> +    entry.target_as = &address_space_memory;
>>> +    entry.addr_mask = size - 1;
>>> +
>>> +    entry.iova = iova;
>>> +    trace_virtio_iommu_notify_unmap(mr->parent_obj.name, iova, size);
>>> +    entry.perm = IOMMU_NONE;
>>> +    entry.translated_addr = 0;
>>> +
>>> +    memory_region_notify_iommu(mr, 0, entry);
>>> +}
>>> +
>>>  static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep)
>>>  {
>>>      if (!ep->domain) {
>>> @@ -307,9 +339,12 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
>>>      uint64_t virt_start = le64_to_cpu(req->virt_start);
>>>      uint64_t virt_end = le64_to_cpu(req->virt_end);
>>>      uint32_t flags = le32_to_cpu(req->flags);
>>> +    hwaddr size = virt_end - virt_start + 1;
>>> +    VirtioIOMMUNotifierNode *node;
>>>      VirtIOIOMMUDomain *domain;
>>>      VirtIOIOMMUInterval *interval;
>>>      VirtIOIOMMUMapping *mapping;
>>> +    VirtIOIOMMUEndpoint *ep;
>>>
>>>      if (flags & ~VIRTIO_IOMMU_MAP_F_MASK) {
>>>          return VIRTIO_IOMMU_S_INVAL;
>>> @@ -339,9 +374,37 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
>>>
>>>      g_tree_insert(domain->mappings, interval, mapping);
>>>
>>> +    /* All devices in an address-space share mapping */
>>> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
>>> +        QLIST_FOREACH(ep, &domain->endpoint_list, next) {
>>> +            if (ep->id == node->iommu_dev->devfn) {
>>> +                virtio_iommu_notify_map(&node->iommu_dev->iommu_mr,
>>> +                                        virt_start, phys_start, size);
>>> +            }
>>> +        }
>>> +    }
>>> +
>>>      return VIRTIO_IOMMU_S_OK;
>>>  }
>>>
>>> +static void virtio_iommu_remove_mapping(VirtIOIOMMU *s, VirtIOIOMMUDomain *domain,
>>> +                                        VirtIOIOMMUInterval *interval)
>>> +{
>>> +    VirtioIOMMUNotifierNode *node;
>>> +    VirtIOIOMMUEndpoint *ep;
>>> +
>>> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
>>> +        QLIST_FOREACH(ep, &domain->endpoint_list, next) {
>>> +            if (ep->id == node->iommu_dev->devfn) {
>>> +                virtio_iommu_notify_unmap(&node->iommu_dev->iommu_mr,
>>> +                                          interval->low,
>>> +                                          interval->high - interval->low + 1);
>>> +            }
>>> +        }
>>> +    }
>>> +    g_tree_remove(domain->mappings, (gpointer)(interval));
>>> +}
>> What about virtio_iommu_put_domain() where you destroy the mapping
>> gtree. I guess you also need to send invalidations there?
> 
> In virtio_iommu_put_domain(), before destroying domain->mappings we
> are calling virtio_iommu_detach_endpoint_from_domain(), which send
> invalidations, no ?
Yes I noticed that later as it is in patch 3/5 (this is the comment you
did not get)
> 
>>> +
>>>  static int virtio_iommu_unmap(VirtIOIOMMU *s,
>>>                                struct virtio_iommu_req_unmap *req)
>>>  {
>>> @@ -368,7 +431,7 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
>>>          uint64_t current_high = iter_key->high;
>>>
>>>          if (interval.low <= current_low && interval.high >= current_high) {
>>> -            g_tree_remove(domain->mappings, iter_key);
>>> +            virtio_iommu_remove_mapping(s, domain, iter_key);
>>>              trace_virtio_iommu_unmap_done(domain_id, current_low, current_high);
>>>          } else {
>>>              ret = VIRTIO_IOMMU_S_RANGE;
>>> @@ -655,6 +718,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>>>      VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>>>      VirtIOIOMMU *s = VIRTIO_IOMMU(dev);
>>>
>>> +    QLIST_INIT(&s->notifiers_list);
>>>      virtio_init(vdev, "virtio-iommu", VIRTIO_ID_IOMMU,
>>>                  sizeof(struct virtio_iommu_config));
>>>
>>> diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h
>>> index 6f67f1020a..4539c8ae72 100644
>>> --- a/include/hw/virtio/virtio-iommu.h
>>> +++ b/include/hw/virtio/virtio-iommu.h
>>> @@ -44,6 +44,11 @@ typedef struct IOMMUPciBus {
>>>      IOMMUDevice  *pbdev[0]; /* Parent array is sparse, so dynamically alloc */
>>>  } IOMMUPciBus;
>>>
>>> +typedef struct VirtioIOMMUNotifierNode {
>>> +    IOMMUDevice *iommu_dev;
>>> +    QLIST_ENTRY(VirtioIOMMUNotifierNode) next;
>>> +} VirtioIOMMUNotifierNode;
>> You may use scripts/git.orderfile for a better diff ordering.
> 
> ok, run "git config diff.orderFile scripts/git.orderfile"
> 
>>> +
>>>  typedef struct VirtIOIOMMU {
>>>      VirtIODevice parent_obj;
>>>      VirtQueue *req_vq;
>>> @@ -56,6 +61,7 @@ typedef struct VirtIOIOMMU {
>>>      GTree *domains;
>>>      QemuMutex mutex;
>>>      GTree *endpoints;
>>> +    QLIST_HEAD(, VirtioIOMMUNotifierNode) notifiers_list;
>> See what was done in smmuv3 and intel. We now directly use a list of
>> IOMMUDevice directly. VirtioIOMMUNotifierNode does not bring anything extra.
> 
> ok,
> 
> Thanks
> -Bharat
> 
>>>  } VirtIOIOMMU;
>>>
>>>  #endif
>>>
>>
>> Thanks
>>
>> Eric
>>
> 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
  2020-03-16  7:32       ` Auger Eric
@ 2020-03-16  7:45         ` Bharat Bhushan
  2020-03-16  8:58           ` Bharat Bhushan
  0 siblings, 1 reply; 27+ messages in thread
From: Bharat Bhushan @ 2020-03-16  7:45 UTC (permalink / raw)
  To: Auger Eric
  Cc: peter.maydell, kevin.tian, tnowicki, mst, drjones, peterx,
	qemu-devel, alex.williamson, qemu-arm, Bharat Bhushan,
	linuc.decode, eric.auger.pro

Hi Eric,

On Mon, Mar 16, 2020 at 1:02 PM Auger Eric <eric.auger@redhat.com> wrote:
>
> Hi Bharat,
>
> On 3/16/20 7:41 AM, Bharat Bhushan wrote:
> > Hi Eric,
> >
> > On Fri, Mar 13, 2020 at 8:11 PM Auger Eric <eric.auger@redhat.com> wrote:
> >>
> >> Hi Bharat
> >>
> >> On 3/13/20 8:48 AM, Bharat Bhushan wrote:
> >>> iommu-notifier are called when a device is attached
> >> IOMMU notifiers
> >>> or detached to as address-space.
> >>> This is needed for VFIO.
> >> and vhost for detach
> >>>
> >>> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
> >>> ---
> >>>  hw/virtio/virtio-iommu.c | 47 ++++++++++++++++++++++++++++++++++++++++
> >>>  1 file changed, 47 insertions(+)
> >>>
> >>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> >>> index e51344a53e..2006f72901 100644
> >>> --- a/hw/virtio/virtio-iommu.c
> >>> +++ b/hw/virtio/virtio-iommu.c
> >>> @@ -49,6 +49,7 @@ typedef struct VirtIOIOMMUEndpoint {
> >>>      uint32_t id;
> >>>      VirtIOIOMMUDomain *domain;
> >>>      QLIST_ENTRY(VirtIOIOMMUEndpoint) next;
> >>> +    VirtIOIOMMU *viommu;
> >> This needs specal care on post-load. When migrating the EPs, only the id
> >> is migrated. On post-load you need to set viommu as it is done for
> >> domain. migration is allowed with vhost.
> >
> > ok, I have not tried vhost/migration. Below change set viommu when
> > reconstructing endpoint.
>
>
> Yes I think this should be OK.
>
> By the end I did the series a try with vhost/vfio. with vhost it works
> (not with recent kernel though, but the issue may be related to kernel).
> With VFIO however it does not for me.
>
> First issue is: your guest can use 4K page and your host can use 64KB
> pages. In that case VFIO_DMA_MAP will fail with -EINVAL. We must devise
> a way to pass the host settings to the VIRTIO-IOMMU device.
>
> Even with 64KB pages, it did not work for me. I have obviously not the
> storm of VFIO_DMA_MAP failures but I have some, most probably due to
> some wrong notifications somewhere. I will try to investigate on my side.
>
> Did you test with VFIO on your side?

I did not tried with different page sizes, only tested with 4K page size.

Yes it works, I tested with two n/w device assigned to VM, both interfaces works

First I will try with 64k page size.

Thanks
-Bharat

>
> Thanks
>
> Eric
> >
> > @@ -984,6 +973,7 @@ static gboolean reconstruct_endpoints(gpointer
> > key, gpointer value,
> >
> >      QLIST_FOREACH(iter, &d->endpoint_list, next) {
> >          iter->domain = d;
> > +       iter->viommu = s;
> >          g_tree_insert(s->endpoints, GUINT_TO_POINTER(iter->id), iter);
> >      }
> >      return false; /* continue the domain traversal */
> >
> >>>  } VirtIOIOMMUEndpoint;
> >>>
> >>>  typedef struct VirtIOIOMMUInterval {
> >>> @@ -155,8 +156,44 @@ static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova,
> >>>      memory_region_notify_iommu(mr, 0, entry);
> >>>  }
> >>>
> >>> +static gboolean virtio_iommu_mapping_unmap(gpointer key, gpointer value,
> >>> +                                           gpointer data)
> >>> +{
> >>> +    VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key;
> >>> +    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
> >>> +
> >>> +    virtio_iommu_notify_unmap(mr, interval->low,
> >>> +                              interval->high - interval->low + 1);
> >>> +
> >>> +    return false;
> >>> +}
> >>> +
> >>> +static gboolean virtio_iommu_mapping_map(gpointer key, gpointer value,
> >>> +                                         gpointer data)
> >>> +{
> >>> +    VirtIOIOMMUMapping *mapping = (VirtIOIOMMUMapping *) value;
> >>> +    VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key;
> >>> +    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
> >>> +
> >>> +    virtio_iommu_notify_map(mr, interval->low, mapping->phys_addr,
> >>> +                            interval->high - interval->low + 1);
> >>> +
> >>> +    return false;
> >>> +}
> >>> +
> >>>  static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep)
> >>>  {
> >>> +    VirtioIOMMUNotifierNode *node;
> >>> +    VirtIOIOMMU *s = ep->viommu;
> >>> +    VirtIOIOMMUDomain *domain = ep->domain;
> >>> +
> >>> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> >>> +        if (ep->id == node->iommu_dev->devfn) {
> >>> +            g_tree_foreach(domain->mappings, virtio_iommu_mapping_unmap,
> >>> +                           &node->iommu_dev->iommu_mr);
> >> I understand this should fo the job for domain removal
> >
> > did not get the comment, are you saying we should do this on domain removal?
> see my reply on 2/5
>
> Note the above code should be moved after the check of !ep->domain below

ohh yes, will move

Thanks
-Bharat

> >
> >>> +        }
> >>> +    }
> >>> +
> >>>      if (!ep->domain) {
> >>>          return;
> >>>      }
> >>> @@ -178,6 +215,7 @@ static VirtIOIOMMUEndpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s,
> >>>      }
> >>>      ep = g_malloc0(sizeof(*ep));
> >>>      ep->id = ep_id;
> >>> +    ep->viommu = s;
> >>>      trace_virtio_iommu_get_endpoint(ep_id);
> >>>      g_tree_insert(s->endpoints, GUINT_TO_POINTER(ep_id), ep);
> >>>      return ep;
> >>> @@ -272,6 +310,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
> >>>  {
> >>>      uint32_t domain_id = le32_to_cpu(req->domain);
> >>>      uint32_t ep_id = le32_to_cpu(req->endpoint);
> >>> +    VirtioIOMMUNotifierNode *node;
> >>>      VirtIOIOMMUDomain *domain;
> >>>      VirtIOIOMMUEndpoint *ep;
> >>>
> >>> @@ -299,6 +338,14 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
> >>>
> >>>      ep->domain = domain;
> >>>
> >>> +    /* Replay existing address space mappings on the associated memory region */
> >> maybe use the "domain" terminology here.
> >
> > ok,
> >
> > Thanks
> > -Bharat
> >
> >>> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> >>> +        if (ep_id == node->iommu_dev->devfn) {
> >>> +            g_tree_foreach(domain->mappings, virtio_iommu_mapping_map,
> >>> +                           &node->iommu_dev->iommu_mr);
> >>> +        }
> >>> +    }
> >>> +
> >>>      return VIRTIO_IOMMU_S_OK;
> >>>  }
> >>>
> >>>
> >> Thanks
> >>
> >> Eric
> >>
> >
>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
  2020-03-16  7:45         ` Bharat Bhushan
@ 2020-03-16  8:58           ` Bharat Bhushan
  2020-03-16  9:04             ` Auger Eric
  0 siblings, 1 reply; 27+ messages in thread
From: Bharat Bhushan @ 2020-03-16  8:58 UTC (permalink / raw)
  To: Auger Eric
  Cc: peter.maydell, kevin.tian, tnowicki, mst, drjones, peterx,
	qemu-devel, alex.williamson, qemu-arm, Bharat Bhushan,
	linuc.decode, eric.auger.pro

Hi Eric,

On Mon, Mar 16, 2020 at 1:15 PM Bharat Bhushan <bharatb.linux@gmail.com> wrote:
>
> Hi Eric,
>
> On Mon, Mar 16, 2020 at 1:02 PM Auger Eric <eric.auger@redhat.com> wrote:
> >
> > Hi Bharat,
> >
> > On 3/16/20 7:41 AM, Bharat Bhushan wrote:
> > > Hi Eric,
> > >
> > > On Fri, Mar 13, 2020 at 8:11 PM Auger Eric <eric.auger@redhat.com> wrote:
> > >>
> > >> Hi Bharat
> > >>
> > >> On 3/13/20 8:48 AM, Bharat Bhushan wrote:
> > >>> iommu-notifier are called when a device is attached
> > >> IOMMU notifiers
> > >>> or detached to as address-space.
> > >>> This is needed for VFIO.
> > >> and vhost for detach
> > >>>
> > >>> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
> > >>> ---
> > >>>  hw/virtio/virtio-iommu.c | 47 ++++++++++++++++++++++++++++++++++++++++
> > >>>  1 file changed, 47 insertions(+)
> > >>>
> > >>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> > >>> index e51344a53e..2006f72901 100644
> > >>> --- a/hw/virtio/virtio-iommu.c
> > >>> +++ b/hw/virtio/virtio-iommu.c
> > >>> @@ -49,6 +49,7 @@ typedef struct VirtIOIOMMUEndpoint {
> > >>>      uint32_t id;
> > >>>      VirtIOIOMMUDomain *domain;
> > >>>      QLIST_ENTRY(VirtIOIOMMUEndpoint) next;
> > >>> +    VirtIOIOMMU *viommu;
> > >> This needs specal care on post-load. When migrating the EPs, only the id
> > >> is migrated. On post-load you need to set viommu as it is done for
> > >> domain. migration is allowed with vhost.
> > >
> > > ok, I have not tried vhost/migration. Below change set viommu when
> > > reconstructing endpoint.
> >
> >
> > Yes I think this should be OK.
> >
> > By the end I did the series a try with vhost/vfio. with vhost it works
> > (not with recent kernel though, but the issue may be related to kernel).
> > With VFIO however it does not for me.
> >
> > First issue is: your guest can use 4K page and your host can use 64KB
> > pages. In that case VFIO_DMA_MAP will fail with -EINVAL. We must devise
> > a way to pass the host settings to the VIRTIO-IOMMU device.
> >
> > Even with 64KB pages, it did not work for me. I have obviously not the
> > storm of VFIO_DMA_MAP failures but I have some, most probably due to
> > some wrong notifications somewhere. I will try to investigate on my side.
> >
> > Did you test with VFIO on your side?
>
> I did not tried with different page sizes, only tested with 4K page size.
>
> Yes it works, I tested with two n/w device assigned to VM, both interfaces works
>
> First I will try with 64k page size.

64K page size does not work for me as well,

I think we are not passing correct page_size_mask here
(config.page_size_mask is set to TARGET_PAGE_MASK ( which is
0xfffffffffffff000))

We need to set this correctly as per host page size, correct?

Thanks
-Bharat

>
> Thanks
> -Bharat
>
> >
> > Thanks
> >
> > Eric
> > >
> > > @@ -984,6 +973,7 @@ static gboolean reconstruct_endpoints(gpointer
> > > key, gpointer value,
> > >
> > >      QLIST_FOREACH(iter, &d->endpoint_list, next) {
> > >          iter->domain = d;
> > > +       iter->viommu = s;
> > >          g_tree_insert(s->endpoints, GUINT_TO_POINTER(iter->id), iter);
> > >      }
> > >      return false; /* continue the domain traversal */
> > >
> > >>>  } VirtIOIOMMUEndpoint;
> > >>>
> > >>>  typedef struct VirtIOIOMMUInterval {
> > >>> @@ -155,8 +156,44 @@ static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova,
> > >>>      memory_region_notify_iommu(mr, 0, entry);
> > >>>  }
> > >>>
> > >>> +static gboolean virtio_iommu_mapping_unmap(gpointer key, gpointer value,
> > >>> +                                           gpointer data)
> > >>> +{
> > >>> +    VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key;
> > >>> +    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
> > >>> +
> > >>> +    virtio_iommu_notify_unmap(mr, interval->low,
> > >>> +                              interval->high - interval->low + 1);
> > >>> +
> > >>> +    return false;
> > >>> +}
> > >>> +
> > >>> +static gboolean virtio_iommu_mapping_map(gpointer key, gpointer value,
> > >>> +                                         gpointer data)
> > >>> +{
> > >>> +    VirtIOIOMMUMapping *mapping = (VirtIOIOMMUMapping *) value;
> > >>> +    VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key;
> > >>> +    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
> > >>> +
> > >>> +    virtio_iommu_notify_map(mr, interval->low, mapping->phys_addr,
> > >>> +                            interval->high - interval->low + 1);
> > >>> +
> > >>> +    return false;
> > >>> +}
> > >>> +
> > >>>  static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep)
> > >>>  {
> > >>> +    VirtioIOMMUNotifierNode *node;
> > >>> +    VirtIOIOMMU *s = ep->viommu;
> > >>> +    VirtIOIOMMUDomain *domain = ep->domain;
> > >>> +
> > >>> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> > >>> +        if (ep->id == node->iommu_dev->devfn) {
> > >>> +            g_tree_foreach(domain->mappings, virtio_iommu_mapping_unmap,
> > >>> +                           &node->iommu_dev->iommu_mr);
> > >> I understand this should fo the job for domain removal
> > >
> > > did not get the comment, are you saying we should do this on domain removal?
> > see my reply on 2/5
> >
> > Note the above code should be moved after the check of !ep->domain below
>
> ohh yes, will move
>
> Thanks
> -Bharat
>
> > >
> > >>> +        }
> > >>> +    }
> > >>> +
> > >>>      if (!ep->domain) {
> > >>>          return;
> > >>>      }
> > >>> @@ -178,6 +215,7 @@ static VirtIOIOMMUEndpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s,
> > >>>      }
> > >>>      ep = g_malloc0(sizeof(*ep));
> > >>>      ep->id = ep_id;
> > >>> +    ep->viommu = s;
> > >>>      trace_virtio_iommu_get_endpoint(ep_id);
> > >>>      g_tree_insert(s->endpoints, GUINT_TO_POINTER(ep_id), ep);
> > >>>      return ep;
> > >>> @@ -272,6 +310,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
> > >>>  {
> > >>>      uint32_t domain_id = le32_to_cpu(req->domain);
> > >>>      uint32_t ep_id = le32_to_cpu(req->endpoint);
> > >>> +    VirtioIOMMUNotifierNode *node;
> > >>>      VirtIOIOMMUDomain *domain;
> > >>>      VirtIOIOMMUEndpoint *ep;
> > >>>
> > >>> @@ -299,6 +338,14 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
> > >>>
> > >>>      ep->domain = domain;
> > >>>
> > >>> +    /* Replay existing address space mappings on the associated memory region */
> > >> maybe use the "domain" terminology here.
> > >
> > > ok,
> > >
> > > Thanks
> > > -Bharat
> > >
> > >>> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> > >>> +        if (ep_id == node->iommu_dev->devfn) {
> > >>> +            g_tree_foreach(domain->mappings, virtio_iommu_mapping_map,
> > >>> +                           &node->iommu_dev->iommu_mr);
> > >>> +        }
> > >>> +    }
> > >>> +
> > >>>      return VIRTIO_IOMMU_S_OK;
> > >>>  }
> > >>>
> > >>>
> > >> Thanks
> > >>
> > >> Eric
> > >>
> > >
> >


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
  2020-03-16  8:58           ` Bharat Bhushan
@ 2020-03-16  9:04             ` Auger Eric
  2020-03-16  9:10               ` Bharat Bhushan
  0 siblings, 1 reply; 27+ messages in thread
From: Auger Eric @ 2020-03-16  9:04 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: peter.maydell, kevin.tian, tnowicki, mst, drjones, peterx,
	qemu-devel, alex.williamson, qemu-arm, Bharat Bhushan,
	linuc.decode, eric.auger.pro

Hi Bharat,

On 3/16/20 9:58 AM, Bharat Bhushan wrote:
> Hi Eric,
> 
> On Mon, Mar 16, 2020 at 1:15 PM Bharat Bhushan <bharatb.linux@gmail.com> wrote:
>>
>> Hi Eric,
>>
>> On Mon, Mar 16, 2020 at 1:02 PM Auger Eric <eric.auger@redhat.com> wrote:
>>>
>>> Hi Bharat,
>>>
>>> On 3/16/20 7:41 AM, Bharat Bhushan wrote:
>>>> Hi Eric,
>>>>
>>>> On Fri, Mar 13, 2020 at 8:11 PM Auger Eric <eric.auger@redhat.com> wrote:
>>>>>
>>>>> Hi Bharat
>>>>>
>>>>> On 3/13/20 8:48 AM, Bharat Bhushan wrote:
>>>>>> iommu-notifier are called when a device is attached
>>>>> IOMMU notifiers
>>>>>> or detached to as address-space.
>>>>>> This is needed for VFIO.
>>>>> and vhost for detach
>>>>>>
>>>>>> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
>>>>>> ---
>>>>>>  hw/virtio/virtio-iommu.c | 47 ++++++++++++++++++++++++++++++++++++++++
>>>>>>  1 file changed, 47 insertions(+)
>>>>>>
>>>>>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>>>>>> index e51344a53e..2006f72901 100644
>>>>>> --- a/hw/virtio/virtio-iommu.c
>>>>>> +++ b/hw/virtio/virtio-iommu.c
>>>>>> @@ -49,6 +49,7 @@ typedef struct VirtIOIOMMUEndpoint {
>>>>>>      uint32_t id;
>>>>>>      VirtIOIOMMUDomain *domain;
>>>>>>      QLIST_ENTRY(VirtIOIOMMUEndpoint) next;
>>>>>> +    VirtIOIOMMU *viommu;
>>>>> This needs specal care on post-load. When migrating the EPs, only the id
>>>>> is migrated. On post-load you need to set viommu as it is done for
>>>>> domain. migration is allowed with vhost.
>>>>
>>>> ok, I have not tried vhost/migration. Below change set viommu when
>>>> reconstructing endpoint.
>>>
>>>
>>> Yes I think this should be OK.
>>>
>>> By the end I did the series a try with vhost/vfio. with vhost it works
>>> (not with recent kernel though, but the issue may be related to kernel).
>>> With VFIO however it does not for me.
>>>
>>> First issue is: your guest can use 4K page and your host can use 64KB
>>> pages. In that case VFIO_DMA_MAP will fail with -EINVAL. We must devise
>>> a way to pass the host settings to the VIRTIO-IOMMU device.
>>>
>>> Even with 64KB pages, it did not work for me. I have obviously not the
>>> storm of VFIO_DMA_MAP failures but I have some, most probably due to
>>> some wrong notifications somewhere. I will try to investigate on my side.
>>>
>>> Did you test with VFIO on your side?
>>
>> I did not tried with different page sizes, only tested with 4K page size.
>>
>> Yes it works, I tested with two n/w device assigned to VM, both interfaces works
>>
>> First I will try with 64k page size.
> 
> 64K page size does not work for me as well,
> 
> I think we are not passing correct page_size_mask here
> (config.page_size_mask is set to TARGET_PAGE_MASK ( which is
> 0xfffffffffffff000))
I guess you mean with guest using 4K and host using 64K.
> 
> We need to set this correctly as per host page size, correct?
Yes that's correct. We need to put in place a control path to retrieve
the page settings on host through VFIO to inform the virtio-iommu device.

Besides this issue, did you try with 64kB on host and guest?

Thanks

Eric
> 
> Thanks
> -Bharat
> 
>>
>> Thanks
>> -Bharat
>>
>>>
>>> Thanks
>>>
>>> Eric
>>>>
>>>> @@ -984,6 +973,7 @@ static gboolean reconstruct_endpoints(gpointer
>>>> key, gpointer value,
>>>>
>>>>      QLIST_FOREACH(iter, &d->endpoint_list, next) {
>>>>          iter->domain = d;
>>>> +       iter->viommu = s;
>>>>          g_tree_insert(s->endpoints, GUINT_TO_POINTER(iter->id), iter);
>>>>      }
>>>>      return false; /* continue the domain traversal */
>>>>
>>>>>>  } VirtIOIOMMUEndpoint;
>>>>>>
>>>>>>  typedef struct VirtIOIOMMUInterval {
>>>>>> @@ -155,8 +156,44 @@ static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova,
>>>>>>      memory_region_notify_iommu(mr, 0, entry);
>>>>>>  }
>>>>>>
>>>>>> +static gboolean virtio_iommu_mapping_unmap(gpointer key, gpointer value,
>>>>>> +                                           gpointer data)
>>>>>> +{
>>>>>> +    VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key;
>>>>>> +    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
>>>>>> +
>>>>>> +    virtio_iommu_notify_unmap(mr, interval->low,
>>>>>> +                              interval->high - interval->low + 1);
>>>>>> +
>>>>>> +    return false;
>>>>>> +}
>>>>>> +
>>>>>> +static gboolean virtio_iommu_mapping_map(gpointer key, gpointer value,
>>>>>> +                                         gpointer data)
>>>>>> +{
>>>>>> +    VirtIOIOMMUMapping *mapping = (VirtIOIOMMUMapping *) value;
>>>>>> +    VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key;
>>>>>> +    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
>>>>>> +
>>>>>> +    virtio_iommu_notify_map(mr, interval->low, mapping->phys_addr,
>>>>>> +                            interval->high - interval->low + 1);
>>>>>> +
>>>>>> +    return false;
>>>>>> +}
>>>>>> +
>>>>>>  static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep)
>>>>>>  {
>>>>>> +    VirtioIOMMUNotifierNode *node;
>>>>>> +    VirtIOIOMMU *s = ep->viommu;
>>>>>> +    VirtIOIOMMUDomain *domain = ep->domain;
>>>>>> +
>>>>>> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
>>>>>> +        if (ep->id == node->iommu_dev->devfn) {
>>>>>> +            g_tree_foreach(domain->mappings, virtio_iommu_mapping_unmap,
>>>>>> +                           &node->iommu_dev->iommu_mr);
>>>>> I understand this should fo the job for domain removal
>>>>
>>>> did not get the comment, are you saying we should do this on domain removal?
>>> see my reply on 2/5
>>>
>>> Note the above code should be moved after the check of !ep->domain below
>>
>> ohh yes, will move
>>
>> Thanks
>> -Bharat
>>
>>>>
>>>>>> +        }
>>>>>> +    }
>>>>>> +
>>>>>>      if (!ep->domain) {
>>>>>>          return;
>>>>>>      }
>>>>>> @@ -178,6 +215,7 @@ static VirtIOIOMMUEndpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s,
>>>>>>      }
>>>>>>      ep = g_malloc0(sizeof(*ep));
>>>>>>      ep->id = ep_id;
>>>>>> +    ep->viommu = s;
>>>>>>      trace_virtio_iommu_get_endpoint(ep_id);
>>>>>>      g_tree_insert(s->endpoints, GUINT_TO_POINTER(ep_id), ep);
>>>>>>      return ep;
>>>>>> @@ -272,6 +310,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
>>>>>>  {
>>>>>>      uint32_t domain_id = le32_to_cpu(req->domain);
>>>>>>      uint32_t ep_id = le32_to_cpu(req->endpoint);
>>>>>> +    VirtioIOMMUNotifierNode *node;
>>>>>>      VirtIOIOMMUDomain *domain;
>>>>>>      VirtIOIOMMUEndpoint *ep;
>>>>>>
>>>>>> @@ -299,6 +338,14 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
>>>>>>
>>>>>>      ep->domain = domain;
>>>>>>
>>>>>> +    /* Replay existing address space mappings on the associated memory region */
>>>>> maybe use the "domain" terminology here.
>>>>
>>>> ok,
>>>>
>>>> Thanks
>>>> -Bharat
>>>>
>>>>>> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
>>>>>> +        if (ep_id == node->iommu_dev->devfn) {
>>>>>> +            g_tree_foreach(domain->mappings, virtio_iommu_mapping_map,
>>>>>> +                           &node->iommu_dev->iommu_mr);
>>>>>> +        }
>>>>>> +    }
>>>>>> +
>>>>>>      return VIRTIO_IOMMU_S_OK;
>>>>>>  }
>>>>>>
>>>>>>
>>>>> Thanks
>>>>>
>>>>> Eric
>>>>>
>>>>
>>>
> 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
  2020-03-16  9:04             ` Auger Eric
@ 2020-03-16  9:10               ` Bharat Bhushan
  2020-03-16 10:11                 ` Jean-Philippe Brucker
  0 siblings, 1 reply; 27+ messages in thread
From: Bharat Bhushan @ 2020-03-16  9:10 UTC (permalink / raw)
  To: Auger Eric
  Cc: peter.maydell, kevin.tian, tnowicki, mst, drjones, peterx,
	qemu-devel, alex.williamson, qemu-arm, Bharat Bhushan,
	linuc.decode, eric.auger.pro

Hi Eric,

On Mon, Mar 16, 2020 at 2:35 PM Auger Eric <eric.auger@redhat.com> wrote:
>
> Hi Bharat,
>
> On 3/16/20 9:58 AM, Bharat Bhushan wrote:
> > Hi Eric,
> >
> > On Mon, Mar 16, 2020 at 1:15 PM Bharat Bhushan <bharatb.linux@gmail.com> wrote:
> >>
> >> Hi Eric,
> >>
> >> On Mon, Mar 16, 2020 at 1:02 PM Auger Eric <eric.auger@redhat.com> wrote:
> >>>
> >>> Hi Bharat,
> >>>
> >>> On 3/16/20 7:41 AM, Bharat Bhushan wrote:
> >>>> Hi Eric,
> >>>>
> >>>> On Fri, Mar 13, 2020 at 8:11 PM Auger Eric <eric.auger@redhat.com> wrote:
> >>>>>
> >>>>> Hi Bharat
> >>>>>
> >>>>> On 3/13/20 8:48 AM, Bharat Bhushan wrote:
> >>>>>> iommu-notifier are called when a device is attached
> >>>>> IOMMU notifiers
> >>>>>> or detached to as address-space.
> >>>>>> This is needed for VFIO.
> >>>>> and vhost for detach
> >>>>>>
> >>>>>> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
> >>>>>> ---
> >>>>>>  hw/virtio/virtio-iommu.c | 47 ++++++++++++++++++++++++++++++++++++++++
> >>>>>>  1 file changed, 47 insertions(+)
> >>>>>>
> >>>>>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> >>>>>> index e51344a53e..2006f72901 100644
> >>>>>> --- a/hw/virtio/virtio-iommu.c
> >>>>>> +++ b/hw/virtio/virtio-iommu.c
> >>>>>> @@ -49,6 +49,7 @@ typedef struct VirtIOIOMMUEndpoint {
> >>>>>>      uint32_t id;
> >>>>>>      VirtIOIOMMUDomain *domain;
> >>>>>>      QLIST_ENTRY(VirtIOIOMMUEndpoint) next;
> >>>>>> +    VirtIOIOMMU *viommu;
> >>>>> This needs specal care on post-load. When migrating the EPs, only the id
> >>>>> is migrated. On post-load you need to set viommu as it is done for
> >>>>> domain. migration is allowed with vhost.
> >>>>
> >>>> ok, I have not tried vhost/migration. Below change set viommu when
> >>>> reconstructing endpoint.
> >>>
> >>>
> >>> Yes I think this should be OK.
> >>>
> >>> By the end I did the series a try with vhost/vfio. with vhost it works
> >>> (not with recent kernel though, but the issue may be related to kernel).
> >>> With VFIO however it does not for me.
> >>>
> >>> First issue is: your guest can use 4K page and your host can use 64KB
> >>> pages. In that case VFIO_DMA_MAP will fail with -EINVAL. We must devise
> >>> a way to pass the host settings to the VIRTIO-IOMMU device.
> >>>
> >>> Even with 64KB pages, it did not work for me. I have obviously not the
> >>> storm of VFIO_DMA_MAP failures but I have some, most probably due to
> >>> some wrong notifications somewhere. I will try to investigate on my side.
> >>>
> >>> Did you test with VFIO on your side?
> >>
> >> I did not tried with different page sizes, only tested with 4K page size.
> >>
> >> Yes it works, I tested with two n/w device assigned to VM, both interfaces works
> >>
> >> First I will try with 64k page size.
> >
> > 64K page size does not work for me as well,
> >
> > I think we are not passing correct page_size_mask here
> > (config.page_size_mask is set to TARGET_PAGE_MASK ( which is
> > 0xfffffffffffff000))
> I guess you mean with guest using 4K and host using 64K.
> >
> > We need to set this correctly as per host page size, correct?
> Yes that's correct. We need to put in place a control path to retrieve
> the page settings on host through VFIO to inform the virtio-iommu device.
>
> Besides this issue, did you try with 64kB on host and guest?

I tried Followings
  - 4k host and 4k guest  - it works with v7 version
  - 64k host and 64k guest - it does not work with v7
    hard-coded config.page_size_mask to 0xffffffffffff0000 and it works

Thanks
-Bharat

>
> Thanks
>
> Eric
> >
> > Thanks
> > -Bharat
> >
> >>
> >> Thanks
> >> -Bharat
> >>
> >>>
> >>> Thanks
> >>>
> >>> Eric
> >>>>
> >>>> @@ -984,6 +973,7 @@ static gboolean reconstruct_endpoints(gpointer
> >>>> key, gpointer value,
> >>>>
> >>>>      QLIST_FOREACH(iter, &d->endpoint_list, next) {
> >>>>          iter->domain = d;
> >>>> +       iter->viommu = s;
> >>>>          g_tree_insert(s->endpoints, GUINT_TO_POINTER(iter->id), iter);
> >>>>      }
> >>>>      return false; /* continue the domain traversal */
> >>>>
> >>>>>>  } VirtIOIOMMUEndpoint;
> >>>>>>
> >>>>>>  typedef struct VirtIOIOMMUInterval {
> >>>>>> @@ -155,8 +156,44 @@ static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova,
> >>>>>>      memory_region_notify_iommu(mr, 0, entry);
> >>>>>>  }
> >>>>>>
> >>>>>> +static gboolean virtio_iommu_mapping_unmap(gpointer key, gpointer value,
> >>>>>> +                                           gpointer data)
> >>>>>> +{
> >>>>>> +    VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key;
> >>>>>> +    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
> >>>>>> +
> >>>>>> +    virtio_iommu_notify_unmap(mr, interval->low,
> >>>>>> +                              interval->high - interval->low + 1);
> >>>>>> +
> >>>>>> +    return false;
> >>>>>> +}
> >>>>>> +
> >>>>>> +static gboolean virtio_iommu_mapping_map(gpointer key, gpointer value,
> >>>>>> +                                         gpointer data)
> >>>>>> +{
> >>>>>> +    VirtIOIOMMUMapping *mapping = (VirtIOIOMMUMapping *) value;
> >>>>>> +    VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key;
> >>>>>> +    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
> >>>>>> +
> >>>>>> +    virtio_iommu_notify_map(mr, interval->low, mapping->phys_addr,
> >>>>>> +                            interval->high - interval->low + 1);
> >>>>>> +
> >>>>>> +    return false;
> >>>>>> +}
> >>>>>> +
> >>>>>>  static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep)
> >>>>>>  {
> >>>>>> +    VirtioIOMMUNotifierNode *node;
> >>>>>> +    VirtIOIOMMU *s = ep->viommu;
> >>>>>> +    VirtIOIOMMUDomain *domain = ep->domain;
> >>>>>> +
> >>>>>> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> >>>>>> +        if (ep->id == node->iommu_dev->devfn) {
> >>>>>> +            g_tree_foreach(domain->mappings, virtio_iommu_mapping_unmap,
> >>>>>> +                           &node->iommu_dev->iommu_mr);
> >>>>> I understand this should fo the job for domain removal
> >>>>
> >>>> did not get the comment, are you saying we should do this on domain removal?
> >>> see my reply on 2/5
> >>>
> >>> Note the above code should be moved after the check of !ep->domain below
> >>
> >> ohh yes, will move
> >>
> >> Thanks
> >> -Bharat
> >>
> >>>>
> >>>>>> +        }
> >>>>>> +    }
> >>>>>> +
> >>>>>>      if (!ep->domain) {
> >>>>>>          return;
> >>>>>>      }
> >>>>>> @@ -178,6 +215,7 @@ static VirtIOIOMMUEndpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s,
> >>>>>>      }
> >>>>>>      ep = g_malloc0(sizeof(*ep));
> >>>>>>      ep->id = ep_id;
> >>>>>> +    ep->viommu = s;
> >>>>>>      trace_virtio_iommu_get_endpoint(ep_id);
> >>>>>>      g_tree_insert(s->endpoints, GUINT_TO_POINTER(ep_id), ep);
> >>>>>>      return ep;
> >>>>>> @@ -272,6 +310,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
> >>>>>>  {
> >>>>>>      uint32_t domain_id = le32_to_cpu(req->domain);
> >>>>>>      uint32_t ep_id = le32_to_cpu(req->endpoint);
> >>>>>> +    VirtioIOMMUNotifierNode *node;
> >>>>>>      VirtIOIOMMUDomain *domain;
> >>>>>>      VirtIOIOMMUEndpoint *ep;
> >>>>>>
> >>>>>> @@ -299,6 +338,14 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
> >>>>>>
> >>>>>>      ep->domain = domain;
> >>>>>>
> >>>>>> +    /* Replay existing address space mappings on the associated memory region */
> >>>>> maybe use the "domain" terminology here.
> >>>>
> >>>> ok,
> >>>>
> >>>> Thanks
> >>>> -Bharat
> >>>>
> >>>>>> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> >>>>>> +        if (ep_id == node->iommu_dev->devfn) {
> >>>>>> +            g_tree_foreach(domain->mappings, virtio_iommu_mapping_map,
> >>>>>> +                           &node->iommu_dev->iommu_mr);
> >>>>>> +        }
> >>>>>> +    }
> >>>>>> +
> >>>>>>      return VIRTIO_IOMMU_S_OK;
> >>>>>>  }
> >>>>>>
> >>>>>>
> >>>>> Thanks
> >>>>>
> >>>>> Eric
> >>>>>
> >>>>
> >>>
> >
>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
  2020-03-16  9:10               ` Bharat Bhushan
@ 2020-03-16 10:11                 ` Jean-Philippe Brucker
  2020-03-17  7:10                   ` Bharat Bhushan
  0 siblings, 1 reply; 27+ messages in thread
From: Jean-Philippe Brucker @ 2020-03-16 10:11 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: peter.maydell, kevin.tian, tnowicki, mst, drjones, peterx,
	qemu-devel, Auger Eric, alex.williamson, qemu-arm,
	Bharat Bhushan, linuc.decode, eric.auger.pro

Hi Bharat,

Could you Cc me on your next posting?  Unfortunately I don't have much
hardware for testing this at the moment, but I might be able to help a
little on the review.

On Mon, Mar 16, 2020 at 02:40:00PM +0530, Bharat Bhushan wrote:
> > >>> First issue is: your guest can use 4K page and your host can use 64KB
> > >>> pages. In that case VFIO_DMA_MAP will fail with -EINVAL. We must devise
> > >>> a way to pass the host settings to the VIRTIO-IOMMU device.
> > >>>
> > >>> Even with 64KB pages, it did not work for me. I have obviously not the
> > >>> storm of VFIO_DMA_MAP failures but I have some, most probably due to
> > >>> some wrong notifications somewhere. I will try to investigate on my side.
> > >>>
> > >>> Did you test with VFIO on your side?
> > >>
> > >> I did not tried with different page sizes, only tested with 4K page size.
> > >>
> > >> Yes it works, I tested with two n/w device assigned to VM, both interfaces works
> > >>
> > >> First I will try with 64k page size.
> > >
> > > 64K page size does not work for me as well,
> > >
> > > I think we are not passing correct page_size_mask here
> > > (config.page_size_mask is set to TARGET_PAGE_MASK ( which is
> > > 0xfffffffffffff000))
> > I guess you mean with guest using 4K and host using 64K.
> > >
> > > We need to set this correctly as per host page size, correct?
> > Yes that's correct. We need to put in place a control path to retrieve
> > the page settings on host through VFIO to inform the virtio-iommu device.
> >
> > Besides this issue, did you try with 64kB on host and guest?
> 
> I tried Followings
>   - 4k host and 4k guest  - it works with v7 version
>   - 64k host and 64k guest - it does not work with v7
>     hard-coded config.page_size_mask to 0xffffffffffff0000 and it works

You might get this from the iova_pgsize bitmap returned by
VFIO_IOMMU_GET_INFO. The virtio config.page_size_mask is global so there
is the usual problem of aggregating consistent properties, but I'm
guessing using the host page size as a granule here is safe enough. 

If it is a problem, we can add a PROBE property for page size mask,
allowing to define per-endpoint page masks. I have kernel patches
somewhere to do just that.

Thanks,
Jean


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
  2020-03-16 10:11                 ` Jean-Philippe Brucker
@ 2020-03-17  7:10                   ` Bharat Bhushan
  2020-03-17  8:25                     ` Auger Eric
  2020-03-17  8:53                     ` Jean-Philippe Brucker
  0 siblings, 2 replies; 27+ messages in thread
From: Bharat Bhushan @ 2020-03-17  7:10 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: peter.maydell, kevin.tian, tnowicki, mst, drjones, peterx,
	qemu-devel, Auger Eric, alex.williamson, qemu-arm,
	Bharat Bhushan, linuc.decode, eric.auger.pro

Hi Jean,

On Mon, Mar 16, 2020 at 3:41 PM Jean-Philippe Brucker
<jean-philippe@linaro.org> wrote:
>
> Hi Bharat,
>
> Could you Cc me on your next posting?  Unfortunately I don't have much
> hardware for testing this at the moment, but I might be able to help a
> little on the review.
>
> On Mon, Mar 16, 2020 at 02:40:00PM +0530, Bharat Bhushan wrote:
> > > >>> First issue is: your guest can use 4K page and your host can use 64KB
> > > >>> pages. In that case VFIO_DMA_MAP will fail with -EINVAL. We must devise
> > > >>> a way to pass the host settings to the VIRTIO-IOMMU device.
> > > >>>
> > > >>> Even with 64KB pages, it did not work for me. I have obviously not the
> > > >>> storm of VFIO_DMA_MAP failures but I have some, most probably due to
> > > >>> some wrong notifications somewhere. I will try to investigate on my side.
> > > >>>
> > > >>> Did you test with VFIO on your side?
> > > >>
> > > >> I did not tried with different page sizes, only tested with 4K page size.
> > > >>
> > > >> Yes it works, I tested with two n/w device assigned to VM, both interfaces works
> > > >>
> > > >> First I will try with 64k page size.
> > > >
> > > > 64K page size does not work for me as well,
> > > >
> > > > I think we are not passing correct page_size_mask here
> > > > (config.page_size_mask is set to TARGET_PAGE_MASK ( which is
> > > > 0xfffffffffffff000))
> > > I guess you mean with guest using 4K and host using 64K.
> > > >
> > > > We need to set this correctly as per host page size, correct?
> > > Yes that's correct. We need to put in place a control path to retrieve
> > > the page settings on host through VFIO to inform the virtio-iommu device.
> > >
> > > Besides this issue, did you try with 64kB on host and guest?
> >
> > I tried Followings
> >   - 4k host and 4k guest  - it works with v7 version
> >   - 64k host and 64k guest - it does not work with v7
> >     hard-coded config.page_size_mask to 0xffffffffffff0000 and it works
>
> You might get this from the iova_pgsize bitmap returned by
> VFIO_IOMMU_GET_INFO. The virtio config.page_size_mask is global so there
> is the usual problem of aggregating consistent properties, but I'm
> guessing using the host page size as a granule here is safe enough.
>
> If it is a problem, we can add a PROBE property for page size mask,
> allowing to define per-endpoint page masks. I have kernel patches
> somewhere to do just that.

I do not see we need page size mask per endpoint.

While I am trying to understand what "page-size-mask" guest will work with

- 4K page size host and 4k page size guest
  config.page_size_mask = 0xffffffffffff000 will work

- 64K page size host and 64k page size guest
  config.page_size_mask = 0xfffffffffff0000 will work

- 64K page size host and 4k page size guest
   1) config.page_size_mask = 0xffffffffffff000 will also not work as
VFIO in host expect iova and size to be aligned to 64k (PAGE_SIZE in
host)
   2) config.page_size_mask = 0xfffffffffff0000 will not work, iova
initialization (in guest) expect minimum page-size supported by h/w to
be equal to 4k (PAGE_SIZE in guest)
       Should we look to relax this in iova allocation code?

Thanks
-Bharat


>
> Thanks,
> Jean


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
  2020-03-17  7:10                   ` Bharat Bhushan
@ 2020-03-17  8:25                     ` Auger Eric
  2020-03-17  8:53                     ` Jean-Philippe Brucker
  1 sibling, 0 replies; 27+ messages in thread
From: Auger Eric @ 2020-03-17  8:25 UTC (permalink / raw)
  To: Bharat Bhushan, Jean-Philippe Brucker
  Cc: peter.maydell, kevin.tian, tnowicki, mst, drjones, peterx,
	qemu-devel, alex.williamson, qemu-arm, Bharat Bhushan,
	linuc.decode, eric.auger.pro

Hi Bharat,

On 3/17/20 8:10 AM, Bharat Bhushan wrote:
> Hi Jean,
> 
> On Mon, Mar 16, 2020 at 3:41 PM Jean-Philippe Brucker
> <jean-philippe@linaro.org> wrote:
>>
>> Hi Bharat,
>>
>> Could you Cc me on your next posting?  Unfortunately I don't have much
>> hardware for testing this at the moment, but I might be able to help a
>> little on the review.
>>
>> On Mon, Mar 16, 2020 at 02:40:00PM +0530, Bharat Bhushan wrote:
>>>>>>> First issue is: your guest can use 4K page and your host can use 64KB
>>>>>>> pages. In that case VFIO_DMA_MAP will fail with -EINVAL. We must devise
>>>>>>> a way to pass the host settings to the VIRTIO-IOMMU device.
>>>>>>>
>>>>>>> Even with 64KB pages, it did not work for me. I have obviously not the
>>>>>>> storm of VFIO_DMA_MAP failures but I have some, most probably due to
>>>>>>> some wrong notifications somewhere. I will try to investigate on my side.
>>>>>>>
>>>>>>> Did you test with VFIO on your side?
>>>>>>
>>>>>> I did not tried with different page sizes, only tested with 4K page size.
>>>>>>
>>>>>> Yes it works, I tested with two n/w device assigned to VM, both interfaces works
>>>>>>
>>>>>> First I will try with 64k page size.
>>>>>
>>>>> 64K page size does not work for me as well,
>>>>>
>>>>> I think we are not passing correct page_size_mask here
>>>>> (config.page_size_mask is set to TARGET_PAGE_MASK ( which is
>>>>> 0xfffffffffffff000))
>>>> I guess you mean with guest using 4K and host using 64K.
>>>>>
>>>>> We need to set this correctly as per host page size, correct?
>>>> Yes that's correct. We need to put in place a control path to retrieve
>>>> the page settings on host through VFIO to inform the virtio-iommu device.
>>>>
>>>> Besides this issue, did you try with 64kB on host and guest?
>>>
>>> I tried Followings
>>>   - 4k host and 4k guest  - it works with v7 version
>>>   - 64k host and 64k guest - it does not work with v7
>>>     hard-coded config.page_size_mask to 0xffffffffffff0000 and it works
>>
>> You might get this from the iova_pgsize bitmap returned by
>> VFIO_IOMMU_GET_INFO. The virtio config.page_size_mask is global so there
>> is the usual problem of aggregating consistent properties, but I'm
>> guessing using the host page size as a granule here is safe enough.
>>
>> If it is a problem, we can add a PROBE property for page size mask,
>> allowing to define per-endpoint page masks. I have kernel patches
>> somewhere to do just that.
> 
> I do not see we need page size mask per endpoint.
the physical devices can be protected with different physical IOMMUs and
they may have different page size support
> 
> While I am trying to understand what "page-size-mask" guest will work with
> 
> - 4K page size host and 4k page size guest
>   config.page_size_mask = 0xffffffffffff000 will work
> 
> - 64K page size host and 64k page size guest
>   config.page_size_mask = 0xfffffffffff0000 will work
I guess not all the pages sizes should be exposed by the virtio-iommu
device, only 4K and 64K

If host supports 4K we should expose 4K and bigger
If host supports 64K we should expose page sizes of 64KB and bigger

The guest will be forced to use what is exposed and that should work.

What is missing is a way to retrieve the host supported page size
bitmask. I can try to help you on that if you want to.

Maybe we should first try to upstream vhost support and then VFIO?

Thanks

Eric
> 
> - 64K page size host and 4k page size guest
>    1) config.page_size_mask = 0xffffffffffff000 will also not work as
> VFIO in host expect iova and size to be aligned to 64k (PAGE_SIZE in
> host)
>    2) config.page_size_mask = 0xfffffffffff0000 will not work, iova
> initialization (in guest) expect minimum page-size supported by h/w to
> be equal to 4k (PAGE_SIZE in guest)
>        Should we look to relax this in iova allocation code?
> 
> Thanks
> -Bharat
> 
> 
>>
>> Thanks,
>> Jean
> 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
  2020-03-17  7:10                   ` Bharat Bhushan
  2020-03-17  8:25                     ` Auger Eric
@ 2020-03-17  8:53                     ` Jean-Philippe Brucker
  2020-03-17  9:16                       ` Bharat Bhushan
  1 sibling, 1 reply; 27+ messages in thread
From: Jean-Philippe Brucker @ 2020-03-17  8:53 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: peter.maydell, kevin.tian, tnowicki, mst, drjones, peterx,
	qemu-devel, Auger Eric, alex.williamson, qemu-arm,
	Bharat Bhushan, linuc.decode, eric.auger.pro

On Tue, Mar 17, 2020 at 12:40:39PM +0530, Bharat Bhushan wrote:
> Hi Jean,
> 
> On Mon, Mar 16, 2020 at 3:41 PM Jean-Philippe Brucker
> <jean-philippe@linaro.org> wrote:
> >
> > Hi Bharat,
> >
> > Could you Cc me on your next posting?  Unfortunately I don't have much
> > hardware for testing this at the moment, but I might be able to help a
> > little on the review.
> >
> > On Mon, Mar 16, 2020 at 02:40:00PM +0530, Bharat Bhushan wrote:
> > > > >>> First issue is: your guest can use 4K page and your host can use 64KB
> > > > >>> pages. In that case VFIO_DMA_MAP will fail with -EINVAL. We must devise
> > > > >>> a way to pass the host settings to the VIRTIO-IOMMU device.
> > > > >>>
> > > > >>> Even with 64KB pages, it did not work for me. I have obviously not the
> > > > >>> storm of VFIO_DMA_MAP failures but I have some, most probably due to
> > > > >>> some wrong notifications somewhere. I will try to investigate on my side.
> > > > >>>
> > > > >>> Did you test with VFIO on your side?
> > > > >>
> > > > >> I did not tried with different page sizes, only tested with 4K page size.
> > > > >>
> > > > >> Yes it works, I tested with two n/w device assigned to VM, both interfaces works
> > > > >>
> > > > >> First I will try with 64k page size.
> > > > >
> > > > > 64K page size does not work for me as well,
> > > > >
> > > > > I think we are not passing correct page_size_mask here
> > > > > (config.page_size_mask is set to TARGET_PAGE_MASK ( which is
> > > > > 0xfffffffffffff000))
> > > > I guess you mean with guest using 4K and host using 64K.
> > > > >
> > > > > We need to set this correctly as per host page size, correct?
> > > > Yes that's correct. We need to put in place a control path to retrieve
> > > > the page settings on host through VFIO to inform the virtio-iommu device.
> > > >
> > > > Besides this issue, did you try with 64kB on host and guest?
> > >
> > > I tried Followings
> > >   - 4k host and 4k guest  - it works with v7 version
> > >   - 64k host and 64k guest - it does not work with v7
> > >     hard-coded config.page_size_mask to 0xffffffffffff0000 and it works
> >
> > You might get this from the iova_pgsize bitmap returned by
> > VFIO_IOMMU_GET_INFO. The virtio config.page_size_mask is global so there
> > is the usual problem of aggregating consistent properties, but I'm
> > guessing using the host page size as a granule here is safe enough.
> >
> > If it is a problem, we can add a PROBE property for page size mask,
> > allowing to define per-endpoint page masks. I have kernel patches
> > somewhere to do just that.
> 
> I do not see we need page size mask per endpoint.
> 
> While I am trying to understand what "page-size-mask" guest will work with
> 
> - 4K page size host and 4k page size guest
>   config.page_size_mask = 0xffffffffffff000 will work
> 
> - 64K page size host and 64k page size guest
>   config.page_size_mask = 0xfffffffffff0000 will work
> 
> - 64K page size host and 4k page size guest
>    1) config.page_size_mask = 0xffffffffffff000 will also not work as
> VFIO in host expect iova and size to be aligned to 64k (PAGE_SIZE in
> host)
>    2) config.page_size_mask = 0xfffffffffff0000 will not work, iova
> initialization (in guest) expect minimum page-size supported by h/w to
> be equal to 4k (PAGE_SIZE in guest)
>        Should we look to relax this in iova allocation code?

Oh right, that's not great. Maybe the BUG_ON() can be removed, I'll ask on
the list.

In the meantime, 64k granule is the right value to advertise to the guest
in this case. Did you try 64k guest 4k host?

Thanks,
Jean


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
  2020-03-17  8:53                     ` Jean-Philippe Brucker
@ 2020-03-17  9:16                       ` Bharat Bhushan
  2020-03-17 15:59                         ` Jean-Philippe Brucker
  0 siblings, 1 reply; 27+ messages in thread
From: Bharat Bhushan @ 2020-03-17  9:16 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Peter Maydell, kevin.tian, tnowicki, mst, drjones, peterx,
	qemu-devel, Auger Eric, alex.williamson, qemu-arm,
	Bharat Bhushan, linuc.decode, eric.auger.pro

Hi Jean,

On Tue, Mar 17, 2020 at 2:23 PM Jean-Philippe Brucker
<jean-philippe@linaro.org> wrote:
>
> On Tue, Mar 17, 2020 at 12:40:39PM +0530, Bharat Bhushan wrote:
> > Hi Jean,
> >
> > On Mon, Mar 16, 2020 at 3:41 PM Jean-Philippe Brucker
> > <jean-philippe@linaro.org> wrote:
> > >
> > > Hi Bharat,
> > >
> > > Could you Cc me on your next posting?  Unfortunately I don't have much
> > > hardware for testing this at the moment, but I might be able to help a
> > > little on the review.
> > >
> > > On Mon, Mar 16, 2020 at 02:40:00PM +0530, Bharat Bhushan wrote:
> > > > > >>> First issue is: your guest can use 4K page and your host can use 64KB
> > > > > >>> pages. In that case VFIO_DMA_MAP will fail with -EINVAL. We must devise
> > > > > >>> a way to pass the host settings to the VIRTIO-IOMMU device.
> > > > > >>>
> > > > > >>> Even with 64KB pages, it did not work for me. I have obviously not the
> > > > > >>> storm of VFIO_DMA_MAP failures but I have some, most probably due to
> > > > > >>> some wrong notifications somewhere. I will try to investigate on my side.
> > > > > >>>
> > > > > >>> Did you test with VFIO on your side?
> > > > > >>
> > > > > >> I did not tried with different page sizes, only tested with 4K page size.
> > > > > >>
> > > > > >> Yes it works, I tested with two n/w device assigned to VM, both interfaces works
> > > > > >>
> > > > > >> First I will try with 64k page size.
> > > > > >
> > > > > > 64K page size does not work for me as well,
> > > > > >
> > > > > > I think we are not passing correct page_size_mask here
> > > > > > (config.page_size_mask is set to TARGET_PAGE_MASK ( which is
> > > > > > 0xfffffffffffff000))
> > > > > I guess you mean with guest using 4K and host using 64K.
> > > > > >
> > > > > > We need to set this correctly as per host page size, correct?
> > > > > Yes that's correct. We need to put in place a control path to retrieve
> > > > > the page settings on host through VFIO to inform the virtio-iommu device.
> > > > >
> > > > > Besides this issue, did you try with 64kB on host and guest?
> > > >
> > > > I tried Followings
> > > >   - 4k host and 4k guest  - it works with v7 version
> > > >   - 64k host and 64k guest - it does not work with v7
> > > >     hard-coded config.page_size_mask to 0xffffffffffff0000 and it works
> > >
> > > You might get this from the iova_pgsize bitmap returned by
> > > VFIO_IOMMU_GET_INFO. The virtio config.page_size_mask is global so there
> > > is the usual problem of aggregating consistent properties, but I'm
> > > guessing using the host page size as a granule here is safe enough.
> > >
> > > If it is a problem, we can add a PROBE property for page size mask,
> > > allowing to define per-endpoint page masks. I have kernel patches
> > > somewhere to do just that.
> >
> > I do not see we need page size mask per endpoint.
> >
> > While I am trying to understand what "page-size-mask" guest will work with
> >
> > - 4K page size host and 4k page size guest
> >   config.page_size_mask = 0xffffffffffff000 will work
> >
> > - 64K page size host and 64k page size guest
> >   config.page_size_mask = 0xfffffffffff0000 will work
> >
> > - 64K page size host and 4k page size guest
> >    1) config.page_size_mask = 0xffffffffffff000 will also not work as
> > VFIO in host expect iova and size to be aligned to 64k (PAGE_SIZE in
> > host)
> >    2) config.page_size_mask = 0xfffffffffff0000 will not work, iova
> > initialization (in guest) expect minimum page-size supported by h/w to
> > be equal to 4k (PAGE_SIZE in guest)
> >        Should we look to relax this in iova allocation code?
>
> Oh right, that's not great. Maybe the BUG_ON() can be removed, I'll ask on
> the list.

yes, the BUG_ON in iova_init.
I tried with removing same and it worked, but not analyzed side effects.

>
> In the meantime, 64k granule is the right value to advertise to the guest
> in this case.
> Did you try 64k guest 4k host?

no, will try.

Thanks
-Bharat

>
> Thanks,
> Jean


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
  2020-03-17  9:16                       ` Bharat Bhushan
@ 2020-03-17 15:59                         ` Jean-Philippe Brucker
  2020-03-18 10:17                           ` Bharat Bhushan
  0 siblings, 1 reply; 27+ messages in thread
From: Jean-Philippe Brucker @ 2020-03-17 15:59 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: Peter Maydell, kevin.tian, tnowicki, mst, drjones, peterx,
	qemu-devel, Auger Eric, alex.williamson, qemu-arm,
	Bharat Bhushan, linuc.decode, eric.auger.pro

On Tue, Mar 17, 2020 at 02:46:55PM +0530, Bharat Bhushan wrote:
> Hi Jean,
> 
> On Tue, Mar 17, 2020 at 2:23 PM Jean-Philippe Brucker
> <jean-philippe@linaro.org> wrote:
> >
> > On Tue, Mar 17, 2020 at 12:40:39PM +0530, Bharat Bhushan wrote:
> > > Hi Jean,
> > >
> > > On Mon, Mar 16, 2020 at 3:41 PM Jean-Philippe Brucker
> > > <jean-philippe@linaro.org> wrote:
> > > >
> > > > Hi Bharat,
> > > >
> > > > Could you Cc me on your next posting?  Unfortunately I don't have much
> > > > hardware for testing this at the moment, but I might be able to help a
> > > > little on the review.
> > > >
> > > > On Mon, Mar 16, 2020 at 02:40:00PM +0530, Bharat Bhushan wrote:
> > > > > > >>> First issue is: your guest can use 4K page and your host can use 64KB
> > > > > > >>> pages. In that case VFIO_DMA_MAP will fail with -EINVAL. We must devise
> > > > > > >>> a way to pass the host settings to the VIRTIO-IOMMU device.
> > > > > > >>>
> > > > > > >>> Even with 64KB pages, it did not work for me. I have obviously not the
> > > > > > >>> storm of VFIO_DMA_MAP failures but I have some, most probably due to
> > > > > > >>> some wrong notifications somewhere. I will try to investigate on my side.
> > > > > > >>>
> > > > > > >>> Did you test with VFIO on your side?
> > > > > > >>
> > > > > > >> I did not tried with different page sizes, only tested with 4K page size.
> > > > > > >>
> > > > > > >> Yes it works, I tested with two n/w device assigned to VM, both interfaces works
> > > > > > >>
> > > > > > >> First I will try with 64k page size.
> > > > > > >
> > > > > > > 64K page size does not work for me as well,
> > > > > > >
> > > > > > > I think we are not passing correct page_size_mask here
> > > > > > > (config.page_size_mask is set to TARGET_PAGE_MASK ( which is
> > > > > > > 0xfffffffffffff000))
> > > > > > I guess you mean with guest using 4K and host using 64K.
> > > > > > >
> > > > > > > We need to set this correctly as per host page size, correct?
> > > > > > Yes that's correct. We need to put in place a control path to retrieve
> > > > > > the page settings on host through VFIO to inform the virtio-iommu device.
> > > > > >
> > > > > > Besides this issue, did you try with 64kB on host and guest?
> > > > >
> > > > > I tried Followings
> > > > >   - 4k host and 4k guest  - it works with v7 version
> > > > >   - 64k host and 64k guest - it does not work with v7
> > > > >     hard-coded config.page_size_mask to 0xffffffffffff0000 and it works
> > > >
> > > > You might get this from the iova_pgsize bitmap returned by
> > > > VFIO_IOMMU_GET_INFO. The virtio config.page_size_mask is global so there
> > > > is the usual problem of aggregating consistent properties, but I'm
> > > > guessing using the host page size as a granule here is safe enough.
> > > >
> > > > If it is a problem, we can add a PROBE property for page size mask,
> > > > allowing to define per-endpoint page masks. I have kernel patches
> > > > somewhere to do just that.
> > >
> > > I do not see we need page size mask per endpoint.
> > >
> > > While I am trying to understand what "page-size-mask" guest will work with
> > >
> > > - 4K page size host and 4k page size guest
> > >   config.page_size_mask = 0xffffffffffff000 will work
> > >
> > > - 64K page size host and 64k page size guest
> > >   config.page_size_mask = 0xfffffffffff0000 will work
> > >
> > > - 64K page size host and 4k page size guest
> > >    1) config.page_size_mask = 0xffffffffffff000 will also not work as
> > > VFIO in host expect iova and size to be aligned to 64k (PAGE_SIZE in
> > > host)
> > >    2) config.page_size_mask = 0xfffffffffff0000 will not work, iova
> > > initialization (in guest) expect minimum page-size supported by h/w to
> > > be equal to 4k (PAGE_SIZE in guest)
> > >        Should we look to relax this in iova allocation code?
> >
> > Oh right, that's not great. Maybe the BUG_ON() can be removed, I'll ask on
> > the list.
> 
> yes, the BUG_ON in iova_init.
> I tried with removing same and it worked, but not analyzed side effects.

It might break the assumption from device drivers that mapping a page is
safe. For example they call alloc_page() followed by dma_map_page(). In
our situation dma-iommu.c will oblige and create one 64k mapping to one 4k
physical page. As a result the endpoint can access the neighbouring 60k of
memory.

This isn't too terrible. After all, even when the page sizes match, device
drivers can call dma_map_single() on sub-page buffers, which will also let
the endpoint access a whole page. The solution, if you don't trust the
endpoint, is to use bounce buffers.

But I suspect it's not as simple as removing the BUG_ON(), we'll need to
go over dma-iommu.c first. And it seems like assigning endpoints to guest
userspace won't work either in this config. In vfio_dma_do_map():

        mask = ((uint64_t)1 << __ffs(vfio_pgsize_bitmap(iommu))) - 1;

        WARN_ON(mask & PAGE_MASK);

If I read this correctly the WARN will trigger in a 4k guest under 64k
host, right?  So maybe we can just say that this config isn't supported,
unless it's an important use-case for virtio-iommu?

Thanks,
Jean



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
  2020-03-17 15:59                         ` Jean-Philippe Brucker
@ 2020-03-18 10:17                           ` Bharat Bhushan
  2020-03-18 11:17                             ` Jean-Philippe Brucker
  0 siblings, 1 reply; 27+ messages in thread
From: Bharat Bhushan @ 2020-03-18 10:17 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Peter Maydell, kevin.tian, tnowicki, mst, drjones, peterx,
	qemu-devel, Auger Eric, alex.williamson, qemu-arm,
	Bharat Bhushan, linuc.decode, eric.auger.pro

Hi Jean,

On Tue, Mar 17, 2020 at 9:29 PM Jean-Philippe Brucker
<jean-philippe@linaro.org> wrote:
>
> On Tue, Mar 17, 2020 at 02:46:55PM +0530, Bharat Bhushan wrote:
> > Hi Jean,
> >
> > On Tue, Mar 17, 2020 at 2:23 PM Jean-Philippe Brucker
> > <jean-philippe@linaro.org> wrote:
> > >
> > > On Tue, Mar 17, 2020 at 12:40:39PM +0530, Bharat Bhushan wrote:
> > > > Hi Jean,
> > > >
> > > > On Mon, Mar 16, 2020 at 3:41 PM Jean-Philippe Brucker
> > > > <jean-philippe@linaro.org> wrote:
> > > > >
> > > > > Hi Bharat,
> > > > >
> > > > > Could you Cc me on your next posting?  Unfortunately I don't have much
> > > > > hardware for testing this at the moment, but I might be able to help a
> > > > > little on the review.
> > > > >
> > > > > On Mon, Mar 16, 2020 at 02:40:00PM +0530, Bharat Bhushan wrote:
> > > > > > > >>> First issue is: your guest can use 4K page and your host can use 64KB
> > > > > > > >>> pages. In that case VFIO_DMA_MAP will fail with -EINVAL. We must devise
> > > > > > > >>> a way to pass the host settings to the VIRTIO-IOMMU device.
> > > > > > > >>>
> > > > > > > >>> Even with 64KB pages, it did not work for me. I have obviously not the
> > > > > > > >>> storm of VFIO_DMA_MAP failures but I have some, most probably due to
> > > > > > > >>> some wrong notifications somewhere. I will try to investigate on my side.
> > > > > > > >>>
> > > > > > > >>> Did you test with VFIO on your side?
> > > > > > > >>
> > > > > > > >> I did not tried with different page sizes, only tested with 4K page size.
> > > > > > > >>
> > > > > > > >> Yes it works, I tested with two n/w device assigned to VM, both interfaces works
> > > > > > > >>
> > > > > > > >> First I will try with 64k page size.
> > > > > > > >
> > > > > > > > 64K page size does not work for me as well,
> > > > > > > >
> > > > > > > > I think we are not passing correct page_size_mask here
> > > > > > > > (config.page_size_mask is set to TARGET_PAGE_MASK ( which is
> > > > > > > > 0xfffffffffffff000))
> > > > > > > I guess you mean with guest using 4K and host using 64K.
> > > > > > > >
> > > > > > > > We need to set this correctly as per host page size, correct?
> > > > > > > Yes that's correct. We need to put in place a control path to retrieve
> > > > > > > the page settings on host through VFIO to inform the virtio-iommu device.
> > > > > > >
> > > > > > > Besides this issue, did you try with 64kB on host and guest?
> > > > > >
> > > > > > I tried Followings
> > > > > >   - 4k host and 4k guest  - it works with v7 version
> > > > > >   - 64k host and 64k guest - it does not work with v7
> > > > > >     hard-coded config.page_size_mask to 0xffffffffffff0000 and it works
> > > > >
> > > > > You might get this from the iova_pgsize bitmap returned by
> > > > > VFIO_IOMMU_GET_INFO. The virtio config.page_size_mask is global so there
> > > > > is the usual problem of aggregating consistent properties, but I'm
> > > > > guessing using the host page size as a granule here is safe enough.
> > > > >
> > > > > If it is a problem, we can add a PROBE property for page size mask,
> > > > > allowing to define per-endpoint page masks. I have kernel patches
> > > > > somewhere to do just that.
> > > >
> > > > I do not see we need page size mask per endpoint.
> > > >
> > > > While I am trying to understand what "page-size-mask" guest will work with
> > > >
> > > > - 4K page size host and 4k page size guest
> > > >   config.page_size_mask = 0xffffffffffff000 will work
> > > >
> > > > - 64K page size host and 64k page size guest
> > > >   config.page_size_mask = 0xfffffffffff0000 will work
> > > >
> > > > - 64K page size host and 4k page size guest
> > > >    1) config.page_size_mask = 0xffffffffffff000 will also not work as
> > > > VFIO in host expect iova and size to be aligned to 64k (PAGE_SIZE in
> > > > host)
> > > >    2) config.page_size_mask = 0xfffffffffff0000 will not work, iova
> > > > initialization (in guest) expect minimum page-size supported by h/w to
> > > > be equal to 4k (PAGE_SIZE in guest)
> > > >        Should we look to relax this in iova allocation code?
> > >
> > > Oh right, that's not great. Maybe the BUG_ON() can be removed, I'll ask on
> > > the list.
> >
> > yes, the BUG_ON in iova_init.
> > I tried with removing same and it worked, but not analyzed side effects.
>
> It might break the assumption from device drivers that mapping a page is
> safe. For example they call alloc_page() followed by dma_map_page(). In
> our situation dma-iommu.c will oblige and create one 64k mapping to one 4k
> physical page. As a result the endpoint can access the neighbouring 60k of
> memory.
>
> This isn't too terrible. After all, even when the page sizes match, device
> drivers can call dma_map_single() on sub-page buffers, which will also let
> the endpoint access a whole page. The solution, if you don't trust the
> endpoint, is to use bounce buffers.
>
> But I suspect it's not as simple as removing the BUG_ON(), we'll need to
> go over dma-iommu.c first. And it seems like assigning endpoints to guest
> userspace won't work either in this config. In vfio_dma_do_map():
>
>         mask = ((uint64_t)1 << __ffs(vfio_pgsize_bitmap(iommu))) - 1;
>
>         WARN_ON(mask & PAGE_MASK);

Yes, Agree

>
> If I read this correctly the WARN will trigger in a 4k guest under 64k
> host, right?  So maybe we can just say that this config isn't supported,
> unless it's an important use-case for virtio-iommu?

I sent v8 version of patch and with that guest and host with same page
size should work.
While i have not yet added analyzed how to mark 4k guest and 64k host
as un-supported configuration, will analyze and send patch.

Thanks
-Bharat

>
> Thanks,
> Jean
>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
  2020-03-18 10:17                           ` Bharat Bhushan
@ 2020-03-18 11:17                             ` Jean-Philippe Brucker
  2020-03-18 11:20                               ` [EXT] " Bharat Bhushan
  0 siblings, 1 reply; 27+ messages in thread
From: Jean-Philippe Brucker @ 2020-03-18 11:17 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: Peter Maydell, kevin.tian, tnowicki, mst, drjones, peterx,
	qemu-devel, Auger Eric, alex.williamson, qemu-arm,
	Bharat Bhushan, linuc.decode, eric.auger.pro

[-- Attachment #1: Type: text/plain, Size: 6206 bytes --]

On Wed, Mar 18, 2020 at 03:47:44PM +0530, Bharat Bhushan wrote:
> Hi Jean,
> 
> On Tue, Mar 17, 2020 at 9:29 PM Jean-Philippe Brucker
> <jean-philippe@linaro.org> wrote:
> >
> > On Tue, Mar 17, 2020 at 02:46:55PM +0530, Bharat Bhushan wrote:
> > > Hi Jean,
> > >
> > > On Tue, Mar 17, 2020 at 2:23 PM Jean-Philippe Brucker
> > > <jean-philippe@linaro.org> wrote:
> > > >
> > > > On Tue, Mar 17, 2020 at 12:40:39PM +0530, Bharat Bhushan wrote:
> > > > > Hi Jean,
> > > > >
> > > > > On Mon, Mar 16, 2020 at 3:41 PM Jean-Philippe Brucker
> > > > > <jean-philippe@linaro.org> wrote:
> > > > > >
> > > > > > Hi Bharat,
> > > > > >
> > > > > > Could you Cc me on your next posting?  Unfortunately I don't have much
> > > > > > hardware for testing this at the moment, but I might be able to help a
> > > > > > little on the review.
> > > > > >
> > > > > > On Mon, Mar 16, 2020 at 02:40:00PM +0530, Bharat Bhushan wrote:
> > > > > > > > >>> First issue is: your guest can use 4K page and your host can use 64KB
> > > > > > > > >>> pages. In that case VFIO_DMA_MAP will fail with -EINVAL. We must devise
> > > > > > > > >>> a way to pass the host settings to the VIRTIO-IOMMU device.
> > > > > > > > >>>
> > > > > > > > >>> Even with 64KB pages, it did not work for me. I have obviously not the
> > > > > > > > >>> storm of VFIO_DMA_MAP failures but I have some, most probably due to
> > > > > > > > >>> some wrong notifications somewhere. I will try to investigate on my side.
> > > > > > > > >>>
> > > > > > > > >>> Did you test with VFIO on your side?
> > > > > > > > >>
> > > > > > > > >> I did not tried with different page sizes, only tested with 4K page size.
> > > > > > > > >>
> > > > > > > > >> Yes it works, I tested with two n/w device assigned to VM, both interfaces works
> > > > > > > > >>
> > > > > > > > >> First I will try with 64k page size.
> > > > > > > > >
> > > > > > > > > 64K page size does not work for me as well,
> > > > > > > > >
> > > > > > > > > I think we are not passing correct page_size_mask here
> > > > > > > > > (config.page_size_mask is set to TARGET_PAGE_MASK ( which is
> > > > > > > > > 0xfffffffffffff000))
> > > > > > > > I guess you mean with guest using 4K and host using 64K.
> > > > > > > > >
> > > > > > > > > We need to set this correctly as per host page size, correct?
> > > > > > > > Yes that's correct. We need to put in place a control path to retrieve
> > > > > > > > the page settings on host through VFIO to inform the virtio-iommu device.
> > > > > > > >
> > > > > > > > Besides this issue, did you try with 64kB on host and guest?
> > > > > > >
> > > > > > > I tried Followings
> > > > > > >   - 4k host and 4k guest  - it works with v7 version
> > > > > > >   - 64k host and 64k guest - it does not work with v7
> > > > > > >     hard-coded config.page_size_mask to 0xffffffffffff0000 and it works
> > > > > >
> > > > > > You might get this from the iova_pgsize bitmap returned by
> > > > > > VFIO_IOMMU_GET_INFO. The virtio config.page_size_mask is global so there
> > > > > > is the usual problem of aggregating consistent properties, but I'm
> > > > > > guessing using the host page size as a granule here is safe enough.
> > > > > >
> > > > > > If it is a problem, we can add a PROBE property for page size mask,
> > > > > > allowing to define per-endpoint page masks. I have kernel patches
> > > > > > somewhere to do just that.
> > > > >
> > > > > I do not see we need page size mask per endpoint.
> > > > >
> > > > > While I am trying to understand what "page-size-mask" guest will work with
> > > > >
> > > > > - 4K page size host and 4k page size guest
> > > > >   config.page_size_mask = 0xffffffffffff000 will work
> > > > >
> > > > > - 64K page size host and 64k page size guest
> > > > >   config.page_size_mask = 0xfffffffffff0000 will work
> > > > >
> > > > > - 64K page size host and 4k page size guest
> > > > >    1) config.page_size_mask = 0xffffffffffff000 will also not work as
> > > > > VFIO in host expect iova and size to be aligned to 64k (PAGE_SIZE in
> > > > > host)
> > > > >    2) config.page_size_mask = 0xfffffffffff0000 will not work, iova
> > > > > initialization (in guest) expect minimum page-size supported by h/w to
> > > > > be equal to 4k (PAGE_SIZE in guest)
> > > > >        Should we look to relax this in iova allocation code?
> > > >
> > > > Oh right, that's not great. Maybe the BUG_ON() can be removed, I'll ask on
> > > > the list.
> > >
> > > yes, the BUG_ON in iova_init.
> > > I tried with removing same and it worked, but not analyzed side effects.
> >
> > It might break the assumption from device drivers that mapping a page is
> > safe. For example they call alloc_page() followed by dma_map_page(). In
> > our situation dma-iommu.c will oblige and create one 64k mapping to one 4k
> > physical page. As a result the endpoint can access the neighbouring 60k of
> > memory.
> >
> > This isn't too terrible. After all, even when the page sizes match, device
> > drivers can call dma_map_single() on sub-page buffers, which will also let
> > the endpoint access a whole page. The solution, if you don't trust the
> > endpoint, is to use bounce buffers.
> >
> > But I suspect it's not as simple as removing the BUG_ON(), we'll need to
> > go over dma-iommu.c first. And it seems like assigning endpoints to guest
> > userspace won't work either in this config. In vfio_dma_do_map():
> >
> >         mask = ((uint64_t)1 << __ffs(vfio_pgsize_bitmap(iommu))) - 1;
> >
> >         WARN_ON(mask & PAGE_MASK);
> 
> Yes, Agree
> 
> >
> > If I read this correctly the WARN will trigger in a 4k guest under 64k
> > host, right?  So maybe we can just say that this config isn't supported,
> > unless it's an important use-case for virtio-iommu?
> 
> I sent v8 version of patch and with that guest and host with same page
> size should work.
> While i have not yet added analyzed how to mark 4k guest and 64k host
> as un-supported configuration, will analyze and send patch.

I don't think there is anything to do for QEMU, it's Linux that doesn't
support the configuration. We could add something like the attached patch,
in the virtio-iommu driver, to abort more gracefully than with a BUG_ON().

Thanks,
Jean

[-- Attachment #2: 0001-iommu-virtio-Reject-IOMMU-page-granule-larger-than-P.patch --]
[-- Type: text/plain, Size: 2040 bytes --]

From 1fa08800c94f7ad6720b7e6fe26a65ed3d6ce715 Mon Sep 17 00:00:00 2001
From: Jean-Philippe Brucker <jean-philippe@linaro.org>
Date: Wed, 18 Mar 2020 11:59:19 +0100
Subject: [PATCH] iommu/virtio: Reject IOMMU page granule larger than PAGE_SIZE

We don't currently support IOMMUs with a page granule larger than the
system page size. Currently the IOVA allocator has a BUG_ON() in this
case, and VFIO has a WARN_ON().

It might be possible to remove these obstacles if necessary. If the host
uses 64kB pages and the guest uses 4kB, then a device driver calling
alloc_page() followed by dma_map_page() will create a 64kB mapping for a
4kB physical page, allowing the endpoint to access the neighbouring 60kB
of memory. This problem could be worked around with bounce buffers.

For the moment, rather than triggering the IOVA BUG_ON() on mismatched
page sizes, abort the virtio-iommu probe with an error message.

Reported-by: Bharat Bhushan <bbhushan2@marvell.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 drivers/iommu/virtio-iommu.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 6d4e3c2a2ddb..80d5d8f621ab 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -998,6 +998,7 @@ static int viommu_probe(struct virtio_device *vdev)
 	struct device *parent_dev = vdev->dev.parent;
 	struct viommu_dev *viommu = NULL;
 	struct device *dev = &vdev->dev;
+	unsigned long viommu_page_size;
 	u64 input_start = 0;
 	u64 input_end = -1UL;
 	int ret;
@@ -1028,6 +1029,14 @@ static int viommu_probe(struct virtio_device *vdev)
 		goto err_free_vqs;
 	}
 
+	viommu_page_size = 1UL << __ffs(viommu->pgsize_bitmap);
+	if (viommu_page_size > PAGE_SIZE) {
+		dev_err(dev, "granule 0x%lx larger than system page size 0x%lx\n",
+			viommu_page_size, PAGE_SIZE);
+		ret = -EINVAL;
+		goto err_free_vqs;
+	}
+
 	viommu->map_flags = VIRTIO_IOMMU_MAP_F_READ | VIRTIO_IOMMU_MAP_F_WRITE;
 	viommu->last_domain = ~0U;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* RE: [EXT] Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
  2020-03-18 11:17                             ` Jean-Philippe Brucker
@ 2020-03-18 11:20                               ` Bharat Bhushan
  2020-03-18 11:42                                 ` Auger Eric
  0 siblings, 1 reply; 27+ messages in thread
From: Bharat Bhushan @ 2020-03-18 11:20 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Bharat Bhushan
  Cc: Peter Maydell, kevin.tian, Tomasz Nowicki [C],
	mst, drjones, peterx, qemu-devel, Auger Eric, alex.williamson,
	qemu-arm, linuc.decode, eric.auger.pro



> -----Original Message-----
> From: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Sent: Wednesday, March 18, 2020 4:48 PM
> To: Bharat Bhushan <bharatb.linux@gmail.com>
> Cc: Auger Eric <eric.auger@redhat.com>; Peter Maydell
> <peter.maydell@linaro.org>; kevin.tian@intel.com; Tomasz Nowicki [C]
> <tnowicki@marvell.com>; mst@redhat.com; drjones@redhat.com;
> peterx@redhat.com; qemu-devel@nongnu.org; alex.williamson@redhat.com;
> qemu-arm@nongnu.org; Bharat Bhushan <bbhushan2@marvell.com>;
> linuc.decode@gmail.com; eric.auger.pro@gmail.com
> Subject: [EXT] Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for
> attach/detach
> 
> External Email
> 
> ----------------------------------------------------------------------
> On Wed, Mar 18, 2020 at 03:47:44PM +0530, Bharat Bhushan wrote:
> > Hi Jean,
> >
> > On Tue, Mar 17, 2020 at 9:29 PM Jean-Philippe Brucker
> > <jean-philippe@linaro.org> wrote:
> > >
> > > On Tue, Mar 17, 2020 at 02:46:55PM +0530, Bharat Bhushan wrote:
> > > > Hi Jean,
> > > >
> > > > On Tue, Mar 17, 2020 at 2:23 PM Jean-Philippe Brucker
> > > > <jean-philippe@linaro.org> wrote:
> > > > >
> > > > > On Tue, Mar 17, 2020 at 12:40:39PM +0530, Bharat Bhushan wrote:
> > > > > > Hi Jean,
> > > > > >
> > > > > > On Mon, Mar 16, 2020 at 3:41 PM Jean-Philippe Brucker
> > > > > > <jean-philippe@linaro.org> wrote:
> > > > > > >
> > > > > > > Hi Bharat,
> > > > > > >
> > > > > > > Could you Cc me on your next posting?  Unfortunately I don't
> > > > > > > have much hardware for testing this at the moment, but I
> > > > > > > might be able to help a little on the review.
> > > > > > >
> > > > > > > On Mon, Mar 16, 2020 at 02:40:00PM +0530, Bharat Bhushan wrote:
> > > > > > > > > >>> First issue is: your guest can use 4K page and your
> > > > > > > > > >>> host can use 64KB pages. In that case VFIO_DMA_MAP
> > > > > > > > > >>> will fail with -EINVAL. We must devise a way to pass the host
> settings to the VIRTIO-IOMMU device.
> > > > > > > > > >>>
> > > > > > > > > >>> Even with 64KB pages, it did not work for me. I have
> > > > > > > > > >>> obviously not the storm of VFIO_DMA_MAP failures but
> > > > > > > > > >>> I have some, most probably due to some wrong notifications
> somewhere. I will try to investigate on my side.
> > > > > > > > > >>>
> > > > > > > > > >>> Did you test with VFIO on your side?
> > > > > > > > > >>
> > > > > > > > > >> I did not tried with different page sizes, only tested with 4K page
> size.
> > > > > > > > > >>
> > > > > > > > > >> Yes it works, I tested with two n/w device assigned
> > > > > > > > > >> to VM, both interfaces works
> > > > > > > > > >>
> > > > > > > > > >> First I will try with 64k page size.
> > > > > > > > > >
> > > > > > > > > > 64K page size does not work for me as well,
> > > > > > > > > >
> > > > > > > > > > I think we are not passing correct page_size_mask here
> > > > > > > > > > (config.page_size_mask is set to TARGET_PAGE_MASK (
> > > > > > > > > > which is
> > > > > > > > > > 0xfffffffffffff000))
> > > > > > > > > I guess you mean with guest using 4K and host using 64K.
> > > > > > > > > >
> > > > > > > > > > We need to set this correctly as per host page size, correct?
> > > > > > > > > Yes that's correct. We need to put in place a control
> > > > > > > > > path to retrieve the page settings on host through VFIO to inform the
> virtio-iommu device.
> > > > > > > > >
> > > > > > > > > Besides this issue, did you try with 64kB on host and guest?
> > > > > > > >
> > > > > > > > I tried Followings
> > > > > > > >   - 4k host and 4k guest  - it works with v7 version
> > > > > > > >   - 64k host and 64k guest - it does not work with v7
> > > > > > > >     hard-coded config.page_size_mask to 0xffffffffffff0000
> > > > > > > > and it works
> > > > > > >
> > > > > > > You might get this from the iova_pgsize bitmap returned by
> > > > > > > VFIO_IOMMU_GET_INFO. The virtio config.page_size_mask is
> > > > > > > global so there is the usual problem of aggregating
> > > > > > > consistent properties, but I'm guessing using the host page size as a
> granule here is safe enough.
> > > > > > >
> > > > > > > If it is a problem, we can add a PROBE property for page
> > > > > > > size mask, allowing to define per-endpoint page masks. I
> > > > > > > have kernel patches somewhere to do just that.
> > > > > >
> > > > > > I do not see we need page size mask per endpoint.
> > > > > >
> > > > > > While I am trying to understand what "page-size-mask" guest
> > > > > > will work with
> > > > > >
> > > > > > - 4K page size host and 4k page size guest
> > > > > >   config.page_size_mask = 0xffffffffffff000 will work
> > > > > >
> > > > > > - 64K page size host and 64k page size guest
> > > > > >   config.page_size_mask = 0xfffffffffff0000 will work
> > > > > >
> > > > > > - 64K page size host and 4k page size guest
> > > > > >    1) config.page_size_mask = 0xffffffffffff000 will also not
> > > > > > work as VFIO in host expect iova and size to be aligned to 64k
> > > > > > (PAGE_SIZE in
> > > > > > host)
> > > > > >    2) config.page_size_mask = 0xfffffffffff0000 will not work,
> > > > > > iova initialization (in guest) expect minimum page-size
> > > > > > supported by h/w to be equal to 4k (PAGE_SIZE in guest)
> > > > > >        Should we look to relax this in iova allocation code?
> > > > >
> > > > > Oh right, that's not great. Maybe the BUG_ON() can be removed,
> > > > > I'll ask on the list.
> > > >
> > > > yes, the BUG_ON in iova_init.
> > > > I tried with removing same and it worked, but not analyzed side effects.
> > >
> > > It might break the assumption from device drivers that mapping a
> > > page is safe. For example they call alloc_page() followed by
> > > dma_map_page(). In our situation dma-iommu.c will oblige and create
> > > one 64k mapping to one 4k physical page. As a result the endpoint
> > > can access the neighbouring 60k of memory.
> > >
> > > This isn't too terrible. After all, even when the page sizes match,
> > > device drivers can call dma_map_single() on sub-page buffers, which
> > > will also let the endpoint access a whole page. The solution, if you
> > > don't trust the endpoint, is to use bounce buffers.
> > >
> > > But I suspect it's not as simple as removing the BUG_ON(), we'll
> > > need to go over dma-iommu.c first. And it seems like assigning
> > > endpoints to guest userspace won't work either in this config. In
> vfio_dma_do_map():
> > >
> > >         mask = ((uint64_t)1 << __ffs(vfio_pgsize_bitmap(iommu))) -
> > > 1;
> > >
> > >         WARN_ON(mask & PAGE_MASK);
> >
> > Yes, Agree
> >
> > >
> > > If I read this correctly the WARN will trigger in a 4k guest under
> > > 64k host, right?  So maybe we can just say that this config isn't
> > > supported, unless it's an important use-case for virtio-iommu?
> >
> > I sent v8 version of patch and with that guest and host with same page
> > size should work.
> > While i have not yet added analyzed how to mark 4k guest and 64k host
> > as un-supported configuration, will analyze and send patch.
> 
> I don't think there is anything to do for QEMU, it's Linux that doesn't support the
> configuration. We could add something like the attached patch, in the virtio-
> iommu driver, to abort more gracefully than with a BUG_ON().

Yes agree, we need to have change in Linux side.

Thanks
-Bharat

> 
> Thanks,
> Jean


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [EXT] Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
  2020-03-18 11:20                               ` [EXT] " Bharat Bhushan
@ 2020-03-18 11:42                                 ` Auger Eric
  2020-03-18 12:00                                   ` Jean-Philippe Brucker
  0 siblings, 1 reply; 27+ messages in thread
From: Auger Eric @ 2020-03-18 11:42 UTC (permalink / raw)
  To: Bharat Bhushan, Jean-Philippe Brucker, Bharat Bhushan
  Cc: Peter Maydell, kevin.tian, Tomasz Nowicki [C],
	mst, drjones, peterx, qemu-devel, alex.williamson, qemu-arm,
	linuc.decode, eric.auger.pro

Hi Jean,

On 3/18/20 12:20 PM, Bharat Bhushan wrote:
> 
> 
>> -----Original Message-----
>> From: Jean-Philippe Brucker <jean-philippe@linaro.org>
>> Sent: Wednesday, March 18, 2020 4:48 PM
>> To: Bharat Bhushan <bharatb.linux@gmail.com>
>> Cc: Auger Eric <eric.auger@redhat.com>; Peter Maydell
>> <peter.maydell@linaro.org>; kevin.tian@intel.com; Tomasz Nowicki [C]
>> <tnowicki@marvell.com>; mst@redhat.com; drjones@redhat.com;
>> peterx@redhat.com; qemu-devel@nongnu.org; alex.williamson@redhat.com;
>> qemu-arm@nongnu.org; Bharat Bhushan <bbhushan2@marvell.com>;
>> linuc.decode@gmail.com; eric.auger.pro@gmail.com
>> Subject: [EXT] Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for
>> attach/detach
>>
>> External Email
>>
>> ----------------------------------------------------------------------
>> On Wed, Mar 18, 2020 at 03:47:44PM +0530, Bharat Bhushan wrote:
>>> Hi Jean,
>>>
>>> On Tue, Mar 17, 2020 at 9:29 PM Jean-Philippe Brucker
>>> <jean-philippe@linaro.org> wrote:
>>>>
>>>> On Tue, Mar 17, 2020 at 02:46:55PM +0530, Bharat Bhushan wrote:
>>>>> Hi Jean,
>>>>>
>>>>> On Tue, Mar 17, 2020 at 2:23 PM Jean-Philippe Brucker
>>>>> <jean-philippe@linaro.org> wrote:
>>>>>>
>>>>>> On Tue, Mar 17, 2020 at 12:40:39PM +0530, Bharat Bhushan wrote:
>>>>>>> Hi Jean,
>>>>>>>
>>>>>>> On Mon, Mar 16, 2020 at 3:41 PM Jean-Philippe Brucker
>>>>>>> <jean-philippe@linaro.org> wrote:
>>>>>>>>
>>>>>>>> Hi Bharat,
>>>>>>>>
>>>>>>>> Could you Cc me on your next posting?  Unfortunately I don't
>>>>>>>> have much hardware for testing this at the moment, but I
>>>>>>>> might be able to help a little on the review.
>>>>>>>>
>>>>>>>> On Mon, Mar 16, 2020 at 02:40:00PM +0530, Bharat Bhushan wrote:
>>>>>>>>>>>>> First issue is: your guest can use 4K page and your
>>>>>>>>>>>>> host can use 64KB pages. In that case VFIO_DMA_MAP
>>>>>>>>>>>>> will fail with -EINVAL. We must devise a way to pass the host
>> settings to the VIRTIO-IOMMU device.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Even with 64KB pages, it did not work for me. I have
>>>>>>>>>>>>> obviously not the storm of VFIO_DMA_MAP failures but
>>>>>>>>>>>>> I have some, most probably due to some wrong notifications
>> somewhere. I will try to investigate on my side.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Did you test with VFIO on your side?
>>>>>>>>>>>>
>>>>>>>>>>>> I did not tried with different page sizes, only tested with 4K page
>> size.
>>>>>>>>>>>>
>>>>>>>>>>>> Yes it works, I tested with two n/w device assigned
>>>>>>>>>>>> to VM, both interfaces works
>>>>>>>>>>>>
>>>>>>>>>>>> First I will try with 64k page size.
>>>>>>>>>>>
>>>>>>>>>>> 64K page size does not work for me as well,
>>>>>>>>>>>
>>>>>>>>>>> I think we are not passing correct page_size_mask here
>>>>>>>>>>> (config.page_size_mask is set to TARGET_PAGE_MASK (
>>>>>>>>>>> which is
>>>>>>>>>>> 0xfffffffffffff000))
>>>>>>>>>> I guess you mean with guest using 4K and host using 64K.
>>>>>>>>>>>
>>>>>>>>>>> We need to set this correctly as per host page size, correct?
>>>>>>>>>> Yes that's correct. We need to put in place a control
>>>>>>>>>> path to retrieve the page settings on host through VFIO to inform the
>> virtio-iommu device.
>>>>>>>>>>
>>>>>>>>>> Besides this issue, did you try with 64kB on host and guest?
>>>>>>>>>
>>>>>>>>> I tried Followings
>>>>>>>>>   - 4k host and 4k guest  - it works with v7 version
>>>>>>>>>   - 64k host and 64k guest - it does not work with v7
>>>>>>>>>     hard-coded config.page_size_mask to 0xffffffffffff0000
>>>>>>>>> and it works
>>>>>>>>
>>>>>>>> You might get this from the iova_pgsize bitmap returned by
>>>>>>>> VFIO_IOMMU_GET_INFO. The virtio config.page_size_mask is
>>>>>>>> global so there is the usual problem of aggregating
>>>>>>>> consistent properties, but I'm guessing using the host page size as a
>> granule here is safe enough.
>>>>>>>>
>>>>>>>> If it is a problem, we can add a PROBE property for page
>>>>>>>> size mask, allowing to define per-endpoint page masks. I
>>>>>>>> have kernel patches somewhere to do just that.
>>>>>>>
>>>>>>> I do not see we need page size mask per endpoint.
>>>>>>>
>>>>>>> While I am trying to understand what "page-size-mask" guest
>>>>>>> will work with
>>>>>>>
>>>>>>> - 4K page size host and 4k page size guest
>>>>>>>   config.page_size_mask = 0xffffffffffff000 will work
>>>>>>>
>>>>>>> - 64K page size host and 64k page size guest
>>>>>>>   config.page_size_mask = 0xfffffffffff0000 will work
>>>>>>>
>>>>>>> - 64K page size host and 4k page size guest
>>>>>>>    1) config.page_size_mask = 0xffffffffffff000 will also not
>>>>>>> work as VFIO in host expect iova and size to be aligned to 64k
>>>>>>> (PAGE_SIZE in
>>>>>>> host)
>>>>>>>    2) config.page_size_mask = 0xfffffffffff0000 will not work,
>>>>>>> iova initialization (in guest) expect minimum page-size
>>>>>>> supported by h/w to be equal to 4k (PAGE_SIZE in guest)
>>>>>>>        Should we look to relax this in iova allocation code?
>>>>>>
>>>>>> Oh right, that's not great. Maybe the BUG_ON() can be removed,
>>>>>> I'll ask on the list.
>>>>>
>>>>> yes, the BUG_ON in iova_init.
So you mean in init_iova_domain()?

I see the BUG_ON was introduced in
0fb5fe874c42942e16c450ae05da453e13a1c09e "iommu: Make IOVA domain page
size explicit" but was it meant to remain?

Logically when we allocate buffer IOVAs for DMA accesses, later on,
shouldn't it be possible to use the actual granule set on init() and
make sure the allocated size is properly aligned.

Reading the commit msg it explicitly says that "the systems may contain
heterogeneous IOMMUs supporting differing minimum page sizes, which may
also not be common with the CPU page size".

Thanks

Eric


>>>>> I tried with removing same and it worked, but not analyzed side effects.
>>>>
>>>> It might break the assumption from device drivers that mapping a
>>>> page is safe. For example they call alloc_page() followed by
>>>> dma_map_page(). In our situation dma-iommu.c will oblige and create
>>>> one 64k mapping to one 4k physical page. As a result the endpoint
>>>> can access the neighbouring 60k of memory.
>>>>
>>>> This isn't too terrible. After all, even when the page sizes match,
>>>> device drivers can call dma_map_single() on sub-page buffers, which
>>>> will also let the endpoint access a whole page. The solution, if you
>>>> don't trust the endpoint, is to use bounce buffers.
>>>>
>>>> But I suspect it's not as simple as removing the BUG_ON(), we'll
>>>> need to go over dma-iommu.c first. And it seems like assigning
>>>> endpoints to guest userspace won't work either in this config. In
>> vfio_dma_do_map():
>>>>
>>>>         mask = ((uint64_t)1 << __ffs(vfio_pgsize_bitmap(iommu))) -
>>>> 1;
>>>>
>>>>         WARN_ON(mask & PAGE_MASK);
>>>
>>> Yes, Agree
>>>
>>>>
>>>> If I read this correctly the WARN will trigger in a 4k guest under
>>>> 64k host, right?  So maybe we can just say that this config isn't
>>>> supported, unless it's an important use-case for virtio-iommu?
>>>
>>> I sent v8 version of patch and with that guest and host with same page
>>> size should work.
>>> While i have not yet added analyzed how to mark 4k guest and 64k host
>>> as un-supported configuration, will analyze and send patch.
>>
>> I don't think there is anything to do for QEMU, it's Linux that doesn't support the
>> configuration. We could add something like the attached patch, in the virtio-
>> iommu driver, to abort more gracefully than with a BUG_ON().
> 
> Yes agree, we need to have change in Linux side.
> 
> Thanks
> -Bharat
> 
>>
>> Thanks,
>> Jean
> 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [EXT] Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach
  2020-03-18 11:42                                 ` Auger Eric
@ 2020-03-18 12:00                                   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 27+ messages in thread
From: Jean-Philippe Brucker @ 2020-03-18 12:00 UTC (permalink / raw)
  To: Auger Eric
  Cc: Peter Maydell, kevin.tian, alex.williamson, Tomasz Nowicki [C],
	mst, drjones, peterx, qemu-devel, Bharat Bhushan, qemu-arm,
	Bharat Bhushan, linuc.decode, eric.auger.pro

On Wed, Mar 18, 2020 at 12:42:25PM +0100, Auger Eric wrote:
> Hi Jean,
> 
> On 3/18/20 12:20 PM, Bharat Bhushan wrote:
> > 
> > 
> >> -----Original Message-----
> >> From: Jean-Philippe Brucker <jean-philippe@linaro.org>
> >> Sent: Wednesday, March 18, 2020 4:48 PM
> >> To: Bharat Bhushan <bharatb.linux@gmail.com>
> >> Cc: Auger Eric <eric.auger@redhat.com>; Peter Maydell
> >> <peter.maydell@linaro.org>; kevin.tian@intel.com; Tomasz Nowicki [C]
> >> <tnowicki@marvell.com>; mst@redhat.com; drjones@redhat.com;
> >> peterx@redhat.com; qemu-devel@nongnu.org; alex.williamson@redhat.com;
> >> qemu-arm@nongnu.org; Bharat Bhushan <bbhushan2@marvell.com>;
> >> linuc.decode@gmail.com; eric.auger.pro@gmail.com
> >> Subject: [EXT] Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for
> >> attach/detach
> >>
> >> External Email
> >>
> >> ----------------------------------------------------------------------
> >> On Wed, Mar 18, 2020 at 03:47:44PM +0530, Bharat Bhushan wrote:
> >>> Hi Jean,
> >>>
> >>> On Tue, Mar 17, 2020 at 9:29 PM Jean-Philippe Brucker
> >>> <jean-philippe@linaro.org> wrote:
> >>>>
> >>>> On Tue, Mar 17, 2020 at 02:46:55PM +0530, Bharat Bhushan wrote:
> >>>>> Hi Jean,
> >>>>>
> >>>>> On Tue, Mar 17, 2020 at 2:23 PM Jean-Philippe Brucker
> >>>>> <jean-philippe@linaro.org> wrote:
> >>>>>>
> >>>>>> On Tue, Mar 17, 2020 at 12:40:39PM +0530, Bharat Bhushan wrote:
> >>>>>>> Hi Jean,
> >>>>>>>
> >>>>>>> On Mon, Mar 16, 2020 at 3:41 PM Jean-Philippe Brucker
> >>>>>>> <jean-philippe@linaro.org> wrote:
> >>>>>>>>
> >>>>>>>> Hi Bharat,
> >>>>>>>>
> >>>>>>>> Could you Cc me on your next posting?  Unfortunately I don't
> >>>>>>>> have much hardware for testing this at the moment, but I
> >>>>>>>> might be able to help a little on the review.
> >>>>>>>>
> >>>>>>>> On Mon, Mar 16, 2020 at 02:40:00PM +0530, Bharat Bhushan wrote:
> >>>>>>>>>>>>> First issue is: your guest can use 4K page and your
> >>>>>>>>>>>>> host can use 64KB pages. In that case VFIO_DMA_MAP
> >>>>>>>>>>>>> will fail with -EINVAL. We must devise a way to pass the host
> >> settings to the VIRTIO-IOMMU device.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Even with 64KB pages, it did not work for me. I have
> >>>>>>>>>>>>> obviously not the storm of VFIO_DMA_MAP failures but
> >>>>>>>>>>>>> I have some, most probably due to some wrong notifications
> >> somewhere. I will try to investigate on my side.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Did you test with VFIO on your side?
> >>>>>>>>>>>>
> >>>>>>>>>>>> I did not tried with different page sizes, only tested with 4K page
> >> size.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Yes it works, I tested with two n/w device assigned
> >>>>>>>>>>>> to VM, both interfaces works
> >>>>>>>>>>>>
> >>>>>>>>>>>> First I will try with 64k page size.
> >>>>>>>>>>>
> >>>>>>>>>>> 64K page size does not work for me as well,
> >>>>>>>>>>>
> >>>>>>>>>>> I think we are not passing correct page_size_mask here
> >>>>>>>>>>> (config.page_size_mask is set to TARGET_PAGE_MASK (
> >>>>>>>>>>> which is
> >>>>>>>>>>> 0xfffffffffffff000))
> >>>>>>>>>> I guess you mean with guest using 4K and host using 64K.
> >>>>>>>>>>>
> >>>>>>>>>>> We need to set this correctly as per host page size, correct?
> >>>>>>>>>> Yes that's correct. We need to put in place a control
> >>>>>>>>>> path to retrieve the page settings on host through VFIO to inform the
> >> virtio-iommu device.
> >>>>>>>>>>
> >>>>>>>>>> Besides this issue, did you try with 64kB on host and guest?
> >>>>>>>>>
> >>>>>>>>> I tried Followings
> >>>>>>>>>   - 4k host and 4k guest  - it works with v7 version
> >>>>>>>>>   - 64k host and 64k guest - it does not work with v7
> >>>>>>>>>     hard-coded config.page_size_mask to 0xffffffffffff0000
> >>>>>>>>> and it works
> >>>>>>>>
> >>>>>>>> You might get this from the iova_pgsize bitmap returned by
> >>>>>>>> VFIO_IOMMU_GET_INFO. The virtio config.page_size_mask is
> >>>>>>>> global so there is the usual problem of aggregating
> >>>>>>>> consistent properties, but I'm guessing using the host page size as a
> >> granule here is safe enough.
> >>>>>>>>
> >>>>>>>> If it is a problem, we can add a PROBE property for page
> >>>>>>>> size mask, allowing to define per-endpoint page masks. I
> >>>>>>>> have kernel patches somewhere to do just that.
> >>>>>>>
> >>>>>>> I do not see we need page size mask per endpoint.
> >>>>>>>
> >>>>>>> While I am trying to understand what "page-size-mask" guest
> >>>>>>> will work with
> >>>>>>>
> >>>>>>> - 4K page size host and 4k page size guest
> >>>>>>>   config.page_size_mask = 0xffffffffffff000 will work
> >>>>>>>
> >>>>>>> - 64K page size host and 64k page size guest
> >>>>>>>   config.page_size_mask = 0xfffffffffff0000 will work
> >>>>>>>
> >>>>>>> - 64K page size host and 4k page size guest
> >>>>>>>    1) config.page_size_mask = 0xffffffffffff000 will also not
> >>>>>>> work as VFIO in host expect iova and size to be aligned to 64k
> >>>>>>> (PAGE_SIZE in
> >>>>>>> host)
> >>>>>>>    2) config.page_size_mask = 0xfffffffffff0000 will not work,
> >>>>>>> iova initialization (in guest) expect minimum page-size
> >>>>>>> supported by h/w to be equal to 4k (PAGE_SIZE in guest)
> >>>>>>>        Should we look to relax this in iova allocation code?
> >>>>>>
> >>>>>> Oh right, that's not great. Maybe the BUG_ON() can be removed,
> >>>>>> I'll ask on the list.
> >>>>>
> >>>>> yes, the BUG_ON in iova_init.
> So you mean in init_iova_domain()?
> 
> I see the BUG_ON was introduced in
> 0fb5fe874c42942e16c450ae05da453e13a1c09e "iommu: Make IOVA domain page
> size explicit" but was it meant to remain?

No I don't think iova.c is the problem, we could as well remove the
BUG_ON(). Now my concern is more with dma-iommu.c and VFIO which use the
PAGE_SIZE in some places and could have ingrained assumption about
iommu_pgsize <= PAGE_SIZE. We need a thorough audit of these drivers
before relaxing this. I started with dma-iommu yesterday but gave up after
seeing the VFIO WARN_ON.

I just sent the patch for virtio-iommu on the IOMMU list, we can continue
the discussion there

Thanks,
Jean

> 
> Logically when we allocate buffer IOVAs for DMA accesses, later on,
> shouldn't it be possible to use the actual granule set on init() and
> make sure the allocated size is properly aligned.
> 
> Reading the commit msg it explicitly says that "the systems may contain
> heterogeneous IOMMUs supporting differing minimum page sizes, which may
> also not be common with the CPU page size".
> 
> Thanks
> 
> Eric
> 
> 
> >>>>> I tried with removing same and it worked, but not analyzed side effects.
> >>>>
> >>>> It might break the assumption from device drivers that mapping a
> >>>> page is safe. For example they call alloc_page() followed by
> >>>> dma_map_page(). In our situation dma-iommu.c will oblige and create
> >>>> one 64k mapping to one 4k physical page. As a result the endpoint
> >>>> can access the neighbouring 60k of memory.
> >>>>
> >>>> This isn't too terrible. After all, even when the page sizes match,
> >>>> device drivers can call dma_map_single() on sub-page buffers, which
> >>>> will also let the endpoint access a whole page. The solution, if you
> >>>> don't trust the endpoint, is to use bounce buffers.
> >>>>
> >>>> But I suspect it's not as simple as removing the BUG_ON(), we'll
> >>>> need to go over dma-iommu.c first. And it seems like assigning
> >>>> endpoints to guest userspace won't work either in this config. In
> >> vfio_dma_do_map():
> >>>>
> >>>>         mask = ((uint64_t)1 << __ffs(vfio_pgsize_bitmap(iommu))) -
> >>>> 1;
> >>>>
> >>>>         WARN_ON(mask & PAGE_MASK);
> >>>
> >>> Yes, Agree
> >>>
> >>>>
> >>>> If I read this correctly the WARN will trigger in a 4k guest under
> >>>> 64k host, right?  So maybe we can just say that this config isn't
> >>>> supported, unless it's an important use-case for virtio-iommu?
> >>>
> >>> I sent v8 version of patch and with that guest and host with same page
> >>> size should work.
> >>> While i have not yet added analyzed how to mark 4k guest and 64k host
> >>> as un-supported configuration, will analyze and send patch.
> >>
> >> I don't think there is anything to do for QEMU, it's Linux that doesn't support the
> >> configuration. We could add something like the attached patch, in the virtio-
> >> iommu driver, to abort more gracefully than with a BUG_ON().
> > 
> > Yes agree, we need to have change in Linux side.
> > 
> > Thanks
> > -Bharat
> > 
> >>
> >> Thanks,
> >> Jean
> > 
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2020-03-18 12:02 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-13  7:48 [PATCH v7 0/5] virtio-iommu: VFIO integration Bharat Bhushan
2020-03-13  7:48 ` [PATCH v7 1/5] hw/vfio/common: Remove error print on mmio region translation by viommu Bharat Bhushan
2020-03-13  7:48 ` [PATCH v7 2/5] virtio-iommu: Add iommu notifier for map/unmap Bharat Bhushan
2020-03-13 14:25   ` Auger Eric
2020-03-16  6:36     ` Bharat Bhushan
2020-03-16  7:32       ` Auger Eric
2020-03-13  7:48 ` [PATCH v7 3/5] virtio-iommu: Call iommu notifier for attach/detach Bharat Bhushan
2020-03-13 14:41   ` Auger Eric
2020-03-16  6:41     ` Bharat Bhushan
2020-03-16  7:32       ` Auger Eric
2020-03-16  7:45         ` Bharat Bhushan
2020-03-16  8:58           ` Bharat Bhushan
2020-03-16  9:04             ` Auger Eric
2020-03-16  9:10               ` Bharat Bhushan
2020-03-16 10:11                 ` Jean-Philippe Brucker
2020-03-17  7:10                   ` Bharat Bhushan
2020-03-17  8:25                     ` Auger Eric
2020-03-17  8:53                     ` Jean-Philippe Brucker
2020-03-17  9:16                       ` Bharat Bhushan
2020-03-17 15:59                         ` Jean-Philippe Brucker
2020-03-18 10:17                           ` Bharat Bhushan
2020-03-18 11:17                             ` Jean-Philippe Brucker
2020-03-18 11:20                               ` [EXT] " Bharat Bhushan
2020-03-18 11:42                                 ` Auger Eric
2020-03-18 12:00                                   ` Jean-Philippe Brucker
2020-03-13  7:48 ` [PATCH v7 4/5] virtio-iommu: add iommu replay Bharat Bhushan
2020-03-13  7:48 ` [PATCH v7 5/5] virtio-iommu: add iommu notifier memory-region Bharat Bhushan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.