All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3 0/2] virtio-iommu: VFIO integration
@ 2017-08-21 10:48 Bharat Bhushan
  2017-08-21 10:48 ` [Qemu-devel] [PATCH v3 1/2] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route Bharat Bhushan
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Bharat Bhushan @ 2017-08-21 10:48 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel
  Cc: wei, kevin.tian, marc.zyngier, tn, will.deacon, drjones,
	robin.murphy, christoffer.dall, bharat.bhushan, Bharat Bhushan

This V3 version is mainly about rebasing on v3 version on Virtio-iommu device
framework from Eric Augur and addresing review comments.

This patch series allows PCI pass-through using virtio-iommu.
  
This series is based on:
 - virtio-iommu specification written by Jean-Philippe Brucker
   [RFC 0/3] virtio-iommu: a paravirtualized IOMMU,
 
 - virtio-iommu driver by Jean-Philippe Brucker
   [RFC PATCH linux] iommu: Add virtio-iommu driver
 
 - virtio-iommu device emulation by Eric Augur.
   [RFC v3 0/8] VIRTIO-IOMMU device

PCI device pass-through and virtio-net-pci is tested with these changes using dma-ops

This patch series does not implement RESV_MEM changes proposal by Jean-Philippe "https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg01796.html"

v2->v3:
 - This series is based on "[RFC v3 0/8] VIRTIO-IOMMU device"
   Which is based on top of v2.10-rc0 that 
 - Fixed issue with two PCI devices
 - Addressed review comments

v1->v2:
  - Added trace events
  - removed vSMMU3 link in patch description

Bharat Bhushan (2):
  target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
  virtio-iommu: vfio integration with virtio-iommu

 hw/virtio/trace-events           |   5 ++
 hw/virtio/virtio-iommu.c         | 163 +++++++++++++++++++++++++++++++++++++++
 include/hw/virtio/virtio-iommu.h |   6 ++
 target/arm/kvm.c                 |  27 +++++++
 target/arm/trace-events          |   3 +
 5 files changed, 204 insertions(+)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH v3 1/2] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
  2017-08-21 10:48 [Qemu-devel] [PATCH v3 0/2] virtio-iommu: VFIO integration Bharat Bhushan
@ 2017-08-21 10:48 ` Bharat Bhushan
  2017-08-21 10:48 ` [Qemu-devel] [PATCH v3 2/2] virtio-iommu: vfio integration with virtio-iommu Bharat Bhushan
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Bharat Bhushan @ 2017-08-21 10:48 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel
  Cc: wei, kevin.tian, marc.zyngier, tn, will.deacon, drjones,
	robin.murphy, christoffer.dall, bharat.bhushan, Bharat Bhushan

Translate msi address if device is behind virtio-iommu.
This logic is similar to vSMMUv3/Intel iommu emulation.

This RFC patch does not handle the case where both vsmmuv3 and
virtio-iommu are available.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Bharat Bhushan <Bharat.Bhushan@nxp.com>
---
v2->v3:
 - Rebased to on top of 2.10-rc0 and especially
  [PATCH qemu v9 0/2] memory/iommu: QOM'fy IOMMU MemoryRegion

v1-v2:
  - Added trace events
  - removed vSMMU3 link in patch description

 target/arm/kvm.c        | 27 +++++++++++++++++++++++++++
 target/arm/trace-events |  3 +++
 2 files changed, 30 insertions(+)

diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 7c17f0d..0219c9d 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -21,7 +21,11 @@
 #include "kvm_arm.h"
 #include "cpu.h"
 #include "internals.h"
+#include "trace.h"
 #include "hw/arm/arm.h"
+#include "hw/pci/pci.h"
+#include "hw/pci/msi.h"
+#include "hw/virtio/virtio-iommu.h"
 #include "exec/memattrs.h"
 #include "exec/address-spaces.h"
 #include "hw/boards.h"
@@ -662,6 +666,29 @@ int kvm_arm_vgic_probe(void)
 int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
                              uint64_t address, uint32_t data, PCIDevice *dev)
 {
+    AddressSpace *as = pci_device_iommu_address_space(dev);
+    IOMMUTLBEntry entry;
+    IOMMUDevice *sdev;
+    IOMMUMemoryRegionClass *imrc;
+
+    if (as == &address_space_memory) {
+        return 0;
+    }
+
+    /* MSI doorbell address is translated by an IOMMU */
+    sdev = container_of(as, IOMMUDevice, as);
+
+    imrc = memory_region_get_iommu_class_nocheck(&sdev->iommu_mr);
+
+    entry = imrc->translate(&sdev->iommu_mr, address, IOMMU_WO);
+
+    route->u.msi.address_lo = entry.translated_addr;
+    route->u.msi.address_hi = entry.translated_addr >> 32;
+
+    trace_kvm_arm_fixup_msi_route(address, sdev->devfn,
+                                  sdev->iommu_mr.parent_obj.name,
+                                  entry.translated_addr);
+
     return 0;
 }
 
diff --git a/target/arm/trace-events b/target/arm/trace-events
index e21c84f..eff2822 100644
--- a/target/arm/trace-events
+++ b/target/arm/trace-events
@@ -8,3 +8,6 @@ arm_gt_tval_write(int timer, uint64_t value) "gt_tval_write: timer %d value %" P
 arm_gt_ctl_write(int timer, uint64_t value) "gt_ctl_write: timer %d value %" PRIx64
 arm_gt_imask_toggle(int timer, int irqstate) "gt_ctl_write: timer %d IMASK toggle, new irqstate %d"
 arm_gt_cntvoff_write(uint64_t value) "gt_cntvoff_write: value %" PRIx64
+
+# target/arm/kvm.c
+kvm_arm_fixup_msi_route(uint64_t iova, uint32_t devid, const char *name, uint64_t gpa) "MSI addr = 0x%"PRIx64" is translated for devfn=%d through %s into 0x%"PRIx64
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH v3 2/2] virtio-iommu: vfio integration with virtio-iommu
  2017-08-21 10:48 [Qemu-devel] [PATCH v3 0/2] virtio-iommu: VFIO integration Bharat Bhushan
  2017-08-21 10:48 ` [Qemu-devel] [PATCH v3 1/2] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route Bharat Bhushan
@ 2017-08-21 10:48 ` Bharat Bhushan
  2017-09-18  7:47   ` Auger Eric
  2017-08-23 16:41 ` [Qemu-devel] [PATCH v3 0/2] virtio-iommu: VFIO integration Auger Eric
  2017-08-29 17:06 ` [Qemu-devel] [Qemu-arm] " Linu Cherian
  3 siblings, 1 reply; 10+ messages in thread
From: Bharat Bhushan @ 2017-08-21 10:48 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel
  Cc: wei, kevin.tian, marc.zyngier, tn, will.deacon, drjones,
	robin.murphy, christoffer.dall, bharat.bhushan, Bharat Bhushan

This RFC patch allows virtio-iommu protection for PCI
device-passthrough.

MSI region is mapped by current version of virtio-iommu driver.
This uses VFIO extension of map/unmap notification when an area
of memory is mappedi/unmapped in emulated iommu device.

This series is tested with 2 PCI devices to virtual machine using
dma-ops and DPDK in VM is not yet tested.

Also with this series we observe below prints for MSI region mapping

   "qemu-system-aarch64: iommu map to non memory area 0"

   This print comes when vfio/map-notifier is called for MSI region.

vfio map/unmap notification is called for given device
   This assumes that devid passed in virtio_iommu_attach is same as devfn
   This assumption is based on 1:1 mapping of requested-id with device-id
   in QEMU.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@nxp.com>
---
v2->v3:
 - Addressed review comments:
    - virtio-iommu_map_region function is split in two functions
      virtio_iommu_notify_map/virtio_iommu_notify_unmap
    - use size received from driver and do not split in 4K pages
 
 - map/unmap notification is called for given device/as
   This relies on devid passed in virtio_iommu_attach is same as devfn
   This is assumed as iommu-map maps 1:1 requested-id to device-id in QEMU
   Looking for comment about this assumtion.

 - Keeping track devices in address-space

 - Verified with 2 PCI endpoints
 - some code cleanup

 hw/virtio/trace-events           |   5 ++
 hw/virtio/virtio-iommu.c         | 163 +++++++++++++++++++++++++++++++++++++++
 include/hw/virtio/virtio-iommu.h |   6 ++
 3 files changed, 174 insertions(+)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 8db3d91..7e9663f 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -39,3 +39,8 @@ virtio_iommu_unmap_left_interval(uint64_t low, uint64_t high, uint64_t next_low,
 virtio_iommu_unmap_right_interval(uint64_t low, uint64_t high, uint64_t next_low, uint64_t next_high) "Unmap right [0x%"PRIx64",0x%"PRIx64"], new interval=[0x%"PRIx64",0x%"PRIx64"]"
 virtio_iommu_unmap_inc_interval(uint64_t low, uint64_t high) "Unmap inc [0x%"PRIx64",0x%"PRIx64"]"
 virtio_iommu_translate_result(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d"
+virtio_iommu_notify_flag_add(const char *iommu) "Add virtio-iommu notifier node for memory region %s"
+virtio_iommu_notify_flag_del(const char *iommu) "Del virtio-iommu notifier node for memory region %s"
+virtio_iommu_remap(hwaddr iova, hwaddr pa, hwaddr size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
+virtio_iommu_notify_map(hwaddr iova, hwaddr paddr, hwaddr map_size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
+virtio_iommu_notify_unmap(hwaddr iova, hwaddr paddr, hwaddr map_size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 9217587..9eae050 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -55,11 +55,13 @@ typedef struct viommu_interval {
 typedef struct viommu_dev {
     uint32_t id;
     viommu_as *as;
+    QLIST_ENTRY(viommu_dev) next;
 } viommu_dev;
 
 struct viommu_as {
     uint32_t id;
     GTree *mappings;
+    QLIST_HEAD(, viommu_dev) device_list;
 };
 
 static inline uint16_t virtio_iommu_get_sid(IOMMUDevice *dev)
@@ -133,12 +135,70 @@ static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
     }
 }
 
+static void virtio_iommu_notify_map(IOMMUMemoryRegion *mr, hwaddr iova,
+                                    hwaddr paddr, hwaddr size)
+{
+    IOMMUTLBEntry entry;
+
+    entry.target_as = &address_space_memory;
+    entry.addr_mask = size - 1;
+
+    entry.iova = iova;
+    trace_virtio_iommu_notify_map(iova, paddr, size);
+    entry.perm = IOMMU_RW;
+    entry.translated_addr = paddr;
+
+    memory_region_notify_iommu(mr, entry);
+}
+
+static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova,
+                                      hwaddr paddr, hwaddr size)
+{
+    IOMMUTLBEntry entry;
+
+    entry.target_as = &address_space_memory;
+    entry.addr_mask = size - 1;
+
+    entry.iova = iova;
+    trace_virtio_iommu_notify_unmap(iova, paddr, size);
+    entry.perm = IOMMU_NONE;
+    entry.translated_addr = 0;
+
+    memory_region_notify_iommu(mr, entry);
+}
+
+static gboolean virtio_iommu_maping_unmap(gpointer key, gpointer value,
+                                          gpointer data)
+{
+    viommu_mapping *mapping = (viommu_mapping *) value;
+    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
+
+    virtio_iommu_notify_unmap(mr, mapping->virt_addr, 0, mapping->size);
+
+    return true;
+}
+
 static void virtio_iommu_detach_dev(VirtIOIOMMU *s, viommu_dev *dev)
 {
+    VirtioIOMMUNotifierNode *node;
     viommu_as *as = dev->as;
+    int devid = dev->id;
 
     trace_virtio_iommu_detach(dev->id);
 
+    /* Remove device from address-space list */
+    QLIST_REMOVE(dev, next);
+    /* unmap all if no devices attached to address-spaceRemove */
+    if (QLIST_EMPTY(&as->device_list)) {
+        QLIST_FOREACH(node, &s->notifiers_list, next) {
+            if (devid == node->iommu_dev->devfn) {
+                g_tree_foreach(as->mappings, virtio_iommu_maping_unmap,
+                               &node->iommu_dev->iommu_mr);
+            }
+        }
+    }
+
+    /* Remove device from global list */
     g_tree_remove(s->devices, GUINT_TO_POINTER(dev->id));
     g_tree_unref(as->mappings);
 }
@@ -171,6 +231,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
     if (!as) {
         as = g_malloc0(sizeof(*as));
         as->id = asid;
+        QLIST_INIT(&as->device_list);
         as->mappings = g_tree_new_full((GCompareDataFunc)interval_cmp,
                                          NULL, NULL, (GDestroyNotify)g_free);
         g_tree_insert(s->address_spaces, GUINT_TO_POINTER(asid), as);
@@ -182,6 +243,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
     dev->id = devid;
     trace_virtio_iommu_new_devid(devid);
     g_tree_insert(s->devices, GUINT_TO_POINTER(devid), dev);
+    QLIST_INSERT_HEAD(&as->device_list, dev, next);
     g_tree_ref(as->mappings);
 
     return VIRTIO_IOMMU_S_OK;
@@ -219,6 +281,8 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
     viommu_as *as;
     viommu_interval *interval;
     viommu_mapping *mapping;
+    VirtioIOMMUNotifierNode *node;
+    viommu_dev *dev;
 
     interval = g_malloc0(sizeof(*interval));
 
@@ -230,6 +294,11 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
         return VIRTIO_IOMMU_S_NOENT;
     }
 
+    dev = QLIST_FIRST(&as->device_list);
+    if (!dev) {
+        return VIRTIO_IOMMU_S_NOENT;
+    }
+
     mapping = g_tree_lookup(as->mappings, (gpointer)interval);
     if (mapping) {
         g_free(interval);
@@ -246,6 +315,14 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
 
     g_tree_insert(as->mappings, interval, mapping);
 
+    /* All devices in an address-space share mapping */
+    QLIST_FOREACH(node, &s->notifiers_list, next) {
+        if (dev->id == node->iommu_dev->devfn) {
+            virtio_iommu_notify_map(&node->iommu_dev->iommu_mr, virt_addr,
+                                    phys_addr, size);
+        }
+    }
+
     return VIRTIO_IOMMU_S_OK;
 }
 
@@ -259,6 +336,8 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
     viommu_mapping *mapping;
     viommu_interval interval;
     viommu_as *as;
+    VirtioIOMMUNotifierNode *node;
+    viommu_dev *dev;
 
     trace_virtio_iommu_unmap(asid, virt_addr, size, flags);
 
@@ -267,6 +346,12 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
         error_report("%s: no as", __func__);
         return VIRTIO_IOMMU_S_NOENT;
     }
+
+    dev = QLIST_FIRST(&as->device_list);
+    if (!dev) {
+        return VIRTIO_IOMMU_S_NOENT;
+    }
+
     interval.low = virt_addr;
     interval.high = virt_addr + size - 1;
 
@@ -296,7 +381,15 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
         } else {
             break;
         }
+
         if (interval.low >= interval.high) {
+            /* All devices in an address-space share mapping */
+            QLIST_FOREACH(node, &s->notifiers_list, next) {
+                if (dev->id == node->iommu_dev->devfn) {
+                    virtio_iommu_notify_unmap(&node->iommu_dev->iommu_mr, virt_addr,
+                                              0, size);
+                }
+            }
             return VIRTIO_IOMMU_S_OK;
         } else {
             mapping = g_tree_lookup(as->mappings, (gpointer)&interval);
@@ -439,6 +532,36 @@ static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
     }
 }
 
+static void virtio_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu_mr,
+                                          IOMMUNotifierFlag old,
+                                          IOMMUNotifierFlag new)
+{
+    IOMMUDevice *sdev = container_of(iommu_mr, IOMMUDevice, iommu_mr);
+    VirtIOIOMMU *s = sdev->viommu;
+    VirtioIOMMUNotifierNode *node = NULL;
+    VirtioIOMMUNotifierNode *next_node = NULL;
+
+    if (old == IOMMU_NOTIFIER_NONE) {
+        trace_virtio_iommu_notify_flag_add(iommu_mr->parent_obj.name);
+        node = g_malloc0(sizeof(*node));
+        node->iommu_dev = sdev;
+        QLIST_INSERT_HEAD(&s->notifiers_list, node, next);
+        return;
+    }
+
+    /* update notifier node with new flags */
+    QLIST_FOREACH_SAFE(node, &s->notifiers_list, next, next_node) {
+        if (node->iommu_dev == sdev) {
+            if (new == IOMMU_NOTIFIER_NONE) {
+                trace_virtio_iommu_notify_flag_del(iommu_mr->parent_obj.name);
+                QLIST_REMOVE(node, next);
+                g_free(node);
+            }
+            return;
+        }
+    }
+}
+
 static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
                                             IOMMUAccessFlags flag)
 {
@@ -553,11 +676,49 @@ static gint int_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
     return (ua > ub) - (ua < ub);
 }
 
+static gboolean virtio_iommu_remap(gpointer key, gpointer value, gpointer data)
+{
+    viommu_mapping *mapping = (viommu_mapping *) value;
+    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
+
+    trace_virtio_iommu_remap(mapping->virt_addr, mapping->phys_addr,
+                             mapping->size);
+    /* unmap previous entry and map again */
+    virtio_iommu_notify_unmap(mr, mapping->virt_addr, 0, mapping->size);
+
+    virtio_iommu_notify_map(mr, mapping->virt_addr, mapping->phys_addr,
+                            mapping->size);
+    return true;
+}
+
+static void virtio_iommu_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
+{
+    IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
+    VirtIOIOMMU *s = sdev->viommu;
+    uint32_t sid;
+    viommu_dev *dev;
+
+    sid = virtio_iommu_get_sid(sdev);
+
+    qemu_mutex_lock(&s->mutex);
+
+    dev = g_tree_lookup(s->devices, GUINT_TO_POINTER(sid));
+    if (!dev) {
+        goto unlock;
+    }
+
+    g_tree_foreach(dev->as->mappings, virtio_iommu_remap, mr);
+
+unlock:
+    qemu_mutex_unlock(&s->mutex);
+}
+
 static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
     VirtIOIOMMU *s = VIRTIO_IOMMU(dev);
 
+    QLIST_INIT(&s->notifiers_list);
     virtio_init(vdev, "virtio-iommu", VIRTIO_ID_IOMMU,
                 sizeof(struct virtio_iommu_config));
 
@@ -644,6 +805,8 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass,
     IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
 
     imrc->translate = virtio_iommu_translate;
+    imrc->notify_flag_changed = virtio_iommu_notify_flag_changed;
+    imrc->replay = virtio_iommu_replay;
 }
 
 static const TypeInfo virtio_iommu_info = {
diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h
index f9c988f..7e04184 100644
--- a/include/hw/virtio/virtio-iommu.h
+++ b/include/hw/virtio/virtio-iommu.h
@@ -46,6 +46,11 @@ typedef struct IOMMUPciBus {
     IOMMUDevice  *pbdev[0]; /* Parent array is sparse, so dynamically alloc */
 } IOMMUPciBus;
 
+typedef struct VirtioIOMMUNotifierNode {
+    IOMMUDevice *iommu_dev;
+    QLIST_ENTRY(VirtioIOMMUNotifierNode) next;
+} VirtioIOMMUNotifierNode;
+
 typedef struct VirtIOIOMMU {
     VirtIODevice parent_obj;
     VirtQueue *vq;
@@ -56,6 +61,7 @@ typedef struct VirtIOIOMMU {
     GTree *address_spaces;
     QemuMutex mutex;
     GTree *devices;
+    QLIST_HEAD(, VirtioIOMMUNotifierNode) notifiers_list;
 } VirtIOIOMMU;
 
 #endif
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/2] virtio-iommu: VFIO integration
  2017-08-21 10:48 [Qemu-devel] [PATCH v3 0/2] virtio-iommu: VFIO integration Bharat Bhushan
  2017-08-21 10:48 ` [Qemu-devel] [PATCH v3 1/2] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route Bharat Bhushan
  2017-08-21 10:48 ` [Qemu-devel] [PATCH v3 2/2] virtio-iommu: vfio integration with virtio-iommu Bharat Bhushan
@ 2017-08-23 16:41 ` Auger Eric
  2017-08-24  5:57   ` Bharat Bhushan
  2017-08-29 17:06 ` [Qemu-devel] [Qemu-arm] " Linu Cherian
  3 siblings, 1 reply; 10+ messages in thread
From: Auger Eric @ 2017-08-23 16:41 UTC (permalink / raw)
  To: Bharat Bhushan, eric.auger.pro, peter.maydell, alex.williamson,
	mst, qemu-arm, qemu-devel
  Cc: wei, kevin.tian, marc.zyngier, tn, will.deacon, drjones,
	robin.murphy, christoffer.dall

Hi Bharat,

On 21/08/2017 12:48, Bharat Bhushan wrote:
> This V3 version is mainly about rebasing on v3 version on Virtio-iommu device
> framework from Eric Augur and addresing review comments.
s/Augur/Auger ;-)
> 
> This patch series allows PCI pass-through using virtio-iommu.
>   
> This series is based on:
>  - virtio-iommu specification written by Jean-Philippe Brucker
>    [RFC 0/3] virtio-iommu: a paravirtualized IOMMU,
>  
>  - virtio-iommu driver by Jean-Philippe Brucker
>    [RFC PATCH linux] iommu: Add virtio-iommu driver
>  
>  - virtio-iommu device emulation by Eric Augur.
>    [RFC v3 0/8] VIRTIO-IOMMU device
> 
> PCI device pass-through and virtio-net-pci is tested with these changes using dma-ops

I confirm it works fine now with 2 assigned VFs.

However at the moment DPDK testpmd using those 2 VFs does not work for me:
1: [/home/augere/UPSTREAM/dpdk/install/bin/testpmd(rte_dump_stack+0x24)
[0x4a8a78]]

I haven't investigated yet...

Thanks

Eric
> 
> This patch series does not implement RESV_MEM changes proposal by Jean-Philippe "https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg01796.html"
> 
> v2->v3:
>  - This series is based on "[RFC v3 0/8] VIRTIO-IOMMU device"
>    Which is based on top of v2.10-rc0 that 
>  - Fixed issue with two PCI devices
>  - Addressed review comments
> 
> v1->v2:
>   - Added trace events
>   - removed vSMMU3 link in patch description
> 
> Bharat Bhushan (2):
>   target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
>   virtio-iommu: vfio integration with virtio-iommu
> 
>  hw/virtio/trace-events           |   5 ++
>  hw/virtio/virtio-iommu.c         | 163 +++++++++++++++++++++++++++++++++++++++
>  include/hw/virtio/virtio-iommu.h |   6 ++
>  target/arm/kvm.c                 |  27 +++++++
>  target/arm/trace-events          |   3 +
>  5 files changed, 204 insertions(+)
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/2] virtio-iommu: VFIO integration
  2017-08-23 16:41 ` [Qemu-devel] [PATCH v3 0/2] virtio-iommu: VFIO integration Auger Eric
@ 2017-08-24  5:57   ` Bharat Bhushan
  0 siblings, 0 replies; 10+ messages in thread
From: Bharat Bhushan @ 2017-08-24  5:57 UTC (permalink / raw)
  To: Auger Eric, eric.auger.pro, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel
  Cc: wei, kevin.tian, marc.zyngier, tn, will.deacon, drjones,
	robin.murphy, christoffer.dall

Hi Eric,

> -----Original Message-----
> From: Auger Eric [mailto:eric.auger@redhat.com]
> Sent: Wednesday, August 23, 2017 10:12 PM
> To: Bharat Bhushan <bharat.bhushan@nxp.com>;
> eric.auger.pro@gmail.com; peter.maydell@linaro.org;
> alex.williamson@redhat.com; mst@redhat.com; qemu-arm@nongnu.org;
> qemu-devel@nongnu.org
> Cc: wei@redhat.com; kevin.tian@intel.com; marc.zyngier@arm.com;
> tn@semihalf.com; will.deacon@arm.com; drjones@redhat.com;
> robin.murphy@arm.com; christoffer.dall@linaro.org
> Subject: Re: [PATCH v3 0/2] virtio-iommu: VFIO integration
> 
> Hi Bharat,
> 
> On 21/08/2017 12:48, Bharat Bhushan wrote:
> > This V3 version is mainly about rebasing on v3 version on Virtio-iommu
> > device framework from Eric Augur and addresing review comments.
> s/Augur/Auger ;-)
I am sorry,

> >
> > This patch series allows PCI pass-through using virtio-iommu.
> >
> > This series is based on:
> >  - virtio-iommu specification written by Jean-Philippe Brucker
> >    [RFC 0/3] virtio-iommu: a paravirtualized IOMMU,
> >
> >  - virtio-iommu driver by Jean-Philippe Brucker
> >    [RFC PATCH linux] iommu: Add virtio-iommu driver
> >
> >  - virtio-iommu device emulation by Eric Augur.
> >    [RFC v3 0/8] VIRTIO-IOMMU device
> >
> > PCI device pass-through and virtio-net-pci is tested with these
> > changes using dma-ops
> 
> I confirm it works fine now with 2 assigned VFs.
> 
> However at the moment DPDK testpmd using those 2 VFs does not work for
> me:
> 1:
> [/home/augere/UPSTREAM/dpdk/install/bin/testpmd(rte_dump_stack+0x2
> 4)
> [0x4a8a78]]
> 
> I haven't investigated yet...

I have not run DPDK before, I am compiling right now and run.

Thanks
-Bharat

> 
> Thanks
> 
> Eric
> >
> > This patch series does not implement RESV_MEM changes proposal by
> Jean-Philippe "https://lists.gnu.org/archive/html/qemu-devel/2017-
> 07/msg01796.html"
> >
> > v2->v3:
> >  - This series is based on "[RFC v3 0/8] VIRTIO-IOMMU device"
> >    Which is based on top of v2.10-rc0 that
> >  - Fixed issue with two PCI devices
> >  - Addressed review comments
> >
> > v1->v2:
> >   - Added trace events
> >   - removed vSMMU3 link in patch description
> >
> > Bharat Bhushan (2):
> >   target/arm/kvm: Translate the MSI doorbell in
> kvm_arch_fixup_msi_route
> >   virtio-iommu: vfio integration with virtio-iommu
> >
> >  hw/virtio/trace-events           |   5 ++
> >  hw/virtio/virtio-iommu.c         | 163
> +++++++++++++++++++++++++++++++++++++++
> >  include/hw/virtio/virtio-iommu.h |   6 ++
> >  target/arm/kvm.c                 |  27 +++++++
> >  target/arm/trace-events          |   3 +
> >  5 files changed, 204 insertions(+)
> >

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v3 0/2] virtio-iommu: VFIO integration
  2017-08-21 10:48 [Qemu-devel] [PATCH v3 0/2] virtio-iommu: VFIO integration Bharat Bhushan
                   ` (2 preceding siblings ...)
  2017-08-23 16:41 ` [Qemu-devel] [PATCH v3 0/2] virtio-iommu: VFIO integration Auger Eric
@ 2017-08-29 17:06 ` Linu Cherian
  2017-08-29 19:42   ` Auger Eric
  3 siblings, 1 reply; 10+ messages in thread
From: Linu Cherian @ 2017-08-29 17:06 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: eric.auger, eric.auger.pro, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, kevin.tian, marc.zyngier, tn, will.deacon,
	drjones, robin.murphy, christoffer.dall, linu.cherian

Hi,

On Mon Aug 21, 2017 at 04:18:52PM +0530, Bharat Bhushan wrote:
> This V3 version is mainly about rebasing on v3 version on Virtio-iommu device
> framework from Eric Augur and addresing review comments.
> 
> This patch series allows PCI pass-through using virtio-iommu.
>   
> This series is based on:
>  - virtio-iommu specification written by Jean-Philippe Brucker
>    [RFC 0/3] virtio-iommu: a paravirtualized IOMMU,
>  
>  - virtio-iommu driver by Jean-Philippe Brucker
>    [RFC PATCH linux] iommu: Add virtio-iommu driver
>  
>  - virtio-iommu device emulation by Eric Augur.
>    [RFC v3 0/8] VIRTIO-IOMMU device
> 
> PCI device pass-through and virtio-net-pci is tested with these changes using dma-ops
> 

Facing issues while trying to test with VFIO.

vfio_dma_map fails as below, 
qemu-system-aarch64: vfio_dma_map(0x1ff0da0, 0xfdfc7000, 0x1000, 0xffff79acc000) = -22 (Invalid argument)
Very likely this seem to be an issue with map size. Kernel PAGE_SIZE 
is 64k on my host and hence the map size for the physical SMMU also will
start with 64k. 

Qemu source: https://github.com/eauger/qemu.git + this patch series
             on branch v2.10.0-rc0-virtio-iommu-rfcv3
Linux source: git://linux-arm.org/linux-jpb.git
             on branch virtio-iommu/v0.1
Any pointers ?

The other related questions i had,
1.  In, virtio_iommu_device_realize in qemu,
   s->config.page_sizes = TARGET_PAGE_MASK;

  Same is being taken as pgsize_bitmap in virtio_iommu guest kernel driver.
  In, viommu_probe 
  virtio_cread(vdev, struct virtio_iommu_config, page_sizes,
                     &viommu->pgsize_bitmap);

  Should s->config.page_sizes be initialized with page bitmap instead
  of page mask ?

2. Should we not populate the supported page sizes based on  
   host kernel size and SMMU hardware capability rather than 
   based on the machine emulated on qemu? Atleast that makes
   sense for VFIO case.

> This patch series does not implement RESV_MEM changes proposal by Jean-Philippe "https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg01796.html"
> 
> v2->v3:
>  - This series is based on "[RFC v3 0/8] VIRTIO-IOMMU device"
>    Which is based on top of v2.10-rc0 that 
>  - Fixed issue with two PCI devices
>  - Addressed review comments
> 
> v1->v2:
>   - Added trace events
>   - removed vSMMU3 link in patch description
> 
> Bharat Bhushan (2):
>   target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
>   virtio-iommu: vfio integration with virtio-iommu
> 
>  hw/virtio/trace-events           |   5 ++
>  hw/virtio/virtio-iommu.c         | 163 +++++++++++++++++++++++++++++++++++++++
>  include/hw/virtio/virtio-iommu.h |   6 ++
>  target/arm/kvm.c                 |  27 +++++++
>  target/arm/trace-events          |   3 +
>  5 files changed, 204 insertions(+)
> 
> -- 
> 1.9.3
> 
> 

-- 
Linu cherian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v3 0/2] virtio-iommu: VFIO integration
  2017-08-29 17:06 ` [Qemu-devel] [Qemu-arm] " Linu Cherian
@ 2017-08-29 19:42   ` Auger Eric
  0 siblings, 0 replies; 10+ messages in thread
From: Auger Eric @ 2017-08-29 19:42 UTC (permalink / raw)
  To: Linu Cherian, Bharat Bhushan
  Cc: peter.maydell, kevin.tian, drjones, mst, marc.zyngier, tn,
	will.deacon, qemu-devel, alex.williamson, qemu-arm, linu.cherian,
	robin.murphy, christoffer.dall, eric.auger.pro

Hi Linu,

On 29/08/2017 19:06, Linu Cherian wrote:
> Hi,
> 
> On Mon Aug 21, 2017 at 04:18:52PM +0530, Bharat Bhushan wrote:
>> This V3 version is mainly about rebasing on v3 version on Virtio-iommu device
>> framework from Eric Augur and addresing review comments.
>>
>> This patch series allows PCI pass-through using virtio-iommu.
>>   
>> This series is based on:
>>  - virtio-iommu specification written by Jean-Philippe Brucker
>>    [RFC 0/3] virtio-iommu: a paravirtualized IOMMU,
>>  
>>  - virtio-iommu driver by Jean-Philippe Brucker
>>    [RFC PATCH linux] iommu: Add virtio-iommu driver
>>  
>>  - virtio-iommu device emulation by Eric Augur.
>>    [RFC v3 0/8] VIRTIO-IOMMU device
>>
>> PCI device pass-through and virtio-net-pci is tested with these changes using dma-ops
>>
> 
> Facing issues while trying to test with VFIO.
> 
> vfio_dma_map fails as below, 
> qemu-system-aarch64: vfio_dma_map(0x1ff0da0, 0xfdfc7000, 0x1000, 0xffff79acc000) = -22 (Invalid argument)
> Very likely this seem to be an issue with map size. Kernel PAGE_SIZE 
> is 64k on my host and hence the map size for the physical SMMU also will
> start with 64k.
Most probably. I currently use 4KB on both host/guest.  Also the devices
I assign have BARs smaller than 64kB and this causes issue with DPDK.

> 
> Qemu source: https://github.com/eauger/qemu.git + this patch series
>              on branch v2.10.0-rc0-virtio-iommu-rfcv3
> Linux source: git://linux-arm.org/linux-jpb.git
>              on branch virtio-iommu/v0.1
> Any pointers ?
Looks good.
> 
> The other related questions i had,
> 1.  In, virtio_iommu_device_realize in qemu,
>    s->config.page_sizes = TARGET_PAGE_MASK;
> 
>   Same is being taken as pgsize_bitmap in virtio_iommu guest kernel driver.
>   In, viommu_probe 
>   virtio_cread(vdev, struct virtio_iommu_config, page_sizes,
>                      &viommu->pgsize_bitmap);
> 
>   Should s->config.page_sizes be initialized with page bitmap instead
>   of page mask ?
We currently support all page size bits greater or equal than the guest
page size
define TARGET_PAGE_MASK ~(TARGET_PAGE_SIZE - 1)

We evoked the problem you seem to face in
https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg05308.html

and the temporary solution was to use TARGET_PAGE_MASK
> 
> 2. Should we not populate the supported page sizes based on  
>    host kernel size and SMMU hardware capability rather than 
>    based on the machine emulated on qemu? Atleast that makes
>    sense for VFIO case.

I think Jean's proposal to address this issue is to enhance the PROBE
API. The driver would fetch for each device an accurate page_size_mask
that would characterize either the virtual iommu or the underlying
physical iommu. This would override the global page_size_mask. I think
the plan was to issue that for v0.5

Thanks

Eric
> 
>> This patch series does not implement RESV_MEM changes proposal by Jean-Philippe "https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg01796.html"
>>
>> v2->v3:
>>  - This series is based on "[RFC v3 0/8] VIRTIO-IOMMU device"
>>    Which is based on top of v2.10-rc0 that 
>>  - Fixed issue with two PCI devices
>>  - Addressed review comments
>>
>> v1->v2:
>>   - Added trace events
>>   - removed vSMMU3 link in patch description
>>
>> Bharat Bhushan (2):
>>   target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
>>   virtio-iommu: vfio integration with virtio-iommu
>>
>>  hw/virtio/trace-events           |   5 ++
>>  hw/virtio/virtio-iommu.c         | 163 +++++++++++++++++++++++++++++++++++++++
>>  include/hw/virtio/virtio-iommu.h |   6 ++
>>  target/arm/kvm.c                 |  27 +++++++
>>  target/arm/trace-events          |   3 +
>>  5 files changed, 204 insertions(+)
>>
>> -- 
>> 1.9.3
>>
>>
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] virtio-iommu: vfio integration with virtio-iommu
  2017-08-21 10:48 ` [Qemu-devel] [PATCH v3 2/2] virtio-iommu: vfio integration with virtio-iommu Bharat Bhushan
@ 2017-09-18  7:47   ` Auger Eric
  2017-09-18 13:09     ` Auger Eric
  2017-09-22  7:45     ` Bharat Bhushan
  0 siblings, 2 replies; 10+ messages in thread
From: Auger Eric @ 2017-09-18  7:47 UTC (permalink / raw)
  To: Bharat Bhushan, eric.auger.pro, peter.maydell, alex.williamson,
	mst, qemu-arm, qemu-devel
  Cc: wei, kevin.tian, marc.zyngier, tn, will.deacon, drjones,
	robin.murphy, christoffer.dall

Hi Bharat,

On 21/08/2017 12:48, Bharat Bhushan wrote:
> This RFC patch allows virtio-iommu protection for PCI
> device-passthrough.
> 
> MSI region is mapped by current version of virtio-iommu driver.
> This uses VFIO extension of map/unmap notification when an area
> of memory is mappedi/unmapped in emulated iommu device.
> 
> This series is tested with 2 PCI devices to virtual machine using
> dma-ops and DPDK in VM is not yet tested.
> 
> Also with this series we observe below prints for MSI region mapping
> 
>    "qemu-system-aarch64: iommu map to non memory area 0"
> 
>    This print comes when vfio/map-notifier is called for MSI region.
> 
> vfio map/unmap notification is called for given device
>    This assumes that devid passed in virtio_iommu_attach is same as devfn
>    This assumption is based on 1:1 mapping of requested-id with device-id
>    in QEMU.
> 
> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@nxp.com>
> ---
> v2->v3:
>  - Addressed review comments:
>     - virtio-iommu_map_region function is split in two functions
>       virtio_iommu_notify_map/virtio_iommu_notify_unmap
>     - use size received from driver and do not split in 4K pages
>  
>  - map/unmap notification is called for given device/as
>    This relies on devid passed in virtio_iommu_attach is same as devfn
>    This is assumed as iommu-map maps 1:1 requested-id to device-id in QEMU
>    Looking for comment about this assumtion.
> 
>  - Keeping track devices in address-space
> 
>  - Verified with 2 PCI endpoints
>  - some code cleanup
> 
>  hw/virtio/trace-events           |   5 ++
>  hw/virtio/virtio-iommu.c         | 163 +++++++++++++++++++++++++++++++++++++++
>  include/hw/virtio/virtio-iommu.h |   6 ++
>  3 files changed, 174 insertions(+)
> 
> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> index 8db3d91..7e9663f 100644
> --- a/hw/virtio/trace-events
> +++ b/hw/virtio/trace-events
> @@ -39,3 +39,8 @@ virtio_iommu_unmap_left_interval(uint64_t low, uint64_t high, uint64_t next_low,
>  virtio_iommu_unmap_right_interval(uint64_t low, uint64_t high, uint64_t next_low, uint64_t next_high) "Unmap right [0x%"PRIx64",0x%"PRIx64"], new interval=[0x%"PRIx64",0x%"PRIx64"]"
>  virtio_iommu_unmap_inc_interval(uint64_t low, uint64_t high) "Unmap inc [0x%"PRIx64",0x%"PRIx64"]"
>  virtio_iommu_translate_result(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d"
> +virtio_iommu_notify_flag_add(const char *iommu) "Add virtio-iommu notifier node for memory region %s"
> +virtio_iommu_notify_flag_del(const char *iommu) "Del virtio-iommu notifier node for memory region %s"
> +virtio_iommu_remap(hwaddr iova, hwaddr pa, hwaddr size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
> +virtio_iommu_notify_map(hwaddr iova, hwaddr paddr, hwaddr map_size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
> +virtio_iommu_notify_unmap(hwaddr iova, hwaddr paddr, hwaddr map_size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
useless paddr
> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> index 9217587..9eae050 100644
> --- a/hw/virtio/virtio-iommu.c
> +++ b/hw/virtio/virtio-iommu.c
> @@ -55,11 +55,13 @@ typedef struct viommu_interval {
>  typedef struct viommu_dev {
>      uint32_t id;
>      viommu_as *as;
> +    QLIST_ENTRY(viommu_dev) next;
>  } viommu_dev;
>  
>  struct viommu_as {
>      uint32_t id;
>      GTree *mappings;
> +    QLIST_HEAD(, viommu_dev) device_list;
>  };
>  
>  static inline uint16_t virtio_iommu_get_sid(IOMMUDevice *dev)
> @@ -133,12 +135,70 @@ static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
>      }
>  }
>  
> +static void virtio_iommu_notify_map(IOMMUMemoryRegion *mr, hwaddr iova,
> +                                    hwaddr paddr, hwaddr size)
> +{
> +    IOMMUTLBEntry entry;
> +
> +    entry.target_as = &address_space_memory;
> +    entry.addr_mask = size - 1;
> +
> +    entry.iova = iova;
> +    trace_virtio_iommu_notify_map(iova, paddr, size);
> +    entry.perm = IOMMU_RW;
> +    entry.translated_addr = paddr;
> +
> +    memory_region_notify_iommu(mr, entry);
> +}
> +
> +static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova,
> +                                      hwaddr paddr, hwaddr size)
unmap() doesn't need to have paddr arg.
> +{
> +    IOMMUTLBEntry entry;
> +
> +    entry.target_as = &address_space_memory;
> +    entry.addr_mask = size - 1;
> +
> +    entry.iova = iova;
> +    trace_virtio_iommu_notify_unmap(iova, paddr, size);
ditto
> +    entry.perm = IOMMU_NONE;
> +    entry.translated_addr = 0;
> +
> +    memory_region_notify_iommu(mr, entry);
> +}
> +
> +static gboolean virtio_iommu_maping_unmap(gpointer key, gpointer value,
> +                                          gpointer data)
s/virtio_iommu_maping_unmap/virtio_iommu_mapping_unmap
> +{
> +    viommu_mapping *mapping = (viommu_mapping *) value;
> +    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
> +
> +    virtio_iommu_notify_unmap(mr, mapping->virt_addr, 0, mapping->size);
> +
> +    return true;
> +}
> +
>  static void virtio_iommu_detach_dev(VirtIOIOMMU *s, viommu_dev *dev)
>  {
> +    VirtioIOMMUNotifierNode *node;
>      viommu_as *as = dev->as;
> +    int devid = dev->id;
>  
>      trace_virtio_iommu_detach(dev->id);
>  
> +    /* Remove device from address-space list */
> +    QLIST_REMOVE(dev, next);
> +    /* unmap all if no devices attached to address-spaceRemove */
comment to be reworked
> +    if (QLIST_EMPTY(&as->device_list)) {
I don't get why you execute the code below only if the device_list is
empty. Each time a device is detached don't you need to notify its
associated iommu_mr?
> +        QLIST_FOREACH(node, &s->notifiers_list, next) {
> +            if (devid == node->iommu_dev->devfn) {
> +                g_tree_foreach(as->mappings, virtio_iommu_maping_unmap,
> +                               &node->iommu_dev->iommu_mr);
> +            }
> +        }
> +    }
> +
> +    /* Remove device from global list */
>      g_tree_remove(s->devices, GUINT_TO_POINTER(dev->id));
>      g_tree_unref(as->mappings);
>  }
> @@ -171,6 +231,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
>      if (!as) {
>          as = g_malloc0(sizeof(*as));
>          as->id = asid;
> +        QLIST_INIT(&as->device_list);
>          as->mappings = g_tree_new_full((GCompareDataFunc)interval_cmp,
>                                           NULL, NULL, (GDestroyNotify)g_free);
>          g_tree_insert(s->address_spaces, GUINT_TO_POINTER(asid), as);
> @@ -182,6 +243,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
>      dev->id = devid;
>      trace_virtio_iommu_new_devid(devid);
>      g_tree_insert(s->devices, GUINT_TO_POINTER(devid), dev);
> +    QLIST_INSERT_HEAD(&as->device_list, dev, next);
>      g_tree_ref(as->mappings);
>  
>      return VIRTIO_IOMMU_S_OK;
> @@ -219,6 +281,8 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
>      viommu_as *as;
>      viommu_interval *interval;
>      viommu_mapping *mapping;
> +    VirtioIOMMUNotifierNode *node;
> +    viommu_dev *dev;
>  
>      interval = g_malloc0(sizeof(*interval));
>  
> @@ -230,6 +294,11 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
>          return VIRTIO_IOMMU_S_NOENT;
>      }
>  
> +    dev = QLIST_FIRST(&as->device_list);
> +    if (!dev) {
> +        return VIRTIO_IOMMU_S_NOENT;
> +    }
> +
>      mapping = g_tree_lookup(as->mappings, (gpointer)interval);
>      if (mapping) {
>          g_free(interval);
> @@ -246,6 +315,14 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
>  
>      g_tree_insert(as->mappings, interval, mapping);
>  
> +    /* All devices in an address-space share mapping */
> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> +        if (dev->id == node->iommu_dev->devfn) {
> +            virtio_iommu_notify_map(&node->iommu_dev->iommu_mr, virt_addr,
> +                                    phys_addr, size);
> +        }
> +    }
> +
>      return VIRTIO_IOMMU_S_OK;
>  }
>  
> @@ -259,6 +336,8 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
>      viommu_mapping *mapping;
>      viommu_interval interval;
>      viommu_as *as;
> +    VirtioIOMMUNotifierNode *node;
> +    viommu_dev *dev;
>  
>      trace_virtio_iommu_unmap(asid, virt_addr, size, flags);
>  
> @@ -267,6 +346,12 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
>          error_report("%s: no as", __func__);
>          return VIRTIO_IOMMU_S_NOENT;
>      }
> +
> +    dev = QLIST_FIRST(&as->device_list);
> +    if (!dev) {
> +        return VIRTIO_IOMMU_S_NOENT;
> +    }
> +
>      interval.low = virt_addr;
>      interval.high = virt_addr + size - 1;
>  
> @@ -296,7 +381,15 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
>          } else {
>              break;
>          }
> +
extra line
>          if (interval.low >= interval.high) {
> +            /* All devices in an address-space share mapping */
> +            QLIST_FOREACH(node, &s->notifiers_list, next) {
> +                if (dev->id == node->iommu_dev->devfn) {
> +                    virtio_iommu_notify_unmap(&node->iommu_dev->iommu_mr, virt_addr,
> +                                              0, size);
Why do you notify only a single mr? All devices belonging to the as are
affected by this change. So don't you need to notify all as device's mr?
> +                }
> +            }
>              return VIRTIO_IOMMU_S_OK;
>          } else {
>              mapping = g_tree_lookup(as->mappings, (gpointer)&interval);
> @@ -439,6 +532,36 @@ static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
>      }
>  }
>  
> +static void virtio_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu_mr,
> +                                          IOMMUNotifierFlag old,
> +                                          IOMMUNotifierFlag new)
> +{
> +    IOMMUDevice *sdev = container_of(iommu_mr, IOMMUDevice, iommu_mr);
> +    VirtIOIOMMU *s = sdev->viommu;
> +    VirtioIOMMUNotifierNode *node = NULL;
> +    VirtioIOMMUNotifierNode *next_node = NULL;
> +
> +    if (old == IOMMU_NOTIFIER_NONE) {
> +        trace_virtio_iommu_notify_flag_add(iommu_mr->parent_obj.name);
> +        node = g_malloc0(sizeof(*node));
> +        node->iommu_dev = sdev;
> +        QLIST_INSERT_HEAD(&s->notifiers_list, node, next);
> +        return;
> +    }
> +
> +    /* update notifier node with new flags */
> +    QLIST_FOREACH_SAFE(node, &s->notifiers_list, next, next_node) {
> +        if (node->iommu_dev == sdev) {
> +            if (new == IOMMU_NOTIFIER_NONE) {
> +                trace_virtio_iommu_notify_flag_del(iommu_mr->parent_obj.name);
> +                QLIST_REMOVE(node, next);
> +                g_free(node);
> +            }
> +            return;
> +        }
> +    }
> +}
> +
>  static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
>                                              IOMMUAccessFlags flag)
>  {
> @@ -553,11 +676,49 @@ static gint int_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
>      return (ua > ub) - (ua < ub);
>  }
>  
> +static gboolean virtio_iommu_remap(gpointer key, gpointer value, gpointer data)
> +{
> +    viommu_mapping *mapping = (viommu_mapping *) value;
> +    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
> +
> +    trace_virtio_iommu_remap(mapping->virt_addr, mapping->phys_addr,
> +                             mapping->size);
> +    /* unmap previous entry and map again */
> +    virtio_iommu_notify_unmap(mr, mapping->virt_addr, 0, mapping->size);
> +
> +    virtio_iommu_notify_map(mr, mapping->virt_addr, mapping->phys_addr,
> +                            mapping->size);
> +    return true;
you need to return false here otherwise  g_tree_foreach will return on
the first as mapping.

Thanks

Eric
> +}
> +
> +static void virtio_iommu_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
> +{
> +    IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
> +    VirtIOIOMMU *s = sdev->viommu;
> +    uint32_t sid;
> +    viommu_dev *dev;
> +
> +    sid = virtio_iommu_get_sid(sdev);
> +
> +    qemu_mutex_lock(&s->mutex);
> +
> +    dev = g_tree_lookup(s->devices, GUINT_TO_POINTER(sid));
> +    if (!dev) {
> +        goto unlock;
> +    }
> +
> +    g_tree_foreach(dev->as->mappings, virtio_iommu_remap, mr);
> +
> +unlock:
> +    qemu_mutex_unlock(&s->mutex);
> +}
> +
>  static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>  {
>      VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>      VirtIOIOMMU *s = VIRTIO_IOMMU(dev);
>  
> +    QLIST_INIT(&s->notifiers_list);
>      virtio_init(vdev, "virtio-iommu", VIRTIO_ID_IOMMU,
>                  sizeof(struct virtio_iommu_config));
>  
> @@ -644,6 +805,8 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass,
>      IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
>  
>      imrc->translate = virtio_iommu_translate;
> +    imrc->notify_flag_changed = virtio_iommu_notify_flag_changed;
> +    imrc->replay = virtio_iommu_replay;
>  }
>  
>  static const TypeInfo virtio_iommu_info = {
> diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h
> index f9c988f..7e04184 100644
> --- a/include/hw/virtio/virtio-iommu.h
> +++ b/include/hw/virtio/virtio-iommu.h
> @@ -46,6 +46,11 @@ typedef struct IOMMUPciBus {
>      IOMMUDevice  *pbdev[0]; /* Parent array is sparse, so dynamically alloc */
>  } IOMMUPciBus;
>  
> +typedef struct VirtioIOMMUNotifierNode {
> +    IOMMUDevice *iommu_dev;
> +    QLIST_ENTRY(VirtioIOMMUNotifierNode) next;
> +} VirtioIOMMUNotifierNode;
> +
>  typedef struct VirtIOIOMMU {
>      VirtIODevice parent_obj;
>      VirtQueue *vq;
> @@ -56,6 +61,7 @@ typedef struct VirtIOIOMMU {
>      GTree *address_spaces;
>      QemuMutex mutex;
>      GTree *devices;
> +    QLIST_HEAD(, VirtioIOMMUNotifierNode) notifiers_list;
>  } VirtIOIOMMU;
>  
>  #endif
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] virtio-iommu: vfio integration with virtio-iommu
  2017-09-18  7:47   ` Auger Eric
@ 2017-09-18 13:09     ` Auger Eric
  2017-09-22  7:45     ` Bharat Bhushan
  1 sibling, 0 replies; 10+ messages in thread
From: Auger Eric @ 2017-09-18 13:09 UTC (permalink / raw)
  To: Bharat Bhushan, eric.auger.pro, peter.maydell, alex.williamson,
	mst, qemu-arm, qemu-devel
  Cc: wei, kevin.tian, marc.zyngier, tn, will.deacon, drjones,
	robin.murphy, christoffer.dall

Hi Bharat,

On 18/09/2017 09:47, Auger Eric wrote:
> Hi Bharat,
> 
> On 21/08/2017 12:48, Bharat Bhushan wrote:
>> This RFC patch allows virtio-iommu protection for PCI
>> device-passthrough.
>>
>> MSI region is mapped by current version of virtio-iommu driver.
>> This uses VFIO extension of map/unmap notification when an area
>> of memory is mappedi/unmapped in emulated iommu device.
>>
>> This series is tested with 2 PCI devices to virtual machine using
>> dma-ops and DPDK in VM is not yet tested.
>>
>> Also with this series we observe below prints for MSI region mapping
>>
>>    "qemu-system-aarch64: iommu map to non memory area 0"
>>
>>    This print comes when vfio/map-notifier is called for MSI region.
>>
>> vfio map/unmap notification is called for given device
>>    This assumes that devid passed in virtio_iommu_attach is same as devfn
>>    This assumption is based on 1:1 mapping of requested-id with device-id
>>    in QEMU.
>>
>> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@nxp.com>
>> ---
>> v2->v3:
>>  - Addressed review comments:
>>     - virtio-iommu_map_region function is split in two functions
>>       virtio_iommu_notify_map/virtio_iommu_notify_unmap
>>     - use size received from driver and do not split in 4K pages
>>  
>>  - map/unmap notification is called for given device/as
>>    This relies on devid passed in virtio_iommu_attach is same as devfn
>>    This is assumed as iommu-map maps 1:1 requested-id to device-id in QEMU
>>    Looking for comment about this assumtion.
>>
>>  - Keeping track devices in address-space
>>
>>  - Verified with 2 PCI endpoints
>>  - some code cleanup
>>
>>  hw/virtio/trace-events           |   5 ++
>>  hw/virtio/virtio-iommu.c         | 163 +++++++++++++++++++++++++++++++++++++++
>>  include/hw/virtio/virtio-iommu.h |   6 ++
>>  3 files changed, 174 insertions(+)
>>
>> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
>> index 8db3d91..7e9663f 100644
>> --- a/hw/virtio/trace-events
>> +++ b/hw/virtio/trace-events
>> @@ -39,3 +39,8 @@ virtio_iommu_unmap_left_interval(uint64_t low, uint64_t high, uint64_t next_low,
>>  virtio_iommu_unmap_right_interval(uint64_t low, uint64_t high, uint64_t next_low, uint64_t next_high) "Unmap right [0x%"PRIx64",0x%"PRIx64"], new interval=[0x%"PRIx64",0x%"PRIx64"]"
>>  virtio_iommu_unmap_inc_interval(uint64_t low, uint64_t high) "Unmap inc [0x%"PRIx64",0x%"PRIx64"]"
>>  virtio_iommu_translate_result(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d"
>> +virtio_iommu_notify_flag_add(const char *iommu) "Add virtio-iommu notifier node for memory region %s"
>> +virtio_iommu_notify_flag_del(const char *iommu) "Del virtio-iommu notifier node for memory region %s"
>> +virtio_iommu_remap(hwaddr iova, hwaddr pa, hwaddr size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
>> +virtio_iommu_notify_map(hwaddr iova, hwaddr paddr, hwaddr map_size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
>> +virtio_iommu_notify_unmap(hwaddr iova, hwaddr paddr, hwaddr map_size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
> useless paddr
>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>> index 9217587..9eae050 100644
>> --- a/hw/virtio/virtio-iommu.c
>> +++ b/hw/virtio/virtio-iommu.c
>> @@ -55,11 +55,13 @@ typedef struct viommu_interval {
>>  typedef struct viommu_dev {
>>      uint32_t id;
>>      viommu_as *as;
>> +    QLIST_ENTRY(viommu_dev) next;
>>  } viommu_dev;
>>  
>>  struct viommu_as {
>>      uint32_t id;
>>      GTree *mappings;
>> +    QLIST_HEAD(, viommu_dev) device_list;
>>  };
>>  
>>  static inline uint16_t virtio_iommu_get_sid(IOMMUDevice *dev)
>> @@ -133,12 +135,70 @@ static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
>>      }
>>  }
>>  
>> +static void virtio_iommu_notify_map(IOMMUMemoryRegion *mr, hwaddr iova,
>> +                                    hwaddr paddr, hwaddr size)
>> +{
>> +    IOMMUTLBEntry entry;
>> +
>> +    entry.target_as = &address_space_memory;
>> +    entry.addr_mask = size - 1;
>> +
>> +    entry.iova = iova;
>> +    trace_virtio_iommu_notify_map(iova, paddr, size);
>> +    entry.perm = IOMMU_RW;
>> +    entry.translated_addr = paddr;
>> +
>> +    memory_region_notify_iommu(mr, entry);
>> +}
>> +
>> +static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova,
>> +                                      hwaddr paddr, hwaddr size)
> unmap() doesn't need to have paddr arg.
>> +{
>> +    IOMMUTLBEntry entry;
>> +
>> +    entry.target_as = &address_space_memory;
>> +    entry.addr_mask = size - 1;
>> +
>> +    entry.iova = iova;
>> +    trace_virtio_iommu_notify_unmap(iova, paddr, size);
> ditto
>> +    entry.perm = IOMMU_NONE;
>> +    entry.translated_addr = 0;
>> +
>> +    memory_region_notify_iommu(mr, entry);
>> +}
>> +
>> +static gboolean virtio_iommu_maping_unmap(gpointer key, gpointer value,
>> +                                          gpointer data)
> s/virtio_iommu_maping_unmap/virtio_iommu_mapping_unmap
>> +{
>> +    viommu_mapping *mapping = (viommu_mapping *) value;
>> +    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
>> +
>> +    virtio_iommu_notify_unmap(mr, mapping->virt_addr, 0, mapping->size);
>> +
>> +    return true;
you need to return false here as well.
>> +}
>> +
>>  static void virtio_iommu_detach_dev(VirtIOIOMMU *s, viommu_dev *dev)
>>  {
>> +    VirtioIOMMUNotifierNode *node;
>>      viommu_as *as = dev->as;
>> +    int devid = dev->id;
>>  
>>      trace_virtio_iommu_detach(dev->id);
>>  
>> +    /* Remove device from address-space list */
>> +    QLIST_REMOVE(dev, next);
>> +    /* unmap all if no devices attached to address-spaceRemove */
> comment to be reworked
>> +    if (QLIST_EMPTY(&as->device_list)) {
> I don't get why you execute the code below only if the device_list is
> empty. Each time a device is detached don't you need to notify its
> associated iommu_mr?
>> +        QLIST_FOREACH(node, &s->notifiers_list, next) {
>> +            if (devid == node->iommu_dev->devfn) {
>> +                g_tree_foreach(as->mappings, virtio_iommu_maping_unmap,
>> +                               &node->iommu_dev->iommu_mr);
>> +            }
>> +        }
>> +    }
>> +
>> +    /* Remove device from global list */
>>      g_tree_remove(s->devices, GUINT_TO_POINTER(dev->id));
>>      g_tree_unref(as->mappings);
>>  }
>> @@ -171,6 +231,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
>>      if (!as) {
>>          as = g_malloc0(sizeof(*as));
>>          as->id = asid;
>> +        QLIST_INIT(&as->device_list);
>>          as->mappings = g_tree_new_full((GCompareDataFunc)interval_cmp,
>>                                           NULL, NULL, (GDestroyNotify)g_free);
>>          g_tree_insert(s->address_spaces, GUINT_TO_POINTER(asid), as);
>> @@ -182,6 +243,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
>>      dev->id = devid;
>>      trace_virtio_iommu_new_devid(devid);
>>      g_tree_insert(s->devices, GUINT_TO_POINTER(devid), dev);
>> +    QLIST_INSERT_HEAD(&as->device_list, dev, next);
>>      g_tree_ref(as->mappings);
You may attach a device on as which already has mappings. In this case
you need to replay the mappings on the iommu mr associated to the
device. So you need something similar to virtio_iommu_map(p)ing_unmap
but for map.

I now have something working with DPDK on top of v0.4. I will provide
you with a branch so that you can analyze my changes in the VFIO
integration.

Thanks

Eric
>>  
>>      return VIRTIO_IOMMU_S_OK;
>> @@ -219,6 +281,8 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
>>      viommu_as *as;
>>      viommu_interval *interval;
>>      viommu_mapping *mapping;
>> +    VirtioIOMMUNotifierNode *node;
>> +    viommu_dev *dev;
>>  
>>      interval = g_malloc0(sizeof(*interval));
>>  
>> @@ -230,6 +294,11 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
>>          return VIRTIO_IOMMU_S_NOENT;
>>      }
>>  
>> +    dev = QLIST_FIRST(&as->device_list);
>> +    if (!dev) {
>> +        return VIRTIO_IOMMU_S_NOENT;
>> +    }
>> +
>>      mapping = g_tree_lookup(as->mappings, (gpointer)interval);
>>      if (mapping) {
>>          g_free(interval);
>> @@ -246,6 +315,14 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
>>  
>>      g_tree_insert(as->mappings, interval, mapping);
>>  
>> +    /* All devices in an address-space share mapping */
>> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
>> +        if (dev->id == node->iommu_dev->devfn) {
>> +            virtio_iommu_notify_map(&node->iommu_dev->iommu_mr, virt_addr,
>> +                                    phys_addr, size);
>> +        }
>> +    }
>> +
>>      return VIRTIO_IOMMU_S_OK;
>>  }
>>  
>> @@ -259,6 +336,8 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
>>      viommu_mapping *mapping;
>>      viommu_interval interval;
>>      viommu_as *as;
>> +    VirtioIOMMUNotifierNode *node;
>> +    viommu_dev *dev;
>>  
>>      trace_virtio_iommu_unmap(asid, virt_addr, size, flags);
>>  
>> @@ -267,6 +346,12 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
>>          error_report("%s: no as", __func__);
>>          return VIRTIO_IOMMU_S_NOENT;
>>      }
>> +
>> +    dev = QLIST_FIRST(&as->device_list);
>> +    if (!dev) {
>> +        return VIRTIO_IOMMU_S_NOENT;
>> +    }
>> +
>>      interval.low = virt_addr;
>>      interval.high = virt_addr + size - 1;
>>  
>> @@ -296,7 +381,15 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
>>          } else {
>>              break;
>>          }
>> +
> extra line
>>          if (interval.low >= interval.high) {
>> +            /* All devices in an address-space share mapping */
>> +            QLIST_FOREACH(node, &s->notifiers_list, next) {
>> +                if (dev->id == node->iommu_dev->devfn) {
>> +                    virtio_iommu_notify_unmap(&node->iommu_dev->iommu_mr, virt_addr,
>> +                                              0, size);
> Why do you notify only a single mr? All devices belonging to the as are
> affected by this change. So don't you need to notify all as device's mr?
>> +                }
>> +            }
>>              return VIRTIO_IOMMU_S_OK;
>>          } else {
>>              mapping = g_tree_lookup(as->mappings, (gpointer)&interval);
>> @@ -439,6 +532,36 @@ static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
>>      }
>>  }
>>  
>> +static void virtio_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu_mr,
>> +                                          IOMMUNotifierFlag old,
>> +                                          IOMMUNotifierFlag new)
>> +{
>> +    IOMMUDevice *sdev = container_of(iommu_mr, IOMMUDevice, iommu_mr);
>> +    VirtIOIOMMU *s = sdev->viommu;
>> +    VirtioIOMMUNotifierNode *node = NULL;
>> +    VirtioIOMMUNotifierNode *next_node = NULL;
>> +
>> +    if (old == IOMMU_NOTIFIER_NONE) {
>> +        trace_virtio_iommu_notify_flag_add(iommu_mr->parent_obj.name);
>> +        node = g_malloc0(sizeof(*node));
>> +        node->iommu_dev = sdev;
>> +        QLIST_INSERT_HEAD(&s->notifiers_list, node, next);
>> +        return;
>> +    }
>> +
>> +    /* update notifier node with new flags */
>> +    QLIST_FOREACH_SAFE(node, &s->notifiers_list, next, next_node) {
>> +        if (node->iommu_dev == sdev) {
>> +            if (new == IOMMU_NOTIFIER_NONE) {
>> +                trace_virtio_iommu_notify_flag_del(iommu_mr->parent_obj.name);
>> +                QLIST_REMOVE(node, next);
>> +                g_free(node);
>> +            }
>> +            return;
>> +        }
>> +    }
>> +}
>> +
>>  static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
>>                                              IOMMUAccessFlags flag)
>>  {
>> @@ -553,11 +676,49 @@ static gint int_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
>>      return (ua > ub) - (ua < ub);
>>  }
>>  
>> +static gboolean virtio_iommu_remap(gpointer key, gpointer value, gpointer data)
>> +{
>> +    viommu_mapping *mapping = (viommu_mapping *) value;
>> +    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
>> +
>> +    trace_virtio_iommu_remap(mapping->virt_addr, mapping->phys_addr,
>> +                             mapping->size);
>> +    /* unmap previous entry and map again */
>> +    virtio_iommu_notify_unmap(mr, mapping->virt_addr, 0, mapping->size);
>> +
>> +    virtio_iommu_notify_map(mr, mapping->virt_addr, mapping->phys_addr,
>> +                            mapping->size);
>> +    return true;
> you need to return false here otherwise  g_tree_foreach will return on
> the first as mapping.
> 
> Thanks
> 
> Eric
>> +}
>> +
>> +static void virtio_iommu_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
>> +{
>> +    IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
>> +    VirtIOIOMMU *s = sdev->viommu;
>> +    uint32_t sid;
>> +    viommu_dev *dev;
>> +
>> +    sid = virtio_iommu_get_sid(sdev);
>> +
>> +    qemu_mutex_lock(&s->mutex);
>> +
>> +    dev = g_tree_lookup(s->devices, GUINT_TO_POINTER(sid));
>> +    if (!dev) {
>> +        goto unlock;
>> +    }
>> +
>> +    g_tree_foreach(dev->as->mappings, virtio_iommu_remap, mr);
>> +
>> +unlock:
>> +    qemu_mutex_unlock(&s->mutex);
>> +}
>> +
>>  static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>>  {
>>      VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>>      VirtIOIOMMU *s = VIRTIO_IOMMU(dev);
>>  
>> +    QLIST_INIT(&s->notifiers_list);
>>      virtio_init(vdev, "virtio-iommu", VIRTIO_ID_IOMMU,
>>                  sizeof(struct virtio_iommu_config));
>>  
>> @@ -644,6 +805,8 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass,
>>      IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
>>  
>>      imrc->translate = virtio_iommu_translate;
>> +    imrc->notify_flag_changed = virtio_iommu_notify_flag_changed;
>> +    imrc->replay = virtio_iommu_replay;
>>  }
>>  
>>  static const TypeInfo virtio_iommu_info = {
>> diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h
>> index f9c988f..7e04184 100644
>> --- a/include/hw/virtio/virtio-iommu.h
>> +++ b/include/hw/virtio/virtio-iommu.h
>> @@ -46,6 +46,11 @@ typedef struct IOMMUPciBus {
>>      IOMMUDevice  *pbdev[0]; /* Parent array is sparse, so dynamically alloc */
>>  } IOMMUPciBus;
>>  
>> +typedef struct VirtioIOMMUNotifierNode {
>> +    IOMMUDevice *iommu_dev;
>> +    QLIST_ENTRY(VirtioIOMMUNotifierNode) next;
>> +} VirtioIOMMUNotifierNode;
>> +
>>  typedef struct VirtIOIOMMU {
>>      VirtIODevice parent_obj;
>>      VirtQueue *vq;
>> @@ -56,6 +61,7 @@ typedef struct VirtIOIOMMU {
>>      GTree *address_spaces;
>>      QemuMutex mutex;
>>      GTree *devices;
>> +    QLIST_HEAD(, VirtioIOMMUNotifierNode) notifiers_list;
>>  } VirtIOIOMMU;
>>  
>>  #endif
>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] virtio-iommu: vfio integration with virtio-iommu
  2017-09-18  7:47   ` Auger Eric
  2017-09-18 13:09     ` Auger Eric
@ 2017-09-22  7:45     ` Bharat Bhushan
  1 sibling, 0 replies; 10+ messages in thread
From: Bharat Bhushan @ 2017-09-22  7:45 UTC (permalink / raw)
  To: Auger Eric, eric.auger.pro, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel
  Cc: wei, kevin.tian, marc.zyngier, tn, will.deacon, drjones,
	robin.murphy, christoffer.dall



> -----Original Message-----
> From: Auger Eric [mailto:eric.auger@redhat.com]
> Sent: Monday, September 18, 2017 1:18 PM
> To: Bharat Bhushan <bharat.bhushan@nxp.com>;
> eric.auger.pro@gmail.com; peter.maydell@linaro.org;
> alex.williamson@redhat.com; mst@redhat.com; qemu-arm@nongnu.org;
> qemu-devel@nongnu.org
> Cc: wei@redhat.com; kevin.tian@intel.com; marc.zyngier@arm.com;
> tn@semihalf.com; will.deacon@arm.com; drjones@redhat.com;
> robin.murphy@arm.com; christoffer.dall@linaro.org
> Subject: Re: [Qemu-devel] [PATCH v3 2/2] virtio-iommu: vfio integration with
> virtio-iommu
> 
> Hi Bharat,
> 
> On 21/08/2017 12:48, Bharat Bhushan wrote:
> > This RFC patch allows virtio-iommu protection for PCI
> > device-passthrough.
> >
> > MSI region is mapped by current version of virtio-iommu driver.
> > This uses VFIO extension of map/unmap notification when an area of
> > memory is mappedi/unmapped in emulated iommu device.
> >
> > This series is tested with 2 PCI devices to virtual machine using
> > dma-ops and DPDK in VM is not yet tested.
> >
> > Also with this series we observe below prints for MSI region mapping
> >
> >    "qemu-system-aarch64: iommu map to non memory area 0"
> >
> >    This print comes when vfio/map-notifier is called for MSI region.
> >
> > vfio map/unmap notification is called for given device
> >    This assumes that devid passed in virtio_iommu_attach is same as devfn
> >    This assumption is based on 1:1 mapping of requested-id with device-id
> >    in QEMU.
> >
> > Signed-off-by: Bharat Bhushan <Bharat.Bhushan@nxp.com>
> > ---
> > v2->v3:
> >  - Addressed review comments:
> >     - virtio-iommu_map_region function is split in two functions
> >       virtio_iommu_notify_map/virtio_iommu_notify_unmap
> >     - use size received from driver and do not split in 4K pages
> >
> >  - map/unmap notification is called for given device/as
> >    This relies on devid passed in virtio_iommu_attach is same as devfn
> >    This is assumed as iommu-map maps 1:1 requested-id to device-id in
> QEMU
> >    Looking for comment about this assumtion.
> >
> >  - Keeping track devices in address-space
> >
> >  - Verified with 2 PCI endpoints
> >  - some code cleanup
> >
> >  hw/virtio/trace-events           |   5 ++
> >  hw/virtio/virtio-iommu.c         | 163
> +++++++++++++++++++++++++++++++++++++++
> >  include/hw/virtio/virtio-iommu.h |   6 ++
> >  3 files changed, 174 insertions(+)
> >
> > diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index
> > 8db3d91..7e9663f 100644
> > --- a/hw/virtio/trace-events
> > +++ b/hw/virtio/trace-events
> > @@ -39,3 +39,8 @@ virtio_iommu_unmap_left_interval(uint64_t low,
> > uint64_t high, uint64_t next_low,
> virtio_iommu_unmap_right_interval(uint64_t low, uint64_t high, uint64_t
> next_low, uint64_t next_high) "Unmap right [0x%"PRIx64",0x%"PRIx64"],
> new interval=[0x%"PRIx64",0x%"PRIx64"]"
> >  virtio_iommu_unmap_inc_interval(uint64_t low, uint64_t high) "Unmap
> inc [0x%"PRIx64",0x%"PRIx64"]"
> >  virtio_iommu_translate_result(uint64_t virt_addr, uint64_t phys_addr,
> uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d"
> > +virtio_iommu_notify_flag_add(const char *iommu) "Add virtio-iommu
> notifier node for memory region %s"
> > +virtio_iommu_notify_flag_del(const char *iommu) "Del virtio-iommu
> notifier node for memory region %s"
> > +virtio_iommu_remap(hwaddr iova, hwaddr pa, hwaddr size)
> "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
> > +virtio_iommu_notify_map(hwaddr iova, hwaddr paddr, hwaddr
> map_size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
> > +virtio_iommu_notify_unmap(hwaddr iova, hwaddr paddr, hwaddr
> map_size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
> useless paddr
> > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index
> > 9217587..9eae050 100644
> > --- a/hw/virtio/virtio-iommu.c
> > +++ b/hw/virtio/virtio-iommu.c
> > @@ -55,11 +55,13 @@ typedef struct viommu_interval {  typedef struct
> > viommu_dev {
> >      uint32_t id;
> >      viommu_as *as;
> > +    QLIST_ENTRY(viommu_dev) next;
> >  } viommu_dev;
> >
> >  struct viommu_as {
> >      uint32_t id;
> >      GTree *mappings;
> > +    QLIST_HEAD(, viommu_dev) device_list;
> >  };
> >
> >  static inline uint16_t virtio_iommu_get_sid(IOMMUDevice *dev) @@
> > -133,12 +135,70 @@ static gint interval_cmp(gconstpointer a, gconstpointer
> b, gpointer user_data)
> >      }
> >  }
> >
> > +static void virtio_iommu_notify_map(IOMMUMemoryRegion *mr,
> hwaddr iova,
> > +                                    hwaddr paddr, hwaddr size) {
> > +    IOMMUTLBEntry entry;
> > +
> > +    entry.target_as = &address_space_memory;
> > +    entry.addr_mask = size - 1;
> > +
> > +    entry.iova = iova;
> > +    trace_virtio_iommu_notify_map(iova, paddr, size);
> > +    entry.perm = IOMMU_RW;
> > +    entry.translated_addr = paddr;
> > +
> > +    memory_region_notify_iommu(mr, entry); }
> > +
> > +static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr,
> hwaddr iova,
> > +                                      hwaddr paddr, hwaddr size)
> unmap() doesn't need to have paddr arg.

Yes, will fix this

> > +{
> > +    IOMMUTLBEntry entry;
> > +
> > +    entry.target_as = &address_space_memory;
> > +    entry.addr_mask = size - 1;
> > +
> > +    entry.iova = iova;
> > +    trace_virtio_iommu_notify_unmap(iova, paddr, size);
> ditto
> > +    entry.perm = IOMMU_NONE;
> > +    entry.translated_addr = 0;
> > +
> > +    memory_region_notify_iommu(mr, entry); }
> > +
> > +static gboolean virtio_iommu_maping_unmap(gpointer key, gpointer
> value,
> > +                                          gpointer data)
> s/virtio_iommu_maping_unmap/virtio_iommu_mapping_unmap
> > +{
> > +    viommu_mapping *mapping = (viommu_mapping *) value;
> > +    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
> > +
> > +    virtio_iommu_notify_unmap(mr, mapping->virt_addr, 0,
> > + mapping->size);
> > +
> > +    return true;
> > +}
> > +
> >  static void virtio_iommu_detach_dev(VirtIOIOMMU *s, viommu_dev
> *dev)
> > {
> > +    VirtioIOMMUNotifierNode *node;
> >      viommu_as *as = dev->as;
> > +    int devid = dev->id;
> >
> >      trace_virtio_iommu_detach(dev->id);
> >
> > +    /* Remove device from address-space list */
> > +    QLIST_REMOVE(dev, next);
> > +    /* unmap all if no devices attached to address-spaceRemove */
> comment to be reworked
> > +    if (QLIST_EMPTY(&as->device_list)) {
> I don't get why you execute the code below only if the device_list is empty.
> Each time a device is detached don't you need to notify its associated
> iommu_mr?

There can be multiple devices share same address space, should not we unmap address space when last device in removed? 

> > +        QLIST_FOREACH(node, &s->notifiers_list, next) {
> > +            if (devid == node->iommu_dev->devfn) {
> > +                g_tree_foreach(as->mappings, virtio_iommu_maping_unmap,
> > +                               &node->iommu_dev->iommu_mr);
> > +            }
> > +        }
> > +    }
> > +
> > +    /* Remove device from global list */
> >      g_tree_remove(s->devices, GUINT_TO_POINTER(dev->id));
> >      g_tree_unref(as->mappings);
> >  }
> > @@ -171,6 +231,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
> >      if (!as) {
> >          as = g_malloc0(sizeof(*as));
> >          as->id = asid;
> > +        QLIST_INIT(&as->device_list);
> >          as->mappings = g_tree_new_full((GCompareDataFunc)interval_cmp,
> >                                           NULL, NULL, (GDestroyNotify)g_free);
> >          g_tree_insert(s->address_spaces, GUINT_TO_POINTER(asid), as);
> > @@ -182,6 +243,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
> >      dev->id = devid;
> >      trace_virtio_iommu_new_devid(devid);
> >      g_tree_insert(s->devices, GUINT_TO_POINTER(devid), dev);
> > +    QLIST_INSERT_HEAD(&as->device_list, dev, next);
> >      g_tree_ref(as->mappings);
> >
> >      return VIRTIO_IOMMU_S_OK;
> > @@ -219,6 +281,8 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
> >      viommu_as *as;
> >      viommu_interval *interval;
> >      viommu_mapping *mapping;
> > +    VirtioIOMMUNotifierNode *node;
> > +    viommu_dev *dev;
> >
> >      interval = g_malloc0(sizeof(*interval));
> >
> > @@ -230,6 +294,11 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
> >          return VIRTIO_IOMMU_S_NOENT;
> >      }
> >
> > +    dev = QLIST_FIRST(&as->device_list);
> > +    if (!dev) {
> > +        return VIRTIO_IOMMU_S_NOENT;
> > +    }
> > +
> >      mapping = g_tree_lookup(as->mappings, (gpointer)interval);
> >      if (mapping) {
> >          g_free(interval);
> > @@ -246,6 +315,14 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
> >
> >      g_tree_insert(as->mappings, interval, mapping);
> >
> > +    /* All devices in an address-space share mapping */
> > +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> > +        if (dev->id == node->iommu_dev->devfn) {
> > +            virtio_iommu_notify_map(&node->iommu_dev->iommu_mr,
> virt_addr,
> > +                                    phys_addr, size);
> > +        }
> > +    }
> > +
> >      return VIRTIO_IOMMU_S_OK;
> >  }
> >
> > @@ -259,6 +336,8 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
> >      viommu_mapping *mapping;
> >      viommu_interval interval;
> >      viommu_as *as;
> > +    VirtioIOMMUNotifierNode *node;
> > +    viommu_dev *dev;
> >
> >      trace_virtio_iommu_unmap(asid, virt_addr, size, flags);
> >
> > @@ -267,6 +346,12 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
> >          error_report("%s: no as", __func__);
> >          return VIRTIO_IOMMU_S_NOENT;
> >      }
> > +
> > +    dev = QLIST_FIRST(&as->device_list);
> > +    if (!dev) {
> > +        return VIRTIO_IOMMU_S_NOENT;
> > +    }
> > +
> >      interval.low = virt_addr;
> >      interval.high = virt_addr + size - 1;
> >
> > @@ -296,7 +381,15 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
> >          } else {
> >              break;
> >          }
> > +
> extra line
> >          if (interval.low >= interval.high) {
> > +            /* All devices in an address-space share mapping */
> > +            QLIST_FOREACH(node, &s->notifiers_list, next) {
> > +                if (dev->id == node->iommu_dev->devfn) {
> > +                    virtio_iommu_notify_unmap(&node->iommu_dev-
> >iommu_mr, virt_addr,
> > +                                              0, size);
> Why do you notify only a single mr? All devices belonging to the as are
> affected by this change. So don't you need to notify all as device's mr?
> > +                }
> > +            }
> >              return VIRTIO_IOMMU_S_OK;
> >          } else {
> >              mapping = g_tree_lookup(as->mappings,
> > (gpointer)&interval); @@ -439,6 +532,36 @@ static void
> virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
> >      }
> >  }
> >
> > +static void virtio_iommu_notify_flag_changed(IOMMUMemoryRegion
> *iommu_mr,
> > +                                          IOMMUNotifierFlag old,
> > +                                          IOMMUNotifierFlag new) {
> > +    IOMMUDevice *sdev = container_of(iommu_mr, IOMMUDevice,
> iommu_mr);
> > +    VirtIOIOMMU *s = sdev->viommu;
> > +    VirtioIOMMUNotifierNode *node = NULL;
> > +    VirtioIOMMUNotifierNode *next_node = NULL;
> > +
> > +    if (old == IOMMU_NOTIFIER_NONE) {
> > +        trace_virtio_iommu_notify_flag_add(iommu_mr-
> >parent_obj.name);
> > +        node = g_malloc0(sizeof(*node));
> > +        node->iommu_dev = sdev;
> > +        QLIST_INSERT_HEAD(&s->notifiers_list, node, next);
> > +        return;
> > +    }
> > +
> > +    /* update notifier node with new flags */
> > +    QLIST_FOREACH_SAFE(node, &s->notifiers_list, next, next_node) {
> > +        if (node->iommu_dev == sdev) {
> > +            if (new == IOMMU_NOTIFIER_NONE) {
> > +                trace_virtio_iommu_notify_flag_del(iommu_mr-
> >parent_obj.name);
> > +                QLIST_REMOVE(node, next);
> > +                g_free(node);
> > +            }
> > +            return;
> > +        }
> > +    }
> > +}
> > +
> >  static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion
> *mr, hwaddr addr,
> >                                              IOMMUAccessFlags flag)  {
> > @@ -553,11 +676,49 @@ static gint int_cmp(gconstpointer a, gconstpointer
> b, gpointer user_data)
> >      return (ua > ub) - (ua < ub);
> >  }
> >
> > +static gboolean virtio_iommu_remap(gpointer key, gpointer value,
> > +gpointer data) {
> > +    viommu_mapping *mapping = (viommu_mapping *) value;
> > +    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
> > +
> > +    trace_virtio_iommu_remap(mapping->virt_addr, mapping->phys_addr,
> > +                             mapping->size);
> > +    /* unmap previous entry and map again */
> > +    virtio_iommu_notify_unmap(mr, mapping->virt_addr, 0,
> > + mapping->size);
> > +
> > +    virtio_iommu_notify_map(mr, mapping->virt_addr, mapping-
> >phys_addr,
> > +                            mapping->size);
> > +    return true;
> you need to return false here otherwise  g_tree_foreach will return on the
> first as mapping.

Yes,

Thanks
-Bharat
> 
> Thanks
> 
> Eric
> > +}
> > +
> > +static void virtio_iommu_replay(IOMMUMemoryRegion *mr,
> IOMMUNotifier
> > +*n) {
> > +    IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
> > +    VirtIOIOMMU *s = sdev->viommu;
> > +    uint32_t sid;
> > +    viommu_dev *dev;
> > +
> > +    sid = virtio_iommu_get_sid(sdev);
> > +
> > +    qemu_mutex_lock(&s->mutex);
> > +
> > +    dev = g_tree_lookup(s->devices, GUINT_TO_POINTER(sid));
> > +    if (!dev) {
> > +        goto unlock;
> > +    }
> > +
> > +    g_tree_foreach(dev->as->mappings, virtio_iommu_remap, mr);
> > +
> > +unlock:
> > +    qemu_mutex_unlock(&s->mutex);
> > +}
> > +
> >  static void virtio_iommu_device_realize(DeviceState *dev, Error
> > **errp)  {
> >      VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> >      VirtIOIOMMU *s = VIRTIO_IOMMU(dev);
> >
> > +    QLIST_INIT(&s->notifiers_list);
> >      virtio_init(vdev, "virtio-iommu", VIRTIO_ID_IOMMU,
> >                  sizeof(struct virtio_iommu_config));
> >
> > @@ -644,6 +805,8 @@ static void
> virtio_iommu_memory_region_class_init(ObjectClass *klass,
> >      IOMMUMemoryRegionClass *imrc =
> IOMMU_MEMORY_REGION_CLASS(klass);
> >
> >      imrc->translate = virtio_iommu_translate;
> > +    imrc->notify_flag_changed = virtio_iommu_notify_flag_changed;
> > +    imrc->replay = virtio_iommu_replay;
> >  }
> >
> >  static const TypeInfo virtio_iommu_info = { diff --git
> > a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h
> > index f9c988f..7e04184 100644
> > --- a/include/hw/virtio/virtio-iommu.h
> > +++ b/include/hw/virtio/virtio-iommu.h
> > @@ -46,6 +46,11 @@ typedef struct IOMMUPciBus {
> >      IOMMUDevice  *pbdev[0]; /* Parent array is sparse, so dynamically
> > alloc */  } IOMMUPciBus;
> >
> > +typedef struct VirtioIOMMUNotifierNode {
> > +    IOMMUDevice *iommu_dev;
> > +    QLIST_ENTRY(VirtioIOMMUNotifierNode) next; }
> > +VirtioIOMMUNotifierNode;
> > +
> >  typedef struct VirtIOIOMMU {
> >      VirtIODevice parent_obj;
> >      VirtQueue *vq;
> > @@ -56,6 +61,7 @@ typedef struct VirtIOIOMMU {
> >      GTree *address_spaces;
> >      QemuMutex mutex;
> >      GTree *devices;
> > +    QLIST_HEAD(, VirtioIOMMUNotifierNode) notifiers_list;
> >  } VirtIOIOMMU;
> >
> >  #endif
> >

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-09-22  7:45 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-21 10:48 [Qemu-devel] [PATCH v3 0/2] virtio-iommu: VFIO integration Bharat Bhushan
2017-08-21 10:48 ` [Qemu-devel] [PATCH v3 1/2] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route Bharat Bhushan
2017-08-21 10:48 ` [Qemu-devel] [PATCH v3 2/2] virtio-iommu: vfio integration with virtio-iommu Bharat Bhushan
2017-09-18  7:47   ` Auger Eric
2017-09-18 13:09     ` Auger Eric
2017-09-22  7:45     ` Bharat Bhushan
2017-08-23 16:41 ` [Qemu-devel] [PATCH v3 0/2] virtio-iommu: VFIO integration Auger Eric
2017-08-24  5:57   ` Bharat Bhushan
2017-08-29 17:06 ` [Qemu-devel] [Qemu-arm] " Linu Cherian
2017-08-29 19:42   ` Auger Eric

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.