QEMU-Devel Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v9 0/9] virtio-iommu: VFIO integration
@ 2020-03-23  8:46 Bharat Bhushan
  2020-03-23  8:46 ` [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu Bharat Bhushan
                   ` (10 more replies)
  0 siblings, 11 replies; 37+ messages in thread
From: Bharat Bhushan @ 2020-03-23  8:46 UTC (permalink / raw)
  To: peter.maydell, peterx, eric.auger.pro, alex.williamson,
	kevin.tian, mst, tnowicki, drjones, linuc.decode, qemu-devel,
	qemu-arm, bharatb.linux, jean-philippe, yang.zhong
  Cc: Bharat Bhushan

This patch series integrates VFIO with virtio-iommu.
This is only applicable for PCI pass-through with virtio-iommu.

This series is available at:
https://github.com/bharat-bhushan-devel/qemu.git virtio-iommu-vfio-integration-v8

This is tested with assigning more than one pci devices to Virtual Machine.

v8-v9:
  - Have page size mask per endpint
  - Add PROBE interface, return page size mask
 
v7->v8:
  - Set page size mask as per host
    This fixes issue with 64K host/guest 
  - Device list from IOMMUDevice directly removed VirtioIOMMUNotifierNode
  - Add missing iep->viommu init on post-load

v6->v7:
  - corrected email-address

v5->v6:
  - Rebase to v16 version from Eric
  - Tested with upstream Linux
  - Added a patch from Eric/Myself on removing mmio-region error print in vfio

v4->v5:
 - Rebase to v9 version from Eric
 - PCIe device hotplug fix
 - Added Patch 1/5 from Eric previous series (Eric somehow dropped in
   last version.
 - Patch "Translate the MSI doorbell in kvm_arch_fixup_msi_route"
   already integrated with vsmmu3

v3->v4:
 - Rebase to v4 version from Eric
 - Fixes from Eric with DPDK in VM
 - Logical division in multiple patches

v2->v3:
 - This series is based on "[RFC v3 0/8] VIRTIO-IOMMU device"
   Which is based on top of v2.10-rc0 that
 - Fixed issue with two PCI devices
 - Addressed review comments

v1->v2:
  - Added trace events
  - removed vSMMU3 link in patch description

Bharat Bhushan (9):
  hw/vfio/common: Remove error print on mmio region translation by
    viommu
  memory: Add interface to set iommu page size mask
  vfio: set iommu page size as per host supported page size
  virtio-iommu: set supported page size mask
  virtio-iommu: Add iommu notifier for map/unmap
  virtio-iommu: Call iommu notifier for attach/detach
  virtio-iommu: add iommu replay
  virtio-iommu: Implement probe request
  virtio-iommu: add iommu notifier memory-region

 include/exec/memory.h                         |  20 +
 include/hw/virtio/virtio-iommu.h              |   3 +
 include/standard-headers/linux/virtio_iommu.h |   6 +
 hw/vfio/common.c                              |   5 +-
 hw/virtio/virtio-iommu.c                      | 352 +++++++++++++++++-
 memory.c                                      |  10 +
 hw/virtio/trace-events                        |   7 +
 7 files changed, 397 insertions(+), 6 deletions(-)

-- 
2.17.1



^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu
  2020-03-23  8:46 [PATCH v9 0/9] virtio-iommu: VFIO integration Bharat Bhushan
@ 2020-03-23  8:46 ` Bharat Bhushan
  2020-03-23 23:08   ` Alex Williamson
  2020-03-23  8:46 ` [PATCH v9 2/9] memory: Add interface to set iommu page size mask Bharat Bhushan
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 37+ messages in thread
From: Bharat Bhushan @ 2020-03-23  8:46 UTC (permalink / raw)
  To: peter.maydell, peterx, eric.auger.pro, alex.williamson,
	kevin.tian, mst, tnowicki, drjones, linuc.decode, qemu-devel,
	qemu-arm, bharatb.linux, jean-philippe, yang.zhong
  Cc: Eric Auger, Bharat Bhushan

On ARM, the MSI doorbell is translated by the virtual IOMMU.
As such address_space_translate() returns the MSI controller
MMIO region and we get an "iommu map to non memory area"
message. Let's remove this latter.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
---
 hw/vfio/common.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 5ca11488d6..c586edf47a 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb, void **vaddr,
                                  &xlat, &len, writable,
                                  MEMTXATTRS_UNSPECIFIED);
     if (!memory_region_is_ram(mr)) {
-        error_report("iommu map to non memory area %"HWADDR_PRIx"",
-                     xlat);
         return false;
     }
 
-- 
2.17.1



^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v9 2/9] memory: Add interface to set iommu page size mask
  2020-03-23  8:46 [PATCH v9 0/9] virtio-iommu: VFIO integration Bharat Bhushan
  2020-03-23  8:46 ` [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu Bharat Bhushan
@ 2020-03-23  8:46 ` Bharat Bhushan
  2020-03-26 16:06   ` Auger Eric
  2020-03-23  8:46 ` [PATCH v9 3/9] vfio: set iommu page size as per host supported page size Bharat Bhushan
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 37+ messages in thread
From: Bharat Bhushan @ 2020-03-23  8:46 UTC (permalink / raw)
  To: peter.maydell, peterx, eric.auger.pro, alex.williamson,
	kevin.tian, mst, tnowicki, drjones, linuc.decode, qemu-devel,
	qemu-arm, bharatb.linux, jean-philippe, yang.zhong
  Cc: Bharat Bhushan

Allow to set page size mask to be supported by iommu.
This is required to expose page size mask compatible with
host with virtio-iommu.

Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
---
 include/exec/memory.h | 20 ++++++++++++++++++++
 memory.c              | 10 ++++++++++
 2 files changed, 30 insertions(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index e85b7de99a..063c424854 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -355,6 +355,16 @@ typedef struct IOMMUMemoryRegionClass {
      * @iommu: the IOMMUMemoryRegion
      */
     int (*num_indexes)(IOMMUMemoryRegion *iommu);
+
+    /*
+     * Set supported IOMMU page size
+     *
+     * Optional method: if this is supported then set page size that
+     * can be supported by IOMMU. This is called to set supported page
+     * size as per host Linux.
+     */
+     void (*iommu_set_page_size_mask)(IOMMUMemoryRegion *iommu,
+                                      uint64_t page_size_mask);
 } IOMMUMemoryRegionClass;
 
 typedef struct CoalescedMemoryRange CoalescedMemoryRange;
@@ -1363,6 +1373,16 @@ int memory_region_iommu_attrs_to_index(IOMMUMemoryRegion *iommu_mr,
  */
 int memory_region_iommu_num_indexes(IOMMUMemoryRegion *iommu_mr);
 
+/**
+ * memory_region_iommu_set_page_size_mask: set the supported pages
+ * size by iommu.
+ *
+ * @iommu_mr: the memory region
+ * @page_size_mask: supported page size mask
+ */
+void memory_region_iommu_set_page_size_mask(IOMMUMemoryRegion *iommu_mr,
+                                            uint64_t page_size_mask);
+
 /**
  * memory_region_name: get a memory region's name
  *
diff --git a/memory.c b/memory.c
index aeaa8dcc9e..14c8783084 100644
--- a/memory.c
+++ b/memory.c
@@ -1833,6 +1833,16 @@ static int memory_region_update_iommu_notify_flags(IOMMUMemoryRegion *iommu_mr,
     return ret;
 }
 
+void memory_region_iommu_set_page_size_mask(IOMMUMemoryRegion *iommu_mr,
+                                            uint64_t page_size_mask)
+{
+    IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr);
+
+    if (imrc->iommu_set_page_size_mask) {
+        imrc->iommu_set_page_size_mask(iommu_mr, page_size_mask);
+    }
+}
+
 int memory_region_register_iommu_notifier(MemoryRegion *mr,
                                           IOMMUNotifier *n, Error **errp)
 {
-- 
2.17.1



^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v9 3/9] vfio: set iommu page size as per host supported page size
  2020-03-23  8:46 [PATCH v9 0/9] virtio-iommu: VFIO integration Bharat Bhushan
  2020-03-23  8:46 ` [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu Bharat Bhushan
  2020-03-23  8:46 ` [PATCH v9 2/9] memory: Add interface to set iommu page size mask Bharat Bhushan
@ 2020-03-23  8:46 ` Bharat Bhushan
  2020-03-23  8:46 ` [PATCH v9 4/9] virtio-iommu: set supported page size mask Bharat Bhushan
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 37+ messages in thread
From: Bharat Bhushan @ 2020-03-23  8:46 UTC (permalink / raw)
  To: peter.maydell, peterx, eric.auger.pro, alex.williamson,
	kevin.tian, mst, tnowicki, drjones, linuc.decode, qemu-devel,
	qemu-arm, bharatb.linux, jean-philippe, yang.zhong
  Cc: Bharat Bhushan

Set iommu supported page size mask same as host Linux
supported page size mask.

Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
---
 hw/vfio/common.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index c586edf47a..6ea50d696f 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -635,6 +635,9 @@ static void vfio_listener_region_add(MemoryListener *listener,
                             int128_get64(llend),
                             iommu_idx);
 
+        memory_region_iommu_set_page_size_mask(giommu->iommu,
+                                               container->pgsizes);
+
         ret = memory_region_register_iommu_notifier(section->mr, &giommu->n,
                                                     &err);
         if (ret) {
-- 
2.17.1



^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v9 4/9] virtio-iommu: set supported page size mask
  2020-03-23  8:46 [PATCH v9 0/9] virtio-iommu: VFIO integration Bharat Bhushan
                   ` (2 preceding siblings ...)
  2020-03-23  8:46 ` [PATCH v9 3/9] vfio: set iommu page size as per host supported page size Bharat Bhushan
@ 2020-03-23  8:46 ` Bharat Bhushan
  2020-03-26 15:51   ` Auger Eric
  2020-03-23  8:46 ` [PATCH v9 5/9] virtio-iommu: Add iommu notifier for map/unmap Bharat Bhushan
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 37+ messages in thread
From: Bharat Bhushan @ 2020-03-23  8:46 UTC (permalink / raw)
  To: peter.maydell, peterx, eric.auger.pro, alex.williamson,
	kevin.tian, mst, tnowicki, drjones, linuc.decode, qemu-devel,
	qemu-arm, bharatb.linux, jean-philippe, yang.zhong
  Cc: Bharat Bhushan

Add optional interface to set page size mask.
Currently this is set global configuration and not
per endpoint.

Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
---
 include/hw/virtio/virtio-iommu.h | 1 +
 hw/virtio/virtio-iommu.c         | 9 +++++++++
 2 files changed, 10 insertions(+)

diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h
index 6f67f1020a..4efa09610a 100644
--- a/include/hw/virtio/virtio-iommu.h
+++ b/include/hw/virtio/virtio-iommu.h
@@ -35,6 +35,7 @@ typedef struct IOMMUDevice {
     void         *viommu;
     PCIBus       *bus;
     int           devfn;
+    uint64_t      page_size_mask;
     IOMMUMemoryRegion  iommu_mr;
     AddressSpace  as;
 } IOMMUDevice;
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 4cee8083bc..a28818202c 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -650,6 +650,14 @@ static gint int_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
     return (ua > ub) - (ua < ub);
 }
 
+static void virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr,
+                                            uint64_t page_size_mask)
+{
+    IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
+
+    sdev->page_size_mask = page_size_mask;
+}
+
 static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
@@ -865,6 +873,7 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass,
     IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
 
     imrc->translate = virtio_iommu_translate;
+    imrc->iommu_set_page_size_mask = virtio_iommu_set_page_size_mask;
 }
 
 static const TypeInfo virtio_iommu_info = {
-- 
2.17.1



^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v9 5/9] virtio-iommu: Add iommu notifier for map/unmap
  2020-03-23  8:46 [PATCH v9 0/9] virtio-iommu: VFIO integration Bharat Bhushan
                   ` (3 preceding siblings ...)
  2020-03-23  8:46 ` [PATCH v9 4/9] virtio-iommu: set supported page size mask Bharat Bhushan
@ 2020-03-23  8:46 ` Bharat Bhushan
  2020-03-23  8:46 ` [PATCH v9 6/9] virtio-iommu: Call iommu notifier for attach/detach Bharat Bhushan
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 37+ messages in thread
From: Bharat Bhushan @ 2020-03-23  8:46 UTC (permalink / raw)
  To: peter.maydell, peterx, eric.auger.pro, alex.williamson,
	kevin.tian, mst, tnowicki, drjones, linuc.decode, qemu-devel,
	qemu-arm, bharatb.linux, jean-philippe, yang.zhong
  Cc: Eric Auger, Bharat Bhushan

This patch extends VIRTIO_IOMMU_T_MAP/UNMAP request to
notify registered iommu-notifier. Which will call vfio
notifier to map/unmap region in iommu.

Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 include/hw/virtio/virtio-iommu.h |  2 +
 hw/virtio/virtio-iommu.c         | 67 +++++++++++++++++++++++++++++++-
 hw/virtio/trace-events           |  2 +
 3 files changed, 70 insertions(+), 1 deletion(-)

diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h
index 4efa09610a..e53586df70 100644
--- a/include/hw/virtio/virtio-iommu.h
+++ b/include/hw/virtio/virtio-iommu.h
@@ -38,6 +38,7 @@ typedef struct IOMMUDevice {
     uint64_t      page_size_mask;
     IOMMUMemoryRegion  iommu_mr;
     AddressSpace  as;
+    QLIST_ENTRY(IOMMUDevice) next;
 } IOMMUDevice;
 
 typedef struct IOMMUPciBus {
@@ -57,6 +58,7 @@ typedef struct VirtIOIOMMU {
     GTree *domains;
     QemuMutex mutex;
     GTree *endpoints;
+    QLIST_HEAD(, IOMMUDevice) notifiers_list;
 } VirtIOIOMMU;
 
 #endif
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index a28818202c..bd464d4fb3 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -123,6 +123,38 @@ static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
     }
 }
 
+static void virtio_iommu_notify_map(IOMMUMemoryRegion *mr, hwaddr iova,
+                                    hwaddr paddr, hwaddr size)
+{
+    IOMMUTLBEntry entry;
+
+    entry.target_as = &address_space_memory;
+    entry.addr_mask = size - 1;
+
+    entry.iova = iova;
+    trace_virtio_iommu_notify_map(mr->parent_obj.name, iova, paddr, size);
+    entry.perm = IOMMU_RW;
+    entry.translated_addr = paddr;
+
+    memory_region_notify_iommu(mr, 0, entry);
+}
+
+static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova,
+                                      hwaddr size)
+{
+    IOMMUTLBEntry entry;
+
+    entry.target_as = &address_space_memory;
+    entry.addr_mask = size - 1;
+
+    entry.iova = iova;
+    trace_virtio_iommu_notify_unmap(mr->parent_obj.name, iova, size);
+    entry.perm = IOMMU_NONE;
+    entry.translated_addr = 0;
+
+    memory_region_notify_iommu(mr, 0, entry);
+}
+
 static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep)
 {
     if (!ep->domain) {
@@ -307,9 +339,12 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
     uint64_t virt_start = le64_to_cpu(req->virt_start);
     uint64_t virt_end = le64_to_cpu(req->virt_end);
     uint32_t flags = le32_to_cpu(req->flags);
+    hwaddr size = virt_end - virt_start + 1;
     VirtIOIOMMUDomain *domain;
     VirtIOIOMMUInterval *interval;
     VirtIOIOMMUMapping *mapping;
+    VirtIOIOMMUEndpoint *ep;
+    IOMMUDevice *sdev;
 
     if (flags & ~VIRTIO_IOMMU_MAP_F_MASK) {
         return VIRTIO_IOMMU_S_INVAL;
@@ -339,9 +374,38 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
 
     g_tree_insert(domain->mappings, interval, mapping);
 
+    /* All devices in an address-space share mapping */
+    QLIST_FOREACH(sdev, &s->notifiers_list, next) {
+        QLIST_FOREACH(ep, &domain->endpoint_list, next) {
+            if (ep->id == sdev->devfn) {
+                virtio_iommu_notify_map(&sdev->iommu_mr,
+                                        virt_start, phys_start, size);
+            }
+        }
+    }
+
     return VIRTIO_IOMMU_S_OK;
 }
 
+static void virtio_iommu_remove_mapping(VirtIOIOMMU *s,
+                                        VirtIOIOMMUDomain *domain,
+                                        VirtIOIOMMUInterval *interval)
+{
+    VirtIOIOMMUEndpoint *ep;
+    IOMMUDevice *sdev;
+
+    QLIST_FOREACH(sdev, &s->notifiers_list, next) {
+        QLIST_FOREACH(ep, &domain->endpoint_list, next) {
+            if (ep->id == sdev->devfn) {
+                virtio_iommu_notify_unmap(&sdev->iommu_mr,
+                                          interval->low,
+                                          interval->high - interval->low + 1);
+            }
+        }
+    }
+    g_tree_remove(domain->mappings, (gpointer)(interval));
+}
+
 static int virtio_iommu_unmap(VirtIOIOMMU *s,
                               struct virtio_iommu_req_unmap *req)
 {
@@ -368,7 +432,7 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
         uint64_t current_high = iter_key->high;
 
         if (interval.low <= current_low && interval.high >= current_high) {
-            g_tree_remove(domain->mappings, iter_key);
+            virtio_iommu_remove_mapping(s, domain, iter_key);
             trace_virtio_iommu_unmap_done(domain_id, current_low, current_high);
         } else {
             ret = VIRTIO_IOMMU_S_RANGE;
@@ -663,6 +727,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
     VirtIOIOMMU *s = VIRTIO_IOMMU(dev);
 
+    QLIST_INIT(&s->notifiers_list);
     virtio_init(vdev, "virtio-iommu", VIRTIO_ID_IOMMU,
                 sizeof(struct virtio_iommu_config));
 
diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index e83500bee9..d94a1cd8a3 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -73,3 +73,5 @@ virtio_iommu_get_domain(uint32_t domain_id) "Alloc domain=%d"
 virtio_iommu_put_domain(uint32_t domain_id) "Free domain=%d"
 virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d"
 virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address =0x%"PRIx64
+virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64
+virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64
-- 
2.17.1



^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v9 6/9] virtio-iommu: Call iommu notifier for attach/detach
  2020-03-23  8:46 [PATCH v9 0/9] virtio-iommu: VFIO integration Bharat Bhushan
                   ` (4 preceding siblings ...)
  2020-03-23  8:46 ` [PATCH v9 5/9] virtio-iommu: Add iommu notifier for map/unmap Bharat Bhushan
@ 2020-03-23  8:46 ` Bharat Bhushan
  2020-03-23  8:46 ` [PATCH v9 7/9] virtio-iommu: add iommu replay Bharat Bhushan
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 37+ messages in thread
From: Bharat Bhushan @ 2020-03-23  8:46 UTC (permalink / raw)
  To: peter.maydell, peterx, eric.auger.pro, alex.williamson,
	kevin.tian, mst, tnowicki, drjones, linuc.decode, qemu-devel,
	qemu-arm, bharatb.linux, jean-philippe, yang.zhong
  Cc: Bharat Bhushan

iommu-notifier are called when a device is attached
or detached to as address-space.
This is needed for VFIO.

Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
---
 hw/virtio/virtio-iommu.c | 49 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index bd464d4fb3..88849aa7b9 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -49,6 +49,7 @@ typedef struct VirtIOIOMMUEndpoint {
     uint32_t id;
     VirtIOIOMMUDomain *domain;
     QLIST_ENTRY(VirtIOIOMMUEndpoint) next;
+    VirtIOIOMMU *viommu;
 } VirtIOIOMMUEndpoint;
 
 typedef struct VirtIOIOMMUInterval {
@@ -155,11 +156,48 @@ static void virtio_iommu_notify_unmap(IOMMUMemoryRegion *mr, hwaddr iova,
     memory_region_notify_iommu(mr, 0, entry);
 }
 
+static gboolean virtio_iommu_mapping_unmap(gpointer key, gpointer value,
+                                           gpointer data)
+{
+    VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key;
+    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
+
+    virtio_iommu_notify_unmap(mr, interval->low,
+                              interval->high - interval->low + 1);
+
+    return false;
+}
+
+static gboolean virtio_iommu_mapping_map(gpointer key, gpointer value,
+                                         gpointer data)
+{
+    VirtIOIOMMUMapping *mapping = (VirtIOIOMMUMapping *) value;
+    VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key;
+    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
+
+    virtio_iommu_notify_map(mr, interval->low, mapping->phys_addr,
+                            interval->high - interval->low + 1);
+
+    return false;
+}
+
 static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep)
 {
+    VirtIOIOMMU *s = ep->viommu;
+    VirtIOIOMMUDomain *domain = ep->domain;
+    IOMMUDevice *sdev;
+
     if (!ep->domain) {
         return;
     }
+
+    QLIST_FOREACH(sdev, &s->notifiers_list, next) {
+        if (ep->id == sdev->devfn) {
+            g_tree_foreach(domain->mappings, virtio_iommu_mapping_unmap,
+                           &sdev->iommu_mr);
+        }
+    }
+
     QLIST_REMOVE(ep, next);
     ep->domain = NULL;
 }
@@ -178,6 +216,7 @@ static VirtIOIOMMUEndpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s,
     }
     ep = g_malloc0(sizeof(*ep));
     ep->id = ep_id;
+    ep->viommu = s;
     trace_virtio_iommu_get_endpoint(ep_id);
     g_tree_insert(s->endpoints, GUINT_TO_POINTER(ep_id), ep);
     return ep;
@@ -274,6 +313,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
     uint32_t ep_id = le32_to_cpu(req->endpoint);
     VirtIOIOMMUDomain *domain;
     VirtIOIOMMUEndpoint *ep;
+    IOMMUDevice *sdev;
 
     trace_virtio_iommu_attach(domain_id, ep_id);
 
@@ -299,6 +339,14 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
 
     ep->domain = domain;
 
+    /* Replay domain mappings on the associated memory region */
+    QLIST_FOREACH(sdev, &s->notifiers_list, next) {
+        if (ep_id == sdev->devfn) {
+            g_tree_foreach(domain->mappings, virtio_iommu_mapping_map,
+                           &sdev->iommu_mr);
+        }
+    }
+
     return VIRTIO_IOMMU_S_OK;
 }
 
@@ -872,6 +920,7 @@ static gboolean reconstruct_endpoints(gpointer key, gpointer value,
 
     QLIST_FOREACH(iter, &d->endpoint_list, next) {
         iter->domain = d;
+        iter->viommu = s;
         g_tree_insert(s->endpoints, GUINT_TO_POINTER(iter->id), iter);
     }
     return false; /* continue the domain traversal */
-- 
2.17.1



^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v9 7/9] virtio-iommu: add iommu replay
  2020-03-23  8:46 [PATCH v9 0/9] virtio-iommu: VFIO integration Bharat Bhushan
                   ` (5 preceding siblings ...)
  2020-03-23  8:46 ` [PATCH v9 6/9] virtio-iommu: Call iommu notifier for attach/detach Bharat Bhushan
@ 2020-03-23  8:46 ` Bharat Bhushan
  2020-03-23  8:46 ` [PATCH v9 8/9] virtio-iommu: Implement probe request Bharat Bhushan
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 37+ messages in thread
From: Bharat Bhushan @ 2020-03-23  8:46 UTC (permalink / raw)
  To: peter.maydell, peterx, eric.auger.pro, alex.williamson,
	kevin.tian, mst, tnowicki, drjones, linuc.decode, qemu-devel,
	qemu-arm, bharatb.linux, jean-philippe, yang.zhong
  Cc: Bharat Bhushan

Default replay does not work with virtio-iommu,
so this patch provide virtio-iommu replay functionality.

Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
---
 hw/virtio/virtio-iommu.c | 44 ++++++++++++++++++++++++++++++++++++++++
 hw/virtio/trace-events   |  1 +
 2 files changed, 45 insertions(+)

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 88849aa7b9..747e3cf1da 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -770,6 +770,49 @@ static void virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr,
     sdev->page_size_mask = page_size_mask;
 }
 
+static gboolean virtio_iommu_remap(gpointer key, gpointer value, gpointer data)
+{
+    VirtIOIOMMUMapping *mapping = (VirtIOIOMMUMapping *) value;
+    VirtIOIOMMUInterval *interval = (VirtIOIOMMUInterval *) key;
+    IOMMUMemoryRegion *mr = (IOMMUMemoryRegion *) data;
+
+    trace_virtio_iommu_remap(interval->low, mapping->phys_addr,
+                             interval->high - interval->low + 1);
+    /* unmap previous entry and map again */
+    virtio_iommu_notify_unmap(mr, interval->low,
+                              interval->high - interval->low + 1);
+
+    virtio_iommu_notify_map(mr, interval->low, mapping->phys_addr,
+                            interval->high - interval->low + 1);
+    return false;
+}
+
+static void virtio_iommu_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
+{
+    IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
+    VirtIOIOMMU *s = sdev->viommu;
+    uint32_t sid;
+    VirtIOIOMMUEndpoint *ep;
+
+    sid = virtio_iommu_get_bdf(sdev);
+
+    qemu_mutex_lock(&s->mutex);
+
+    if (!s->endpoints) {
+        goto unlock;
+    }
+
+    ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(sid));
+    if (!ep || !ep->domain) {
+        goto unlock;
+    }
+
+    g_tree_foreach(ep->domain->mappings, virtio_iommu_remap, mr);
+
+unlock:
+    qemu_mutex_unlock(&s->mutex);
+}
+
 static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
@@ -988,6 +1031,7 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass,
 
     imrc->translate = virtio_iommu_translate;
     imrc->iommu_set_page_size_mask = virtio_iommu_set_page_size_mask;
+    imrc->replay = virtio_iommu_replay;
 }
 
 static const TypeInfo virtio_iommu_info = {
diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index d94a1cd8a3..8bae651191 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -75,3 +75,4 @@ virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid)
 virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address =0x%"PRIx64
 virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64
 virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64
+virtio_iommu_remap(uint64_t iova, uint64_t pa, uint64_t size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
-- 
2.17.1



^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v9 8/9] virtio-iommu: Implement probe request
  2020-03-23  8:46 [PATCH v9 0/9] virtio-iommu: VFIO integration Bharat Bhushan
                   ` (6 preceding siblings ...)
  2020-03-23  8:46 ` [PATCH v9 7/9] virtio-iommu: add iommu replay Bharat Bhushan
@ 2020-03-23  8:46 ` Bharat Bhushan
  2020-03-26 15:48   ` Auger Eric
  2020-04-23 16:09   ` Jean-Philippe Brucker
  2020-03-23  8:46 ` [PATCH v9 9/9] virtio-iommu: add iommu notifier memory-region Bharat Bhushan
                   ` (2 subsequent siblings)
  10 siblings, 2 replies; 37+ messages in thread
From: Bharat Bhushan @ 2020-03-23  8:46 UTC (permalink / raw)
  To: peter.maydell, peterx, eric.auger.pro, alex.williamson,
	kevin.tian, mst, tnowicki, drjones, linuc.decode, qemu-devel,
	qemu-arm, bharatb.linux, jean-philippe, yang.zhong
  Cc: Eric Auger, Bharat Bhushan

This patch implements the PROBE request. Currently supported
page size mask per endpoint is returned. Also append a NONE
property in the end.

Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 include/standard-headers/linux/virtio_iommu.h |   6 +
 hw/virtio/virtio-iommu.c                      | 161 +++++++++++++++++-
 hw/virtio/trace-events                        |   2 +
 3 files changed, 166 insertions(+), 3 deletions(-)

diff --git a/include/standard-headers/linux/virtio_iommu.h b/include/standard-headers/linux/virtio_iommu.h
index b9443b83a1..8a0d47b907 100644
--- a/include/standard-headers/linux/virtio_iommu.h
+++ b/include/standard-headers/linux/virtio_iommu.h
@@ -111,6 +111,7 @@ struct virtio_iommu_req_unmap {
 
 #define VIRTIO_IOMMU_PROBE_T_NONE		0
 #define VIRTIO_IOMMU_PROBE_T_RESV_MEM		1
+#define VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK	2
 
 #define VIRTIO_IOMMU_PROBE_T_MASK		0xfff
 
@@ -130,6 +131,11 @@ struct virtio_iommu_probe_resv_mem {
 	uint64_t					end;
 };
 
+struct virtio_iommu_probe_pgsize_mask {
+	struct virtio_iommu_probe_property      head;
+	uint64_t				pgsize_bitmap;
+};
+
 struct virtio_iommu_req_probe {
 	struct virtio_iommu_req_head		head;
 	uint32_t					endpoint;
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 747e3cf1da..63fbacdcdc 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -38,6 +38,10 @@
 
 /* Max size */
 #define VIOMMU_DEFAULT_QUEUE_SIZE 256
+#define VIOMMU_PROBE_SIZE 512
+
+#define SUPPORTED_PROBE_PROPERTIES (\
+    1 << VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK)
 
 typedef struct VirtIOIOMMUDomain {
     uint32_t id;
@@ -62,6 +66,13 @@ typedef struct VirtIOIOMMUMapping {
     uint32_t flags;
 } VirtIOIOMMUMapping;
 
+typedef struct VirtIOIOMMUPropBuffer {
+    VirtIOIOMMUEndpoint *endpoint;
+    size_t filled;
+    uint8_t *start;
+    bool error;
+} VirtIOIOMMUPropBuffer;
+
 static inline uint16_t virtio_iommu_get_bdf(IOMMUDevice *dev)
 {
     return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
@@ -490,6 +501,114 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
     return ret;
 }
 
+static int virtio_iommu_fill_none_prop(VirtIOIOMMUPropBuffer *bufstate)
+{
+    struct virtio_iommu_probe_property *prop;
+
+    prop = (struct virtio_iommu_probe_property *)
+                (bufstate->start + bufstate->filled);
+    prop->type = 0;
+    prop->length = 0;
+    bufstate->filled += sizeof(*prop);
+    trace_virtio_iommu_fill_none_property(bufstate->endpoint->id);
+    return 0;
+}
+
+static int virtio_iommu_fill_page_size_mask(VirtIOIOMMUPropBuffer *bufstate)
+{
+    struct virtio_iommu_probe_pgsize_mask *page_size_mask;
+    size_t prop_size = sizeof(*page_size_mask);
+    VirtIOIOMMUEndpoint *ep = bufstate->endpoint;
+    VirtIOIOMMU *s = ep->viommu;
+    IOMMUDevice *sdev;
+
+    if (bufstate->filled + prop_size >= VIOMMU_PROBE_SIZE) {
+        bufstate->error = true;
+        /* get the traversal stopped by returning true */
+        return true;
+    }
+
+    page_size_mask = (struct virtio_iommu_probe_pgsize_mask *)
+                     (bufstate->start + bufstate->filled);
+
+    page_size_mask->head.type = VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK;
+    page_size_mask->head.length = prop_size;
+    QLIST_FOREACH(sdev, &s->notifiers_list, next) {
+        if (ep->id == sdev->devfn) {
+            page_size_mask->pgsize_bitmap = sdev->page_size_mask;
+	}
+    }
+    bufstate->filled += sizeof(*page_size_mask);
+    trace_virtio_iommu_fill_pgsize_mask_property(bufstate->endpoint->id,
+                                                 page_size_mask->pgsize_bitmap,
+                                                 bufstate->filled);
+    return false;
+}
+
+/* Fill the properties[] buffer with properties of type @type */
+static int virtio_iommu_fill_property(int type,
+                                      VirtIOIOMMUPropBuffer *bufstate)
+{
+    int ret = -ENOSPC;
+
+    if (bufstate->filled + sizeof(struct virtio_iommu_probe_property)
+            >= VIOMMU_PROBE_SIZE) {
+        /* no space left for the header */
+        bufstate->error = true;
+        goto out;
+    }
+
+    switch (type) {
+    case VIRTIO_IOMMU_PROBE_T_NONE:
+        ret = virtio_iommu_fill_none_prop(bufstate);
+        break;
+    case VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK:
+    {
+        ret = virtio_iommu_fill_page_size_mask(bufstate);
+	break;
+    }
+    default:
+        ret = -ENOENT;
+        break;
+    }
+out:
+    if (ret) {
+        error_report("%s property of type=%d could not be filled (%d),"
+                     " remaining size = 0x%lx",
+                     __func__, type, ret, bufstate->filled);
+    }
+    return ret;
+}
+
+/**
+ * virtio_iommu_probe - Fill the probe request buffer with all
+ * the properties the device is able to return and add a NONE
+ * property at the end. @buf points to properties[].
+ */
+static int virtio_iommu_probe(VirtIOIOMMU *s,
+                              struct virtio_iommu_req_probe *req,
+                              uint8_t *buf)
+{
+    uint32_t ep_id = le32_to_cpu(req->endpoint);
+    VirtIOIOMMUEndpoint *ep = virtio_iommu_get_endpoint(s, ep_id);
+    int16_t prop_types = SUPPORTED_PROBE_PROPERTIES, type;
+    VirtIOIOMMUPropBuffer bufstate = {.start = buf, .filled = 0,
+                                       .error = false, .endpoint = ep};
+
+    while ((type = ctz32(prop_types)) != 32) {
+        if (virtio_iommu_fill_property(type, &bufstate)) {
+            goto failure;
+        }
+        prop_types &= ~(1 << type);
+    }
+    if (virtio_iommu_fill_property(VIRTIO_IOMMU_PROBE_T_NONE, &bufstate)) {
+        goto failure;
+    }
+    return VIRTIO_IOMMU_S_OK;
+failure:
+    return VIRTIO_IOMMU_S_INVAL;
+}
+
 static int virtio_iommu_iov_to_req(struct iovec *iov,
                                    unsigned int iov_cnt,
                                    void *req, size_t req_sz)
@@ -519,6 +638,17 @@ virtio_iommu_handle_req(detach)
 virtio_iommu_handle_req(map)
 virtio_iommu_handle_req(unmap)
 
+static int virtio_iommu_handle_probe(VirtIOIOMMU *s,
+                                     struct iovec *iov,
+                                     unsigned int iov_cnt,
+                                     uint8_t *buf)
+{
+    struct virtio_iommu_req_probe req;
+    int ret = virtio_iommu_iov_to_req(iov, iov_cnt, &req, sizeof(req));
+
+    return ret ? ret : virtio_iommu_probe(s, &req, buf);
+}
+
 static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIOIOMMU *s = VIRTIO_IOMMU(vdev);
@@ -564,17 +694,33 @@ static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
         case VIRTIO_IOMMU_T_UNMAP:
             tail.status = virtio_iommu_handle_unmap(s, iov, iov_cnt);
             break;
+        case VIRTIO_IOMMU_T_PROBE:
+        {
+            struct virtio_iommu_req_tail *ptail;
+            uint8_t *buf = g_malloc0(s->config.probe_size + sizeof(tail));
+
+            ptail = (struct virtio_iommu_req_tail *)
+                        (buf + s->config.probe_size);
+            ptail->status = virtio_iommu_handle_probe(s, iov, iov_cnt, buf);
+
+            sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
+                              buf, s->config.probe_size + sizeof(tail));
+            g_free(buf);
+            assert(sz == s->config.probe_size + sizeof(tail));
+            goto push;
+        }
         default:
             tail.status = VIRTIO_IOMMU_S_UNSUPP;
         }
-        qemu_mutex_unlock(&s->mutex);
 
 out:
         sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
                           &tail, sizeof(tail));
         assert(sz == sizeof(tail));
 
-        virtqueue_push(vq, elem, sizeof(tail));
+push:
+        qemu_mutex_unlock(&s->mutex);
+        virtqueue_push(vq, elem, sz);
         virtio_notify(vdev, vq);
         g_free(elem);
     }
@@ -634,16 +780,23 @@ static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
     VirtIOIOMMUEndpoint *ep;
     uint32_t sid, flags;
     bool bypass_allowed;
+    hwaddr addr_mask;
     bool found;
 
     interval.low = addr;
     interval.high = addr + 1;
 
+    if (sdev->page_size_mask) {
+        addr_mask = (1 << ctz32(sdev->page_size_mask)) - 1;
+    } else {
+        addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1;
+    }
+
     IOMMUTLBEntry entry = {
         .target_as = &address_space_memory,
         .iova = addr,
         .translated_addr = addr,
-        .addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1,
+        .addr_mask = addr_mask,
         .perm = IOMMU_NONE,
     };
 
@@ -831,6 +984,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
     s->config.page_size_mask = TARGET_PAGE_MASK;
     s->config.input_range.end = -1UL;
     s->config.domain_range.end = 32;
+    s->config.probe_size = VIOMMU_PROBE_SIZE;
 
     virtio_add_feature(&s->features, VIRTIO_RING_F_EVENT_IDX);
     virtio_add_feature(&s->features, VIRTIO_RING_F_INDIRECT_DESC);
@@ -840,6 +994,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
     virtio_add_feature(&s->features, VIRTIO_IOMMU_F_MAP_UNMAP);
     virtio_add_feature(&s->features, VIRTIO_IOMMU_F_BYPASS);
     virtio_add_feature(&s->features, VIRTIO_IOMMU_F_MMIO);
+    virtio_add_feature(&s->features, VIRTIO_IOMMU_F_PROBE);
 
     qemu_mutex_init(&s->mutex);
 
diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 8bae651191..b0a6e4bda3 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -76,3 +76,5 @@ virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uin
 virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64
 virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64
 virtio_iommu_remap(uint64_t iova, uint64_t pa, uint64_t size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
+virtio_iommu_fill_none_property(uint32_t devid) "devid=%d"
+virtio_iommu_fill_pgsize_mask_property(uint32_t devid, uint64_t pgsize_mask, size_t filled) "dev= %d, pgsize_mask=0x%"PRIx64" filled=0x%lx"
-- 
2.17.1



^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v9 9/9] virtio-iommu: add iommu notifier memory-region
  2020-03-23  8:46 [PATCH v9 0/9] virtio-iommu: VFIO integration Bharat Bhushan
                   ` (7 preceding siblings ...)
  2020-03-23  8:46 ` [PATCH v9 8/9] virtio-iommu: Implement probe request Bharat Bhushan
@ 2020-03-23  8:46 ` Bharat Bhushan
  2020-03-23  9:52 ` [PATCH v9 0/9] virtio-iommu: VFIO integration no-reply
  2020-03-23  9:59 ` no-reply
  10 siblings, 0 replies; 37+ messages in thread
From: Bharat Bhushan @ 2020-03-23  8:46 UTC (permalink / raw)
  To: peter.maydell, peterx, eric.auger.pro, alex.williamson,
	kevin.tian, mst, tnowicki, drjones, linuc.decode, qemu-devel,
	qemu-arm, bharatb.linux, jean-philippe, yang.zhong
  Cc: Bharat Bhushan

Finally add notify_flag_changed() to for memory-region
access flag iommu flag change notifier

Finally add the memory notifier

Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
---
 hw/virtio/virtio-iommu.c | 22 ++++++++++++++++++++++
 hw/virtio/trace-events   |  2 ++
 2 files changed, 24 insertions(+)

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 63fbacdcdc..413792b626 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -966,6 +966,27 @@ unlock:
     qemu_mutex_unlock(&s->mutex);
 }
 
+static int virtio_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu_mr,
+                                             IOMMUNotifierFlag old,
+                                             IOMMUNotifierFlag new,
+                                             Error **errp)
+{
+    IOMMUDevice *sdev = container_of(iommu_mr, IOMMUDevice, iommu_mr);
+    VirtIOIOMMU *s = sdev->viommu;
+
+    if (old == IOMMU_NOTIFIER_NONE) {
+        trace_virtio_iommu_notify_flag_add(iommu_mr->parent_obj.name);
+        QLIST_INSERT_HEAD(&s->notifiers_list, sdev, next);
+        return 0;
+    }
+
+    if (new == IOMMU_NOTIFIER_NONE) {
+        trace_virtio_iommu_notify_flag_del(iommu_mr->parent_obj.name);
+        QLIST_REMOVE(sdev, next);
+    }
+    return 0;
+}
+
 static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
@@ -1187,6 +1208,7 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass,
     imrc->translate = virtio_iommu_translate;
     imrc->iommu_set_page_size_mask = virtio_iommu_set_page_size_mask;
     imrc->replay = virtio_iommu_replay;
+    imrc->notify_flag_changed = virtio_iommu_notify_flag_changed;
 }
 
 static const TypeInfo virtio_iommu_info = {
diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index b0a6e4bda3..6b7495ac3d 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -78,3 +78,5 @@ virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "m
 virtio_iommu_remap(uint64_t iova, uint64_t pa, uint64_t size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
 virtio_iommu_fill_none_property(uint32_t devid) "devid=%d"
 virtio_iommu_fill_pgsize_mask_property(uint32_t devid, uint64_t pgsize_mask, size_t filled) "dev= %d, pgsize_mask=0x%"PRIx64" filled=0x%lx"
+virtio_iommu_notify_flag_add(const char *iommu) "Add virtio-iommu notifier node for memory region %s"
+virtio_iommu_notify_flag_del(const char *iommu) "Del virtio-iommu notifier node for memory region %s"
-- 
2.17.1



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v9 0/9] virtio-iommu: VFIO integration
  2020-03-23  8:46 [PATCH v9 0/9] virtio-iommu: VFIO integration Bharat Bhushan
                   ` (8 preceding siblings ...)
  2020-03-23  8:46 ` [PATCH v9 9/9] virtio-iommu: add iommu notifier memory-region Bharat Bhushan
@ 2020-03-23  9:52 ` no-reply
  2020-03-23  9:59 ` no-reply
  10 siblings, 0 replies; 37+ messages in thread
From: no-reply @ 2020-03-23  9:52 UTC (permalink / raw)
  To: bbhushan2
  Cc: yang.zhong, peter.maydell, kevin.tian, tnowicki, mst, drjones,
	peterx, qemu-devel, alex.williamson, qemu-arm, bharatb.linux,
	jean-philippe, linuc.decode, bbhushan2, eric.auger.pro

Patchew URL: https://patchew.org/QEMU/20200323084617.1782-1-bbhushan2@marvell.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [PATCH v9 0/9] virtio-iommu: VFIO integration
Message-id: 20200323084617.1782-1-bbhushan2@marvell.com
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
   9b26a61..29e0855  master     -> master
 - [tag update]      patchew/20200320160622.8040-1-richard.henderson@linaro.org -> patchew/20200320160622.8040-1-richard.henderson@linaro.org
Switched to a new branch 'test'
8eeb60c virtio-iommu: add iommu notifier memory-region
176e083 virtio-iommu: Implement probe request
221f4ba virtio-iommu: add iommu replay
30211fd virtio-iommu: Call iommu notifier for attach/detach
84f316d virtio-iommu: Add iommu notifier for map/unmap
8cc559f virtio-iommu: set supported page size mask
d4062a9 vfio: set iommu page size as per host supported page size
cc9310f memory: Add interface to set iommu page size mask
2b43555 hw/vfio/common: Remove error print on mmio region translation by viommu

=== OUTPUT BEGIN ===
1/9 Checking commit 2b43555a2f62 (hw/vfio/common: Remove error print on mmio region translation by viommu)
2/9 Checking commit cc9310f32d06 (memory: Add interface to set iommu page size mask)
3/9 Checking commit d4062a953cef (vfio: set iommu page size as per host supported page size)
4/9 Checking commit 8cc559f025ae (virtio-iommu: set supported page size mask)
5/9 Checking commit 84f316d68429 (virtio-iommu: Add iommu notifier for map/unmap)
6/9 Checking commit 30211fdb36cd (virtio-iommu: Call iommu notifier for attach/detach)
7/9 Checking commit 221f4ba0d125 (virtio-iommu: add iommu replay)
8/9 Checking commit 176e0834bcee (virtio-iommu: Implement probe request)
ERROR: code indent should never use tabs
#93: FILE: hw/virtio/virtio-iommu.c:539:
+^I}$

ERROR: code indent should never use tabs
#122: FILE: hw/virtio/virtio-iommu.c:568:
+^Ibreak;$

total: 2 errors, 0 warnings, 250 lines checked

Patch 8/9 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

9/9 Checking commit 8eeb60c12ace (virtio-iommu: add iommu notifier memory-region)
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20200323084617.1782-1-bbhushan2@marvell.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v9 0/9] virtio-iommu: VFIO integration
  2020-03-23  8:46 [PATCH v9 0/9] virtio-iommu: VFIO integration Bharat Bhushan
                   ` (9 preceding siblings ...)
  2020-03-23  9:52 ` [PATCH v9 0/9] virtio-iommu: VFIO integration no-reply
@ 2020-03-23  9:59 ` no-reply
  10 siblings, 0 replies; 37+ messages in thread
From: no-reply @ 2020-03-23  9:59 UTC (permalink / raw)
  To: bbhushan2
  Cc: yang.zhong, peter.maydell, kevin.tian, tnowicki, mst, drjones,
	peterx, qemu-devel, alex.williamson, qemu-arm, bharatb.linux,
	jean-philippe, linuc.decode, bbhushan2, eric.auger.pro

Patchew URL: https://patchew.org/QEMU/20200323084617.1782-1-bbhushan2@marvell.com/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#! /bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-mingw@fedora J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC      x86_64-softmmu/hw/virtio/virtio-blk-pci.o
  CC      x86_64-softmmu/hw/virtio/virtio-net-pci.o
/tmp/qemu-test/src/hw/virtio/virtio-iommu.c: In function 'virtio_iommu_fill_property':
/tmp/qemu-test/src/hw/virtio/virtio-iommu.c:576:22: error: format '%lx' expects argument of type 'long unsigned int', but argument 5 has type 'size_t' {aka 'long long unsigned int'} [-Werror=format=]
         error_report("%s property of type=%d could not be filled (%d),"
                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/qemu-test/src/hw/virtio/virtio-iommu.c:578:43:
---
                                           ~~^
                                           %llx
cc1: all warnings being treated as errors
make[1]: *** [/tmp/qemu-test/src/rules.mak:69: hw/virtio/virtio-iommu.o] Error 1
make[1]: *** Waiting for unfinished jobs....
  CC      aarch64-softmmu/hw/scsi/virtio-scsi-dataplane.o
  CC      aarch64-softmmu/hw/semihosting/config.o
---
  CC      aarch64-softmmu/hw/virtio/virtio-balloon-pci.o
  CC      aarch64-softmmu/hw/virtio/virtio-iommu-pci.o
  CC      aarch64-softmmu/hw/virtio/virtio-scsi-pci.o
make: *** [Makefile:527: x86_64-softmmu/all] Error 2
make: *** Waiting for unfinished jobs....
  CC      aarch64-softmmu/hw/virtio/virtio-blk-pci.o
  CC      aarch64-softmmu/hw/virtio/virtio-net-pci.o
---
  CC      aarch64-softmmu/hw/arm/virt.o
  CC      aarch64-softmmu/hw/arm/virt-acpi-build.o
/tmp/qemu-test/src/hw/virtio/virtio-iommu.c: In function 'virtio_iommu_fill_property':
/tmp/qemu-test/src/hw/virtio/virtio-iommu.c:576:22: error: format '%lx' expects argument of type 'long unsigned int', but argument 5 has type 'size_t' {aka 'long long unsigned int'} [-Werror=format=]
         error_report("%s property of type=%d could not be filled (%d),"
                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/qemu-test/src/hw/virtio/virtio-iommu.c:578:43:
---
                                           ~~^
                                           %llx
cc1: all warnings being treated as errors
make[1]: *** [/tmp/qemu-test/src/rules.mak:69: hw/virtio/virtio-iommu.o] Error 1
make[1]: *** Waiting for unfinished jobs....
  CC      aarch64-softmmu/hw/arm/digic_boards.o
make: *** [Makefile:527: aarch64-softmmu/all] Error 2
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 664, in <module>
    sys.exit(main())
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=480fcb56b8454a80ae140ce373e940ee', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-rkry8xh9/src/docker-src.2020-03-23-05.55.47.1669:/var/tmp/qemu:z,ro', 'qemu:fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=480fcb56b8454a80ae140ce373e940ee
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-rkry8xh9/src'
make: *** [docker-run-test-mingw@fedora] Error 2

real    3m47.039s
user    0m7.589s


The full log is available at
http://patchew.org/logs/20200323084617.1782-1-bbhushan2@marvell.com/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu
  2020-03-23  8:46 ` [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu Bharat Bhushan
@ 2020-03-23 23:08   ` Alex Williamson
  2020-03-26 17:35     ` Auger Eric
  0 siblings, 1 reply; 37+ messages in thread
From: Alex Williamson @ 2020-03-23 23:08 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: yang.zhong, peter.maydell, kevin.tian, tnowicki, mst, drjones,
	peterx, qemu-devel, Eric Auger, bharatb.linux, qemu-arm,
	jean-philippe, linuc.decode, David Gibson, eric.auger.pro

[Cc +dwg who originated this warning]

On Mon, 23 Mar 2020 14:16:09 +0530
Bharat Bhushan <bbhushan2@marvell.com> wrote:

> On ARM, the MSI doorbell is translated by the virtual IOMMU.
> As such address_space_translate() returns the MSI controller
> MMIO region and we get an "iommu map to non memory area"
> message. Let's remove this latter.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
> ---
>  hw/vfio/common.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 5ca11488d6..c586edf47a 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb, void **vaddr,
>                                   &xlat, &len, writable,
>                                   MEMTXATTRS_UNSPECIFIED);
>      if (!memory_region_is_ram(mr)) {
> -        error_report("iommu map to non memory area %"HWADDR_PRIx"",
> -                     xlat);
>          return false;
>      }
>  

I'm a bit confused here, I think we need more justification beyond "we
hit this warning and we don't want to because it's ok in this one
special case, therefore remove it".  I assume the special case is that
the device MSI address is managed via the SET_IRQS ioctl and therefore
we won't actually get DMAs to this range.  But I imagine the case that
was in mind when adding this warning was general peer-to-peer between
and assigned and emulated device.  Maybe there's an argument to be made
that such a p2p mapping might also be used in a non-vIOMMU case.  We
skip creating those mappings and drivers continue to work, maybe
because nobody attempts to do p2p DMA with the types of devices we
emulate, maybe because p2p DMA is not absolutely reliable on bare metal
and drivers test it before using it.  Anyway, I need a better argument
why this is ok.  Thanks,

Alex



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v9 8/9] virtio-iommu: Implement probe request
  2020-03-23  8:46 ` [PATCH v9 8/9] virtio-iommu: Implement probe request Bharat Bhushan
@ 2020-03-26 15:48   ` Auger Eric
  2020-03-27  5:40     ` [EXT] " Bharat Bhushan
  2020-04-23 16:09   ` Jean-Philippe Brucker
  1 sibling, 1 reply; 37+ messages in thread
From: Auger Eric @ 2020-03-26 15:48 UTC (permalink / raw)
  To: Bharat Bhushan, peter.maydell, peterx, eric.auger.pro,
	alex.williamson, kevin.tian, mst, tnowicki, drjones,
	linuc.decode, qemu-devel, qemu-arm, bharatb.linux, jean-philippe,
	yang.zhong

Hi Bharat

On 3/23/20 9:46 AM, Bharat Bhushan wrote:
> This patch implements the PROBE request. Currently supported
> page size mask per endpoint is returned. Also append a NONE
> property in the end.
> 
> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  include/standard-headers/linux/virtio_iommu.h |   6 +
Changes to virtio_iommu.h should be in a separate patch
you should use ./scripts/update-linux-headers.sh
See for instance:
ddda37483d  linux-headers: update
until the uapi updates are not upstream you can link to your kernel
branch and mention this is a temporary linux header update or partial if
you just want to pick up the iommu.h changes.

>  hw/virtio/virtio-iommu.c                      | 161 +++++++++++++++++-
>  hw/virtio/trace-events                        |   2 +
>  3 files changed, 166 insertions(+), 3 deletions(-)
> 
> diff --git a/include/standard-headers/linux/virtio_iommu.h b/include/standard-headers/linux/virtio_iommu.h
> index b9443b83a1..8a0d47b907 100644
> --- a/include/standard-headers/linux/virtio_iommu.h
> +++ b/include/standard-headers/linux/virtio_iommu.h
> @@ -111,6 +111,7 @@ struct virtio_iommu_req_unmap {
>  
>  #define VIRTIO_IOMMU_PROBE_T_NONE		0
>  #define VIRTIO_IOMMU_PROBE_T_RESV_MEM		1
> +#define VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK	2
>  
>  #define VIRTIO_IOMMU_PROBE_T_MASK		0xfff
>  
> @@ -130,6 +131,11 @@ struct virtio_iommu_probe_resv_mem {
>  	uint64_t					end;
>  };
>  
> +struct virtio_iommu_probe_pgsize_mask {
> +	struct virtio_iommu_probe_property      head;
> +	uint64_t				pgsize_bitmap;
> +};
> +
>  struct virtio_iommu_req_probe {
>  	struct virtio_iommu_req_head		head;
>  	uint32_t					endpoint;
> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> index 747e3cf1da..63fbacdcdc 100644
> --- a/hw/virtio/virtio-iommu.c
> +++ b/hw/virtio/virtio-iommu.c
> @@ -38,6 +38,10 @@
>  
>  /* Max size */
>  #define VIOMMU_DEFAULT_QUEUE_SIZE 256
> +#define VIOMMU_PROBE_SIZE 512
> +
> +#define SUPPORTED_PROBE_PROPERTIES (\
> +    1 << VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK)
>  
>  typedef struct VirtIOIOMMUDomain {
>      uint32_t id;
> @@ -62,6 +66,13 @@ typedef struct VirtIOIOMMUMapping {
>      uint32_t flags;
>  } VirtIOIOMMUMapping;
>  
> +typedef struct VirtIOIOMMUPropBuffer {
> +    VirtIOIOMMUEndpoint *endpoint;
> +    size_t filled;
> +    uint8_t *start;
> +    bool error;
> +} VirtIOIOMMUPropBuffer;
> +
>  static inline uint16_t virtio_iommu_get_bdf(IOMMUDevice *dev)
>  {
>      return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
> @@ -490,6 +501,114 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
>      return ret;
>  }
>  
> +static int virtio_iommu_fill_none_prop(VirtIOIOMMUPropBuffer *bufstate)
> +{
> +    struct virtio_iommu_probe_property *prop;
> +
> +    prop = (struct virtio_iommu_probe_property *)
> +                (bufstate->start + bufstate->filled);
> +    prop->type = 0;
> +    prop->length = 0;
> +    bufstate->filled += sizeof(*prop);
> +    trace_virtio_iommu_fill_none_property(bufstate->endpoint->id);
> +    return 0;
> +}
> +
> +static int virtio_iommu_fill_page_size_mask(VirtIOIOMMUPropBuffer *bufstate)
> +{
> +    struct virtio_iommu_probe_pgsize_mask *page_size_mask;
> +    size_t prop_size = sizeof(*page_size_mask);
> +    VirtIOIOMMUEndpoint *ep = bufstate->endpoint;
> +    VirtIOIOMMU *s = ep->viommu;
> +    IOMMUDevice *sdev;
> +
> +    if (bufstate->filled + prop_size >= VIOMMU_PROBE_SIZE) {
> +        bufstate->error = true;
> +        /* get the traversal stopped by returning true */
> +        return true;
> +    }
> +
> +    page_size_mask = (struct virtio_iommu_probe_pgsize_mask *)
> +                     (bufstate->start + bufstate->filled);
> +
> +    page_size_mask->head.type = VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK;
> +    page_size_mask->head.length = prop_size;
> +    QLIST_FOREACH(sdev, &s->notifiers_list, next) {
> +        if (ep->id == sdev->devfn) {
> +            page_size_mask->pgsize_bitmap = sdev->page_size_mask;
> +	}
> +    }
> +    bufstate->filled += sizeof(*page_size_mask);
> +    trace_virtio_iommu_fill_pgsize_mask_property(bufstate->endpoint->id,
> +                                                 page_size_mask->pgsize_bitmap,
> +                                                 bufstate->filled);
> +    return false;
> +}
> +
> +/* Fill the properties[] buffer with properties of type @type */
> +static int virtio_iommu_fill_property(int type,
> +                                      VirtIOIOMMUPropBuffer *bufstate)
> +{
> +    int ret = -ENOSPC;
> +
> +    if (bufstate->filled + sizeof(struct virtio_iommu_probe_property)
> +            >= VIOMMU_PROBE_SIZE) {
> +        /* no space left for the header */
> +        bufstate->error = true;
> +        goto out;
> +    }
> +
> +    switch (type) {
> +    case VIRTIO_IOMMU_PROBE_T_NONE:
> +        ret = virtio_iommu_fill_none_prop(bufstate);
> +        break;
> +    case VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK:
> +    {
> +        ret = virtio_iommu_fill_page_size_mask(bufstate);
I don't think you want to fill the property of each EP. Only for those
whose sdev->page_size_mask was properly set. For instance if you mix
virtio and vfio devices, virtio ones won't have this field set.
> +	break;
> +    }
> +    default:
> +        ret = -ENOENT;
> +        break;
> +    }
> +out:
> +    if (ret) {
> +        error_report("%s property of type=%d could not be filled (%d),"
> +                     " remaining size = 0x%lx",
> +                     __func__, type, ret, bufstate->filled);
> +    }
> +    return ret;
> +}
> +
> +/**
> + * virtio_iommu_probe - Fill the probe request buffer with all
> + * the properties the device is able to return and add a NONE
> + * property at the end. @buf points to properties[].
> + */
> +static int virtio_iommu_probe(VirtIOIOMMU *s,
> +                              struct virtio_iommu_req_probe *req,
> +                              uint8_t *buf)
> +{
> +    uint32_t ep_id = le32_to_cpu(req->endpoint);
> +    VirtIOIOMMUEndpoint *ep = virtio_iommu_get_endpoint(s, ep_id);
> +    int16_t prop_types = SUPPORTED_PROBE_PROPERTIES, type;
> +    VirtIOIOMMUPropBuffer bufstate = {.start = buf, .filled = 0,
> +                                       .error = false, .endpoint = ep};
> +
> +    while ((type = ctz32(prop_types)) != 32) {
> +        if (virtio_iommu_fill_property(type, &bufstate)) {
> +            goto failure;
> +        }
> +        prop_types &= ~(1 << type);
> +    }
> +    if (virtio_iommu_fill_property(VIRTIO_IOMMU_PROBE_T_NONE, &bufstate)) {
> +        goto failure;
> +    }
> +    return VIRTIO_IOMMU_S_OK;
> +failure:
> +    return VIRTIO_IOMMU_S_INVAL;
> +}
> +
>  static int virtio_iommu_iov_to_req(struct iovec *iov,
>                                     unsigned int iov_cnt,
>                                     void *req, size_t req_sz)
> @@ -519,6 +638,17 @@ virtio_iommu_handle_req(detach)
>  virtio_iommu_handle_req(map)
>  virtio_iommu_handle_req(unmap)
>  
> +static int virtio_iommu_handle_probe(VirtIOIOMMU *s,
> +                                     struct iovec *iov,
> +                                     unsigned int iov_cnt,
> +                                     uint8_t *buf)
> +{
> +    struct virtio_iommu_req_probe req;
> +    int ret = virtio_iommu_iov_to_req(iov, iov_cnt, &req, sizeof(req));
> +
> +    return ret ? ret : virtio_iommu_probe(s, &req, buf);
> +}
> +
>  static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
>  {
>      VirtIOIOMMU *s = VIRTIO_IOMMU(vdev);
> @@ -564,17 +694,33 @@ static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
>          case VIRTIO_IOMMU_T_UNMAP:
>              tail.status = virtio_iommu_handle_unmap(s, iov, iov_cnt);
>              break;
> +        case VIRTIO_IOMMU_T_PROBE:
> +        {
> +            struct virtio_iommu_req_tail *ptail;
> +            uint8_t *buf = g_malloc0(s->config.probe_size + sizeof(tail));
> +
> +            ptail = (struct virtio_iommu_req_tail *)
> +                        (buf + s->config.probe_size);
> +            ptail->status = virtio_iommu_handle_probe(s, iov, iov_cnt, buf);
> +
> +            sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
> +                              buf, s->config.probe_size + sizeof(tail));
> +            g_free(buf);
> +            assert(sz == s->config.probe_size + sizeof(tail));
> +            goto push;
> +        }
>          default:
>              tail.status = VIRTIO_IOMMU_S_UNSUPP;
>          }
> -        qemu_mutex_unlock(&s->mutex);
>  
>  out:
>          sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
>                            &tail, sizeof(tail));
>          assert(sz == sizeof(tail));
>  
> -        virtqueue_push(vq, elem, sizeof(tail));
> +push:
> +        qemu_mutex_unlock(&s->mutex);
> +        virtqueue_push(vq, elem, sz);
>          virtio_notify(vdev, vq);
>          g_free(elem);
>      }
> @@ -634,16 +780,23 @@ static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
>      VirtIOIOMMUEndpoint *ep;
>      uint32_t sid, flags;
>      bool bypass_allowed;
> +    hwaddr addr_mask;
>      bool found;
>  
>      interval.low = addr;
>      interval.high = addr + 1;
>  
> +    if (sdev->page_size_mask) {
> +        addr_mask = (1 << ctz32(sdev->page_size_mask)) - 1;
> +    } else {
> +        addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1;
> +    }
This change does not belong ot this patch. Rather belongs to
"virtio-iommu: set supported page size mask"
> +
>      IOMMUTLBEntry entry = {
>          .target_as = &address_space_memory,
>          .iova = addr,
>          .translated_addr = addr,
> -        .addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1,
> +        .addr_mask = addr_mask,
>          .perm = IOMMU_NONE,
>      };
>  
> @@ -831,6 +984,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>      s->config.page_size_mask = TARGET_PAGE_MASK;
>      s->config.input_range.end = -1UL;
>      s->config.domain_range.end = 32;
> +    s->config.probe_size = VIOMMU_PROBE_SIZE;
>  
>      virtio_add_feature(&s->features, VIRTIO_RING_F_EVENT_IDX);
>      virtio_add_feature(&s->features, VIRTIO_RING_F_INDIRECT_DESC);
> @@ -840,6 +994,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>      virtio_add_feature(&s->features, VIRTIO_IOMMU_F_MAP_UNMAP);
>      virtio_add_feature(&s->features, VIRTIO_IOMMU_F_BYPASS);
>      virtio_add_feature(&s->features, VIRTIO_IOMMU_F_MMIO);
> +    virtio_add_feature(&s->features, VIRTIO_IOMMU_F_PROBE);
>  
>      qemu_mutex_init(&s->mutex);
>  
> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> index 8bae651191..b0a6e4bda3 100644
> --- a/hw/virtio/trace-events
> +++ b/hw/virtio/trace-events
> @@ -76,3 +76,5 @@ virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uin
>  virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64
>  virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64
>  virtio_iommu_remap(uint64_t iova, uint64_t pa, uint64_t size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
> +virtio_iommu_fill_none_property(uint32_t devid) "devid=%d"
> +virtio_iommu_fill_pgsize_mask_property(uint32_t devid, uint64_t pgsize_mask, size_t filled) "dev= %d, pgsize_mask=0x%"PRIx64" filled=0x%lx"
> 
Thanks

Eric



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v9 4/9] virtio-iommu: set supported page size mask
  2020-03-23  8:46 ` [PATCH v9 4/9] virtio-iommu: set supported page size mask Bharat Bhushan
@ 2020-03-26 15:51   ` Auger Eric
  2020-03-27  5:13     ` [EXT] " Bharat Bhushan
  0 siblings, 1 reply; 37+ messages in thread
From: Auger Eric @ 2020-03-26 15:51 UTC (permalink / raw)
  To: Bharat Bhushan, peter.maydell, peterx, eric.auger.pro,
	alex.williamson, kevin.tian, mst, tnowicki, drjones,
	linuc.decode, qemu-devel, qemu-arm, bharatb.linux, jean-philippe,
	yang.zhong

Hi Bharat,

On 3/23/20 9:46 AM, Bharat Bhushan wrote:
> Add optional interface to set page size mask.
> Currently this is set global configuration and not
> per endpoint.
This allows to override the page size mask per end-point?
> 
> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
> ---
>  include/hw/virtio/virtio-iommu.h | 1 +
>  hw/virtio/virtio-iommu.c         | 9 +++++++++
>  2 files changed, 10 insertions(+)
> 
> diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h
> index 6f67f1020a..4efa09610a 100644
> --- a/include/hw/virtio/virtio-iommu.h
> +++ b/include/hw/virtio/virtio-iommu.h
> @@ -35,6 +35,7 @@ typedef struct IOMMUDevice {
>      void         *viommu;
>      PCIBus       *bus;
>      int           devfn;
> +    uint64_t      page_size_mask;
>      IOMMUMemoryRegion  iommu_mr;
>      AddressSpace  as;
>  } IOMMUDevice;
> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> index 4cee8083bc..a28818202c 100644
> --- a/hw/virtio/virtio-iommu.c
> +++ b/hw/virtio/virtio-iommu.c
> @@ -650,6 +650,14 @@ static gint int_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
>      return (ua > ub) - (ua < ub);
>  }
>  
> +static void virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr,
> +                                            uint64_t page_size_mask)
> +{
> +    IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
> +
> +    sdev->page_size_mask = page_size_mask;
> +}
> +
>  static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>  {
>      VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> @@ -865,6 +873,7 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass,
>      IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
>  
>      imrc->translate = virtio_iommu_translate;
> +    imrc->iommu_set_page_size_mask = virtio_iommu_set_page_size_mask;
>  }
>  
>  static const TypeInfo virtio_iommu_info = {
> 
Thanks

Eric



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v9 2/9] memory: Add interface to set iommu page size mask
  2020-03-23  8:46 ` [PATCH v9 2/9] memory: Add interface to set iommu page size mask Bharat Bhushan
@ 2020-03-26 16:06   ` Auger Eric
  2020-03-27  5:33     ` [EXT] " Bharat Bhushan
  0 siblings, 1 reply; 37+ messages in thread
From: Auger Eric @ 2020-03-26 16:06 UTC (permalink / raw)
  To: Bharat Bhushan, peter.maydell, peterx, eric.auger.pro,
	alex.williamson, kevin.tian, mst, tnowicki, drjones,
	linuc.decode, qemu-devel, qemu-arm, bharatb.linux, jean-philippe,
	yang.zhong

Hi Bharat,
On 3/23/20 9:46 AM, Bharat Bhushan wrote:
> Allow to set page size mask to be supported by iommu.
by iommu memory region. I mean this is not global to the IOMMU.
> This is required to expose page size mask compatible with
> host with virtio-iommu.
> 
> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
> ---
>  include/exec/memory.h | 20 ++++++++++++++++++++
>  memory.c              | 10 ++++++++++
>  2 files changed, 30 insertions(+)
> 
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index e85b7de99a..063c424854 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -355,6 +355,16 @@ typedef struct IOMMUMemoryRegionClass {
>       * @iommu: the IOMMUMemoryRegion
>       */
>      int (*num_indexes)(IOMMUMemoryRegion *iommu);
> +
> +    /*
> +     * Set supported IOMMU page size
> +     *
> +     * Optional method: if this is supported then set page size that
> +     * can be supported by IOMMU. This is called to set supported page
> +     * size as per host Linux.
What about: If supported, allows to restrict the page size mask that can
be supported with a given IOMMU memory region. For example, this allows
to propagate host physical IOMMU page size mask limitations to the
virtual IOMMU (vfio assignment with virtual iommu).
> +     */
> +     void (*iommu_set_page_size_mask)(IOMMUMemoryRegion *iommu,
> +                                      uint64_t page_size_mask);
>  } IOMMUMemoryRegionClass;
>  
>  typedef struct CoalescedMemoryRange CoalescedMemoryRange;
> @@ -1363,6 +1373,16 @@ int memory_region_iommu_attrs_to_index(IOMMUMemoryRegion *iommu_mr,
>   */
>  int memory_region_iommu_num_indexes(IOMMUMemoryRegion *iommu_mr);
>  
> +/**
> + * memory_region_iommu_set_page_size_mask: set the supported pages
> + * size by iommu.
supported page sizes for a given IOMMU memory region
> + *
> + * @iommu_mr: the memory region
IOMMU memory region
> + * @page_size_mask: supported page size mask
> + */
> +void memory_region_iommu_set_page_size_mask(IOMMUMemoryRegion *iommu_mr,
> +                                            uint64_t page_size_mask);
> +
>  /**
>   * memory_region_name: get a memory region's name
>   *
> diff --git a/memory.c b/memory.c
> index aeaa8dcc9e..14c8783084 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -1833,6 +1833,16 @@ static int memory_region_update_iommu_notify_flags(IOMMUMemoryRegion *iommu_mr,
>      return ret;
>  }
>  
> +void memory_region_iommu_set_page_size_mask(IOMMUMemoryRegion *iommu_mr,
> +                                            uint64_t page_size_mask)
> +{
> +    IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr);
> +
> +    if (imrc->iommu_set_page_size_mask) {
> +        imrc->iommu_set_page_size_mask(iommu_mr, page_size_mask);
Shouldn't it return an int in case the setting cannot be applied?
> +    }
> +}
> +
>  int memory_region_register_iommu_notifier(MemoryRegion *mr,
>                                            IOMMUNotifier *n, Error **errp)
>  {
> 
Thanks
Eric



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu
  2020-03-23 23:08   ` Alex Williamson
@ 2020-03-26 17:35     ` Auger Eric
  2020-03-26 17:53       ` Alex Williamson
  0 siblings, 1 reply; 37+ messages in thread
From: Auger Eric @ 2020-03-26 17:35 UTC (permalink / raw)
  To: Alex Williamson, Bharat Bhushan
  Cc: yang.zhong, peter.maydell, kevin.tian, tnowicki, mst, drjones,
	peterx, qemu-devel, bharatb.linux, qemu-arm, jean-philippe,
	linuc.decode, David Gibson, eric.auger.pro

Hi Alex,

On 3/24/20 12:08 AM, Alex Williamson wrote:
> [Cc +dwg who originated this warning]
> 
> On Mon, 23 Mar 2020 14:16:09 +0530
> Bharat Bhushan <bbhushan2@marvell.com> wrote:
> 
>> On ARM, the MSI doorbell is translated by the virtual IOMMU.
>> As such address_space_translate() returns the MSI controller
>> MMIO region and we get an "iommu map to non memory area"
>> message. Let's remove this latter.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
>> ---
>>  hw/vfio/common.c | 2 --
>>  1 file changed, 2 deletions(-)
>>
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index 5ca11488d6..c586edf47a 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb, void **vaddr,
>>                                   &xlat, &len, writable,
>>                                   MEMTXATTRS_UNSPECIFIED);
>>      if (!memory_region_is_ram(mr)) {
>> -        error_report("iommu map to non memory area %"HWADDR_PRIx"",
>> -                     xlat);
>>          return false;
>>      }
>>  
> 
> I'm a bit confused here, I think we need more justification beyond "we
> hit this warning and we don't want to because it's ok in this one
> special case, therefore remove it".  I assume the special case is that
> the device MSI address is managed via the SET_IRQS ioctl and therefore
> we won't actually get DMAs to this range.
Yes exactly. The guest creates a mapping between one giova and this gpa
(corresponding to the MSI controller doorbell) because MSIs are mapped
on ARM. But practically the physical device is programmed with an host
chosen iova that maps onto the physical MSI controller's doorbell. so
the device never performs DMA accesses to this range.

  But I imagine the case that
> was in mind when adding this warning was general peer-to-peer between
> and assigned and emulated device.
yes makes sense.

  Maybe there's an argument to be made
> that such a p2p mapping might also be used in a non-vIOMMU case.  We
> skip creating those mappings and drivers continue to work, maybe
> because nobody attempts to do p2p DMA with the types of devices we
> emulate, maybe because p2p DMA is not absolutely reliable on bare metal
> and drivers test it before using it.
MSI doorbells are mapped using the IOMMU_MMIO flag (dma-iommu.c
iommu_dma_get_msi_page).
One idea could be to pass that flag through the IOMMU Notifier mechanism
into the iotlb->perm. Eventually when we get this in vfio_get_vaddr() we
would not print the warning. Could that make sense?

Thanks

Eric



  Anyway, I need a better argument
> why this is ok.  Thanks,
> 
> Alex
> 



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu
  2020-03-26 17:35     ` Auger Eric
@ 2020-03-26 17:53       ` Alex Williamson
  2020-03-27  5:50         ` [EXT] " Bharat Bhushan
  2020-04-02  9:01         ` Bharat Bhushan
  0 siblings, 2 replies; 37+ messages in thread
From: Alex Williamson @ 2020-03-26 17:53 UTC (permalink / raw)
  To: Auger Eric
  Cc: yang.zhong, peter.maydell, kevin.tian, tnowicki, mst, drjones,
	peterx, qemu-devel, bharatb.linux, qemu-arm, jean-philippe,
	Bharat Bhushan, David Gibson, linuc.decode, eric.auger.pro

On Thu, 26 Mar 2020 18:35:48 +0100
Auger Eric <eric.auger@redhat.com> wrote:

> Hi Alex,
> 
> On 3/24/20 12:08 AM, Alex Williamson wrote:
> > [Cc +dwg who originated this warning]
> > 
> > On Mon, 23 Mar 2020 14:16:09 +0530
> > Bharat Bhushan <bbhushan2@marvell.com> wrote:
> >   
> >> On ARM, the MSI doorbell is translated by the virtual IOMMU.
> >> As such address_space_translate() returns the MSI controller
> >> MMIO region and we get an "iommu map to non memory area"
> >> message. Let's remove this latter.
> >>
> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
> >> ---
> >>  hw/vfio/common.c | 2 --
> >>  1 file changed, 2 deletions(-)
> >>
> >> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> >> index 5ca11488d6..c586edf47a 100644
> >> --- a/hw/vfio/common.c
> >> +++ b/hw/vfio/common.c
> >> @@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb, void **vaddr,
> >>                                   &xlat, &len, writable,
> >>                                   MEMTXATTRS_UNSPECIFIED);
> >>      if (!memory_region_is_ram(mr)) {
> >> -        error_report("iommu map to non memory area %"HWADDR_PRIx"",
> >> -                     xlat);
> >>          return false;
> >>      }
> >>    
> > 
> > I'm a bit confused here, I think we need more justification beyond "we
> > hit this warning and we don't want to because it's ok in this one
> > special case, therefore remove it".  I assume the special case is that
> > the device MSI address is managed via the SET_IRQS ioctl and therefore
> > we won't actually get DMAs to this range.  
> Yes exactly. The guest creates a mapping between one giova and this gpa
> (corresponding to the MSI controller doorbell) because MSIs are mapped
> on ARM. But practically the physical device is programmed with an host
> chosen iova that maps onto the physical MSI controller's doorbell. so
> the device never performs DMA accesses to this range.
> 
>   But I imagine the case that
> > was in mind when adding this warning was general peer-to-peer between
> > and assigned and emulated device.  
> yes makes sense.
> 
>   Maybe there's an argument to be made
> > that such a p2p mapping might also be used in a non-vIOMMU case.  We
> > skip creating those mappings and drivers continue to work, maybe
> > because nobody attempts to do p2p DMA with the types of devices we
> > emulate, maybe because p2p DMA is not absolutely reliable on bare metal
> > and drivers test it before using it.  
> MSI doorbells are mapped using the IOMMU_MMIO flag (dma-iommu.c
> iommu_dma_get_msi_page).
> One idea could be to pass that flag through the IOMMU Notifier mechanism
> into the iotlb->perm. Eventually when we get this in vfio_get_vaddr() we
> would not print the warning. Could that make sense?

Yeah, if we can identify a valid case that doesn't need a warning,
that's fine by me.  Thanks,

Alex



^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [EXT] Re: [PATCH v9 4/9] virtio-iommu: set supported page size mask
  2020-03-26 15:51   ` Auger Eric
@ 2020-03-27  5:13     ` Bharat Bhushan
  2020-03-27  8:28       ` Auger Eric
  0 siblings, 1 reply; 37+ messages in thread
From: Bharat Bhushan @ 2020-03-27  5:13 UTC (permalink / raw)
  To: Auger Eric, peter.maydell, peterx, eric.auger.pro,
	alex.williamson, kevin.tian, mst, Tomasz Nowicki [C],
	drjones, linuc.decode, qemu-devel, qemu-arm, bharatb.linux,
	jean-philippe, yang.zhong

Hi Eric,

> -----Original Message-----
> From: Auger Eric <eric.auger@redhat.com>
> Sent: Thursday, March 26, 2020 9:22 PM
> To: Bharat Bhushan <bbhushan2@marvell.com>; peter.maydell@linaro.org;
> peterx@redhat.com; eric.auger.pro@gmail.com; alex.williamson@redhat.com;
> kevin.tian@intel.com; mst@redhat.com; Tomasz Nowicki [C]
> <tnowicki@marvell.com>; drjones@redhat.com; linuc.decode@gmail.com; qemu-
> devel@nongnu.org; qemu-arm@nongnu.org; bharatb.linux@gmail.com; jean-
> philippe@linaro.org; yang.zhong@intel.com
> Subject: [EXT] Re: [PATCH v9 4/9] virtio-iommu: set supported page size mask
> 
> External Email
> 
> ----------------------------------------------------------------------
> Hi Bharat,
> 
> On 3/23/20 9:46 AM, Bharat Bhushan wrote:
> > Add optional interface to set page size mask.
> > Currently this is set global configuration and not per endpoint.
> This allows to override the page size mask per end-point?

This patch adds per endpoint page-size-mask configuration in addition to global page-size-mask.
endpoint page-size-mask will override global page-size-mask configuration for that endpoint.

Thanks
-Bharat

> >
> > Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
> > ---
> >  include/hw/virtio/virtio-iommu.h | 1 +
> >  hw/virtio/virtio-iommu.c         | 9 +++++++++
> >  2 files changed, 10 insertions(+)
> >
> > diff --git a/include/hw/virtio/virtio-iommu.h
> > b/include/hw/virtio/virtio-iommu.h
> > index 6f67f1020a..4efa09610a 100644
> > --- a/include/hw/virtio/virtio-iommu.h
> > +++ b/include/hw/virtio/virtio-iommu.h
> > @@ -35,6 +35,7 @@ typedef struct IOMMUDevice {
> >      void         *viommu;
> >      PCIBus       *bus;
> >      int           devfn;
> > +    uint64_t      page_size_mask;
> >      IOMMUMemoryRegion  iommu_mr;
> >      AddressSpace  as;
> >  } IOMMUDevice;
> > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index
> > 4cee8083bc..a28818202c 100644
> > --- a/hw/virtio/virtio-iommu.c
> > +++ b/hw/virtio/virtio-iommu.c
> > @@ -650,6 +650,14 @@ static gint int_cmp(gconstpointer a, gconstpointer b,
> gpointer user_data)
> >      return (ua > ub) - (ua < ub);
> >  }
> >
> > +static void virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr,
> > +                                            uint64_t page_size_mask)
> > +{
> > +    IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
> > +
> > +    sdev->page_size_mask = page_size_mask; }
> > +
> >  static void virtio_iommu_device_realize(DeviceState *dev, Error
> > **errp)  {
> >      VirtIODevice *vdev = VIRTIO_DEVICE(dev); @@ -865,6 +873,7 @@
> > static void virtio_iommu_memory_region_class_init(ObjectClass *klass,
> >      IOMMUMemoryRegionClass *imrc =
> IOMMU_MEMORY_REGION_CLASS(klass);
> >
> >      imrc->translate = virtio_iommu_translate;
> > +    imrc->iommu_set_page_size_mask = virtio_iommu_set_page_size_mask;
> >  }
> >
> >  static const TypeInfo virtio_iommu_info = {
> >
> Thanks
> 
> Eric



^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [EXT] Re: [PATCH v9 2/9] memory: Add interface to set iommu page size mask
  2020-03-26 16:06   ` Auger Eric
@ 2020-03-27  5:33     ` Bharat Bhushan
  2020-03-27  8:27       ` Auger Eric
  0 siblings, 1 reply; 37+ messages in thread
From: Bharat Bhushan @ 2020-03-27  5:33 UTC (permalink / raw)
  To: Auger Eric, peter.maydell, peterx, eric.auger.pro,
	alex.williamson, kevin.tian, mst, Tomasz Nowicki [C],
	drjones, linuc.decode, qemu-devel, qemu-arm, bharatb.linux,
	jean-philippe, yang.zhong

Hi Eric,

> -----Original Message-----
> From: Auger Eric <eric.auger@redhat.com>
> Sent: Thursday, March 26, 2020 9:36 PM
> To: Bharat Bhushan <bbhushan2@marvell.com>; peter.maydell@linaro.org;
> peterx@redhat.com; eric.auger.pro@gmail.com; alex.williamson@redhat.com;
> kevin.tian@intel.com; mst@redhat.com; Tomasz Nowicki [C]
> <tnowicki@marvell.com>; drjones@redhat.com; linuc.decode@gmail.com; qemu-
> devel@nongnu.org; qemu-arm@nongnu.org; bharatb.linux@gmail.com; jean-
> philippe@linaro.org; yang.zhong@intel.com
> Subject: [EXT] Re: [PATCH v9 2/9] memory: Add interface to set iommu page size
> mask
> 
> External Email
> 
> ----------------------------------------------------------------------
> Hi Bharat,
> On 3/23/20 9:46 AM, Bharat Bhushan wrote:
> > Allow to set page size mask to be supported by iommu.
> by iommu memory region. I mean this is not global to the IOMMU.

Yes.

> > This is required to expose page size mask compatible with host with
> > virtio-iommu.
> >
> > Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
> > ---
> >  include/exec/memory.h | 20 ++++++++++++++++++++
> >  memory.c              | 10 ++++++++++
> >  2 files changed, 30 insertions(+)
> >
> > diff --git a/include/exec/memory.h b/include/exec/memory.h index
> > e85b7de99a..063c424854 100644
> > --- a/include/exec/memory.h
> > +++ b/include/exec/memory.h
> > @@ -355,6 +355,16 @@ typedef struct IOMMUMemoryRegionClass {
> >       * @iommu: the IOMMUMemoryRegion
> >       */
> >      int (*num_indexes)(IOMMUMemoryRegion *iommu);
> > +
> > +    /*
> > +     * Set supported IOMMU page size
> > +     *
> > +     * Optional method: if this is supported then set page size that
> > +     * can be supported by IOMMU. This is called to set supported page
> > +     * size as per host Linux.
> What about: If supported, allows to restrict the page size mask that can be
> supported with a given IOMMU memory region. For example, this allows to
> propagate host physical IOMMU page size mask limitations to the virtual IOMMU
> (vfio assignment with virtual iommu).

Much better 

> > +     */
> > +     void (*iommu_set_page_size_mask)(IOMMUMemoryRegion *iommu,
> > +                                      uint64_t page_size_mask);
> >  } IOMMUMemoryRegionClass;
> >
> >  typedef struct CoalescedMemoryRange CoalescedMemoryRange; @@ -1363,6
> > +1373,16 @@ int
> memory_region_iommu_attrs_to_index(IOMMUMemoryRegion *iommu_mr,
> >   */
> >  int memory_region_iommu_num_indexes(IOMMUMemoryRegion *iommu_mr);
> >
> > +/**
> > + * memory_region_iommu_set_page_size_mask: set the supported pages
> > + * size by iommu.
> supported page sizes for a given IOMMU memory region
> > + *
> > + * @iommu_mr: the memory region
> IOMMU memory region
> > + * @page_size_mask: supported page size mask  */ void
> > +memory_region_iommu_set_page_size_mask(IOMMUMemoryRegion
> *iommu_mr,
> > +                                            uint64_t page_size_mask);
> > +
> >  /**
> >   * memory_region_name: get a memory region's name
> >   *
> > diff --git a/memory.c b/memory.c
> > index aeaa8dcc9e..14c8783084 100644
> > --- a/memory.c
> > +++ b/memory.c
> > @@ -1833,6 +1833,16 @@ static int
> memory_region_update_iommu_notify_flags(IOMMUMemoryRegion
> *iommu_mr,
> >      return ret;
> >  }
> >
> > +void memory_region_iommu_set_page_size_mask(IOMMUMemoryRegion
> *iommu_mr,
> > +                                            uint64_t page_size_mask)
> > +{
> > +    IOMMUMemoryRegionClass *imrc =
> > +IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr);
> > +
> > +    if (imrc->iommu_set_page_size_mask) {
> > +        imrc->iommu_set_page_size_mask(iommu_mr, page_size_mask);
> Shouldn't it return an int in case the setting cannot be applied?

iommu_set_page_size_mask() is setting page-size-mask for endpoint. Below function from code

	static void virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr,
              	                              uint64_t page_size_mask)
	{
    		IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);

	              sdev->page_size_mask = page_size_mask;
	}

Do you see any reason it cannot be applied, am I missing something?

Thanks
-Bharat

> > +    }
> > +}
> > +
> >  int memory_region_register_iommu_notifier(MemoryRegion *mr,
> >                                            IOMMUNotifier *n, Error
> > **errp)  {
> >
> Thanks
> Eric



^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [EXT] Re: [PATCH v9 8/9] virtio-iommu: Implement probe request
  2020-03-26 15:48   ` Auger Eric
@ 2020-03-27  5:40     ` Bharat Bhushan
  2020-03-27  8:34       ` Auger Eric
  0 siblings, 1 reply; 37+ messages in thread
From: Bharat Bhushan @ 2020-03-27  5:40 UTC (permalink / raw)
  To: Auger Eric, peter.maydell, peterx, eric.auger.pro,
	alex.williamson, kevin.tian, mst, Tomasz Nowicki [C],
	drjones, linuc.decode, qemu-devel, qemu-arm, bharatb.linux,
	jean-philippe, yang.zhong

Hi Eric,

> -----Original Message-----
> From: Auger Eric <eric.auger@redhat.com>
> Sent: Thursday, March 26, 2020 9:18 PM
> To: Bharat Bhushan <bbhushan2@marvell.com>; peter.maydell@linaro.org;
> peterx@redhat.com; eric.auger.pro@gmail.com; alex.williamson@redhat.com;
> kevin.tian@intel.com; mst@redhat.com; Tomasz Nowicki [C]
> <tnowicki@marvell.com>; drjones@redhat.com; linuc.decode@gmail.com; qemu-
> devel@nongnu.org; qemu-arm@nongnu.org; bharatb.linux@gmail.com; jean-
> philippe@linaro.org; yang.zhong@intel.com
> Subject: [EXT] Re: [PATCH v9 8/9] virtio-iommu: Implement probe request
> 
> External Email
> 
> ----------------------------------------------------------------------
> Hi Bharat
> 
> On 3/23/20 9:46 AM, Bharat Bhushan wrote:
> > This patch implements the PROBE request. Currently supported page size
> > mask per endpoint is returned. Also append a NONE property in the end.
> >
> > Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
> > Signed-off-by: Eric Auger <eric.auger@redhat.com>
> > ---
> >  include/standard-headers/linux/virtio_iommu.h |   6 +
> Changes to virtio_iommu.h should be in a separate patch you should use
> ./scripts/update-linux-headers.sh See for instance:
> ddda37483d  linux-headers: update
> until the uapi updates are not upstream you can link to your kernel branch and
> mention this is a temporary linux header update or partial if you just want to pick
> up the iommu.h changes.

yes, I am sorry.

> 
> >  hw/virtio/virtio-iommu.c                      | 161 +++++++++++++++++-
> >  hw/virtio/trace-events                        |   2 +
> >  3 files changed, 166 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/standard-headers/linux/virtio_iommu.h
> > b/include/standard-headers/linux/virtio_iommu.h
> > index b9443b83a1..8a0d47b907 100644
> > --- a/include/standard-headers/linux/virtio_iommu.h
> > +++ b/include/standard-headers/linux/virtio_iommu.h
> > @@ -111,6 +111,7 @@ struct virtio_iommu_req_unmap {
> >
> >  #define VIRTIO_IOMMU_PROBE_T_NONE		0
> >  #define VIRTIO_IOMMU_PROBE_T_RESV_MEM		1
> > +#define VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK	2
> >
> >  #define VIRTIO_IOMMU_PROBE_T_MASK		0xfff
> >
> > @@ -130,6 +131,11 @@ struct virtio_iommu_probe_resv_mem {
> >  	uint64_t					end;
> >  };
> >
> > +struct virtio_iommu_probe_pgsize_mask {
> > +	struct virtio_iommu_probe_property      head;
> > +	uint64_t				pgsize_bitmap;
> > +};
> > +
> >  struct virtio_iommu_req_probe {
> >  	struct virtio_iommu_req_head		head;
> >  	uint32_t					endpoint;
> > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index
> > 747e3cf1da..63fbacdcdc 100644
> > --- a/hw/virtio/virtio-iommu.c
> > +++ b/hw/virtio/virtio-iommu.c
> > @@ -38,6 +38,10 @@
> >
> >  /* Max size */
> >  #define VIOMMU_DEFAULT_QUEUE_SIZE 256
> > +#define VIOMMU_PROBE_SIZE 512
> > +
> > +#define SUPPORTED_PROBE_PROPERTIES (\
> > +    1 << VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK)
> >
> >  typedef struct VirtIOIOMMUDomain {
> >      uint32_t id;
> > @@ -62,6 +66,13 @@ typedef struct VirtIOIOMMUMapping {
> >      uint32_t flags;
> >  } VirtIOIOMMUMapping;
> >
> > +typedef struct VirtIOIOMMUPropBuffer {
> > +    VirtIOIOMMUEndpoint *endpoint;
> > +    size_t filled;
> > +    uint8_t *start;
> > +    bool error;
> > +} VirtIOIOMMUPropBuffer;
> > +
> >  static inline uint16_t virtio_iommu_get_bdf(IOMMUDevice *dev)  {
> >      return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn); @@
> > -490,6 +501,114 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
> >      return ret;
> >  }
> >
> > +static int virtio_iommu_fill_none_prop(VirtIOIOMMUPropBuffer
> > +*bufstate) {
> > +    struct virtio_iommu_probe_property *prop;
> > +
> > +    prop = (struct virtio_iommu_probe_property *)
> > +                (bufstate->start + bufstate->filled);
> > +    prop->type = 0;
> > +    prop->length = 0;
> > +    bufstate->filled += sizeof(*prop);
> > +    trace_virtio_iommu_fill_none_property(bufstate->endpoint->id);
> > +    return 0;
> > +}
> > +
> > +static int virtio_iommu_fill_page_size_mask(VirtIOIOMMUPropBuffer
> > +*bufstate) {
> > +    struct virtio_iommu_probe_pgsize_mask *page_size_mask;
> > +    size_t prop_size = sizeof(*page_size_mask);
> > +    VirtIOIOMMUEndpoint *ep = bufstate->endpoint;
> > +    VirtIOIOMMU *s = ep->viommu;
> > +    IOMMUDevice *sdev;
> > +
> > +    if (bufstate->filled + prop_size >= VIOMMU_PROBE_SIZE) {
> > +        bufstate->error = true;
> > +        /* get the traversal stopped by returning true */
> > +        return true;
> > +    }
> > +
> > +    page_size_mask = (struct virtio_iommu_probe_pgsize_mask *)
> > +                     (bufstate->start + bufstate->filled);
> > +
> > +    page_size_mask->head.type = VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK;
> > +    page_size_mask->head.length = prop_size;
> > +    QLIST_FOREACH(sdev, &s->notifiers_list, next) {
> > +        if (ep->id == sdev->devfn) {
> > +            page_size_mask->pgsize_bitmap = sdev->page_size_mask;
> > +	}
> > +    }
> > +    bufstate->filled += sizeof(*page_size_mask);
> > +    trace_virtio_iommu_fill_pgsize_mask_property(bufstate->endpoint->id,
> > +                                                 page_size_mask->pgsize_bitmap,
> > +                                                 bufstate->filled);
> > +    return false;
> > +}
> > +
> > +/* Fill the properties[] buffer with properties of type @type */
> > +static int virtio_iommu_fill_property(int type,
> > +                                      VirtIOIOMMUPropBuffer
> > +*bufstate) {
> > +    int ret = -ENOSPC;
> > +
> > +    if (bufstate->filled + sizeof(struct virtio_iommu_probe_property)
> > +            >= VIOMMU_PROBE_SIZE) {
> > +        /* no space left for the header */
> > +        bufstate->error = true;
> > +        goto out;
> > +    }
> > +
> > +    switch (type) {
> > +    case VIRTIO_IOMMU_PROBE_T_NONE:
> > +        ret = virtio_iommu_fill_none_prop(bufstate);
> > +        break;
> > +    case VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK:
> > +    {
> > +        ret = virtio_iommu_fill_page_size_mask(bufstate);
> I don't think you want to fill the property of each EP. Only for those whose sdev-
> >page_size_mask was properly set. For instance if you mix virtio and vfio devices,
> virtio ones won't have this field set.

This something I am looking for clarification ( asked in spec document patch). Copy pasting same here:

What some clarification about two page-size-mask configurations available.
 - Global configuration for page-size-mask
 - per endpoint page-size-mask configuration

PAGE_SIZE_MASK probe for and endpoint can return zero or non-zero value.
If it returns non-zero value than it will override the global configuration.
If PAGE_SIZE_MASK probe for and endpoint return zero value than global page-size-mask configuration will be used.

Is that correct?

Thanks
-Bharat

> > +	break;
> > +    }
> > +    default:
> > +        ret = -ENOENT;
> > +        break;
> > +    }
> > +out:
> > +    if (ret) {
> > +        error_report("%s property of type=%d could not be filled (%d),"
> > +                     " remaining size = 0x%lx",
> > +                     __func__, type, ret, bufstate->filled);
> > +    }
> > +    return ret;
> > +}
> > +
> > +/**
> > + * virtio_iommu_probe - Fill the probe request buffer with all
> > + * the properties the device is able to return and add a NONE
> > + * property at the end. @buf points to properties[].
> > + */
> > +static int virtio_iommu_probe(VirtIOIOMMU *s,
> > +                              struct virtio_iommu_req_probe *req,
> > +                              uint8_t *buf) {
> > +    uint32_t ep_id = le32_to_cpu(req->endpoint);
> > +    VirtIOIOMMUEndpoint *ep = virtio_iommu_get_endpoint(s, ep_id);
> > +    int16_t prop_types = SUPPORTED_PROBE_PROPERTIES, type;
> > +    VirtIOIOMMUPropBuffer bufstate = {.start = buf, .filled = 0,
> > +                                       .error = false, .endpoint =
> > +ep};
> > +
> > +    while ((type = ctz32(prop_types)) != 32) {
> > +        if (virtio_iommu_fill_property(type, &bufstate)) {
> > +            goto failure;
> > +        }
> > +        prop_types &= ~(1 << type);
> > +    }
> > +    if (virtio_iommu_fill_property(VIRTIO_IOMMU_PROBE_T_NONE, &bufstate))
> {
> > +        goto failure;
> > +    }
> > +    return VIRTIO_IOMMU_S_OK;
> > +failure:
> > +    return VIRTIO_IOMMU_S_INVAL;
> > +}
> > +
> >  static int virtio_iommu_iov_to_req(struct iovec *iov,
> >                                     unsigned int iov_cnt,
> >                                     void *req, size_t req_sz) @@
> > -519,6 +638,17 @@ virtio_iommu_handle_req(detach)
> >  virtio_iommu_handle_req(map)
> >  virtio_iommu_handle_req(unmap)
> >
> > +static int virtio_iommu_handle_probe(VirtIOIOMMU *s,
> > +                                     struct iovec *iov,
> > +                                     unsigned int iov_cnt,
> > +                                     uint8_t *buf) {
> > +    struct virtio_iommu_req_probe req;
> > +    int ret = virtio_iommu_iov_to_req(iov, iov_cnt, &req,
> > +sizeof(req));
> > +
> > +    return ret ? ret : virtio_iommu_probe(s, &req, buf); }
> > +
> >  static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue
> > *vq)  {
> >      VirtIOIOMMU *s = VIRTIO_IOMMU(vdev); @@ -564,17 +694,33 @@ static
> > void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
> >          case VIRTIO_IOMMU_T_UNMAP:
> >              tail.status = virtio_iommu_handle_unmap(s, iov, iov_cnt);
> >              break;
> > +        case VIRTIO_IOMMU_T_PROBE:
> > +        {
> > +            struct virtio_iommu_req_tail *ptail;
> > +            uint8_t *buf = g_malloc0(s->config.probe_size +
> > + sizeof(tail));
> > +
> > +            ptail = (struct virtio_iommu_req_tail *)
> > +                        (buf + s->config.probe_size);
> > +            ptail->status = virtio_iommu_handle_probe(s, iov,
> > + iov_cnt, buf);
> > +
> > +            sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
> > +                              buf, s->config.probe_size + sizeof(tail));
> > +            g_free(buf);
> > +            assert(sz == s->config.probe_size + sizeof(tail));
> > +            goto push;
> > +        }
> >          default:
> >              tail.status = VIRTIO_IOMMU_S_UNSUPP;
> >          }
> > -        qemu_mutex_unlock(&s->mutex);
> >
> >  out:
> >          sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
> >                            &tail, sizeof(tail));
> >          assert(sz == sizeof(tail));
> >
> > -        virtqueue_push(vq, elem, sizeof(tail));
> > +push:
> > +        qemu_mutex_unlock(&s->mutex);
> > +        virtqueue_push(vq, elem, sz);
> >          virtio_notify(vdev, vq);
> >          g_free(elem);
> >      }
> > @@ -634,16 +780,23 @@ static IOMMUTLBEntry
> virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
> >      VirtIOIOMMUEndpoint *ep;
> >      uint32_t sid, flags;
> >      bool bypass_allowed;
> > +    hwaddr addr_mask;
> >      bool found;
> >
> >      interval.low = addr;
> >      interval.high = addr + 1;
> >
> > +    if (sdev->page_size_mask) {
> > +        addr_mask = (1 << ctz32(sdev->page_size_mask)) - 1;
> > +    } else {
> > +        addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1;
> > +    }
> This change does not belong ot this patch. Rather belongs to
> "virtio-iommu: set supported page size mask"

Thanks for pointing, will change.

Thanks
-Bharat

> > +
> >      IOMMUTLBEntry entry = {
> >          .target_as = &address_space_memory,
> >          .iova = addr,
> >          .translated_addr = addr,
> > -        .addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1,
> > +        .addr_mask = addr_mask,
> >          .perm = IOMMU_NONE,
> >      };
> >
> > @@ -831,6 +984,7 @@ static void virtio_iommu_device_realize(DeviceState
> *dev, Error **errp)
> >      s->config.page_size_mask = TARGET_PAGE_MASK;
> >      s->config.input_range.end = -1UL;
> >      s->config.domain_range.end = 32;
> > +    s->config.probe_size = VIOMMU_PROBE_SIZE;
> >
> >      virtio_add_feature(&s->features, VIRTIO_RING_F_EVENT_IDX);
> >      virtio_add_feature(&s->features, VIRTIO_RING_F_INDIRECT_DESC); @@
> > -840,6 +994,7 @@ static void virtio_iommu_device_realize(DeviceState *dev,
> Error **errp)
> >      virtio_add_feature(&s->features, VIRTIO_IOMMU_F_MAP_UNMAP);
> >      virtio_add_feature(&s->features, VIRTIO_IOMMU_F_BYPASS);
> >      virtio_add_feature(&s->features, VIRTIO_IOMMU_F_MMIO);
> > +    virtio_add_feature(&s->features, VIRTIO_IOMMU_F_PROBE);
> >
> >      qemu_mutex_init(&s->mutex);
> >
> > diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index
> > 8bae651191..b0a6e4bda3 100644
> > --- a/hw/virtio/trace-events
> > +++ b/hw/virtio/trace-events
> > @@ -76,3 +76,5 @@ virtio_iommu_report_fault(uint8_t reason, uint32_t
> > flags, uint32_t endpoint, uin  virtio_iommu_notify_map(const char
> > *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s
> > iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64
> virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size)
> "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64  virtio_iommu_remap(uint64_t iova,
> uint64_t pa, uint64_t size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
> > +virtio_iommu_fill_none_property(uint32_t devid) "devid=%d"
> > +virtio_iommu_fill_pgsize_mask_property(uint32_t devid, uint64_t pgsize_mask,
> size_t filled) "dev= %d, pgsize_mask=0x%"PRIx64" filled=0x%lx"
> >
> Thanks
> 
> Eric



^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu
  2020-03-26 17:53       ` Alex Williamson
@ 2020-03-27  5:50         ` Bharat Bhushan
  2020-04-02  9:01         ` Bharat Bhushan
  1 sibling, 0 replies; 37+ messages in thread
From: Bharat Bhushan @ 2020-03-27  5:50 UTC (permalink / raw)
  To: Alex Williamson, Auger Eric
  Cc: yang.zhong, peter.maydell, kevin.tian, Tomasz Nowicki [C],
	mst, drjones, peterx, qemu-devel, bharatb.linux, qemu-arm,
	jean-philippe, linuc.decode, David Gibson, eric.auger.pro

Hi Alex, Eric,

> -----Original Message-----
> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Thursday, March 26, 2020 11:23 PM
> To: Auger Eric <eric.auger@redhat.com>
> Cc: Bharat Bhushan <bbhushan2@marvell.com>; peter.maydell@linaro.org;
> peterx@redhat.com; eric.auger.pro@gmail.com; kevin.tian@intel.com;
> mst@redhat.com; Tomasz Nowicki [C] <tnowicki@marvell.com>;
> drjones@redhat.com; linuc.decode@gmail.com; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org; bharatb.linux@gmail.com; jean-philippe@linaro.org;
> yang.zhong@intel.com; David Gibson <david@gibson.dropbear.id.au>
> Subject: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio
> region translation by viommu
> 
> External Email
> 
> ----------------------------------------------------------------------
> On Thu, 26 Mar 2020 18:35:48 +0100
> Auger Eric <eric.auger@redhat.com> wrote:
> 
> > Hi Alex,
> >
> > On 3/24/20 12:08 AM, Alex Williamson wrote:
> > > [Cc +dwg who originated this warning]
> > >
> > > On Mon, 23 Mar 2020 14:16:09 +0530
> > > Bharat Bhushan <bbhushan2@marvell.com> wrote:
> > >
> > >> On ARM, the MSI doorbell is translated by the virtual IOMMU.
> > >> As such address_space_translate() returns the MSI controller MMIO
> > >> region and we get an "iommu map to non memory area"
> > >> message. Let's remove this latter.
> > >>
> > >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> > >> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
> > >> ---
> > >>  hw/vfio/common.c | 2 --
> > >>  1 file changed, 2 deletions(-)
> > >>
> > >> diff --git a/hw/vfio/common.c b/hw/vfio/common.c index
> > >> 5ca11488d6..c586edf47a 100644
> > >> --- a/hw/vfio/common.c
> > >> +++ b/hw/vfio/common.c
> > >> @@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb,
> void **vaddr,
> > >>                                   &xlat, &len, writable,
> > >>                                   MEMTXATTRS_UNSPECIFIED);
> > >>      if (!memory_region_is_ram(mr)) {
> > >> -        error_report("iommu map to non memory area %"HWADDR_PRIx"",
> > >> -                     xlat);
> > >>          return false;
> > >>      }
> > >>
> > >
> > > I'm a bit confused here, I think we need more justification beyond
> > > "we hit this warning and we don't want to because it's ok in this
> > > one special case, therefore remove it".  I assume the special case
> > > is that the device MSI address is managed via the SET_IRQS ioctl and
> > > therefore we won't actually get DMAs to this range.
> > Yes exactly. The guest creates a mapping between one giova and this
> > gpa (corresponding to the MSI controller doorbell) because MSIs are
> > mapped on ARM. But practically the physical device is programmed with
> > an host chosen iova that maps onto the physical MSI controller's
> > doorbell. so the device never performs DMA accesses to this range.
> >
> >   But I imagine the case that
> > > was in mind when adding this warning was general peer-to-peer
> > > between and assigned and emulated device.
> > yes makes sense.
> >
> >   Maybe there's an argument to be made
> > > that such a p2p mapping might also be used in a non-vIOMMU case.  We
> > > skip creating those mappings and drivers continue to work, maybe
> > > because nobody attempts to do p2p DMA with the types of devices we
> > > emulate, maybe because p2p DMA is not absolutely reliable on bare
> > > metal and drivers test it before using it.
> > MSI doorbells are mapped using the IOMMU_MMIO flag (dma-iommu.c
> > iommu_dma_get_msi_page).
> > One idea could be to pass that flag through the IOMMU Notifier
> > mechanism into the iotlb->perm. Eventually when we get this in
> > vfio_get_vaddr() we would not print the warning. Could that make sense?
> 
> Yeah, if we can identify a valid case that doesn't need a warning, that's fine by me.
> Thanks,

Will change as per above suggestion by Eric.

Thanks
-Bharat

> 
> Alex



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [EXT] Re: [PATCH v9 2/9] memory: Add interface to set iommu page size mask
  2020-03-27  5:33     ` [EXT] " Bharat Bhushan
@ 2020-03-27  8:27       ` Auger Eric
  0 siblings, 0 replies; 37+ messages in thread
From: Auger Eric @ 2020-03-27  8:27 UTC (permalink / raw)
  To: Bharat Bhushan, peter.maydell, peterx, eric.auger.pro,
	alex.williamson, kevin.tian, mst, Tomasz Nowicki [C],
	drjones, linuc.decode, qemu-devel, qemu-arm, bharatb.linux,
	jean-philippe, yang.zhong

Hi Bharat,

On 3/27/20 6:33 AM, Bharat Bhushan wrote:
> Hi Eric,
> 
>> -----Original Message-----
>> From: Auger Eric <eric.auger@redhat.com>
>> Sent: Thursday, March 26, 2020 9:36 PM
>> To: Bharat Bhushan <bbhushan2@marvell.com>; peter.maydell@linaro.org;
>> peterx@redhat.com; eric.auger.pro@gmail.com; alex.williamson@redhat.com;
>> kevin.tian@intel.com; mst@redhat.com; Tomasz Nowicki [C]
>> <tnowicki@marvell.com>; drjones@redhat.com; linuc.decode@gmail.com; qemu-
>> devel@nongnu.org; qemu-arm@nongnu.org; bharatb.linux@gmail.com; jean-
>> philippe@linaro.org; yang.zhong@intel.com
>> Subject: [EXT] Re: [PATCH v9 2/9] memory: Add interface to set iommu page size
>> mask
>>
>> External Email
>>
>> ----------------------------------------------------------------------
>> Hi Bharat,
>> On 3/23/20 9:46 AM, Bharat Bhushan wrote:
>>> Allow to set page size mask to be supported by iommu.
>> by iommu memory region. I mean this is not global to the IOMMU.
> 
> Yes.
> 
>>> This is required to expose page size mask compatible with host with
>>> virtio-iommu.
>>>
>>> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
>>> ---
>>>  include/exec/memory.h | 20 ++++++++++++++++++++
>>>  memory.c              | 10 ++++++++++
>>>  2 files changed, 30 insertions(+)
>>>
>>> diff --git a/include/exec/memory.h b/include/exec/memory.h index
>>> e85b7de99a..063c424854 100644
>>> --- a/include/exec/memory.h
>>> +++ b/include/exec/memory.h
>>> @@ -355,6 +355,16 @@ typedef struct IOMMUMemoryRegionClass {
>>>       * @iommu: the IOMMUMemoryRegion
>>>       */
>>>      int (*num_indexes)(IOMMUMemoryRegion *iommu);
>>> +
>>> +    /*
>>> +     * Set supported IOMMU page size
>>> +     *
>>> +     * Optional method: if this is supported then set page size that
>>> +     * can be supported by IOMMU. This is called to set supported page
>>> +     * size as per host Linux.
>> What about: If supported, allows to restrict the page size mask that can be
>> supported with a given IOMMU memory region. For example, this allows to
>> propagate host physical IOMMU page size mask limitations to the virtual IOMMU
>> (vfio assignment with virtual iommu).
> 
> Much better 
> 
>>> +     */
>>> +     void (*iommu_set_page_size_mask)(IOMMUMemoryRegion *iommu,
>>> +                                      uint64_t page_size_mask);
>>>  } IOMMUMemoryRegionClass;
>>>
>>>  typedef struct CoalescedMemoryRange CoalescedMemoryRange; @@ -1363,6
>>> +1373,16 @@ int
>> memory_region_iommu_attrs_to_index(IOMMUMemoryRegion *iommu_mr,
>>>   */
>>>  int memory_region_iommu_num_indexes(IOMMUMemoryRegion *iommu_mr);
>>>
>>> +/**
>>> + * memory_region_iommu_set_page_size_mask: set the supported pages
>>> + * size by iommu.
>> supported page sizes for a given IOMMU memory region
>>> + *
>>> + * @iommu_mr: the memory region
>> IOMMU memory region
>>> + * @page_size_mask: supported page size mask  */ void
>>> +memory_region_iommu_set_page_size_mask(IOMMUMemoryRegion
>> *iommu_mr,
>>> +                                            uint64_t page_size_mask);
>>> +
>>>  /**
>>>   * memory_region_name: get a memory region's name
>>>   *
>>> diff --git a/memory.c b/memory.c
>>> index aeaa8dcc9e..14c8783084 100644
>>> --- a/memory.c
>>> +++ b/memory.c
>>> @@ -1833,6 +1833,16 @@ static int
>> memory_region_update_iommu_notify_flags(IOMMUMemoryRegion
>> *iommu_mr,
>>>      return ret;
>>>  }
>>>
>>> +void memory_region_iommu_set_page_size_mask(IOMMUMemoryRegion
>> *iommu_mr,
>>> +                                            uint64_t page_size_mask)
>>> +{
>>> +    IOMMUMemoryRegionClass *imrc =
>>> +IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr);
>>> +
>>> +    if (imrc->iommu_set_page_size_mask) {
>>> +        imrc->iommu_set_page_size_mask(iommu_mr, page_size_mask);
>> Shouldn't it return an int in case the setting cannot be applied?
> 
> iommu_set_page_size_mask() is setting page-size-mask for endpoint. Below function from code
> 
> 	static void virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr,
>               	                              uint64_t page_size_mask)
> 	{
>     		IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
> 
> 	              sdev->page_size_mask = page_size_mask;
> 	}
> 
> Do you see any reason it cannot be applied, am I missing something?
maybe the function can be called several times and one setting is
already in place. In other contexts, maybe the setting we try to apply
is not compatible with other settings enforced?

Thanks

Eric
> 
> Thanks
> -Bharat
> 
>>> +    }
>>> +}
>>> +
>>>  int memory_region_register_iommu_notifier(MemoryRegion *mr,
>>>                                            IOMMUNotifier *n, Error
>>> **errp)  {
>>>
>> Thanks
>> Eric
> 



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [EXT] Re: [PATCH v9 4/9] virtio-iommu: set supported page size mask
  2020-03-27  5:13     ` [EXT] " Bharat Bhushan
@ 2020-03-27  8:28       ` Auger Eric
  0 siblings, 0 replies; 37+ messages in thread
From: Auger Eric @ 2020-03-27  8:28 UTC (permalink / raw)
  To: Bharat Bhushan, peter.maydell, peterx, eric.auger.pro,
	alex.williamson, kevin.tian, mst, Tomasz Nowicki [C],
	drjones, linuc.decode, qemu-devel, qemu-arm, bharatb.linux,
	jean-philippe, yang.zhong

Hi Bharat,
On 3/27/20 6:13 AM, Bharat Bhushan wrote:
> Hi Eric,
> 
>> -----Original Message-----
>> From: Auger Eric <eric.auger@redhat.com>
>> Sent: Thursday, March 26, 2020 9:22 PM
>> To: Bharat Bhushan <bbhushan2@marvell.com>; peter.maydell@linaro.org;
>> peterx@redhat.com; eric.auger.pro@gmail.com; alex.williamson@redhat.com;
>> kevin.tian@intel.com; mst@redhat.com; Tomasz Nowicki [C]
>> <tnowicki@marvell.com>; drjones@redhat.com; linuc.decode@gmail.com; qemu-
>> devel@nongnu.org; qemu-arm@nongnu.org; bharatb.linux@gmail.com; jean-
>> philippe@linaro.org; yang.zhong@intel.com
>> Subject: [EXT] Re: [PATCH v9 4/9] virtio-iommu: set supported page size mask
>>
>> External Email
>>
>> ----------------------------------------------------------------------
>> Hi Bharat,
>>
>> On 3/23/20 9:46 AM, Bharat Bhushan wrote:
>>> Add optional interface to set page size mask.
>>> Currently this is set global configuration and not per endpoint.
>> This allows to override the page size mask per end-point?
> 
> This patch adds per endpoint page-size-mask configuration in addition to global page-size-mask.
> endpoint page-size-mask will override global page-size-mask configuration for that endpoint.
Yes my comment was a suggestion for commit msg rephrasing. Sorry if it
was unclear. The above text sounds clearer.

thanks

Eric
> 
> Thanks
> -Bharat
> 
>>>
>>> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
>>> ---
>>>  include/hw/virtio/virtio-iommu.h | 1 +
>>>  hw/virtio/virtio-iommu.c         | 9 +++++++++
>>>  2 files changed, 10 insertions(+)
>>>
>>> diff --git a/include/hw/virtio/virtio-iommu.h
>>> b/include/hw/virtio/virtio-iommu.h
>>> index 6f67f1020a..4efa09610a 100644
>>> --- a/include/hw/virtio/virtio-iommu.h
>>> +++ b/include/hw/virtio/virtio-iommu.h
>>> @@ -35,6 +35,7 @@ typedef struct IOMMUDevice {
>>>      void         *viommu;
>>>      PCIBus       *bus;
>>>      int           devfn;
>>> +    uint64_t      page_size_mask;
>>>      IOMMUMemoryRegion  iommu_mr;
>>>      AddressSpace  as;
>>>  } IOMMUDevice;
>>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index
>>> 4cee8083bc..a28818202c 100644
>>> --- a/hw/virtio/virtio-iommu.c
>>> +++ b/hw/virtio/virtio-iommu.c
>>> @@ -650,6 +650,14 @@ static gint int_cmp(gconstpointer a, gconstpointer b,
>> gpointer user_data)
>>>      return (ua > ub) - (ua < ub);
>>>  }
>>>
>>> +static void virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr,
>>> +                                            uint64_t page_size_mask)
>>> +{
>>> +    IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
>>> +
>>> +    sdev->page_size_mask = page_size_mask; }
>>> +
>>>  static void virtio_iommu_device_realize(DeviceState *dev, Error
>>> **errp)  {
>>>      VirtIODevice *vdev = VIRTIO_DEVICE(dev); @@ -865,6 +873,7 @@
>>> static void virtio_iommu_memory_region_class_init(ObjectClass *klass,
>>>      IOMMUMemoryRegionClass *imrc =
>> IOMMU_MEMORY_REGION_CLASS(klass);
>>>
>>>      imrc->translate = virtio_iommu_translate;
>>> +    imrc->iommu_set_page_size_mask = virtio_iommu_set_page_size_mask;
>>>  }
>>>
>>>  static const TypeInfo virtio_iommu_info = {
>>>
>> Thanks
>>
>> Eric
> 
> 



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [EXT] Re: [PATCH v9 8/9] virtio-iommu: Implement probe request
  2020-03-27  5:40     ` [EXT] " Bharat Bhushan
@ 2020-03-27  8:34       ` Auger Eric
  0 siblings, 0 replies; 37+ messages in thread
From: Auger Eric @ 2020-03-27  8:34 UTC (permalink / raw)
  To: Bharat Bhushan, peter.maydell, peterx, eric.auger.pro,
	alex.williamson, kevin.tian, mst, Tomasz Nowicki [C],
	drjones, linuc.decode, qemu-devel, qemu-arm, bharatb.linux,
	jean-philippe, yang.zhong

Hi Bharat,

On 3/27/20 6:40 AM, Bharat Bhushan wrote:
> Hi Eric,
> 
>> -----Original Message-----
>> From: Auger Eric <eric.auger@redhat.com>
>> Sent: Thursday, March 26, 2020 9:18 PM
>> To: Bharat Bhushan <bbhushan2@marvell.com>; peter.maydell@linaro.org;
>> peterx@redhat.com; eric.auger.pro@gmail.com; alex.williamson@redhat.com;
>> kevin.tian@intel.com; mst@redhat.com; Tomasz Nowicki [C]
>> <tnowicki@marvell.com>; drjones@redhat.com; linuc.decode@gmail.com; qemu-
>> devel@nongnu.org; qemu-arm@nongnu.org; bharatb.linux@gmail.com; jean-
>> philippe@linaro.org; yang.zhong@intel.com
>> Subject: [EXT] Re: [PATCH v9 8/9] virtio-iommu: Implement probe request
>>
>> External Email
>>
>> ----------------------------------------------------------------------
>> Hi Bharat
>>
>> On 3/23/20 9:46 AM, Bharat Bhushan wrote:
>>> This patch implements the PROBE request. Currently supported page size
>>> mask per endpoint is returned. Also append a NONE property in the end.
>>>
>>> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>> ---
>>>  include/standard-headers/linux/virtio_iommu.h |   6 +
>> Changes to virtio_iommu.h should be in a separate patch you should use
>> ./scripts/update-linux-headers.sh See for instance:
>> ddda37483d  linux-headers: update
>> until the uapi updates are not upstream you can link to your kernel branch and
>> mention this is a temporary linux header update or partial if you just want to pick
>> up the iommu.h changes.
> 
> yes, I am sorry.
no problem
> 
>>
>>>  hw/virtio/virtio-iommu.c                      | 161 +++++++++++++++++-
>>>  hw/virtio/trace-events                        |   2 +
>>>  3 files changed, 166 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/include/standard-headers/linux/virtio_iommu.h
>>> b/include/standard-headers/linux/virtio_iommu.h
>>> index b9443b83a1..8a0d47b907 100644
>>> --- a/include/standard-headers/linux/virtio_iommu.h
>>> +++ b/include/standard-headers/linux/virtio_iommu.h
>>> @@ -111,6 +111,7 @@ struct virtio_iommu_req_unmap {
>>>
>>>  #define VIRTIO_IOMMU_PROBE_T_NONE		0
>>>  #define VIRTIO_IOMMU_PROBE_T_RESV_MEM		1
>>> +#define VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK	2
>>>
>>>  #define VIRTIO_IOMMU_PROBE_T_MASK		0xfff
>>>
>>> @@ -130,6 +131,11 @@ struct virtio_iommu_probe_resv_mem {
>>>  	uint64_t					end;
>>>  };
>>>
>>> +struct virtio_iommu_probe_pgsize_mask {
>>> +	struct virtio_iommu_probe_property      head;
>>> +	uint64_t				pgsize_bitmap;
>>> +};
>>> +
>>>  struct virtio_iommu_req_probe {
>>>  	struct virtio_iommu_req_head		head;
>>>  	uint32_t					endpoint;
>>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index
>>> 747e3cf1da..63fbacdcdc 100644
>>> --- a/hw/virtio/virtio-iommu.c
>>> +++ b/hw/virtio/virtio-iommu.c
>>> @@ -38,6 +38,10 @@
>>>
>>>  /* Max size */
>>>  #define VIOMMU_DEFAULT_QUEUE_SIZE 256
>>> +#define VIOMMU_PROBE_SIZE 512
>>> +
>>> +#define SUPPORTED_PROBE_PROPERTIES (\
>>> +    1 << VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK)
>>>
>>>  typedef struct VirtIOIOMMUDomain {
>>>      uint32_t id;
>>> @@ -62,6 +66,13 @@ typedef struct VirtIOIOMMUMapping {
>>>      uint32_t flags;
>>>  } VirtIOIOMMUMapping;
>>>
>>> +typedef struct VirtIOIOMMUPropBuffer {
>>> +    VirtIOIOMMUEndpoint *endpoint;
>>> +    size_t filled;
>>> +    uint8_t *start;
>>> +    bool error;
>>> +} VirtIOIOMMUPropBuffer;
>>> +
>>>  static inline uint16_t virtio_iommu_get_bdf(IOMMUDevice *dev)  {
>>>      return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn); @@
>>> -490,6 +501,114 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
>>>      return ret;
>>>  }
>>>
>>> +static int virtio_iommu_fill_none_prop(VirtIOIOMMUPropBuffer
>>> +*bufstate) {
>>> +    struct virtio_iommu_probe_property *prop;
>>> +
>>> +    prop = (struct virtio_iommu_probe_property *)
>>> +                (bufstate->start + bufstate->filled);
>>> +    prop->type = 0;
>>> +    prop->length = 0;
>>> +    bufstate->filled += sizeof(*prop);
>>> +    trace_virtio_iommu_fill_none_property(bufstate->endpoint->id);
>>> +    return 0;
>>> +}
>>> +
>>> +static int virtio_iommu_fill_page_size_mask(VirtIOIOMMUPropBuffer
>>> +*bufstate) {
>>> +    struct virtio_iommu_probe_pgsize_mask *page_size_mask;
>>> +    size_t prop_size = sizeof(*page_size_mask);
>>> +    VirtIOIOMMUEndpoint *ep = bufstate->endpoint;
>>> +    VirtIOIOMMU *s = ep->viommu;
>>> +    IOMMUDevice *sdev;
>>> +
>>> +    if (bufstate->filled + prop_size >= VIOMMU_PROBE_SIZE) {
>>> +        bufstate->error = true;
>>> +        /* get the traversal stopped by returning true */
>>> +        return true;
>>> +    }
>>> +
>>> +    page_size_mask = (struct virtio_iommu_probe_pgsize_mask *)
>>> +                     (bufstate->start + bufstate->filled);
>>> +
>>> +    page_size_mask->head.type = VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK;
>>> +    page_size_mask->head.length = prop_size;
>>> +    QLIST_FOREACH(sdev, &s->notifiers_list, next) {
>>> +        if (ep->id == sdev->devfn) {
>>> +            page_size_mask->pgsize_bitmap = sdev->page_size_mask;
>>> +	}
>>> +    }
>>> +    bufstate->filled += sizeof(*page_size_mask);
>>> +    trace_virtio_iommu_fill_pgsize_mask_property(bufstate->endpoint->id,
>>> +                                                 page_size_mask->pgsize_bitmap,
>>> +                                                 bufstate->filled);
>>> +    return false;
>>> +}
>>> +
>>> +/* Fill the properties[] buffer with properties of type @type */
>>> +static int virtio_iommu_fill_property(int type,
>>> +                                      VirtIOIOMMUPropBuffer
>>> +*bufstate) {
>>> +    int ret = -ENOSPC;
>>> +
>>> +    if (bufstate->filled + sizeof(struct virtio_iommu_probe_property)
>>> +            >= VIOMMU_PROBE_SIZE) {
>>> +        /* no space left for the header */
>>> +        bufstate->error = true;
>>> +        goto out;
>>> +    }
>>> +
>>> +    switch (type) {
>>> +    case VIRTIO_IOMMU_PROBE_T_NONE:
>>> +        ret = virtio_iommu_fill_none_prop(bufstate);
>>> +        break;
>>> +    case VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK:
>>> +    {
>>> +        ret = virtio_iommu_fill_page_size_mask(bufstate);
>> I don't think you want to fill the property of each EP. Only for those whose sdev-
>>> page_size_mask was properly set. For instance if you mix virtio and vfio devices,
>> virtio ones won't have this field set.
> 
> This something I am looking for clarification ( asked in spec document patch). Copy pasting same here:
> 
> What some clarification about two page-size-mask configurations available.
>  - Global configuration for page-size-mask
>  - per endpoint page-size-mask configuration
> 
> PAGE_SIZE_MASK probe for and endpoint can return zero or non-zero value.
> If it returns non-zero value than it will override the global configuration.
> If PAGE_SIZE_MASK probe for and endpoint return zero value than global page-size-mask configuration will be used.
Hum, that's not my current understanding. I understand it would also
override the global page size mask with 0. Hence my suggestion to *not*
return the property in that case.

Thanks

Eric
> 
> Is that correct?
> 
> Thanks
> -Bharat
> 
>>> +	break;
>>> +    }
>>> +    default:
>>> +        ret = -ENOENT;
>>> +        break;
>>> +    }
>>> +out:
>>> +    if (ret) {
>>> +        error_report("%s property of type=%d could not be filled (%d),"
>>> +                     " remaining size = 0x%lx",
>>> +                     __func__, type, ret, bufstate->filled);
>>> +    }
>>> +    return ret;
>>> +}
>>> +
>>> +/**
>>> + * virtio_iommu_probe - Fill the probe request buffer with all
>>> + * the properties the device is able to return and add a NONE
>>> + * property at the end. @buf points to properties[].
>>> + */
>>> +static int virtio_iommu_probe(VirtIOIOMMU *s,
>>> +                              struct virtio_iommu_req_probe *req,
>>> +                              uint8_t *buf) {
>>> +    uint32_t ep_id = le32_to_cpu(req->endpoint);
>>> +    VirtIOIOMMUEndpoint *ep = virtio_iommu_get_endpoint(s, ep_id);
>>> +    int16_t prop_types = SUPPORTED_PROBE_PROPERTIES, type;
>>> +    VirtIOIOMMUPropBuffer bufstate = {.start = buf, .filled = 0,
>>> +                                       .error = false, .endpoint =
>>> +ep};
>>> +
>>> +    while ((type = ctz32(prop_types)) != 32) {
>>> +        if (virtio_iommu_fill_property(type, &bufstate)) {
>>> +            goto failure;
>>> +        }
>>> +        prop_types &= ~(1 << type);
>>> +    }
>>> +    if (virtio_iommu_fill_property(VIRTIO_IOMMU_PROBE_T_NONE, &bufstate))
>> {
>>> +        goto failure;
>>> +    }
>>> +    return VIRTIO_IOMMU_S_OK;
>>> +failure:
>>> +    return VIRTIO_IOMMU_S_INVAL;
>>> +}
>>> +
>>>  static int virtio_iommu_iov_to_req(struct iovec *iov,
>>>                                     unsigned int iov_cnt,
>>>                                     void *req, size_t req_sz) @@
>>> -519,6 +638,17 @@ virtio_iommu_handle_req(detach)
>>>  virtio_iommu_handle_req(map)
>>>  virtio_iommu_handle_req(unmap)
>>>
>>> +static int virtio_iommu_handle_probe(VirtIOIOMMU *s,
>>> +                                     struct iovec *iov,
>>> +                                     unsigned int iov_cnt,
>>> +                                     uint8_t *buf) {
>>> +    struct virtio_iommu_req_probe req;
>>> +    int ret = virtio_iommu_iov_to_req(iov, iov_cnt, &req,
>>> +sizeof(req));
>>> +
>>> +    return ret ? ret : virtio_iommu_probe(s, &req, buf); }
>>> +
>>>  static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue
>>> *vq)  {
>>>      VirtIOIOMMU *s = VIRTIO_IOMMU(vdev); @@ -564,17 +694,33 @@ static
>>> void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
>>>          case VIRTIO_IOMMU_T_UNMAP:
>>>              tail.status = virtio_iommu_handle_unmap(s, iov, iov_cnt);
>>>              break;
>>> +        case VIRTIO_IOMMU_T_PROBE:
>>> +        {
>>> +            struct virtio_iommu_req_tail *ptail;
>>> +            uint8_t *buf = g_malloc0(s->config.probe_size +
>>> + sizeof(tail));
>>> +
>>> +            ptail = (struct virtio_iommu_req_tail *)
>>> +                        (buf + s->config.probe_size);
>>> +            ptail->status = virtio_iommu_handle_probe(s, iov,
>>> + iov_cnt, buf);
>>> +
>>> +            sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
>>> +                              buf, s->config.probe_size + sizeof(tail));
>>> +            g_free(buf);
>>> +            assert(sz == s->config.probe_size + sizeof(tail));
>>> +            goto push;
>>> +        }
>>>          default:
>>>              tail.status = VIRTIO_IOMMU_S_UNSUPP;
>>>          }
>>> -        qemu_mutex_unlock(&s->mutex);
>>>
>>>  out:
>>>          sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
>>>                            &tail, sizeof(tail));
>>>          assert(sz == sizeof(tail));
>>>
>>> -        virtqueue_push(vq, elem, sizeof(tail));
>>> +push:
>>> +        qemu_mutex_unlock(&s->mutex);
>>> +        virtqueue_push(vq, elem, sz);
>>>          virtio_notify(vdev, vq);
>>>          g_free(elem);
>>>      }
>>> @@ -634,16 +780,23 @@ static IOMMUTLBEntry
>> virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
>>>      VirtIOIOMMUEndpoint *ep;
>>>      uint32_t sid, flags;
>>>      bool bypass_allowed;
>>> +    hwaddr addr_mask;
>>>      bool found;
>>>
>>>      interval.low = addr;
>>>      interval.high = addr + 1;
>>>
>>> +    if (sdev->page_size_mask) {
>>> +        addr_mask = (1 << ctz32(sdev->page_size_mask)) - 1;
>>> +    } else {
>>> +        addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1;
>>> +    }
>> This change does not belong ot this patch. Rather belongs to
>> "virtio-iommu: set supported page size mask"
> 
> Thanks for pointing, will change.
> 
> Thanks
> -Bharat
> 
>>> +
>>>      IOMMUTLBEntry entry = {
>>>          .target_as = &address_space_memory,
>>>          .iova = addr,
>>>          .translated_addr = addr,
>>> -        .addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1,
>>> +        .addr_mask = addr_mask,
>>>          .perm = IOMMU_NONE,
>>>      };
>>>
>>> @@ -831,6 +984,7 @@ static void virtio_iommu_device_realize(DeviceState
>> *dev, Error **errp)
>>>      s->config.page_size_mask = TARGET_PAGE_MASK;
>>>      s->config.input_range.end = -1UL;
>>>      s->config.domain_range.end = 32;
>>> +    s->config.probe_size = VIOMMU_PROBE_SIZE;
>>>
>>>      virtio_add_feature(&s->features, VIRTIO_RING_F_EVENT_IDX);
>>>      virtio_add_feature(&s->features, VIRTIO_RING_F_INDIRECT_DESC); @@
>>> -840,6 +994,7 @@ static void virtio_iommu_device_realize(DeviceState *dev,
>> Error **errp)
>>>      virtio_add_feature(&s->features, VIRTIO_IOMMU_F_MAP_UNMAP);
>>>      virtio_add_feature(&s->features, VIRTIO_IOMMU_F_BYPASS);
>>>      virtio_add_feature(&s->features, VIRTIO_IOMMU_F_MMIO);
>>> +    virtio_add_feature(&s->features, VIRTIO_IOMMU_F_PROBE);
>>>
>>>      qemu_mutex_init(&s->mutex);
>>>
>>> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index
>>> 8bae651191..b0a6e4bda3 100644
>>> --- a/hw/virtio/trace-events
>>> +++ b/hw/virtio/trace-events
>>> @@ -76,3 +76,5 @@ virtio_iommu_report_fault(uint8_t reason, uint32_t
>>> flags, uint32_t endpoint, uin  virtio_iommu_notify_map(const char
>>> *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s
>>> iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64
>> virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size)
>> "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64  virtio_iommu_remap(uint64_t iova,
>> uint64_t pa, uint64_t size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
>>> +virtio_iommu_fill_none_property(uint32_t devid) "devid=%d"
>>> +virtio_iommu_fill_pgsize_mask_property(uint32_t devid, uint64_t pgsize_mask,
>> size_t filled) "dev= %d, pgsize_mask=0x%"PRIx64" filled=0x%lx"
>>>
>> Thanks
>>
>> Eric
> 



^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu
  2020-03-26 17:53       ` Alex Williamson
  2020-03-27  5:50         ` [EXT] " Bharat Bhushan
@ 2020-04-02  9:01         ` Bharat Bhushan
  2020-04-24 14:17           ` Auger Eric
  1 sibling, 1 reply; 37+ messages in thread
From: Bharat Bhushan @ 2020-04-02  9:01 UTC (permalink / raw)
  To: Alex Williamson, Auger Eric
  Cc: yang.zhong, peter.maydell, kevin.tian, Tomasz Nowicki [C],
	mst, drjones, peterx, qemu-devel, bharatb.linux, qemu-arm,
	jean-philippe, linuc.decode, David Gibson, eric.auger.pro

Hi Eric/Alex,

> -----Original Message-----
> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Thursday, March 26, 2020 11:23 PM
> To: Auger Eric <eric.auger@redhat.com>
> Cc: Bharat Bhushan <bbhushan2@marvell.com>; peter.maydell@linaro.org;
> peterx@redhat.com; eric.auger.pro@gmail.com; kevin.tian@intel.com;
> mst@redhat.com; Tomasz Nowicki [C] <tnowicki@marvell.com>;
> drjones@redhat.com; linuc.decode@gmail.com; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org; bharatb.linux@gmail.com; jean-philippe@linaro.org;
> yang.zhong@intel.com; David Gibson <david@gibson.dropbear.id.au>
> Subject: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio
> region translation by viommu
> 
> External Email
> 
> ----------------------------------------------------------------------
> On Thu, 26 Mar 2020 18:35:48 +0100
> Auger Eric <eric.auger@redhat.com> wrote:
> 
> > Hi Alex,
> >
> > On 3/24/20 12:08 AM, Alex Williamson wrote:
> > > [Cc +dwg who originated this warning]
> > >
> > > On Mon, 23 Mar 2020 14:16:09 +0530
> > > Bharat Bhushan <bbhushan2@marvell.com> wrote:
> > >
> > >> On ARM, the MSI doorbell is translated by the virtual IOMMU.
> > >> As such address_space_translate() returns the MSI controller MMIO
> > >> region and we get an "iommu map to non memory area"
> > >> message. Let's remove this latter.
> > >>
> > >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> > >> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
> > >> ---
> > >>  hw/vfio/common.c | 2 --
> > >>  1 file changed, 2 deletions(-)
> > >>
> > >> diff --git a/hw/vfio/common.c b/hw/vfio/common.c index
> > >> 5ca11488d6..c586edf47a 100644
> > >> --- a/hw/vfio/common.c
> > >> +++ b/hw/vfio/common.c
> > >> @@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb,
> void **vaddr,
> > >>                                   &xlat, &len, writable,
> > >>                                   MEMTXATTRS_UNSPECIFIED);
> > >>      if (!memory_region_is_ram(mr)) {
> > >> -        error_report("iommu map to non memory area %"HWADDR_PRIx"",
> > >> -                     xlat);
> > >>          return false;
> > >>      }
> > >>
> > >
> > > I'm a bit confused here, I think we need more justification beyond
> > > "we hit this warning and we don't want to because it's ok in this
> > > one special case, therefore remove it".  I assume the special case
> > > is that the device MSI address is managed via the SET_IRQS ioctl and
> > > therefore we won't actually get DMAs to this range.
> > Yes exactly. The guest creates a mapping between one giova and this
> > gpa (corresponding to the MSI controller doorbell) because MSIs are
> > mapped on ARM. But practically the physical device is programmed with
> > an host chosen iova that maps onto the physical MSI controller's
> > doorbell. so the device never performs DMA accesses to this range.
> >
> >   But I imagine the case that
> > > was in mind when adding this warning was general peer-to-peer
> > > between and assigned and emulated device.
> > yes makes sense.
> >
> >   Maybe there's an argument to be made
> > > that such a p2p mapping might also be used in a non-vIOMMU case.  We
> > > skip creating those mappings and drivers continue to work, maybe
> > > because nobody attempts to do p2p DMA with the types of devices we
> > > emulate, maybe because p2p DMA is not absolutely reliable on bare
> > > metal and drivers test it before using it.
> > MSI doorbells are mapped using the IOMMU_MMIO flag (dma-iommu.c
> > iommu_dma_get_msi_page).
> > One idea could be to pass that flag through the IOMMU Notifier
> > mechanism into the iotlb->perm. Eventually when we get this in
> > vfio_get_vaddr() we would not print the warning. Could that make sense?
> 
> Yeah, if we can identify a valid case that doesn't need a warning, that's fine by me.
> Thanks,

Let me know if I understood the proposal correctly:

virtio-iommu driver in guest will make map (VIRTIO_IOMMU_T_MAP) with VIRTIO_IOMMU_MAP_F_MMIO flag for MSI mapping.
In qemu, virtio-iommu device will set a new defined flag (say IOMMU_MMIO) in iotlb->perm in memory_region_notify_iommu(). vfio_get_vaddr() will check same flag and will not print the warning.

Is above correct?

Thanks
-Bharat

> 
> Alex



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v9 8/9] virtio-iommu: Implement probe request
  2020-03-23  8:46 ` [PATCH v9 8/9] virtio-iommu: Implement probe request Bharat Bhushan
  2020-03-26 15:48   ` Auger Eric
@ 2020-04-23 16:09   ` Jean-Philippe Brucker
  2020-04-24 13:51     ` Auger Eric
  1 sibling, 1 reply; 37+ messages in thread
From: Jean-Philippe Brucker @ 2020-04-23 16:09 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: yang.zhong, peter.maydell, kevin.tian, tnowicki, mst, drjones,
	peterx, qemu-devel, Eric Auger, alex.williamson, qemu-arm,
	bharatb.linux, linuc.decode, eric.auger.pro

Hi Bharat,

A few more things found while rebasing

On Mon, Mar 23, 2020 at 02:16:16PM +0530, Bharat Bhushan wrote:
> This patch implements the PROBE request. Currently supported
> page size mask per endpoint is returned. Also append a NONE
> property in the end.
> 
> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  include/standard-headers/linux/virtio_iommu.h |   6 +
>  hw/virtio/virtio-iommu.c                      | 161 +++++++++++++++++-
>  hw/virtio/trace-events                        |   2 +
>  3 files changed, 166 insertions(+), 3 deletions(-)
> 
> diff --git a/include/standard-headers/linux/virtio_iommu.h b/include/standard-headers/linux/virtio_iommu.h
> index b9443b83a1..8a0d47b907 100644
> --- a/include/standard-headers/linux/virtio_iommu.h
> +++ b/include/standard-headers/linux/virtio_iommu.h
> @@ -111,6 +111,7 @@ struct virtio_iommu_req_unmap {
>  
>  #define VIRTIO_IOMMU_PROBE_T_NONE		0
>  #define VIRTIO_IOMMU_PROBE_T_RESV_MEM		1
> +#define VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK	2
>  
>  #define VIRTIO_IOMMU_PROBE_T_MASK		0xfff
>  
> @@ -130,6 +131,11 @@ struct virtio_iommu_probe_resv_mem {
>  	uint64_t					end;
>  };
>  
> +struct virtio_iommu_probe_pgsize_mask {
> +	struct virtio_iommu_probe_property      head;
> +	uint64_t				pgsize_bitmap;
> +};
> +
>  struct virtio_iommu_req_probe {
>  	struct virtio_iommu_req_head		head;
>  	uint32_t					endpoint;
> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> index 747e3cf1da..63fbacdcdc 100644
> --- a/hw/virtio/virtio-iommu.c
> +++ b/hw/virtio/virtio-iommu.c
> @@ -38,6 +38,10 @@
>  
>  /* Max size */
>  #define VIOMMU_DEFAULT_QUEUE_SIZE 256
> +#define VIOMMU_PROBE_SIZE 512
> +
> +#define SUPPORTED_PROBE_PROPERTIES (\
> +    1 << VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK)
>  
>  typedef struct VirtIOIOMMUDomain {
>      uint32_t id;
> @@ -62,6 +66,13 @@ typedef struct VirtIOIOMMUMapping {
>      uint32_t flags;
>  } VirtIOIOMMUMapping;
>  
> +typedef struct VirtIOIOMMUPropBuffer {
> +    VirtIOIOMMUEndpoint *endpoint;
> +    size_t filled;
> +    uint8_t *start;
> +    bool error;

It doesn't seem like bufstate->error gets used anywhere

> +} VirtIOIOMMUPropBuffer;
> +
>  static inline uint16_t virtio_iommu_get_bdf(IOMMUDevice *dev)
>  {
>      return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
> @@ -490,6 +501,114 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
>      return ret;
>  }
>  
> +static int virtio_iommu_fill_none_prop(VirtIOIOMMUPropBuffer *bufstate)
> +{
> +    struct virtio_iommu_probe_property *prop;
> +
> +    prop = (struct virtio_iommu_probe_property *)
> +                (bufstate->start + bufstate->filled);
> +    prop->type = 0;
> +    prop->length = 0;
> +    bufstate->filled += sizeof(*prop);
> +    trace_virtio_iommu_fill_none_property(bufstate->endpoint->id);
> +    return 0;
> +}
> +
> +static int virtio_iommu_fill_page_size_mask(VirtIOIOMMUPropBuffer *bufstate)
> +{
> +    struct virtio_iommu_probe_pgsize_mask *page_size_mask;
> +    size_t prop_size = sizeof(*page_size_mask);
> +    VirtIOIOMMUEndpoint *ep = bufstate->endpoint;
> +    VirtIOIOMMU *s = ep->viommu;
> +    IOMMUDevice *sdev;
> +
> +    if (bufstate->filled + prop_size >= VIOMMU_PROBE_SIZE) {
> +        bufstate->error = true;
> +        /* get the traversal stopped by returning true */
> +        return true;
> +    }
> +
> +    page_size_mask = (struct virtio_iommu_probe_pgsize_mask *)
> +                     (bufstate->start + bufstate->filled);
> +
> +    page_size_mask->head.type = VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK;
> +    page_size_mask->head.length = prop_size;
> +    QLIST_FOREACH(sdev, &s->notifiers_list, next) {
> +        if (ep->id == sdev->devfn) {
> +            page_size_mask->pgsize_bitmap = sdev->page_size_mask;

Do we need a cpu_to_le64 here?

> +	}
> +    }
> +    bufstate->filled += sizeof(*page_size_mask);
> +    trace_virtio_iommu_fill_pgsize_mask_property(bufstate->endpoint->id,
> +                                                 page_size_mask->pgsize_bitmap,
> +                                                 bufstate->filled);
> +    return false;
> +}
> +
> +/* Fill the properties[] buffer with properties of type @type */
> +static int virtio_iommu_fill_property(int type,
> +                                      VirtIOIOMMUPropBuffer *bufstate)
> +{
> +    int ret = -ENOSPC;
> +
> +    if (bufstate->filled + sizeof(struct virtio_iommu_probe_property)
> +            >= VIOMMU_PROBE_SIZE) {
> +        /* no space left for the header */
> +        bufstate->error = true;
> +        goto out;
> +    }
> +
> +    switch (type) {
> +    case VIRTIO_IOMMU_PROBE_T_NONE:
> +        ret = virtio_iommu_fill_none_prop(bufstate);
> +        break;
> +    case VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK:
> +    {
> +        ret = virtio_iommu_fill_page_size_mask(bufstate);
> +	break;
> +    }
> +    default:
> +        ret = -ENOENT;
> +        break;
> +    }
> +out:
> +    if (ret) {
> +        error_report("%s property of type=%d could not be filled (%d),"
> +                     " remaining size = 0x%lx",
> +                     __func__, type, ret, bufstate->filled);
> +    }
> +    return ret;
> +}
> +
> +/**
> + * virtio_iommu_probe - Fill the probe request buffer with all
> + * the properties the device is able to return and add a NONE
> + * property at the end. @buf points to properties[].
> + */
> +static int virtio_iommu_probe(VirtIOIOMMU *s,
> +                              struct virtio_iommu_req_probe *req,
> +                              uint8_t *buf)
> +{
> +    uint32_t ep_id = le32_to_cpu(req->endpoint);
> +    VirtIOIOMMUEndpoint *ep = virtio_iommu_get_endpoint(s, ep_id);
> +    int16_t prop_types = SUPPORTED_PROBE_PROPERTIES, type;
> +    VirtIOIOMMUPropBuffer bufstate = {.start = buf, .filled = 0,
> +                                       .error = false, .endpoint = ep};

Probably need to check if ep is valid before going further

> +
> +    while ((type = ctz32(prop_types)) != 32) {
> +        if (virtio_iommu_fill_property(type, &bufstate)) {
> +            goto failure;
> +        }
> +        prop_types &= ~(1 << type);
> +    }
> +    if (virtio_iommu_fill_property(VIRTIO_IOMMU_PROBE_T_NONE, &bufstate)) {
> +        goto failure;
> +    }

We got rid of the NONE property in the spec, now there is: 

 "If the device doesn’t fill all probe_size bytes with properties, it
  SHOULD fill the remaining bytes of properties with zeroes."

So I think you can get rid of virtio_iommu_fill_none_prop() and fill the
rest of the buffer with zeroes here instead.

Thanks,
Jean

> +    return VIRTIO_IOMMU_S_OK;
> +failure:
> +    return VIRTIO_IOMMU_S_INVAL;
> +}
> +
>  static int virtio_iommu_iov_to_req(struct iovec *iov,
>                                     unsigned int iov_cnt,
>                                     void *req, size_t req_sz)
> @@ -519,6 +638,17 @@ virtio_iommu_handle_req(detach)
>  virtio_iommu_handle_req(map)
>  virtio_iommu_handle_req(unmap)
>  
> +static int virtio_iommu_handle_probe(VirtIOIOMMU *s,
> +                                     struct iovec *iov,
> +                                     unsigned int iov_cnt,
> +                                     uint8_t *buf)
> +{
> +    struct virtio_iommu_req_probe req;
> +    int ret = virtio_iommu_iov_to_req(iov, iov_cnt, &req, sizeof(req));
> +
> +    return ret ? ret : virtio_iommu_probe(s, &req, buf);
> +}
> +
>  static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
>  {
>      VirtIOIOMMU *s = VIRTIO_IOMMU(vdev);
> @@ -564,17 +694,33 @@ static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
>          case VIRTIO_IOMMU_T_UNMAP:
>              tail.status = virtio_iommu_handle_unmap(s, iov, iov_cnt);
>              break;
> +        case VIRTIO_IOMMU_T_PROBE:
> +        {
> +            struct virtio_iommu_req_tail *ptail;
> +            uint8_t *buf = g_malloc0(s->config.probe_size + sizeof(tail));
> +
> +            ptail = (struct virtio_iommu_req_tail *)
> +                        (buf + s->config.probe_size);
> +            ptail->status = virtio_iommu_handle_probe(s, iov, iov_cnt, buf);
> +
> +            sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
> +                              buf, s->config.probe_size + sizeof(tail));
> +            g_free(buf);
> +            assert(sz == s->config.probe_size + sizeof(tail));
> +            goto push;
> +        }
>          default:
>              tail.status = VIRTIO_IOMMU_S_UNSUPP;
>          }
> -        qemu_mutex_unlock(&s->mutex);
>  
>  out:
>          sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
>                            &tail, sizeof(tail));
>          assert(sz == sizeof(tail));
>  
> -        virtqueue_push(vq, elem, sizeof(tail));
> +push:
> +        qemu_mutex_unlock(&s->mutex);
> +        virtqueue_push(vq, elem, sz);
>          virtio_notify(vdev, vq);
>          g_free(elem);
>      }
> @@ -634,16 +780,23 @@ static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
>      VirtIOIOMMUEndpoint *ep;
>      uint32_t sid, flags;
>      bool bypass_allowed;
> +    hwaddr addr_mask;
>      bool found;
>  
>      interval.low = addr;
>      interval.high = addr + 1;
>  
> +    if (sdev->page_size_mask) {
> +        addr_mask = (1 << ctz32(sdev->page_size_mask)) - 1;
> +    } else {
> +        addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1;
> +    }
> +
>      IOMMUTLBEntry entry = {
>          .target_as = &address_space_memory,
>          .iova = addr,
>          .translated_addr = addr,
> -        .addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1,
> +        .addr_mask = addr_mask,
>          .perm = IOMMU_NONE,
>      };
>  
> @@ -831,6 +984,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>      s->config.page_size_mask = TARGET_PAGE_MASK;
>      s->config.input_range.end = -1UL;
>      s->config.domain_range.end = 32;
> +    s->config.probe_size = VIOMMU_PROBE_SIZE;
>  
>      virtio_add_feature(&s->features, VIRTIO_RING_F_EVENT_IDX);
>      virtio_add_feature(&s->features, VIRTIO_RING_F_INDIRECT_DESC);
> @@ -840,6 +994,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>      virtio_add_feature(&s->features, VIRTIO_IOMMU_F_MAP_UNMAP);
>      virtio_add_feature(&s->features, VIRTIO_IOMMU_F_BYPASS);
>      virtio_add_feature(&s->features, VIRTIO_IOMMU_F_MMIO);
> +    virtio_add_feature(&s->features, VIRTIO_IOMMU_F_PROBE);
>  
>      qemu_mutex_init(&s->mutex);
>  
> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> index 8bae651191..b0a6e4bda3 100644
> --- a/hw/virtio/trace-events
> +++ b/hw/virtio/trace-events
> @@ -76,3 +76,5 @@ virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uin
>  virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64
>  virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64
>  virtio_iommu_remap(uint64_t iova, uint64_t pa, uint64_t size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
> +virtio_iommu_fill_none_property(uint32_t devid) "devid=%d"
> +virtio_iommu_fill_pgsize_mask_property(uint32_t devid, uint64_t pgsize_mask, size_t filled) "dev= %d, pgsize_mask=0x%"PRIx64" filled=0x%lx"
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v9 8/9] virtio-iommu: Implement probe request
  2020-04-23 16:09   ` Jean-Philippe Brucker
@ 2020-04-24 13:51     ` Auger Eric
  2020-05-05  9:06       ` Bharat Bhushan
  0 siblings, 1 reply; 37+ messages in thread
From: Auger Eric @ 2020-04-24 13:51 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Bharat Bhushan
  Cc: yang.zhong, peter.maydell, kevin.tian, tnowicki, mst, drjones,
	peterx, qemu-devel, alex.williamson, qemu-arm, bharatb.linux,
	linuc.decode, eric.auger.pro

Hi Bharat,
On 4/23/20 6:09 PM, Jean-Philippe Brucker wrote:
> Hi Bharat,
> 
> A few more things found while rebasing
> 
> On Mon, Mar 23, 2020 at 02:16:16PM +0530, Bharat Bhushan wrote:
>> This patch implements the PROBE request. Currently supported
>> page size mask per endpoint is returned. Also append a NONE
>> property in the end.
>>
>> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>>  include/standard-headers/linux/virtio_iommu.h |   6 +
>>  hw/virtio/virtio-iommu.c                      | 161 +++++++++++++++++-
>>  hw/virtio/trace-events                        |   2 +
>>  3 files changed, 166 insertions(+), 3 deletions(-)
>>
>> diff --git a/include/standard-headers/linux/virtio_iommu.h b/include/standard-headers/linux/virtio_iommu.h
>> index b9443b83a1..8a0d47b907 100644
>> --- a/include/standard-headers/linux/virtio_iommu.h
>> +++ b/include/standard-headers/linux/virtio_iommu.h
>> @@ -111,6 +111,7 @@ struct virtio_iommu_req_unmap {
>>  
>>  #define VIRTIO_IOMMU_PROBE_T_NONE		0
>>  #define VIRTIO_IOMMU_PROBE_T_RESV_MEM		1
>> +#define VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK	2
>>  
>>  #define VIRTIO_IOMMU_PROBE_T_MASK		0xfff
>>  
>> @@ -130,6 +131,11 @@ struct virtio_iommu_probe_resv_mem {
>>  	uint64_t					end;
>>  };
>>  
>> +struct virtio_iommu_probe_pgsize_mask {
>> +	struct virtio_iommu_probe_property      head;
>> +	uint64_t				pgsize_bitmap;
>> +};
>> +
>>  struct virtio_iommu_req_probe {
>>  	struct virtio_iommu_req_head		head;
>>  	uint32_t					endpoint;
>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>> index 747e3cf1da..63fbacdcdc 100644
>> --- a/hw/virtio/virtio-iommu.c
>> +++ b/hw/virtio/virtio-iommu.c
>> @@ -38,6 +38,10 @@
>>  
>>  /* Max size */
>>  #define VIOMMU_DEFAULT_QUEUE_SIZE 256
>> +#define VIOMMU_PROBE_SIZE 512
>> +
>> +#define SUPPORTED_PROBE_PROPERTIES (\
>> +    1 << VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK)
>>  
>>  typedef struct VirtIOIOMMUDomain {
>>      uint32_t id;
>> @@ -62,6 +66,13 @@ typedef struct VirtIOIOMMUMapping {
>>      uint32_t flags;
>>  } VirtIOIOMMUMapping;
>>  
>> +typedef struct VirtIOIOMMUPropBuffer {
>> +    VirtIOIOMMUEndpoint *endpoint;
>> +    size_t filled;
>> +    uint8_t *start;
>> +    bool error;
> 
> It doesn't seem like bufstate->error gets used anywhere
maybe rebase your work on
[PATCH for-4.2 v10 10/15] virtio-iommu: Implement probe request
which tests it.

Also in
[Qemu-devel] [PATCH for-4.2 v10 10/15] virtio-iommu: Implement probe request
I changed the implementation to keep it simpler.

Thanks

Eric
> 
>> +} VirtIOIOMMUPropBuffer;
>> +
>>  static inline uint16_t virtio_iommu_get_bdf(IOMMUDevice *dev)
>>  {
>>      return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
>> @@ -490,6 +501,114 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
>>      return ret;
>>  }
>>  
>> +static int virtio_iommu_fill_none_prop(VirtIOIOMMUPropBuffer *bufstate)
>> +{
>> +    struct virtio_iommu_probe_property *prop;
>> +
>> +    prop = (struct virtio_iommu_probe_property *)
>> +                (bufstate->start + bufstate->filled);
>> +    prop->type = 0;
>> +    prop->length = 0;
>> +    bufstate->filled += sizeof(*prop);
>> +    trace_virtio_iommu_fill_none_property(bufstate->endpoint->id);
>> +    return 0;
>> +}
>> +
>> +static int virtio_iommu_fill_page_size_mask(VirtIOIOMMUPropBuffer *bufstate)
>> +{
>> +    struct virtio_iommu_probe_pgsize_mask *page_size_mask;
>> +    size_t prop_size = sizeof(*page_size_mask);
>> +    VirtIOIOMMUEndpoint *ep = bufstate->endpoint;
>> +    VirtIOIOMMU *s = ep->viommu;
>> +    IOMMUDevice *sdev;
>> +
>> +    if (bufstate->filled + prop_size >= VIOMMU_PROBE_SIZE) {
>> +        bufstate->error = true;
>> +        /* get the traversal stopped by returning true */
>> +        return true;
>> +    }
>> +
>> +    page_size_mask = (struct virtio_iommu_probe_pgsize_mask *)
>> +                     (bufstate->start + bufstate->filled);
>> +
>> +    page_size_mask->head.type = VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK;
>> +    page_size_mask->head.length = prop_size;
>> +    QLIST_FOREACH(sdev, &s->notifiers_list, next) {
>> +        if (ep->id == sdev->devfn) {
>> +            page_size_mask->pgsize_bitmap = sdev->page_size_mask;
> 
> Do we need a cpu_to_le64 here?
> 
>> +	}
>> +    }
>> +    bufstate->filled += sizeof(*page_size_mask);
>> +    trace_virtio_iommu_fill_pgsize_mask_property(bufstate->endpoint->id,
>> +                                                 page_size_mask->pgsize_bitmap,
>> +                                                 bufstate->filled);
>> +    return false;
>> +}
>> +
>> +/* Fill the properties[] buffer with properties of type @type */
>> +static int virtio_iommu_fill_property(int type,
>> +                                      VirtIOIOMMUPropBuffer *bufstate)
>> +{
>> +    int ret = -ENOSPC;
>> +
>> +    if (bufstate->filled + sizeof(struct virtio_iommu_probe_property)
>> +            >= VIOMMU_PROBE_SIZE) {
>> +        /* no space left for the header */
>> +        bufstate->error = true;
>> +        goto out;
>> +    }
>> +
>> +    switch (type) {
>> +    case VIRTIO_IOMMU_PROBE_T_NONE:
>> +        ret = virtio_iommu_fill_none_prop(bufstate);
>> +        break;
>> +    case VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK:
>> +    {
>> +        ret = virtio_iommu_fill_page_size_mask(bufstate);
>> +	break;
>> +    }
>> +    default:
>> +        ret = -ENOENT;
>> +        break;
>> +    }
>> +out:
>> +    if (ret) {
>> +        error_report("%s property of type=%d could not be filled (%d),"
>> +                     " remaining size = 0x%lx",
>> +                     __func__, type, ret, bufstate->filled);
>> +    }
>> +    return ret;
>> +}
>> +
>> +/**
>> + * virtio_iommu_probe - Fill the probe request buffer with all
>> + * the properties the device is able to return and add a NONE
>> + * property at the end. @buf points to properties[].
>> + */
>> +static int virtio_iommu_probe(VirtIOIOMMU *s,
>> +                              struct virtio_iommu_req_probe *req,
>> +                              uint8_t *buf)
>> +{
>> +    uint32_t ep_id = le32_to_cpu(req->endpoint);
>> +    VirtIOIOMMUEndpoint *ep = virtio_iommu_get_endpoint(s, ep_id);
>> +    int16_t prop_types = SUPPORTED_PROBE_PROPERTIES, type;
>> +    VirtIOIOMMUPropBuffer bufstate = {.start = buf, .filled = 0,
>> +                                       .error = false, .endpoint = ep};
> 
> Probably need to check if ep is valid before going further
> 
>> +
>> +    while ((type = ctz32(prop_types)) != 32) {
>> +        if (virtio_iommu_fill_property(type, &bufstate)) {
>> +            goto failure;
>> +        }
>> +        prop_types &= ~(1 << type);
>> +    }
>> +    if (virtio_iommu_fill_property(VIRTIO_IOMMU_PROBE_T_NONE, &bufstate)) {
>> +        goto failure;
>> +    }
> 
> We got rid of the NONE property in the spec, now there is: 
> 
>  "If the device doesn’t fill all probe_size bytes with properties, it
>   SHOULD fill the remaining bytes of properties with zeroes."
> 
> So I think you can get rid of virtio_iommu_fill_none_prop() and fill the
> rest of the buffer with zeroes here instead.
> 
> Thanks,
> Jean
> 
>> +    return VIRTIO_IOMMU_S_OK;
>> +failure:
>> +    return VIRTIO_IOMMU_S_INVAL;
>> +}
>> +
>>  static int virtio_iommu_iov_to_req(struct iovec *iov,
>>                                     unsigned int iov_cnt,
>>                                     void *req, size_t req_sz)
>> @@ -519,6 +638,17 @@ virtio_iommu_handle_req(detach)
>>  virtio_iommu_handle_req(map)
>>  virtio_iommu_handle_req(unmap)
>>  
>> +static int virtio_iommu_handle_probe(VirtIOIOMMU *s,
>> +                                     struct iovec *iov,
>> +                                     unsigned int iov_cnt,
>> +                                     uint8_t *buf)
>> +{
>> +    struct virtio_iommu_req_probe req;
>> +    int ret = virtio_iommu_iov_to_req(iov, iov_cnt, &req, sizeof(req));
>> +
>> +    return ret ? ret : virtio_iommu_probe(s, &req, buf);
>> +}
>> +
>>  static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
>>  {
>>      VirtIOIOMMU *s = VIRTIO_IOMMU(vdev);
>> @@ -564,17 +694,33 @@ static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
>>          case VIRTIO_IOMMU_T_UNMAP:
>>              tail.status = virtio_iommu_handle_unmap(s, iov, iov_cnt);
>>              break;
>> +        case VIRTIO_IOMMU_T_PROBE:
>> +        {
>> +            struct virtio_iommu_req_tail *ptail;
>> +            uint8_t *buf = g_malloc0(s->config.probe_size + sizeof(tail));
>> +
>> +            ptail = (struct virtio_iommu_req_tail *)
>> +                        (buf + s->config.probe_size);
>> +            ptail->status = virtio_iommu_handle_probe(s, iov, iov_cnt, buf);
>> +
>> +            sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
>> +                              buf, s->config.probe_size + sizeof(tail));
>> +            g_free(buf);
>> +            assert(sz == s->config.probe_size + sizeof(tail));
>> +            goto push;
>> +        }
>>          default:
>>              tail.status = VIRTIO_IOMMU_S_UNSUPP;
>>          }
>> -        qemu_mutex_unlock(&s->mutex);
>>  
>>  out:
>>          sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
>>                            &tail, sizeof(tail));
>>          assert(sz == sizeof(tail));
>>  
>> -        virtqueue_push(vq, elem, sizeof(tail));
>> +push:
>> +        qemu_mutex_unlock(&s->mutex);
>> +        virtqueue_push(vq, elem, sz);
>>          virtio_notify(vdev, vq);
>>          g_free(elem);
>>      }
>> @@ -634,16 +780,23 @@ static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
>>      VirtIOIOMMUEndpoint *ep;
>>      uint32_t sid, flags;
>>      bool bypass_allowed;
>> +    hwaddr addr_mask;
>>      bool found;
>>  
>>      interval.low = addr;
>>      interval.high = addr + 1;
>>  
>> +    if (sdev->page_size_mask) {
>> +        addr_mask = (1 << ctz32(sdev->page_size_mask)) - 1;
>> +    } else {
>> +        addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1;
>> +    }
>> +
>>      IOMMUTLBEntry entry = {
>>          .target_as = &address_space_memory,
>>          .iova = addr,
>>          .translated_addr = addr,
>> -        .addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1,
>> +        .addr_mask = addr_mask,
>>          .perm = IOMMU_NONE,
>>      };
>>  
>> @@ -831,6 +984,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>>      s->config.page_size_mask = TARGET_PAGE_MASK;
>>      s->config.input_range.end = -1UL;
>>      s->config.domain_range.end = 32;
>> +    s->config.probe_size = VIOMMU_PROBE_SIZE;
>>  
>>      virtio_add_feature(&s->features, VIRTIO_RING_F_EVENT_IDX);
>>      virtio_add_feature(&s->features, VIRTIO_RING_F_INDIRECT_DESC);
>> @@ -840,6 +994,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>>      virtio_add_feature(&s->features, VIRTIO_IOMMU_F_MAP_UNMAP);
>>      virtio_add_feature(&s->features, VIRTIO_IOMMU_F_BYPASS);
>>      virtio_add_feature(&s->features, VIRTIO_IOMMU_F_MMIO);
>> +    virtio_add_feature(&s->features, VIRTIO_IOMMU_F_PROBE);
>>  
>>      qemu_mutex_init(&s->mutex);
>>  
>> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
>> index 8bae651191..b0a6e4bda3 100644
>> --- a/hw/virtio/trace-events
>> +++ b/hw/virtio/trace-events
>> @@ -76,3 +76,5 @@ virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uin
>>  virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64
>>  virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64
>>  virtio_iommu_remap(uint64_t iova, uint64_t pa, uint64_t size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
>> +virtio_iommu_fill_none_property(uint32_t devid) "devid=%d"
>> +virtio_iommu_fill_pgsize_mask_property(uint32_t devid, uint64_t pgsize_mask, size_t filled) "dev= %d, pgsize_mask=0x%"PRIx64" filled=0x%lx"
>> -- 
>> 2.17.1
>>
> 



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu
  2020-04-02  9:01         ` Bharat Bhushan
@ 2020-04-24 14:17           ` Auger Eric
  2020-05-05  9:25             ` Bharat Bhushan
  0 siblings, 1 reply; 37+ messages in thread
From: Auger Eric @ 2020-04-24 14:17 UTC (permalink / raw)
  To: Bharat Bhushan, Alex Williamson
  Cc: yang.zhong, peter.maydell, kevin.tian, Tomasz Nowicki [C],
	mst, drjones, peterx, qemu-devel, bharatb.linux, qemu-arm,
	jean-philippe, linuc.decode, eric.auger.pro, David Gibson

Hi Bharat,

On 4/2/20 11:01 AM, Bharat Bhushan wrote:
> Hi Eric/Alex,
> 
>> -----Original Message-----
>> From: Alex Williamson <alex.williamson@redhat.com>
>> Sent: Thursday, March 26, 2020 11:23 PM
>> To: Auger Eric <eric.auger@redhat.com>
>> Cc: Bharat Bhushan <bbhushan2@marvell.com>; peter.maydell@linaro.org;
>> peterx@redhat.com; eric.auger.pro@gmail.com; kevin.tian@intel.com;
>> mst@redhat.com; Tomasz Nowicki [C] <tnowicki@marvell.com>;
>> drjones@redhat.com; linuc.decode@gmail.com; qemu-devel@nongnu.org; qemu-
>> arm@nongnu.org; bharatb.linux@gmail.com; jean-philippe@linaro.org;
>> yang.zhong@intel.com; David Gibson <david@gibson.dropbear.id.au>
>> Subject: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio
>> region translation by viommu
>>
>> External Email
>>
>> ----------------------------------------------------------------------
>> On Thu, 26 Mar 2020 18:35:48 +0100
>> Auger Eric <eric.auger@redhat.com> wrote:
>>
>>> Hi Alex,
>>>
>>> On 3/24/20 12:08 AM, Alex Williamson wrote:
>>>> [Cc +dwg who originated this warning]
>>>>
>>>> On Mon, 23 Mar 2020 14:16:09 +0530
>>>> Bharat Bhushan <bbhushan2@marvell.com> wrote:
>>>>
>>>>> On ARM, the MSI doorbell is translated by the virtual IOMMU.
>>>>> As such address_space_translate() returns the MSI controller MMIO
>>>>> region and we get an "iommu map to non memory area"
>>>>> message. Let's remove this latter.
>>>>>
>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
>>>>> ---
>>>>>  hw/vfio/common.c | 2 --
>>>>>  1 file changed, 2 deletions(-)
>>>>>
>>>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c index
>>>>> 5ca11488d6..c586edf47a 100644
>>>>> --- a/hw/vfio/common.c
>>>>> +++ b/hw/vfio/common.c
>>>>> @@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb,
>> void **vaddr,
>>>>>                                   &xlat, &len, writable,
>>>>>                                   MEMTXATTRS_UNSPECIFIED);
>>>>>      if (!memory_region_is_ram(mr)) {
>>>>> -        error_report("iommu map to non memory area %"HWADDR_PRIx"",
>>>>> -                     xlat);
>>>>>          return false;
>>>>>      }
>>>>>
>>>>
>>>> I'm a bit confused here, I think we need more justification beyond
>>>> "we hit this warning and we don't want to because it's ok in this
>>>> one special case, therefore remove it".  I assume the special case
>>>> is that the device MSI address is managed via the SET_IRQS ioctl and
>>>> therefore we won't actually get DMAs to this range.
>>> Yes exactly. The guest creates a mapping between one giova and this
>>> gpa (corresponding to the MSI controller doorbell) because MSIs are
>>> mapped on ARM. But practically the physical device is programmed with
>>> an host chosen iova that maps onto the physical MSI controller's
>>> doorbell. so the device never performs DMA accesses to this range.
>>>
>>>   But I imagine the case that
>>>> was in mind when adding this warning was general peer-to-peer
>>>> between and assigned and emulated device.
>>> yes makes sense.
>>>
>>>   Maybe there's an argument to be made
>>>> that such a p2p mapping might also be used in a non-vIOMMU case.  We
>>>> skip creating those mappings and drivers continue to work, maybe
>>>> because nobody attempts to do p2p DMA with the types of devices we
>>>> emulate, maybe because p2p DMA is not absolutely reliable on bare
>>>> metal and drivers test it before using it.
>>> MSI doorbells are mapped using the IOMMU_MMIO flag (dma-iommu.c
>>> iommu_dma_get_msi_page).
>>> One idea could be to pass that flag through the IOMMU Notifier
>>> mechanism into the iotlb->perm. Eventually when we get this in
>>> vfio_get_vaddr() we would not print the warning. Could that make sense?
>>
>> Yeah, if we can identify a valid case that doesn't need a warning, that's fine by me.
>> Thanks,
> 
> Let me know if I understood the proposal correctly:
> 
> virtio-iommu driver in guest will make map (VIRTIO_IOMMU_T_MAP) with VIRTIO_IOMMU_MAP_F_MMIO flag for MSI mapping.
> In qemu, virtio-iommu device will set a new defined flag (say IOMMU_MMIO) in iotlb->perm in memory_region_notify_iommu(). vfio_get_vaddr() will check same flag and will not print the warning.>
> Is above correct?
Yes that's what I had in mind.

Thanks

Eric
> 
> Thanks
> -Bharat
> 
>>
>> Alex
> 
> 



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v9 8/9] virtio-iommu: Implement probe request
  2020-04-24 13:51     ` Auger Eric
@ 2020-05-05  9:06       ` Bharat Bhushan
  2020-05-07 14:42         ` Auger Eric
  0 siblings, 1 reply; 37+ messages in thread
From: Bharat Bhushan @ 2020-05-05  9:06 UTC (permalink / raw)
  To: Auger Eric
  Cc: Zhong, Yang, Jean-Philippe Brucker, kevin.tian, tnowicki,
	Peter Maydell, mst, drjones, peterx, qemu-devel, alex.williamson,
	qemu-arm, Bharat Bhushan, linuc.decode, eric.auger.pro

On Fri, Apr 24, 2020 at 7:22 PM Auger Eric <eric.auger@redhat.com> wrote:
>
> Hi Bharat,
> On 4/23/20 6:09 PM, Jean-Philippe Brucker wrote:
> > Hi Bharat,
> >
> > A few more things found while rebasing
> >
> > On Mon, Mar 23, 2020 at 02:16:16PM +0530, Bharat Bhushan wrote:
> >> This patch implements the PROBE request. Currently supported
> >> page size mask per endpoint is returned. Also append a NONE
> >> property in the end.
> >>
> >> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >> ---
> >>  include/standard-headers/linux/virtio_iommu.h |   6 +
> >>  hw/virtio/virtio-iommu.c                      | 161 +++++++++++++++++-
> >>  hw/virtio/trace-events                        |   2 +
> >>  3 files changed, 166 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/include/standard-headers/linux/virtio_iommu.h b/include/standard-headers/linux/virtio_iommu.h
> >> index b9443b83a1..8a0d47b907 100644
> >> --- a/include/standard-headers/linux/virtio_iommu.h
> >> +++ b/include/standard-headers/linux/virtio_iommu.h
> >> @@ -111,6 +111,7 @@ struct virtio_iommu_req_unmap {
> >>
> >>  #define VIRTIO_IOMMU_PROBE_T_NONE           0
> >>  #define VIRTIO_IOMMU_PROBE_T_RESV_MEM               1
> >> +#define VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK 2
> >>
> >>  #define VIRTIO_IOMMU_PROBE_T_MASK           0xfff
> >>
> >> @@ -130,6 +131,11 @@ struct virtio_iommu_probe_resv_mem {
> >>      uint64_t                                        end;
> >>  };
> >>
> >> +struct virtio_iommu_probe_pgsize_mask {
> >> +    struct virtio_iommu_probe_property      head;
> >> +    uint64_t                                pgsize_bitmap;
> >> +};
> >> +
> >>  struct virtio_iommu_req_probe {
> >>      struct virtio_iommu_req_head            head;
> >>      uint32_t                                        endpoint;
> >> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> >> index 747e3cf1da..63fbacdcdc 100644
> >> --- a/hw/virtio/virtio-iommu.c
> >> +++ b/hw/virtio/virtio-iommu.c
> >> @@ -38,6 +38,10 @@
> >>
> >>  /* Max size */
> >>  #define VIOMMU_DEFAULT_QUEUE_SIZE 256
> >> +#define VIOMMU_PROBE_SIZE 512
> >> +
> >> +#define SUPPORTED_PROBE_PROPERTIES (\
> >> +    1 << VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK)
> >>
> >>  typedef struct VirtIOIOMMUDomain {
> >>      uint32_t id;
> >> @@ -62,6 +66,13 @@ typedef struct VirtIOIOMMUMapping {
> >>      uint32_t flags;
> >>  } VirtIOIOMMUMapping;
> >>
> >> +typedef struct VirtIOIOMMUPropBuffer {
> >> +    VirtIOIOMMUEndpoint *endpoint;
> >> +    size_t filled;
> >> +    uint8_t *start;
> >> +    bool error;
> >
> > It doesn't seem like bufstate->error gets used anywhere
> maybe rebase your work on
> [PATCH for-4.2 v10 10/15] virtio-iommu: Implement probe request
> which tests it.

This was the staring point for me, As of now i moved away from "error"
from above struct.

>
> Also in
> [Qemu-devel] [PATCH for-4.2 v10 10/15] virtio-iommu: Implement probe request
> I changed the implementation to keep it simpler.
>
> Thanks
>
> Eric
> >
> >> +} VirtIOIOMMUPropBuffer;
> >> +
> >>  static inline uint16_t virtio_iommu_get_bdf(IOMMUDevice *dev)
> >>  {
> >>      return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
> >> @@ -490,6 +501,114 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
> >>      return ret;
> >>  }
> >>
> >> +static int virtio_iommu_fill_none_prop(VirtIOIOMMUPropBuffer *bufstate)
> >> +{
> >> +    struct virtio_iommu_probe_property *prop;
> >> +
> >> +    prop = (struct virtio_iommu_probe_property *)
> >> +                (bufstate->start + bufstate->filled);
> >> +    prop->type = 0;
> >> +    prop->length = 0;
> >> +    bufstate->filled += sizeof(*prop);
> >> +    trace_virtio_iommu_fill_none_property(bufstate->endpoint->id);
> >> +    return 0;
> >> +}
> >> +
> >> +static int virtio_iommu_fill_page_size_mask(VirtIOIOMMUPropBuffer *bufstate)
> >> +{
> >> +    struct virtio_iommu_probe_pgsize_mask *page_size_mask;
> >> +    size_t prop_size = sizeof(*page_size_mask);
> >> +    VirtIOIOMMUEndpoint *ep = bufstate->endpoint;
> >> +    VirtIOIOMMU *s = ep->viommu;
> >> +    IOMMUDevice *sdev;
> >> +
> >> +    if (bufstate->filled + prop_size >= VIOMMU_PROBE_SIZE) {
> >> +        bufstate->error = true;
> >> +        /* get the traversal stopped by returning true */
> >> +        return true;
> >> +    }
> >> +
> >> +    page_size_mask = (struct virtio_iommu_probe_pgsize_mask *)
> >> +                     (bufstate->start + bufstate->filled);
> >> +
> >> +    page_size_mask->head.type = VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK;
> >> +    page_size_mask->head.length = prop_size;
> >> +    QLIST_FOREACH(sdev, &s->notifiers_list, next) {
> >> +        if (ep->id == sdev->devfn) {
> >> +            page_size_mask->pgsize_bitmap = sdev->page_size_mask;
> >
> > Do we need a cpu_to_le64 here?

Ack, yes, even head.type and head.length needed  cpu_to_le16().

> >
> >> +    }
> >> +    }
> >> +    bufstate->filled += sizeof(*page_size_mask);
> >> +    trace_virtio_iommu_fill_pgsize_mask_property(bufstate->endpoint->id,
> >> +                                                 page_size_mask->pgsize_bitmap,
> >> +                                                 bufstate->filled);
> >> +    return false;
> >> +}
> >> +
> >> +/* Fill the properties[] buffer with properties of type @type */
> >> +static int virtio_iommu_fill_property(int type,
> >> +                                      VirtIOIOMMUPropBuffer *bufstate)
> >> +{
> >> +    int ret = -ENOSPC;
> >> +
> >> +    if (bufstate->filled + sizeof(struct virtio_iommu_probe_property)
> >> +            >= VIOMMU_PROBE_SIZE) {
> >> +        /* no space left for the header */
> >> +        bufstate->error = true;
> >> +        goto out;
> >> +    }
> >> +
> >> +    switch (type) {
> >> +    case VIRTIO_IOMMU_PROBE_T_NONE:
> >> +        ret = virtio_iommu_fill_none_prop(bufstate);
> >> +        break;
> >> +    case VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK:
> >> +    {
> >> +        ret = virtio_iommu_fill_page_size_mask(bufstate);
> >> +    break;
> >> +    }
> >> +    default:
> >> +        ret = -ENOENT;
> >> +        break;
> >> +    }
> >> +out:
> >> +    if (ret) {
> >> +        error_report("%s property of type=%d could not be filled (%d),"
> >> +                     " remaining size = 0x%lx",
> >> +                     __func__, type, ret, bufstate->filled);
> >> +    }
> >> +    return ret;
> >> +}
> >> +
> >> +/**
> >> + * virtio_iommu_probe - Fill the probe request buffer with all
> >> + * the properties the device is able to return and add a NONE
> >> + * property at the end. @buf points to properties[].
> >> + */
> >> +static int virtio_iommu_probe(VirtIOIOMMU *s,
> >> +                              struct virtio_iommu_req_probe *req,
> >> +                              uint8_t *buf)
> >> +{
> >> +    uint32_t ep_id = le32_to_cpu(req->endpoint);
> >> +    VirtIOIOMMUEndpoint *ep = virtio_iommu_get_endpoint(s, ep_id);
> >> +    int16_t prop_types = SUPPORTED_PROBE_PROPERTIES, type;
> >> +    VirtIOIOMMUPropBuffer bufstate = {.start = buf, .filled = 0,
> >> +                                       .error = false, .endpoint = ep};
> >
> > Probably need to check if ep is valid before going further

yes, will take care

> >
> >> +
> >> +    while ((type = ctz32(prop_types)) != 32) {
> >> +        if (virtio_iommu_fill_property(type, &bufstate)) {
> >> +            goto failure;
> >> +        }
> >> +        prop_types &= ~(1 << type);
> >> +    }
> >> +    if (virtio_iommu_fill_property(VIRTIO_IOMMU_PROBE_T_NONE, &bufstate)) {
> >> +        goto failure;
> >> +    }
> >
> > We got rid of the NONE property in the spec, now there is:
> >
> >  "If the device doesn’t fill all probe_size bytes with properties, it
> >   SHOULD fill the remaining bytes of properties with zeroes."
> >
> > So I think you can get rid of virtio_iommu_fill_none_prop() and fill the
> > rest of the buffer with zeroes here instead.

will do changes accordingly.

Stay Safe !!

Thanks
-Bharat
> >
> > Thanks,
> > Jean
> >
> >> +    return VIRTIO_IOMMU_S_OK;
> >> +failure:
> >> +    return VIRTIO_IOMMU_S_INVAL;
> >> +}
> >> +
> >>  static int virtio_iommu_iov_to_req(struct iovec *iov,
> >>                                     unsigned int iov_cnt,
> >>                                     void *req, size_t req_sz)
> >> @@ -519,6 +638,17 @@ virtio_iommu_handle_req(detach)
> >>  virtio_iommu_handle_req(map)
> >>  virtio_iommu_handle_req(unmap)
> >>
> >> +static int virtio_iommu_handle_probe(VirtIOIOMMU *s,
> >> +                                     struct iovec *iov,
> >> +                                     unsigned int iov_cnt,
> >> +                                     uint8_t *buf)
> >> +{
> >> +    struct virtio_iommu_req_probe req;
> >> +    int ret = virtio_iommu_iov_to_req(iov, iov_cnt, &req, sizeof(req));
> >> +
> >> +    return ret ? ret : virtio_iommu_probe(s, &req, buf);
> >> +}
> >> +
> >>  static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
> >>  {
> >>      VirtIOIOMMU *s = VIRTIO_IOMMU(vdev);
> >> @@ -564,17 +694,33 @@ static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
> >>          case VIRTIO_IOMMU_T_UNMAP:
> >>              tail.status = virtio_iommu_handle_unmap(s, iov, iov_cnt);
> >>              break;
> >> +        case VIRTIO_IOMMU_T_PROBE:
> >> +        {
> >> +            struct virtio_iommu_req_tail *ptail;
> >> +            uint8_t *buf = g_malloc0(s->config.probe_size + sizeof(tail));
> >> +
> >> +            ptail = (struct virtio_iommu_req_tail *)
> >> +                        (buf + s->config.probe_size);
> >> +            ptail->status = virtio_iommu_handle_probe(s, iov, iov_cnt, buf);
> >> +
> >> +            sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
> >> +                              buf, s->config.probe_size + sizeof(tail));
> >> +            g_free(buf);
> >> +            assert(sz == s->config.probe_size + sizeof(tail));
> >> +            goto push;
> >> +        }
> >>          default:
> >>              tail.status = VIRTIO_IOMMU_S_UNSUPP;
> >>          }
> >> -        qemu_mutex_unlock(&s->mutex);
> >>
> >>  out:
> >>          sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
> >>                            &tail, sizeof(tail));
> >>          assert(sz == sizeof(tail));
> >>
> >> -        virtqueue_push(vq, elem, sizeof(tail));
> >> +push:
> >> +        qemu_mutex_unlock(&s->mutex);
> >> +        virtqueue_push(vq, elem, sz);
> >>          virtio_notify(vdev, vq);
> >>          g_free(elem);
> >>      }
> >> @@ -634,16 +780,23 @@ static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
> >>      VirtIOIOMMUEndpoint *ep;
> >>      uint32_t sid, flags;
> >>      bool bypass_allowed;
> >> +    hwaddr addr_mask;
> >>      bool found;
> >>
> >>      interval.low = addr;
> >>      interval.high = addr + 1;
> >>
> >> +    if (sdev->page_size_mask) {
> >> +        addr_mask = (1 << ctz32(sdev->page_size_mask)) - 1;
> >> +    } else {
> >> +        addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1;
> >> +    }
> >> +
> >>      IOMMUTLBEntry entry = {
> >>          .target_as = &address_space_memory,
> >>          .iova = addr,
> >>          .translated_addr = addr,
> >> -        .addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1,
> >> +        .addr_mask = addr_mask,
> >>          .perm = IOMMU_NONE,
> >>      };
> >>
> >> @@ -831,6 +984,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
> >>      s->config.page_size_mask = TARGET_PAGE_MASK;
> >>      s->config.input_range.end = -1UL;
> >>      s->config.domain_range.end = 32;
> >> +    s->config.probe_size = VIOMMU_PROBE_SIZE;
> >>
> >>      virtio_add_feature(&s->features, VIRTIO_RING_F_EVENT_IDX);
> >>      virtio_add_feature(&s->features, VIRTIO_RING_F_INDIRECT_DESC);
> >> @@ -840,6 +994,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
> >>      virtio_add_feature(&s->features, VIRTIO_IOMMU_F_MAP_UNMAP);
> >>      virtio_add_feature(&s->features, VIRTIO_IOMMU_F_BYPASS);
> >>      virtio_add_feature(&s->features, VIRTIO_IOMMU_F_MMIO);
> >> +    virtio_add_feature(&s->features, VIRTIO_IOMMU_F_PROBE);
> >>
> >>      qemu_mutex_init(&s->mutex);
> >>
> >> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> >> index 8bae651191..b0a6e4bda3 100644
> >> --- a/hw/virtio/trace-events
> >> +++ b/hw/virtio/trace-events
> >> @@ -76,3 +76,5 @@ virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uin
> >>  virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64
> >>  virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64
> >>  virtio_iommu_remap(uint64_t iova, uint64_t pa, uint64_t size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
> >> +virtio_iommu_fill_none_property(uint32_t devid) "devid=%d"
> >> +virtio_iommu_fill_pgsize_mask_property(uint32_t devid, uint64_t pgsize_mask, size_t filled) "dev= %d, pgsize_mask=0x%"PRIx64" filled=0x%lx"
> >> --
> >> 2.17.1
> >>
> >
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu
  2020-04-24 14:17           ` Auger Eric
@ 2020-05-05  9:25             ` Bharat Bhushan
  2020-05-05  9:30               ` Auger Eric
  0 siblings, 1 reply; 37+ messages in thread
From: Bharat Bhushan @ 2020-05-05  9:25 UTC (permalink / raw)
  To: Auger Eric
  Cc: yang.zhong, peter.maydell, kevin.tian, Tomasz Nowicki [C],
	mst, drjones, peterx, qemu-devel, Alex Williamson, qemu-arm,
	jean-philippe, Bharat Bhushan, eric.auger.pro, linuc.decode,
	David Gibson

Hi Eric,

On Fri, Apr 24, 2020 at 7:47 PM Auger Eric <eric.auger@redhat.com> wrote:
>
> Hi Bharat,
>
> On 4/2/20 11:01 AM, Bharat Bhushan wrote:
> > Hi Eric/Alex,
> >
> >> -----Original Message-----
> >> From: Alex Williamson <alex.williamson@redhat.com>
> >> Sent: Thursday, March 26, 2020 11:23 PM
> >> To: Auger Eric <eric.auger@redhat.com>
> >> Cc: Bharat Bhushan <bbhushan2@marvell.com>; peter.maydell@linaro.org;
> >> peterx@redhat.com; eric.auger.pro@gmail.com; kevin.tian@intel.com;
> >> mst@redhat.com; Tomasz Nowicki [C] <tnowicki@marvell.com>;
> >> drjones@redhat.com; linuc.decode@gmail.com; qemu-devel@nongnu.org; qemu-
> >> arm@nongnu.org; bharatb.linux@gmail.com; jean-philippe@linaro.org;
> >> yang.zhong@intel.com; David Gibson <david@gibson.dropbear.id.au>
> >> Subject: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio
> >> region translation by viommu
> >>
> >> External Email
> >>
> >> ----------------------------------------------------------------------
> >> On Thu, 26 Mar 2020 18:35:48 +0100
> >> Auger Eric <eric.auger@redhat.com> wrote:
> >>
> >>> Hi Alex,
> >>>
> >>> On 3/24/20 12:08 AM, Alex Williamson wrote:
> >>>> [Cc +dwg who originated this warning]
> >>>>
> >>>> On Mon, 23 Mar 2020 14:16:09 +0530
> >>>> Bharat Bhushan <bbhushan2@marvell.com> wrote:
> >>>>
> >>>>> On ARM, the MSI doorbell is translated by the virtual IOMMU.
> >>>>> As such address_space_translate() returns the MSI controller MMIO
> >>>>> region and we get an "iommu map to non memory area"
> >>>>> message. Let's remove this latter.
> >>>>>
> >>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >>>>> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
> >>>>> ---
> >>>>>  hw/vfio/common.c | 2 --
> >>>>>  1 file changed, 2 deletions(-)
> >>>>>
> >>>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c index
> >>>>> 5ca11488d6..c586edf47a 100644
> >>>>> --- a/hw/vfio/common.c
> >>>>> +++ b/hw/vfio/common.c
> >>>>> @@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb,
> >> void **vaddr,
> >>>>>                                   &xlat, &len, writable,
> >>>>>                                   MEMTXATTRS_UNSPECIFIED);
> >>>>>      if (!memory_region_is_ram(mr)) {
> >>>>> -        error_report("iommu map to non memory area %"HWADDR_PRIx"",
> >>>>> -                     xlat);
> >>>>>          return false;
> >>>>>      }
> >>>>>
> >>>>
> >>>> I'm a bit confused here, I think we need more justification beyond
> >>>> "we hit this warning and we don't want to because it's ok in this
> >>>> one special case, therefore remove it".  I assume the special case
> >>>> is that the device MSI address is managed via the SET_IRQS ioctl and
> >>>> therefore we won't actually get DMAs to this range.
> >>> Yes exactly. The guest creates a mapping between one giova and this
> >>> gpa (corresponding to the MSI controller doorbell) because MSIs are
> >>> mapped on ARM. But practically the physical device is programmed with
> >>> an host chosen iova that maps onto the physical MSI controller's
> >>> doorbell. so the device never performs DMA accesses to this range.
> >>>
> >>>   But I imagine the case that
> >>>> was in mind when adding this warning was general peer-to-peer
> >>>> between and assigned and emulated device.
> >>> yes makes sense.
> >>>
> >>>   Maybe there's an argument to be made
> >>>> that such a p2p mapping might also be used in a non-vIOMMU case.  We
> >>>> skip creating those mappings and drivers continue to work, maybe
> >>>> because nobody attempts to do p2p DMA with the types of devices we
> >>>> emulate, maybe because p2p DMA is not absolutely reliable on bare
> >>>> metal and drivers test it before using it.
> >>> MSI doorbells are mapped using the IOMMU_MMIO flag (dma-iommu.c
> >>> iommu_dma_get_msi_page).
> >>> One idea could be to pass that flag through the IOMMU Notifier
> >>> mechanism into the iotlb->perm. Eventually when we get this in
> >>> vfio_get_vaddr() we would not print the warning. Could that make sense?
> >>
> >> Yeah, if we can identify a valid case that doesn't need a warning, that's fine by me.
> >> Thanks,
> >
> > Let me know if I understood the proposal correctly:
> >
> > virtio-iommu driver in guest will make map (VIRTIO_IOMMU_T_MAP) with VIRTIO_IOMMU_MAP_F_MMIO flag for MSI mapping.
> > In qemu, virtio-iommu device will set a new defined flag (say IOMMU_MMIO) in iotlb->perm in memory_region_notify_iommu(). vfio_get_vaddr() will check same flag and will not print the warning.>
> > Is above correct?
> Yes that's what I had in mind.

In that case virtio-iommu driver in guest should not make map
(VIRTIO_IOMMU_T_MAP) call as it known nothing to be mapped.

Stay Safe

Thanks
-Bharat

>
> Thanks
>
> Eric
> >
> > Thanks
> > -Bharat
> >
> >>
> >> Alex
> >
> >
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu
  2020-05-05  9:25             ` Bharat Bhushan
@ 2020-05-05  9:30               ` Auger Eric
  2020-05-05  9:46                 ` Bharat Bhushan
  0 siblings, 1 reply; 37+ messages in thread
From: Auger Eric @ 2020-05-05  9:30 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: yang.zhong, peter.maydell, kevin.tian, Tomasz Nowicki [C],
	mst, drjones, peterx, qemu-devel, Alex Williamson, qemu-arm,
	jean-philippe, Bharat Bhushan, eric.auger.pro, linuc.decode,
	David Gibson

Hi Bharat,

On 5/5/20 11:25 AM, Bharat Bhushan wrote:
> Hi Eric,
> 
> On Fri, Apr 24, 2020 at 7:47 PM Auger Eric <eric.auger@redhat.com> wrote:
>>
>> Hi Bharat,
>>
>> On 4/2/20 11:01 AM, Bharat Bhushan wrote:
>>> Hi Eric/Alex,
>>>
>>>> -----Original Message-----
>>>> From: Alex Williamson <alex.williamson@redhat.com>
>>>> Sent: Thursday, March 26, 2020 11:23 PM
>>>> To: Auger Eric <eric.auger@redhat.com>
>>>> Cc: Bharat Bhushan <bbhushan2@marvell.com>; peter.maydell@linaro.org;
>>>> peterx@redhat.com; eric.auger.pro@gmail.com; kevin.tian@intel.com;
>>>> mst@redhat.com; Tomasz Nowicki [C] <tnowicki@marvell.com>;
>>>> drjones@redhat.com; linuc.decode@gmail.com; qemu-devel@nongnu.org; qemu-
>>>> arm@nongnu.org; bharatb.linux@gmail.com; jean-philippe@linaro.org;
>>>> yang.zhong@intel.com; David Gibson <david@gibson.dropbear.id.au>
>>>> Subject: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio
>>>> region translation by viommu
>>>>
>>>> External Email
>>>>
>>>> ----------------------------------------------------------------------
>>>> On Thu, 26 Mar 2020 18:35:48 +0100
>>>> Auger Eric <eric.auger@redhat.com> wrote:
>>>>
>>>>> Hi Alex,
>>>>>
>>>>> On 3/24/20 12:08 AM, Alex Williamson wrote:
>>>>>> [Cc +dwg who originated this warning]
>>>>>>
>>>>>> On Mon, 23 Mar 2020 14:16:09 +0530
>>>>>> Bharat Bhushan <bbhushan2@marvell.com> wrote:
>>>>>>
>>>>>>> On ARM, the MSI doorbell is translated by the virtual IOMMU.
>>>>>>> As such address_space_translate() returns the MSI controller MMIO
>>>>>>> region and we get an "iommu map to non memory area"
>>>>>>> message. Let's remove this latter.
>>>>>>>
>>>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>>>> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
>>>>>>> ---
>>>>>>>  hw/vfio/common.c | 2 --
>>>>>>>  1 file changed, 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c index
>>>>>>> 5ca11488d6..c586edf47a 100644
>>>>>>> --- a/hw/vfio/common.c
>>>>>>> +++ b/hw/vfio/common.c
>>>>>>> @@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb,
>>>> void **vaddr,
>>>>>>>                                   &xlat, &len, writable,
>>>>>>>                                   MEMTXATTRS_UNSPECIFIED);
>>>>>>>      if (!memory_region_is_ram(mr)) {
>>>>>>> -        error_report("iommu map to non memory area %"HWADDR_PRIx"",
>>>>>>> -                     xlat);
>>>>>>>          return false;
>>>>>>>      }
>>>>>>>
>>>>>>
>>>>>> I'm a bit confused here, I think we need more justification beyond
>>>>>> "we hit this warning and we don't want to because it's ok in this
>>>>>> one special case, therefore remove it".  I assume the special case
>>>>>> is that the device MSI address is managed via the SET_IRQS ioctl and
>>>>>> therefore we won't actually get DMAs to this range.
>>>>> Yes exactly. The guest creates a mapping between one giova and this
>>>>> gpa (corresponding to the MSI controller doorbell) because MSIs are
>>>>> mapped on ARM. But practically the physical device is programmed with
>>>>> an host chosen iova that maps onto the physical MSI controller's
>>>>> doorbell. so the device never performs DMA accesses to this range.
>>>>>
>>>>>   But I imagine the case that
>>>>>> was in mind when adding this warning was general peer-to-peer
>>>>>> between and assigned and emulated device.
>>>>> yes makes sense.
>>>>>
>>>>>   Maybe there's an argument to be made
>>>>>> that such a p2p mapping might also be used in a non-vIOMMU case.  We
>>>>>> skip creating those mappings and drivers continue to work, maybe
>>>>>> because nobody attempts to do p2p DMA with the types of devices we
>>>>>> emulate, maybe because p2p DMA is not absolutely reliable on bare
>>>>>> metal and drivers test it before using it.
>>>>> MSI doorbells are mapped using the IOMMU_MMIO flag (dma-iommu.c
>>>>> iommu_dma_get_msi_page).
>>>>> One idea could be to pass that flag through the IOMMU Notifier
>>>>> mechanism into the iotlb->perm. Eventually when we get this in
>>>>> vfio_get_vaddr() we would not print the warning. Could that make sense?
>>>>
>>>> Yeah, if we can identify a valid case that doesn't need a warning, that's fine by me.
>>>> Thanks,
>>>
>>> Let me know if I understood the proposal correctly:
>>>
>>> virtio-iommu driver in guest will make map (VIRTIO_IOMMU_T_MAP) with VIRTIO_IOMMU_MAP_F_MMIO flag for MSI mapping.
>>> In qemu, virtio-iommu device will set a new defined flag (say IOMMU_MMIO) in iotlb->perm in memory_region_notify_iommu(). vfio_get_vaddr() will check same flag and will not print the warning.>
>>> Is above correct?
>> Yes that's what I had in mind.
> 
> In that case virtio-iommu driver in guest should not make map
> (VIRTIO_IOMMU_T_MAP) call as it known nothing to be mapped.
sorry I don't catch what you meant. Please can you elaborate?

Thanks

Eric
> 
> Stay Safe
> 
> Thanks
> -Bharat
> 
>>
>> Thanks
>>
>> Eric
>>>
>>> Thanks
>>> -Bharat
>>>
>>>>
>>>> Alex
>>>
>>>
>>
> 



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu
  2020-05-05  9:30               ` Auger Eric
@ 2020-05-05  9:46                 ` Bharat Bhushan
  2020-05-05 10:18                   ` Bharat Bhushan
  0 siblings, 1 reply; 37+ messages in thread
From: Bharat Bhushan @ 2020-05-05  9:46 UTC (permalink / raw)
  To: Auger Eric
  Cc: yang.zhong, peter.maydell, kevin.tian, Tomasz Nowicki [C],
	mst, drjones, peterx, qemu-devel, Alex Williamson, qemu-arm,
	jean-philippe, Bharat Bhushan, eric.auger.pro, linuc.decode,
	David Gibson

hi Eric,

On Tue, May 5, 2020 at 3:00 PM Auger Eric <eric.auger@redhat.com> wrote:
>
> Hi Bharat,
>
> On 5/5/20 11:25 AM, Bharat Bhushan wrote:
> > Hi Eric,
> >
> > On Fri, Apr 24, 2020 at 7:47 PM Auger Eric <eric.auger@redhat.com> wrote:
> >>
> >> Hi Bharat,
> >>
> >> On 4/2/20 11:01 AM, Bharat Bhushan wrote:
> >>> Hi Eric/Alex,
> >>>
> >>>> -----Original Message-----
> >>>> From: Alex Williamson <alex.williamson@redhat.com>
> >>>> Sent: Thursday, March 26, 2020 11:23 PM
> >>>> To: Auger Eric <eric.auger@redhat.com>
> >>>> Cc: Bharat Bhushan <bbhushan2@marvell.com>; peter.maydell@linaro.org;
> >>>> peterx@redhat.com; eric.auger.pro@gmail.com; kevin.tian@intel.com;
> >>>> mst@redhat.com; Tomasz Nowicki [C] <tnowicki@marvell.com>;
> >>>> drjones@redhat.com; linuc.decode@gmail.com; qemu-devel@nongnu.org; qemu-
> >>>> arm@nongnu.org; bharatb.linux@gmail.com; jean-philippe@linaro.org;
> >>>> yang.zhong@intel.com; David Gibson <david@gibson.dropbear.id.au>
> >>>> Subject: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio
> >>>> region translation by viommu
> >>>>
> >>>> External Email
> >>>>
> >>>> ----------------------------------------------------------------------
> >>>> On Thu, 26 Mar 2020 18:35:48 +0100
> >>>> Auger Eric <eric.auger@redhat.com> wrote:
> >>>>
> >>>>> Hi Alex,
> >>>>>
> >>>>> On 3/24/20 12:08 AM, Alex Williamson wrote:
> >>>>>> [Cc +dwg who originated this warning]
> >>>>>>
> >>>>>> On Mon, 23 Mar 2020 14:16:09 +0530
> >>>>>> Bharat Bhushan <bbhushan2@marvell.com> wrote:
> >>>>>>
> >>>>>>> On ARM, the MSI doorbell is translated by the virtual IOMMU.
> >>>>>>> As such address_space_translate() returns the MSI controller MMIO
> >>>>>>> region and we get an "iommu map to non memory area"
> >>>>>>> message. Let's remove this latter.
> >>>>>>>
> >>>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >>>>>>> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
> >>>>>>> ---
> >>>>>>>  hw/vfio/common.c | 2 --
> >>>>>>>  1 file changed, 2 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c index
> >>>>>>> 5ca11488d6..c586edf47a 100644
> >>>>>>> --- a/hw/vfio/common.c
> >>>>>>> +++ b/hw/vfio/common.c
> >>>>>>> @@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb,
> >>>> void **vaddr,
> >>>>>>>                                   &xlat, &len, writable,
> >>>>>>>                                   MEMTXATTRS_UNSPECIFIED);
> >>>>>>>      if (!memory_region_is_ram(mr)) {
> >>>>>>> -        error_report("iommu map to non memory area %"HWADDR_PRIx"",
> >>>>>>> -                     xlat);
> >>>>>>>          return false;
> >>>>>>>      }
> >>>>>>>
> >>>>>>
> >>>>>> I'm a bit confused here, I think we need more justification beyond
> >>>>>> "we hit this warning and we don't want to because it's ok in this
> >>>>>> one special case, therefore remove it".  I assume the special case
> >>>>>> is that the device MSI address is managed via the SET_IRQS ioctl and
> >>>>>> therefore we won't actually get DMAs to this range.
> >>>>> Yes exactly. The guest creates a mapping between one giova and this
> >>>>> gpa (corresponding to the MSI controller doorbell) because MSIs are
> >>>>> mapped on ARM. But practically the physical device is programmed with
> >>>>> an host chosen iova that maps onto the physical MSI controller's
> >>>>> doorbell. so the device never performs DMA accesses to this range.
> >>>>>
> >>>>>   But I imagine the case that
> >>>>>> was in mind when adding this warning was general peer-to-peer
> >>>>>> between and assigned and emulated device.
> >>>>> yes makes sense.
> >>>>>
> >>>>>   Maybe there's an argument to be made
> >>>>>> that such a p2p mapping might also be used in a non-vIOMMU case.  We
> >>>>>> skip creating those mappings and drivers continue to work, maybe
> >>>>>> because nobody attempts to do p2p DMA with the types of devices we
> >>>>>> emulate, maybe because p2p DMA is not absolutely reliable on bare
> >>>>>> metal and drivers test it before using it.
> >>>>> MSI doorbells are mapped using the IOMMU_MMIO flag (dma-iommu.c
> >>>>> iommu_dma_get_msi_page).
> >>>>> One idea could be to pass that flag through the IOMMU Notifier
> >>>>> mechanism into the iotlb->perm. Eventually when we get this in
> >>>>> vfio_get_vaddr() we would not print the warning. Could that make sense?
> >>>>
> >>>> Yeah, if we can identify a valid case that doesn't need a warning, that's fine by me.
> >>>> Thanks,
> >>>
> >>> Let me know if I understood the proposal correctly:
> >>>
> >>> virtio-iommu driver in guest will make map (VIRTIO_IOMMU_T_MAP) with VIRTIO_IOMMU_MAP_F_MMIO flag for MSI mapping.
> >>> In qemu, virtio-iommu device will set a new defined flag (say IOMMU_MMIO) in iotlb->perm in memory_region_notify_iommu(). vfio_get_vaddr() will check same flag and will not print the warning.>
> >>> Is above correct?
> >> Yes that's what I had in mind.
> >
> > In that case virtio-iommu driver in guest should not make map
> > (VIRTIO_IOMMU_T_MAP) call as it known nothing to be mapped.
> sorry I don't catch what you meant. Please can you elaborate?

What I understood of the proposal is:
Linux:
 1) MSI doorbells are mapped using the IOMMU_MMIO flag (dma-iommu.c
iommu_dma_get_msi_page)
 2) virtio-iommu driver in guest will make map (VIRTIO_IOMMU_T_MAP)
with VIRTIO_IOMMU_MAP_F_MMIO flag for MSI mapping.

GEMU:
3) virtio-iommu device - If VIRTIO_IOMMU_MAP_F_MMIO flag set then will
set a new defined flag (say IOMMU_MMIO) in iotlb->perm in
memory_region_notify_iommu()
4. vfio_get_vaddr() will check same flag and will not print the
warning. Also vfio_iommu_map_notify() will not do anything.

So, rather than going down to step 3 and 4, can we avoid maling map()
calling in step-2 itself?

Thanks
-Bharat

>
> Thanks
>
> Eric
> >
> > Stay Safe
> >
> > Thanks
> > -Bharat
> >
> >>
> >> Thanks
> >>
> >> Eric
> >>>
> >>> Thanks
> >>> -Bharat
> >>>
> >>>>
> >>>> Alex
> >>>
> >>>
> >>
> >
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu
  2020-05-05  9:46                 ` Bharat Bhushan
@ 2020-05-05 10:18                   ` Bharat Bhushan
  2020-05-05 12:05                     ` Auger Eric
  2020-05-07 14:40                     ` Auger Eric
  0 siblings, 2 replies; 37+ messages in thread
From: Bharat Bhushan @ 2020-05-05 10:18 UTC (permalink / raw)
  To: Auger Eric
  Cc: yang.zhong, peter.maydell, kevin.tian, Tomasz Nowicki [C],
	mst, drjones, peterx, qemu-devel, Alex Williamson, qemu-arm,
	jean-philippe, Bharat Bhushan, eric.auger.pro, linuc.decode,
	David Gibson

Hi Eric,

On Tue, May 5, 2020 at 3:16 PM Bharat Bhushan <bharatb.linux@gmail.com> wrote:
>
> hi Eric,
>
> On Tue, May 5, 2020 at 3:00 PM Auger Eric <eric.auger@redhat.com> wrote:
> >
> > Hi Bharat,
> >
> > On 5/5/20 11:25 AM, Bharat Bhushan wrote:
> > > Hi Eric,
> > >
> > > On Fri, Apr 24, 2020 at 7:47 PM Auger Eric <eric.auger@redhat.com> wrote:
> > >>
> > >> Hi Bharat,
> > >>
> > >> On 4/2/20 11:01 AM, Bharat Bhushan wrote:
> > >>> Hi Eric/Alex,
> > >>>
> > >>>> -----Original Message-----
> > >>>> From: Alex Williamson <alex.williamson@redhat.com>
> > >>>> Sent: Thursday, March 26, 2020 11:23 PM
> > >>>> To: Auger Eric <eric.auger@redhat.com>
> > >>>> Cc: Bharat Bhushan <bbhushan2@marvell.com>; peter.maydell@linaro.org;
> > >>>> peterx@redhat.com; eric.auger.pro@gmail.com; kevin.tian@intel.com;
> > >>>> mst@redhat.com; Tomasz Nowicki [C] <tnowicki@marvell.com>;
> > >>>> drjones@redhat.com; linuc.decode@gmail.com; qemu-devel@nongnu.org; qemu-
> > >>>> arm@nongnu.org; bharatb.linux@gmail.com; jean-philippe@linaro.org;
> > >>>> yang.zhong@intel.com; David Gibson <david@gibson.dropbear.id.au>
> > >>>> Subject: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio
> > >>>> region translation by viommu
> > >>>>
> > >>>> External Email
> > >>>>
> > >>>> ----------------------------------------------------------------------
> > >>>> On Thu, 26 Mar 2020 18:35:48 +0100
> > >>>> Auger Eric <eric.auger@redhat.com> wrote:
> > >>>>
> > >>>>> Hi Alex,
> > >>>>>
> > >>>>> On 3/24/20 12:08 AM, Alex Williamson wrote:
> > >>>>>> [Cc +dwg who originated this warning]
> > >>>>>>
> > >>>>>> On Mon, 23 Mar 2020 14:16:09 +0530
> > >>>>>> Bharat Bhushan <bbhushan2@marvell.com> wrote:
> > >>>>>>
> > >>>>>>> On ARM, the MSI doorbell is translated by the virtual IOMMU.
> > >>>>>>> As such address_space_translate() returns the MSI controller MMIO
> > >>>>>>> region and we get an "iommu map to non memory area"
> > >>>>>>> message. Let's remove this latter.
> > >>>>>>>
> > >>>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> > >>>>>>> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
> > >>>>>>> ---
> > >>>>>>>  hw/vfio/common.c | 2 --
> > >>>>>>>  1 file changed, 2 deletions(-)
> > >>>>>>>
> > >>>>>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c index
> > >>>>>>> 5ca11488d6..c586edf47a 100644
> > >>>>>>> --- a/hw/vfio/common.c
> > >>>>>>> +++ b/hw/vfio/common.c
> > >>>>>>> @@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb,
> > >>>> void **vaddr,
> > >>>>>>>                                   &xlat, &len, writable,
> > >>>>>>>                                   MEMTXATTRS_UNSPECIFIED);
> > >>>>>>>      if (!memory_region_is_ram(mr)) {
> > >>>>>>> -        error_report("iommu map to non memory area %"HWADDR_PRIx"",
> > >>>>>>> -                     xlat);
> > >>>>>>>          return false;
> > >>>>>>>      }
> > >>>>>>>
> > >>>>>>
> > >>>>>> I'm a bit confused here, I think we need more justification beyond
> > >>>>>> "we hit this warning and we don't want to because it's ok in this
> > >>>>>> one special case, therefore remove it".  I assume the special case
> > >>>>>> is that the device MSI address is managed via the SET_IRQS ioctl and
> > >>>>>> therefore we won't actually get DMAs to this range.
> > >>>>> Yes exactly. The guest creates a mapping between one giova and this
> > >>>>> gpa (corresponding to the MSI controller doorbell) because MSIs are
> > >>>>> mapped on ARM. But practically the physical device is programmed with
> > >>>>> an host chosen iova that maps onto the physical MSI controller's
> > >>>>> doorbell. so the device never performs DMA accesses to this range.
> > >>>>>
> > >>>>>   But I imagine the case that
> > >>>>>> was in mind when adding this warning was general peer-to-peer
> > >>>>>> between and assigned and emulated device.
> > >>>>> yes makes sense.
> > >>>>>
> > >>>>>   Maybe there's an argument to be made
> > >>>>>> that such a p2p mapping might also be used in a non-vIOMMU case.  We
> > >>>>>> skip creating those mappings and drivers continue to work, maybe
> > >>>>>> because nobody attempts to do p2p DMA with the types of devices we
> > >>>>>> emulate, maybe because p2p DMA is not absolutely reliable on bare
> > >>>>>> metal and drivers test it before using it.
> > >>>>> MSI doorbells are mapped using the IOMMU_MMIO flag (dma-iommu.c
> > >>>>> iommu_dma_get_msi_page).
> > >>>>> One idea could be to pass that flag through the IOMMU Notifier
> > >>>>> mechanism into the iotlb->perm. Eventually when we get this in
> > >>>>> vfio_get_vaddr() we would not print the warning. Could that make sense?
> > >>>>
> > >>>> Yeah, if we can identify a valid case that doesn't need a warning, that's fine by me.
> > >>>> Thanks,
> > >>>
> > >>> Let me know if I understood the proposal correctly:
> > >>>
> > >>> virtio-iommu driver in guest will make map (VIRTIO_IOMMU_T_MAP) with VIRTIO_IOMMU_MAP_F_MMIO flag for MSI mapping.
> > >>> In qemu, virtio-iommu device will set a new defined flag (say IOMMU_MMIO) in iotlb->perm in memory_region_notify_iommu(). vfio_get_vaddr() will check same flag and will not print the warning.>
> > >>> Is above correct?
> > >> Yes that's what I had in mind.
> > >
> > > In that case virtio-iommu driver in guest should not make map
> > > (VIRTIO_IOMMU_T_MAP) call as it known nothing to be mapped.
> > sorry I don't catch what you meant. Please can you elaborate?
>
> What I understood of the proposal is:
> Linux:
>  1) MSI doorbells are mapped using the IOMMU_MMIO flag (dma-iommu.c
> iommu_dma_get_msi_page)
>  2) virtio-iommu driver in guest will make map (VIRTIO_IOMMU_T_MAP)
> with VIRTIO_IOMMU_MAP_F_MMIO flag for MSI mapping.
>
> GEMU:
> 3) virtio-iommu device - If VIRTIO_IOMMU_MAP_F_MMIO flag set then will
> set a new defined flag (say IOMMU_MMIO) in iotlb->perm in
> memory_region_notify_iommu()
> 4. vfio_get_vaddr() will check same flag and will not print the
> warning. Also vfio_iommu_map_notify() will not do anything.
>
> So, rather than going down to step 3 and 4, can we avoid maling map()
> calling in step-2 itself?

ohh, We need to setup msi translation mapping, correct.

Thanks
-Bharat

>
> Thanks
> -Bharat
>
> >
> > Thanks
> >
> > Eric
> > >
> > > Stay Safe
> > >
> > > Thanks
> > > -Bharat
> > >
> > >>
> > >> Thanks
> > >>
> > >> Eric
> > >>>
> > >>> Thanks
> > >>> -Bharat
> > >>>
> > >>>>
> > >>>> Alex
> > >>>
> > >>>
> > >>
> > >
> >


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu
  2020-05-05 10:18                   ` Bharat Bhushan
@ 2020-05-05 12:05                     ` Auger Eric
  2020-05-07 14:40                     ` Auger Eric
  1 sibling, 0 replies; 37+ messages in thread
From: Auger Eric @ 2020-05-05 12:05 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: yang.zhong, peter.maydell, kevin.tian, Tomasz Nowicki [C],
	mst, drjones, peterx, qemu-devel, Alex Williamson, qemu-arm,
	jean-philippe, Bharat Bhushan, David Gibson, linuc.decode,
	eric.auger.pro

Hi Bharat,

On 5/5/20 12:18 PM, Bharat Bhushan wrote:
> Hi Eric,
> 
> On Tue, May 5, 2020 at 3:16 PM Bharat Bhushan <bharatb.linux@gmail.com> wrote:
>>
>> hi Eric,
>>
>> On Tue, May 5, 2020 at 3:00 PM Auger Eric <eric.auger@redhat.com> wrote:
>>>
>>> Hi Bharat,
>>>
>>> On 5/5/20 11:25 AM, Bharat Bhushan wrote:
>>>> Hi Eric,
>>>>
>>>> On Fri, Apr 24, 2020 at 7:47 PM Auger Eric <eric.auger@redhat.com> wrote:
>>>>>
>>>>> Hi Bharat,
>>>>>
>>>>> On 4/2/20 11:01 AM, Bharat Bhushan wrote:
>>>>>> Hi Eric/Alex,
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Alex Williamson <alex.williamson@redhat.com>
>>>>>>> Sent: Thursday, March 26, 2020 11:23 PM
>>>>>>> To: Auger Eric <eric.auger@redhat.com>
>>>>>>> Cc: Bharat Bhushan <bbhushan2@marvell.com>; peter.maydell@linaro.org;
>>>>>>> peterx@redhat.com; eric.auger.pro@gmail.com; kevin.tian@intel.com;
>>>>>>> mst@redhat.com; Tomasz Nowicki [C] <tnowicki@marvell.com>;
>>>>>>> drjones@redhat.com; linuc.decode@gmail.com; qemu-devel@nongnu.org; qemu-
>>>>>>> arm@nongnu.org; bharatb.linux@gmail.com; jean-philippe@linaro.org;
>>>>>>> yang.zhong@intel.com; David Gibson <david@gibson.dropbear.id.au>
>>>>>>> Subject: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio
>>>>>>> region translation by viommu
>>>>>>>
>>>>>>> External Email
>>>>>>>
>>>>>>> ----------------------------------------------------------------------
>>>>>>> On Thu, 26 Mar 2020 18:35:48 +0100
>>>>>>> Auger Eric <eric.auger@redhat.com> wrote:
>>>>>>>
>>>>>>>> Hi Alex,
>>>>>>>>
>>>>>>>> On 3/24/20 12:08 AM, Alex Williamson wrote:
>>>>>>>>> [Cc +dwg who originated this warning]
>>>>>>>>>
>>>>>>>>> On Mon, 23 Mar 2020 14:16:09 +0530
>>>>>>>>> Bharat Bhushan <bbhushan2@marvell.com> wrote:
>>>>>>>>>
>>>>>>>>>> On ARM, the MSI doorbell is translated by the virtual IOMMU.
>>>>>>>>>> As such address_space_translate() returns the MSI controller MMIO
>>>>>>>>>> region and we get an "iommu map to non memory area"
>>>>>>>>>> message. Let's remove this latter.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>>>>>>> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
>>>>>>>>>> ---
>>>>>>>>>>  hw/vfio/common.c | 2 --
>>>>>>>>>>  1 file changed, 2 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c index
>>>>>>>>>> 5ca11488d6..c586edf47a 100644
>>>>>>>>>> --- a/hw/vfio/common.c
>>>>>>>>>> +++ b/hw/vfio/common.c
>>>>>>>>>> @@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb,
>>>>>>> void **vaddr,
>>>>>>>>>>                                   &xlat, &len, writable,
>>>>>>>>>>                                   MEMTXATTRS_UNSPECIFIED);
>>>>>>>>>>      if (!memory_region_is_ram(mr)) {
>>>>>>>>>> -        error_report("iommu map to non memory area %"HWADDR_PRIx"",
>>>>>>>>>> -                     xlat);
>>>>>>>>>>          return false;
>>>>>>>>>>      }
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I'm a bit confused here, I think we need more justification beyond
>>>>>>>>> "we hit this warning and we don't want to because it's ok in this
>>>>>>>>> one special case, therefore remove it".  I assume the special case
>>>>>>>>> is that the device MSI address is managed via the SET_IRQS ioctl and
>>>>>>>>> therefore we won't actually get DMAs to this range.
>>>>>>>> Yes exactly. The guest creates a mapping between one giova and this
>>>>>>>> gpa (corresponding to the MSI controller doorbell) because MSIs are
>>>>>>>> mapped on ARM. But practically the physical device is programmed with
>>>>>>>> an host chosen iova that maps onto the physical MSI controller's
>>>>>>>> doorbell. so the device never performs DMA accesses to this range.
>>>>>>>>
>>>>>>>>   But I imagine the case that
>>>>>>>>> was in mind when adding this warning was general peer-to-peer
>>>>>>>>> between and assigned and emulated device.
>>>>>>>> yes makes sense.
>>>>>>>>
>>>>>>>>   Maybe there's an argument to be made
>>>>>>>>> that such a p2p mapping might also be used in a non-vIOMMU case.  We
>>>>>>>>> skip creating those mappings and drivers continue to work, maybe
>>>>>>>>> because nobody attempts to do p2p DMA with the types of devices we
>>>>>>>>> emulate, maybe because p2p DMA is not absolutely reliable on bare
>>>>>>>>> metal and drivers test it before using it.
>>>>>>>> MSI doorbells are mapped using the IOMMU_MMIO flag (dma-iommu.c
>>>>>>>> iommu_dma_get_msi_page).
>>>>>>>> One idea could be to pass that flag through the IOMMU Notifier
>>>>>>>> mechanism into the iotlb->perm. Eventually when we get this in
>>>>>>>> vfio_get_vaddr() we would not print the warning. Could that make sense?
>>>>>>>
>>>>>>> Yeah, if we can identify a valid case that doesn't need a warning, that's fine by me.
>>>>>>> Thanks,
>>>>>>
>>>>>> Let me know if I understood the proposal correctly:
>>>>>>
>>>>>> virtio-iommu driver in guest will make map (VIRTIO_IOMMU_T_MAP) with VIRTIO_IOMMU_MAP_F_MMIO flag for MSI mapping.
>>>>>> In qemu, virtio-iommu device will set a new defined flag (say IOMMU_MMIO) in iotlb->perm in memory_region_notify_iommu(). vfio_get_vaddr() will check same flag and will not print the warning.>
>>>>>> Is above correct?
>>>>> Yes that's what I had in mind.
>>>>
>>>> In that case virtio-iommu driver in guest should not make map
>>>> (VIRTIO_IOMMU_T_MAP) call as it known nothing to be mapped.
>>> sorry I don't catch what you meant. Please can you elaborate?
>>
>> What I understood of the proposal is:
>> Linux:
>>  1) MSI doorbells are mapped using the IOMMU_MMIO flag (dma-iommu.c
>> iommu_dma_get_msi_page)
>>  2) virtio-iommu driver in guest will make map (VIRTIO_IOMMU_T_MAP)
>> with VIRTIO_IOMMU_MAP_F_MMIO flag for MSI mapping.
>>
>> GEMU:
>> 3) virtio-iommu device - If VIRTIO_IOMMU_MAP_F_MMIO flag set then will
>> set a new defined flag (say IOMMU_MMIO) in iotlb->perm in
>> memory_region_notify_iommu()
>> 4. vfio_get_vaddr() will check same flag and will not print the
>> warning. Also vfio_iommu_map_notify() will not do anything.
>>
>> So, rather than going down to step 3 and 4, can we avoid maling map()
>> calling in step-2 itself?
> 
> ohh, We need to setup msi translation mapping, correct.
The virtio-iommu driver is supposed to obey the map() operation argument
and cascade this to the device. I don't see why it would filter this out.

Now we should have means to declare the vITS GPA as an HW MSI reserved
region and avoid calling the map(). I will try this out.

Thanks

Eric

> 
> Thanks
> -Bharat
> 
>>
>> Thanks
>> -Bharat
>>
>>>
>>> Thanks
>>>
>>> Eric
>>>>
>>>> Stay Safe
>>>>
>>>> Thanks
>>>> -Bharat
>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>> Eric
>>>>>>
>>>>>> Thanks
>>>>>> -Bharat
>>>>>>
>>>>>>>
>>>>>>> Alex
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
> 



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu
  2020-05-05 10:18                   ` Bharat Bhushan
  2020-05-05 12:05                     ` Auger Eric
@ 2020-05-07 14:40                     ` Auger Eric
  1 sibling, 0 replies; 37+ messages in thread
From: Auger Eric @ 2020-05-07 14:40 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: yang.zhong, peter.maydell, kevin.tian, Tomasz Nowicki [C],
	mst, drjones, peterx, qemu-devel, Alex Williamson, qemu-arm,
	jean-philippe, Bharat Bhushan, David Gibson, linuc.decode,
	eric.auger.pro

Hi Bharat,

On 5/5/20 12:18 PM, Bharat Bhushan wrote:
> Hi Eric,
> 
> On Tue, May 5, 2020 at 3:16 PM Bharat Bhushan <bharatb.linux@gmail.com> wrote:
>>
>> hi Eric,
>>
>> On Tue, May 5, 2020 at 3:00 PM Auger Eric <eric.auger@redhat.com> wrote:
>>>
>>> Hi Bharat,
>>>
>>> On 5/5/20 11:25 AM, Bharat Bhushan wrote:
>>>> Hi Eric,
>>>>
>>>> On Fri, Apr 24, 2020 at 7:47 PM Auger Eric <eric.auger@redhat.com> wrote:
>>>>>
>>>>> Hi Bharat,
>>>>>
>>>>> On 4/2/20 11:01 AM, Bharat Bhushan wrote:
>>>>>> Hi Eric/Alex,
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Alex Williamson <alex.williamson@redhat.com>
>>>>>>> Sent: Thursday, March 26, 2020 11:23 PM
>>>>>>> To: Auger Eric <eric.auger@redhat.com>
>>>>>>> Cc: Bharat Bhushan <bbhushan2@marvell.com>; peter.maydell@linaro.org;
>>>>>>> peterx@redhat.com; eric.auger.pro@gmail.com; kevin.tian@intel.com;
>>>>>>> mst@redhat.com; Tomasz Nowicki [C] <tnowicki@marvell.com>;
>>>>>>> drjones@redhat.com; linuc.decode@gmail.com; qemu-devel@nongnu.org; qemu-
>>>>>>> arm@nongnu.org; bharatb.linux@gmail.com; jean-philippe@linaro.org;
>>>>>>> yang.zhong@intel.com; David Gibson <david@gibson.dropbear.id.au>
>>>>>>> Subject: [EXT] Re: [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio
>>>>>>> region translation by viommu
>>>>>>>
>>>>>>> External Email
>>>>>>>
>>>>>>> ----------------------------------------------------------------------
>>>>>>> On Thu, 26 Mar 2020 18:35:48 +0100
>>>>>>> Auger Eric <eric.auger@redhat.com> wrote:
>>>>>>>
>>>>>>>> Hi Alex,
>>>>>>>>
>>>>>>>> On 3/24/20 12:08 AM, Alex Williamson wrote:
>>>>>>>>> [Cc +dwg who originated this warning]
>>>>>>>>>
>>>>>>>>> On Mon, 23 Mar 2020 14:16:09 +0530
>>>>>>>>> Bharat Bhushan <bbhushan2@marvell.com> wrote:
>>>>>>>>>
>>>>>>>>>> On ARM, the MSI doorbell is translated by the virtual IOMMU.
>>>>>>>>>> As such address_space_translate() returns the MSI controller MMIO
>>>>>>>>>> region and we get an "iommu map to non memory area"
>>>>>>>>>> message. Let's remove this latter.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>>>>>>> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
>>>>>>>>>> ---
>>>>>>>>>>  hw/vfio/common.c | 2 --
>>>>>>>>>>  1 file changed, 2 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c index
>>>>>>>>>> 5ca11488d6..c586edf47a 100644
>>>>>>>>>> --- a/hw/vfio/common.c
>>>>>>>>>> +++ b/hw/vfio/common.c
>>>>>>>>>> @@ -426,8 +426,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb,
>>>>>>> void **vaddr,
>>>>>>>>>>                                   &xlat, &len, writable,
>>>>>>>>>>                                   MEMTXATTRS_UNSPECIFIED);
>>>>>>>>>>      if (!memory_region_is_ram(mr)) {
>>>>>>>>>> -        error_report("iommu map to non memory area %"HWADDR_PRIx"",
>>>>>>>>>> -                     xlat);
>>>>>>>>>>          return false;
>>>>>>>>>>      }
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I'm a bit confused here, I think we need more justification beyond
>>>>>>>>> "we hit this warning and we don't want to because it's ok in this
>>>>>>>>> one special case, therefore remove it".  I assume the special case
>>>>>>>>> is that the device MSI address is managed via the SET_IRQS ioctl and
>>>>>>>>> therefore we won't actually get DMAs to this range.
>>>>>>>> Yes exactly. The guest creates a mapping between one giova and this
>>>>>>>> gpa (corresponding to the MSI controller doorbell) because MSIs are
>>>>>>>> mapped on ARM. But practically the physical device is programmed with
>>>>>>>> an host chosen iova that maps onto the physical MSI controller's
>>>>>>>> doorbell. so the device never performs DMA accesses to this range.
>>>>>>>>
>>>>>>>>   But I imagine the case that
>>>>>>>>> was in mind when adding this warning was general peer-to-peer
>>>>>>>>> between and assigned and emulated device.
>>>>>>>> yes makes sense.
>>>>>>>>
>>>>>>>>   Maybe there's an argument to be made
>>>>>>>>> that such a p2p mapping might also be used in a non-vIOMMU case.  We
>>>>>>>>> skip creating those mappings and drivers continue to work, maybe
>>>>>>>>> because nobody attempts to do p2p DMA with the types of devices we
>>>>>>>>> emulate, maybe because p2p DMA is not absolutely reliable on bare
>>>>>>>>> metal and drivers test it before using it.
>>>>>>>> MSI doorbells are mapped using the IOMMU_MMIO flag (dma-iommu.c
>>>>>>>> iommu_dma_get_msi_page).
>>>>>>>> One idea could be to pass that flag through the IOMMU Notifier
>>>>>>>> mechanism into the iotlb->perm. Eventually when we get this in
>>>>>>>> vfio_get_vaddr() we would not print the warning. Could that make sense?
>>>>>>>
>>>>>>> Yeah, if we can identify a valid case that doesn't need a warning, that's fine by me.
>>>>>>> Thanks,
>>>>>>
>>>>>> Let me know if I understood the proposal correctly:
>>>>>>
>>>>>> virtio-iommu driver in guest will make map (VIRTIO_IOMMU_T_MAP) with VIRTIO_IOMMU_MAP_F_MMIO flag for MSI mapping.
>>>>>> In qemu, virtio-iommu device will set a new defined flag (say IOMMU_MMIO) in iotlb->perm in memory_region_notify_iommu(). vfio_get_vaddr() will check same flag and will not print the warning.>
>>>>>> Is above correct?
>>>>> Yes that's what I had in mind.
>>>>
>>>> In that case virtio-iommu driver in guest should not make map
>>>> (VIRTIO_IOMMU_T_MAP) call as it known nothing to be mapped.
>>> sorry I don't catch what you meant. Please can you elaborate?
>>
>> What I understood of the proposal is:
>> Linux:
>>  1) MSI doorbells are mapped using the IOMMU_MMIO flag (dma-iommu.c
>> iommu_dma_get_msi_page)
>>  2) virtio-iommu driver in guest will make map (VIRTIO_IOMMU_T_MAP)
>> with VIRTIO_IOMMU_MAP_F_MMIO flag for MSI mapping.
>>
>> GEMU:
>> 3) virtio-iommu device - If VIRTIO_IOMMU_MAP_F_MMIO flag set then will
>> set a new defined flag (say IOMMU_MMIO) in iotlb->perm in
>> memory_region_notify_iommu()
>> 4. vfio_get_vaddr() will check same flag and will not print the
>> warning. Also vfio_iommu_map_notify() will not do anything.
>>
>> So, rather than going down to step 3 and 4, can we avoid maling map()
>> calling in step-2 itself?
> 
> ohh, We need to setup msi translation mapping, correct.

I sent
[PATCH 0/5] VIRTIO-IOMMU probe request support and MSI bypass on ARM

With this series, the guest does not attempt to map the ITS MSI doorbell
anymore. So you shouldn't get "iommu map to non memory area" anymore if
you rebase on it and remove this patch.

At the moment I only bypass the ITS doorbell but I think it is safe to
do the same with GICv2M doorbell as well. Anyway all assigned devices
will be able to access this latter.

Thanks

Eric
> 
> Thanks
> -Bharat
> 
>>
>> Thanks
>> -Bharat
>>
>>>
>>> Thanks
>>>
>>> Eric
>>>>
>>>> Stay Safe
>>>>
>>>> Thanks
>>>> -Bharat
>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>> Eric
>>>>>>
>>>>>> Thanks
>>>>>> -Bharat
>>>>>>
>>>>>>>
>>>>>>> Alex
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
> 



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v9 8/9] virtio-iommu: Implement probe request
  2020-05-05  9:06       ` Bharat Bhushan
@ 2020-05-07 14:42         ` Auger Eric
  0 siblings, 0 replies; 37+ messages in thread
From: Auger Eric @ 2020-05-07 14:42 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: Zhong, Yang, Jean-Philippe Brucker, kevin.tian, tnowicki, mst,
	Peter Maydell, drjones, peterx, qemu-devel, alex.williamson,
	qemu-arm, Bharat Bhushan, linuc.decode, eric.auger.pro

Hi Bharat,

On 5/5/20 11:06 AM, Bharat Bhushan wrote:
> On Fri, Apr 24, 2020 at 7:22 PM Auger Eric <eric.auger@redhat.com> wrote:
>>
>> Hi Bharat,
>> On 4/23/20 6:09 PM, Jean-Philippe Brucker wrote:
>>> Hi Bharat,
>>>
>>> A few more things found while rebasing
>>>
>>> On Mon, Mar 23, 2020 at 02:16:16PM +0530, Bharat Bhushan wrote:
>>>> This patch implements the PROBE request. Currently supported
>>>> page size mask per endpoint is returned. Also append a NONE
>>>> property in the end.
>>>>
>>>> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>> ---
>>>>  include/standard-headers/linux/virtio_iommu.h |   6 +
>>>>  hw/virtio/virtio-iommu.c                      | 161 +++++++++++++++++-
>>>>  hw/virtio/trace-events                        |   2 +
>>>>  3 files changed, 166 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/include/standard-headers/linux/virtio_iommu.h b/include/standard-headers/linux/virtio_iommu.h
>>>> index b9443b83a1..8a0d47b907 100644
>>>> --- a/include/standard-headers/linux/virtio_iommu.h
>>>> +++ b/include/standard-headers/linux/virtio_iommu.h
>>>> @@ -111,6 +111,7 @@ struct virtio_iommu_req_unmap {
>>>>
>>>>  #define VIRTIO_IOMMU_PROBE_T_NONE           0
>>>>  #define VIRTIO_IOMMU_PROBE_T_RESV_MEM               1
>>>> +#define VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK 2
>>>>
>>>>  #define VIRTIO_IOMMU_PROBE_T_MASK           0xfff
>>>>
>>>> @@ -130,6 +131,11 @@ struct virtio_iommu_probe_resv_mem {
>>>>      uint64_t                                        end;
>>>>  };
>>>>
>>>> +struct virtio_iommu_probe_pgsize_mask {
>>>> +    struct virtio_iommu_probe_property      head;
>>>> +    uint64_t                                pgsize_bitmap;
>>>> +};
>>>> +
>>>>  struct virtio_iommu_req_probe {
>>>>      struct virtio_iommu_req_head            head;
>>>>      uint32_t                                        endpoint;
>>>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>>>> index 747e3cf1da..63fbacdcdc 100644
>>>> --- a/hw/virtio/virtio-iommu.c
>>>> +++ b/hw/virtio/virtio-iommu.c
>>>> @@ -38,6 +38,10 @@
>>>>
>>>>  /* Max size */
>>>>  #define VIOMMU_DEFAULT_QUEUE_SIZE 256
>>>> +#define VIOMMU_PROBE_SIZE 512
>>>> +
>>>> +#define SUPPORTED_PROBE_PROPERTIES (\
>>>> +    1 << VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK)
>>>>
>>>>  typedef struct VirtIOIOMMUDomain {
>>>>      uint32_t id;
>>>> @@ -62,6 +66,13 @@ typedef struct VirtIOIOMMUMapping {
>>>>      uint32_t flags;
>>>>  } VirtIOIOMMUMapping;
>>>>
>>>> +typedef struct VirtIOIOMMUPropBuffer {
>>>> +    VirtIOIOMMUEndpoint *endpoint;
>>>> +    size_t filled;
>>>> +    uint8_t *start;
>>>> +    bool error;
>>>
>>> It doesn't seem like bufstate->error gets used anywhere
>> maybe rebase your work on
>> [PATCH for-4.2 v10 10/15] virtio-iommu: Implement probe request
>> which tests it.
> 
> This was the staring point for me, As of now i moved away from "error"
> from above struct.
> 
>>
>> Also in
>> [Qemu-devel] [PATCH for-4.2 v10 10/15] virtio-iommu: Implement probe request
>> I changed the implementation to keep it simpler.

In [PATCH 2/5] virtio-iommu: Implement RESV_MEM probe request
I reworked the original implementation. I think this version is simpler
to start from. I removed the VirtIOIOMMUPropBuffer struct which looks
more complex than requested.

Thanks

Eric
>>
>> Thanks
>>
>> Eric
>>>
>>>> +} VirtIOIOMMUPropBuffer;
>>>> +
>>>>  static inline uint16_t virtio_iommu_get_bdf(IOMMUDevice *dev)
>>>>  {
>>>>      return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
>>>> @@ -490,6 +501,114 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
>>>>      return ret;
>>>>  }
>>>>
>>>> +static int virtio_iommu_fill_none_prop(VirtIOIOMMUPropBuffer *bufstate)
>>>> +{
>>>> +    struct virtio_iommu_probe_property *prop;
>>>> +
>>>> +    prop = (struct virtio_iommu_probe_property *)
>>>> +                (bufstate->start + bufstate->filled);
>>>> +    prop->type = 0;
>>>> +    prop->length = 0;
>>>> +    bufstate->filled += sizeof(*prop);
>>>> +    trace_virtio_iommu_fill_none_property(bufstate->endpoint->id);
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static int virtio_iommu_fill_page_size_mask(VirtIOIOMMUPropBuffer *bufstate)
>>>> +{
>>>> +    struct virtio_iommu_probe_pgsize_mask *page_size_mask;
>>>> +    size_t prop_size = sizeof(*page_size_mask);
>>>> +    VirtIOIOMMUEndpoint *ep = bufstate->endpoint;
>>>> +    VirtIOIOMMU *s = ep->viommu;
>>>> +    IOMMUDevice *sdev;
>>>> +
>>>> +    if (bufstate->filled + prop_size >= VIOMMU_PROBE_SIZE) {
>>>> +        bufstate->error = true;
>>>> +        /* get the traversal stopped by returning true */
>>>> +        return true;
>>>> +    }
>>>> +
>>>> +    page_size_mask = (struct virtio_iommu_probe_pgsize_mask *)
>>>> +                     (bufstate->start + bufstate->filled);
>>>> +
>>>> +    page_size_mask->head.type = VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK;
>>>> +    page_size_mask->head.length = prop_size;
>>>> +    QLIST_FOREACH(sdev, &s->notifiers_list, next) {
>>>> +        if (ep->id == sdev->devfn) {
>>>> +            page_size_mask->pgsize_bitmap = sdev->page_size_mask;
>>>
>>> Do we need a cpu_to_le64 here?
> 
> Ack, yes, even head.type and head.length needed  cpu_to_le16().
> 
>>>
>>>> +    }
>>>> +    }
>>>> +    bufstate->filled += sizeof(*page_size_mask);
>>>> +    trace_virtio_iommu_fill_pgsize_mask_property(bufstate->endpoint->id,
>>>> +                                                 page_size_mask->pgsize_bitmap,
>>>> +                                                 bufstate->filled);
>>>> +    return false;
>>>> +}
>>>> +
>>>> +/* Fill the properties[] buffer with properties of type @type */
>>>> +static int virtio_iommu_fill_property(int type,
>>>> +                                      VirtIOIOMMUPropBuffer *bufstate)
>>>> +{
>>>> +    int ret = -ENOSPC;
>>>> +
>>>> +    if (bufstate->filled + sizeof(struct virtio_iommu_probe_property)
>>>> +            >= VIOMMU_PROBE_SIZE) {
>>>> +        /* no space left for the header */
>>>> +        bufstate->error = true;
>>>> +        goto out;
>>>> +    }
>>>> +
>>>> +    switch (type) {
>>>> +    case VIRTIO_IOMMU_PROBE_T_NONE:
>>>> +        ret = virtio_iommu_fill_none_prop(bufstate);
>>>> +        break;
>>>> +    case VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK:
>>>> +    {
>>>> +        ret = virtio_iommu_fill_page_size_mask(bufstate);
>>>> +    break;
>>>> +    }
>>>> +    default:
>>>> +        ret = -ENOENT;
>>>> +        break;
>>>> +    }
>>>> +out:
>>>> +    if (ret) {
>>>> +        error_report("%s property of type=%d could not be filled (%d),"
>>>> +                     " remaining size = 0x%lx",
>>>> +                     __func__, type, ret, bufstate->filled);
>>>> +    }
>>>> +    return ret;
>>>> +}
>>>> +
>>>> +/**
>>>> + * virtio_iommu_probe - Fill the probe request buffer with all
>>>> + * the properties the device is able to return and add a NONE
>>>> + * property at the end. @buf points to properties[].
>>>> + */
>>>> +static int virtio_iommu_probe(VirtIOIOMMU *s,
>>>> +                              struct virtio_iommu_req_probe *req,
>>>> +                              uint8_t *buf)
>>>> +{
>>>> +    uint32_t ep_id = le32_to_cpu(req->endpoint);
>>>> +    VirtIOIOMMUEndpoint *ep = virtio_iommu_get_endpoint(s, ep_id);
>>>> +    int16_t prop_types = SUPPORTED_PROBE_PROPERTIES, type;
>>>> +    VirtIOIOMMUPropBuffer bufstate = {.start = buf, .filled = 0,
>>>> +                                       .error = false, .endpoint = ep};
>>>
>>> Probably need to check if ep is valid before going further
> 
> yes, will take care
> 
>>>
>>>> +
>>>> +    while ((type = ctz32(prop_types)) != 32) {
>>>> +        if (virtio_iommu_fill_property(type, &bufstate)) {
>>>> +            goto failure;
>>>> +        }
>>>> +        prop_types &= ~(1 << type);
>>>> +    }
>>>> +    if (virtio_iommu_fill_property(VIRTIO_IOMMU_PROBE_T_NONE, &bufstate)) {
>>>> +        goto failure;
>>>> +    }
>>>
>>> We got rid of the NONE property in the spec, now there is:
>>>
>>>  "If the device doesn’t fill all probe_size bytes with properties, it
>>>   SHOULD fill the remaining bytes of properties with zeroes."
>>>
>>> So I think you can get rid of virtio_iommu_fill_none_prop() and fill the
>>> rest of the buffer with zeroes here instead.
> 
> will do changes accordingly.
> 
> Stay Safe !!
> 
> Thanks
> -Bharat
>>>
>>> Thanks,
>>> Jean
>>>
>>>> +    return VIRTIO_IOMMU_S_OK;
>>>> +failure:
>>>> +    return VIRTIO_IOMMU_S_INVAL;
>>>> +}
>>>> +
>>>>  static int virtio_iommu_iov_to_req(struct iovec *iov,
>>>>                                     unsigned int iov_cnt,
>>>>                                     void *req, size_t req_sz)
>>>> @@ -519,6 +638,17 @@ virtio_iommu_handle_req(detach)
>>>>  virtio_iommu_handle_req(map)
>>>>  virtio_iommu_handle_req(unmap)
>>>>
>>>> +static int virtio_iommu_handle_probe(VirtIOIOMMU *s,
>>>> +                                     struct iovec *iov,
>>>> +                                     unsigned int iov_cnt,
>>>> +                                     uint8_t *buf)
>>>> +{
>>>> +    struct virtio_iommu_req_probe req;
>>>> +    int ret = virtio_iommu_iov_to_req(iov, iov_cnt, &req, sizeof(req));
>>>> +
>>>> +    return ret ? ret : virtio_iommu_probe(s, &req, buf);
>>>> +}
>>>> +
>>>>  static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
>>>>  {
>>>>      VirtIOIOMMU *s = VIRTIO_IOMMU(vdev);
>>>> @@ -564,17 +694,33 @@ static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
>>>>          case VIRTIO_IOMMU_T_UNMAP:
>>>>              tail.status = virtio_iommu_handle_unmap(s, iov, iov_cnt);
>>>>              break;
>>>> +        case VIRTIO_IOMMU_T_PROBE:
>>>> +        {
>>>> +            struct virtio_iommu_req_tail *ptail;
>>>> +            uint8_t *buf = g_malloc0(s->config.probe_size + sizeof(tail));
>>>> +
>>>> +            ptail = (struct virtio_iommu_req_tail *)
>>>> +                        (buf + s->config.probe_size);
>>>> +            ptail->status = virtio_iommu_handle_probe(s, iov, iov_cnt, buf);
>>>> +
>>>> +            sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
>>>> +                              buf, s->config.probe_size + sizeof(tail));
>>>> +            g_free(buf);
>>>> +            assert(sz == s->config.probe_size + sizeof(tail));
>>>> +            goto push;
>>>> +        }
>>>>          default:
>>>>              tail.status = VIRTIO_IOMMU_S_UNSUPP;
>>>>          }
>>>> -        qemu_mutex_unlock(&s->mutex);
>>>>
>>>>  out:
>>>>          sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
>>>>                            &tail, sizeof(tail));
>>>>          assert(sz == sizeof(tail));
>>>>
>>>> -        virtqueue_push(vq, elem, sizeof(tail));
>>>> +push:
>>>> +        qemu_mutex_unlock(&s->mutex);
>>>> +        virtqueue_push(vq, elem, sz);
>>>>          virtio_notify(vdev, vq);
>>>>          g_free(elem);
>>>>      }
>>>> @@ -634,16 +780,23 @@ static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
>>>>      VirtIOIOMMUEndpoint *ep;
>>>>      uint32_t sid, flags;
>>>>      bool bypass_allowed;
>>>> +    hwaddr addr_mask;
>>>>      bool found;
>>>>
>>>>      interval.low = addr;
>>>>      interval.high = addr + 1;
>>>>
>>>> +    if (sdev->page_size_mask) {
>>>> +        addr_mask = (1 << ctz32(sdev->page_size_mask)) - 1;
>>>> +    } else {
>>>> +        addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1;
>>>> +    }
>>>> +
>>>>      IOMMUTLBEntry entry = {
>>>>          .target_as = &address_space_memory,
>>>>          .iova = addr,
>>>>          .translated_addr = addr,
>>>> -        .addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1,
>>>> +        .addr_mask = addr_mask,
>>>>          .perm = IOMMU_NONE,
>>>>      };
>>>>
>>>> @@ -831,6 +984,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>>>>      s->config.page_size_mask = TARGET_PAGE_MASK;
>>>>      s->config.input_range.end = -1UL;
>>>>      s->config.domain_range.end = 32;
>>>> +    s->config.probe_size = VIOMMU_PROBE_SIZE;
>>>>
>>>>      virtio_add_feature(&s->features, VIRTIO_RING_F_EVENT_IDX);
>>>>      virtio_add_feature(&s->features, VIRTIO_RING_F_INDIRECT_DESC);
>>>> @@ -840,6 +994,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>>>>      virtio_add_feature(&s->features, VIRTIO_IOMMU_F_MAP_UNMAP);
>>>>      virtio_add_feature(&s->features, VIRTIO_IOMMU_F_BYPASS);
>>>>      virtio_add_feature(&s->features, VIRTIO_IOMMU_F_MMIO);
>>>> +    virtio_add_feature(&s->features, VIRTIO_IOMMU_F_PROBE);
>>>>
>>>>      qemu_mutex_init(&s->mutex);
>>>>
>>>> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
>>>> index 8bae651191..b0a6e4bda3 100644
>>>> --- a/hw/virtio/trace-events
>>>> +++ b/hw/virtio/trace-events
>>>> @@ -76,3 +76,5 @@ virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uin
>>>>  virtio_iommu_notify_map(const char *name, uint64_t iova, uint64_t paddr, uint64_t map_size) "mr=%s iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64
>>>>  virtio_iommu_notify_unmap(const char *name, uint64_t iova, uint64_t map_size) "mr=%s iova=0x%"PRIx64" size=0x%"PRIx64
>>>>  virtio_iommu_remap(uint64_t iova, uint64_t pa, uint64_t size) "iova=0x%"PRIx64" pa=0x%" PRIx64" size=0x%"PRIx64""
>>>> +virtio_iommu_fill_none_property(uint32_t devid) "devid=%d"
>>>> +virtio_iommu_fill_pgsize_mask_property(uint32_t devid, uint64_t pgsize_mask, size_t filled) "dev= %d, pgsize_mask=0x%"PRIx64" filled=0x%lx"
>>>> --
>>>> 2.17.1
>>>>
>>>
>>
> 



^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, back to index

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-23  8:46 [PATCH v9 0/9] virtio-iommu: VFIO integration Bharat Bhushan
2020-03-23  8:46 ` [PATCH v9 1/9] hw/vfio/common: Remove error print on mmio region translation by viommu Bharat Bhushan
2020-03-23 23:08   ` Alex Williamson
2020-03-26 17:35     ` Auger Eric
2020-03-26 17:53       ` Alex Williamson
2020-03-27  5:50         ` [EXT] " Bharat Bhushan
2020-04-02  9:01         ` Bharat Bhushan
2020-04-24 14:17           ` Auger Eric
2020-05-05  9:25             ` Bharat Bhushan
2020-05-05  9:30               ` Auger Eric
2020-05-05  9:46                 ` Bharat Bhushan
2020-05-05 10:18                   ` Bharat Bhushan
2020-05-05 12:05                     ` Auger Eric
2020-05-07 14:40                     ` Auger Eric
2020-03-23  8:46 ` [PATCH v9 2/9] memory: Add interface to set iommu page size mask Bharat Bhushan
2020-03-26 16:06   ` Auger Eric
2020-03-27  5:33     ` [EXT] " Bharat Bhushan
2020-03-27  8:27       ` Auger Eric
2020-03-23  8:46 ` [PATCH v9 3/9] vfio: set iommu page size as per host supported page size Bharat Bhushan
2020-03-23  8:46 ` [PATCH v9 4/9] virtio-iommu: set supported page size mask Bharat Bhushan
2020-03-26 15:51   ` Auger Eric
2020-03-27  5:13     ` [EXT] " Bharat Bhushan
2020-03-27  8:28       ` Auger Eric
2020-03-23  8:46 ` [PATCH v9 5/9] virtio-iommu: Add iommu notifier for map/unmap Bharat Bhushan
2020-03-23  8:46 ` [PATCH v9 6/9] virtio-iommu: Call iommu notifier for attach/detach Bharat Bhushan
2020-03-23  8:46 ` [PATCH v9 7/9] virtio-iommu: add iommu replay Bharat Bhushan
2020-03-23  8:46 ` [PATCH v9 8/9] virtio-iommu: Implement probe request Bharat Bhushan
2020-03-26 15:48   ` Auger Eric
2020-03-27  5:40     ` [EXT] " Bharat Bhushan
2020-03-27  8:34       ` Auger Eric
2020-04-23 16:09   ` Jean-Philippe Brucker
2020-04-24 13:51     ` Auger Eric
2020-05-05  9:06       ` Bharat Bhushan
2020-05-07 14:42         ` Auger Eric
2020-03-23  8:46 ` [PATCH v9 9/9] virtio-iommu: add iommu notifier memory-region Bharat Bhushan
2020-03-23  9:52 ` [PATCH v9 0/9] virtio-iommu: VFIO integration no-reply
2020-03-23  9:59 ` no-reply

QEMU-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/qemu-devel/0 qemu-devel/git/0.git
	git clone --mirror https://lore.kernel.org/qemu-devel/1 qemu-devel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 qemu-devel qemu-devel/ https://lore.kernel.org/qemu-devel \
		qemu-devel@nongnu.org
	public-inbox-index qemu-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.nongnu.qemu-devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git