All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration
@ 2019-05-27 11:41 Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 01/27] vfio/common: Introduce vfio_set_irq_signaling helper Eric Auger
                   ` (28 more replies)
  0 siblings, 29 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

Up to now vSMMUv3 has not been integrated with VFIO. VFIO
integration requires to program the physical IOMMU consistently
with the guest mappings. However, as opposed to VTD, SMMUv3 has
no "Caching Mode" which allows easy trapping of guest mappings.
This means the vSMMUV3 cannot use the same VFIO integration as VTD.

However SMMUv3 has 2 translation stages. This was devised with
virtualization use case in mind where stage 1 is "owned" by the
guest whereas the host uses stage 2 for VM isolation.

This series sets up this nested translation stage. It only works
if there is one physical SMMUv3 used along with QEMU vSMMUv3 (in
other words, it does not work if there is a physical SMMUv2).

The series uses a new kernel user API [1], still under definition.

- We force the host to use stage 2 instead of stage 1, when we
  detect a vSMMUV3 is behind a VFIO device. For a VFIO device
  without any virtual IOMMU, we still use stage 1 as many existing
  SMMUs expect this behavior.
- We introduce new IOTLB "config" notifiers, requested to notify
  changes in the config of a given iommu memory region. So now
  we have notifiers for IOTLB changes and config changes.
- vSMMUv3 calls config notifiers when STE (Stream Table Entries)
  are updated by the guest.
- We implement a specific UNMAP notifier that conveys guest
  IOTLB invalidations to the host
- We implement a new MAP notifiers only used for MSI IOVAs so
  that the host can build a nested stage translation for MSI IOVAs
- As the legacy MAP notifier is not called anymore, we must make
  sure stage 2 mappings are set. This is achieved through another
  memory listener.
- Physical SMMUs faults are reported to the guest via en eventfd
  mechanism and reinjected into this latter.

Note: The first patch is a code cleanup and was sent separately.

Best Regards

Eric

This series can be found at:
https://github.com/eauger/qemu/tree/v4.0.0-2stage-rfcv4

Compatible with kernel series:
[PATCH v8 00/29] SMMUv3 Nested Stage Setup
(https://lkml.org/lkml/2019/5/26/95)

History:
v3 -> v4:
- adapt to changes in uapi (asid cache invalidation)
- check VFIO_PCI_DMA_FAULT_IRQ_INDEX is supported at kernel level
  before attempting to set signaling for it.
- sync on 5.2-rc1 kernel headers + Drew's patch that imports sve_context.h
- fix MSI binding for MSI (not MSIX)
- fix mingw compilation

v2 -> v3:
- rework fault handling
- MSI binding registration done in vfio-pci. MSI binding tear down called
  on container cleanup path
- leaf parameter propagated

v1 -> v2:
- Fixed dual assignment (asid now correctly propagated on TLB invalidations)
- Integrated fault reporting


Andrew Jones (1):
  update-linux-headers: Add sve_context.h to asm-arm64

Eric Auger (26):
  vfio/common: Introduce vfio_set_irq_signaling helper
  update-linux-headers: Import iommu.h
  header update against 5.2.0-rc1 and IOMMU/VFIO nested stage APIs
  memory: add IOMMU_ATTR_VFIO_NESTED IOMMU memory region attribute
  memory: add IOMMU_ATTR_MSI_TRANSLATE IOMMU memory region attribute
  hw/arm/smmuv3: Advertise VFIO_NESTED and MSI_TRANSLATE attributes
  hw/vfio/common: Force nested if iommu requires it
  memory: Prepare for different kinds of IOMMU MR notifiers
  memory: Add IOMMUConfigNotifier
  memory: Add arch_id and leaf fields in IOTLBEntry
  hw/arm/smmuv3: Store the PASID table GPA in the translation config
  hw/arm/smmuv3: Implement dummy replay
  hw/arm/smmuv3: Fill the IOTLBEntry arch_id on NH_VA invalidation
  hw/arm/smmuv3: Fill the IOTLBEntry leaf field on NH_VA invalidation
  hw/arm/smmuv3: Notify on config changes
  hw/vfio/common: Introduce vfio_alloc_guest_iommu helper
  hw/vfio/common: Introduce hostwin_from_range helper
  hw/vfio/common: Introduce helpers to DMA map/unmap a RAM section
  hw/vfio/common: Setup nested stage mappings
  hw/vfio/common: Register a MAP notifier for MSI binding
  vfio-pci: Expose MSI stage 1 bindings to the host
  memory: Introduce IOMMU Memory Region inject_faults API
  hw/arm/smmuv3: Implement fault injection
  vfio-pci: register handler for iommu fault
  vfio-pci: Set up fault regions
  vfio-pci: Implement the DMA fault handler

 exec.c                          |  12 +-
 hw/arm/smmu-common.c            |  10 +-
 hw/arm/smmuv3.c                 | 198 +++++++++--
 hw/arm/trace-events             |   3 +-
 hw/i386/amd_iommu.c             |   2 +-
 hw/i386/intel_iommu.c           |  25 +-
 hw/misc/tz-mpc.c                |   8 +-
 hw/ppc/spapr_iommu.c            |   2 +-
 hw/s390x/s390-pci-inst.c        |   4 +-
 hw/vfio/common.c                | 572 ++++++++++++++++++++++++++------
 hw/vfio/pci.c                   | 471 ++++++++++++++++----------
 hw/vfio/pci.h                   |   4 +
 hw/vfio/platform.c              |  54 ++-
 hw/vfio/trace-events            |   8 +-
 hw/virtio/vhost.c               |  14 +-
 include/exec/memory.h           | 158 +++++++--
 include/hw/arm/smmu-common.h    |   1 +
 include/hw/vfio/vfio-common.h   |  10 +
 linux-headers/linux/iommu.h     | 280 ++++++++++++++++
 linux-headers/linux/vfio.h      | 107 ++++++
 memory.c                        |  67 +++-
 scripts/update-linux-headers.sh |   5 +-
 22 files changed, 1593 insertions(+), 422 deletions(-)
 create mode 100644 linux-headers/linux/iommu.h

-- 
2.20.1



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 01/27] vfio/common: Introduce vfio_set_irq_signaling helper
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 02/27] update-linux-headers: Import iommu.h Eric Auger
                   ` (27 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

The code used to assign an interrupt index/subindex to an
eventfd is duplicated many times. Let's introduce an helper that
allows to set/unset the signaling for an ACTION_TRIGGER,
ACTION_MASK or ACTION_UNMASK action.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v1 -> v2:
- don't call GET_IRQ_INFO in vfio_set_irq_signaling()
  and restore quiet check in vfio_register_req_notifier.
  Nicer display of the IRQ name.

This is a follow-up to
[PATCH v2 0/2] vfio-pci: Introduce vfio_set_event_handler().
It looks to me that introducing vfio_set_irq_signaling() has more
benefits in term of code reduction and the helper abstraction
looks cleaner.
---
 hw/vfio/common.c              |  78 ++++++++++++
 hw/vfio/pci.c                 | 217 ++++++++--------------------------
 hw/vfio/platform.c            |  54 +++------
 include/hw/vfio/vfio-common.h |   2 +
 4 files changed, 150 insertions(+), 201 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 4374cc6176..1f1deff360 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -95,6 +95,84 @@ void vfio_mask_single_irqindex(VFIODevice *vbasedev, int index)
     ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
 }
 
+static inline const char *action_to_str(int action)
+{
+    switch (action) {
+    case VFIO_IRQ_SET_ACTION_MASK:
+        return "MASK";
+    case VFIO_IRQ_SET_ACTION_UNMASK:
+        return "UNMASK";
+    case VFIO_IRQ_SET_ACTION_TRIGGER:
+        return "TRIGGER";
+    default:
+        return "UNKNOWN ACTION";
+    }
+}
+
+static char *irq_to_str(int index, int subindex)
+{
+    char *str;
+
+    switch (index) {
+    case VFIO_PCI_INTX_IRQ_INDEX:
+        str = g_strdup_printf("INTX-%d", subindex);
+        break;
+    case VFIO_PCI_MSI_IRQ_INDEX:
+        str = g_strdup_printf("MSI-%d", subindex);
+        break;
+    case VFIO_PCI_MSIX_IRQ_INDEX:
+        str = g_strdup_printf("MSIX-%d", subindex);
+        break;
+    case VFIO_PCI_ERR_IRQ_INDEX:
+        str = g_strdup_printf("ERR-%d", subindex);
+        break;
+    case VFIO_PCI_REQ_IRQ_INDEX:
+        str = g_strdup_printf("REQ-%d", subindex);
+        break;
+    default:
+        str = g_strdup_printf("index %d (unknown)", index);
+        break;
+    }
+    return str;
+}
+
+int vfio_set_irq_signaling(VFIODevice *vbasedev, int index, int subindex,
+                           int action, int fd, Error **errp)
+{
+    struct vfio_irq_set *irq_set;
+    int argsz, ret = 0;
+    int32_t *pfd;
+    char *irq_name;
+
+    argsz = sizeof(*irq_set) + sizeof(*pfd);
+
+    irq_set = g_malloc0(argsz);
+    irq_set->argsz = argsz;
+    irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | action;
+    irq_set->index = index;
+    irq_set->start = subindex;
+    irq_set->count = 1;
+    pfd = (int32_t *)&irq_set->data;
+    *pfd = fd;
+
+    ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+
+    g_free(irq_set);
+
+    if (!ret) {
+        return 0;
+    }
+
+    error_setg_errno(errp, -ret, "VFIO_DEVICE_SET_IRQS failure");
+    irq_name = irq_to_str(index, subindex);
+    error_prepend(errp,
+                  "Failed to %s %s eventfd signaling for interrupt %s: ",
+                  fd < 0 ? "tear down" : "set up", action_to_str(action),
+                  irq_name);
+    g_free(irq_name);
+    return ret;
+}
+
 /*
  * IO Port/MMIO - Beware of the endians, VFIO is always little endian
  */
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 8e555db12e..3095379747 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -113,9 +113,7 @@ static void vfio_intx_enable_kvm(VFIOPCIDevice *vdev, Error **errp)
         .gsi = vdev->intx.route.irq,
         .flags = KVM_IRQFD_FLAG_RESAMPLE,
     };
-    struct vfio_irq_set *irq_set;
-    int ret, argsz;
-    int32_t *pfd;
+    Error *err = NULL;
 
     if (vdev->no_kvm_intx || !kvm_irqfds_enabled() ||
         vdev->intx.route.mode != PCI_INTX_ENABLED ||
@@ -143,22 +141,10 @@ static void vfio_intx_enable_kvm(VFIOPCIDevice *vdev, Error **errp)
         goto fail_irqfd;
     }
 
-    argsz = sizeof(*irq_set) + sizeof(*pfd);
-
-    irq_set = g_malloc0(argsz);
-    irq_set->argsz = argsz;
-    irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_UNMASK;
-    irq_set->index = VFIO_PCI_INTX_IRQ_INDEX;
-    irq_set->start = 0;
-    irq_set->count = 1;
-    pfd = (int32_t *)&irq_set->data;
-
-    *pfd = irqfd.resamplefd;
-
-    ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set);
-    g_free(irq_set);
-    if (ret) {
-        error_setg_errno(errp, -ret, "failed to setup INTx unmask fd");
+    if (vfio_set_irq_signaling(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX, 0,
+                               VFIO_IRQ_SET_ACTION_UNMASK,
+                               irqfd.resamplefd, &err)) {
+        error_propagate(errp, err);
         goto fail_vfio;
     }
 
@@ -262,10 +248,10 @@ static void vfio_intx_update(PCIDevice *pdev)
 static int vfio_intx_enable(VFIOPCIDevice *vdev, Error **errp)
 {
     uint8_t pin = vfio_pci_read_config(&vdev->pdev, PCI_INTERRUPT_PIN, 1);
-    int ret, argsz, retval = 0;
-    struct vfio_irq_set *irq_set;
-    int32_t *pfd;
     Error *err = NULL;
+    int32_t fd;
+    int ret;
+
 
     if (!pin) {
         return 0;
@@ -292,27 +278,15 @@ static int vfio_intx_enable(VFIOPCIDevice *vdev, Error **errp)
         error_setg_errno(errp, -ret, "event_notifier_init failed");
         return ret;
     }
+    fd = event_notifier_get_fd(&vdev->intx.interrupt);
+    qemu_set_fd_handler(fd, vfio_intx_interrupt, NULL, vdev);
 
-    argsz = sizeof(*irq_set) + sizeof(*pfd);
-
-    irq_set = g_malloc0(argsz);
-    irq_set->argsz = argsz;
-    irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
-    irq_set->index = VFIO_PCI_INTX_IRQ_INDEX;
-    irq_set->start = 0;
-    irq_set->count = 1;
-    pfd = (int32_t *)&irq_set->data;
-
-    *pfd = event_notifier_get_fd(&vdev->intx.interrupt);
-    qemu_set_fd_handler(*pfd, vfio_intx_interrupt, NULL, vdev);
-
-    ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set);
-    if (ret) {
-        error_setg_errno(errp, -ret, "failed to setup INTx fd");
-        qemu_set_fd_handler(*pfd, NULL, NULL, vdev);
+    if (vfio_set_irq_signaling(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX, 0,
+                               VFIO_IRQ_SET_ACTION_TRIGGER, fd, &err)) {
+        error_propagate(errp, err);
+        qemu_set_fd_handler(fd, NULL, NULL, vdev);
         event_notifier_cleanup(&vdev->intx.interrupt);
-        retval = -errno;
-        goto cleanup;
+        return -errno;
     }
 
     vfio_intx_enable_kvm(vdev, &err);
@@ -323,11 +297,7 @@ static int vfio_intx_enable(VFIOPCIDevice *vdev, Error **errp)
     vdev->interrupt = VFIO_INT_INTx;
 
     trace_vfio_intx_enable(vdev->vbasedev.name);
-
-cleanup:
-    g_free(irq_set);
-
-    return retval;
+    return 0;
 }
 
 static void vfio_intx_disable(VFIOPCIDevice *vdev)
@@ -530,31 +500,19 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr,
             error_report("vfio: failed to enable vectors, %d", ret);
         }
     } else {
-        int argsz;
-        struct vfio_irq_set *irq_set;
-        int32_t *pfd;
-
-        argsz = sizeof(*irq_set) + sizeof(*pfd);
-
-        irq_set = g_malloc0(argsz);
-        irq_set->argsz = argsz;
-        irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD |
-                         VFIO_IRQ_SET_ACTION_TRIGGER;
-        irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
-        irq_set->start = nr;
-        irq_set->count = 1;
-        pfd = (int32_t *)&irq_set->data;
+        Error *err = NULL;
+        int32_t fd;
 
         if (vector->virq >= 0) {
-            *pfd = event_notifier_get_fd(&vector->kvm_interrupt);
+            fd = event_notifier_get_fd(&vector->kvm_interrupt);
         } else {
-            *pfd = event_notifier_get_fd(&vector->interrupt);
+            fd = event_notifier_get_fd(&vector->interrupt);
         }
 
-        ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set);
-        g_free(irq_set);
-        if (ret) {
-            error_report("vfio: failed to modify vector, %d", ret);
+        if (vfio_set_irq_signaling(&vdev->vbasedev,
+                                     VFIO_PCI_MSIX_IRQ_INDEX, nr,
+                                     VFIO_IRQ_SET_ACTION_TRIGGER, fd, &err)) {
+            error_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name);
         }
     }
 
@@ -591,26 +549,10 @@ static void vfio_msix_vector_release(PCIDevice *pdev, unsigned int nr)
      * be re-asserted on unmask.  Nothing to do if already using QEMU mode.
      */
     if (vector->virq >= 0) {
-        int argsz;
-        struct vfio_irq_set *irq_set;
-        int32_t *pfd;
+        int32_t fd = event_notifier_get_fd(&vector->interrupt);
 
-        argsz = sizeof(*irq_set) + sizeof(*pfd);
-
-        irq_set = g_malloc0(argsz);
-        irq_set->argsz = argsz;
-        irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD |
-                         VFIO_IRQ_SET_ACTION_TRIGGER;
-        irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
-        irq_set->start = nr;
-        irq_set->count = 1;
-        pfd = (int32_t *)&irq_set->data;
-
-        *pfd = event_notifier_get_fd(&vector->interrupt);
-
-        ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set);
-
-        g_free(irq_set);
+        vfio_set_irq_signaling(&vdev->vbasedev, VFIO_PCI_MSIX_IRQ_INDEX, nr,
+                               VFIO_IRQ_SET_ACTION_TRIGGER, fd, NULL);
     }
 }
 
@@ -2636,10 +2578,8 @@ static void vfio_err_notifier_handler(void *opaque)
  */
 static void vfio_register_err_notifier(VFIOPCIDevice *vdev)
 {
-    int ret;
-    int argsz;
-    struct vfio_irq_set *irq_set;
-    int32_t *pfd;
+    Error *err = NULL;
+    int32_t fd;
 
     if (!vdev->pci_aer) {
         return;
@@ -2651,58 +2591,30 @@ static void vfio_register_err_notifier(VFIOPCIDevice *vdev)
         return;
     }
 
-    argsz = sizeof(*irq_set) + sizeof(*pfd);
+    fd = event_notifier_get_fd(&vdev->err_notifier);
+    qemu_set_fd_handler(fd, vfio_err_notifier_handler, NULL, vdev);
 
-    irq_set = g_malloc0(argsz);
-    irq_set->argsz = argsz;
-    irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD |
-                     VFIO_IRQ_SET_ACTION_TRIGGER;
-    irq_set->index = VFIO_PCI_ERR_IRQ_INDEX;
-    irq_set->start = 0;
-    irq_set->count = 1;
-    pfd = (int32_t *)&irq_set->data;
-
-    *pfd = event_notifier_get_fd(&vdev->err_notifier);
-    qemu_set_fd_handler(*pfd, vfio_err_notifier_handler, NULL, vdev);
-
-    ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set);
-    if (ret) {
-        error_report("vfio: Failed to set up error notification");
-        qemu_set_fd_handler(*pfd, NULL, NULL, vdev);
+    if (vfio_set_irq_signaling(&vdev->vbasedev, VFIO_PCI_ERR_IRQ_INDEX, 0,
+                               VFIO_IRQ_SET_ACTION_TRIGGER, fd, &err)) {
+        error_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name);
+        qemu_set_fd_handler(fd, NULL, NULL, vdev);
         event_notifier_cleanup(&vdev->err_notifier);
         vdev->pci_aer = false;
     }
-    g_free(irq_set);
 }
 
 static void vfio_unregister_err_notifier(VFIOPCIDevice *vdev)
 {
-    int argsz;
-    struct vfio_irq_set *irq_set;
-    int32_t *pfd;
-    int ret;
+    Error *err = NULL;
 
     if (!vdev->pci_aer) {
         return;
     }
 
-    argsz = sizeof(*irq_set) + sizeof(*pfd);
-
-    irq_set = g_malloc0(argsz);
-    irq_set->argsz = argsz;
-    irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD |
-                     VFIO_IRQ_SET_ACTION_TRIGGER;
-    irq_set->index = VFIO_PCI_ERR_IRQ_INDEX;
-    irq_set->start = 0;
-    irq_set->count = 1;
-    pfd = (int32_t *)&irq_set->data;
-    *pfd = -1;
-
-    ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set);
-    if (ret) {
-        error_report("vfio: Failed to de-assign error fd: %m");
+    if (vfio_set_irq_signaling(&vdev->vbasedev, VFIO_PCI_ERR_IRQ_INDEX, 0,
+                               VFIO_IRQ_SET_ACTION_TRIGGER, -1, &err)) {
+        error_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name);
     }
-    g_free(irq_set);
     qemu_set_fd_handler(event_notifier_get_fd(&vdev->err_notifier),
                         NULL, NULL, vdev);
     event_notifier_cleanup(&vdev->err_notifier);
@@ -2727,9 +2639,8 @@ static void vfio_register_req_notifier(VFIOPCIDevice *vdev)
 {
     struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info),
                                       .index = VFIO_PCI_REQ_IRQ_INDEX };
-    int argsz;
-    struct vfio_irq_set *irq_set;
-    int32_t *pfd;
+    Error *err = NULL;
+    int32_t fd;
 
     if (!(vdev->features & VFIO_FEATURE_ENABLE_REQ)) {
         return;
@@ -2745,57 +2656,31 @@ static void vfio_register_req_notifier(VFIOPCIDevice *vdev)
         return;
     }
 
-    argsz = sizeof(*irq_set) + sizeof(*pfd);
+    fd = event_notifier_get_fd(&vdev->req_notifier);
+    qemu_set_fd_handler(fd, vfio_req_notifier_handler, NULL, vdev);
 
-    irq_set = g_malloc0(argsz);
-    irq_set->argsz = argsz;
-    irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD |
-                     VFIO_IRQ_SET_ACTION_TRIGGER;
-    irq_set->index = VFIO_PCI_REQ_IRQ_INDEX;
-    irq_set->start = 0;
-    irq_set->count = 1;
-    pfd = (int32_t *)&irq_set->data;
-
-    *pfd = event_notifier_get_fd(&vdev->req_notifier);
-    qemu_set_fd_handler(*pfd, vfio_req_notifier_handler, NULL, vdev);
-
-    if (ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set)) {
-        error_report("vfio: Failed to set up device request notification");
-        qemu_set_fd_handler(*pfd, NULL, NULL, vdev);
+    if (vfio_set_irq_signaling(&vdev->vbasedev, VFIO_PCI_REQ_IRQ_INDEX, 0,
+                           VFIO_IRQ_SET_ACTION_TRIGGER, fd, &err)) {
+        error_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name);
+        qemu_set_fd_handler(fd, NULL, NULL, vdev);
         event_notifier_cleanup(&vdev->req_notifier);
     } else {
         vdev->req_enabled = true;
     }
-
-    g_free(irq_set);
 }
 
 static void vfio_unregister_req_notifier(VFIOPCIDevice *vdev)
 {
-    int argsz;
-    struct vfio_irq_set *irq_set;
-    int32_t *pfd;
+    Error *err = NULL;
 
     if (!vdev->req_enabled) {
         return;
     }
 
-    argsz = sizeof(*irq_set) + sizeof(*pfd);
-
-    irq_set = g_malloc0(argsz);
-    irq_set->argsz = argsz;
-    irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD |
-                     VFIO_IRQ_SET_ACTION_TRIGGER;
-    irq_set->index = VFIO_PCI_REQ_IRQ_INDEX;
-    irq_set->start = 0;
-    irq_set->count = 1;
-    pfd = (int32_t *)&irq_set->data;
-    *pfd = -1;
-
-    if (ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set)) {
-        error_report("vfio: Failed to de-assign device request fd: %m");
+    if (vfio_set_irq_signaling(&vdev->vbasedev, VFIO_PCI_REQ_IRQ_INDEX, 0,
+                               VFIO_IRQ_SET_ACTION_TRIGGER, -1, &err)) {
+        error_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name);
     }
-    g_free(irq_set);
     qemu_set_fd_handler(event_notifier_get_fd(&vdev->req_notifier),
                         NULL, NULL, vdev);
     event_notifier_cleanup(&vdev->req_notifier);
diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index d52d6552e0..98f8038f31 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -106,26 +106,19 @@ static int vfio_set_trigger_eventfd(VFIOINTp *intp,
                                     eventfd_user_side_handler_t handler)
 {
     VFIODevice *vbasedev = &intp->vdev->vbasedev;
-    struct vfio_irq_set *irq_set;
-    int argsz, ret;
-    int32_t *pfd;
+    int32_t fd = event_notifier_get_fd(intp->interrupt);
+    Error *err = NULL;
+    int ret;
 
-    argsz = sizeof(*irq_set) + sizeof(*pfd);
-    irq_set = g_malloc0(argsz);
-    irq_set->argsz = argsz;
-    irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
-    irq_set->index = intp->pin;
-    irq_set->start = 0;
-    irq_set->count = 1;
-    pfd = (int32_t *)&irq_set->data;
-    *pfd = event_notifier_get_fd(intp->interrupt);
-    qemu_set_fd_handler(*pfd, (IOHandler *)handler, NULL, intp);
-    ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
-    if (ret < 0) {
-        error_report("vfio: Failed to set trigger eventfd: %m");
-        qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
+    qemu_set_fd_handler(fd, (IOHandler *)handler, NULL, intp);
+
+    ret = vfio_set_irq_signaling(vbasedev, intp->pin, 0,
+                                 VFIO_IRQ_SET_ACTION_TRIGGER, fd, &err);
+    if (ret) {
+        error_reportf_err(err, VFIO_MSG_PREFIX, vbasedev->name);
+        qemu_set_fd_handler(fd, NULL, NULL, NULL);
     }
-    g_free(irq_set);
+
     return ret;
 }
 
@@ -361,25 +354,16 @@ static void vfio_start_eventfd_injection(SysBusDevice *sbdev, qemu_irq irq)
  */
 static int vfio_set_resample_eventfd(VFIOINTp *intp)
 {
+    int32_t fd = event_notifier_get_fd(intp->unmask);
     VFIODevice *vbasedev = &intp->vdev->vbasedev;
-    struct vfio_irq_set *irq_set;
-    int argsz, ret;
-    int32_t *pfd;
+    Error *err = NULL;
+    int ret;
 
-    argsz = sizeof(*irq_set) + sizeof(*pfd);
-    irq_set = g_malloc0(argsz);
-    irq_set->argsz = argsz;
-    irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_UNMASK;
-    irq_set->index = intp->pin;
-    irq_set->start = 0;
-    irq_set->count = 1;
-    pfd = (int32_t *)&irq_set->data;
-    *pfd = event_notifier_get_fd(intp->unmask);
-    qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
-    ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
-    g_free(irq_set);
-    if (ret < 0) {
-        error_report("vfio: Failed to set resample eventfd: %m");
+    qemu_set_fd_handler(fd, NULL, NULL, NULL);
+    ret = vfio_set_irq_signaling(vbasedev, intp->pin, 0,
+                                 VFIO_IRQ_SET_ACTION_UNMASK, fd, &err);
+    if (ret) {
+        error_reportf_err(err, VFIO_MSG_PREFIX, vbasedev->name);
     }
     return ret;
 }
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 1155b79678..686d99ff8c 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -167,6 +167,8 @@ void vfio_put_base_device(VFIODevice *vbasedev);
 void vfio_disable_irqindex(VFIODevice *vbasedev, int index);
 void vfio_unmask_single_irqindex(VFIODevice *vbasedev, int index);
 void vfio_mask_single_irqindex(VFIODevice *vbasedev, int index);
+int vfio_set_irq_signaling(VFIODevice *vbasedev, int index, int subindex,
+                           int action, int fd, Error **errp);
 void vfio_region_write(void *opaque, hwaddr addr,
                            uint64_t data, unsigned size);
 uint64_t vfio_region_read(void *opaque,
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 02/27] update-linux-headers: Import iommu.h
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 01/27] vfio/common: Introduce vfio_set_irq_signaling helper Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 03/27] update-linux-headers: Add sve_context.h to asm-arm64 Eric Auger
                   ` (26 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

Update the script to import the new iommu.h uapi header.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 scripts/update-linux-headers.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
index f76d77363b..dfdfdfddcf 100755
--- a/scripts/update-linux-headers.sh
+++ b/scripts/update-linux-headers.sh
@@ -141,7 +141,7 @@ done
 
 rm -rf "$output/linux-headers/linux"
 mkdir -p "$output/linux-headers/linux"
-for header in kvm.h vfio.h vfio_ccw.h vhost.h \
+for header in kvm.h vfio.h vfio_ccw.h vhost.h iommu.h \
               psci.h psp-sev.h userfaultfd.h mman.h; do
     cp "$tmpdir/include/linux/$header" "$output/linux-headers/linux"
 done
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 03/27] update-linux-headers: Add sve_context.h to asm-arm64
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 01/27] vfio/common: Introduce vfio_set_irq_signaling helper Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 02/27] update-linux-headers: Import iommu.h Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 04/27] header update against 5.2.0-rc1 and IOMMU/VFIO nested stage APIs Eric Auger
                   ` (25 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

From: Andrew Jones <drjones@redhat.com>

Signed-off-by: Andrew Jones <drjones@redhat.com>
---
 scripts/update-linux-headers.sh | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
index dfdfdfddcf..c97d485b08 100755
--- a/scripts/update-linux-headers.sh
+++ b/scripts/update-linux-headers.sh
@@ -99,6 +99,9 @@ for arch in $ARCHLIST; do
         cp "$tmpdir/include/asm/$header" "$output/linux-headers/asm-$arch"
     done
 
+    if [ $arch = arm64 ]; then
+        cp "$tmpdir/include/asm/sve_context.h" "$output/linux-headers/asm-arm64/"
+    fi
     if [ $arch = mips ]; then
         cp "$tmpdir/include/asm/sgidefs.h" "$output/linux-headers/asm-mips/"
         cp "$tmpdir/include/asm/unistd_o32.h" "$output/linux-headers/asm-mips/"
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 04/27] header update against 5.2.0-rc1 and IOMMU/VFIO nested stage APIs
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (2 preceding siblings ...)
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 03/27] update-linux-headers: Add sve_context.h to asm-arm64 Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 05/27] memory: add IOMMU_ATTR_VFIO_NESTED IOMMU memory region attribute Eric Auger
                   ` (24 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

This is an update against the following development branch:
https://github.com/eauger/linux/tree/v5.2.0-rc1-2stage-v8.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 linux-headers/linux/iommu.h | 280 ++++++++++++++++++++++++++++++++++++
 linux-headers/linux/vfio.h  | 107 ++++++++++++++
 2 files changed, 387 insertions(+)
 create mode 100644 linux-headers/linux/iommu.h

diff --git a/linux-headers/linux/iommu.h b/linux-headers/linux/iommu.h
new file mode 100644
index 0000000000..0a59d6439c
--- /dev/null
+++ b/linux-headers/linux/iommu.h
@@ -0,0 +1,280 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * IOMMU user API definitions
+ */
+
+#ifndef _IOMMU_H
+#define _IOMMU_H
+
+#include <linux/types.h>
+
+#define IOMMU_FAULT_PERM_READ	(1 << 0) /* read */
+#define IOMMU_FAULT_PERM_WRITE	(1 << 1) /* write */
+#define IOMMU_FAULT_PERM_EXEC	(1 << 2) /* exec */
+#define IOMMU_FAULT_PERM_PRIV	(1 << 3) /* privileged */
+
+/* Generic fault types, can be expanded IRQ remapping fault */
+enum iommu_fault_type {
+	IOMMU_FAULT_DMA_UNRECOV = 1,	/* unrecoverable fault */
+	IOMMU_FAULT_PAGE_REQ,		/* page request fault */
+};
+
+enum iommu_fault_reason {
+	IOMMU_FAULT_REASON_UNKNOWN = 0,
+
+	/* Could not access the PASID table (fetch caused external abort) */
+	IOMMU_FAULT_REASON_PASID_FETCH,
+
+	/* PASID entry is invalid or has configuration errors */
+	IOMMU_FAULT_REASON_BAD_PASID_ENTRY,
+
+	/*
+	 * PASID is out of range (e.g. exceeds the maximum PASID
+	 * supported by the IOMMU) or disabled.
+	 */
+	IOMMU_FAULT_REASON_PASID_INVALID,
+
+	/*
+	 * An external abort occurred fetching (or updating) a translation
+	 * table descriptor
+	 */
+	IOMMU_FAULT_REASON_WALK_EABT,
+
+	/*
+	 * Could not access the page table entry (Bad address),
+	 * actual translation fault
+	 */
+	IOMMU_FAULT_REASON_PTE_FETCH,
+
+	/* Protection flag check failed */
+	IOMMU_FAULT_REASON_PERMISSION,
+
+	/* access flag check failed */
+	IOMMU_FAULT_REASON_ACCESS,
+
+	/* Output address of a translation stage caused Address Size fault */
+	IOMMU_FAULT_REASON_OOR_ADDRESS,
+};
+
+/**
+ * struct iommu_fault_unrecoverable - Unrecoverable fault data
+ * @reason: reason of the fault, from &enum iommu_fault_reason
+ * @flags: parameters of this fault (IOMMU_FAULT_UNRECOV_* values)
+ * @pasid: Process Address Space ID
+ * @perm: Requested permission access using by the incoming transaction
+ *        (IOMMU_FAULT_PERM_* values)
+ * @addr: offending page address
+ * @fetch_addr: address that caused a fetch abort, if any
+ */
+struct iommu_fault_unrecoverable {
+	__u32	reason;
+#define IOMMU_FAULT_UNRECOV_PASID_VALID		(1 << 0)
+#define IOMMU_FAULT_UNRECOV_ADDR_VALID		(1 << 1)
+#define IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID	(1 << 2)
+	__u32	flags;
+	__u32	pasid;
+	__u32	perm;
+	__u64	addr;
+	__u64	fetch_addr;
+};
+
+/**
+ * struct iommu_fault_page_request - Page Request data
+ * @flags: encodes whether the corresponding fields are valid and whether this
+ *         is the last page in group (IOMMU_FAULT_PAGE_REQUEST_* values)
+ * @pasid: Process Address Space ID
+ * @grpid: Page Request Group Index
+ * @perm: requested page permissions (IOMMU_FAULT_PERM_* values)
+ * @addr: page address
+ * @private_data: device-specific private information
+ */
+struct iommu_fault_page_request {
+#define IOMMU_FAULT_PAGE_REQUEST_PASID_VALID	(1 << 0)
+#define IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE	(1 << 1)
+#define IOMMU_FAULT_PAGE_REQUEST_PRIV_DATA	(1 << 2)
+	__u32	flags;
+	__u32	pasid;
+	__u32	grpid;
+	__u32	perm;
+	__u64	addr;
+	__u64	private_data[2];
+};
+
+/**
+ * struct iommu_fault - Generic fault data
+ * @type: fault type from &enum iommu_fault_type
+ * @padding: reserved for future use (should be zero)
+ * @event: Fault event, when @type is %IOMMU_FAULT_DMA_UNRECOV
+ * @prm: Page Request message, when @type is %IOMMU_FAULT_PAGE_REQ
+ */
+struct iommu_fault {
+	__u32	type;
+	__u32	padding;
+	union {
+		struct iommu_fault_unrecoverable event;
+		struct iommu_fault_page_request prm;
+	};
+};
+
+/**
+ * struct iommu_pasid_smmuv3 - ARM SMMUv3 Stream Table Entry stage 1 related
+ *     information
+ * @version: API version of this structure
+ * @s1fmt: STE s1fmt (format of the CD table: single CD, linear table
+ *         or 2-level table)
+ * @s1dss: STE s1dss (specifies the behavior when @pasid_bits != 0
+ *         and no PASID is passed along with the incoming transaction)
+ * @padding: reserved for future use (should be zero)
+ *
+ * The PASID table is referred to as the Context Descriptor (CD) table on ARM
+ * SMMUv3. Please refer to the ARM SMMU 3.x spec (ARM IHI 0070A) for full
+ * details.
+ */
+struct iommu_pasid_smmuv3 {
+#define PASID_TABLE_SMMUV3_CFG_VERSION_1 1
+	__u32	version;
+	__u8	s1fmt;
+	__u8	s1dss;
+	__u8	padding[2];
+};
+
+/**
+ * struct iommu_pasid_table_config - PASID table data used to bind guest PASID
+ *     table to the host IOMMU
+ * @version: API version to prepare for future extensions
+ * @format: format of the PASID table
+ * @base_ptr: guest physical address of the PASID table
+ * @pasid_bits: number of PASID bits used in the PASID table
+ * @config: indicates whether the guest translation stage must
+ *          be translated, bypassed or aborted.
+ * @padding: reserved for future use (should be zero)
+ * @smmuv3: table information when @format is %IOMMU_PASID_FORMAT_SMMUV3
+ */
+struct iommu_pasid_table_config {
+#define PASID_TABLE_CFG_VERSION_1 1
+	__u32	version;
+#define IOMMU_PASID_FORMAT_SMMUV3	1
+	__u32	format;
+	__u64	base_ptr;
+	__u8	pasid_bits;
+#define IOMMU_PASID_CONFIG_TRANSLATE	1
+#define IOMMU_PASID_CONFIG_BYPASS	2
+#define IOMMU_PASID_CONFIG_ABORT	3
+	__u8	config;
+	__u8    padding[6];
+	union {
+		struct iommu_pasid_smmuv3 smmuv3;
+	};
+};
+
+/* defines the granularity of the invalidation */
+enum iommu_inv_granularity {
+	IOMMU_INV_GRANU_DOMAIN,	/* domain-selective invalidation */
+	IOMMU_INV_GRANU_PASID,	/* PASID-selective invalidation */
+	IOMMU_INV_GRANU_ADDR,	/* page-selective invalidation */
+	IOMMU_INV_GRANU_NR,	/* number of invalidation granularities */
+};
+
+/**
+ * struct iommu_inv_addr_info - Address Selective Invalidation Structure
+ *
+ * @flags: indicates the granularity of the address-selective invalidation
+ * - If the PASID bit is set, the @pasid field is populated and the invalidation
+ *   relates to cache entries tagged with this PASID and matching the address
+ *   range.
+ * - If ARCHID bit is set, @archid is populated and the invalidation relates
+ *   to cache entries tagged with this architecture specific ID and matching
+ *   the address range.
+ * - Both PASID and ARCHID can be set as they may tag different caches.
+ * - If neither PASID or ARCHID is set, global addr invalidation applies.
+ * - The LEAF flag indicates whether only the leaf PTE caching needs to be
+ *   invalidated and other paging structure caches can be preserved.
+ * @pasid: process address space ID
+ * @archid: architecture-specific ID
+ * @addr: first stage/level input address
+ * @granule_size: page/block size of the mapping in bytes
+ * @nb_granules: number of contiguous granules to be invalidated
+ */
+struct iommu_inv_addr_info {
+#define IOMMU_INV_ADDR_FLAGS_PASID	(1 << 0)
+#define IOMMU_INV_ADDR_FLAGS_ARCHID	(1 << 1)
+#define IOMMU_INV_ADDR_FLAGS_LEAF	(1 << 2)
+	__u32	flags;
+	__u32	archid;
+	__u64	pasid;
+	__u64	addr;
+	__u64	granule_size;
+	__u64	nb_granules;
+};
+
+/**
+ * struct iommu_inv_pasid_info - PASID Selective Invalidation Structure
+ *
+ * @flags: indicates the granularity of the PASID-selective invalidation
+ * - If the PASID bit is set, the @pasid field is populated and the invalidation
+ *   relates to cache entries tagged with this PASID and matching the address
+ *   range.
+ * - If the ARCHID bit is set, the @archid is populated and the invalidation
+ *   relates to cache entries tagged with this architecture specific ID and
+ *   matching the address range.
+ * - Both PASID and ARCHID can be set as they may tag different caches.
+ * - At least one of PASID or ARCHID must be set.
+ * @pasid: process address space ID
+ * @archid: architecture-specific ID
+ */
+struct iommu_inv_pasid_info {
+#define IOMMU_INV_PASID_FLAGS_PASID	(1 << 0)
+#define IOMMU_INV_PASID_FLAGS_ARCHID	(1 << 1)
+	__u32	flags;
+	__u32	archid;
+	__u64	pasid;
+};
+
+/**
+ * struct iommu_cache_invalidate_info - First level/stage invalidation
+ *     information
+ * @version: API version of this structure
+ * @cache: bitfield that allows to select which caches to invalidate
+ * @granularity: defines the lowest granularity used for the invalidation:
+ *     domain > PASID > addr
+ * @padding: reserved for future use (should be zero)
+ * @pasid_info: invalidation data when @granularity is %IOMMU_INV_GRANU_PASID
+ * @addr_info: invalidation data when @granularity is %IOMMU_INV_GRANU_ADDR
+ *
+ * Not all the combinations of cache/granularity are valid:
+ *
+ * +--------------+---------------+---------------+---------------+
+ * | type /       |   DEV_IOTLB   |     IOTLB     |      PASID    |
+ * | granularity  |               |               |      cache    |
+ * +==============+===============+===============+===============+
+ * | DOMAIN       |       N/A     |       Y       |       Y       |
+ * +--------------+---------------+---------------+---------------+
+ * | PASID        |       Y       |       Y       |       Y       |
+ * +--------------+---------------+---------------+---------------+
+ * | ADDR         |       Y       |       Y       |       N/A     |
+ * +--------------+---------------+---------------+---------------+
+ *
+ * Invalidations by %IOMMU_INV_GRANU_DOMAIN don't take any argument other than
+ * @version and @cache.
+ *
+ * If multiple cache types are invalidated simultaneously, they all
+ * must support the used granularity.
+ */
+struct iommu_cache_invalidate_info {
+#define IOMMU_CACHE_INVALIDATE_INFO_VERSION_1 1
+	__u32	version;
+/* IOMMU paging structure cache */
+#define IOMMU_CACHE_INV_TYPE_IOTLB	(1 << 0) /* IOMMU IOTLB */
+#define IOMMU_CACHE_INV_TYPE_DEV_IOTLB	(1 << 1) /* Device IOTLB */
+#define IOMMU_CACHE_INV_TYPE_PASID	(1 << 2) /* PASID cache */
+#define IOMMU_CACHE_INV_TYPE_NR		(3)
+	__u8	cache;
+	__u8	granularity;
+	__u8	padding[2];
+	union {
+		struct iommu_inv_pasid_info pasid_info;
+		struct iommu_inv_addr_info addr_info;
+	};
+};
+
+#endif /* _IOMMU_H */
diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index 24f505199f..f8e355896c 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -14,6 +14,7 @@
 
 #include <linux/types.h>
 #include <linux/ioctl.h>
+#include <linux/iommu.h>
 
 #define VFIO_API_VERSION	0
 
@@ -306,6 +307,10 @@ struct vfio_region_info_cap_type {
 #define VFIO_REGION_TYPE_GFX                    (1)
 #define VFIO_REGION_SUBTYPE_GFX_EDID            (1)
 
+#define VFIO_REGION_TYPE_NESTED			(2)
+#define VFIO_REGION_SUBTYPE_NESTED_FAULT_PROD	(1)
+#define VFIO_REGION_SUBTYPE_NESTED_FAULT_CONS	(2)
+
 /**
  * struct vfio_region_gfx_edid - EDID region layout.
  *
@@ -554,6 +559,7 @@ enum {
 	VFIO_PCI_MSIX_IRQ_INDEX,
 	VFIO_PCI_ERR_IRQ_INDEX,
 	VFIO_PCI_REQ_IRQ_INDEX,
+	VFIO_PCI_DMA_FAULT_IRQ_INDEX,
 	VFIO_PCI_NUM_IRQS
 };
 
@@ -700,6 +706,44 @@ struct vfio_device_ioeventfd {
 
 #define VFIO_DEVICE_IOEVENTFD		_IO(VFIO_TYPE, VFIO_BASE + 16)
 
+
+/*
+ * Capability exposed by the Producer Fault Region
+ * @version: max fault ABI version supported by the kernel
+ */
+#define VFIO_REGION_INFO_CAP_PRODUCER_FAULT	6
+
+struct vfio_region_info_cap_fault {
+	struct vfio_info_cap_header header;
+	__u32 version;
+};
+
+/*
+ * Producer Fault Region (Read-Only from user space perspective)
+ * Contains the fault circular buffer and the producer index
+ * @version: version of the fault record uapi
+ * @entry_size: size of each fault record
+ * @offset: offset of the start of the queue
+ * @prod: producer index relative to the start of the queue
+ */
+struct vfio_region_fault_prod {
+	__u32   version;
+	__u32	nb_entries;
+	__u32   entry_size;
+	__u32	offset;
+	__u32   prod;
+};
+
+/*
+ * Consumer Fault Region (Write-Only from the user space perspective)
+ * @version: ABI version requested by the userspace
+ * @cons: consumer index relative to the start of the queue
+ */
+struct vfio_region_fault_cons {
+	__u32 version;
+	__u32 cons;
+};
+
 /* -------- API for Type1 VFIO IOMMU -------- */
 
 /**
@@ -763,6 +807,69 @@ struct vfio_iommu_type1_dma_unmap {
 #define VFIO_IOMMU_ENABLE	_IO(VFIO_TYPE, VFIO_BASE + 15)
 #define VFIO_IOMMU_DISABLE	_IO(VFIO_TYPE, VFIO_BASE + 16)
 
+/**
+ * VFIO_IOMMU_ATTACH_PASID_TABLE - _IOWR(VFIO_TYPE, VFIO_BASE + 22,
+ *			struct vfio_iommu_type1_attach_pasid_table)
+ *
+ * Passes the PASID table to the host. Calling ATTACH_PASID_TABLE
+ * while a table is already installed is allowed: it replaces the old
+ * table. DETACH does a comprehensive tear down of the nested mode.
+ */
+struct vfio_iommu_type1_attach_pasid_table {
+	__u32	argsz;
+	__u32	flags;
+	struct iommu_pasid_table_config config;
+};
+#define VFIO_IOMMU_ATTACH_PASID_TABLE	_IO(VFIO_TYPE, VFIO_BASE + 22)
+
+/**
+ * VFIO_IOMMU_DETACH_PASID_TABLE - - _IOWR(VFIO_TYPE, VFIO_BASE + 23)
+ * Detaches the PASID table
+ */
+#define VFIO_IOMMU_DETACH_PASID_TABLE	_IO(VFIO_TYPE, VFIO_BASE + 23)
+
+/**
+ * VFIO_IOMMU_CACHE_INVALIDATE - _IOWR(VFIO_TYPE, VFIO_BASE + 24,
+ *			struct vfio_iommu_type1_cache_invalidate)
+ *
+ * Propagate guest IOMMU cache invalidation to the host.
+ */
+struct vfio_iommu_type1_cache_invalidate {
+	__u32   argsz;
+	__u32   flags;
+	struct iommu_cache_invalidate_info info;
+};
+#define VFIO_IOMMU_CACHE_INVALIDATE      _IO(VFIO_TYPE, VFIO_BASE + 24)
+
+/**
+ * VFIO_IOMMU_BIND_MSI - _IOWR(VFIO_TYPE, VFIO_BASE + 25,
+ *			struct vfio_iommu_type1_bind_msi)
+ *
+ * Pass a stage 1 MSI doorbell mapping to the host so that this
+ * latter can build a nested stage2 mapping
+ */
+struct vfio_iommu_type1_bind_msi {
+	__u32   argsz;
+	__u32   flags;
+	__u64	iova;
+	__u64	gpa;
+	__u64	size;
+};
+#define VFIO_IOMMU_BIND_MSI      _IO(VFIO_TYPE, VFIO_BASE + 25)
+
+/**
+ * VFIO_IOMMU_UNBIND_MSI - _IOWR(VFIO_TYPE, VFIO_BASE + 26,
+ *			struct vfio_iommu_type1_unbind_msi)
+ *
+ * Unregister an MSI mapping
+ */
+struct vfio_iommu_type1_unbind_msi {
+	__u32   argsz;
+	__u32   flags;
+	__u64	iova;
+};
+#define VFIO_IOMMU_UNBIND_MSI      _IO(VFIO_TYPE, VFIO_BASE + 26)
+
 /* -------- Additional API for SPAPR TCE (Server POWERPC) IOMMU -------- */
 
 /*
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 05/27] memory: add IOMMU_ATTR_VFIO_NESTED IOMMU memory region attribute
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (3 preceding siblings ...)
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 04/27] header update against 5.2.0-rc1 and IOMMU/VFIO nested stage APIs Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 06/27] memory: add IOMMU_ATTR_MSI_TRANSLATE " Eric Auger
                   ` (23 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

We introduce a new IOMMU Memory Region attribute, IOMMU_ATTR_VFIO_NESTED
which tells whether the virtual IOMMU requires physical nested
stages for VFIO integration. Intel virtual IOMMU supports Caching
Mode and does not require 2 stages at physical level. However virtual
ARM SMMU does not implement such caching mode and requires to use
physical stage 1 for VFIO integration.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 include/exec/memory.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 9144a47f57..352a00169f 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -204,7 +204,8 @@ struct MemoryRegionOps {
 };
 
 enum IOMMUMemoryRegionAttr {
-    IOMMU_ATTR_SPAPR_TCE_FD
+    IOMMU_ATTR_SPAPR_TCE_FD,
+    IOMMU_ATTR_VFIO_NESTED,
 };
 
 /**
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 06/27] memory: add IOMMU_ATTR_MSI_TRANSLATE IOMMU memory region attribute
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (4 preceding siblings ...)
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 05/27] memory: add IOMMU_ATTR_VFIO_NESTED IOMMU memory region attribute Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 07/27] hw/arm/smmuv3: Advertise VFIO_NESTED and MSI_TRANSLATE attributes Eric Auger
                   ` (22 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

We introduce a new IOMMU Memory Region attribute, IOMMU_ATTR_MSI_TRANSLATE
which tells whether the virtual IOMMU translates MSIs. ARM SMMU
will expose this attribute since, as opposed to Intel DMAR, MSIs
are translated as any other DMA requests.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 include/exec/memory.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 352a00169f..146a6096da 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -206,6 +206,7 @@ struct MemoryRegionOps {
 enum IOMMUMemoryRegionAttr {
     IOMMU_ATTR_SPAPR_TCE_FD,
     IOMMU_ATTR_VFIO_NESTED,
+    IOMMU_ATTR_MSI_TRANSLATE,
 };
 
 /**
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 07/27] hw/arm/smmuv3: Advertise VFIO_NESTED and MSI_TRANSLATE attributes
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (5 preceding siblings ...)
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 06/27] memory: add IOMMU_ATTR_MSI_TRANSLATE " Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 08/27] hw/vfio/common: Force nested if iommu requires it Eric Auger
                   ` (21 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

Virtual SMMUv3 requires physical nested stages for VFIO integration
and translates MSIs. So let's advertise those attributes.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v2 -> v3:
- also advertise MSI_TRANSLATE
---
 hw/arm/smmuv3.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index fd8ec7860e..761d722395 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1490,6 +1490,20 @@ static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
     }
 }
 
+static int smmuv3_get_attr(IOMMUMemoryRegion *iommu,
+                           enum IOMMUMemoryRegionAttr attr,
+                           void *data)
+{
+    if (attr == IOMMU_ATTR_VFIO_NESTED) {
+        *(bool *) data = true;
+        return 0;
+    } else if (attr == IOMMU_ATTR_MSI_TRANSLATE) {
+        *(bool *) data = true;
+        return 0;
+    }
+    return -EINVAL;
+}
+
 static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
                                                   void *data)
 {
@@ -1497,6 +1511,7 @@ static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
 
     imrc->translate = smmuv3_translate;
     imrc->notify_flag_changed = smmuv3_notify_flag_changed;
+    imrc->get_attr = smmuv3_get_attr;
 }
 
 static const TypeInfo smmuv3_type_info = {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 08/27] hw/vfio/common: Force nested if iommu requires it
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (6 preceding siblings ...)
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 07/27] hw/arm/smmuv3: Advertise VFIO_NESTED and MSI_TRANSLATE attributes Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-28  2:47   ` Peter Xu
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 09/27] memory: Prepare for different kinds of IOMMU MR notifiers Eric Auger
                   ` (20 subsequent siblings)
  28 siblings, 1 reply; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

In case we detect the address space is translated by
a virtual IOMMU which requires nested stages, let's set up
the container with the VFIO_TYPE1_NESTING_IOMMU iommu_type.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v2 -> v3:
- add "nested only is selected if requested by @force_nested"
  comment in this patch
---
 hw/vfio/common.c | 27 +++++++++++++++++++++++----
 1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 1f1deff360..99ade21056 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1136,14 +1136,19 @@ static void vfio_put_address_space(VFIOAddressSpace *space)
  * vfio_get_iommu_type - selects the richest iommu_type (v2 first)
  */
 static int vfio_get_iommu_type(VFIOContainer *container,
+                               bool force_nested,
                                Error **errp)
 {
-    int iommu_types[] = { VFIO_TYPE1v2_IOMMU, VFIO_TYPE1_IOMMU,
+    int iommu_types[] = { VFIO_TYPE1_NESTING_IOMMU,
+                          VFIO_TYPE1v2_IOMMU, VFIO_TYPE1_IOMMU,
                           VFIO_SPAPR_TCE_v2_IOMMU, VFIO_SPAPR_TCE_IOMMU };
     int i;
 
     for (i = 0; i < ARRAY_SIZE(iommu_types); i++) {
         if (ioctl(container->fd, VFIO_CHECK_EXTENSION, iommu_types[i])) {
+            if (iommu_types[i] == VFIO_TYPE1_NESTING_IOMMU && !force_nested) {
+                continue;
+            }
             return iommu_types[i];
         }
     }
@@ -1152,11 +1157,11 @@ static int vfio_get_iommu_type(VFIOContainer *container,
 }
 
 static int vfio_init_container(VFIOContainer *container, int group_fd,
-                               Error **errp)
+                               bool force_nested, Error **errp)
 {
     int iommu_type, ret;
 
-    iommu_type = vfio_get_iommu_type(container, errp);
+    iommu_type = vfio_get_iommu_type(container, force_nested, errp);
     if (iommu_type < 0) {
         return iommu_type;
     }
@@ -1192,6 +1197,14 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
     VFIOContainer *container;
     int ret, fd;
     VFIOAddressSpace *space;
+    IOMMUMemoryRegion *iommu_mr;
+    bool force_nested = false;
+
+    if (as != &address_space_memory && memory_region_is_iommu(as->root)) {
+        iommu_mr = IOMMU_MEMORY_REGION(as->root);
+        memory_region_iommu_get_attr(iommu_mr, IOMMU_ATTR_VFIO_NESTED,
+                                     (void *)&force_nested);
+    }
 
     space = vfio_get_address_space(as);
 
@@ -1252,12 +1265,18 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
     QLIST_INIT(&container->giommu_list);
     QLIST_INIT(&container->hostwin_list);
 
-    ret = vfio_init_container(container, group->fd, errp);
+    ret = vfio_init_container(container, group->fd, force_nested, errp);
     if (ret) {
         goto free_container_exit;
     }
 
+    if (force_nested && container->iommu_type != VFIO_TYPE1_NESTING_IOMMU) {
+            error_setg(errp, "nested mode requested by the virtual IOMMU "
+                       "but not supported by the vfio iommu");
+    }
+
     switch (container->iommu_type) {
+    case VFIO_TYPE1_NESTING_IOMMU:
     case VFIO_TYPE1v2_IOMMU:
     case VFIO_TYPE1_IOMMU:
     {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 09/27] memory: Prepare for different kinds of IOMMU MR notifiers
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (7 preceding siblings ...)
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 08/27] hw/vfio/common: Force nested if iommu requires it Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-28  4:48   ` Peter Xu
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 10/27] memory: Add IOMMUConfigNotifier Eric Auger
                   ` (19 subsequent siblings)
  28 siblings, 1 reply; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

Current IOMMUNotifiers dedicate to IOTLB related notifications,
ie. MAP/UNMAP. We plan to introduce new types of notifiers, for
instance to notify vIOMMU configuration changes. Those new
notifiers may not be characterized by any associated address
space range.

So let's create a specialized IOMMUIOLTBNotifier datatype.
The base IOMMUNotifier will be able to encapsulate either of
the notifier types, including looming IOMMUConfigNotifier.

We also rename:
- IOMMU_NOTIFIER_* into IOMMU_NOTIFIER_IOTLB_*
- *_notify_* into *iotlb_notify_*

All calling sites are updated.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 exec.c                   | 12 ++++-----
 hw/arm/smmu-common.c     | 10 ++++---
 hw/arm/smmuv3.c          |  8 +++---
 hw/i386/amd_iommu.c      |  2 +-
 hw/i386/intel_iommu.c    | 25 ++++++++++--------
 hw/misc/tz-mpc.c         |  8 +++---
 hw/ppc/spapr_iommu.c     |  2 +-
 hw/s390x/s390-pci-inst.c |  4 +--
 hw/vfio/common.c         | 13 ++++-----
 hw/virtio/vhost.c        | 14 +++++-----
 include/exec/memory.h    | 57 +++++++++++++++++++++++++---------------
 memory.c                 | 32 ++++++++++++----------
 12 files changed, 107 insertions(+), 80 deletions(-)

diff --git a/exec.c b/exec.c
index 4e734770c2..ed4c5149ac 100644
--- a/exec.c
+++ b/exec.c
@@ -686,12 +686,12 @@ static void tcg_register_iommu_notifier(CPUState *cpu,
          * just register interest in the whole thing, on the assumption
          * that iommu reconfiguration will be rare.
          */
-        iommu_notifier_init(&notifier->n,
-                            tcg_iommu_unmap_notify,
-                            IOMMU_NOTIFIER_UNMAP,
-                            0,
-                            HWADDR_MAX,
-                            iommu_idx);
+        iommu_iotlb_notifier_init(&notifier->n,
+                                  tcg_iommu_unmap_notify,
+                                  IOMMU_NOTIFIER_IOTLB_UNMAP,
+                                  0,
+                                  HWADDR_MAX,
+                                  iommu_idx);
         memory_region_register_iommu_notifier(notifier->mr, &notifier->n);
     }
 
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index e94be6db6c..ee81038fc0 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -391,11 +391,11 @@ static void smmu_unmap_notifier_range(IOMMUNotifier *n)
     IOMMUTLBEntry entry;
 
     entry.target_as = &address_space_memory;
-    entry.iova = n->start;
+    entry.iova = n->iotlb_notifier.start;
     entry.perm = IOMMU_NONE;
-    entry.addr_mask = n->end - n->start;
+    entry.addr_mask = n->iotlb_notifier.end - n->iotlb_notifier.start;
 
-    memory_region_notify_one(n, &entry);
+    memory_region_iotlb_notify_one(n, &entry);
 }
 
 /* Unmap all notifiers attached to @mr */
@@ -405,7 +405,9 @@ inline void smmu_inv_notifiers_mr(IOMMUMemoryRegion *mr)
 
     trace_smmu_inv_notifiers_mr(mr->parent_obj.name);
     IOMMU_NOTIFIER_FOREACH(n, mr) {
-        smmu_unmap_notifier_range(n);
+        if (n->notifier_flags & IOMMU_NOTIFIER_IOTLB_UNMAP) {
+            smmu_unmap_notifier_range(n);
+        }
     }
 }
 
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 761d722395..1744874e72 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -822,7 +822,7 @@ static void smmuv3_notify_iova(IOMMUMemoryRegion *mr,
     entry.addr_mask = (1 << tt->granule_sz) - 1;
     entry.perm = IOMMU_NONE;
 
-    memory_region_notify_one(n, &entry);
+    memory_region_iotlb_notify_one(n, &entry);
 }
 
 /* invalidate an asid/iova tuple in all mr's */
@@ -837,7 +837,9 @@ static void smmuv3_inv_notifiers_iova(SMMUState *s, int asid, dma_addr_t iova)
         trace_smmuv3_inv_notifiers_iova(mr->parent_obj.name, asid, iova);
 
         IOMMU_NOTIFIER_FOREACH(n, mr) {
-            smmuv3_notify_iova(mr, n, asid, iova);
+            if (n->notifier_flags & IOMMU_NOTIFIER_IOTLB_UNMAP) {
+                smmuv3_notify_iova(mr, n, asid, iova);
+            }
         }
     }
 }
@@ -1473,7 +1475,7 @@ static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
     SMMUv3State *s3 = sdev->smmu;
     SMMUState *s = &(s3->smmu_state);
 
-    if (new & IOMMU_NOTIFIER_MAP) {
+    if (new & IOMMU_NOTIFIER_IOTLB_MAP) {
         int bus_num = pci_bus_num(sdev->bus);
         PCIDevice *pcidev = pci_find_device(sdev->bus, bus_num, sdev->devfn);
 
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 4a4e2c7fd4..7479e74a5c 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -1470,7 +1470,7 @@ static void amdvi_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu,
 {
     AMDVIAddressSpace *as = container_of(iommu, AMDVIAddressSpace, iommu);
 
-    if (new & IOMMU_NOTIFIER_MAP) {
+    if (new & IOMMU_NOTIFIER_IOTLB_MAP) {
         error_report("device %02x.%02x.%x requires iommu notifier which is not "
                      "currently supported", as->bus_num, PCI_SLOT(as->devfn),
                      PCI_FUNC(as->devfn));
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 44b1231157..dff90ed3fa 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -174,7 +174,7 @@ static void vtd_update_scalable_state(IntelIOMMUState *s)
 /* Whether the address space needs to notify new mappings */
 static inline gboolean vtd_as_has_map_notifier(VTDAddressSpace *as)
 {
-    return as->notifier_flags & IOMMU_NOTIFIER_MAP;
+    return as->notifier_flags & IOMMU_NOTIFIER_IOTLB_MAP;
 }
 
 /* GHashTable functions */
@@ -1361,7 +1361,7 @@ static int vtd_dev_to_context_entry(IntelIOMMUState *s, uint8_t bus_num,
 static int vtd_sync_shadow_page_hook(IOMMUTLBEntry *entry,
                                      void *private)
 {
-    memory_region_notify_iommu((IOMMUMemoryRegion *)private, 0, *entry);
+    memory_region_iotlb_notify_iommu((IOMMUMemoryRegion *)private, 0, *entry);
     return 0;
 }
 
@@ -1928,7 +1928,7 @@ static void vtd_iotlb_page_invalidate_notify(IntelIOMMUState *s,
                     .addr_mask = size - 1,
                     .perm = IOMMU_NONE,
                 };
-                memory_region_notify_iommu(&vtd_as->iommu, 0, entry);
+                memory_region_iotlb_notify_iommu(&vtd_as->iommu, 0, entry);
             }
         }
     }
@@ -2393,7 +2393,7 @@ static bool vtd_process_device_iotlb_desc(IntelIOMMUState *s,
     entry.iova = addr;
     entry.perm = IOMMU_NONE;
     entry.translated_addr = 0;
-    memory_region_notify_iommu(&vtd_dev_as->iommu, 0, entry);
+    memory_region_iotlb_notify_iommu(&vtd_dev_as->iommu, 0, entry);
 
 done:
     return true;
@@ -2925,7 +2925,7 @@ static void vtd_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu,
     VTDAddressSpace *vtd_as = container_of(iommu, VTDAddressSpace, iommu);
     IntelIOMMUState *s = vtd_as->iommu_state;
 
-    if (!s->caching_mode && new & IOMMU_NOTIFIER_MAP) {
+    if (!s->caching_mode && new & IOMMU_NOTIFIER_IOTLB_MAP) {
         error_report("We need to set caching-mode=on for intel-iommu to enable "
                      "device assignment with IOMMU protection.");
         exit(1);
@@ -3368,8 +3368,9 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
 {
     IOMMUTLBEntry entry;
     hwaddr size;
-    hwaddr start = n->start;
-    hwaddr end = n->end;
+
+    hwaddr start = n->iotlb_notifier.start;
+    hwaddr end = n->iotlb_notifier.end;
     IntelIOMMUState *s = as->iommu_state;
     DMAMap map;
 
@@ -3405,7 +3406,7 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
 
     entry.target_as = &address_space_memory;
     /* Adjust iova for the size */
-    entry.iova = n->start & ~(size - 1);
+    entry.iova = n->iotlb_notifier.start & ~(size - 1);
     /* This field is meaningless for unmap */
     entry.translated_addr = 0;
     entry.perm = IOMMU_NONE;
@@ -3420,7 +3421,7 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
     map.size = entry.addr_mask;
     iova_tree_remove(as->iova_tree, &map);
 
-    memory_region_notify_one(n, &entry);
+    memory_region_iotlb_notify_one(n, &entry);
 }
 
 static void vtd_address_space_unmap_all(IntelIOMMUState *s)
@@ -3430,7 +3431,9 @@ static void vtd_address_space_unmap_all(IntelIOMMUState *s)
 
     QLIST_FOREACH(vtd_as, &s->vtd_as_with_notifiers, next) {
         IOMMU_NOTIFIER_FOREACH(n, &vtd_as->iommu) {
-            vtd_address_space_unmap(vtd_as, n);
+            if (n->notifier_flags & IOMMU_NOTIFIER_IOTLB_UNMAP) {
+                vtd_address_space_unmap(vtd_as, n);
+            }
         }
     }
 }
@@ -3443,7 +3446,7 @@ static void vtd_address_space_refresh_all(IntelIOMMUState *s)
 
 static int vtd_replay_hook(IOMMUTLBEntry *entry, void *private)
 {
-    memory_region_notify_one((IOMMUNotifier *)private, entry);
+    memory_region_iotlb_notify_one((IOMMUNotifier *)private, entry);
     return 0;
 }
 
diff --git a/hw/misc/tz-mpc.c b/hw/misc/tz-mpc.c
index 9a84be75ed..f735d60e0f 100644
--- a/hw/misc/tz-mpc.c
+++ b/hw/misc/tz-mpc.c
@@ -100,8 +100,8 @@ static void tz_mpc_iommu_notify(TZMPC *s, uint32_t lutidx,
         entry.translated_addr = addr;
 
         entry.perm = IOMMU_NONE;
-        memory_region_notify_iommu(&s->upstream, IOMMU_IDX_S, entry);
-        memory_region_notify_iommu(&s->upstream, IOMMU_IDX_NS, entry);
+        memory_region_iotlb_notify_iommu(&s->upstream, IOMMU_IDX_S, entry);
+        memory_region_iotlb_notify_iommu(&s->upstream, IOMMU_IDX_NS, entry);
 
         entry.perm = IOMMU_RW;
         if (block_is_ns) {
@@ -109,13 +109,13 @@ static void tz_mpc_iommu_notify(TZMPC *s, uint32_t lutidx,
         } else {
             entry.target_as = &s->downstream_as;
         }
-        memory_region_notify_iommu(&s->upstream, IOMMU_IDX_S, entry);
+        memory_region_iotlb_notify_iommu(&s->upstream, IOMMU_IDX_S, entry);
         if (block_is_ns) {
             entry.target_as = &s->downstream_as;
         } else {
             entry.target_as = &s->blocked_io_as;
         }
-        memory_region_notify_iommu(&s->upstream, IOMMU_IDX_NS, entry);
+        memory_region_iotlb_notify_iommu(&s->upstream, IOMMU_IDX_NS, entry);
     }
 }
 
diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
index 5aff4d5a05..91da0dfb9c 100644
--- a/hw/ppc/spapr_iommu.c
+++ b/hw/ppc/spapr_iommu.c
@@ -459,7 +459,7 @@ static target_ulong put_tce_emu(SpaprTceTable *tcet, target_ulong ioba,
     entry.translated_addr = tce & page_mask;
     entry.addr_mask = ~page_mask;
     entry.perm = spapr_tce_iommu_access_flags(tce);
-    memory_region_notify_iommu(&tcet->iommu, 0, entry);
+    memory_region_iotlb_notify_iommu(&tcet->iommu, 0, entry);
 
     return H_SUCCESS;
 }
diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index be2896232d..63bb23accd 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -594,7 +594,7 @@ static void s390_pci_update_iotlb(S390PCIIOMMU *iommu, S390IOTLBEntry *entry)
             }
 
             notify.perm = IOMMU_NONE;
-            memory_region_notify_iommu(&iommu->iommu_mr, 0, notify);
+            memory_region_iotlb_notify_iommu(&iommu->iommu_mr, 0, notify);
             notify.perm = entry->perm;
         }
 
@@ -606,7 +606,7 @@ static void s390_pci_update_iotlb(S390PCIIOMMU *iommu, S390IOTLBEntry *entry)
         g_hash_table_replace(iommu->iotlb, &cache->iova, cache);
     }
 
-    memory_region_notify_iommu(&iommu->iommu_mr, 0, notify);
+    memory_region_iotlb_notify_iommu(&iommu->iommu_mr, 0, notify);
 }
 
 int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 99ade21056..4183772618 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -623,11 +623,11 @@ static void vfio_listener_region_add(MemoryListener *listener,
         llend = int128_sub(llend, int128_one());
         iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
                                                        MEMTXATTRS_UNSPECIFIED);
-        iommu_notifier_init(&giommu->n, vfio_iommu_map_notify,
-                            IOMMU_NOTIFIER_ALL,
-                            section->offset_within_region,
-                            int128_get64(llend),
-                            iommu_idx);
+        iommu_iotlb_notifier_init(&giommu->n, vfio_iommu_map_notify,
+                                  IOMMU_NOTIFIER_IOTLB_ALL,
+                                  section->offset_within_region,
+                                  int128_get64(llend),
+                                  iommu_idx);
         QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
 
         memory_region_register_iommu_notifier(section->mr, &giommu->n);
@@ -721,7 +721,8 @@ static void vfio_listener_region_del(MemoryListener *listener,
 
         QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) {
             if (MEMORY_REGION(giommu->iommu) == section->mr &&
-                giommu->n.start == section->offset_within_region) {
+                is_iommu_iotlb_notifier(&giommu->n) &&
+                giommu->n.iotlb_notifier.start == section->offset_within_region) {
                 memory_region_unregister_iommu_notifier(section->mr,
                                                         &giommu->n);
                 QLIST_REMOVE(giommu, giommu_next);
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 7f61018f2a..263a45d05b 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -677,11 +677,11 @@ static void vhost_iommu_region_add(MemoryListener *listener,
     end = int128_sub(end, int128_one());
     iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
                                                    MEMTXATTRS_UNSPECIFIED);
-    iommu_notifier_init(&iommu->n, vhost_iommu_unmap_notify,
-                        IOMMU_NOTIFIER_UNMAP,
-                        section->offset_within_region,
-                        int128_get64(end),
-                        iommu_idx);
+    iommu_iotlb_notifier_init(&iommu->n, vhost_iommu_unmap_notify,
+                              IOMMU_NOTIFIER_IOTLB_UNMAP,
+                              section->offset_within_region,
+                              int128_get64(end),
+                              iommu_idx);
     iommu->mr = section->mr;
     iommu->iommu_offset = section->offset_within_address_space -
                           section->offset_within_region;
@@ -703,8 +703,8 @@ static void vhost_iommu_region_del(MemoryListener *listener,
     }
 
     QLIST_FOREACH(iommu, &dev->iommu_list, iommu_next) {
-        if (iommu->mr == section->mr &&
-            iommu->n.start == section->offset_within_region) {
+        if (iommu->mr == section->mr && is_iommu_iotlb_notifier(&iommu->n) &&
+            iommu->n.iotlb_notifier.start == section->offset_within_region) {
             memory_region_unregister_iommu_notifier(iommu->mr,
                                                     &iommu->n);
             QLIST_REMOVE(iommu, iommu_next);
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 146a6096da..42d10b29ef 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -81,23 +81,29 @@ struct IOMMUTLBEntry {
 typedef enum {
     IOMMU_NOTIFIER_NONE = 0,
     /* Notify cache invalidations */
-    IOMMU_NOTIFIER_UNMAP = 0x1,
+    IOMMU_NOTIFIER_IOTLB_UNMAP = 0x1,
     /* Notify entry changes (newly created entries) */
-    IOMMU_NOTIFIER_MAP = 0x2,
+    IOMMU_NOTIFIER_IOTLB_MAP = 0x2,
 } IOMMUNotifierFlag;
 
-#define IOMMU_NOTIFIER_ALL (IOMMU_NOTIFIER_MAP | IOMMU_NOTIFIER_UNMAP)
+#define IOMMU_NOTIFIER_IOTLB_ALL (IOMMU_NOTIFIER_IOTLB_MAP | IOMMU_NOTIFIER_IOTLB_UNMAP)
 
 struct IOMMUNotifier;
 typedef void (*IOMMUNotify)(struct IOMMUNotifier *notifier,
                             IOMMUTLBEntry *data);
 
-struct IOMMUNotifier {
+typedef struct IOMMUIOLTBNotifier {
     IOMMUNotify notify;
-    IOMMUNotifierFlag notifier_flags;
     /* Notify for address space range start <= addr <= end */
     hwaddr start;
     hwaddr end;
+} IOMMUIOLTBNotifier;
+
+struct IOMMUNotifier {
+    IOMMUNotifierFlag notifier_flags;
+    union {
+        IOMMUIOLTBNotifier iotlb_notifier;
+    };
     int iommu_idx;
     QLIST_ENTRY(IOMMUNotifier) node;
 };
@@ -126,15 +132,18 @@ typedef struct IOMMUNotifier IOMMUNotifier;
 /* RAM is a persistent kind memory */
 #define RAM_PMEM (1 << 5)
 
-static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
-                                       IOMMUNotifierFlag flags,
-                                       hwaddr start, hwaddr end,
-                                       int iommu_idx)
+static inline void iommu_iotlb_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
+                                             IOMMUNotifierFlag flags,
+                                             hwaddr start, hwaddr end,
+                                             int iommu_idx)
 {
-    n->notify = fn;
+    assert(flags & IOMMU_NOTIFIER_IOTLB_MAP ||
+           flags & IOMMU_NOTIFIER_IOTLB_UNMAP);
+    assert(start < end);
     n->notifier_flags = flags;
-    n->start = start;
-    n->end = end;
+    n->iotlb_notifier.notify = fn;
+    n->iotlb_notifier.start = start;
+    n->iotlb_notifier.end = end;
     n->iommu_idx = iommu_idx;
 }
 
@@ -633,6 +642,11 @@ void memory_region_init_resizeable_ram(MemoryRegion *mr,
                                                        uint64_t length,
                                                        void *host),
                                        Error **errp);
+
+static inline bool is_iommu_iotlb_notifier(IOMMUNotifier *n)
+{
+    return n->notifier_flags & IOMMU_NOTIFIER_IOTLB_ALL;
+}
 #ifdef CONFIG_POSIX
 
 /**
@@ -1018,7 +1032,8 @@ static inline IOMMUMemoryRegionClass *memory_region_get_iommu_class_nocheck(
 uint64_t memory_region_iommu_get_min_page_size(IOMMUMemoryRegion *iommu_mr);
 
 /**
- * memory_region_notify_iommu: notify a change in an IOMMU translation entry.
+ * memory_region_iotlb_notify_iommu: notify a change in an IOMMU translation
+ * entry.
  *
  * The notification type will be decided by entry.perm bits:
  *
@@ -1035,15 +1050,15 @@ uint64_t memory_region_iommu_get_min_page_size(IOMMUMemoryRegion *iommu_mr);
  *         replaces all old entries for the same virtual I/O address range.
  *         Deleted entries have .@perm == 0.
  */
-void memory_region_notify_iommu(IOMMUMemoryRegion *iommu_mr,
-                                int iommu_idx,
-                                IOMMUTLBEntry entry);
+void memory_region_iotlb_notify_iommu(IOMMUMemoryRegion *iommu_mr,
+                                      int iommu_idx,
+                                      IOMMUTLBEntry entry);
 
 /**
- * memory_region_notify_one: notify a change in an IOMMU translation
- *                           entry to a single notifier
+ * memory_region_iotlb_notify_one: notify a change in an IOMMU translation
+ *                                 entry to a single notifier
  *
- * This works just like memory_region_notify_iommu(), but it only
+ * This works just like memory_region_iotlb_notify_iommu(), but it only
  * notifies a specific notifier, not all of them.
  *
  * @notifier: the notifier to be notified
@@ -1051,8 +1066,8 @@ void memory_region_notify_iommu(IOMMUMemoryRegion *iommu_mr,
  *         replaces all old entries for the same virtual I/O address range.
  *         Deleted entries have .@perm == 0.
  */
-void memory_region_notify_one(IOMMUNotifier *notifier,
-                              IOMMUTLBEntry *entry);
+void memory_region_iotlb_notify_one(IOMMUNotifier *notifier,
+                                    IOMMUTLBEntry *entry);
 
 /**
  * memory_region_register_iommu_notifier: register a notifier for changes to
diff --git a/memory.c b/memory.c
index 3071c4bdad..924396a3ce 100644
--- a/memory.c
+++ b/memory.c
@@ -1863,7 +1863,9 @@ void memory_region_register_iommu_notifier(MemoryRegion *mr,
     /* We need to register for at least one bitfield */
     iommu_mr = IOMMU_MEMORY_REGION(mr);
     assert(n->notifier_flags != IOMMU_NOTIFIER_NONE);
-    assert(n->start <= n->end);
+    if (is_iommu_iotlb_notifier(n)) {
+        assert(n->iotlb_notifier.start <= n->iotlb_notifier.end);
+    }
     assert(n->iommu_idx >= 0 &&
            n->iommu_idx < memory_region_iommu_num_indexes(iommu_mr));
 
@@ -1899,7 +1901,7 @@ void memory_region_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n)
     for (addr = 0; addr < memory_region_size(mr); addr += granularity) {
         iotlb = imrc->translate(iommu_mr, addr, IOMMU_NONE, n->iommu_idx);
         if (iotlb.perm != IOMMU_NONE) {
-            n->notify(n, &iotlb);
+            n->iotlb_notifier.notify(n, &iotlb);
         }
 
         /* if (2^64 - MR size) < granularity, it's possible to get an
@@ -1933,42 +1935,44 @@ void memory_region_unregister_iommu_notifier(MemoryRegion *mr,
     memory_region_update_iommu_notify_flags(iommu_mr);
 }
 
-void memory_region_notify_one(IOMMUNotifier *notifier,
-                              IOMMUTLBEntry *entry)
+void memory_region_iotlb_notify_one(IOMMUNotifier *notifier,
+                                    IOMMUTLBEntry *entry)
 {
     IOMMUNotifierFlag request_flags;
 
+    assert(is_iommu_iotlb_notifier(notifier));
     /*
      * Skip the notification if the notification does not overlap
      * with registered range.
      */
-    if (notifier->start > entry->iova + entry->addr_mask ||
-        notifier->end < entry->iova) {
+    if (notifier->iotlb_notifier.start > entry->iova + entry->addr_mask ||
+        notifier->iotlb_notifier.end < entry->iova) {
         return;
     }
 
     if (entry->perm & IOMMU_RW) {
-        request_flags = IOMMU_NOTIFIER_MAP;
+        request_flags = IOMMU_NOTIFIER_IOTLB_MAP;
     } else {
-        request_flags = IOMMU_NOTIFIER_UNMAP;
+        request_flags = IOMMU_NOTIFIER_IOTLB_UNMAP;
     }
 
     if (notifier->notifier_flags & request_flags) {
-        notifier->notify(notifier, entry);
+        notifier->iotlb_notifier.notify(notifier, entry);
     }
 }
 
-void memory_region_notify_iommu(IOMMUMemoryRegion *iommu_mr,
-                                int iommu_idx,
-                                IOMMUTLBEntry entry)
+void memory_region_iotlb_notify_iommu(IOMMUMemoryRegion *iommu_mr,
+                                      int iommu_idx,
+                                      IOMMUTLBEntry entry)
 {
     IOMMUNotifier *iommu_notifier;
 
     assert(memory_region_is_iommu(MEMORY_REGION(iommu_mr)));
 
     IOMMU_NOTIFIER_FOREACH(iommu_notifier, iommu_mr) {
-        if (iommu_notifier->iommu_idx == iommu_idx) {
-            memory_region_notify_one(iommu_notifier, &entry);
+        if (iommu_notifier->iommu_idx == iommu_idx &&
+            is_iommu_iotlb_notifier(iommu_notifier)) {
+            memory_region_iotlb_notify_one(iommu_notifier, &entry);
         }
     }
 }
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 10/27] memory: Add IOMMUConfigNotifier
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (8 preceding siblings ...)
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 09/27] memory: Prepare for different kinds of IOMMU MR notifiers Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 11/27] memory: Add arch_id and leaf fields in IOTLBEntry Eric Auger
                   ` (18 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

With this patch, an IOMMUNotifier can now be either
an IOTLB notifier or a config notifier. A config notifier
is supposed to be called on guest translation config change.
This gives host a chance to update the physical IOMMU
configuration so that is consistent with the guest view.

The notifier is passed an IOMMUConfig. The first type of
configuration introduced here consists in the PASID
configuration.

We introduce the associated helpers, iommu_config_notifier_init,
memory_region_config_notify_iommu

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v1 -> v2:
- use pasid_table config
- pass IOMMUNotifierFlag flags to iommu_config_notifier_init
  to prepare for other config flags
- Introduce IOMMUConfig
- s/IOMMU_NOTIFIER_S1_CFG/IOMMU_NOTIFIER_PASID_CFG
- remove unused IOMMUStage1ConfigType
---
 hw/vfio/common.c      | 15 ++++++++-----
 include/exec/memory.h | 52 ++++++++++++++++++++++++++++++++++++++++++-
 memory.c              | 25 +++++++++++++++++++++
 3 files changed, 86 insertions(+), 6 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 4183772618..75fb568f95 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -720,11 +720,16 @@ static void vfio_listener_region_del(MemoryListener *listener,
         VFIOGuestIOMMU *giommu;
 
         QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) {
-            if (MEMORY_REGION(giommu->iommu) == section->mr &&
-                is_iommu_iotlb_notifier(&giommu->n) &&
-                giommu->n.iotlb_notifier.start == section->offset_within_region) {
-                memory_region_unregister_iommu_notifier(section->mr,
-                                                        &giommu->n);
+            if (MEMORY_REGION(giommu->iommu) == section->mr) {
+                if (is_iommu_iotlb_notifier(&giommu->n) &&
+                    giommu->n.iotlb_notifier.start ==
+                        section->offset_within_region) {
+                    memory_region_unregister_iommu_notifier(section->mr,
+                                                            &giommu->n);
+                } else {
+                    memory_region_unregister_iommu_notifier(section->mr,
+                                                            &giommu->n);
+                }
                 QLIST_REMOVE(giommu, giommu_next);
                 g_free(giommu);
                 break;
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 42d10b29ef..701cb83367 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -26,6 +26,9 @@
 #include "qom/object.h"
 #include "qemu/rcu.h"
 #include "hw/qdev-core.h"
+#ifdef CONFIG_LINUX
+#include <linux/iommu.h>
+#endif
 
 #define RAM_ADDR_INVALID (~(ram_addr_t)0)
 
@@ -74,6 +77,14 @@ struct IOMMUTLBEntry {
     IOMMUAccessFlags perm;
 };
 
+typedef struct IOMMUConfig {
+    union {
+#ifdef __linux__
+        struct iommu_pasid_table_config pasid_cfg;
+#endif
+          };
+} IOMMUConfig;
+
 /*
  * Bitmap for different IOMMUNotifier capabilities. Each notifier can
  * register with one or multiple IOMMU Notifier capability bit(s).
@@ -84,13 +95,18 @@ typedef enum {
     IOMMU_NOTIFIER_IOTLB_UNMAP = 0x1,
     /* Notify entry changes (newly created entries) */
     IOMMU_NOTIFIER_IOTLB_MAP = 0x2,
+    /* Notify stage 1 config changes */
+    IOMMU_NOTIFIER_CONFIG_PASID = 0x4,
 } IOMMUNotifierFlag;
 
 #define IOMMU_NOTIFIER_IOTLB_ALL (IOMMU_NOTIFIER_IOTLB_MAP | IOMMU_NOTIFIER_IOTLB_UNMAP)
+#define IOMMU_NOTIFIER_CONFIG_ALL (IOMMU_NOTIFIER_CONFIG_PASID)
 
 struct IOMMUNotifier;
 typedef void (*IOMMUNotify)(struct IOMMUNotifier *notifier,
                             IOMMUTLBEntry *data);
+typedef void (*IOMMUConfigNotify)(struct IOMMUNotifier *notifier,
+                                  IOMMUConfig *cfg);
 
 typedef struct IOMMUIOLTBNotifier {
     IOMMUNotify notify;
@@ -99,10 +115,15 @@ typedef struct IOMMUIOLTBNotifier {
     hwaddr end;
 } IOMMUIOLTBNotifier;
 
+typedef struct IOMMUConfigNotifier {
+    IOMMUConfigNotify notify;
+} IOMMUConfigNotifier;
+
 struct IOMMUNotifier {
     IOMMUNotifierFlag notifier_flags;
     union {
         IOMMUIOLTBNotifier iotlb_notifier;
+        IOMMUConfigNotifier config_notifier;
     };
     int iommu_idx;
     QLIST_ENTRY(IOMMUNotifier) node;
@@ -147,6 +168,16 @@ static inline void iommu_iotlb_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
     n->iommu_idx = iommu_idx;
 }
 
+static inline void iommu_config_notifier_init(IOMMUNotifier *n,
+                                              IOMMUConfigNotify fn,
+                                              IOMMUNotifierFlag flags,
+                                              int iommu_idx)
+{
+    n->notifier_flags = flags;
+    n->iommu_idx = iommu_idx;
+    n->config_notifier.notify = fn;
+}
+
 /*
  * Memory region callbacks
  */
@@ -647,6 +678,12 @@ static inline bool is_iommu_iotlb_notifier(IOMMUNotifier *n)
 {
     return n->notifier_flags & IOMMU_NOTIFIER_IOTLB_ALL;
 }
+
+static inline bool is_iommu_config_notifier(IOMMUNotifier *n)
+{
+    return n->notifier_flags & IOMMU_NOTIFIER_CONFIG_ALL;
+}
+
 #ifdef CONFIG_POSIX
 
 /**
@@ -1054,6 +1091,19 @@ void memory_region_iotlb_notify_iommu(IOMMUMemoryRegion *iommu_mr,
                                       int iommu_idx,
                                       IOMMUTLBEntry entry);
 
+/**
+ * memory_region_config_notify_iommu: notify a change in a translation
+ * configuration structure.
+ * @iommu_mr: the memory region that was changed
+ * @iommu_idx: the IOMMU index for the translation table which has changed
+ * @flag: config change type
+ * @config: new guest config
+ */
+void memory_region_config_notify_iommu(IOMMUMemoryRegion *iommu_mr,
+                                       int iommu_idx,
+                                       IOMMUNotifierFlag flag,
+                                       IOMMUConfig *config);
+
 /**
  * memory_region_iotlb_notify_one: notify a change in an IOMMU translation
  *                                 entry to a single notifier
@@ -1071,7 +1121,7 @@ void memory_region_iotlb_notify_one(IOMMUNotifier *notifier,
 
 /**
  * memory_region_register_iommu_notifier: register a notifier for changes to
- * IOMMU translation entries.
+ * IOMMU translation entries or translation config settings.
  *
  * @mr: the memory region to observe
  * @n: the IOMMUNotifier to be added; the notify callback receives a
diff --git a/memory.c b/memory.c
index 924396a3ce..d90d8ea67e 100644
--- a/memory.c
+++ b/memory.c
@@ -1935,6 +1935,13 @@ void memory_region_unregister_iommu_notifier(MemoryRegion *mr,
     memory_region_update_iommu_notify_flags(iommu_mr);
 }
 
+static void
+memory_region_config_notify_one(IOMMUNotifier *notifier,
+                                IOMMUConfig *cfg)
+{
+    notifier->config_notifier.notify(notifier, cfg);
+}
+
 void memory_region_iotlb_notify_one(IOMMUNotifier *notifier,
                                     IOMMUTLBEntry *entry)
 {
@@ -1977,6 +1984,24 @@ void memory_region_iotlb_notify_iommu(IOMMUMemoryRegion *iommu_mr,
     }
 }
 
+void memory_region_config_notify_iommu(IOMMUMemoryRegion *iommu_mr,
+                                       int iommu_idx,
+                                       IOMMUNotifierFlag flag,
+                                       IOMMUConfig *config)
+{
+    IOMMUNotifier *iommu_notifier;
+
+    assert(memory_region_is_iommu(MEMORY_REGION(iommu_mr)));
+
+    IOMMU_NOTIFIER_FOREACH(iommu_notifier, iommu_mr) {
+        if (iommu_notifier->iommu_idx == iommu_idx &&
+            is_iommu_config_notifier(iommu_notifier) &&
+            iommu_notifier->notifier_flags == flag) {
+            memory_region_config_notify_one(iommu_notifier, config);
+        }
+    }
+}
+
 int memory_region_iommu_get_attr(IOMMUMemoryRegion *iommu_mr,
                                  enum IOMMUMemoryRegionAttr attr,
                                  void *data)
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 11/27] memory: Add arch_id and leaf fields in IOTLBEntry
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (9 preceding siblings ...)
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 10/27] memory: Add IOMMUConfigNotifier Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 12/27] hw/arm/smmuv3: Store the PASID table GPA in the translation config Eric Auger
                   ` (17 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

TLB entries are usually tagged with some ids such as the asid
or pasid. When propagating an invalidation command from the
guest to the host, we need to pass this id.

Also we add a leaf field which indicates, in case of invalidation
notification whether only cache entries for the last level of
translation are required to be invalidated.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 include/exec/memory.h | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 701cb83367..9f107ebedb 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -69,12 +69,30 @@ typedef enum {
 
 #define IOMMU_ACCESS_FLAG(r, w) (((r) ? IOMMU_RO : 0) | ((w) ? IOMMU_WO : 0))
 
+/**
+ * IOMMUTLBEntry - IOMMU TLB entry
+ *
+ * Structure used when performing a translation or when notifying MAP or
+ * UNMAP (invalidation) events
+ *
+ * @target_as: target address space
+ * @iova: IO virtual address (input)
+ * @translated_addr: translated address (output)
+ * @addr_mask: address mask (0xfff means 4K binding), must be multiple of 2
+ * @perm: permission flag of the mapping (NONE encodes no mapping or
+ * invalidation notification)
+ * @arch_id: architecture specific ID tagging the TLB
+ * @leaf: when @perm is NONE, indicates whether only caches for the last
+ * level of translation need to be invalidated.
+ */
 struct IOMMUTLBEntry {
     AddressSpace    *target_as;
     hwaddr           iova;
     hwaddr           translated_addr;
-    hwaddr           addr_mask;  /* 0xfff = 4k translation */
+    hwaddr           addr_mask;
     IOMMUAccessFlags perm;
+    uint32_t         arch_id;
+    bool             leaf;
 };
 
 typedef struct IOMMUConfig {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 12/27] hw/arm/smmuv3: Store the PASID table GPA in the translation config
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (10 preceding siblings ...)
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 11/27] memory: Add arch_id and leaf fields in IOTLBEntry Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 13/27] hw/arm/smmuv3: Implement dummy replay Eric Auger
                   ` (16 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

For VFIO integration we will need to pass the Context Descriptor (CD)
table GPA to the host. The CD table is also referred to as the PASID
table. Its GPA corresponds to the s1ctrptr field of the Stream Table
Entry. So let's decode and store it in the configuration structure.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/smmuv3.c              | 1 +
 include/hw/arm/smmu-common.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 1744874e72..96d4147533 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -351,6 +351,7 @@ static int decode_ste(SMMUv3State *s, SMMUTransCfg *cfg,
                       "SMMUv3 S1 stalling fault model not allowed yet\n");
         goto bad_ste;
     }
+    cfg->s1ctxptr = STE_CTXPTR(ste);
     return 0;
 
 bad_ste:
diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
index 1f37844e5c..353668f4ea 100644
--- a/include/hw/arm/smmu-common.h
+++ b/include/hw/arm/smmu-common.h
@@ -68,6 +68,7 @@ typedef struct SMMUTransCfg {
     uint8_t tbi;               /* Top Byte Ignore */
     uint16_t asid;
     SMMUTransTableInfo tt[2];
+    dma_addr_t s1ctxptr;
     uint32_t iotlb_hits;       /* counts IOTLB hits for this asid */
     uint32_t iotlb_misses;     /* counts IOTLB misses for this asid */
 } SMMUTransCfg;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 13/27] hw/arm/smmuv3: Implement dummy replay
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (11 preceding siblings ...)
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 12/27] hw/arm/smmuv3: Store the PASID table GPA in the translation config Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 14/27] hw/arm/smmuv3: Fill the IOTLBEntry arch_id on NH_VA invalidation Eric Auger
                   ` (15 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

The default implementation of memory_region_iommu_replay() shall
not be used as it forces the translation of the whole RAM range.
The purpose of this function is to update the shadow page tables.
However in case of nested stage, there is no shadow page table so
we can simply return.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/smmuv3.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 96d4147533..8db605adab 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1507,6 +1507,11 @@ static int smmuv3_get_attr(IOMMUMemoryRegion *iommu,
     return -EINVAL;
 }
 
+static inline void
+smmuv3_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n)
+{
+}
+
 static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
                                                   void *data)
 {
@@ -1515,6 +1520,7 @@ static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
     imrc->translate = smmuv3_translate;
     imrc->notify_flag_changed = smmuv3_notify_flag_changed;
     imrc->get_attr = smmuv3_get_attr;
+    imrc->replay = smmuv3_replay;
 }
 
 static const TypeInfo smmuv3_type_info = {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 14/27] hw/arm/smmuv3: Fill the IOTLBEntry arch_id on NH_VA invalidation
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (12 preceding siblings ...)
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 13/27] hw/arm/smmuv3: Implement dummy replay Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 15/27] hw/arm/smmuv3: Fill the IOTLBEntry leaf field " Eric Auger
                   ` (14 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

When the guest invalidates one S1 entry, it passes the asid.
When propagating this invalidation downto the host, the asid
information also must be passed. So let's fill the arch_id field
introduced for that purpose.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/smmuv3.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 8db605adab..b6eb61304d 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -822,6 +822,7 @@ static void smmuv3_notify_iova(IOMMUMemoryRegion *mr,
     entry.iova = iova;
     entry.addr_mask = (1 << tt->granule_sz) - 1;
     entry.perm = IOMMU_NONE;
+    entry.arch_id = asid;
 
     memory_region_iotlb_notify_one(n, &entry);
 }
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 15/27] hw/arm/smmuv3: Fill the IOTLBEntry leaf field on NH_VA invalidation
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (13 preceding siblings ...)
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 14/27] hw/arm/smmuv3: Fill the IOTLBEntry arch_id on NH_VA invalidation Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 16/27] hw/arm/smmuv3: Notify on config changes Eric Auger
                   ` (13 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

Let's propagate the leaf attribute throughout the invalidation path.
This hint is used to reduce the scope of the invalidations to the
last level of translation. Not enforcing it induces large performance
penalties in nested mode.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/smmuv3.c     | 16 +++++++++-------
 hw/arm/trace-events |  2 +-
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index b6eb61304d..f2f3724686 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -792,8 +792,7 @@ epilogue:
  */
 static void smmuv3_notify_iova(IOMMUMemoryRegion *mr,
                                IOMMUNotifier *n,
-                               int asid,
-                               dma_addr_t iova)
+                               int asid, dma_addr_t iova, bool leaf)
 {
     SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
     SMMUEventInfo event = {};
@@ -823,12 +822,14 @@ static void smmuv3_notify_iova(IOMMUMemoryRegion *mr,
     entry.addr_mask = (1 << tt->granule_sz) - 1;
     entry.perm = IOMMU_NONE;
     entry.arch_id = asid;
+    entry.leaf = leaf;
 
     memory_region_iotlb_notify_one(n, &entry);
 }
 
 /* invalidate an asid/iova tuple in all mr's */
-static void smmuv3_inv_notifiers_iova(SMMUState *s, int asid, dma_addr_t iova)
+static void smmuv3_inv_notifiers_iova(SMMUState *s, int asid,
+                                      dma_addr_t iova, bool leaf)
 {
     SMMUDevice *sdev;
 
@@ -840,7 +841,7 @@ static void smmuv3_inv_notifiers_iova(SMMUState *s, int asid, dma_addr_t iova)
 
         IOMMU_NOTIFIER_FOREACH(n, mr) {
             if (n->notifier_flags & IOMMU_NOTIFIER_IOTLB_UNMAP) {
-                smmuv3_notify_iova(mr, n, asid, iova);
+                smmuv3_notify_iova(mr, n, asid, iova, leaf);
             }
         }
     }
@@ -979,9 +980,10 @@ static int smmuv3_cmdq_consume(SMMUv3State *s)
         {
             dma_addr_t addr = CMD_ADDR(&cmd);
             uint16_t vmid = CMD_VMID(&cmd);
+            bool leaf = CMD_LEAF(&cmd);
 
-            trace_smmuv3_cmdq_tlbi_nh_vaa(vmid, addr);
-            smmuv3_inv_notifiers_iova(bs, -1, addr);
+            trace_smmuv3_cmdq_tlbi_nh_vaa(vmid, addr, leaf);
+            smmuv3_inv_notifiers_iova(bs, -1, addr, leaf);
             smmu_iotlb_inv_all(bs);
             break;
         }
@@ -993,7 +995,7 @@ static int smmuv3_cmdq_consume(SMMUv3State *s)
             bool leaf = CMD_LEAF(&cmd);
 
             trace_smmuv3_cmdq_tlbi_nh_va(vmid, asid, addr, leaf);
-            smmuv3_inv_notifiers_iova(bs, asid, addr);
+            smmuv3_inv_notifiers_iova(bs, asid, addr, leaf);
             smmu_iotlb_inv_iova(bs, asid, addr);
             break;
         }
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 0acedcedc6..3809005cba 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -43,7 +43,7 @@ smmuv3_cmdq_cfgi_cd(uint32_t sid) "streamid = %d"
 smmuv3_config_cache_hit(uint32_t sid, uint32_t hits, uint32_t misses, uint32_t perc) "Config cache HIT for sid %d (hits=%d, misses=%d, hit rate=%d)"
 smmuv3_config_cache_miss(uint32_t sid, uint32_t hits, uint32_t misses, uint32_t perc) "Config cache MISS for sid %d (hits=%d, misses=%d, hit rate=%d)"
 smmuv3_cmdq_tlbi_nh_va(int vmid, int asid, uint64_t addr, bool leaf) "vmid =%d asid =%d addr=0x%"PRIx64" leaf=%d"
-smmuv3_cmdq_tlbi_nh_vaa(int vmid, uint64_t addr) "vmid =%d addr=0x%"PRIx64
+smmuv3_cmdq_tlbi_nh_vaa(int vmid, uint64_t addr, bool leaf) "vmid =%d addr=0x%"PRIx64" leaf=%d"
 smmuv3_cmdq_tlbi_nh(void) ""
 smmuv3_cmdq_tlbi_nh_asid(uint16_t asid) "asid=%d"
 smmu_iotlb_cache_hit(uint16_t asid, uint64_t addr, uint32_t hit, uint32_t miss, uint32_t p) "IOTLB cache HIT asid=%d addr=0x%"PRIx64" hit=%d miss=%d hit rate=%d"
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 16/27] hw/arm/smmuv3: Notify on config changes
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (14 preceding siblings ...)
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 15/27] hw/arm/smmuv3: Fill the IOTLBEntry leaf field " Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 17/27] hw/vfio/common: Introduce vfio_alloc_guest_iommu helper Eric Auger
                   ` (12 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

In case IOMMU config notifiers are attached to the
IOMMU memory region, we execute them, passing as argument
the iommu_pasid_table_config struct updated with the new
viommu translation config. Config notifiers are called on
STE changes. At physical level, they translate into
CMD_CFGI_STE_* commands.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v3 -> v4:
- fix compile issue with mingw

v2 -> v3:
- adapt to pasid_cfg field changes. Use local variable
- add trace event
- set version fields
- use CONFIG_PASID

v1 -> v2:
- do not notify anymore on CD change. Anyway the smmuv3 linux
  driver is not sending any CD invalidation commands. If we were
  to propagate CD invalidation commands, we would use the
  CACHE_INVALIDATE VFIO ioctl.
- notify a precise config flags to prepare for addition of new
  flags
---
 hw/arm/smmuv3.c     | 76 +++++++++++++++++++++++++++++++++++----------
 hw/arm/trace-events |  1 +
 2 files changed, 60 insertions(+), 17 deletions(-)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index f2f3724686..db03313672 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -16,6 +16,10 @@
  * with this program; if not, see <http://www.gnu.org/licenses/>.
  */
 
+#ifdef __linux__
+#include "linux/iommu.h"
+#endif
+
 #include "qemu/osdep.h"
 #include "hw/boards.h"
 #include "sysemu/sysemu.h"
@@ -847,6 +851,59 @@ static void smmuv3_inv_notifiers_iova(SMMUState *s, int asid,
     }
 }
 
+static void smmuv3_notify_config_change(SMMUState *bs, uint32_t sid)
+{
+#ifdef __linux__
+    IOMMUMemoryRegion *mr = smmu_iommu_mr(bs, sid);
+    SMMUEventInfo event = {.type = SMMU_EVT_NONE, .sid = sid};
+    SMMUTransCfg *cfg;
+    SMMUDevice *sdev;
+
+    if (!mr) {
+        return;
+    }
+
+    sdev = container_of(mr, SMMUDevice, iommu);
+
+    /* flush QEMU config cache */
+    smmuv3_flush_config(sdev);
+
+    if (mr->iommu_notify_flags & IOMMU_NOTIFIER_CONFIG_PASID) {
+        /* force a guest RAM config structure decoding */
+        cfg = smmuv3_get_config(sdev, &event);
+
+        if (cfg) {
+            IOMMUConfig iommu_config = {
+                .pasid_cfg.version = PASID_TABLE_CFG_VERSION_1,
+                .pasid_cfg.format = IOMMU_PASID_FORMAT_SMMUV3,
+                .pasid_cfg.base_ptr = cfg->s1ctxptr,
+                .pasid_cfg.smmuv3.version = PASID_TABLE_SMMUV3_CFG_VERSION_1,
+            };
+
+            if (cfg->disabled || cfg->bypassed) {
+                iommu_config.pasid_cfg.config = IOMMU_PASID_CONFIG_BYPASS;
+            } else if (cfg->aborted) {
+                iommu_config.pasid_cfg.config = IOMMU_PASID_CONFIG_ABORT;
+            } else {
+                iommu_config.pasid_cfg.config = IOMMU_PASID_CONFIG_TRANSLATE;
+            }
+
+            trace_smmuv3_notify_config_change(mr->parent_obj.name,
+                                              iommu_config.pasid_cfg.config,
+                                              iommu_config.pasid_cfg.base_ptr);
+
+            memory_region_config_notify_iommu(mr, 0,
+                                              IOMMU_NOTIFIER_CONFIG_PASID,
+                                              &iommu_config);
+        } else {
+            qemu_log_mask(LOG_GUEST_ERROR,
+                          "%s error decoding the configuration for iommu mr=%s\n",
+                         __func__, mr->parent_obj.name);
+        }
+    }
+#endif
+}
+
 static int smmuv3_cmdq_consume(SMMUv3State *s)
 {
     SMMUState *bs = ARM_SMMU(s);
@@ -897,22 +954,14 @@ static int smmuv3_cmdq_consume(SMMUv3State *s)
         case SMMU_CMD_CFGI_STE:
         {
             uint32_t sid = CMD_SID(&cmd);
-            IOMMUMemoryRegion *mr = smmu_iommu_mr(bs, sid);
-            SMMUDevice *sdev;
 
             if (CMD_SSEC(&cmd)) {
                 cmd_error = SMMU_CERROR_ILL;
                 break;
             }
 
-            if (!mr) {
-                break;
-            }
-
             trace_smmuv3_cmdq_cfgi_ste(sid);
-            sdev = container_of(mr, SMMUDevice, iommu);
-            smmuv3_flush_config(sdev);
-
+            smmuv3_notify_config_change(bs, sid);
             break;
         }
         case SMMU_CMD_CFGI_STE_RANGE: /* same as SMMU_CMD_CFGI_ALL */
@@ -929,14 +978,7 @@ static int smmuv3_cmdq_consume(SMMUv3State *s)
             trace_smmuv3_cmdq_cfgi_ste_range(start, end);
 
             for (i = start; i <= end; i++) {
-                IOMMUMemoryRegion *mr = smmu_iommu_mr(bs, i);
-                SMMUDevice *sdev;
-
-                if (!mr) {
-                    continue;
-                }
-                sdev = container_of(mr, SMMUDevice, iommu);
-                smmuv3_flush_config(sdev);
+                smmuv3_notify_config_change(bs, i);
             }
             break;
         }
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 3809005cba..741e645ae2 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -52,4 +52,5 @@ smmuv3_config_cache_inv(uint32_t sid) "Config cache INV for sid %d"
 smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node for iommu mr=%s"
 smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node for iommu mr=%s"
 smmuv3_inv_notifiers_iova(const char *name, uint16_t asid, uint64_t iova) "iommu mr=%s asid=%d iova=0x%"PRIx64
+smmuv3_notify_config_change(const char *name, uint8_t config, uint64_t s1ctxptr) "iommu mr=%s config=%d s1ctxptr=0x%"PRIx64
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 17/27] hw/vfio/common: Introduce vfio_alloc_guest_iommu helper
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (15 preceding siblings ...)
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 16/27] hw/arm/smmuv3: Notify on config changes Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 18/27] hw/vfio/common: Introduce hostwin_from_range helper Eric Auger
                   ` (11 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

Soon this code will be called several times. So let's introduce
an helper.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/vfio/common.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 75fb568f95..7df8b92563 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -24,6 +24,7 @@
 #include <linux/kvm.h>
 #endif
 #include <linux/vfio.h>
+#include <linux/iommu.h>
 
 #include "hw/vfio/vfio-common.h"
 #include "hw/vfio/vfio.h"
@@ -497,6 +498,19 @@ out:
     rcu_read_unlock();
 }
 
+static VFIOGuestIOMMU *vfio_alloc_guest_iommu(VFIOContainer *container,
+                                              IOMMUMemoryRegion *iommu,
+                                              hwaddr offset)
+{
+    VFIOGuestIOMMU *giommu = g_new0(VFIOGuestIOMMU, 1);
+
+    giommu->container = container;
+    giommu->iommu = iommu;
+    giommu->iommu_offset = offset;
+    /* notifier will be registered separately */
+    return giommu;
+}
+
 static void vfio_listener_region_add(MemoryListener *listener,
                                      MemoryRegionSection *section)
 {
@@ -604,6 +618,7 @@ static void vfio_listener_region_add(MemoryListener *listener,
     if (memory_region_is_iommu(section->mr)) {
         VFIOGuestIOMMU *giommu;
         IOMMUMemoryRegion *iommu_mr = IOMMU_MEMORY_REGION(section->mr);
+        hwaddr offset;
         int iommu_idx;
 
         trace_vfio_listener_region_add_iommu(iova, end);
@@ -613,11 +628,11 @@ static void vfio_listener_region_add(MemoryListener *listener,
          * would be the right place to wire that up (tell the KVM
          * device emulation the VFIO iommu handles to use).
          */
-        giommu = g_malloc0(sizeof(*giommu));
-        giommu->iommu = iommu_mr;
-        giommu->iommu_offset = section->offset_within_address_space -
-                               section->offset_within_region;
-        giommu->container = container;
+
+        offset = section->offset_within_address_space -
+                    section->offset_within_region;
+        giommu = vfio_alloc_guest_iommu(container, iommu_mr, offset);
+
         llend = int128_add(int128_make64(section->offset_within_region),
                            section->size);
         llend = int128_sub(llend, int128_one());
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 18/27] hw/vfio/common: Introduce hostwin_from_range helper
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (16 preceding siblings ...)
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 17/27] hw/vfio/common: Introduce vfio_alloc_guest_iommu helper Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 19/27] hw/vfio/common: Introduce helpers to DMA map/unmap a RAM section Eric Auger
                   ` (10 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

Let's introduce a hostwin_from_range() helper that returns the
hostwin encapsulating an IOVA range or NULL if non is found.

This improves the readibility of callers and removes the usage
of hostwin_found.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/vfio/common.c | 37 ++++++++++++++++++-------------------
 1 file changed, 18 insertions(+), 19 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 7df8b92563..5c4b444f24 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -511,6 +511,19 @@ static VFIOGuestIOMMU *vfio_alloc_guest_iommu(VFIOContainer *container,
     return giommu;
 }
 
+static VFIOHostDMAWindow *
+hostwin_from_range(VFIOContainer *container, hwaddr iova, hwaddr end)
+{
+    VFIOHostDMAWindow *hostwin;
+
+    QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) {
+        if (hostwin->min_iova <= iova && end <= hostwin->max_iova) {
+            return hostwin;
+        }
+    }
+    return NULL;
+}
+
 static void vfio_listener_region_add(MemoryListener *listener,
                                      MemoryRegionSection *section)
 {
@@ -520,7 +533,6 @@ static void vfio_listener_region_add(MemoryListener *listener,
     void *vaddr;
     int ret;
     VFIOHostDMAWindow *hostwin;
-    bool hostwin_found;
 
     if (vfio_listener_skipped_section(section)) {
         trace_vfio_listener_region_add_skip(
@@ -597,15 +609,8 @@ static void vfio_listener_region_add(MemoryListener *listener,
 #endif
     }
 
-    hostwin_found = false;
-    QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) {
-        if (hostwin->min_iova <= iova && end <= hostwin->max_iova) {
-            hostwin_found = true;
-            break;
-        }
-    }
-
-    if (!hostwin_found) {
+    hostwin = hostwin_from_range(container, iova, end);
+    if (!hostwin) {
         error_report("vfio: IOMMU container %p can't map guest IOVA region"
                      " 0x%"HWADDR_PRIx"..0x%"HWADDR_PRIx,
                      container, iova, end);
@@ -776,16 +781,10 @@ static void vfio_listener_region_del(MemoryListener *listener,
 
     if (memory_region_is_ram_device(section->mr)) {
         hwaddr pgmask;
-        VFIOHostDMAWindow *hostwin;
-        bool hostwin_found = false;
+        VFIOHostDMAWindow *hostwin =
+            hostwin_from_range(container, iova, end);
 
-        QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) {
-            if (hostwin->min_iova <= iova && end <= hostwin->max_iova) {
-                hostwin_found = true;
-                break;
-            }
-        }
-        assert(hostwin_found); /* or region_add() would have failed */
+        assert(hostwin); /* or region_add() would have failed */
 
         pgmask = (1ULL << ctz64(hostwin->iova_pgsizes)) - 1;
         try_unmap = !((iova & pgmask) || (int128_get64(llsize) & pgmask));
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 19/27] hw/vfio/common: Introduce helpers to DMA map/unmap a RAM section
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (17 preceding siblings ...)
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 18/27] hw/vfio/common: Introduce hostwin_from_range helper Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 20/27] hw/vfio/common: Setup nested stage mappings Eric Auger
                   ` (9 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

Let's introduce two helpers that allow to DMA map/unmap a RAM
section. Those helpers will be called for nested stage setup in
another call site. Also the vfio_listener_region_add/del()
structure may be clearer.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/vfio/common.c     | 178 ++++++++++++++++++++++++++-----------------
 hw/vfio/trace-events |   4 +-
 2 files changed, 109 insertions(+), 73 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 5c4b444f24..26bc2ab19f 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -524,13 +524,116 @@ hostwin_from_range(VFIOContainer *container, hwaddr iova, hwaddr end)
     return NULL;
 }
 
+static int vfio_dma_map_ram_section(VFIOContainer *container,
+                                    MemoryRegionSection *section)
+{
+    VFIOHostDMAWindow *hostwin;
+    Int128 llend, llsize;
+    hwaddr iova, end;
+    void *vaddr;
+    int ret;
+
+    assert(memory_region_is_ram(section->mr));
+
+    iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
+    llend = int128_make64(section->offset_within_address_space);
+    llend = int128_add(llend, section->size);
+    llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK));
+    end = int128_get64(int128_sub(llend, int128_one()));
+
+    vaddr = memory_region_get_ram_ptr(section->mr) +
+            section->offset_within_region +
+            (iova - section->offset_within_address_space);
+
+    hostwin = hostwin_from_range(container, iova, end);
+    if (!hostwin) {
+        error_report("vfio: IOMMU container %p can't map guest IOVA region"
+                     " 0x%"HWADDR_PRIx"..0x%"HWADDR_PRIx,
+                     container, iova, end);
+        return -EFAULT;
+    }
+
+    trace_vfio_dma_map_ram(iova, end, vaddr);
+
+    llsize = int128_sub(llend, int128_make64(iova));
+
+    if (memory_region_is_ram_device(section->mr)) {
+        hwaddr pgmask = (1ULL << ctz64(hostwin->iova_pgsizes)) - 1;
+
+        if ((iova & pgmask) || (int128_get64(llsize) & pgmask)) {
+            trace_vfio_listener_region_add_no_dma_map(
+                memory_region_name(section->mr),
+                section->offset_within_address_space,
+                int128_getlo(section->size),
+                pgmask + 1);
+            return 0;
+        }
+    }
+
+    ret = vfio_dma_map(container, iova, int128_get64(llsize),
+                       vaddr, section->readonly);
+    if (ret) {
+        error_report("vfio_dma_map(%p, 0x%"HWADDR_PRIx", "
+                     "0x%"HWADDR_PRIx", %p) = %d (%m)",
+                     container, iova, int128_get64(llsize), vaddr, ret);
+        if (memory_region_is_ram_device(section->mr)) {
+            /* Allow unexpected mappings not to be fatal for RAM devices */
+            return 0;
+        }
+        return ret;
+    }
+    return 0;
+}
+
+static void vfio_dma_unmap_ram_section(VFIOContainer *container,
+                                       MemoryRegionSection *section)
+{
+    Int128 llend, llsize;
+    hwaddr iova, end;
+    bool try_unmap = true;
+    int ret;
+
+    iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
+    llend = int128_make64(section->offset_within_address_space);
+    llend = int128_add(llend, section->size);
+    llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK));
+
+    if (int128_ge(int128_make64(iova), llend)) {
+        return;
+    }
+    end = int128_get64(int128_sub(llend, int128_one()));
+
+    llsize = int128_sub(llend, int128_make64(iova));
+
+    trace_vfio_dma_unmap_ram(iova, end);
+
+    if (memory_region_is_ram_device(section->mr)) {
+        hwaddr pgmask;
+        VFIOHostDMAWindow *hostwin =
+            hostwin_from_range(container, iova, end);
+
+        assert(hostwin); /* or region_add() would have failed */
+
+        pgmask = (1ULL << ctz64(hostwin->iova_pgsizes)) - 1;
+        try_unmap = !((iova & pgmask) || (int128_get64(llsize) & pgmask));
+    }
+
+    if (try_unmap) {
+        ret = vfio_dma_unmap(container, iova, int128_get64(llsize));
+        if (ret) {
+            error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
+                         "0x%"HWADDR_PRIx") = %d (%m)",
+                         container, iova, int128_get64(llsize), ret);
+        }
+    }
+}
+
 static void vfio_listener_region_add(MemoryListener *listener,
                                      MemoryRegionSection *section)
 {
     VFIOContainer *container = container_of(listener, VFIOContainer, listener);
     hwaddr iova, end;
-    Int128 llend, llsize;
-    void *vaddr;
+    Int128 llend;
     int ret;
     VFIOHostDMAWindow *hostwin;
 
@@ -657,41 +760,10 @@ static void vfio_listener_region_add(MemoryListener *listener,
     }
 
     /* Here we assume that memory_region_is_ram(section->mr)==true */
-
-    vaddr = memory_region_get_ram_ptr(section->mr) +
-            section->offset_within_region +
-            (iova - section->offset_within_address_space);
-
-    trace_vfio_listener_region_add_ram(iova, end, vaddr);
-
-    llsize = int128_sub(llend, int128_make64(iova));
-
-    if (memory_region_is_ram_device(section->mr)) {
-        hwaddr pgmask = (1ULL << ctz64(hostwin->iova_pgsizes)) - 1;
-
-        if ((iova & pgmask) || (int128_get64(llsize) & pgmask)) {
-            trace_vfio_listener_region_add_no_dma_map(
-                memory_region_name(section->mr),
-                section->offset_within_address_space,
-                int128_getlo(section->size),
-                pgmask + 1);
-            return;
-        }
-    }
-
-    ret = vfio_dma_map(container, iova, int128_get64(llsize),
-                       vaddr, section->readonly);
+    ret = vfio_dma_map_ram_section(container, section);
     if (ret) {
-        error_report("vfio_dma_map(%p, 0x%"HWADDR_PRIx", "
-                     "0x%"HWADDR_PRIx", %p) = %d (%m)",
-                     container, iova, int128_get64(llsize), vaddr, ret);
-        if (memory_region_is_ram_device(section->mr)) {
-            /* Allow unexpected mappings not to be fatal for RAM devices */
-            return;
-        }
         goto fail;
     }
-
     return;
 
 fail:
@@ -717,10 +789,6 @@ static void vfio_listener_region_del(MemoryListener *listener,
                                      MemoryRegionSection *section)
 {
     VFIOContainer *container = container_of(listener, VFIOContainer, listener);
-    hwaddr iova, end;
-    Int128 llend, llsize;
-    int ret;
-    bool try_unmap = true;
 
     if (vfio_listener_skipped_section(section)) {
         trace_vfio_listener_region_del_skip(
@@ -765,39 +833,7 @@ static void vfio_listener_region_del(MemoryListener *listener,
          */
     }
 
-    iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
-    llend = int128_make64(section->offset_within_address_space);
-    llend = int128_add(llend, section->size);
-    llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK));
-
-    if (int128_ge(int128_make64(iova), llend)) {
-        return;
-    }
-    end = int128_get64(int128_sub(llend, int128_one()));
-
-    llsize = int128_sub(llend, int128_make64(iova));
-
-    trace_vfio_listener_region_del(iova, end);
-
-    if (memory_region_is_ram_device(section->mr)) {
-        hwaddr pgmask;
-        VFIOHostDMAWindow *hostwin =
-            hostwin_from_range(container, iova, end);
-
-        assert(hostwin); /* or region_add() would have failed */
-
-        pgmask = (1ULL << ctz64(hostwin->iova_pgsizes)) - 1;
-        try_unmap = !((iova & pgmask) || (int128_get64(llsize) & pgmask));
-    }
-
-    if (try_unmap) {
-        ret = vfio_dma_unmap(container, iova, int128_get64(llsize));
-        if (ret) {
-            error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
-                         "0x%"HWADDR_PRIx") = %d (%m)",
-                         container, iova, int128_get64(llsize), ret);
-        }
-    }
+    vfio_dma_unmap_ram_section(container, section);
 
     memory_region_unref(section->mr);
 
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index b1ef55a33f..410801de6e 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -97,10 +97,10 @@ vfio_iommu_map_notify(const char *op, uint64_t iova_start, uint64_t iova_end) "i
 vfio_listener_region_add_skip(uint64_t start, uint64_t end) "SKIPPING region_add 0x%"PRIx64" - 0x%"PRIx64
 vfio_spapr_group_attach(int groupfd, int tablefd) "Attached groupfd %d to liobn fd %d"
 vfio_listener_region_add_iommu(uint64_t start, uint64_t end) "region_add [iommu] 0x%"PRIx64" - 0x%"PRIx64
-vfio_listener_region_add_ram(uint64_t iova_start, uint64_t iova_end, void *vaddr) "region_add [ram] 0x%"PRIx64" - 0x%"PRIx64" [%p]"
+vfio_dma_map_ram(uint64_t iova_start, uint64_t iova_end, void *vaddr) "region_add [ram] 0x%"PRIx64" - 0x%"PRIx64" [%p]"
 vfio_listener_region_add_no_dma_map(const char *name, uint64_t iova, uint64_t size, uint64_t page_size) "Region \"%s\" 0x%"PRIx64" size=0x%"PRIx64" is not aligned to 0x%"PRIx64" and cannot be mapped for DMA"
 vfio_listener_region_del_skip(uint64_t start, uint64_t end) "SKIPPING region_del 0x%"PRIx64" - 0x%"PRIx64
-vfio_listener_region_del(uint64_t start, uint64_t end) "region_del 0x%"PRIx64" - 0x%"PRIx64
+vfio_dma_unmap_ram(uint64_t start, uint64_t end) "region_del 0x%"PRIx64" - 0x%"PRIx64
 vfio_disconnect_container(int fd) "close container->fd=%d"
 vfio_put_group(int fd) "close group->fd=%d"
 vfio_get_device(const char * name, unsigned int flags, unsigned int num_regions, unsigned int num_irqs) "Device %s flags: %u, regions: %u, irqs: %u"
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 20/27] hw/vfio/common: Setup nested stage mappings
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (18 preceding siblings ...)
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 19/27] hw/vfio/common: Introduce helpers to DMA map/unmap a RAM section Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 21/27] hw/vfio/common: Register a MAP notifier for MSI binding Eric Auger
                   ` (8 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

In nested mode, legacy vfio_iommu_map_notify cannot be used as
there is no "caching" mode and we do not trap on map.

On Intel, vfio_iommu_map_notify was used to DMA map the RAM
through the host single stage.

With nested mode, we need to setup the stage 2 and the stage 1
separately. This patch introduces a prereg_lsitener to setup
the stage 2 mapping.

The stage 1 mapping, owned by the guest, is passed to the host
when the guest invalidates the stage 1 configuration, through
a dedicated config IOMMU notifier. Guest IOTLB invalidations
are cascaded downto the host through another IOMMU MR UNMAP
notifier.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v3 -> v4:
- use iommu_inv_pasid_info for ASID invalidation

v2 -> v3:
- use VFIO_IOMMU_ATTACH_PASID_TABLE
- new user API
- handle leaf

v1 -> v2:
- adapt to uapi changes
- pass the asid
- pass IOMMU_NOTIFIER_S1_CFG when initializing the config notifier
---
 hw/vfio/common.c     | 151 +++++++++++++++++++++++++++++++++++++++----
 hw/vfio/trace-events |   2 +
 2 files changed, 142 insertions(+), 11 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 26bc2ab19f..084e3f30e6 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -445,6 +445,71 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb, void **vaddr,
     return true;
 }
 
+/* Pass the guest stage 1 config to the host */
+static void vfio_iommu_nested_notify(IOMMUNotifier *n, IOMMUConfig *cfg)
+{
+    VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n);
+    VFIOContainer *container = giommu->container;
+    struct vfio_iommu_type1_attach_pasid_table info;
+    int ret;
+
+    info.argsz = sizeof(info);
+    info.flags = 0;
+    memcpy(&info.config, &cfg->pasid_cfg, sizeof(cfg->pasid_cfg));
+
+    ret = ioctl(container->fd, VFIO_IOMMU_ATTACH_PASID_TABLE, &info);
+    if (ret) {
+        error_report("%p: failed to pass S1 config to the host (%d)",
+                     container, ret);
+    }
+}
+
+/* Propagate a guest IOTLB invalidation to the host (nested mode) */
+static void vfio_iommu_unmap_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
+{
+    VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n);
+    hwaddr start = iotlb->iova + giommu->iommu_offset;
+
+    VFIOContainer *container = giommu->container;
+    struct vfio_iommu_type1_cache_invalidate ustruct;
+    size_t size = iotlb->addr_mask + 1;
+    int ret;
+
+    assert(iotlb->perm == IOMMU_NONE);
+
+    ustruct.argsz = sizeof(ustruct);
+    ustruct.flags = 0;
+    ustruct.info.version = IOMMU_CACHE_INVALIDATE_INFO_VERSION_1;
+
+    if (size <= 0x10000) {
+        ustruct.info.cache = IOMMU_CACHE_INV_TYPE_IOTLB;
+        ustruct.info.granularity = IOMMU_INV_GRANU_ADDR;
+        ustruct.info.addr_info.flags = IOMMU_INV_ADDR_FLAGS_ARCHID;
+        if (iotlb->leaf) {
+            ustruct.info.addr_info.flags |= IOMMU_INV_ADDR_FLAGS_LEAF;
+        }
+        ustruct.info.addr_info.archid = iotlb->arch_id;
+        ustruct.info.addr_info.addr = start;
+        ustruct.info.addr_info.granule_size = size;
+        ustruct.info.addr_info.nb_granules = 1;
+        trace_vfio_iommu_addr_inv_iotlb(iotlb->arch_id, start, size, 1,
+                                        iotlb->leaf);
+    } else {
+        ustruct.info.cache = IOMMU_CACHE_INV_TYPE_IOTLB;
+        ustruct.info.granularity = IOMMU_INV_GRANU_PASID;
+        ustruct.info.pasid_info.archid = iotlb->arch_id;
+        ustruct.info.pasid_info.flags = IOMMU_INV_PASID_FLAGS_ARCHID;
+        trace_vfio_iommu_asid_inv_iotlb(iotlb->arch_id);
+    }
+
+    ret = ioctl(container->fd, VFIO_IOMMU_CACHE_INVALIDATE, &ustruct);
+    if (ret) {
+        error_report("%p: failed to invalidate CACHE for 0x%"PRIx64
+                     " mask=0x%"PRIx64" (%d)",
+                     container, start, iotlb->addr_mask, ret);
+    }
+}
+
 static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
 {
     VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n);
@@ -628,6 +693,32 @@ static void vfio_dma_unmap_ram_section(VFIOContainer *container,
     }
 }
 
+static void vfio_prereg_listener_region_add(MemoryListener *listener,
+                                            MemoryRegionSection *section)
+{
+    VFIOContainer *container =
+        container_of(listener, VFIOContainer, prereg_listener);
+
+    if (!memory_region_is_ram(section->mr)) {
+        return;
+    }
+
+    vfio_dma_map_ram_section(container, section);
+
+}
+static void vfio_prereg_listener_region_del(MemoryListener *listener,
+                                     MemoryRegionSection *section)
+{
+    VFIOContainer *container =
+        container_of(listener, VFIOContainer, prereg_listener);
+
+    if (!memory_region_is_ram(section->mr)) {
+        return;
+    }
+
+    vfio_dma_unmap_ram_section(container, section);
+}
+
 static void vfio_listener_region_add(MemoryListener *listener,
                                      MemoryRegionSection *section)
 {
@@ -739,21 +830,40 @@ static void vfio_listener_region_add(MemoryListener *listener,
 
         offset = section->offset_within_address_space -
                     section->offset_within_region;
-        giommu = vfio_alloc_guest_iommu(container, iommu_mr, offset);
-
         llend = int128_add(int128_make64(section->offset_within_region),
                            section->size);
         llend = int128_sub(llend, int128_one());
         iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
                                                        MEMTXATTRS_UNSPECIFIED);
-        iommu_iotlb_notifier_init(&giommu->n, vfio_iommu_map_notify,
-                                  IOMMU_NOTIFIER_IOTLB_ALL,
-                                  section->offset_within_region,
-                                  int128_get64(llend),
-                                  iommu_idx);
-        QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
 
-        memory_region_register_iommu_notifier(section->mr, &giommu->n);
+        if (container->iommu_type == VFIO_TYPE1_NESTING_IOMMU) {
+            /* Config notifier to propagate guest stage 1 config changes */
+            giommu = vfio_alloc_guest_iommu(container, iommu_mr, offset);
+            iommu_config_notifier_init(&giommu->n, vfio_iommu_nested_notify,
+                                       IOMMU_NOTIFIER_CONFIG_PASID, iommu_idx);
+            QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
+            memory_region_register_iommu_notifier(section->mr, &giommu->n);
+
+            /* IOTLB unmap notifier to propagate guest IOTLB invalidations */
+            giommu = vfio_alloc_guest_iommu(container, iommu_mr, offset);
+            iommu_iotlb_notifier_init(&giommu->n, vfio_iommu_unmap_notify,
+                                      IOMMU_NOTIFIER_IOTLB_UNMAP,
+                                      section->offset_within_region,
+                                      int128_get64(llend),
+                                      iommu_idx);
+            QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
+            memory_region_register_iommu_notifier(section->mr, &giommu->n);
+        } else {
+            /* MAP/UNMAP IOTLB notifier */
+            giommu = vfio_alloc_guest_iommu(container, iommu_mr, offset);
+            iommu_iotlb_notifier_init(&giommu->n, vfio_iommu_map_notify,
+                                      IOMMU_NOTIFIER_IOTLB_ALL,
+                                      section->offset_within_region,
+                                      int128_get64(llend),
+                                      iommu_idx);
+            QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
+            memory_region_register_iommu_notifier(section->mr, &giommu->n);
+        }
         memory_region_iommu_replay(giommu->iommu, &giommu->n);
 
         return;
@@ -850,15 +960,21 @@ static void vfio_listener_region_del(MemoryListener *listener,
     }
 }
 
-static const MemoryListener vfio_memory_listener = {
+static MemoryListener vfio_memory_listener = {
     .region_add = vfio_listener_region_add,
     .region_del = vfio_listener_region_del,
 };
 
+static MemoryListener vfio_memory_prereg_listener = {
+    .region_add = vfio_prereg_listener_region_add,
+    .region_del = vfio_prereg_listener_region_del,
+};
+
 static void vfio_listener_release(VFIOContainer *container)
 {
     memory_listener_unregister(&container->listener);
-    if (container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU) {
+    if (container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU ||
+        container->iommu_type == VFIO_TYPE1_NESTING_IOMMU) {
         memory_listener_unregister(&container->prereg_listener);
     }
 }
@@ -1354,6 +1470,19 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
         }
         vfio_host_win_add(container, 0, (hwaddr)-1, info.iova_pgsizes);
         container->pgsizes = info.iova_pgsizes;
+
+        if (container->iommu_type == VFIO_TYPE1_NESTING_IOMMU) {
+            container->prereg_listener = vfio_memory_prereg_listener;
+            memory_listener_register(&container->prereg_listener,
+                                     &address_space_memory);
+            if (container->error) {
+                memory_listener_unregister(&container->prereg_listener);
+                ret = container->error;
+                error_setg(errp, "RAM memory listener initialization failed "
+                          " for container");
+                goto free_container_exit;
+            }
+        }
         break;
     }
     case VFIO_SPAPR_TCE_v2_IOMMU:
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index 410801de6e..9f1868af2d 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -115,6 +115,8 @@ vfio_region_sparse_mmap_header(const char *name, int index, int nr_areas) "Devic
 vfio_region_sparse_mmap_entry(int i, unsigned long start, unsigned long end) "sparse entry %d [0x%lx - 0x%lx]"
 vfio_get_dev_region(const char *name, int index, uint32_t type, uint32_t subtype) "%s index %d, %08x/%0x8"
 vfio_dma_unmap_overflow_workaround(void) ""
+vfio_iommu_addr_inv_iotlb(int asid, uint64_t addr, uint64_t size, uint64_t nb_granules, bool leaf) "nested IOTLB invalidate asid=%d, addr=0x%"PRIx64" granule_size=0x%"PRIx64" nb_granules=0x%"PRIx64" leaf=%d"
+vfio_iommu_asid_inv_iotlb(int asid) "nested IOTLB invalidate asid=%d"
 
 # platform.c
 vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group #%d"
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 21/27] hw/vfio/common: Register a MAP notifier for MSI binding
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (19 preceding siblings ...)
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 20/27] hw/vfio/common: Setup nested stage mappings Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 22/27] vfio-pci: Expose MSI stage 1 bindings to the host Eric Auger
                   ` (7 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

Instantiate a MAP notifier to register the MSI stage 1
binding (gIOVA -> gDB) to the host. This allows the host
to build a nested mapping towards the physical doorbell:
guest IOVA -> guest Doorbell -> physical doorbell.
          Stage1          Stage 2

The unregistration is done on VFIO container deallocation.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v2 -> v3:
- only register the notifier if the IOMMU translates MSIs
- record the msi bindings in a container list and unregister on
  container release
---
 hw/vfio/common.c              | 69 +++++++++++++++++++++++++++++++++++
 include/hw/vfio/vfio-common.h |  8 ++++
 2 files changed, 77 insertions(+)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 084e3f30e6..532ede0e70 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -510,6 +510,56 @@ static void vfio_iommu_unmap_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
     }
 }
 
+static void vfio_iommu_msi_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
+{
+    VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n);
+    VFIOContainer *container = giommu->container;
+    int ret;
+
+    struct vfio_iommu_type1_bind_msi ustruct;
+    VFIOMSIBinding *binding;
+
+    QLIST_FOREACH(binding, &container->msibinding_list, next) {
+        if (binding->iova == iotlb->iova) {
+            return;
+        }
+    }
+    ustruct.argsz = sizeof(struct vfio_iommu_type1_bind_msi);
+    ustruct.flags = 0;
+
+    ustruct.iova = iotlb->iova;
+    ustruct.gpa = iotlb->translated_addr;
+    ustruct.size = iotlb->addr_mask + 1;
+    ret = ioctl(container->fd, VFIO_IOMMU_BIND_MSI , &ustruct);
+    if (ret) {
+        error_report("%s: failed to register the stage1 MSI binding (%d)",
+                     __func__, ret);
+    }
+    binding =  g_new0(VFIOMSIBinding, 1);
+    binding->iova = ustruct.iova;
+    binding->gpa = ustruct.gpa;
+    binding->size = ustruct.size;
+
+    QLIST_INSERT_HEAD(&container->msibinding_list, binding, next);
+}
+
+static void vfio_container_unbind_msis(VFIOContainer *container)
+{
+    VFIOMSIBinding *binding, *tmp;
+
+    QLIST_FOREACH_SAFE(binding, &container->msibinding_list, next, tmp) {
+        struct vfio_iommu_type1_unbind_msi ustruct;
+
+        /* the MSI doorbell is not used anymore, unregister it */
+        ustruct.argsz = sizeof(struct vfio_iommu_type1_unbind_msi);
+        ustruct.flags = 0;
+        ustruct.iova = binding->iova;
+        ioctl(container->fd, VFIO_IOMMU_UNBIND_MSI , &ustruct);
+        QLIST_REMOVE(binding, next);
+        g_free(binding);
+    }
+}
+
 static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
 {
     VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n);
@@ -837,6 +887,8 @@ static void vfio_listener_region_add(MemoryListener *listener,
                                                        MEMTXATTRS_UNSPECIFIED);
 
         if (container->iommu_type == VFIO_TYPE1_NESTING_IOMMU) {
+            bool translate_msi;
+
             /* Config notifier to propagate guest stage 1 config changes */
             giommu = vfio_alloc_guest_iommu(container, iommu_mr, offset);
             iommu_config_notifier_init(&giommu->n, vfio_iommu_nested_notify,
@@ -853,6 +905,21 @@ static void vfio_listener_region_add(MemoryListener *listener,
                                       iommu_idx);
             QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
             memory_region_register_iommu_notifier(section->mr, &giommu->n);
+
+            memory_region_iommu_get_attr(iommu_mr, IOMMU_ATTR_MSI_TRANSLATE,
+                                         (void *)&translate_msi);
+            if (translate_msi) {
+                giommu = vfio_alloc_guest_iommu(container, iommu_mr, offset);
+                iommu_iotlb_notifier_init(&giommu->n,
+                                          vfio_iommu_msi_map_notify,
+                                          IOMMU_NOTIFIER_IOTLB_MAP,
+                                          section->offset_within_region,
+                                          int128_get64(llend),
+                                          iommu_idx);
+                QLIST_INSERT_HEAD(&container->giommu_list, giommu,
+                                  giommu_next);
+                memory_region_register_iommu_notifier(section->mr, &giommu->n);
+            }
         } else {
             /* MAP/UNMAP IOTLB notifier */
             giommu = vfio_alloc_guest_iommu(container, iommu_mr, offset);
@@ -1629,6 +1696,8 @@ static void vfio_disconnect_container(VFIOGroup *group)
             g_free(giommu);
         }
 
+        vfio_container_unbind_msis(container);
+
         trace_vfio_disconnect_container(container->fd);
         close(container->fd);
         g_free(container);
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 686d99ff8c..c862d87725 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -64,6 +64,13 @@ typedef struct VFIOAddressSpace {
     QLIST_ENTRY(VFIOAddressSpace) list;
 } VFIOAddressSpace;
 
+typedef struct VFIOMSIBinding {
+    hwaddr iova;
+    hwaddr gpa;
+    hwaddr size;
+    QLIST_ENTRY(VFIOMSIBinding) next;
+} VFIOMSIBinding;
+
 struct VFIOGroup;
 
 typedef struct VFIOContainer {
@@ -83,6 +90,7 @@ typedef struct VFIOContainer {
     QLIST_HEAD(, VFIOGuestIOMMU) giommu_list;
     QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list;
     QLIST_HEAD(, VFIOGroup) group_list;
+    QLIST_HEAD(, VFIOMSIBinding) msibinding_list;
     QLIST_ENTRY(VFIOContainer) next;
 } VFIOContainer;
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 22/27] vfio-pci: Expose MSI stage 1 bindings to the host
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (20 preceding siblings ...)
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 21/27] hw/vfio/common: Register a MAP notifier for MSI binding Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 23/27] memory: Introduce IOMMU Memory Region inject_faults API Eric Auger
                   ` (6 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

When the guest is exposed with a virtual IOMMU that translates
MSIs, the guest allocates an IOVA (gIOVA) that maps the virtual
doorbell (gDB). In nested mode, when the MSI is setup, we pass
this stage1 mapping to the host so that it can use this stage1
binding to create a nested stage translating into the physical
doorbell. Conversely, when the MSI setup os torn down, we
unregister this binding.

For registration, We directly use the iommu memory region
translate() callback since the addr_mask is returned in the
IOTLB entry. address_space_translate does not return this information.

Now that we use a MAP notifier, let's remove warning against
the usage of map notifiers (historically used along with Intel's
caching mode).

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v3 -> v4:
- move the MSI binding registration in vfio_enable_vectors
  to address the MSI use case
---
 hw/arm/smmuv3.c      |  8 -------
 hw/vfio/pci.c        | 50 +++++++++++++++++++++++++++++++++++++++++++-
 hw/vfio/trace-events |  2 ++
 3 files changed, 51 insertions(+), 9 deletions(-)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index db03313672..a697968ace 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1521,14 +1521,6 @@ static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
     SMMUv3State *s3 = sdev->smmu;
     SMMUState *s = &(s3->smmu_state);
 
-    if (new & IOMMU_NOTIFIER_IOTLB_MAP) {
-        int bus_num = pci_bus_num(sdev->bus);
-        PCIDevice *pcidev = pci_find_device(sdev->bus, bus_num, sdev->devfn);
-
-        warn_report("SMMUv3 does not support notification on MAP: "
-                     "device %s will not function properly", pcidev->name);
-    }
-
     if (old == IOMMU_NOTIFIER_NONE) {
         trace_smmuv3_notify_flag_add(iommu->parent_obj.name);
         QLIST_INSERT_HEAD(&s->devices_with_notifiers, sdev, next);
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 3095379747..b613b20501 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -358,6 +358,48 @@ static void vfio_msi_interrupt(void *opaque)
     notify(&vdev->pdev, nr);
 }
 
+static int vfio_register_msi_binding(VFIOPCIDevice *vdev, int vector_n)
+{
+    PCIDevice *dev = &vdev->pdev;
+    AddressSpace *as = pci_device_iommu_address_space(dev);
+    MSIMessage msg = pci_get_msi_message(dev, vector_n);
+    IOMMUMemoryRegionClass *imrc;
+    IOMMUMemoryRegion *iommu_mr;
+    bool msi_translate = false, nested = false;;
+    IOMMUTLBEntry entry;
+
+    if (as == &address_space_memory) {
+        return 0;
+    }
+
+    iommu_mr = IOMMU_MEMORY_REGION(as->root);
+    memory_region_iommu_get_attr(iommu_mr, IOMMU_ATTR_MSI_TRANSLATE,
+                                 (void *)&msi_translate);
+    memory_region_iommu_get_attr(iommu_mr, IOMMU_ATTR_VFIO_NESTED,
+                                 (void *)&nested);
+    imrc = memory_region_get_iommu_class_nocheck(iommu_mr);
+
+    if (!nested || !msi_translate) {
+        return 0;
+    }
+
+    /* MSI doorbell address is translated by an IOMMU */
+
+    rcu_read_lock();
+    entry = imrc->translate(iommu_mr, msg.address, IOMMU_WO, 0);
+    rcu_read_unlock();
+
+    if (entry.perm == IOMMU_NONE) {
+        return -ENOENT;
+    }
+
+    trace_vfio_register_msi_binding(vdev->vbasedev.name, vector_n,
+                                    msg.address, entry.translated_addr);
+
+    memory_region_iotlb_notify_iommu(iommu_mr, 0, entry);
+    return 0;
+}
+
 static int vfio_enable_vectors(VFIOPCIDevice *vdev, bool msix)
 {
     struct vfio_irq_set *irq_set;
@@ -375,7 +417,7 @@ static int vfio_enable_vectors(VFIOPCIDevice *vdev, bool msix)
     fds = (int32_t *)&irq_set->data;
 
     for (i = 0; i < vdev->nr_vectors; i++) {
-        int fd = -1;
+        int ret, fd = -1;
 
         /*
          * MSI vs MSI-X - The guest has direct access to MSI mask and pending
@@ -390,6 +432,12 @@ static int vfio_enable_vectors(VFIOPCIDevice *vdev, bool msix)
             } else {
                 fd = event_notifier_get_fd(&vdev->msi_vectors[i].kvm_interrupt);
             }
+            ret = vfio_register_msi_binding(vdev, i);
+            if (ret) {
+                error_report("%s failed to register S1 MSI binding "
+                             "for vector %d(%d)", __func__, i, ret);
+                return ret;
+            }
         }
 
         fds[i] = fd;
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index 9f1868af2d..5de97a8882 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -117,6 +117,8 @@ vfio_get_dev_region(const char *name, int index, uint32_t type, uint32_t subtype
 vfio_dma_unmap_overflow_workaround(void) ""
 vfio_iommu_addr_inv_iotlb(int asid, uint64_t addr, uint64_t size, uint64_t nb_granules, bool leaf) "nested IOTLB invalidate asid=%d, addr=0x%"PRIx64" granule_size=0x%"PRIx64" nb_granules=0x%"PRIx64" leaf=%d"
 vfio_iommu_asid_inv_iotlb(int asid) "nested IOTLB invalidate asid=%d"
+vfio_register_msi_binding(const char *name, int vector, uint64_t giova, uint64_t gdb) "%s: register vector %d gIOVA=0x%"PRIx64 "-> gDB=0x%"PRIx64" stage 1 mapping"
+vfio_unregister_msi_binding(const char *name, int vector, uint64_t giova) "%s: unregister vector %d gIOVA=0x%"PRIx64 " stage 1 mapping"
 
 # platform.c
 vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group #%d"
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 23/27] memory: Introduce IOMMU Memory Region inject_faults API
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (21 preceding siblings ...)
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 22/27] vfio-pci: Expose MSI stage 1 bindings to the host Eric Auger
@ 2019-05-27 11:41 ` Eric Auger
  2019-05-27 11:42 ` [Qemu-devel] [RFC v4 24/27] hw/arm/smmuv3: Implement fault injection Eric Auger
                   ` (5 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:41 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

This new API allows to inject @count iommu_faults into
the IOMMU memory region.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 include/exec/memory.h | 25 +++++++++++++++++++++++++
 memory.c              | 10 ++++++++++
 2 files changed, 35 insertions(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 9f107ebedb..593ee7fc50 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -57,6 +57,8 @@ struct MemoryRegionMmio {
     CPUWriteMemoryFunc *write[3];
 };
 
+struct iommu_fault;
+
 typedef struct IOMMUTLBEntry IOMMUTLBEntry;
 
 /* See address_space_translate: bit 0 is read, bit 1 is write.  */
@@ -400,6 +402,19 @@ typedef struct IOMMUMemoryRegionClass {
      * @iommu: the IOMMUMemoryRegion
      */
     int (*num_indexes)(IOMMUMemoryRegion *iommu);
+
+    /*
+     * Inject @count faults into the IOMMU memory region
+     *
+     * Optional method: if this method is not provided, then
+     * memory_region_injection_faults() will return -ENOENT
+     *
+     * @iommu: the IOMMU memory region to inject the faults in
+     * @count: number of faults to inject
+     * @buf: fault buffer
+     */
+    int (*inject_faults)(IOMMUMemoryRegion *iommu, int count,
+                         struct iommu_fault *buf);
 } IOMMUMemoryRegionClass;
 
 typedef struct CoalescedMemoryRange CoalescedMemoryRange;
@@ -1216,6 +1231,16 @@ int memory_region_iommu_attrs_to_index(IOMMUMemoryRegion *iommu_mr,
  */
 int memory_region_iommu_num_indexes(IOMMUMemoryRegion *iommu_mr);
 
+/**
+ * memory_region_inject_faults : inject @count faults stored in @buf
+ *
+ * @iommu_mr: the IOMMU memory region
+ * @count: number of faults to be injected
+ * @buf: buffer containing the faults
+ */
+int memory_region_inject_faults(IOMMUMemoryRegion *iommu_mr, int count,
+                                struct iommu_fault *buf);
+
 /**
  * memory_region_name: get a memory region's name
  *
diff --git a/memory.c b/memory.c
index d90d8ea67e..16996ef14e 100644
--- a/memory.c
+++ b/memory.c
@@ -2038,6 +2038,16 @@ int memory_region_iommu_num_indexes(IOMMUMemoryRegion *iommu_mr)
     return imrc->num_indexes(iommu_mr);
 }
 
+int memory_region_inject_faults(IOMMUMemoryRegion *iommu_mr, int count,
+                                struct iommu_fault *buf)
+{
+    IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr);
+    if (!imrc->inject_faults) {
+        return -ENOENT;
+    }
+    return imrc->inject_faults(iommu_mr, count, buf);
+}
+
 void memory_region_set_log(MemoryRegion *mr, bool log, unsigned client)
 {
     uint8_t mask = 1 << client;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 24/27] hw/arm/smmuv3: Implement fault injection
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (22 preceding siblings ...)
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 23/27] memory: Introduce IOMMU Memory Region inject_faults API Eric Auger
@ 2019-05-27 11:42 ` Eric Auger
  2019-05-27 11:42 ` [Qemu-devel] [RFC v4 25/27] vfio-pci: register handler for iommu fault Eric Auger
                   ` (4 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:42 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

We convert iommu_fault structs received from the kernel
into the data struct used by the emulation code and record
the evnts into the virtual event queue.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v3 -> v4:
- fix compil issue on mingw

Exhaustive mapping remains to be done
---
 hw/arm/smmuv3.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 71 insertions(+)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index a697968ace..4b6480bec0 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1549,6 +1549,76 @@ smmuv3_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n)
 {
 }
 
+struct iommu_fault;
+
+static inline int
+smmuv3_inject_faults(IOMMUMemoryRegion *iommu_mr, int count,
+                     struct iommu_fault *buf)
+{
+#ifdef __linux__
+    SMMUDevice *sdev = container_of(iommu_mr, SMMUDevice, iommu);
+    SMMUv3State *s3 = sdev->smmu;
+    uint32_t sid = smmu_get_sid(sdev);
+    int i;
+
+    for (i = 0; i < count; i++) {
+        SMMUEventInfo info = {};
+        struct iommu_fault_unrecoverable *record;
+
+        if (buf[i].type != IOMMU_FAULT_DMA_UNRECOV) {
+            continue;
+        }
+
+        info.sid = sid;
+        record = &buf[i].event;
+
+        switch (record->reason) {
+        case IOMMU_FAULT_REASON_PASID_INVALID:
+            info.type = SMMU_EVT_C_BAD_SUBSTREAMID;
+            /* TODO further fill info.u.c_bad_substream */
+            break;
+        case IOMMU_FAULT_REASON_PASID_FETCH:
+            info.type = SMMU_EVT_F_CD_FETCH;
+            break;
+        case IOMMU_FAULT_REASON_BAD_PASID_ENTRY:
+            info.type = SMMU_EVT_C_BAD_CD;
+            /* TODO further fill info.u.c_bad_cd */
+            break;
+        case IOMMU_FAULT_REASON_WALK_EABT:
+            info.type = SMMU_EVT_F_WALK_EABT;
+            info.u.f_walk_eabt.addr = record->addr;
+            info.u.f_walk_eabt.addr2 = record->fetch_addr;
+            break;
+        case IOMMU_FAULT_REASON_PTE_FETCH:
+            info.type = SMMU_EVT_F_TRANSLATION;
+            info.u.f_translation.addr = record->addr;
+            break;
+        case IOMMU_FAULT_REASON_OOR_ADDRESS:
+            info.type = SMMU_EVT_F_ADDR_SIZE;
+            info.u.f_addr_size.addr = record->addr;
+            break;
+        case IOMMU_FAULT_REASON_ACCESS:
+            info.type = SMMU_EVT_F_ACCESS;
+            info.u.f_access.addr = record->addr;
+            break;
+        case IOMMU_FAULT_REASON_PERMISSION:
+            info.type = SMMU_EVT_F_PERMISSION;
+            info.u.f_permission.addr = record->addr;
+            break;
+        default:
+            warn_report("%s Unexpected fault reason received from host: %d",
+                        __func__, record->reason);
+            continue;
+        }
+
+        smmuv3_record_event(s3, &info);
+    }
+    return 0;
+#else
+    return -1;
+#endif
+}
+
 static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
                                                   void *data)
 {
@@ -1558,6 +1628,7 @@ static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
     imrc->notify_flag_changed = smmuv3_notify_flag_changed;
     imrc->get_attr = smmuv3_get_attr;
     imrc->replay = smmuv3_replay;
+    imrc->inject_faults = smmuv3_inject_faults;
 }
 
 static const TypeInfo smmuv3_type_info = {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 25/27] vfio-pci: register handler for iommu fault
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (23 preceding siblings ...)
  2019-05-27 11:42 ` [Qemu-devel] [RFC v4 24/27] hw/arm/smmuv3: Implement fault injection Eric Auger
@ 2019-05-27 11:42 ` Eric Auger
  2019-05-27 11:42 ` [Qemu-devel] [RFC v4 26/27] vfio-pci: Set up fault regions Eric Auger
                   ` (3 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:42 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

We use the VFIO_PCI_DMA_FAULT_IRQ_INDEX "irq" index to set/unset
a notifier for physical DMA faults. The associated eventfd is
triggered, in nested mode, whenever a fault is detected at IOMMU
physical level.

As this is the first use of this new IRQ index, also handle it
in irq_to_str() in case the signaling setup fails.

The actual handler will be implemented in subsequent patches.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v3 -> v4:
- check VFIO_PCI_DMA_FAULT_IRQ_INDEX is supported at kernel level
  before attempting to set signaling for it.
---
 hw/vfio/common.c |  3 +++
 hw/vfio/pci.c    | 52 ++++++++++++++++++++++++++++++++++++++++++++++++
 hw/vfio/pci.h    |  1 +
 3 files changed, 56 insertions(+)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 532ede0e70..cf0087321e 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -130,6 +130,9 @@ static char *irq_to_str(int index, int subindex)
     case VFIO_PCI_REQ_IRQ_INDEX:
         str = g_strdup_printf("REQ-%d", subindex);
         break;
+    case VFIO_PCI_DMA_FAULT_IRQ_INDEX:
+        str = g_strdup_printf("DMA-FAULT-%d", subindex);
+        break;
     default:
         str = g_strdup_printf("index %d (unknown)", index);
         break;
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index b613b20501..29d4f633b0 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2736,6 +2736,56 @@ static void vfio_unregister_req_notifier(VFIOPCIDevice *vdev)
     vdev->req_enabled = false;
 }
 
+static void vfio_dma_fault_notifier_handler(void *opaque)
+{
+    VFIOPCIDevice *vdev = opaque;
+
+    if (!event_notifier_test_and_clear(&vdev->dma_fault_notifier)) {
+        return;
+    }
+}
+
+static void vfio_register_dma_fault_notifier(VFIOPCIDevice *vdev)
+{
+    struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info),
+                                      .index = VFIO_PCI_DMA_FAULT_IRQ_INDEX };
+    Error *err = NULL;
+    int32_t fd;
+
+    if (ioctl(vdev->vbasedev.fd,
+              VFIO_DEVICE_GET_IRQ_INFO, &irq_info) < 0 || irq_info.count < 1) {
+        return;
+    }
+
+    if (event_notifier_init(&vdev->dma_fault_notifier, 0)) {
+        error_report("vfio: Unable to init event notifier for dma fault");
+        return;
+    }
+
+    fd = event_notifier_get_fd(&vdev->dma_fault_notifier);
+    qemu_set_fd_handler(fd, vfio_dma_fault_notifier_handler, NULL, vdev);
+
+    if (vfio_set_irq_signaling(&vdev->vbasedev, VFIO_PCI_DMA_FAULT_IRQ_INDEX, 0,
+                           VFIO_IRQ_SET_ACTION_TRIGGER, fd, &err)) {
+        error_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name);
+        qemu_set_fd_handler(fd, NULL, NULL, vdev);
+        event_notifier_cleanup(&vdev->dma_fault_notifier);
+    }
+}
+
+static void vfio_unregister_dma_fault_notifier(VFIOPCIDevice *vdev)
+{
+    Error *err = NULL;
+
+    if (vfio_set_irq_signaling(&vdev->vbasedev, VFIO_PCI_DMA_FAULT_IRQ_INDEX, 0,
+                               VFIO_IRQ_SET_ACTION_TRIGGER, -1, &err)) {
+        error_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name);
+    }
+    qemu_set_fd_handler(event_notifier_get_fd(&vdev->dma_fault_notifier),
+                        NULL, NULL, vdev);
+    event_notifier_cleanup(&vdev->dma_fault_notifier);
+}
+
 static void vfio_realize(PCIDevice *pdev, Error **errp)
 {
     VFIOPCIDevice *vdev = PCI_VFIO(pdev);
@@ -3035,6 +3085,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
 
     vfio_register_err_notifier(vdev);
     vfio_register_req_notifier(vdev);
+    vfio_register_dma_fault_notifier(vdev);
     vfio_setup_resetfn_quirk(vdev);
 
     return;
@@ -3073,6 +3124,7 @@ static void vfio_exitfn(PCIDevice *pdev)
 
     vfio_unregister_req_notifier(vdev);
     vfio_unregister_err_notifier(vdev);
+    vfio_unregister_dma_fault_notifier(vdev);
     pci_device_set_intx_routing_notifier(&vdev->pdev, NULL);
     vfio_disable_interrupts(vdev);
     if (vdev->intx.mmap_timer) {
diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index cfcd1a81b8..96d29d667b 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -135,6 +135,7 @@ typedef struct VFIOPCIDevice {
     PCIHostDeviceAddress host;
     EventNotifier err_notifier;
     EventNotifier req_notifier;
+    EventNotifier dma_fault_notifier;
     int (*resetfn)(struct VFIOPCIDevice *);
     uint32_t vendor_id;
     uint32_t device_id;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 26/27] vfio-pci: Set up fault regions
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (24 preceding siblings ...)
  2019-05-27 11:42 ` [Qemu-devel] [RFC v4 25/27] vfio-pci: register handler for iommu fault Eric Auger
@ 2019-05-27 11:42 ` Eric Auger
  2019-05-27 11:42 ` [Qemu-devel] [RFC v4 27/27] vfio-pci: Implement the DMA fault handler Eric Auger
                   ` (2 subsequent siblings)
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:42 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

We setup two fault regions: the producer fault is read-only from the
user space perspective. It is composed of the fault queue (mmappable)
and a header written by the kernel, located in a separate page.

The consumer fault is write-only from the user-space perspective.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
---
 hw/vfio/pci.c | 99 +++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/vfio/pci.h |  2 ++
 2 files changed, 101 insertions(+)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 29d4f633b0..8208171f92 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2505,11 +2505,100 @@ int vfio_populate_vga(VFIOPCIDevice *vdev, Error **errp)
     return 0;
 }
 
+static void vfio_init_fault_regions(VFIOPCIDevice *vdev, Error **errp)
+{
+    struct vfio_region_info *fault_region_info = NULL;
+    struct vfio_region_info_cap_fault *cap_fault;
+    VFIODevice *vbasedev = &vdev->vbasedev;
+    struct vfio_info_cap_header *hdr;
+    char *fault_region_name = NULL;
+    uint32_t max_version;
+    ssize_t bytes;
+    int ret;
+
+    /* Producer Fault Region */
+    ret = vfio_get_dev_region_info(&vdev->vbasedev,
+                                   VFIO_REGION_TYPE_NESTED,
+                                   VFIO_REGION_SUBTYPE_NESTED_FAULT_PROD,
+                                   &fault_region_info);
+    if (!ret) {
+        hdr = vfio_get_region_info_cap(fault_region_info,
+                                       VFIO_REGION_INFO_CAP_PRODUCER_FAULT);
+        if (!hdr) {
+            error_setg(errp, "failed to retrieve fault ABI max version");
+            g_free(fault_region_info);
+            return;
+        }
+        cap_fault = container_of(hdr, struct vfio_region_info_cap_fault,
+                                 header);
+        max_version = cap_fault->version;
+
+        fault_region_name = g_strdup_printf("%s FAULT PROD %d",
+                                            vbasedev->name,
+                                            fault_region_info->index);
+
+        ret = vfio_region_setup(OBJECT(vdev), vbasedev,
+                                &vdev->fault_prod_region,
+                                fault_region_info->index,
+                                fault_region_name);
+        if (ret) {
+            error_setg_errno(errp, -ret,
+                             "failed to setup the fault prod region %d",
+                             fault_region_info->index);
+            goto out;
+        }
+
+        ret = vfio_region_mmap(&vdev->fault_prod_region);
+        if (ret) {
+            error_report("Failed to mmap fault queue(%d)", ret);
+        }
+
+        g_free(fault_region_info);
+        g_free(fault_region_name);
+    } else {
+        goto out;
+    }
+
+    /* Consumer Fault Region */
+    ret = vfio_get_dev_region_info(&vdev->vbasedev,
+                                   VFIO_REGION_TYPE_NESTED,
+                                   VFIO_REGION_SUBTYPE_NESTED_FAULT_CONS,
+                                   &fault_region_info);
+    if (!ret) {
+        fault_region_name = g_strdup_printf("%s FAULT CONS %d",
+                                            vbasedev->name,
+                                            fault_region_info->index);
+
+        ret = vfio_region_setup(OBJECT(vdev), vbasedev,
+                                &vdev->fault_cons_region,
+                                fault_region_info->index,
+                                fault_region_name);
+        if (ret) {
+            error_setg_errno(errp, -ret,
+                             "failed to setup the fault cons region %d",
+                             fault_region_info->index);
+        }
+
+        /* Set the chosen fault ABI version in the consume header*/
+        bytes = pwrite(vdev->vbasedev.fd, &max_version, 4,
+                       vdev->fault_cons_region.fd_offset);
+        if (bytes != 4) {
+            error_setg(errp,
+                       "Unable to set the chosen fault ABI version (%d)",
+                       max_version);
+        }
+    }
+out:
+    g_free(fault_region_name);
+    g_free(fault_region_info);
+}
+
 static void vfio_populate_device(VFIOPCIDevice *vdev, Error **errp)
 {
     VFIODevice *vbasedev = &vdev->vbasedev;
     struct vfio_region_info *reg_info;
     struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info) };
+    Error *err = NULL;
     int i, ret = -1;
 
     /* Sanity check device */
@@ -2573,6 +2662,12 @@ static void vfio_populate_device(VFIOPCIDevice *vdev, Error **errp)
         }
     }
 
+    vfio_init_fault_regions(vdev, &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+
     irq_info.index = VFIO_PCI_ERR_IRQ_INDEX;
 
     ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_IRQ_INFO, &irq_info);
@@ -3105,6 +3200,8 @@ static void vfio_instance_finalize(Object *obj)
 
     vfio_display_finalize(vdev);
     vfio_bars_finalize(vdev);
+    vfio_region_finalize(&vdev->fault_prod_region);
+    vfio_region_finalize(&vdev->fault_cons_region);
     g_free(vdev->emulated_config_bits);
     g_free(vdev->rom);
     /*
@@ -3125,6 +3222,8 @@ static void vfio_exitfn(PCIDevice *pdev)
     vfio_unregister_req_notifier(vdev);
     vfio_unregister_err_notifier(vdev);
     vfio_unregister_dma_fault_notifier(vdev);
+    vfio_region_exit(&vdev->fault_prod_region);
+    vfio_region_exit(&vdev->fault_cons_region);
     pci_device_set_intx_routing_notifier(&vdev->pdev, NULL);
     vfio_disable_interrupts(vdev);
     if (vdev->intx.mmap_timer) {
diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index 96d29d667b..ee64081b47 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -136,6 +136,8 @@ typedef struct VFIOPCIDevice {
     EventNotifier err_notifier;
     EventNotifier req_notifier;
     EventNotifier dma_fault_notifier;
+    VFIORegion fault_prod_region;
+    VFIORegion fault_cons_region;
     int (*resetfn)(struct VFIOPCIDevice *);
     uint32_t vendor_id;
     uint32_t device_id;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [RFC v4 27/27] vfio-pci: Implement the DMA fault handler
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (25 preceding siblings ...)
  2019-05-27 11:42 ` [Qemu-devel] [RFC v4 26/27] vfio-pci: Set up fault regions Eric Auger
@ 2019-05-27 11:42 ` Eric Auger
  2019-05-27 12:31 ` [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration no-reply
  2019-07-11  1:53 ` Zhangfei Gao
  28 siblings, 0 replies; 36+ messages in thread
From: Eric Auger @ 2019-05-27 11:42 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
  Cc: drjones, yi.l.liu, mst, jean-philippe.brucker, zhangfei.gao,
	peterx, alex.williamson, vincent.stehle

Whenever the eventfd is triggered, we retrieve the DMA faults
from the mmapped fault region and inject them in the iommu
memory region.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/vfio/pci.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/vfio/pci.h |  1 +
 2 files changed, 54 insertions(+)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 8208171f92..a07acf98c7 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2834,10 +2834,63 @@ static void vfio_unregister_req_notifier(VFIOPCIDevice *vdev)
 static void vfio_dma_fault_notifier_handler(void *opaque)
 {
     VFIOPCIDevice *vdev = opaque;
+    PCIDevice *pdev = &vdev->pdev;
+    AddressSpace *as = pci_device_iommu_address_space(pdev);
+    IOMMUMemoryRegion *iommu_mr = IOMMU_MEMORY_REGION(as->root);
+    struct vfio_region_fault_prod header;
+    struct iommu_fault *queue;
+    char *queue_buffer = NULL;
+    ssize_t bytes;
 
     if (!event_notifier_test_and_clear(&vdev->dma_fault_notifier)) {
         return;
     }
+
+    if (!vdev->fault_prod_region.size || !vdev->fault_cons_region.size) {
+        return;
+    }
+
+    bytes = pread(vdev->vbasedev.fd, &header, sizeof(header),
+                  vdev->fault_prod_region.fd_offset);
+    if (bytes != sizeof(header)) {
+        error_report("%s unable to read the fault region header (0x%lx)",
+                     __func__, bytes);
+        return;
+    }
+
+    /* Normally the fault queue is mmapped */
+    queue = (struct iommu_fault *)vdev->fault_prod_region.mmaps[0].mmap;
+    if (!queue) {
+        size_t queue_size = header.nb_entries * header.entry_size;
+
+        error_report("%s: fault queue not mmapped: slower fault handling",
+                     vdev->vbasedev.name);
+
+        queue_buffer = g_malloc(queue_size);
+        bytes =  pread(vdev->vbasedev.fd, queue_buffer, queue_size,
+                       vdev->fault_prod_region.fd_offset + header.offset);
+        if (bytes != queue_size) {
+            error_report("%s unable to read the fault queue (0x%lx)",
+                         __func__, bytes);
+            return;
+        }
+
+        queue = (struct iommu_fault *)queue_buffer;
+    }
+
+    while (vdev->fault_cons_index != header.prod) {
+        memory_region_inject_faults(iommu_mr, 1,
+                                    &queue[vdev->fault_cons_index]);
+        vdev->fault_cons_index =
+            (vdev->fault_cons_index + 1) % header.nb_entries;
+    }
+    bytes = pwrite(vdev->vbasedev.fd, &vdev->fault_cons_index, 4,
+                   vdev->fault_cons_region.fd_offset + 4);
+    if (bytes != 4) {
+        error_report("%s unable to write the fault region cons index (0x%lx)",
+                     __func__, bytes);
+    }
+    g_free(queue_buffer);
 }
 
 static void vfio_register_dma_fault_notifier(VFIOPCIDevice *vdev)
diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index ee64081b47..01737d9372 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -138,6 +138,7 @@ typedef struct VFIOPCIDevice {
     EventNotifier dma_fault_notifier;
     VFIORegion fault_prod_region;
     VFIORegion fault_cons_region;
+    uint32_t fault_cons_index;
     int (*resetfn)(struct VFIOPCIDevice *);
     uint32_t vendor_id;
     uint32_t device_id;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (26 preceding siblings ...)
  2019-05-27 11:42 ` [Qemu-devel] [RFC v4 27/27] vfio-pci: Implement the DMA fault handler Eric Auger
@ 2019-05-27 12:31 ` no-reply
  2019-07-11  1:53 ` Zhangfei Gao
  28 siblings, 0 replies; 36+ messages in thread
From: no-reply @ 2019-05-27 12:31 UTC (permalink / raw)
  To: eric.auger
  Cc: peter.maydell, drjones, yi.l.liu, mst, jean-philippe.brucker,
	zhangfei.gao, qemu-devel, peterx, eric.auger, alex.williamson,
	qemu-arm, vincent.stehle, eric.auger.pro

Patchew URL: https://patchew.org/QEMU/20190527114203.2762-1-eric.auger@redhat.com/



Hi,

This series failed build test on s390x host. Please find the details below.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
# Testing script will be invoked under the git checkout with
# HEAD pointing to a commit that has the patches applied on top of "base"
# branch
set -e
CC=$HOME/bin/cc
INSTALL=$PWD/install
BUILD=$PWD/build
mkdir -p $BUILD $INSTALL
SRC=$PWD
cd $BUILD
$SRC/configure --cc=$CC --prefix=$INSTALL
make -j4
# XXX: we need reliable clean up
# make check -j4 V=1
make install

echo
echo "=== ENV ==="
env

echo
echo "=== PACKAGES ==="
rpm -qa
=== TEST SCRIPT END ===

  CC      ppc-softmmu/hw/display/vga.o
  CC      mips-softmmu/hw/mips/mips_r4k.o
/var/tmp/patchew-tester-tmp-oaqfmxu5/src/hw/ppc/spapr_iommu.c: In function ‘spapr_tce_replay’:
/var/tmp/patchew-tester-tmp-oaqfmxu5/src/hw/ppc/spapr_iommu.c:161:14: error: ‘IOMMUNotifier’ {aka ‘struct IOMMUNotifier’} has no member named ‘notify’
  161 |             n->notify(n, &iotlb);
      |              ^~
make[1]: *** [/var/tmp/patchew-tester-tmp-oaqfmxu5/src/rules.mak:69: hw/ppc/spapr_iommu.o] Error 1


The full log is available at
http://patchew.org/logs/20190527114203.2762-1-eric.auger@redhat.com/testing.s390x/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [RFC v4 08/27] hw/vfio/common: Force nested if iommu requires it
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 08/27] hw/vfio/common: Force nested if iommu requires it Eric Auger
@ 2019-05-28  2:47   ` Peter Xu
  2019-05-28 12:51     ` Auger Eric
  0 siblings, 1 reply; 36+ messages in thread
From: Peter Xu @ 2019-05-28  2:47 UTC (permalink / raw)
  To: Eric Auger
  Cc: peter.maydell, drjones, yi.l.liu, mst, jean-philippe.brucker,
	zhangfei.gao, qemu-devel, alex.williamson, qemu-arm,
	vincent.stehle, eric.auger.pro

On Mon, May 27, 2019 at 01:41:44PM +0200, Eric Auger wrote:
> In case we detect the address space is translated by
> a virtual IOMMU which requires nested stages, let's set up
> the container with the VFIO_TYPE1_NESTING_IOMMU iommu_type.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> 
> v2 -> v3:
> - add "nested only is selected if requested by @force_nested"
>   comment in this patch
> ---
>  hw/vfio/common.c | 27 +++++++++++++++++++++++----
>  1 file changed, 23 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 1f1deff360..99ade21056 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -1136,14 +1136,19 @@ static void vfio_put_address_space(VFIOAddressSpace *space)
>   * vfio_get_iommu_type - selects the richest iommu_type (v2 first)
>   */
>  static int vfio_get_iommu_type(VFIOContainer *container,
> +                               bool force_nested,
>                                 Error **errp)
>  {
> -    int iommu_types[] = { VFIO_TYPE1v2_IOMMU, VFIO_TYPE1_IOMMU,
> +    int iommu_types[] = { VFIO_TYPE1_NESTING_IOMMU,
> +                          VFIO_TYPE1v2_IOMMU, VFIO_TYPE1_IOMMU,
>                            VFIO_SPAPR_TCE_v2_IOMMU, VFIO_SPAPR_TCE_IOMMU };
>      int i;
>  
>      for (i = 0; i < ARRAY_SIZE(iommu_types); i++) {
>          if (ioctl(container->fd, VFIO_CHECK_EXTENSION, iommu_types[i])) {
> +            if (iommu_types[i] == VFIO_TYPE1_NESTING_IOMMU && !force_nested) {

If force_nested==true and if the kernel does not support
VFIO_TYPE1_NESTING_IOMMU, we will still return other iommu types?
That seems to not match with what "force" mean here.

What I feel like is that we want an "iommu_nest_types[]" which only
contains VFIO_TYPE1_NESTING_IOMMU.  Then:

        if (nested) {
                target_types = iommu_nest_types;
        } else {
                target_types = iommu_types;
        }

        foreach (target_types)
                ...

        return -EINVAL;

Might be clearer?  Then we can drop [2] below since we'll fail earlier
at [1].

> +                continue;
> +            }
>              return iommu_types[i];
>          }
>      }
> @@ -1152,11 +1157,11 @@ static int vfio_get_iommu_type(VFIOContainer *container,
>  }
>  
>  static int vfio_init_container(VFIOContainer *container, int group_fd,
> -                               Error **errp)
> +                               bool force_nested, Error **errp)
>  {
>      int iommu_type, ret;
>  
> -    iommu_type = vfio_get_iommu_type(container, errp);
> +    iommu_type = vfio_get_iommu_type(container, force_nested, errp);
>      if (iommu_type < 0) {
>          return iommu_type;

[1]

>      }
> @@ -1192,6 +1197,14 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
>      VFIOContainer *container;
>      int ret, fd;
>      VFIOAddressSpace *space;
> +    IOMMUMemoryRegion *iommu_mr;
> +    bool force_nested = false;
> +
> +    if (as != &address_space_memory && memory_region_is_iommu(as->root)) {
> +        iommu_mr = IOMMU_MEMORY_REGION(as->root);
> +        memory_region_iommu_get_attr(iommu_mr, IOMMU_ATTR_VFIO_NESTED,
> +                                     (void *)&force_nested);
> +    }
>  
>      space = vfio_get_address_space(as);
>  
> @@ -1252,12 +1265,18 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
>      QLIST_INIT(&container->giommu_list);
>      QLIST_INIT(&container->hostwin_list);
>  
> -    ret = vfio_init_container(container, group->fd, errp);
> +    ret = vfio_init_container(container, group->fd, force_nested, errp);
>      if (ret) {
>          goto free_container_exit;
>      }
>  
> +    if (force_nested && container->iommu_type != VFIO_TYPE1_NESTING_IOMMU) {
> +            error_setg(errp, "nested mode requested by the virtual IOMMU "
> +                       "but not supported by the vfio iommu");
> +    }

[2]

> +
>      switch (container->iommu_type) {
> +    case VFIO_TYPE1_NESTING_IOMMU:
>      case VFIO_TYPE1v2_IOMMU:
>      case VFIO_TYPE1_IOMMU:
>      {
> -- 
> 2.20.1
> 

Regards,

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [RFC v4 09/27] memory: Prepare for different kinds of IOMMU MR notifiers
  2019-05-27 11:41 ` [Qemu-devel] [RFC v4 09/27] memory: Prepare for different kinds of IOMMU MR notifiers Eric Auger
@ 2019-05-28  4:48   ` Peter Xu
  2019-05-28 17:11     ` Auger Eric
  0 siblings, 1 reply; 36+ messages in thread
From: Peter Xu @ 2019-05-28  4:48 UTC (permalink / raw)
  To: Eric Auger
  Cc: peter.maydell, drjones, yi.l.liu, mst, jean-philippe.brucker,
	zhangfei.gao, qemu-devel, alex.williamson, qemu-arm,
	vincent.stehle, eric.auger.pro

On Mon, May 27, 2019 at 01:41:45PM +0200, Eric Auger wrote:

[...]

> @@ -3368,8 +3368,9 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
>  {
>      IOMMUTLBEntry entry;
>      hwaddr size;
> -    hwaddr start = n->start;
> -    hwaddr end = n->end;
> +

(extra new line)

> +    hwaddr start = n->iotlb_notifier.start;
> +    hwaddr end = n->iotlb_notifier.end;
>      IntelIOMMUState *s = as->iommu_state;
>      DMAMap map;

[...]

>  typedef void (*IOMMUNotify)(struct IOMMUNotifier *notifier,
>                              IOMMUTLBEntry *data);
>  
> -struct IOMMUNotifier {
> +typedef struct IOMMUIOLTBNotifier {
>      IOMMUNotify notify;

Hi, Eric,

I wasn't following the thread much before so sorry to ask this if too
late - have you thought about using the Notifier struct direct?
Because then it'll (1) allow the user to register with both IOTLB |
CONFIG flags in the same notifier while currently we'll need to
register one for each (and this worries me a bit on when we grow the
types of flags further then one register can have quite a few
notifiers) (2) the notifier part can be shared by different events.
Then when notify the (void *) data can be an union:

struct IOMMUEvent {
  int event; // can be one of the notifier flags
  union {
    struct IOTLBEvent {
      ...
    };
    struct PASIDEvent {
      ...
    };
  }
}

Then the handler hook would be simple too:

handler (data)
{
  switch (data.event) {
    ...
  }
}

I would be fine with current patch if this series is close to be
merged because even if we want that we can do that on top when we
introduce even more notifiers, but just to ask loud first.

> -    IOMMUNotifierFlag notifier_flags;
>      /* Notify for address space range start <= addr <= end */
>      hwaddr start;
>      hwaddr end;
> +} IOMMUIOLTBNotifier;
> +
> +struct IOMMUNotifier {
> +    IOMMUNotifierFlag notifier_flags;
> +    union {
> +        IOMMUIOLTBNotifier iotlb_notifier;
> +    };
>      int iommu_idx;
>      QLIST_ENTRY(IOMMUNotifier) node;
>  };
> @@ -126,15 +132,18 @@ typedef struct IOMMUNotifier IOMMUNotifier;
>  /* RAM is a persistent kind memory */
>  #define RAM_PMEM (1 << 5)
>  
> -static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
> -                                       IOMMUNotifierFlag flags,
> -                                       hwaddr start, hwaddr end,
> -                                       int iommu_idx)
> +static inline void iommu_iotlb_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
> +                                             IOMMUNotifierFlag flags,
> +                                             hwaddr start, hwaddr end,
> +                                             int iommu_idx)
>  {
> -    n->notify = fn;
> +    assert(flags & IOMMU_NOTIFIER_IOTLB_MAP ||
> +           flags & IOMMU_NOTIFIER_IOTLB_UNMAP);

Can use IOMMU_NOTIFIER_IOTLB_ALL directly?

> +    assert(start < end);
>      n->notifier_flags = flags;
> -    n->start = start;
> -    n->end = end;
> +    n->iotlb_notifier.notify = fn;
> +    n->iotlb_notifier.start = start;
> +    n->iotlb_notifier.end = end;
>      n->iommu_idx = iommu_idx;
>  }

Otherwise the patch looks good to me.

Regards,

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [RFC v4 08/27] hw/vfio/common: Force nested if iommu requires it
  2019-05-28  2:47   ` Peter Xu
@ 2019-05-28 12:51     ` Auger Eric
  0 siblings, 0 replies; 36+ messages in thread
From: Auger Eric @ 2019-05-28 12:51 UTC (permalink / raw)
  To: Peter Xu
  Cc: peter.maydell, drjones, yi.l.liu, mst, jean-philippe.brucker,
	zhangfei.gao, qemu-devel, alex.williamson, qemu-arm,
	vincent.stehle, eric.auger.pro

Hi Peter,

On 5/28/19 4:47 AM, Peter Xu wrote:
> On Mon, May 27, 2019 at 01:41:44PM +0200, Eric Auger wrote:
>> In case we detect the address space is translated by
>> a virtual IOMMU which requires nested stages, let's set up
>> the container with the VFIO_TYPE1_NESTING_IOMMU iommu_type.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>>
>> v2 -> v3:
>> - add "nested only is selected if requested by @force_nested"
>>   comment in this patch
>> ---
>>  hw/vfio/common.c | 27 +++++++++++++++++++++++----
>>  1 file changed, 23 insertions(+), 4 deletions(-)
>>
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index 1f1deff360..99ade21056 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -1136,14 +1136,19 @@ static void vfio_put_address_space(VFIOAddressSpace *space)
>>   * vfio_get_iommu_type - selects the richest iommu_type (v2 first)
>>   */
>>  static int vfio_get_iommu_type(VFIOContainer *container,
>> +                               bool force_nested,
>>                                 Error **errp)
>>  {
>> -    int iommu_types[] = { VFIO_TYPE1v2_IOMMU, VFIO_TYPE1_IOMMU,
>> +    int iommu_types[] = { VFIO_TYPE1_NESTING_IOMMU,
>> +                          VFIO_TYPE1v2_IOMMU, VFIO_TYPE1_IOMMU,
>>                            VFIO_SPAPR_TCE_v2_IOMMU, VFIO_SPAPR_TCE_IOMMU };
>>      int i;
>>  
>>      for (i = 0; i < ARRAY_SIZE(iommu_types); i++) {
>>          if (ioctl(container->fd, VFIO_CHECK_EXTENSION, iommu_types[i])) {
>> +            if (iommu_types[i] == VFIO_TYPE1_NESTING_IOMMU && !force_nested) {
> 
> If force_nested==true and if the kernel does not support
> VFIO_TYPE1_NESTING_IOMMU, we will still return other iommu types?
> That seems to not match with what "force" mean here.
> 
> What I feel like is that we want an "iommu_nest_types[]" which only
> contains VFIO_TYPE1_NESTING_IOMMU.  Then:
> 
>         if (nested) {
>                 target_types = iommu_nest_types;
>         } else {
>                 target_types = iommu_types;
>         }
> 
>         foreach (target_types)
>                 ...
> 
>         return -EINVAL;
> 
> Might be clearer?  Then we can drop [2] below since we'll fail earlier
> at [1].

agreed. I can fail immediately in case the nested mode was requested and
not supported. This will be clearer.

Thanks!


Eric
> 
>> +                continue;
>> +            }
>>              return iommu_types[i];
>>          }
>>      }
>> @@ -1152,11 +1157,11 @@ static int vfio_get_iommu_type(VFIOContainer *container,
>>  }
>>  
>>  static int vfio_init_container(VFIOContainer *container, int group_fd,
>> -                               Error **errp)
>> +                               bool force_nested, Error **errp)
>>  {
>>      int iommu_type, ret;
>>  
>> -    iommu_type = vfio_get_iommu_type(container, errp);
>> +    iommu_type = vfio_get_iommu_type(container, force_nested, errp);
>>      if (iommu_type < 0) {
>>          return iommu_type;
> 
> [1]
> 
>>      }
>> @@ -1192,6 +1197,14 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
>>      VFIOContainer *container;
>>      int ret, fd;
>>      VFIOAddressSpace *space;
>> +    IOMMUMemoryRegion *iommu_mr;
>> +    bool force_nested = false;
>> +
>> +    if (as != &address_space_memory && memory_region_is_iommu(as->root)) {
>> +        iommu_mr = IOMMU_MEMORY_REGION(as->root);
>> +        memory_region_iommu_get_attr(iommu_mr, IOMMU_ATTR_VFIO_NESTED,
>> +                                     (void *)&force_nested);
>> +    }
>>  
>>      space = vfio_get_address_space(as);
>>  
>> @@ -1252,12 +1265,18 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
>>      QLIST_INIT(&container->giommu_list);
>>      QLIST_INIT(&container->hostwin_list);
>>  
>> -    ret = vfio_init_container(container, group->fd, errp);
>> +    ret = vfio_init_container(container, group->fd, force_nested, errp);
>>      if (ret) {
>>          goto free_container_exit;
>>      }
>>  
>> +    if (force_nested && container->iommu_type != VFIO_TYPE1_NESTING_IOMMU) {
>> +            error_setg(errp, "nested mode requested by the virtual IOMMU "
>> +                       "but not supported by the vfio iommu");
>> +    }
> 
> [2]
> 
>> +
>>      switch (container->iommu_type) {
>> +    case VFIO_TYPE1_NESTING_IOMMU:
>>      case VFIO_TYPE1v2_IOMMU:
>>      case VFIO_TYPE1_IOMMU:
>>      {
>> -- 
>> 2.20.1
>>
> 
> Regards,
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [RFC v4 09/27] memory: Prepare for different kinds of IOMMU MR notifiers
  2019-05-28  4:48   ` Peter Xu
@ 2019-05-28 17:11     ` Auger Eric
  0 siblings, 0 replies; 36+ messages in thread
From: Auger Eric @ 2019-05-28 17:11 UTC (permalink / raw)
  To: Peter Xu
  Cc: peter.maydell, drjones, yi.l.liu, mst, jean-philippe.brucker,
	zhangfei.gao, qemu-devel, alex.williamson, qemu-arm,
	vincent.stehle, eric.auger.pro

Hi Peter,

On 5/28/19 6:48 AM, Peter Xu wrote:
> On Mon, May 27, 2019 at 01:41:45PM +0200, Eric Auger wrote:
> 
> [...]
> 
>> @@ -3368,8 +3368,9 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
>>  {
>>      IOMMUTLBEntry entry;
>>      hwaddr size;
>> -    hwaddr start = n->start;
>> -    hwaddr end = n->end;
>> +
> 
> (extra new line)
> 
>> +    hwaddr start = n->iotlb_notifier.start;
>> +    hwaddr end = n->iotlb_notifier.end;
>>      IntelIOMMUState *s = as->iommu_state;
>>      DMAMap map;
> 
> [...]
> 
>>  typedef void (*IOMMUNotify)(struct IOMMUNotifier *notifier,
>>                              IOMMUTLBEntry *data);
>>  
>> -struct IOMMUNotifier {
>> +typedef struct IOMMUIOLTBNotifier {
>>      IOMMUNotify notify;
> 
> Hi, Eric,
> 
> I wasn't following the thread much before so sorry to ask this if too
> late - have you thought about using the Notifier struct direct?
> Because then it'll (1) allow the user to register with both IOTLB |
> CONFIG flags in the same notifier while currently we'll need to
> register one for each (and this worries me a bit on when we grow the
> types of flags further then one register can have quite a few
> notifiers) (2) the notifier part can be shared by different events.
> Then when notify the (void *) data can be an union:
> 
> struct IOMMUEvent {
>   int event; // can be one of the notifier flags
>   union {
>     struct IOTLBEvent {
>       ...
>     };
>     struct PASIDEvent {
>       ...
>     };
>   }
> }

I am currently prototyping your suggestion. I think this would clarify
some parts of the code to see clearly the type of event that is
propagated. I will send a separate RFC for this change.

Thanks!

Eric
> 
> Then the handler hook would be simple too:
> 
> handler (data)
> {
>   switch (data.event) {
>     ...
>   }
> }
> 
> I would be fine with current patch if this series is close to be
> merged because even if we want that we can do that on top when we
> introduce even more notifiers, but just to ask loud first.
> 
>> -    IOMMUNotifierFlag notifier_flags;
>>      /* Notify for address space range start <= addr <= end */
>>      hwaddr start;
>>      hwaddr end;
>> +} IOMMUIOLTBNotifier;
>> +
>> +struct IOMMUNotifier {
>> +    IOMMUNotifierFlag notifier_flags;
>> +    union {
>> +        IOMMUIOLTBNotifier iotlb_notifier;
>> +    };
>>      int iommu_idx;
>>      QLIST_ENTRY(IOMMUNotifier) node;
>>  };
>> @@ -126,15 +132,18 @@ typedef struct IOMMUNotifier IOMMUNotifier;
>>  /* RAM is a persistent kind memory */
>>  #define RAM_PMEM (1 << 5)
>>  
>> -static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
>> -                                       IOMMUNotifierFlag flags,
>> -                                       hwaddr start, hwaddr end,
>> -                                       int iommu_idx)
>> +static inline void iommu_iotlb_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
>> +                                             IOMMUNotifierFlag flags,
>> +                                             hwaddr start, hwaddr end,
>> +                                             int iommu_idx)
>>  {
>> -    n->notify = fn;
>> +    assert(flags & IOMMU_NOTIFIER_IOTLB_MAP ||
>> +           flags & IOMMU_NOTIFIER_IOTLB_UNMAP);
> 
> Can use IOMMU_NOTIFIER_IOTLB_ALL directly?
> 
>> +    assert(start < end);
>>      n->notifier_flags = flags;
>> -    n->start = start;
>> -    n->end = end;
>> +    n->iotlb_notifier.notify = fn;
>> +    n->iotlb_notifier.start = start;
>> +    n->iotlb_notifier.end = end;
>>      n->iommu_idx = iommu_idx;
>>  }
> 
> Otherwise the patch looks good to me.
> 
> Regards,
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration
  2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
                   ` (27 preceding siblings ...)
  2019-05-27 12:31 ` [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration no-reply
@ 2019-07-11  1:53 ` Zhangfei Gao
  2019-07-11  5:55   ` Auger Eric
  28 siblings, 1 reply; 36+ messages in thread
From: Zhangfei Gao @ 2019-07-11  1:53 UTC (permalink / raw)
  To: Eric Auger
  Cc: Peter Maydell, drjones, yi.l.liu, Michael S. Tsirkin,
	jean-philippe.brucker, zhangfei.gao, qemu-devel, peterx,
	alex.williamson, qemu-arm, vincent.stehle, eric.auger.pro

On Mon, May 27, 2019 at 7:44 PM Eric Auger <eric.auger@redhat.com> wrote:
>
> Up to now vSMMUv3 has not been integrated with VFIO. VFIO
> integration requires to program the physical IOMMU consistently
> with the guest mappings. However, as opposed to VTD, SMMUv3 has
> no "Caching Mode" which allows easy trapping of guest mappings.
> This means the vSMMUV3 cannot use the same VFIO integration as VTD.
>
> However SMMUv3 has 2 translation stages. This was devised with
> virtualization use case in mind where stage 1 is "owned" by the
> guest whereas the host uses stage 2 for VM isolation.
>
> This series sets up this nested translation stage. It only works
> if there is one physical SMMUv3 used along with QEMU vSMMUv3 (in
> other words, it does not work if there is a physical SMMUv2).
>
> The series uses a new kernel user API [1], still under definition.
>
> - We force the host to use stage 2 instead of stage 1, when we
>   detect a vSMMUV3 is behind a VFIO device. For a VFIO device
>   without any virtual IOMMU, we still use stage 1 as many existing
>   SMMUs expect this behavior.
> - We introduce new IOTLB "config" notifiers, requested to notify
>   changes in the config of a given iommu memory region. So now
>   we have notifiers for IOTLB changes and config changes.
> - vSMMUv3 calls config notifiers when STE (Stream Table Entries)
>   are updated by the guest.
> - We implement a specific UNMAP notifier that conveys guest
>   IOTLB invalidations to the host
> - We implement a new MAP notifiers only used for MSI IOVAs so
>   that the host can build a nested stage translation for MSI IOVAs
> - As the legacy MAP notifier is not called anymore, we must make
>   sure stage 2 mappings are set. This is achieved through another
>   memory listener.
> - Physical SMMUs faults are reported to the guest via en eventfd
>   mechanism and reinjected into this latter.
>
> Note: The first patch is a code cleanup and was sent separately.
>
> Best Regards
>
> Eric
>
> This series can be found at:
> https://github.com/eauger/qemu/tree/v4.0.0-2stage-rfcv4
>
> Compatible with kernel series:
> [PATCH v8 00/29] SMMUv3 Nested Stage Setup
> (https://lkml.org/lkml/2019/5/26/95)
>

Have tested vfio mode in qemu on arm64 platform.

Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
qemu: https://github.com/eauger/qemu/tree/v4.0.0-2stage-rfcv4
kernel: https://github.com/eauger/linux/tree/v5.2-rc1-2stage-v8


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration
  2019-07-11  1:53 ` Zhangfei Gao
@ 2019-07-11  5:55   ` Auger Eric
  2019-07-11  6:18     ` Zhangfei Gao
  0 siblings, 1 reply; 36+ messages in thread
From: Auger Eric @ 2019-07-11  5:55 UTC (permalink / raw)
  To: Zhangfei Gao
  Cc: Peter Maydell, drjones, yi.l.liu, Michael S. Tsirkin,
	jean-philippe.brucker, zhangfei.gao, qemu-devel, peterx,
	alex.williamson, qemu-arm, vincent.stehle, eric.auger.pro

Hi Zhangfei,

On 7/11/19 3:53 AM, Zhangfei Gao wrote:
> On Mon, May 27, 2019 at 7:44 PM Eric Auger <eric.auger@redhat.com> wrote:
>>
>> Up to now vSMMUv3 has not been integrated with VFIO. VFIO
>> integration requires to program the physical IOMMU consistently
>> with the guest mappings. However, as opposed to VTD, SMMUv3 has
>> no "Caching Mode" which allows easy trapping of guest mappings.
>> This means the vSMMUV3 cannot use the same VFIO integration as VTD.
>>
>> However SMMUv3 has 2 translation stages. This was devised with
>> virtualization use case in mind where stage 1 is "owned" by the
>> guest whereas the host uses stage 2 for VM isolation.
>>
>> This series sets up this nested translation stage. It only works
>> if there is one physical SMMUv3 used along with QEMU vSMMUv3 (in
>> other words, it does not work if there is a physical SMMUv2).
>>
>> The series uses a new kernel user API [1], still under definition.
>>
>> - We force the host to use stage 2 instead of stage 1, when we
>>   detect a vSMMUV3 is behind a VFIO device. For a VFIO device
>>   without any virtual IOMMU, we still use stage 1 as many existing
>>   SMMUs expect this behavior.
>> - We introduce new IOTLB "config" notifiers, requested to notify
>>   changes in the config of a given iommu memory region. So now
>>   we have notifiers for IOTLB changes and config changes.
>> - vSMMUv3 calls config notifiers when STE (Stream Table Entries)
>>   are updated by the guest.
>> - We implement a specific UNMAP notifier that conveys guest
>>   IOTLB invalidations to the host
>> - We implement a new MAP notifiers only used for MSI IOVAs so
>>   that the host can build a nested stage translation for MSI IOVAs
>> - As the legacy MAP notifier is not called anymore, we must make
>>   sure stage 2 mappings are set. This is achieved through another
>>   memory listener.
>> - Physical SMMUs faults are reported to the guest via en eventfd
>>   mechanism and reinjected into this latter.
>>
>> Note: The first patch is a code cleanup and was sent separately.
>>
>> Best Regards
>>
>> Eric
>>
>> This series can be found at:
>> https://github.com/eauger/qemu/tree/v4.0.0-2stage-rfcv4
>>
>> Compatible with kernel series:
>> [PATCH v8 00/29] SMMUv3 Nested Stage Setup
>> (https://lkml.org/lkml/2019/5/26/95)
>>
> 
> Have tested vfio mode in qemu on arm64 platform.
> 
> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
> qemu: https://github.com/eauger/qemu/tree/v4.0.0-2stage-rfcv4
> kernel: https://github.com/eauger/linux/tree/v5.2-rc1-2stage-v8

Your testing is really appreciated.

Both kernel and QEMU series will be respinned. I am currently waiting
for 5.3 kernel window as it will resolve some dependencies on the fault
reporting APIs. My focus is to get the updated kernel series reviewed
and tested and then refine the QEMU integration accordingly.

Thanks

Eric


> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration
  2019-07-11  5:55   ` Auger Eric
@ 2019-07-11  6:18     ` Zhangfei Gao
  0 siblings, 0 replies; 36+ messages in thread
From: Zhangfei Gao @ 2019-07-11  6:18 UTC (permalink / raw)
  To: Auger Eric
  Cc: Peter Maydell, drjones, yi.l.liu, Michael S. Tsirkin,
	jean-philippe.brucker, zhangfei.gao, qemu-devel, peterx,
	alex.williamson, qemu-arm, vincent.stehle, eric.auger.pro

On Thu, Jul 11, 2019 at 1:55 PM Auger Eric <eric.auger@redhat.com> wrote:
>
> Hi Zhangfei,
>
> On 7/11/19 3:53 AM, Zhangfei Gao wrote:
> > On Mon, May 27, 2019 at 7:44 PM Eric Auger <eric.auger@redhat.com> wrote:
> >>
> >> Up to now vSMMUv3 has not been integrated with VFIO. VFIO
> >> integration requires to program the physical IOMMU consistently
> >> with the guest mappings. However, as opposed to VTD, SMMUv3 has
> >> no "Caching Mode" which allows easy trapping of guest mappings.
> >> This means the vSMMUV3 cannot use the same VFIO integration as VTD.
> >>
> >> However SMMUv3 has 2 translation stages. This was devised with
> >> virtualization use case in mind where stage 1 is "owned" by the
> >> guest whereas the host uses stage 2 for VM isolation.
> >>
> >> This series sets up this nested translation stage. It only works
> >> if there is one physical SMMUv3 used along with QEMU vSMMUv3 (in
> >> other words, it does not work if there is a physical SMMUv2).
> >>
> >> The series uses a new kernel user API [1], still under definition.
> >>
> >> - We force the host to use stage 2 instead of stage 1, when we
> >>   detect a vSMMUV3 is behind a VFIO device. For a VFIO device
> >>   without any virtual IOMMU, we still use stage 1 as many existing
> >>   SMMUs expect this behavior.
> >> - We introduce new IOTLB "config" notifiers, requested to notify
> >>   changes in the config of a given iommu memory region. So now
> >>   we have notifiers for IOTLB changes and config changes.
> >> - vSMMUv3 calls config notifiers when STE (Stream Table Entries)
> >>   are updated by the guest.
> >> - We implement a specific UNMAP notifier that conveys guest
> >>   IOTLB invalidations to the host
> >> - We implement a new MAP notifiers only used for MSI IOVAs so
> >>   that the host can build a nested stage translation for MSI IOVAs
> >> - As the legacy MAP notifier is not called anymore, we must make
> >>   sure stage 2 mappings are set. This is achieved through another
> >>   memory listener.
> >> - Physical SMMUs faults are reported to the guest via en eventfd
> >>   mechanism and reinjected into this latter.
> >>
> >> Note: The first patch is a code cleanup and was sent separately.
> >>
> >> Best Regards
> >>
> >> Eric
> >>
> >> This series can be found at:
> >> https://github.com/eauger/qemu/tree/v4.0.0-2stage-rfcv4
> >>
> >> Compatible with kernel series:
> >> [PATCH v8 00/29] SMMUv3 Nested Stage Setup
> >> (https://lkml.org/lkml/2019/5/26/95)
> >>
> >
> > Have tested vfio mode in qemu on arm64 platform.
> >
> > Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
> > qemu: https://github.com/eauger/qemu/tree/v4.0.0-2stage-rfcv4
> > kernel: https://github.com/eauger/linux/tree/v5.2-rc1-2stage-v8
>
> Your testing is really appreciated.
>
> Both kernel and QEMU series will be respinned. I am currently waiting
> for 5.3 kernel window as it will resolve some dependencies on the fault
> reporting APIs. My focus is to get the updated kernel series reviewed
> and tested and then refine the QEMU integration accordingly.
>
Thanks Eric, that's great
Since I found kernel part (drivers/iommu/arm-smmu-v3.c) will be
conflicting with Jean's sva patch.
Especially this one: iommu/smmuv3: Dynamically allocate s1_cfg and s2_cfg

Thanks


^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2019-07-11  6:19 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-27 11:41 [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 01/27] vfio/common: Introduce vfio_set_irq_signaling helper Eric Auger
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 02/27] update-linux-headers: Import iommu.h Eric Auger
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 03/27] update-linux-headers: Add sve_context.h to asm-arm64 Eric Auger
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 04/27] header update against 5.2.0-rc1 and IOMMU/VFIO nested stage APIs Eric Auger
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 05/27] memory: add IOMMU_ATTR_VFIO_NESTED IOMMU memory region attribute Eric Auger
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 06/27] memory: add IOMMU_ATTR_MSI_TRANSLATE " Eric Auger
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 07/27] hw/arm/smmuv3: Advertise VFIO_NESTED and MSI_TRANSLATE attributes Eric Auger
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 08/27] hw/vfio/common: Force nested if iommu requires it Eric Auger
2019-05-28  2:47   ` Peter Xu
2019-05-28 12:51     ` Auger Eric
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 09/27] memory: Prepare for different kinds of IOMMU MR notifiers Eric Auger
2019-05-28  4:48   ` Peter Xu
2019-05-28 17:11     ` Auger Eric
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 10/27] memory: Add IOMMUConfigNotifier Eric Auger
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 11/27] memory: Add arch_id and leaf fields in IOTLBEntry Eric Auger
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 12/27] hw/arm/smmuv3: Store the PASID table GPA in the translation config Eric Auger
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 13/27] hw/arm/smmuv3: Implement dummy replay Eric Auger
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 14/27] hw/arm/smmuv3: Fill the IOTLBEntry arch_id on NH_VA invalidation Eric Auger
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 15/27] hw/arm/smmuv3: Fill the IOTLBEntry leaf field " Eric Auger
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 16/27] hw/arm/smmuv3: Notify on config changes Eric Auger
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 17/27] hw/vfio/common: Introduce vfio_alloc_guest_iommu helper Eric Auger
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 18/27] hw/vfio/common: Introduce hostwin_from_range helper Eric Auger
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 19/27] hw/vfio/common: Introduce helpers to DMA map/unmap a RAM section Eric Auger
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 20/27] hw/vfio/common: Setup nested stage mappings Eric Auger
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 21/27] hw/vfio/common: Register a MAP notifier for MSI binding Eric Auger
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 22/27] vfio-pci: Expose MSI stage 1 bindings to the host Eric Auger
2019-05-27 11:41 ` [Qemu-devel] [RFC v4 23/27] memory: Introduce IOMMU Memory Region inject_faults API Eric Auger
2019-05-27 11:42 ` [Qemu-devel] [RFC v4 24/27] hw/arm/smmuv3: Implement fault injection Eric Auger
2019-05-27 11:42 ` [Qemu-devel] [RFC v4 25/27] vfio-pci: register handler for iommu fault Eric Auger
2019-05-27 11:42 ` [Qemu-devel] [RFC v4 26/27] vfio-pci: Set up fault regions Eric Auger
2019-05-27 11:42 ` [Qemu-devel] [RFC v4 27/27] vfio-pci: Implement the DMA fault handler Eric Auger
2019-05-27 12:31 ` [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration no-reply
2019-07-11  1:53 ` Zhangfei Gao
2019-07-11  5:55   ` Auger Eric
2019-07-11  6:18     ` Zhangfei Gao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.