All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v5 0/3] IOMMU: intel_iommu support map and unmap notifications
@ 2016-10-20 21:07 Aviv B.D
  2016-10-20 21:07 ` [Qemu-devel] [PATCH v5 1/3] IOMMU: add option to enable VTD_CAP_CM to vIOMMU capility exposoed to guest Aviv B.D
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Aviv B.D @ 2016-10-20 21:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S. Tsirkin, Alex Williamson, Peter Xu, Jan Kiszka,
	Aviv Ben-David

From: "Aviv Ben-David" <bd.aviv@gmail.com>

* Advertize Cache Mode capability in iommu cap register. 
  This capability is controlled by "cache-mode" property of intel-iommu device.
  To enable this option call QEMU with "-device intel-iommu,cache-mode=true".

* On page cache invalidation in intel vIOMMU, check if the domain belong to
  registered notifier, and notify accordingly.

Currently this patch still doesn't enabling VFIO devices support with vIOMMU 
present. Current problems:
* vfio_iommu_map_notify is not aware about memory range belong to specific 
  VFIOGuestIOMMU.
* memory_region_iommu_replay hangs QEMU on start up while it itterate over 
  64bit address space. Commenting out the call to this function enables 
  workable VFIO device while vIOMMU present.

Changes from v1 to v2:
* remove assumption that the cache do not clears
* fix lockup on high load.

Changes from v2 to v3:
* remove debug leftovers
* split to sepearate commits
* change is_write to flags in vtd_do_iommu_translate, add IOMMU_NO_FAIL 
  to suppress error propagating to guest.

Changes from v3 to v4:
* Add property to intel_iommu device to control the CM capability, 
  default to False.
* Use s->iommu_ops.notify_flag_changed to register notifiers.

Changes from v4 to v4 RESEND:
* Fix codding style pointed by checkpatch.pl script.

Changes from v4 to v5:
* Reduce the number of changes in patch 2 and make flags real bitfield.
* Revert deleted debug prints.
* Fix memory leak in patch 3.

Aviv Ben-David (3):
  IOMMU: add option to enable VTD_CAP_CM to vIOMMU capility exposoed to
    guest
  IOMMU: change iommu_op->translate's is_write to flags, add support to
    NO_FAIL flag mode
  IOMMU: enable intel_iommu map and unmap notifiers

 exec.c                         |   3 +-
 hw/i386/amd_iommu.c            |   4 +-
 hw/i386/intel_iommu.c          | 171 ++++++++++++++++++++++++++++++++++-------
 hw/i386/intel_iommu_internal.h |   3 +
 hw/ppc/spapr_iommu.c           |   2 +-
 include/exec/memory.h          |   6 +-
 include/hw/i386/intel_iommu.h  |  11 +++
 memory.c                       |   3 +-
 8 files changed, 168 insertions(+), 35 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Qemu-devel] [PATCH v5 1/3] IOMMU: add option to enable VTD_CAP_CM to vIOMMU capility exposoed to guest
  2016-10-20 21:07 [Qemu-devel] [PATCH v5 0/3] IOMMU: intel_iommu support map and unmap notifications Aviv B.D
@ 2016-10-20 21:07 ` Aviv B.D
  2016-10-20 21:07 ` [Qemu-devel] [PATCH v5 2/3] IOMMU: change iommu_op->translate's is_write to flags, add support to NO_FAIL flag mode Aviv B.D
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Aviv B.D @ 2016-10-20 21:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S. Tsirkin, Alex Williamson, Peter Xu, Jan Kiszka,
	Aviv Ben-David

From: "Aviv Ben-David" <bd.aviv@gmail.com>

This capability asks the guest to invalidate cache before each map operation.
We can use this invalidation to trap map operations in the hypervisor.

Signed-off-by: Aviv Ben-David <bd.aviv@gmail.com>
---
 hw/i386/intel_iommu.c          | 5 +++++
 hw/i386/intel_iommu_internal.h | 1 +
 include/hw/i386/intel_iommu.h  | 2 ++
 3 files changed, 8 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 1655a65..834887f 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2017,6 +2017,7 @@ static Property vtd_properties[] = {
     DEFINE_PROP_ON_OFF_AUTO("eim", IntelIOMMUState, intr_eim,
                             ON_OFF_AUTO_AUTO),
     DEFINE_PROP_BOOL("x-buggy-eim", IntelIOMMUState, buggy_eim, false),
+    DEFINE_PROP_BOOL("cache-mode", IntelIOMMUState, cache_mode_enabled, FALSE),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -2391,6 +2392,10 @@ static void vtd_init(IntelIOMMUState *s)
         assert(s->intr_eim != ON_OFF_AUTO_AUTO);
     }
 
+    if (s->cache_mode_enabled) {
+        s->cap |= VTD_CAP_CM;
+    }
+
     vtd_reset_context_cache(s);
     vtd_reset_iotlb(s);
 
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index 0829a50..35d9f3a 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -201,6 +201,7 @@
 #define VTD_CAP_MAMV                (VTD_MAMV << 48)
 #define VTD_CAP_PSI                 (1ULL << 39)
 #define VTD_CAP_SLLPS               ((1ULL << 34) | (1ULL << 35))
+#define VTD_CAP_CM                  (1ULL << 7)
 
 /* Supported Adjusted Guest Address Widths */
 #define VTD_CAP_SAGAW_SHIFT         8
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 1989c1e..42d293f 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -258,6 +258,8 @@ struct IntelIOMMUState {
     uint8_t womask[DMAR_REG_SIZE];  /* WO (write only - read returns 0) */
     uint32_t version;
 
+    bool cache_mode_enabled;        /* RO - is cap CM enabled? */
+
     dma_addr_t root;                /* Current root table pointer */
     bool root_extended;             /* Type of root table (extended or not) */
     bool dmar_enabled;              /* Set if DMA remapping is enabled */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Qemu-devel] [PATCH v5 2/3] IOMMU: change iommu_op->translate's is_write to flags, add support to NO_FAIL flag mode
  2016-10-20 21:07 [Qemu-devel] [PATCH v5 0/3] IOMMU: intel_iommu support map and unmap notifications Aviv B.D
  2016-10-20 21:07 ` [Qemu-devel] [PATCH v5 1/3] IOMMU: add option to enable VTD_CAP_CM to vIOMMU capility exposoed to guest Aviv B.D
@ 2016-10-20 21:07 ` Aviv B.D
  2016-10-20 21:07 ` [Qemu-devel] [PATCH v5 3/3] IOMMU: enable intel_iommu map and unmap notifiers Aviv B.D
  2016-10-26 16:37 ` [Qemu-devel] [PATCH v5 0/3] IOMMU: intel_iommu support map and unmap notifications Chao Gao
  3 siblings, 0 replies; 6+ messages in thread
From: Aviv B.D @ 2016-10-20 21:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S. Tsirkin, Alex Williamson, Peter Xu, Jan Kiszka,
	Aviv Ben-David

From: "Aviv Ben-David" <bd.aviv@gmail.com>

Supports translation trials without reporting error to guest on
translation failure.

Signed-off-by: Aviv Ben-David <bd.aviv@gmail.com>
---
 exec.c                |  3 ++-
 hw/i386/amd_iommu.c   |  4 ++--
 hw/i386/intel_iommu.c | 59 +++++++++++++++++++++++++++++++++------------------
 hw/ppc/spapr_iommu.c  |  2 +-
 include/exec/memory.h |  6 ++++--
 memory.c              |  3 ++-
 6 files changed, 49 insertions(+), 28 deletions(-)

diff --git a/exec.c b/exec.c
index e63c5a1..9e2e5ca 100644
--- a/exec.c
+++ b/exec.c
@@ -432,7 +432,8 @@ MemoryRegion *address_space_translate(AddressSpace *as, hwaddr addr,
             break;
         }
 
-        iotlb = mr->iommu_ops->translate(mr, addr, is_write);
+        iotlb = mr->iommu_ops->translate(mr, addr,
+                                         is_write ? IOMMU_WO : IOMMU_RO);
         addr = ((iotlb.translated_addr & ~iotlb.addr_mask)
                 | (addr & iotlb.addr_mask));
         *plen = MIN(*plen, (addr | iotlb.addr_mask) - addr + 1);
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 47b79d9..1f0d76b 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -988,7 +988,7 @@ static inline bool amdvi_is_interrupt_addr(hwaddr addr)
 }
 
 static IOMMUTLBEntry amdvi_translate(MemoryRegion *iommu, hwaddr addr,
-                                     bool is_write)
+                                     IOMMUAccessFlags flags)
 {
     AMDVIAddressSpace *as = container_of(iommu, AMDVIAddressSpace, iommu);
     AMDVIState *s = as->iommu_state;
@@ -1017,7 +1017,7 @@ static IOMMUTLBEntry amdvi_translate(MemoryRegion *iommu, hwaddr addr,
         return ret;
     }
 
-    amdvi_do_translate(as, addr, is_write, &ret);
+    amdvi_do_translate(as, addr, flags & IOMMU_WO, &ret);
     trace_amdvi_translation_result(as->bus_num, PCI_SLOT(as->devfn),
             PCI_FUNC(as->devfn), addr, ret.translated_addr);
     return ret;
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 834887f..0821079 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -631,7 +631,8 @@ static bool vtd_slpte_nonzero_rsvd(uint64_t slpte, uint32_t level)
 /* Given the @gpa, get relevant @slptep. @slpte_level will be the last level
  * of the translation, can be used for deciding the size of large page.
  */
-static int vtd_gpa_to_slpte(VTDContextEntry *ce, uint64_t gpa, bool is_write,
+static int vtd_gpa_to_slpte(VTDContextEntry *ce, uint64_t gpa,
+                            IOMMUAccessFlags flags,
                             uint64_t *slptep, uint32_t *slpte_level,
                             bool *reads, bool *writes)
 {
@@ -640,7 +641,7 @@ static int vtd_gpa_to_slpte(VTDContextEntry *ce, uint64_t gpa, bool is_write,
     uint32_t offset;
     uint64_t slpte;
     uint32_t ce_agaw = vtd_get_agaw_from_context_entry(ce);
-    uint64_t access_right_check;
+    uint64_t access_right_check = 0;
 
     /* Check if @gpa is above 2^X-1, where X is the minimum of MGAW in CAP_REG
      * and AW in context-entry.
@@ -651,7 +652,15 @@ static int vtd_gpa_to_slpte(VTDContextEntry *ce, uint64_t gpa, bool is_write,
     }
 
     /* FIXME: what is the Atomics request here? */
-    access_right_check = is_write ? VTD_SL_W : VTD_SL_R;
+    if (flags & IOMMU_WO) {
+        access_right_check |= VTD_SL_W;
+    }
+    if (flags & IOMMU_RO) {
+        access_right_check |= VTD_SL_R;
+    }
+    if (flags & IOMMU_NO_FAIL) {
+        access_right_check |= VTD_SL_R | VTD_SL_W;
+    }
 
     while (true) {
         offset = vtd_gpa_level_offset(gpa, level);
@@ -673,8 +682,8 @@ static int vtd_gpa_to_slpte(VTDContextEntry *ce, uint64_t gpa, bool is_write,
         if (!(slpte & access_right_check)) {
             VTD_DPRINTF(GENERAL, "error: lack of %s permission for "
                         "gpa 0x%"PRIx64 " slpte 0x%"PRIx64,
-                        (is_write ? "write" : "read"), gpa, slpte);
-            return is_write ? -VTD_FR_WRITE : -VTD_FR_READ;
+                        (flags == IOMMU_WO ? "write" : "read"), gpa, slpte);
+            return (flags & IOMMU_WO) ? -VTD_FR_WRITE : -VTD_FR_READ;
         }
         if (vtd_slpte_nonzero_rsvd(slpte, level)) {
             VTD_DPRINTF(GENERAL, "error: non-zero reserved field in second "
@@ -791,11 +800,13 @@ static inline bool vtd_is_interrupt_addr(hwaddr addr)
  *
  * @bus_num: The bus number
  * @devfn: The devfn, which is the  combined of device and function number
- * @is_write: The access is a write operation
+ * @flags: The access permission of the operation, use IOMMU_NO_FAIL to
+ *         suppress translation errors (e.g. no mapping present)
  * @entry: IOMMUTLBEntry that contain the addr to be translated and result
  */
 static void vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus,
-                                   uint8_t devfn, hwaddr addr, bool is_write,
+                                   uint8_t devfn, hwaddr addr,
+                                   IOMMUAccessFlags flags,
                                    IOMMUTLBEntry *entry)
 {
     IntelIOMMUState *s = vtd_as->iommu_state;
@@ -813,7 +824,7 @@ static void vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus,
 
     /* Check if the request is in interrupt address range */
     if (vtd_is_interrupt_addr(addr)) {
-        if (is_write) {
+        if (flags & IOMMU_WO) {
             /* FIXME: since we don't know the length of the access here, we
              * treat Non-DWORD length write requests without PASID as
              * interrupt requests, too. Withoud interrupt remapping support,
@@ -829,7 +840,8 @@ static void vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus,
         } else {
             VTD_DPRINTF(GENERAL, "error: read request from interrupt address "
                         "gpa 0x%"PRIx64, addr);
-            vtd_report_dmar_fault(s, source_id, addr, VTD_FR_READ, is_write);
+            vtd_report_dmar_fault(s, source_id, addr, VTD_FR_READ,
+                    flags & IOMMU_WO);
             return;
         }
     }
@@ -858,12 +870,15 @@ static void vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus,
         is_fpd_set = ce.lo & VTD_CONTEXT_ENTRY_FPD;
         if (ret_fr) {
             ret_fr = -ret_fr;
-            if (is_fpd_set && vtd_is_qualified_fault(ret_fr)) {
-                VTD_DPRINTF(FLOG, "fault processing is disabled for DMA "
+            if (!(flags & IOMMU_NO_FAIL)) {
+                if (is_fpd_set && vtd_is_qualified_fault(ret_fr)) {
+                    VTD_DPRINTF(FLOG, "fault processing is disabled for DMA "
                             "requests through this context-entry "
                             "(with FPD Set)");
-            } else {
-                vtd_report_dmar_fault(s, source_id, addr, ret_fr, is_write);
+                } else {
+                    vtd_report_dmar_fault(s, source_id, addr, ret_fr,
+                            flags & IOMMU_WO);
+                }
             }
             return;
         }
@@ -876,15 +891,17 @@ static void vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus,
         cc_entry->context_cache_gen = s->context_cache_gen;
     }
 
-    ret_fr = vtd_gpa_to_slpte(&ce, addr, is_write, &slpte, &level,
+    ret_fr = vtd_gpa_to_slpte(&ce, addr, flags, &slpte, &level,
                               &reads, &writes);
     if (ret_fr) {
         ret_fr = -ret_fr;
-        if (is_fpd_set && vtd_is_qualified_fault(ret_fr)) {
-            VTD_DPRINTF(FLOG, "fault processing is disabled for DMA requests "
-                        "through this context-entry (with FPD Set)");
-        } else {
-            vtd_report_dmar_fault(s, source_id, addr, ret_fr, is_write);
+        if (!(flags & IOMMU_NO_FAIL)) {
+            if (is_fpd_set && vtd_is_qualified_fault(ret_fr)) {
+                VTD_DPRINTF(FLOG, "fault processing is disabled for DMA "
+                        "requests through this context-entry (with FPD Set)");
+            } else {
+                vtd_report_dmar_fault(s, source_id, addr, ret_fr, flags);
+            }
         }
         return;
     }
@@ -1946,7 +1963,7 @@ static void vtd_mem_write(void *opaque, hwaddr addr,
 }
 
 static IOMMUTLBEntry vtd_iommu_translate(MemoryRegion *iommu, hwaddr addr,
-                                         bool is_write)
+                                         IOMMUAccessFlags flags)
 {
     VTDAddressSpace *vtd_as = container_of(iommu, VTDAddressSpace, iommu);
     IntelIOMMUState *s = vtd_as->iommu_state;
@@ -1968,7 +1985,7 @@ static IOMMUTLBEntry vtd_iommu_translate(MemoryRegion *iommu, hwaddr addr,
     }
 
     vtd_do_iommu_translate(vtd_as, vtd_as->bus, vtd_as->devfn, addr,
-                           is_write, &ret);
+                           flags, &ret);
     VTD_DPRINTF(MMU,
                 "bus %"PRIu8 " slot %"PRIu8 " func %"PRIu8 " devfn %"PRIu8
                 " gpa 0x%"PRIx64 " hpa 0x%"PRIx64, pci_bus_num(vtd_as->bus),
diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
index ae30bbe..d19c3ff 100644
--- a/hw/ppc/spapr_iommu.c
+++ b/hw/ppc/spapr_iommu.c
@@ -110,7 +110,7 @@ static void spapr_tce_free_table(uint64_t *table, int fd, uint32_t nb_table)
 
 /* Called from RCU critical section */
 static IOMMUTLBEntry spapr_tce_translate_iommu(MemoryRegion *iommu, hwaddr addr,
-                                               bool is_write)
+                                               IOMMUAccessFlags flags)
 {
     sPAPRTCETable *tcet = container_of(iommu, sPAPRTCETable, iommu);
     uint64_t tce;
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 10d7eac..0d4acb9 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -57,6 +57,7 @@ typedef enum {
     IOMMU_RO   = 1,
     IOMMU_WO   = 2,
     IOMMU_RW   = 3,
+    IOMMU_NO_FAIL  = 4, /* may not be present, don't repport error to guest */
 } IOMMUAccessFlags;
 
 struct IOMMUTLBEntry {
@@ -168,10 +169,11 @@ struct MemoryRegionOps {
 };
 
 typedef struct MemoryRegionIOMMUOps MemoryRegionIOMMUOps;
-
 struct MemoryRegionIOMMUOps {
     /* Return a TLB entry that contains a given address. */
-    IOMMUTLBEntry (*translate)(MemoryRegion *iommu, hwaddr addr, bool is_write);
+    IOMMUTLBEntry (*translate)(MemoryRegion *iommu,
+                               hwaddr addr,
+                               IOMMUAccessFlags flags);
     /* Returns minimum supported page size */
     uint64_t (*get_min_page_size)(MemoryRegion *iommu);
     /* Called when IOMMU Notifier flag changed */
diff --git a/memory.c b/memory.c
index 58f9269..dfbb9a0 100644
--- a/memory.c
+++ b/memory.c
@@ -1563,7 +1563,8 @@ void memory_region_iommu_replay(MemoryRegion *mr, IOMMUNotifier *n,
     granularity = memory_region_iommu_get_min_page_size(mr);
 
     for (addr = 0; addr < memory_region_size(mr); addr += granularity) {
-        iotlb = mr->iommu_ops->translate(mr, addr, is_write);
+        iotlb = mr->iommu_ops->translate(mr, addr,
+                                         is_write ? IOMMU_WO : IOMMU_RO);
         if (iotlb.perm != IOMMU_NONE) {
             n->notify(n, &iotlb);
         }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Qemu-devel] [PATCH v5 3/3] IOMMU: enable intel_iommu map and unmap notifiers
  2016-10-20 21:07 [Qemu-devel] [PATCH v5 0/3] IOMMU: intel_iommu support map and unmap notifications Aviv B.D
  2016-10-20 21:07 ` [Qemu-devel] [PATCH v5 1/3] IOMMU: add option to enable VTD_CAP_CM to vIOMMU capility exposoed to guest Aviv B.D
  2016-10-20 21:07 ` [Qemu-devel] [PATCH v5 2/3] IOMMU: change iommu_op->translate's is_write to flags, add support to NO_FAIL flag mode Aviv B.D
@ 2016-10-20 21:07 ` Aviv B.D
  2016-10-26 16:37 ` [Qemu-devel] [PATCH v5 0/3] IOMMU: intel_iommu support map and unmap notifications Chao Gao
  3 siblings, 0 replies; 6+ messages in thread
From: Aviv B.D @ 2016-10-20 21:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S. Tsirkin, Alex Williamson, Peter Xu, Jan Kiszka,
	Aviv Ben-David

From: "Aviv Ben-David" <bd.aviv@gmail.com>

Adds a list of registered vtd_as's to intel iommu state to save
iteration over each PCI device in a search of the corrosponding domain.

Signed-off-by: Aviv Ben-David <bd.aviv@gmail.com>
---
 hw/i386/intel_iommu.c          | 109 ++++++++++++++++++++++++++++++++++++++---
 hw/i386/intel_iommu_internal.h |   2 +
 include/hw/i386/intel_iommu.h  |   9 ++++
 3 files changed, 112 insertions(+), 8 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 0821079..46e9b8e 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -54,6 +54,9 @@ static int vtd_dbgflags = VTD_DBGBIT(GENERAL) | VTD_DBGBIT(CSR);
 #define VTD_DPRINTF(what, fmt, ...) do {} while (0)
 #endif
 
+static int vtd_dev_to_context_entry(IntelIOMMUState *s, uint8_t bus_num,
+                                    uint8_t devfn, VTDContextEntry *ce);
+
 static void vtd_define_quad(IntelIOMMUState *s, hwaddr addr, uint64_t val,
                             uint64_t wmask, uint64_t w1cmask)
 {
@@ -145,6 +148,23 @@ static uint64_t vtd_set_clear_mask_quad(IntelIOMMUState *s, hwaddr addr,
     return new_val;
 }
 
+static int vtd_get_did_dev(IntelIOMMUState *s, uint8_t bus_num, uint8_t devfn,
+                           uint16_t *domain_id)
+{
+    VTDContextEntry ce;
+    int ret_fr;
+
+    assert(domain_id);
+
+    ret_fr = vtd_dev_to_context_entry(s, bus_num, devfn, &ce);
+    if (ret_fr) {
+        return -1;
+    }
+
+    *domain_id =  VTD_CONTEXT_ENTRY_DID(ce.hi);
+    return 0;
+}
+
 /* GHashTable functions */
 static gboolean vtd_uint64_equal(gconstpointer v1, gconstpointer v2)
 {
@@ -679,10 +699,10 @@ static int vtd_gpa_to_slpte(VTDContextEntry *ce, uint64_t gpa,
         }
         *reads = (*reads) && (slpte & VTD_SL_R);
         *writes = (*writes) && (slpte & VTD_SL_W);
-        if (!(slpte & access_right_check)) {
+        if (!(slpte & access_right_check) && !(flags & IOMMU_NO_FAIL)) {
             VTD_DPRINTF(GENERAL, "error: lack of %s permission for "
                         "gpa 0x%"PRIx64 " slpte 0x%"PRIx64,
-                        (flags == IOMMU_WO ? "write" : "read"), gpa, slpte);
+                        (flags & IOMMU_WO ? "write" : "read"), gpa, slpte);
             return (flags & IOMMU_WO) ? -VTD_FR_WRITE : -VTD_FR_READ;
         }
         if (vtd_slpte_nonzero_rsvd(slpte, level)) {
@@ -1064,6 +1084,55 @@ static void vtd_iotlb_domain_invalidate(IntelIOMMUState *s, uint16_t domain_id)
                                 &domain_id);
 }
 
+static void vtd_iotlb_page_invalidate_notify(IntelIOMMUState *s,
+                                           uint16_t domain_id, hwaddr addr,
+                                           uint8_t am)
+{
+    IntelIOMMUNotifierNode *node;
+
+    QLIST_FOREACH(node, &(s->notifiers_list), next) {
+        VTDAddressSpace *vtd_as = node->vtd_as;
+        uint16_t vfio_domain_id;
+        int ret = vtd_get_did_dev(s, pci_bus_num(vtd_as->bus), vtd_as->devfn,
+                                  &vfio_domain_id);
+        if (!ret && domain_id == vfio_domain_id) {
+            IOMMUTLBEntry entry;
+
+            /* notify unmap */
+            if (node->notifier_flag & IOMMU_NOTIFIER_UNMAP) {
+                VTD_DPRINTF(GENERAL, "Remove addr 0x%"PRIx64 " mask %d",
+                            addr, am);
+                entry.target_as = &address_space_memory;
+                entry.iova = addr & VTD_PAGE_MASK_4K;
+                entry.translated_addr = 0;
+                entry.addr_mask = ~VTD_PAGE_MASK(VTD_PAGE_SHIFT_4K + am);
+                entry.perm = IOMMU_NONE;
+                memory_region_notify_iommu(&node->vtd_as->iommu, entry);
+            }
+
+            /* notify map */
+            if (node->notifier_flag & IOMMU_NOTIFIER_MAP) {
+                hwaddr original_addr = addr;
+                VTD_DPRINTF(GENERAL, "add addr 0x%"PRIx64 " mask %d", addr, am);
+                while (addr < original_addr + (1 << am) * VTD_PAGE_SIZE) {
+                    /* call to vtd_iommu_translate */
+                    IOMMUTLBEntry entry = s->iommu_ops.translate(
+                                                         &node->vtd_as->iommu,
+                                                         addr,
+                                                         IOMMU_NO_FAIL);
+                    if (entry.perm != IOMMU_NONE) {
+                        addr += entry.addr_mask + 1;
+                        memory_region_notify_iommu(&node->vtd_as->iommu, entry);
+                    } else {
+                        addr += VTD_PAGE_SIZE;
+                    }
+                }
+            }
+        }
+    }
+}
+
+
 static void vtd_iotlb_page_invalidate(IntelIOMMUState *s, uint16_t domain_id,
                                       hwaddr addr, uint8_t am)
 {
@@ -1074,6 +1143,8 @@ static void vtd_iotlb_page_invalidate(IntelIOMMUState *s, uint16_t domain_id,
     info.addr = addr;
     info.mask = ~((1 << am) - 1);
     g_hash_table_foreach_remove(s->iotlb, vtd_hash_remove_by_page, &info);
+
+    vtd_iotlb_page_invalidate_notify(s, domain_id, addr, am);
 }
 
 /* Flush IOTLB
@@ -1999,15 +2070,37 @@ static void vtd_iommu_notify_flag_changed(MemoryRegion *iommu,
                                           IOMMUNotifierFlag new)
 {
     VTDAddressSpace *vtd_as = container_of(iommu, VTDAddressSpace, iommu);
+    IntelIOMMUState *s = vtd_as->iommu_state;
+    IntelIOMMUNotifierNode *node = NULL;
+    IntelIOMMUNotifierNode *next_node = NULL;
 
-    if (new & IOMMU_NOTIFIER_MAP) {
-        error_report("Device at bus %s addr %02x.%d requires iommu "
-                     "notifier which is currently not supported by "
-                     "intel-iommu emulation",
-                     vtd_as->bus->qbus.name, PCI_SLOT(vtd_as->devfn),
-                     PCI_FUNC(vtd_as->devfn));
+    if (!s->cache_mode_enabled && new & IOMMU_NOTIFIER_MAP) {
+        error_report("We need to set cache_mode=1 for intel-iommu to enable "
+                     "device assignment with IOMMU protection.");
         exit(1);
     }
+
+    /* Add new ndoe if no mapping was exising before this call */
+    if (old == IOMMU_NOTIFIER_NONE) {
+        node = g_malloc0(sizeof(*node));
+        node->vtd_as = vtd_as;
+        node->notifier_flag = new;
+        QLIST_INSERT_HEAD(&s->notifiers_list, node, next);
+        return;
+    }
+
+    /* update notifier node with new flags */
+    QLIST_FOREACH_SAFE(node, &s->notifiers_list, next, next_node) {
+        if (node->vtd_as == vtd_as) {
+            if (new == IOMMU_NOTIFIER_NONE) {
+                QLIST_REMOVE(node, next);
+                g_free(node);
+            } else {
+                node->notifier_flag = new;
+            }
+            return;
+        }
+    }
 }
 
 static const VMStateDescription vtd_vmstate = {
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index 35d9f3a..96df4ae 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -381,6 +381,8 @@ typedef struct VTDIOTLBPageInvInfo VTDIOTLBPageInvInfo;
 #define VTD_PAGE_SHIFT_1G           30
 #define VTD_PAGE_MASK_1G            (~((1ULL << VTD_PAGE_SHIFT_1G) - 1))
 
+#define VTD_PAGE_MASK(shift)        (~((1ULL << (shift)) - 1))
+
 struct VTDRootEntry {
     uint64_t val;
     uint64_t rsvd;
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 42d293f..ebe9250 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -63,6 +63,7 @@ typedef union VTD_IR_TableEntry VTD_IR_TableEntry;
 typedef union VTD_IR_MSIAddress VTD_IR_MSIAddress;
 typedef struct VTDIrq VTDIrq;
 typedef struct VTD_MSIMessage VTD_MSIMessage;
+typedef struct IntelIOMMUNotifierNode IntelIOMMUNotifierNode;
 
 /* Context-Entry */
 struct VTDContextEntry {
@@ -248,6 +249,12 @@ struct VTD_MSIMessage {
 /* When IR is enabled, all MSI/MSI-X data bits should be zero */
 #define VTD_IR_MSI_DATA          (0)
 
+struct IntelIOMMUNotifierNode {
+    VTDAddressSpace *vtd_as;
+    IOMMUNotifierFlag notifier_flag;
+    QLIST_ENTRY(IntelIOMMUNotifierNode) next;
+};
+
 /* The iommu (DMAR) device state struct */
 struct IntelIOMMUState {
     X86IOMMUState x86_iommu;
@@ -285,6 +292,8 @@ struct IntelIOMMUState {
     MemoryRegionIOMMUOps iommu_ops;
     GHashTable *vtd_as_by_busptr;   /* VTDBus objects indexed by PCIBus* reference */
     VTDBus *vtd_as_by_bus_num[VTD_PCI_BUS_MAX]; /* VTDBus objects indexed by bus number */
+    /* list of registered notifiers */
+    QLIST_HEAD(, IntelIOMMUNotifierNode) notifiers_list;
 
     /* interrupt remapping */
     bool intr_enabled;              /* Whether guest enabled IR */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH v5 0/3] IOMMU: intel_iommu support map and unmap notifications
  2016-10-20 21:07 [Qemu-devel] [PATCH v5 0/3] IOMMU: intel_iommu support map and unmap notifications Aviv B.D
                   ` (2 preceding siblings ...)
  2016-10-20 21:07 ` [Qemu-devel] [PATCH v5 3/3] IOMMU: enable intel_iommu map and unmap notifiers Aviv B.D
@ 2016-10-26 16:37 ` Chao Gao
  2016-10-30 10:18   ` Aviv B.D.
  3 siblings, 1 reply; 6+ messages in thread
From: Chao Gao @ 2016-10-26 16:37 UTC (permalink / raw)
  To: Aviv B.D
  Cc: qemu-devel, Jan Kiszka, Alex Williamson, Peter Xu, Michael S. Tsirkin

On Fri, Oct 21, 2016 at 12:07:18AM +0300, Aviv B.D wrote:
>From: "Aviv Ben-David" <bd.aviv@gmail.com>
>
>* Advertize Cache Mode capability in iommu cap register. 
>  This capability is controlled by "cache-mode" property of intel-iommu device.
>  To enable this option call QEMU with "-device intel-iommu,cache-mode=true".
>
>* On page cache invalidation in intel vIOMMU, check if the domain belong to
>  registered notifier, and notify accordingly.
>
>Currently this patch still doesn't enabling VFIO devices support with vIOMMU 
>present. Current problems:
>* vfio_iommu_map_notify is not aware about memory range belong to specific 
>  VFIOGuestIOMMU.
>* memory_region_iommu_replay hangs QEMU on start up while it itterate over 
>  64bit address space. Commenting out the call to this function enables 
>  workable VFIO device while vIOMMU present.
>
After applying the patch series based on commit ede0cbeb, I run the following cmd:
modprobe vfio-pci
echo 8086 1528 > /sys/bus/pci/drivers/vfio-pci/new_id
echo "0000:03:00.1" > /sys/bus/pci/devices/0000\:03\:00.1/driver/unbind
echo 8086 1528 > /sys/bus/pci/drivers/vfio-pci/new_id
to prepare for assigning nic device.
After that, I try to boot a guest by
qemu-system-x86_64 -boot c -m 4096 -monitor pty -serial stdio --enable-kvm \
-M kernel-irqchip=split -bios bios.bin -smp cpus=288 -hda vdisk.img \
-machine q35 -device intel-iommu,intremap=on,eim=on,cache-mode=true \
-net none -device vfio-pci,host=03:00.1
and however encounter this error:
qemu-system-x86_64: -device vfio-pci,host=03:00.1: iommu map to non memory area e258e000
qemu-system-x86_64: -device vfio-pci,host=03:00.1: iommu map to non memory area e258f000
qemu-system-x86_64: -device vfio-pci,host=03:00.1: iommu map to non memory area e2590000

Do i make some mistakes in this test? How to correct it?

by the way, there is a build error:
qemu/hw/pci-host/apb.c:326:5: error: initialization from incompatible pointer type [-Werror]
     .translate = pbm_translate_iommu,
     ^
qemu/hw/pci-host/apb.c:326:5: error: (near initialization for \u2018pbm_iommu_ops.translate\u2019) [-Werror]
cc1: all warnings being treated as errors
make: *** [hw/pci-host/apb.o] Error 1 
>
>-- 
>1.9.1
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH v5 0/3] IOMMU: intel_iommu support map and unmap notifications
  2016-10-26 16:37 ` [Qemu-devel] [PATCH v5 0/3] IOMMU: intel_iommu support map and unmap notifications Chao Gao
@ 2016-10-30 10:18   ` Aviv B.D.
  0 siblings, 0 replies; 6+ messages in thread
From: Aviv B.D. @ 2016-10-30 10:18 UTC (permalink / raw)
  To: Aviv B.D, qemu-devel, Jan Kiszka, Alex Williamson, Peter Xu,
	Michael S. Tsirkin

I will fix the warning in apb.c...

Can you post your qemu build configuration?

Thanks,
Aviv.

On Wed, Oct 26, 2016 at 6:37 PM, Chao Gao <chao.gao@intel.com> wrote:

> On Fri, Oct 21, 2016 at 12:07:18AM +0300, Aviv B.D wrote:
> >From: "Aviv Ben-David" <bd.aviv@gmail.com>
> >
> >* Advertize Cache Mode capability in iommu cap register.
> >  This capability is controlled by "cache-mode" property of intel-iommu
> device.
> >  To enable this option call QEMU with "-device
> intel-iommu,cache-mode=true".
> >
> >* On page cache invalidation in intel vIOMMU, check if the domain belong
> to
> >  registered notifier, and notify accordingly.
> >
> >Currently this patch still doesn't enabling VFIO devices support with
> vIOMMU
> >present. Current problems:
> >* vfio_iommu_map_notify is not aware about memory range belong to specific
> >  VFIOGuestIOMMU.
> >* memory_region_iommu_replay hangs QEMU on start up while it itterate over
> >  64bit address space. Commenting out the call to this function enables
> >  workable VFIO device while vIOMMU present.
> >
> After applying the patch series based on commit ede0cbeb, I run the
> following cmd:
> modprobe vfio-pci
> echo 8086 1528 > /sys/bus/pci/drivers/vfio-pci/new_id
> echo "0000:03:00.1" > /sys/bus/pci/devices/0000\:03\:00.1/driver/unbind
> echo 8086 1528 > /sys/bus/pci/drivers/vfio-pci/new_id
> to prepare for assigning nic device.
> After that, I try to boot a guest by
> qemu-system-x86_64 -boot c -m 4096 -monitor pty -serial stdio --enable-kvm
> \
> -M kernel-irqchip=split -bios bios.bin -smp cpus=288 -hda vdisk.img \
> -machine q35 -device intel-iommu,intremap=on,eim=on,cache-mode=true \
> -net none -device vfio-pci,host=03:00.1
> and however encounter this error:
> qemu-system-x86_64: -device vfio-pci,host=03:00.1: iommu map to non memory
> area e258e000
> qemu-system-x86_64: -device vfio-pci,host=03:00.1: iommu map to non memory
> area e258f000
> qemu-system-x86_64: -device vfio-pci,host=03:00.1: iommu map to non memory
> area e2590000
>
> Do i make some mistakes in this test? How to correct it?
>
> by the way, there is a build error:
> qemu/hw/pci-host/apb.c:326:5: error: initialization from incompatible
> pointer type [-Werror]
>      .translate = pbm_translate_iommu,
>      ^
> qemu/hw/pci-host/apb.c:326:5: error: (near initialization for
> \u2018pbm_iommu_ops.translate\u2019) [-Werror]
> cc1: all warnings being treated as errors
> make: *** [hw/pci-host/apb.o] Error 1
> >
> >--
> >1.9.1
> >
> >
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-10-30 10:18 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-20 21:07 [Qemu-devel] [PATCH v5 0/3] IOMMU: intel_iommu support map and unmap notifications Aviv B.D
2016-10-20 21:07 ` [Qemu-devel] [PATCH v5 1/3] IOMMU: add option to enable VTD_CAP_CM to vIOMMU capility exposoed to guest Aviv B.D
2016-10-20 21:07 ` [Qemu-devel] [PATCH v5 2/3] IOMMU: change iommu_op->translate's is_write to flags, add support to NO_FAIL flag mode Aviv B.D
2016-10-20 21:07 ` [Qemu-devel] [PATCH v5 3/3] IOMMU: enable intel_iommu map and unmap notifiers Aviv B.D
2016-10-26 16:37 ` [Qemu-devel] [PATCH v5 0/3] IOMMU: intel_iommu support map and unmap notifications Chao Gao
2016-10-30 10:18   ` Aviv B.D.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.