All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] iommu improvements
@ 2018-12-03 17:40 Paul Durrant
  2018-12-03 17:40 ` [PATCH v2 1/4] amd-iommu: add flush iommu_ops Paul Durrant
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Paul Durrant @ 2018-12-03 17:40 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant

Paul Durrant (4):
  amd-iommu: add flush iommu_ops
  iommu: rename wrapper functions
  iommu: elide flushing for higher order map/unmap operations
  x86/mm/p2m: stop checking for IOMMU shared page tables in mmio_order()

 xen/arch/arm/p2m.c                            |  13 ++-
 xen/arch/x86/mm.c                             |  11 +--
 xen/arch/x86/mm/p2m-ept.c                     |   4 +-
 xen/arch/x86/mm/p2m-pt.c                      |   5 +-
 xen/arch/x86/mm/p2m.c                         |  17 ++--
 xen/arch/x86/x86_64/mm.c                      |   9 +-
 xen/common/grant_table.c                      |  14 ++--
 xen/common/memory.c                           |   6 +-
 xen/drivers/passthrough/amd/iommu_map.c       | 116 ++++++++++++++++++++------
 xen/drivers/passthrough/amd/pci_amd_iommu.c   |   2 +
 xen/drivers/passthrough/arm/smmu.c            |  15 ++--
 xen/drivers/passthrough/iommu.c               |  61 +++++++++++---
 xen/drivers/passthrough/vtd/iommu.c           |  45 ++++++----
 xen/drivers/passthrough/x86/iommu.c           |  20 +++--
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |  10 ++-
 xen/include/xen/iommu.h                       |  46 +++++++---
 16 files changed, 281 insertions(+), 113 deletions(-)

-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 1/4] amd-iommu: add flush iommu_ops
  2018-12-03 17:40 [PATCH v2 0/4] iommu improvements Paul Durrant
@ 2018-12-03 17:40 ` Paul Durrant
  2018-12-04 14:24   ` Jan Beulich
  2018-12-03 17:40 ` [PATCH v2 2/4] iommu: rename wrapper functions Paul Durrant
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 18+ messages in thread
From: Paul Durrant @ 2018-12-03 17:40 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Suravee Suthikulpanit, Andrew Cooper, Paul Durrant,
	Jan Beulich, Brian Woods, Roger Pau Monné

The iommu_ops structure contains two methods for flushing: 'iotlb_flush' and
'iotlb_flush_all'. This patch adds implementations of these for AMD IOMMUs.

The iotlb_flush method takes a base DFN and a (4k) page count, but the
flush needs to be done by page order (i.e. 0, 9 or 18). Because a flush
operation is fairly expensive to perform, the code calculates the minimum
order single flush that will cover the specified page range rather than
performing multiple flushes.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Brian Woods <brian.woods@amd.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>

v2:
 - Treat passing INVALID_DFN to iommu_iotlb_flush() as an error, and a zero
   page_count as a no-op.
 - Get rid of dfn_lt().
---
 xen/drivers/passthrough/amd/iommu_map.c       | 50 +++++++++++++++++++++++++++
 xen/drivers/passthrough/amd/pci_amd_iommu.c   |  2 ++
 xen/drivers/passthrough/iommu.c               |  6 +++-
 xen/drivers/passthrough/vtd/iommu.c           |  2 ++
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |  3 ++
 xen/include/xen/iommu.h                       |  5 +++
 6 files changed, 67 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/amd/iommu_map.c b/xen/drivers/passthrough/amd/iommu_map.c
index 04cb7b3182..b6ddf211c6 100644
--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -631,6 +631,56 @@ int amd_iommu_unmap_page(struct domain *d, dfn_t dfn)
     spin_unlock(&hd->arch.mapping_lock);
 
     amd_iommu_flush_pages(d, dfn_x(dfn), 0);
+    return 0;
+}
+
+static unsigned long flush_count(unsigned long dfn, unsigned int page_count,
+                                 unsigned int order)
+{
+    unsigned long start = dfn / (1u << order);
+    unsigned long end = DIV_ROUND_UP(dfn + page_count, (1u << order));
+
+    ASSERT(end > start);
+    return end - start;
+}
+
+int amd_iommu_flush_iotlb_pages(struct domain *d, dfn_t dfn,
+                                unsigned int page_count)
+{
+    unsigned long dfn_l = dfn_x(dfn);
+
+    ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN));
+
+    /* If the range wraps then just flush everything */
+    if ( dfn_l + page_count < dfn_l )
+    {
+        amd_iommu_flush_all_pages(d);
+        return 0;
+    }
+
+    /*
+     * Flushes are expensive so find the minimal single flush that will
+     * cover the page range.
+     *
+     * NOTE: It is unnecessary to round down the DFN value to align with
+     *       the flush order here. This is done by the internals of the
+     *       flush code.
+     */
+    if ( page_count == 1 ) /* order 0 flush count */
+        amd_iommu_flush_pages(d, dfn_l, 0);
+    else if ( flush_count(dfn_l, page_count, 9) == 1 )
+        amd_iommu_flush_pages(d, dfn_l, 9);
+    else if ( flush_count(dfn_l, page_count, 18) == 1 )
+        amd_iommu_flush_pages(d, dfn_l, 18);
+    else
+        amd_iommu_flush_all_pages(d);
+
+    return 0;
+}
+
+int amd_iommu_flush_iotlb_all(struct domain *d)
+{
+    amd_iommu_flush_all_pages(d);
 
     return 0;
 }
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index 900136390d..33a3798f36 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -579,6 +579,8 @@ static const struct iommu_ops __initconstrel amd_iommu_ops = {
     .teardown = amd_iommu_domain_destroy,
     .map_page = amd_iommu_map_page,
     .unmap_page = amd_iommu_unmap_page,
+    .iotlb_flush = amd_iommu_flush_iotlb_pages,
+    .iotlb_flush_all = amd_iommu_flush_iotlb_all,
     .free_page_table = deallocate_page_table,
     .reassign_device = reassign_device,
     .get_device_group_id = amd_iommu_group_id,
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index ac62d7f52a..c1cce08551 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -414,9 +414,13 @@ int iommu_iotlb_flush(struct domain *d, dfn_t dfn, unsigned int page_count)
     const struct domain_iommu *hd = dom_iommu(d);
     int rc;
 
-    if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush )
+    if ( !iommu_enabled || !hd->platform_ops ||
+         !hd->platform_ops->iotlb_flush || !page_count )
         return 0;
 
+    if ( dfn_eq(dfn, INVALID_DFN) )
+        return -EINVAL;
+
     rc = hd->platform_ops->iotlb_flush(d, dfn, page_count);
     if ( unlikely(rc) )
     {
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 1601278b07..d2fa5e2b25 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -635,6 +635,8 @@ static int __must_check iommu_flush_iotlb_pages(struct domain *d,
                                                 dfn_t dfn,
                                                 unsigned int page_count)
 {
+    ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN));
+
     return iommu_flush_iotlb(d, dfn, 1, page_count);
 }
 
diff --git a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
index 718a44f956..88715329ca 100644
--- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
+++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
@@ -60,6 +60,9 @@ int __must_check amd_iommu_alloc_root(struct domain_iommu *hd);
 int amd_iommu_reserve_domain_unity_map(struct domain *domain,
                                        paddr_t phys_addr, unsigned long size,
                                        int iw, int ir);
+int __must_check amd_iommu_flush_iotlb_pages(struct domain *d, dfn_t dfn,
+                                             unsigned int page_count);
+int __must_check amd_iommu_flush_iotlb_all(struct domain *d);
 
 /* Share p2m table with iommu */
 void amd_iommu_share_p2m(struct domain *d);
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 3d78126801..da8294bac8 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -52,6 +52,11 @@ static inline bool_t dfn_eq(dfn_t x, dfn_t y)
     return dfn_x(x) == dfn_x(y);
 }
 
+static inline bool_t dfn_lt(dfn_t x, dfn_t y)
+{
+    return dfn_x(x) < dfn_x(y);
+}
+
 extern bool_t iommu_enable, iommu_enabled;
 extern bool_t force_iommu, iommu_verbose;
 extern bool_t iommu_workaround_bios_bug, iommu_igfx;
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 2/4] iommu: rename wrapper functions
  2018-12-03 17:40 [PATCH v2 0/4] iommu improvements Paul Durrant
  2018-12-03 17:40 ` [PATCH v2 1/4] amd-iommu: add flush iommu_ops Paul Durrant
@ 2018-12-03 17:40 ` Paul Durrant
  2018-12-04 14:51   ` Jan Beulich
  2018-12-03 17:40 ` [PATCH v2 3/4] iommu: elide flushing for higher order map/unmap operations Paul Durrant
  2018-12-03 17:40 ` [PATCH v2 4/4] x86/mm/p2m: stop checking for IOMMU shared page tables in mmio_order() Paul Durrant
  3 siblings, 1 reply; 18+ messages in thread
From: Paul Durrant @ 2018-12-03 17:40 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, Paul Durrant, Jan Beulich,
	Roger Pau Monné

A subsequent patch will add semantically different versions of
iommu_map/unmap() so, in advance of that change, this patch renames the
existing functions to iommu_legacy_map/unmap() and modifies all call-sites.

The patch also renames iommu_iotlb_flush[_all] to the shorter name(s)
iommu_flush[_all] (also renaming an internal VT-d function to avoid a name
clash) and co-locates the declarations with those of
iommu_legacy_map/unmap().

The only changes in this patch that are not purely cosmetic are in
arch_iommu_populate_page_table() and iommu_hwdom_init(), which now call
iommu_legacy_map() rather than calling the map_page() iommu_ops method
directly.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>

v2:
 - New in v2.
---
 xen/arch/arm/p2m.c                  |  4 ++--
 xen/arch/x86/mm.c                   | 11 ++++++-----
 xen/arch/x86/mm/p2m-ept.c           |  4 ++--
 xen/arch/x86/mm/p2m-pt.c            |  5 +++--
 xen/arch/x86/mm/p2m.c               | 12 ++++++------
 xen/arch/x86/x86_64/mm.c            |  9 +++++----
 xen/common/grant_table.c            | 14 +++++++-------
 xen/common/memory.c                 |  4 ++--
 xen/drivers/passthrough/iommu.c     | 14 +++++++-------
 xen/drivers/passthrough/vtd/iommu.c | 10 +++++-----
 xen/drivers/passthrough/x86/iommu.c | 13 ++++++-------
 xen/include/xen/iommu.h             | 19 ++++++++++---------
 12 files changed, 61 insertions(+), 58 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 6c76298ebc..e8b7624492 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -971,8 +971,8 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
 
     if ( need_iommu_pt_sync(p2m->domain) &&
          (lpae_is_valid(orig_pte) || lpae_is_valid(*entry)) )
-        rc = iommu_iotlb_flush(p2m->domain, _dfn(gfn_x(sgfn)),
-                               1UL << page_order);
+        rc = iommu_flush(p2m->domain, _dfn(gfn_x(sgfn)),
+                         1UL << page_order);
     else
         rc = 0;
 
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 28a003063e..746f0b0258 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2801,12 +2801,13 @@ static int _get_page_type(struct page_info *page, unsigned long type,
             mfn_t mfn = page_to_mfn(page);
 
             if ( (x & PGT_type_mask) == PGT_writable_page )
-                iommu_ret = iommu_unmap(d, _dfn(mfn_x(mfn)),
-                                        PAGE_ORDER_4K);
+                iommu_ret = iommu_legacy_unmap(d, _dfn(mfn_x(mfn)),
+                                               PAGE_ORDER_4K);
             else if ( type == PGT_writable_page )
-                iommu_ret = iommu_map(d, _dfn(mfn_x(mfn)), mfn,
-                                      PAGE_ORDER_4K,
-                                      IOMMUF_readable | IOMMUF_writable);
+                iommu_ret = iommu_legacy_map(d, _dfn(mfn_x(mfn)), mfn,
+                                             PAGE_ORDER_4K,
+                                             IOMMUF_readable |
+                                             IOMMUF_writable);
         }
     }
 
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 6e4e375bad..64a49c07b7 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -882,8 +882,8 @@ out:
             rc = iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
         else if ( need_iommu_pt_sync(d) )
             rc = iommu_flags ?
-                iommu_map(d, _dfn(gfn), mfn, order, iommu_flags) :
-                iommu_unmap(d, _dfn(gfn), order);
+                iommu_legacy_map(d, _dfn(gfn), mfn, order, iommu_flags) :
+                iommu_legacy_unmap(d, _dfn(gfn), order);
     }
 
     unmap_domain_page(table);
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 17a6b61f12..69ffb08179 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -686,8 +686,9 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
 
         if ( need_iommu_pt_sync(p2m->domain) )
             rc = iommu_pte_flags ?
-                iommu_map(d, _dfn(gfn), mfn, page_order, iommu_pte_flags) :
-                iommu_unmap(d, _dfn(gfn), page_order);
+                iommu_legacy_map(d, _dfn(gfn), mfn, page_order,
+                                 iommu_pte_flags) :
+                iommu_legacy_unmap(d, _dfn(gfn), page_order);
         else if ( iommu_use_hap_pt(d) && iommu_old_flags )
             amd_iommu_flush_pages(p2m->domain, gfn, page_order);
     }
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index fea4497910..ed76e96d33 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -733,7 +733,7 @@ p2m_remove_page(struct p2m_domain *p2m, unsigned long gfn_l, unsigned long mfn,
 
     if ( !paging_mode_translate(p2m->domain) )
         return need_iommu_pt_sync(p2m->domain) ?
-            iommu_unmap(p2m->domain, _dfn(mfn), page_order) : 0;
+            iommu_legacy_unmap(p2m->domain, _dfn(mfn), page_order) : 0;
 
     ASSERT(gfn_locked_by_me(p2m, gfn));
     P2M_DEBUG("removing gfn=%#lx mfn=%#lx\n", gfn_l, mfn);
@@ -780,8 +780,8 @@ guest_physmap_add_entry(struct domain *d, gfn_t gfn, mfn_t mfn,
 
     if ( !paging_mode_translate(d) )
         return (need_iommu_pt_sync(d) && t == p2m_ram_rw) ?
-            iommu_map(d, _dfn(mfn_x(mfn)), mfn, page_order,
-                      IOMMUF_readable | IOMMUF_writable) : 0;
+            iommu_legacy_map(d, _dfn(mfn_x(mfn)), mfn, page_order,
+                             IOMMUF_readable | IOMMUF_writable) : 0;
 
     /* foreign pages are added thru p2m_add_foreign */
     if ( p2m_is_foreign(t) )
@@ -1151,8 +1151,8 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn_l,
     {
         if ( !need_iommu_pt_sync(d) )
             return 0;
-        return iommu_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
-                         IOMMUF_readable | IOMMUF_writable);
+        return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
+                                IOMMUF_readable | IOMMUF_writable);
     }
 
     gfn_lock(p2m, gfn, 0);
@@ -1242,7 +1242,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn_l)
     {
         if ( !need_iommu_pt_sync(d) )
             return 0;
-        return iommu_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
+        return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
     }
 
     gfn_lock(p2m, gfn, 0);
diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index 11977f2671..8056679de0 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -1436,15 +1436,16 @@ int memory_add(unsigned long spfn, unsigned long epfn, unsigned int pxm)
          !need_iommu_pt_sync(hardware_domain) )
     {
         for ( i = spfn; i < epfn; i++ )
-            if ( iommu_map(hardware_domain, _dfn(i), _mfn(i),
-                           PAGE_ORDER_4K,
-                           IOMMUF_readable | IOMMUF_writable) )
+            if ( iommu_legacy_map(hardware_domain, _dfn(i), _mfn(i),
+                                  PAGE_ORDER_4K,
+                                  IOMMUF_readable | IOMMUF_writable) )
                 break;
         if ( i != epfn )
         {
             while (i-- > old_max)
                 /* If statement to satisfy __must_check. */
-                if ( iommu_unmap(hardware_domain, _dfn(i), PAGE_ORDER_4K) )
+                if ( iommu_legacy_unmap(hardware_domain, _dfn(i),
+                                        PAGE_ORDER_4K) )
                     continue;
 
             goto destroy_m2p;
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index b67ae9e3f5..fd099a8f25 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -1134,14 +1134,14 @@ map_grant_ref(
              !(old_pin & (GNTPIN_hstw_mask|GNTPIN_devw_mask)) )
         {
             if ( !(kind & MAPKIND_WRITE) )
-                err = iommu_map(ld, _dfn(mfn_x(mfn)), mfn, 0,
-                                IOMMUF_readable | IOMMUF_writable);
+                err = iommu_legacy_map(ld, _dfn(mfn_x(mfn)), mfn, 0,
+                                       IOMMUF_readable | IOMMUF_writable);
         }
         else if ( act_pin && !old_pin )
         {
             if ( !kind )
-                err = iommu_map(ld, _dfn(mfn_x(mfn)), mfn, 0,
-                                IOMMUF_readable);
+                err = iommu_legacy_map(ld, _dfn(mfn_x(mfn)), mfn, 0,
+                                       IOMMUF_readable);
         }
         if ( err )
         {
@@ -1389,10 +1389,10 @@ unmap_common(
 
         kind = mapkind(lgt, rd, op->mfn);
         if ( !kind )
-            err = iommu_unmap(ld, _dfn(mfn_x(op->mfn)), 0);
+            err = iommu_legacy_unmap(ld, _dfn(mfn_x(op->mfn)), 0);
         else if ( !(kind & MAPKIND_WRITE) )
-            err = iommu_map(ld, _dfn(mfn_x(op->mfn)), op->mfn, 0,
-                            IOMMUF_readable);
+            err = iommu_legacy_map(ld, _dfn(mfn_x(op->mfn)), op->mfn, 0,
+                                   IOMMUF_readable);
 
         double_gt_unlock(lgt, rgt);
 
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 175bd62c11..7b668077d8 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -865,11 +865,11 @@ int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap *xatp,
 
         this_cpu(iommu_dont_flush_iotlb) = 0;
 
-        ret = iommu_iotlb_flush(d, _dfn(xatp->idx - done), done);
+        ret = iommu_flush(d, _dfn(xatp->idx - done), done);
         if ( unlikely(ret) && rc >= 0 )
             rc = ret;
 
-        ret = iommu_iotlb_flush(d, _dfn(xatp->gpfn - done), done);
+        ret = iommu_flush(d, _dfn(xatp->gpfn - done), done);
         if ( unlikely(ret) && rc >= 0 )
             rc = ret;
     }
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index c1cce08551..6d231bec94 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -226,8 +226,8 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
                   == PGT_writable_page) )
                 mapping |= IOMMUF_writable;
 
-            ret = hd->platform_ops->map_page(d, _dfn(dfn), _mfn(mfn),
-                                             mapping);
+            ret = iommu_legacy_map(d, _dfn(dfn), _mfn(mfn), mapping,
+                                   PAGE_ORDER_4K);
             if ( !rc )
                 rc = ret;
 
@@ -304,8 +304,8 @@ void iommu_domain_destroy(struct domain *d)
     arch_iommu_domain_destroy(d);
 }
 
-int iommu_map(struct domain *d, dfn_t dfn, mfn_t mfn,
-              unsigned int page_order, unsigned int flags)
+int iommu_legacy_map(struct domain *d, dfn_t dfn, mfn_t mfn,
+                     unsigned int page_order, unsigned int flags)
 {
     const struct domain_iommu *hd = dom_iommu(d);
     unsigned long i;
@@ -345,7 +345,7 @@ int iommu_map(struct domain *d, dfn_t dfn, mfn_t mfn,
     return rc;
 }
 
-int iommu_unmap(struct domain *d, dfn_t dfn, unsigned int page_order)
+int iommu_legacy_unmap(struct domain *d, dfn_t dfn, unsigned int page_order)
 {
     const struct domain_iommu *hd = dom_iommu(d);
     unsigned long i;
@@ -409,7 +409,7 @@ static void iommu_free_pagetables(unsigned long unused)
                             cpumask_cycle(smp_processor_id(), &cpu_online_map));
 }
 
-int iommu_iotlb_flush(struct domain *d, dfn_t dfn, unsigned int page_count)
+int iommu_flush(struct domain *d, dfn_t dfn, unsigned int page_count)
 {
     const struct domain_iommu *hd = dom_iommu(d);
     int rc;
@@ -436,7 +436,7 @@ int iommu_iotlb_flush(struct domain *d, dfn_t dfn, unsigned int page_count)
     return rc;
 }
 
-int iommu_iotlb_flush_all(struct domain *d)
+int iommu_flush_all(struct domain *d)
 {
     const struct domain_iommu *hd = dom_iommu(d);
     int rc;
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index d2fa5e2b25..8727e242e2 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -546,7 +546,7 @@ static int __must_check iommu_flush_iotlb_psi(struct iommu *iommu, u16 did,
     return status;
 }
 
-static int __must_check iommu_flush_all(void)
+static int __must_check flush_all_iommus(void)
 {
     struct acpi_drhd_unit *drhd;
     struct iommu *iommu;
@@ -1310,7 +1310,7 @@ static void __hwdom_init intel_iommu_hwdom_init(struct domain *d)
     /* Make sure workarounds are applied before enabling the IOMMU(s). */
     arch_iommu_hwdom_init(d);
 
-    if ( iommu_flush_all() )
+    if ( flush_all_iommus() )
         printk(XENLOG_WARNING VTDPREFIX
                " IOMMU flush all failed for hardware domain\n");
 
@@ -2250,7 +2250,7 @@ static int __must_check init_vtd_hw(void)
         }
     }
 
-    return iommu_flush_all();
+    return flush_all_iommus();
 }
 
 static void __hwdom_init setup_hwdom_rmrr(struct domain *d)
@@ -2554,7 +2554,7 @@ static int __must_check vtd_suspend(void)
     if ( !iommu_enabled )
         return 0;
 
-    rc = iommu_flush_all();
+    rc = flush_all_iommus();
     if ( unlikely(rc) )
     {
         printk(XENLOG_WARNING VTDPREFIX
@@ -2602,7 +2602,7 @@ static void vtd_crash_shutdown(void)
     if ( !iommu_enabled )
         return;
 
-    if ( iommu_flush_all() )
+    if ( flush_all_iommus() )
         printk(XENLOG_WARNING VTDPREFIX
                " crash shutdown: IOMMU flush all failed\n");
 
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index c68a72279d..c1f3e2442e 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -46,7 +46,6 @@ int __init iommu_setup_hpet_msi(struct msi_desc *msi)
 
 int arch_iommu_populate_page_table(struct domain *d)
 {
-    const struct domain_iommu *hd = dom_iommu(d);
     struct page_info *page;
     int rc = 0, n = 0;
 
@@ -68,9 +67,9 @@ int arch_iommu_populate_page_table(struct domain *d)
             {
                 ASSERT(!(gfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH));
                 BUG_ON(SHARED_M2P(gfn));
-                rc = hd->platform_ops->map_page(d, _dfn(gfn), _mfn(mfn),
-                                                IOMMUF_readable |
-                                                IOMMUF_writable);
+                rc = iommu_legacy_map(d, _dfn(gfn), _mfn(mfn),
+                                      PAGE_ORDER_4K, IOMMUF_readable |
+                                      IOMMUF_writable);
             }
             if ( rc )
             {
@@ -107,7 +106,7 @@ int arch_iommu_populate_page_table(struct domain *d)
     this_cpu(iommu_dont_flush_iotlb) = 0;
 
     if ( !rc )
-        rc = iommu_iotlb_flush_all(d);
+        rc = iommu_flush_all(d);
 
     if ( rc && rc != -ERESTART )
         iommu_teardown(d);
@@ -241,8 +240,8 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
         if ( paging_mode_translate(d) )
             rc = set_identity_p2m_entry(d, pfn, p2m_access_rw, 0);
         else
-            rc = iommu_map(d, _dfn(pfn), _mfn(pfn), PAGE_ORDER_4K,
-                           IOMMUF_readable | IOMMUF_writable);
+            rc = iommu_legacy_map(d, _dfn(pfn), _mfn(pfn), PAGE_ORDER_4K,
+                                  IOMMUF_readable | IOMMUF_writable);
         if ( rc )
             printk(XENLOG_WARNING " d%d: IOMMU mapping failed: %d\n",
                    d->domain_id, rc);
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index da8294bac8..6773d605a9 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -88,17 +88,22 @@ int iommu_construct(struct domain *d);
 /* Function used internally, use iommu_domain_destroy */
 void iommu_teardown(struct domain *d);
 
-/* iommu_map_page() takes flags to direct the mapping operation. */
 #define _IOMMUF_readable 0
 #define IOMMUF_readable  (1u<<_IOMMUF_readable)
 #define _IOMMUF_writable 1
 #define IOMMUF_writable  (1u<<_IOMMUF_writable)
-int __must_check iommu_map(struct domain *d, dfn_t dfn, mfn_t mfn,
-                           unsigned int page_order, unsigned int flags);
-int __must_check iommu_unmap(struct domain *d, dfn_t dfn,
-                             unsigned int page_order);
+
+int __must_check iommu_legacy_map(struct domain *d, dfn_t dfn, mfn_t mfn,
+                                  unsigned int page_order,
+                                  unsigned int flags);
+int __must_check iommu_legacy_unmap(struct domain *d, dfn_t dfn,
+                                    unsigned int page_order);
+
 int __must_check iommu_lookup_page(struct domain *d, dfn_t dfn, mfn_t *mfn,
                                    unsigned int *flags);
+int __must_check iommu_flush(struct domain *d, dfn_t dfn,
+                             unsigned int page_count);
+int __must_check iommu_flush_all(struct domain *d);
 
 enum iommu_feature
 {
@@ -252,10 +257,6 @@ int iommu_do_pci_domctl(struct xen_domctl *, struct domain *d,
 int iommu_do_domctl(struct xen_domctl *, struct domain *d,
                     XEN_GUEST_HANDLE_PARAM(xen_domctl_t));
 
-int __must_check iommu_iotlb_flush(struct domain *d, dfn_t dfn,
-                                   unsigned int page_count);
-int __must_check iommu_iotlb_flush_all(struct domain *d);
-
 void iommu_dev_iotlb_flush_timeout(struct domain *d, struct pci_dev *pdev);
 
 /*
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 3/4] iommu: elide flushing for higher order map/unmap operations
  2018-12-03 17:40 [PATCH v2 0/4] iommu improvements Paul Durrant
  2018-12-03 17:40 ` [PATCH v2 1/4] amd-iommu: add flush iommu_ops Paul Durrant
  2018-12-03 17:40 ` [PATCH v2 2/4] iommu: rename wrapper functions Paul Durrant
@ 2018-12-03 17:40 ` Paul Durrant
  2018-12-04 15:16   ` Jan Beulich
  2018-12-03 17:40 ` [PATCH v2 4/4] x86/mm/p2m: stop checking for IOMMU shared page tables in mmio_order() Paul Durrant
  3 siblings, 1 reply; 18+ messages in thread
From: Paul Durrant @ 2018-12-03 17:40 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Suravee Suthikulpanit,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, Paul Durrant, Jan Beulich, Brian Woods,
	Roger Pau Monné

This patch removes any implicit flushing that occurs in the implementation
of map and unmap operations and adds new iommu_map/unmap() wrapper
functions. To maintain sematics of the iommu_legacy_map/unmap() wrapper
functions, these are modified to call the new wrapper functions and then
perform an explicit flush operation.

Because VT-d currently performs two different types of flush dependent upon
whether a PTE is being modified versus merely added (i.e. replacing a non-
present PTE) 'iommu flush flags' are defined by this patch and the
iommu_ops map_page() and unmap_page() methods are modified to OR the type
of flush necessary for the PTE that has been populated or depopulated into
an accumulated flags value. The accumulated value can then be passed into
the explicit flush operation.

The ARM SMMU implementations of map_page() and unmap_page() currently
perform no implicit flushing and therefore the modified methods do not
adjust the flush flags.

NOTE: The per-cpu 'iommu_dont_flush_iotlb' is respected by the
      iommu_legacy_map() wrapper function and therefore this now applies to
      all IOMMU implementations rather than just VT-d.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Brian Woods <brian.woods@amd.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>

v2:
 - Add the new iommu_map/unmap() and don't proliferate use of
   iommu_dont_flush_iotlb.
 - Use 'flush flags' instead of a 'iommu_flush_type'
 - Add a 'flush_flags' argument to iommu_flush() and modify the call-sites.

This code has only been compile tested for ARM.
---
 xen/arch/arm/p2m.c                            | 11 ++++-
 xen/common/memory.c                           |  6 ++-
 xen/drivers/passthrough/amd/iommu_map.c       | 68 +++++++++++++++++----------
 xen/drivers/passthrough/arm/smmu.c            | 15 ++++--
 xen/drivers/passthrough/iommu.c               | 57 +++++++++++++++++-----
 xen/drivers/passthrough/vtd/iommu.c           | 33 ++++++++-----
 xen/drivers/passthrough/x86/iommu.c           | 19 +++++---
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |  9 ++--
 xen/include/xen/iommu.h                       | 28 +++++++++--
 9 files changed, 173 insertions(+), 73 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index e8b7624492..74264711ce 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -971,8 +971,17 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
 
     if ( need_iommu_pt_sync(p2m->domain) &&
          (lpae_is_valid(orig_pte) || lpae_is_valid(*entry)) )
+    {
+        unsigned int flush_flags = 0;
+
+        if ( lpae_is_valid(orig_pte) )
+            flush_flags |= IOMMU_FLUSHF_modified;
+        if ( lpae_is_valid(*entry) )
+            flush_flags |= IOMMU_FLUSHF_added;
+
         rc = iommu_flush(p2m->domain, _dfn(gfn_x(sgfn)),
-                         1UL << page_order);
+                         1UL << page_order, flush_flags);
+    }
     else
         rc = 0;
 
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 7b668077d8..f6ba852136 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -865,11 +865,13 @@ int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap *xatp,
 
         this_cpu(iommu_dont_flush_iotlb) = 0;
 
-        ret = iommu_flush(d, _dfn(xatp->idx - done), done);
+        ret = iommu_flush(d, _dfn(xatp->idx - done), done,
+                          IOMMU_FLUSHF_added | IOMMU_FLUSHF_modified);
         if ( unlikely(ret) && rc >= 0 )
             rc = ret;
 
-        ret = iommu_flush(d, _dfn(xatp->gpfn - done), done);
+        ret = iommu_flush(d, _dfn(xatp->gpfn - done), done,
+                          IOMMU_FLUSHF_added | IOMMU_FLUSHF_modified);
         if ( unlikely(ret) && rc >= 0 )
             rc = ret;
     }
diff --git a/xen/drivers/passthrough/amd/iommu_map.c b/xen/drivers/passthrough/amd/iommu_map.c
index b6ddf211c6..de6ecb9f04 100644
--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -45,13 +45,15 @@ static void clear_iommu_pte_present(unsigned long l1_mfn, unsigned long dfn)
     unmap_domain_page(table);
 }
 
-static bool set_iommu_pde_present(uint32_t *pde, unsigned long next_mfn,
-                                  unsigned int next_level,
-                                  bool iw, bool ir)
+static unsigned int set_iommu_pde_present(uint32_t *pde,
+                                          unsigned long next_mfn,
+                                          unsigned int next_level, bool iw,
+                                          bool ir)
 {
     uint64_t maddr_next;
     uint32_t addr_lo, addr_hi, entry;
-    bool need_flush = false, old_present;
+    bool old_present;
+    unsigned int flush_flags = 0;
 
     maddr_next = __pfn_to_paddr(next_mfn);
 
@@ -84,7 +86,7 @@ static bool set_iommu_pde_present(uint32_t *pde, unsigned long next_mfn,
 
         if ( maddr_old != maddr_next || iw != old_w || ir != old_r ||
              old_level != next_level )
-            need_flush = true;
+            flush_flags = IOMMU_FLUSHF_modified;
     }
 
     addr_lo = maddr_next & DMA_32BIT_MASK;
@@ -121,24 +123,27 @@ static bool set_iommu_pde_present(uint32_t *pde, unsigned long next_mfn,
                          IOMMU_PDE_PRESENT_SHIFT, &entry);
     pde[0] = entry;
 
-    return need_flush;
+    return flush_flags;
 }
 
-static bool set_iommu_pte_present(unsigned long pt_mfn, unsigned long dfn,
-                                  unsigned long next_mfn, int pde_level,
-                                  bool iw, bool ir)
+static unsigned int set_iommu_pte_present(unsigned long pt_mfn,
+                                          unsigned long dfn,
+                                          unsigned long next_mfn,
+                                          int pde_level,
+                                          bool iw, bool ir)
 {
     uint64_t *table;
     uint32_t *pde;
-    bool need_flush;
+    unsigned int flush_flags;
 
     table = map_domain_page(_mfn(pt_mfn));
 
     pde = (uint32_t *)(table + pfn_to_pde_idx(dfn, pde_level));
 
-    need_flush = set_iommu_pde_present(pde, next_mfn, 0, iw, ir);
+    flush_flags = set_iommu_pde_present(pde, next_mfn, 0, iw, ir);
     unmap_domain_page(table);
-    return need_flush;
+
+    return flush_flags;
 }
 
 void amd_iommu_set_root_page_table(uint32_t *dte, uint64_t root_ptr,
@@ -522,9 +527,8 @@ static int update_paging_mode(struct domain *d, unsigned long dfn)
 }
 
 int amd_iommu_map_page(struct domain *d, dfn_t dfn, mfn_t mfn,
-                       unsigned int flags)
+                       unsigned int flags, unsigned int *flush_flags)
 {
-    bool need_flush;
     struct domain_iommu *hd = dom_iommu(d);
     int rc;
     unsigned long pt_mfn[7];
@@ -570,18 +574,17 @@ int amd_iommu_map_page(struct domain *d, dfn_t dfn, mfn_t mfn,
     }
 
     /* Install 4k mapping */
-    need_flush = set_iommu_pte_present(pt_mfn[1], dfn_x(dfn), mfn_x(mfn), 1,
-                                       !!(flags & IOMMUF_writable),
-                                       !!(flags & IOMMUF_readable));
-
-    if ( need_flush )
-        amd_iommu_flush_pages(d, dfn_x(dfn), 0);
+    *flush_flags |= set_iommu_pte_present(pt_mfn[1], dfn_x(dfn), mfn_x(mfn),
+                                          1, !!(flags & IOMMUF_writable),
+                                          !!(flags & IOMMUF_readable));
 
     spin_unlock(&hd->arch.mapping_lock);
+
     return 0;
 }
 
-int amd_iommu_unmap_page(struct domain *d, dfn_t dfn)
+int amd_iommu_unmap_page(struct domain *d, dfn_t dfn,
+                         unsigned int *flush_flags)
 {
     unsigned long pt_mfn[7];
     struct domain_iommu *hd = dom_iommu(d);
@@ -627,10 +630,10 @@ int amd_iommu_unmap_page(struct domain *d, dfn_t dfn)
 
     /* mark PTE as 'page not present' */
     clear_iommu_pte_present(pt_mfn[1], dfn_x(dfn));
+    *flush_flags |= IOMMU_FLUSHF_modified;
 
     spin_unlock(&hd->arch.mapping_lock);
 
-    amd_iommu_flush_pages(d, dfn_x(dfn), 0);
     return 0;
 }
 
@@ -645,11 +648,13 @@ static unsigned long flush_count(unsigned long dfn, unsigned int page_count,
 }
 
 int amd_iommu_flush_iotlb_pages(struct domain *d, dfn_t dfn,
-                                unsigned int page_count)
+                                unsigned int page_count,
+                                unsigned int flush_flags)
 {
     unsigned long dfn_l = dfn_x(dfn);
 
     ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN));
+    ASSERT(flush_flags & IOMMU_FLUSHF_modified);
 
     /* If the range wraps then just flush everything */
     if ( dfn_l + page_count < dfn_l )
@@ -692,6 +697,7 @@ int amd_iommu_reserve_domain_unity_map(struct domain *domain,
     unsigned long npages, i;
     unsigned long gfn;
     unsigned int flags = !!ir;
+    unsigned int flush_flags = 0;
     int rt = 0;
 
     if ( iw )
@@ -703,11 +709,21 @@ int amd_iommu_reserve_domain_unity_map(struct domain *domain,
     {
         unsigned long frame = gfn + i;
 
-        rt = amd_iommu_map_page(domain, _dfn(frame), _mfn(frame), flags);
+        rt = amd_iommu_map_page(domain, _dfn(frame), _mfn(frame), flags,
+                                &flush_flags);
         if ( rt != 0 )
-            return rt;
+            break;
     }
-    return 0;
+
+    /*
+     * The underlying implementation is void so the return value is
+     * meaningless and can hence be ignored.
+     */
+    while ( amd_iommu_flush_iotlb_pages(domain, _dfn(gfn), npages,
+                                        flush_flags) )
+        break;
+
+    return rt;
 }
 
 /* Share p2m table with iommu. */
diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c
index 9612c0fddc..5d12639e97 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -2534,9 +2534,12 @@ static int __must_check arm_smmu_iotlb_flush_all(struct domain *d)
 	return 0;
 }
 
-static int __must_check arm_smmu_iotlb_flush(struct domain *d, dfn_t dfn,
-                                             unsigned int page_count)
+static int __must_check arm_smmu_iotlb_flush(
+	struct domain *d, dfn_t dfn, unsigned int page_count,
+	unsigned int flush_flags)
 {
+	ASSERT(flush_flags);
+
 	/* ARM SMMU v1 doesn't have flush by VMA and VMID */
 	return arm_smmu_iotlb_flush_all(d);
 }
@@ -2731,8 +2734,9 @@ static void arm_smmu_iommu_domain_teardown(struct domain *d)
 	xfree(xen_domain);
 }
 
-static int __must_check arm_smmu_map_page(struct domain *d, dfn_t dfn,
-					  mfn_t mfn, unsigned int flags)
+static int __must_check arm_smmu_map_page(
+	struct domain *d, dfn_t dfn, mfn_t mfn, unsigned int flags,
+	unsigned int *flush_flags)
 {
 	p2m_type_t t;
 
@@ -2761,7 +2765,8 @@ static int __must_check arm_smmu_map_page(struct domain *d, dfn_t dfn,
 				       0, t);
 }
 
-static int __must_check arm_smmu_unmap_page(struct domain *d, dfn_t dfn)
+static int __must_check arm_smmu_unmap_page(struct domain *d, dfn_t dfn,
+                                            unsigned int *flush_flags)
 {
 	/*
 	 * This function should only be used by gnttab code when the domain
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 6d231bec94..60862ddaa2 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -211,7 +211,7 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
     if ( need_iommu_pt_sync(d) )
     {
         struct page_info *page;
-        unsigned int i = 0;
+        unsigned int i = 0, flush_flags = 0;
         int rc = 0;
 
         page_list_for_each ( page, &d->page_list )
@@ -226,8 +226,9 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
                   == PGT_writable_page) )
                 mapping |= IOMMUF_writable;
 
-            ret = iommu_legacy_map(d, _dfn(dfn), _mfn(mfn), mapping,
-                                   PAGE_ORDER_4K);
+            ret = iommu_map(d, _dfn(dfn), _mfn(mfn), mapping, 0,
+                            &flush_flags);
+
             if ( !rc )
                 rc = ret;
 
@@ -235,6 +236,9 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
                 process_pending_softirqs();
         }
 
+        while ( !flush_flags && iommu_flush_all(d) )
+            break;
+
         if ( rc )
             printk(XENLOG_WARNING "d%d: IOMMU mapping failed: %d\n",
                    d->domain_id, rc);
@@ -304,8 +308,9 @@ void iommu_domain_destroy(struct domain *d)
     arch_iommu_domain_destroy(d);
 }
 
-int iommu_legacy_map(struct domain *d, dfn_t dfn, mfn_t mfn,
-                     unsigned int page_order, unsigned int flags)
+int iommu_map(struct domain *d, dfn_t dfn, mfn_t mfn,
+              unsigned int page_order, unsigned int flags,
+              unsigned int *flush_flags)
 {
     const struct domain_iommu *hd = dom_iommu(d);
     unsigned long i;
@@ -320,7 +325,8 @@ int iommu_legacy_map(struct domain *d, dfn_t dfn, mfn_t mfn,
     for ( i = 0; i < (1ul << page_order); i++ )
     {
         rc = hd->platform_ops->map_page(d, dfn_add(dfn, i),
-                                        mfn_add(mfn, i), flags);
+                                        mfn_add(mfn, i), flags,
+                                        flush_flags);
 
         if ( likely(!rc) )
             continue;
@@ -333,7 +339,8 @@ int iommu_legacy_map(struct domain *d, dfn_t dfn, mfn_t mfn,
 
         while ( i-- )
             /* if statement to satisfy __must_check */
-            if ( hd->platform_ops->unmap_page(d, dfn_add(dfn, i)) )
+            if ( hd->platform_ops->unmap_page(d, dfn_add(dfn, i),
+                                              flush_flags) )
                 continue;
 
         if ( !is_hardware_domain(d) )
@@ -345,7 +352,20 @@ int iommu_legacy_map(struct domain *d, dfn_t dfn, mfn_t mfn,
     return rc;
 }
 
-int iommu_legacy_unmap(struct domain *d, dfn_t dfn, unsigned int page_order)
+int iommu_legacy_map(struct domain *d, dfn_t dfn, mfn_t mfn,
+                     unsigned int page_order, unsigned int flags)
+{
+    unsigned int flush_flags = 0;
+    int rc = iommu_map(d, dfn, mfn, page_order, flags, &flush_flags);
+
+    if ( !rc && !this_cpu(iommu_dont_flush_iotlb) )
+        rc = iommu_flush(d, dfn, (1u << page_order), flush_flags);
+
+    return rc;
+}
+
+int iommu_unmap(struct domain *d, dfn_t dfn, unsigned int page_order,
+                unsigned int *flush_flags)
 {
     const struct domain_iommu *hd = dom_iommu(d);
     unsigned long i;
@@ -358,7 +378,8 @@ int iommu_legacy_unmap(struct domain *d, dfn_t dfn, unsigned int page_order)
 
     for ( i = 0; i < (1ul << page_order); i++ )
     {
-        int err = hd->platform_ops->unmap_page(d, dfn_add(dfn, i));
+        int err = hd->platform_ops->unmap_page(d, dfn_add(dfn, i),
+                                               flush_flags);
 
         if ( likely(!err) )
             continue;
@@ -381,6 +402,17 @@ int iommu_legacy_unmap(struct domain *d, dfn_t dfn, unsigned int page_order)
     return rc;
 }
 
+int iommu_legacy_unmap(struct domain *d, dfn_t dfn, unsigned int page_order)
+{
+    unsigned int flush_flags = 0;
+    int rc = iommu_unmap(d, dfn, page_order, &flush_flags);
+
+    if ( !rc )
+        rc = iommu_flush(d, dfn, (1u << page_order), flush_flags);
+
+    return rc;
+}
+
 int iommu_lookup_page(struct domain *d, dfn_t dfn, mfn_t *mfn,
                       unsigned int *flags)
 {
@@ -409,19 +441,20 @@ static void iommu_free_pagetables(unsigned long unused)
                             cpumask_cycle(smp_processor_id(), &cpu_online_map));
 }
 
-int iommu_flush(struct domain *d, dfn_t dfn, unsigned int page_count)
+int iommu_flush(struct domain *d, dfn_t dfn, unsigned int page_count,
+                unsigned int flush_flags)
 {
     const struct domain_iommu *hd = dom_iommu(d);
     int rc;
 
     if ( !iommu_enabled || !hd->platform_ops ||
-         !hd->platform_ops->iotlb_flush || !page_count )
+         !hd->platform_ops->iotlb_flush || !page_count || !flush_flags )
         return 0;
 
     if ( dfn_eq(dfn, INVALID_DFN) )
         return -EINVAL;
 
-    rc = hd->platform_ops->iotlb_flush(d, dfn, page_count);
+    rc = hd->platform_ops->iotlb_flush(d, dfn, page_count, flush_flags);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 8727e242e2..e0326f9456 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -633,11 +633,14 @@ static int __must_check iommu_flush_iotlb(struct domain *d, dfn_t dfn,
 
 static int __must_check iommu_flush_iotlb_pages(struct domain *d,
                                                 dfn_t dfn,
-                                                unsigned int page_count)
+                                                unsigned int page_count,
+                                                unsigned int flush_flags)
 {
     ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN));
+    ASSERT(flush_flags);
 
-    return iommu_flush_iotlb(d, dfn, 1, page_count);
+    return iommu_flush_iotlb(d, dfn, flush_flags & IOMMU_FLUSHF_modified,
+                             page_count);
 }
 
 static int __must_check iommu_flush_iotlb_all(struct domain *d)
@@ -676,9 +679,6 @@ static int __must_check dma_pte_clear_one(struct domain *domain, u64 addr)
     spin_unlock(&hd->arch.mapping_lock);
     iommu_flush_cache_entry(pte, sizeof(struct dma_pte));
 
-    if ( !this_cpu(iommu_dont_flush_iotlb) )
-        rc = iommu_flush_iotlb_pages(domain, daddr_to_dfn(addr), 1);
-
     unmap_vtd_domain_page(page);
 
     return rc;
@@ -1773,9 +1773,9 @@ static void iommu_domain_teardown(struct domain *d)
     spin_unlock(&hd->arch.mapping_lock);
 }
 
-static int __must_check intel_iommu_map_page(struct domain *d,
-                                             dfn_t dfn, mfn_t mfn,
-                                             unsigned int flags)
+static int __must_check intel_iommu_map_page(struct domain *d, dfn_t dfn,
+                                             mfn_t mfn, unsigned int flags,
+                                             unsigned int *flush_flags)
 {
     struct domain_iommu *hd = dom_iommu(d);
     struct dma_pte *page, *pte, old, new = {};
@@ -1825,15 +1825,18 @@ static int __must_check intel_iommu_map_page(struct domain *d,
     spin_unlock(&hd->arch.mapping_lock);
     unmap_vtd_domain_page(page);
 
-    if ( !this_cpu(iommu_dont_flush_iotlb) )
-        rc = iommu_flush_iotlb(d, dfn, dma_pte_present(old), 1);
+    *flush_flags |= IOMMU_FLUSHF_added;
+    if ( dma_pte_present(old) )
+        *flush_flags |= IOMMU_FLUSHF_modified;
 
     return rc;
 }
 
-static int __must_check intel_iommu_unmap_page(struct domain *d,
-                                               dfn_t dfn)
+static int __must_check intel_iommu_unmap_page(struct domain *d, dfn_t dfn,
+                                               unsigned int *flush_flags)
 {
+    int rc;
+
     /* Do nothing if VT-d shares EPT page table */
     if ( iommu_use_hap_pt(d) )
         return 0;
@@ -1842,7 +1845,11 @@ static int __must_check intel_iommu_unmap_page(struct domain *d,
     if ( iommu_hwdom_passthrough && is_hardware_domain(d) )
         return 0;
 
-    return dma_pte_clear_one(d, dfn_to_daddr(dfn));
+    rc = dma_pte_clear_one(d, dfn_to_daddr(dfn));
+    if ( !rc )
+        *flush_flags |= IOMMU_FLUSHF_modified;
+
+    return rc;
 }
 
 static int intel_iommu_lookup_page(struct domain *d, dfn_t dfn, mfn_t *mfn,
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index c1f3e2442e..61841e6c79 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -49,7 +49,6 @@ int arch_iommu_populate_page_table(struct domain *d)
     struct page_info *page;
     int rc = 0, n = 0;
 
-    this_cpu(iommu_dont_flush_iotlb) = 1;
     spin_lock(&d->page_alloc_lock);
 
     if ( unlikely(d->is_dying) )
@@ -62,14 +61,15 @@ int arch_iommu_populate_page_table(struct domain *d)
         {
             unsigned long mfn = mfn_x(page_to_mfn(page));
             unsigned long gfn = mfn_to_gmfn(d, mfn);
+            unsigned int flush_flags = 0;
 
             if ( gfn != gfn_x(INVALID_GFN) )
             {
                 ASSERT(!(gfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH));
                 BUG_ON(SHARED_M2P(gfn));
-                rc = iommu_legacy_map(d, _dfn(gfn), _mfn(mfn),
-                                      PAGE_ORDER_4K, IOMMUF_readable |
-                                      IOMMUF_writable);
+                rc = iommu_map(d, _dfn(gfn), _mfn(mfn),
+                               PAGE_ORDER_4K, IOMMUF_readable |
+                               IOMMUF_writable, &flush_flags);
             }
             if ( rc )
             {
@@ -103,7 +103,6 @@ int arch_iommu_populate_page_table(struct domain *d)
     }
 
     spin_unlock(&d->page_alloc_lock);
-    this_cpu(iommu_dont_flush_iotlb) = 0;
 
     if ( !rc )
         rc = iommu_flush_all(d);
@@ -206,6 +205,7 @@ static bool __hwdom_init hwdom_iommu_map(const struct domain *d,
 void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
 {
     unsigned long i, top, max_pfn;
+    unsigned int flush_flags = 0;
 
     BUG_ON(!is_hardware_domain(d));
 
@@ -240,8 +240,10 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
         if ( paging_mode_translate(d) )
             rc = set_identity_p2m_entry(d, pfn, p2m_access_rw, 0);
         else
-            rc = iommu_legacy_map(d, _dfn(pfn), _mfn(pfn), PAGE_ORDER_4K,
-                                  IOMMUF_readable | IOMMUF_writable);
+            rc = iommu_map(d, _dfn(pfn), _mfn(pfn), PAGE_ORDER_4K,
+                           IOMMUF_readable | IOMMUF_writable,
+                           &flush_flags);
+
         if ( rc )
             printk(XENLOG_WARNING " d%d: IOMMU mapping failed: %d\n",
                    d->domain_id, rc);
@@ -249,6 +251,9 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
         if (!(i & 0xfffff))
             process_pending_softirqs();
     }
+
+    if ( !flush_flags && iommu_flush_all(d) )
+        return;
 }
 
 /*
diff --git a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
index 88715329ca..c5697565d6 100644
--- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
+++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
@@ -53,15 +53,18 @@ int amd_iommu_update_ivrs_mapping_acpi(void);
 
 /* mapping functions */
 int __must_check amd_iommu_map_page(struct domain *d, dfn_t dfn,
-                                    mfn_t mfn, unsigned int flags);
-int __must_check amd_iommu_unmap_page(struct domain *d, dfn_t dfn);
+                                    mfn_t mfn, unsigned int flags,
+                                    unsigned int *flush_flags);
+int __must_check amd_iommu_unmap_page(struct domain *d, dfn_t dfn,
+                                      unsigned int *flush_flags);
 uint64_t amd_iommu_get_address_from_pte(void *entry);
 int __must_check amd_iommu_alloc_root(struct domain_iommu *hd);
 int amd_iommu_reserve_domain_unity_map(struct domain *domain,
                                        paddr_t phys_addr, unsigned long size,
                                        int iw, int ir);
 int __must_check amd_iommu_flush_iotlb_pages(struct domain *d, dfn_t dfn,
-                                             unsigned int page_count);
+                                             unsigned int page_count,
+                                             unsigned int flush_flags);
 int __must_check amd_iommu_flush_iotlb_all(struct domain *d);
 
 /* Share p2m table with iommu */
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 6773d605a9..93b41cd648 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -93,6 +93,22 @@ void iommu_teardown(struct domain *d);
 #define _IOMMUF_writable 1
 #define IOMMUF_writable  (1u<<_IOMMUF_writable)
 
+enum
+{
+    _IOMMU_FLUSHF_added, /* no modified entries, just additional entries */
+    _IOMMU_FLUSHF_modified, /* modified entries */
+};
+
+#define IOMMU_FLUSHF_added (1u << _IOMMU_FLUSHF_added)
+#define IOMMU_FLUSHF_modified (1u << _IOMMU_FLUSHF_modified)
+
+int __must_check iommu_map(struct domain *d, dfn_t dfn, mfn_t mfn,
+                           unsigned int page_order, unsigned int flags,
+                           unsigned int *flush_flags);
+int __must_check iommu_unmap(struct domain *d, dfn_t dfn,
+                             unsigned int page_order,
+                             unsigned int *flush_flags);
+
 int __must_check iommu_legacy_map(struct domain *d, dfn_t dfn, mfn_t mfn,
                                   unsigned int page_order,
                                   unsigned int flags);
@@ -102,7 +118,8 @@ int __must_check iommu_legacy_unmap(struct domain *d, dfn_t dfn,
 int __must_check iommu_lookup_page(struct domain *d, dfn_t dfn, mfn_t *mfn,
                                    unsigned int *flags);
 int __must_check iommu_flush(struct domain *d, dfn_t dfn,
-                             unsigned int page_count);
+                             unsigned int page_count,
+                             unsigned int flush_flags);
 int __must_check iommu_flush_all(struct domain *d);
 
 enum iommu_feature
@@ -182,8 +199,10 @@ struct iommu_ops {
      * other by the caller in order to have meaningful results.
      */
     int __must_check (*map_page)(struct domain *d, dfn_t dfn, mfn_t mfn,
-                                 unsigned int flags);
-    int __must_check (*unmap_page)(struct domain *d, dfn_t dfn);
+                                 unsigned int flags,
+                                 unsigned int *flush_flags);
+    int __must_check (*unmap_page)(struct domain *d, dfn_t dfn,
+                                   unsigned int *flush_flags);
     int __must_check (*lookup_page)(struct domain *d, dfn_t dfn, mfn_t *mfn,
                                     unsigned int *flags);
 
@@ -198,7 +217,8 @@ struct iommu_ops {
     void (*share_p2m)(struct domain *d);
     void (*crash_shutdown)(void);
     int __must_check (*iotlb_flush)(struct domain *d, dfn_t dfn,
-                                    unsigned int page_count);
+                                    unsigned int page_count,
+                                    unsigned int flush_flags);
     int __must_check (*iotlb_flush_all)(struct domain *d);
     int (*get_reserved_device_memory)(iommu_grdm_t *, void *);
     void (*dump_p2m_table)(struct domain *d);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 4/4] x86/mm/p2m: stop checking for IOMMU shared page tables in mmio_order()
  2018-12-03 17:40 [PATCH v2 0/4] iommu improvements Paul Durrant
                   ` (2 preceding siblings ...)
  2018-12-03 17:40 ` [PATCH v2 3/4] iommu: elide flushing for higher order map/unmap operations Paul Durrant
@ 2018-12-03 17:40 ` Paul Durrant
  2018-12-04 15:20   ` Jan Beulich
  2018-12-04 15:51   ` Jan Beulich
  3 siblings, 2 replies; 18+ messages in thread
From: Paul Durrant @ 2018-12-03 17:40 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, George Dunlap, Andrew Cooper, Paul Durrant, Jan Beulich,
	Roger Pau Monné

Now that the iommu_map() and iommu_unmap() operations take an order
parameter and elide flushing there's no strong reason why modifying MMIO
ranges in the p2m should be restricted to a 4k granularity simply because
the IOMMU is enabled but shared page tables are not in operation.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>

v2:
 - New in v2. (Adapted from a previously independent patch).
---
 xen/arch/x86/mm/p2m.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index ed76e96d33..a9cfd1b2e4 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -2059,13 +2059,12 @@ static unsigned int mmio_order(const struct domain *d,
                                unsigned long start_fn, unsigned long nr)
 {
     /*
-     * Note that the !iommu_use_hap_pt() here has three effects:
-     * - cover iommu_{,un}map_page() not having an "order" input yet,
+     * Note that the !hap_enabled() here has two effects:
      * - exclude shadow mode (which doesn't support large MMIO mappings),
      * - exclude PV guests, should execution reach this code for such.
      * So be careful when altering this.
      */
-    if ( !iommu_use_hap_pt(d) ||
+    if ( !hap_enabled(d) ||
          (start_fn & ((1UL << PAGE_ORDER_2M) - 1)) || !(nr >> PAGE_ORDER_2M) )
         return PAGE_ORDER_4K;
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/4] amd-iommu: add flush iommu_ops
  2018-12-03 17:40 ` [PATCH v2 1/4] amd-iommu: add flush iommu_ops Paul Durrant
@ 2018-12-04 14:24   ` Jan Beulich
  2018-12-04 14:56     ` Paul Durrant
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2018-12-04 14:24 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Wei Liu, Andrew Cooper, Suravee Suthikulpanit, xen-devel,
	Brian Woods, Roger Pau Monne

>>> On 03.12.18 at 18:40, <paul.durrant@citrix.com> wrote:
> +static unsigned long flush_count(unsigned long dfn, unsigned int page_count,
> +                                 unsigned int order)
> +{
> +    unsigned long start = dfn / (1u << order);
> +    unsigned long end = DIV_ROUND_UP(dfn + page_count, (1u << order));

Luckily this in not in generic code, so the anomaly at the upper address
space end is not going to surface, and in particular not cause ...

> +    ASSERT(end > start);

... this to trigger. I therefore nevertheless wonder whether it
would't be better to use

    unsigned long start = dfn >> order;
    unsigned long end = (dfn + page_count - 1) >> order) + 1;

instead.

> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -52,6 +52,11 @@ static inline bool_t dfn_eq(dfn_t x, dfn_t y)
>      return dfn_x(x) == dfn_x(y);
>  }
>  
> +static inline bool_t dfn_lt(dfn_t x, dfn_t y)
> +{
> +    return dfn_x(x) < dfn_x(y);
> +}

The revision log says this is gone ...

With it really gone, and irrespective of the other comment
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Of course one or both adjustments could be easily done while
committing, provided you agree and provided there's no other
need for a v3.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/4] iommu: rename wrapper functions
  2018-12-03 17:40 ` [PATCH v2 2/4] iommu: rename wrapper functions Paul Durrant
@ 2018-12-04 14:51   ` Jan Beulich
  2018-12-04 15:00     ` Paul Durrant
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2018-12-04 14:51 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Jun Nakajima, xen-devel, Roger Pau Monne

>>> On 03.12.18 at 18:40, <paul.durrant@citrix.com> wrote:
> A subsequent patch will add semantically different versions of
> iommu_map/unmap() so, in advance of that change, this patch renames the
> existing functions to iommu_legacy_map/unmap() and modifies all call-sites.

Hmm, this is the second rename in pretty short a period of time.
Have you considered simply requesting a revert of the earlier
rename? Or wait, that was a rename folded into the addition of
the order parameter. Still not very fortunate. Apparently I
wasn't fast enough to express my reservations against the
original suggestion.

> The patch also renames iommu_iotlb_flush[_all] to the shorter name(s)
> iommu_flush[_all] (also renaming an internal VT-d function to avoid a name
> clash) and co-locates the declarations with those of
> iommu_legacy_map/unmap().

But the "iotlb" part was there to distinguish from other kinds of
flushing). Furthermore the such renamed functions continue to
call iotlb_flush{,_all} hook functions.

> The only changes in this patch that are not purely cosmetic are in
> arch_iommu_populate_page_table() and iommu_hwdom_init(), which now call
> iommu_legacy_map() rather than calling the map_page() iommu_ops method
> directly.

I pretty strongly think this ought to be a separate change. First
and foremost because you add verbosity (in case of error) to
the first of these code paths. Additionally the extra overhead of
repeatedly executed conditionals and the extra function call may
end up being noticeable for sufficiently long loops in both cases.

> @@ -68,9 +67,9 @@ int arch_iommu_populate_page_table(struct domain *d)
>              {
>                  ASSERT(!(gfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH));
>                  BUG_ON(SHARED_M2P(gfn));
> -                rc = hd->platform_ops->map_page(d, _dfn(gfn), _mfn(mfn),
> -                                                IOMMUF_readable |
> -                                                IOMMUF_writable);
> +                rc = iommu_legacy_map(d, _dfn(gfn), _mfn(mfn),
> +                                      PAGE_ORDER_4K, IOMMUF_readable |
> +                                      IOMMUF_writable);

The indentation here is now pretty misleading.

> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -88,17 +88,22 @@ int iommu_construct(struct domain *d);
>  /* Function used internally, use iommu_domain_destroy */
>  void iommu_teardown(struct domain *d);
>  
> -/* iommu_map_page() takes flags to direct the mapping operation. */
>  #define _IOMMUF_readable 0
>  #define IOMMUF_readable  (1u<<_IOMMUF_readable)
>  #define _IOMMUF_writable 1
>  #define IOMMUF_writable  (1u<<_IOMMUF_writable)

I'd prefer if the comment didn't go away altogether.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/4] amd-iommu: add flush iommu_ops
  2018-12-04 14:24   ` Jan Beulich
@ 2018-12-04 14:56     ` Paul Durrant
  0 siblings, 0 replies; 18+ messages in thread
From: Paul Durrant @ 2018-12-04 14:56 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Wei Liu, Andrew Cooper, Suravee Suthikulpanit, xen-devel,
	Brian Woods, Roger Pau Monne

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 04 December 2018 14:25
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Brian Woods <brian.woods@amd.com>; Suravee Suthikulpanit
> <suravee.suthikulpanit@amd.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Roger Pau Monne <roger.pau@citrix.com>; Wei
> Liu <wei.liu2@citrix.com>; xen-devel <xen-devel@lists.xenproject.org>
> Subject: Re: [PATCH v2 1/4] amd-iommu: add flush iommu_ops
> 
> >>> On 03.12.18 at 18:40, <paul.durrant@citrix.com> wrote:
> > +static unsigned long flush_count(unsigned long dfn, unsigned int
> page_count,
> > +                                 unsigned int order)
> > +{
> > +    unsigned long start = dfn / (1u << order);
> > +    unsigned long end = DIV_ROUND_UP(dfn + page_count, (1u << order));
> 
> Luckily this in not in generic code, so the anomaly at the upper address
> space end is not going to surface, and in particular not cause ...
> 
> > +    ASSERT(end > start);
> 
> ... this to trigger. I therefore nevertheless wonder whether it
> would't be better to use
> 
>     unsigned long start = dfn >> order;
>     unsigned long end = (dfn + page_count - 1) >> order) + 1;
> 
> instead.

Yes, that's much better... don't know why I didn't do it that way in the first place.

> 
> > --- a/xen/include/xen/iommu.h
> > +++ b/xen/include/xen/iommu.h
> > @@ -52,6 +52,11 @@ static inline bool_t dfn_eq(dfn_t x, dfn_t y)
> >      return dfn_x(x) == dfn_x(y);
> >  }
> >
> > +static inline bool_t dfn_lt(dfn_t x, dfn_t y)
> > +{
> > +    return dfn_x(x) < dfn_x(y);
> > +}
> 
> The revision log says this is gone ...
> 

Oh. I must have messed up my git commands and accidentally put it back during rebase.

> With it really gone, and irrespective of the other comment
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 

Thanks.

> Of course one or both adjustments could be easily done while
> committing, provided you agree and provided there's no other
> need for a v3.

Ok, you're comments on patch #2 suggest there will probably by a v3 so I can fix.

  Paul

> 
> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/4] iommu: rename wrapper functions
  2018-12-04 14:51   ` Jan Beulich
@ 2018-12-04 15:00     ` Paul Durrant
  0 siblings, 0 replies; 18+ messages in thread
From: Paul Durrant @ 2018-12-04 15:00 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Jun Nakajima, xen-devel,
	Ian Jackson, Roger Pau Monne

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 04 December 2018 14:51
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Roger Pau Monne <roger.pau@citrix.com>; Wei
> Liu <wei.liu2@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Ian
> Jackson <Ian.Jackson@citrix.com>; Jun Nakajima <jun.nakajima@intel.com>;
> Kevin Tian <kevin.tian@intel.com>; Stefano Stabellini
> <sstabellini@kernel.org>; xen-devel <xen-devel@lists.xenproject.org>;
> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Tim (Xen.org)
> <tim@xen.org>
> Subject: Re: [PATCH v2 2/4] iommu: rename wrapper functions
> 
> >>> On 03.12.18 at 18:40, <paul.durrant@citrix.com> wrote:
> > A subsequent patch will add semantically different versions of
> > iommu_map/unmap() so, in advance of that change, this patch renames the
> > existing functions to iommu_legacy_map/unmap() and modifies all call-
> sites.
> 
> Hmm, this is the second rename in pretty short a period of time.
> Have you considered simply requesting a revert of the earlier
> rename? Or wait, that was a rename folded into the addition of
> the order parameter. Still not very fortunate. Apparently I
> wasn't fast enough to express my reservations against the
> original suggestion.
> 
> > The patch also renames iommu_iotlb_flush[_all] to the shorter name(s)
> > iommu_flush[_all] (also renaming an internal VT-d function to avoid a
> name
> > clash) and co-locates the declarations with those of
> > iommu_legacy_map/unmap().
> 
> But the "iotlb" part was there to distinguish from other kinds of
> flushing). Furthermore the such renamed functions continue to
> call iotlb_flush{,_all} hook functions.

Ok. I'll leave these names alone.

> 
> > The only changes in this patch that are not purely cosmetic are in
> > arch_iommu_populate_page_table() and iommu_hwdom_init(), which now call
> > iommu_legacy_map() rather than calling the map_page() iommu_ops method
> > directly.
> 
> I pretty strongly think this ought to be a separate change. First
> and foremost because you add verbosity (in case of error) to
> the first of these code paths. Additionally the extra overhead of
> repeatedly executed conditionals and the extra function call may
> end up being noticeable for sufficiently long loops in both cases.
> 

Ok, I'll make this patch purely cosmetic and split those out.

> > @@ -68,9 +67,9 @@ int arch_iommu_populate_page_table(struct domain *d)
> >              {
> >                  ASSERT(!(gfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH));
> >                  BUG_ON(SHARED_M2P(gfn));
> > -                rc = hd->platform_ops->map_page(d, _dfn(gfn),
> _mfn(mfn),
> > -                                                IOMMUF_readable |
> > -                                                IOMMUF_writable);
> > +                rc = iommu_legacy_map(d, _dfn(gfn), _mfn(mfn),
> > +                                      PAGE_ORDER_4K, IOMMUF_readable |
> > +                                      IOMMUF_writable);
> 
> The indentation here is now pretty misleading.
> 

Ok, I'll move the IOMMUF_readable down a line.

> > --- a/xen/include/xen/iommu.h
> > +++ b/xen/include/xen/iommu.h
> > @@ -88,17 +88,22 @@ int iommu_construct(struct domain *d);
> >  /* Function used internally, use iommu_domain_destroy */
> >  void iommu_teardown(struct domain *d);
> >
> > -/* iommu_map_page() takes flags to direct the mapping operation. */
> >  #define _IOMMUF_readable 0
> >  #define IOMMUF_readable  (1u<<_IOMMUF_readable)
> >  #define _IOMMUF_writable 1
> >  #define IOMMUF_writable  (1u<<_IOMMUF_writable)
> 
> I'd prefer if the comment didn't go away altogether.
> 

Ok. I didn't think it was particularly helpful but I'll re-instate it with a modified name.

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] iommu: elide flushing for higher order map/unmap operations
  2018-12-03 17:40 ` [PATCH v2 3/4] iommu: elide flushing for higher order map/unmap operations Paul Durrant
@ 2018-12-04 15:16   ` Jan Beulich
  2018-12-04 15:36     ` Paul Durrant
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2018-12-04 15:16 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Suravee Suthikulpanit, xen-devel, Brian Woods,
	Roger Pau Monne

>>> On 03.12.18 at 18:40, <paul.durrant@citrix.com> wrote:
> This patch removes any implicit flushing that occurs in the implementation
> of map and unmap operations and adds new iommu_map/unmap() wrapper
> functions. To maintain sematics of the iommu_legacy_map/unmap() wrapper
> functions, these are modified to call the new wrapper functions and then
> perform an explicit flush operation.
> 
> Because VT-d currently performs two different types of flush dependent upon
> whether a PTE is being modified versus merely added (i.e. replacing a non-
> present PTE) 'iommu flush flags' are defined by this patch and the
> iommu_ops map_page() and unmap_page() methods are modified to OR the type
> of flush necessary for the PTE that has been populated or depopulated into
> an accumulated flags value. The accumulated value can then be passed into
> the explicit flush operation.
> 
> The ARM SMMU implementations of map_page() and unmap_page() currently
> perform no implicit flushing and therefore the modified methods do not
> adjust the flush flags.

Which, however, likely is wrong. If we mean the flushing to be initiated
by the arch- and vendor-independent wrappers, then all map/unmap
backends should indicate the needed kind of flush. Granted this can be
done later, if things are otherwise correct on Arm right now.

> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -971,8 +971,17 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
>  
>      if ( need_iommu_pt_sync(p2m->domain) &&
>           (lpae_is_valid(orig_pte) || lpae_is_valid(*entry)) )
> +    {
> +        unsigned int flush_flags = 0;
> +
> +        if ( lpae_is_valid(orig_pte) )
> +            flush_flags |= IOMMU_FLUSHF_modified;
> +        if ( lpae_is_valid(*entry) )
> +            flush_flags |= IOMMU_FLUSHF_added;

Shouldn't this be "else if" with the meaning assigned to both
types? From an abstract perspective I'd also expect that for
a single mapping no more than one of the flags can come
back set (through the iommu_ops interface).

> @@ -84,7 +86,7 @@ static bool set_iommu_pde_present(uint32_t *pde, unsigned long next_mfn,
>  
>          if ( maddr_old != maddr_next || iw != old_w || ir != old_r ||
>               old_level != next_level )
> -            need_flush = true;
> +            flush_flags = IOMMU_FLUSHF_modified;

Why uniformly "modified"?

> @@ -645,11 +648,13 @@ static unsigned long flush_count(unsigned long dfn, unsigned int page_count,
>  }
>  
>  int amd_iommu_flush_iotlb_pages(struct domain *d, dfn_t dfn,
> -                                unsigned int page_count)
> +                                unsigned int page_count,
> +                                unsigned int flush_flags)
>  {
>      unsigned long dfn_l = dfn_x(dfn);
>  
>      ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN));
> +    ASSERT(flush_flags & IOMMU_FLUSHF_modified);

Is this valid? What if a map operation solely re-established what
was already there? Aiui in that case set_iommu_pde_present()
would always return zero. Or take this (seeing that the generic
wrapper has a zero check for the flush flags):

> @@ -692,6 +697,7 @@ int amd_iommu_reserve_domain_unity_map(struct domain *domain,
>      unsigned long npages, i;
>      unsigned long gfn;
>      unsigned int flags = !!ir;
> +    unsigned int flush_flags = 0;
>      int rt = 0;
>  
>      if ( iw )
> @@ -703,11 +709,21 @@ int amd_iommu_reserve_domain_unity_map(struct domain *domain,
>      {
>          unsigned long frame = gfn + i;
>  
> -        rt = amd_iommu_map_page(domain, _dfn(frame), _mfn(frame), flags);
> +        rt = amd_iommu_map_page(domain, _dfn(frame), _mfn(frame), flags,
> +                                &flush_flags);
>          if ( rt != 0 )
> -            return rt;
> +            break;
>      }
> -    return 0;
> +
> +    /*
> +     * The underlying implementation is void so the return value is
> +     * meaningless and can hence be ignored.
> +     */
> +    while ( amd_iommu_flush_iotlb_pages(domain, _dfn(gfn), npages,
> +                                        flush_flags) )
> +        break;

Nothing here guarantees flush_flags to be non-zero.

> @@ -235,6 +236,9 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
>                  process_pending_softirqs();
>          }
>  
> +        while ( !flush_flags && iommu_flush_all(d) )
> +            break;

Is there a comment missing here explaining the seemingly odd
loop?

> @@ -381,6 +402,17 @@ int iommu_legacy_unmap(struct domain *d, dfn_t dfn, unsigned int page_order)
>      return rc;
>  }
>  
> +int iommu_legacy_unmap(struct domain *d, dfn_t dfn, unsigned int page_order)
> +{
> +    unsigned int flush_flags = 0;
> +    int rc = iommu_unmap(d, dfn, page_order, &flush_flags);
> +
> +    if ( !rc )
> +        rc = iommu_flush(d, dfn, (1u << page_order), flush_flags);

No iommu_dont_flush_iotlb check needed here?

> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -633,11 +633,14 @@ static int __must_check iommu_flush_iotlb(struct domain *d, dfn_t dfn,
>  
>  static int __must_check iommu_flush_iotlb_pages(struct domain *d,
>                                                  dfn_t dfn,
> -                                                unsigned int page_count)
> +                                                unsigned int page_count,
> +                                                unsigned int flush_flags)
>  {
>      ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN));
> +    ASSERT(flush_flags);
>  
> -    return iommu_flush_iotlb(d, dfn, 1, page_count);
> +    return iommu_flush_iotlb(d, dfn, flush_flags & IOMMU_FLUSHF_modified,
> +                             page_count);

Why the restriction to "modified"?

> @@ -1825,15 +1825,18 @@ static int __must_check intel_iommu_map_page(struct domain *d,
>      spin_unlock(&hd->arch.mapping_lock);
>      unmap_vtd_domain_page(page);
>  
> -    if ( !this_cpu(iommu_dont_flush_iotlb) )
> -        rc = iommu_flush_iotlb(d, dfn, dma_pte_present(old), 1);
> +    *flush_flags |= IOMMU_FLUSHF_added;
> +    if ( dma_pte_present(old) )
> +        *flush_flags |= IOMMU_FLUSHF_modified;

See my earlier comment as to only one of them to get set for an
individual mapping.

> @@ -62,14 +61,15 @@ int arch_iommu_populate_page_table(struct domain *d)
>          {
>              unsigned long mfn = mfn_x(page_to_mfn(page));
>              unsigned long gfn = mfn_to_gmfn(d, mfn);
> +            unsigned int flush_flags = 0;
>  
>              if ( gfn != gfn_x(INVALID_GFN) )
>              {
>                  ASSERT(!(gfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH));
>                  BUG_ON(SHARED_M2P(gfn));
> -                rc = iommu_legacy_map(d, _dfn(gfn), _mfn(mfn),
> -                                      PAGE_ORDER_4K, IOMMUF_readable |
> -                                      IOMMUF_writable);
> +                rc = iommu_map(d, _dfn(gfn), _mfn(mfn),
> +                               PAGE_ORDER_4K, IOMMUF_readable |
> +                               IOMMUF_writable, &flush_flags);
>              }
>              if ( rc )
>              {
> @@ -103,7 +103,6 @@ int arch_iommu_populate_page_table(struct domain *d)
>      }
>  
>      spin_unlock(&d->page_alloc_lock);
> -    this_cpu(iommu_dont_flush_iotlb) = 0;
>  
>      if ( !rc )
>          rc = iommu_flush_all(d);

Would be nice to have a comment here clarifying why flush_flags
doesn't get used.

> @@ -249,6 +251,9 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
>          if (!(i & 0xfffff))
>              process_pending_softirqs();
>      }
> +
> +    if ( !flush_flags && iommu_flush_all(d) )
> +        return;
>  }

Again please attach a brief comment explaining the seemingly
strange construct.

> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -93,6 +93,22 @@ void iommu_teardown(struct domain *d);
>  #define _IOMMUF_writable 1
>  #define IOMMUF_writable  (1u<<_IOMMUF_writable)
>  
> +enum
> +{

Brace on the same line as "enum" please, just like for struct/union. When
they're named this helps finding the place where a certain type gets
(fully) declared.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] x86/mm/p2m: stop checking for IOMMU shared page tables in mmio_order()
  2018-12-03 17:40 ` [PATCH v2 4/4] x86/mm/p2m: stop checking for IOMMU shared page tables in mmio_order() Paul Durrant
@ 2018-12-04 15:20   ` Jan Beulich
  2018-12-04 15:22     ` Paul Durrant
  2018-12-04 15:51   ` Jan Beulich
  1 sibling, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2018-12-04 15:20 UTC (permalink / raw)
  To: Paul Durrant
  Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

>>> On 03.12.18 at 18:40, <paul.durrant@citrix.com> wrote:
> Now that the iommu_map() and iommu_unmap() operations take an order
> parameter and elide flushing there's no strong reason why modifying MMIO
> ranges in the p2m should be restricted to a 4k granularity simply because
> the IOMMU is enabled but shared page tables are not in operation.

I'm afraid the two improvements are not enough for this restriction
to be lifted: There's still no preemption in the processing of the
higher order values.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] x86/mm/p2m: stop checking for IOMMU shared page tables in mmio_order()
  2018-12-04 15:20   ` Jan Beulich
@ 2018-12-04 15:22     ` Paul Durrant
  2018-12-04 15:36       ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Paul Durrant @ 2018-12-04 15:22 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Andrew Cooper, xen-devel, Wei Liu, George Dunlap, Roger Pau Monne

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 04 December 2018 15:21
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau Monne
> <roger.pau@citrix.com>; Wei Liu <wei.liu2@citrix.com>; George Dunlap
> <George.Dunlap@citrix.com>; xen-devel <xen-devel@lists.xenproject.org>
> Subject: Re: [PATCH v2 4/4] x86/mm/p2m: stop checking for IOMMU shared
> page tables in mmio_order()
> 
> >>> On 03.12.18 at 18:40, <paul.durrant@citrix.com> wrote:
> > Now that the iommu_map() and iommu_unmap() operations take an order
> > parameter and elide flushing there's no strong reason why modifying MMIO
> > ranges in the p2m should be restricted to a 4k granularity simply
> because
> > the IOMMU is enabled but shared page tables are not in operation.
> 
> I'm afraid the two improvements are not enough for this restriction
> to be lifted: There's still no preemption in the processing of the
> higher order values.

Why? 1G orders are already ruled out and testing shows that 2M orders cause no problems on EPYC systems.

  Paul

> 
> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] iommu: elide flushing for higher order map/unmap operations
  2018-12-04 15:16   ` Jan Beulich
@ 2018-12-04 15:36     ` Paul Durrant
  2018-12-04 16:01       ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Paul Durrant @ 2018-12-04 15:36 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Suravee Suthikulpanit, xen-devel,
	Ian Jackson, Brian Woods, Roger Pau Monne

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 04 December 2018 15:17
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Brian Woods <brian.woods@amd.com>; Suravee Suthikulpanit
> <suravee.suthikulpanit@amd.com>; Julien Grall <julien.grall@arm.com>;
> Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau Monne
> <roger.pau@citrix.com>; Wei Liu <wei.liu2@citrix.com>; George Dunlap
> <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Kevin
> Tian <kevin.tian@intel.com>; Stefano Stabellini <sstabellini@kernel.org>;
> xen-devel <xen-devel@lists.xenproject.org>; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Tim (Xen.org) <tim@xen.org>
> Subject: Re: [PATCH v2 3/4] iommu: elide flushing for higher order
> map/unmap operations
> 
> >>> On 03.12.18 at 18:40, <paul.durrant@citrix.com> wrote:
> > This patch removes any implicit flushing that occurs in the
> implementation
> > of map and unmap operations and adds new iommu_map/unmap() wrapper
> > functions. To maintain sematics of the iommu_legacy_map/unmap() wrapper
> > functions, these are modified to call the new wrapper functions and then
> > perform an explicit flush operation.
> >
> > Because VT-d currently performs two different types of flush dependent
> upon
> > whether a PTE is being modified versus merely added (i.e. replacing a
> non-
> > present PTE) 'iommu flush flags' are defined by this patch and the
> > iommu_ops map_page() and unmap_page() methods are modified to OR the
> type
> > of flush necessary for the PTE that has been populated or depopulated
> into
> > an accumulated flags value. The accumulated value can then be passed
> into
> > the explicit flush operation.
> >
> > The ARM SMMU implementations of map_page() and unmap_page() currently
> > perform no implicit flushing and therefore the modified methods do not
> > adjust the flush flags.
> 
> Which, however, likely is wrong. If we mean the flushing to be initiated
> by the arch- and vendor-independent wrappers, then all map/unmap
> backends should indicate the needed kind of flush. Granted this can be
> done later, if things are otherwise correct on Arm right now.
> 
> > --- a/xen/arch/arm/p2m.c
> > +++ b/xen/arch/arm/p2m.c
> > @@ -971,8 +971,17 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
> >
> >      if ( need_iommu_pt_sync(p2m->domain) &&
> >           (lpae_is_valid(orig_pte) || lpae_is_valid(*entry)) )
> > +    {
> > +        unsigned int flush_flags = 0;
> > +
> > +        if ( lpae_is_valid(orig_pte) )
> > +            flush_flags |= IOMMU_FLUSHF_modified;
> > +        if ( lpae_is_valid(*entry) )
> > +            flush_flags |= IOMMU_FLUSHF_added;
> 
> Shouldn't this be "else if" with the meaning assigned to both
> types? From an abstract perspective I'd also expect that for
> a single mapping no more than one of the flags can come
> back set (through the iommu_ops interface).

That's not how I see it. My rationale is:

- present PTE made non-present, or modified -> IOMMU_FLUSHF_modified
- new PTE value is present -> IOMMU_FLUSHF_added

So, a single op can set any combination of bits and thus the above code does not use 'else if'.

> 
> > @@ -84,7 +86,7 @@ static bool set_iommu_pde_present(uint32_t *pde,
> unsigned long next_mfn,
> >
> >          if ( maddr_old != maddr_next || iw != old_w || ir != old_r ||
> >               old_level != next_level )
> > -            need_flush = true;
> > +            flush_flags = IOMMU_FLUSHF_modified;
> 
> Why uniformly "modified"?

Because the AMD IOMMU does require flushing for a non-present -> present transition AFAICT. The old code certainly implies this.

> 
> > @@ -645,11 +648,13 @@ static unsigned long flush_count(unsigned long
> dfn, unsigned int page_count,
> >  }
> >
> >  int amd_iommu_flush_iotlb_pages(struct domain *d, dfn_t dfn,
> > -                                unsigned int page_count)
> > +                                unsigned int page_count,
> > +                                unsigned int flush_flags)
> >  {
> >      unsigned long dfn_l = dfn_x(dfn);
> >
> >      ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN));
> > +    ASSERT(flush_flags & IOMMU_FLUSHF_modified);
> 
> Is this valid? What if a map operation solely re-established what
> was already there? Aiui in that case set_iommu_pde_present()
> would always return zero. Or take this (seeing that the generic
> wrapper has a zero check for the flush flags):

Yes, the ASSERT is there because this should never be called unless flush_flags != 0 (ensured by the wrapper) and the map code should only ever set IOMMU_FLUSHF_modified.

> 
> > @@ -692,6 +697,7 @@ int amd_iommu_reserve_domain_unity_map(struct domain
> *domain,
> >      unsigned long npages, i;
> >      unsigned long gfn;
> >      unsigned int flags = !!ir;
> > +    unsigned int flush_flags = 0;
> >      int rt = 0;
> >
> >      if ( iw )
> > @@ -703,11 +709,21 @@ int amd_iommu_reserve_domain_unity_map(struct
> domain *domain,
> >      {
> >          unsigned long frame = gfn + i;
> >
> > -        rt = amd_iommu_map_page(domain, _dfn(frame), _mfn(frame),
> flags);
> > +        rt = amd_iommu_map_page(domain, _dfn(frame), _mfn(frame),
> flags,
> > +                                &flush_flags);
> >          if ( rt != 0 )
> > -            return rt;
> > +            break;
> >      }
> > -    return 0;
> > +
> > +    /*
> > +     * The underlying implementation is void so the return value is
> > +     * meaningless and can hence be ignored.
> > +     */
> > +    while ( amd_iommu_flush_iotlb_pages(domain, _dfn(gfn), npages,
> > +                                        flush_flags) )
> > +        break;
> 
> Nothing here guarantees flush_flags to be non-zero.

Good point. I'll add a check.

> 
> > @@ -235,6 +236,9 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
> >                  process_pending_softirqs();
> >          }
> >
> > +        while ( !flush_flags && iommu_flush_all(d) )
> > +            break;
> 
> Is there a comment missing here explaining the seemingly odd
> loop?

I'm merely using the construct you suggested, but I can add a comment.

> 
> > @@ -381,6 +402,17 @@ int iommu_legacy_unmap(struct domain *d, dfn_t dfn,
> unsigned int page_order)
> >      return rc;
> >  }
> >
> > +int iommu_legacy_unmap(struct domain *d, dfn_t dfn, unsigned int
> page_order)
> > +{
> > +    unsigned int flush_flags = 0;
> > +    int rc = iommu_unmap(d, dfn, page_order, &flush_flags);
> > +
> > +    if ( !rc )
> > +        rc = iommu_flush(d, dfn, (1u << page_order), flush_flags);
> 
> No iommu_dont_flush_iotlb check needed here?

I thought the old VT-d unmap code ignored it, but I see it didn't so yes I do need to add the check.

> 
> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > @@ -633,11 +633,14 @@ static int __must_check iommu_flush_iotlb(struct
> domain *d, dfn_t dfn,
> >
> >  static int __must_check iommu_flush_iotlb_pages(struct domain *d,
> >                                                  dfn_t dfn,
> > -                                                unsigned int
> page_count)
> > +                                                unsigned int
> page_count,
> > +                                                unsigned int
> flush_flags)
> >  {
> >      ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN));
> > +    ASSERT(flush_flags);
> >
> > -    return iommu_flush_iotlb(d, dfn, 1, page_count);
> > +    return iommu_flush_iotlb(d, dfn, flush_flags &
> IOMMU_FLUSHF_modified,
> > +                             page_count);
> 
> Why the restriction to "modified"?

The parameter is a bool which should be true if an existing PTE was modified or false otherwise. I can make this !!(flush_flags & IOMMU_FLUSHF_modified) is you prefer.

> 
> > @@ -1825,15 +1825,18 @@ static int __must_check
> intel_iommu_map_page(struct domain *d,
> >      spin_unlock(&hd->arch.mapping_lock);
> >      unmap_vtd_domain_page(page);
> >
> > -    if ( !this_cpu(iommu_dont_flush_iotlb) )
> > -        rc = iommu_flush_iotlb(d, dfn, dma_pte_present(old), 1);
> > +    *flush_flags |= IOMMU_FLUSHF_added;
> > +    if ( dma_pte_present(old) )
> > +        *flush_flags |= IOMMU_FLUSHF_modified;
> 
> See my earlier comment as to only one of them to get set for an
> individual mapping.
> 
> > @@ -62,14 +61,15 @@ int arch_iommu_populate_page_table(struct domain *d)
> >          {
> >              unsigned long mfn = mfn_x(page_to_mfn(page));
> >              unsigned long gfn = mfn_to_gmfn(d, mfn);
> > +            unsigned int flush_flags = 0;
> >
> >              if ( gfn != gfn_x(INVALID_GFN) )
> >              {
> >                  ASSERT(!(gfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH));
> >                  BUG_ON(SHARED_M2P(gfn));
> > -                rc = iommu_legacy_map(d, _dfn(gfn), _mfn(mfn),
> > -                                      PAGE_ORDER_4K, IOMMUF_readable |
> > -                                      IOMMUF_writable);
> > +                rc = iommu_map(d, _dfn(gfn), _mfn(mfn),
> > +                               PAGE_ORDER_4K, IOMMUF_readable |
> > +                               IOMMUF_writable, &flush_flags);
> >              }
> >              if ( rc )
> >              {
> > @@ -103,7 +103,6 @@ int arch_iommu_populate_page_table(struct domain *d)
> >      }
> >
> >      spin_unlock(&d->page_alloc_lock);
> > -    this_cpu(iommu_dont_flush_iotlb) = 0;
> >
> >      if ( !rc )
> >          rc = iommu_flush_all(d);
> 
> Would be nice to have a comment here clarifying why flush_flags
> doesn't get used.

Ok.

> 
> > @@ -249,6 +251,9 @@ void __hwdom_init arch_iommu_hwdom_init(struct
> domain *d)
> >          if (!(i & 0xfffff))
> >              process_pending_softirqs();
> >      }
> > +
> > +    if ( !flush_flags && iommu_flush_all(d) )
> > +        return;
> >  }
> 
> Again please attach a brief comment explaining the seemingly
> strange construct.
> 

Ok.

> > --- a/xen/include/xen/iommu.h
> > +++ b/xen/include/xen/iommu.h
> > @@ -93,6 +93,22 @@ void iommu_teardown(struct domain *d);
> >  #define _IOMMUF_writable 1
> >  #define IOMMUF_writable  (1u<<_IOMMUF_writable)
> >
> > +enum
> > +{
> 
> Brace on the same line as "enum" please, just like for struct/union. When
> they're named this helps finding the place where a certain type gets
> (fully) declared.

Ok.

  Paul

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] x86/mm/p2m: stop checking for IOMMU shared page tables in mmio_order()
  2018-12-04 15:22     ` Paul Durrant
@ 2018-12-04 15:36       ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2018-12-04 15:36 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Andrew Cooper, xen-devel, Wei Liu, george.dunlap, Roger Pau Monne

>>> On 04.12.18 at 16:22, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 04 December 2018 15:21
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau Monne
>> <roger.pau@citrix.com>; Wei Liu <wei.liu2@citrix.com>; George Dunlap
>> <George.Dunlap@citrix.com>; xen-devel <xen-devel@lists.xenproject.org>
>> Subject: Re: [PATCH v2 4/4] x86/mm/p2m: stop checking for IOMMU shared
>> page tables in mmio_order()
>> 
>> >>> On 03.12.18 at 18:40, <paul.durrant@citrix.com> wrote:
>> > Now that the iommu_map() and iommu_unmap() operations take an order
>> > parameter and elide flushing there's no strong reason why modifying MMIO
>> > ranges in the p2m should be restricted to a 4k granularity simply
>> because
>> > the IOMMU is enabled but shared page tables are not in operation.
>> 
>> I'm afraid the two improvements are not enough for this restriction
>> to be lifted: There's still no preemption in the processing of the
>> higher order values.
> 
> Why? 1G orders are already ruled out and testing shows that 2M orders cause 
> no problems on EPYC systems.

Hmm, yes, agreed.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] x86/mm/p2m: stop checking for IOMMU shared page tables in mmio_order()
  2018-12-03 17:40 ` [PATCH v2 4/4] x86/mm/p2m: stop checking for IOMMU shared page tables in mmio_order() Paul Durrant
  2018-12-04 15:20   ` Jan Beulich
@ 2018-12-04 15:51   ` Jan Beulich
  1 sibling, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2018-12-04 15:51 UTC (permalink / raw)
  To: Paul Durrant
  Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

>>> On 03.12.18 at 18:40, <paul.durrant@citrix.com> wrote:
> Now that the iommu_map() and iommu_unmap() operations take an order
> parameter and elide flushing there's no strong reason why modifying MMIO
> ranges in the p2m should be restricted to a 4k granularity simply because
> the IOMMU is enabled but shared page tables are not in operation.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>

Based on the other reply of yours:
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] iommu: elide flushing for higher order map/unmap operations
  2018-12-04 15:36     ` Paul Durrant
@ 2018-12-04 16:01       ` Jan Beulich
  2018-12-04 16:53         ` Paul Durrant
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2018-12-04 16:01 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim Deegan, george.dunlap, Julien Grall,
	Suravee Suthikulpanit, xen-devel, IanJackson, Brian Woods,
	Roger Pau Monne

>>> On 04.12.18 at 16:36, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 04 December 2018 15:17
>> 
>> >>> On 03.12.18 at 18:40, <paul.durrant@citrix.com> wrote:
>> > --- a/xen/arch/arm/p2m.c
>> > +++ b/xen/arch/arm/p2m.c
>> > @@ -971,8 +971,17 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
>> >
>> >      if ( need_iommu_pt_sync(p2m->domain) &&
>> >           (lpae_is_valid(orig_pte) || lpae_is_valid(*entry)) )
>> > +    {
>> > +        unsigned int flush_flags = 0;
>> > +
>> > +        if ( lpae_is_valid(orig_pte) )
>> > +            flush_flags |= IOMMU_FLUSHF_modified;
>> > +        if ( lpae_is_valid(*entry) )
>> > +            flush_flags |= IOMMU_FLUSHF_added;
>> 
>> Shouldn't this be "else if" with the meaning assigned to both
>> types? From an abstract perspective I'd also expect that for
>> a single mapping no more than one of the flags can come
>> back set (through the iommu_ops interface).
> 
> That's not how I see it. My rationale is:
> 
> - present PTE made non-present, or modified -> IOMMU_FLUSHF_modified
> - new PTE value is present -> IOMMU_FLUSHF_added
> 
> So, a single op can set any combination of bits and thus the above code does 
> not use 'else if'.

I can't fit this with the code comments:

enum
{
    _IOMMU_FLUSHF_added, /* no modified entries, just additional entries */
    _IOMMU_FLUSHF_modified, /* modified entries */
};

..., in particular the "no modified entries" part.

>> > @@ -84,7 +86,7 @@ static bool set_iommu_pde_present(uint32_t *pde,
>> unsigned long next_mfn,
>> >
>> >          if ( maddr_old != maddr_next || iw != old_w || ir != old_r ||
>> >               old_level != next_level )
>> > -            need_flush = true;
>> > +            flush_flags = IOMMU_FLUSHF_modified;
>> 
>> Why uniformly "modified"?
> 
> Because the AMD IOMMU does require flushing for a non-present -> present 
> transition AFAICT. The old code certainly implies this.

It is one thing what the flush function does with the value, but
another whether the modifying function "lies". I'm not opposed
to simplification, but then a comment needs to explain this.

>> > @@ -235,6 +236,9 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
>> >                  process_pending_softirqs();
>> >          }
>> >
>> > +        while ( !flush_flags && iommu_flush_all(d) )
>> > +            break;
>> 
>> Is there a comment missing here explaining the seemingly odd
>> loop?
> 
> I'm merely using the construct you suggested, but I can add a comment.

And I'm fine with the construct, but in the other place (for which
we did discuss this for the earlier version) there is a comment.

>> > --- a/xen/drivers/passthrough/vtd/iommu.c
>> > +++ b/xen/drivers/passthrough/vtd/iommu.c
>> > @@ -633,11 +633,14 @@ static int __must_check iommu_flush_iotlb(struct
>> domain *d, dfn_t dfn,
>> >
>> >  static int __must_check iommu_flush_iotlb_pages(struct domain *d,
>> >                                                  dfn_t dfn,
>> > -                                                unsigned int
>> page_count)
>> > +                                                unsigned int
>> page_count,
>> > +                                                unsigned int
>> flush_flags)
>> >  {
>> >      ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN));
>> > +    ASSERT(flush_flags);
>> >
>> > -    return iommu_flush_iotlb(d, dfn, 1, page_count);
>> > +    return iommu_flush_iotlb(d, dfn, flush_flags &
>> IOMMU_FLUSHF_modified,
>> > +                             page_count);
>> 
>> Why the restriction to "modified"?
> 
> The parameter is a bool which should be true if an existing PTE was modified 
> or false otherwise. I can make this !!(flush_flags & IOMMU_FLUSHF_modified) is 
> you prefer.

No, that wasn't my point. The question is why this isn't just
"flush_flags", without any masking. Iirc there are precautions
in the VT-d code to deal with hardware which may cache
non-present entries. In that case "added" requires flushing too.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] iommu: elide flushing for higher order map/unmap operations
  2018-12-04 16:01       ` Jan Beulich
@ 2018-12-04 16:53         ` Paul Durrant
  2018-12-04 17:20           ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Paul Durrant @ 2018-12-04 16:53 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Suravee Suthikulpanit, Ian Jackson,
	xen-devel, Brian Woods, Roger Pau Monne

> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On Behalf
> Of Jan Beulich
> Sent: 04 December 2018 16:02
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Kevin Tian <kevin.tian@intel.com>; Stefano Stabellini
> <sstabellini@kernel.org>; Wei Liu <wei.liu2@citrix.com>; Konrad Rzeszutek
> Wilk <konrad.wilk@oracle.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>;
> Tim (Xen.org) <tim@xen.org>; George Dunlap <George.Dunlap@citrix.com>;
> Julien Grall <julien.grall@arm.com>; Suravee Suthikulpanit
> <suravee.suthikulpanit@amd.com>; xen-devel <xen-
> devel@lists.xenproject.org>; Ian Jackson <Ian.Jackson@citrix.com>; Brian
> Woods <brian.woods@amd.com>; Roger Pau Monne <roger.pau@citrix.com>
> Subject: Re: [Xen-devel] [PATCH v2 3/4] iommu: elide flushing for higher
> order map/unmap operations
> 
> >>> On 04.12.18 at 16:36, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 04 December 2018 15:17
> >>
> >> >>> On 03.12.18 at 18:40, <paul.durrant@citrix.com> wrote:
> >> > --- a/xen/arch/arm/p2m.c
> >> > +++ b/xen/arch/arm/p2m.c
> >> > @@ -971,8 +971,17 @@ static int __p2m_set_entry(struct p2m_domain
> *p2m,
> >> >
> >> >      if ( need_iommu_pt_sync(p2m->domain) &&
> >> >           (lpae_is_valid(orig_pte) || lpae_is_valid(*entry)) )
> >> > +    {
> >> > +        unsigned int flush_flags = 0;
> >> > +
> >> > +        if ( lpae_is_valid(orig_pte) )
> >> > +            flush_flags |= IOMMU_FLUSHF_modified;
> >> > +        if ( lpae_is_valid(*entry) )
> >> > +            flush_flags |= IOMMU_FLUSHF_added;
> >>
> >> Shouldn't this be "else if" with the meaning assigned to both
> >> types? From an abstract perspective I'd also expect that for
> >> a single mapping no more than one of the flags can come
> >> back set (through the iommu_ops interface).
> >
> > That's not how I see it. My rationale is:
> >
> > - present PTE made non-present, or modified -> IOMMU_FLUSHF_modified
> > - new PTE value is present -> IOMMU_FLUSHF_added
> >
> > So, a single op can set any combination of bits and thus the above code
> does
> > not use 'else if'.
> 
> I can't fit this with the code comments:
> 
> enum
> {
>     _IOMMU_FLUSHF_added, /* no modified entries, just additional entries
> */
>     _IOMMU_FLUSHF_modified, /* modified entries */
> };
> 
> ..., in particular the "no modified entries" part.

That was supposed to emphasize the need for the other flag was needed the case of a mapping that modifies an existing entry... I'll re-state using a block comment.

> 
> >> > @@ -84,7 +86,7 @@ static bool set_iommu_pde_present(uint32_t *pde,
> >> unsigned long next_mfn,
> >> >
> >> >          if ( maddr_old != maddr_next || iw != old_w || ir != old_r
> ||
> >> >               old_level != next_level )
> >> > -            need_flush = true;
> >> > +            flush_flags = IOMMU_FLUSHF_modified;
> >>
> >> Why uniformly "modified"?
> >
> > Because the AMD IOMMU does require flushing for a non-present -> present
> > transition AFAICT. The old code certainly implies this.
> 
> It is one thing what the flush function does with the value, but
> another whether the modifying function "lies". I'm not opposed
> to simplification, but then a comment needs to explain this.
> 

Ok. Maybe it is simpler not to omit requesting the 'added' flush here and then just ignore it in the flush method.

> >> > @@ -235,6 +236,9 @@ void __hwdom_init iommu_hwdom_init(struct domain
> *d)
> >> >                  process_pending_softirqs();
> >> >          }
> >> >
> >> > +        while ( !flush_flags && iommu_flush_all(d) )
> >> > +            break;
> >>
> >> Is there a comment missing here explaining the seemingly odd
> >> loop?
> >
> > I'm merely using the construct you suggested, but I can add a comment.
> 
> And I'm fine with the construct, but in the other place (for which
> we did discuss this for the earlier version) there is a comment.
>

Ok. I'll add a similar comment.

> >> > --- a/xen/drivers/passthrough/vtd/iommu.c
> >> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> >> > @@ -633,11 +633,14 @@ static int __must_check
> iommu_flush_iotlb(struct
> >> domain *d, dfn_t dfn,
> >> >
> >> >  static int __must_check iommu_flush_iotlb_pages(struct domain *d,
> >> >                                                  dfn_t dfn,
> >> > -                                                unsigned int
> >> page_count)
> >> > +                                                unsigned int
> >> page_count,
> >> > +                                                unsigned int
> >> flush_flags)
> >> >  {
> >> >      ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN));
> >> > +    ASSERT(flush_flags);
> >> >
> >> > -    return iommu_flush_iotlb(d, dfn, 1, page_count);
> >> > +    return iommu_flush_iotlb(d, dfn, flush_flags &
> >> IOMMU_FLUSHF_modified,
> >> > +                             page_count);
> >>
> >> Why the restriction to "modified"?
> >
> > The parameter is a bool which should be true if an existing PTE was
> modified
> > or false otherwise. I can make this !!(flush_flags &
> IOMMU_FLUSHF_modified) is
> > you prefer.
> 
> No, that wasn't my point. The question is why this isn't just
> "flush_flags", without any masking. Iirc there are precautions
> in the VT-d code to deal with hardware which may cache
> non-present entries. In that case "added" requires flushing too.
> 

I don't understand. iommu_flush_iotlb()'s third argument is 'dma_old_pte_present' so that should be true iff IOMMU_FLUSHF_modified is set. IOMMU_FLUSHF_added is irrelevant to the implementation.

  Paul

> Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] iommu: elide flushing for higher order map/unmap operations
  2018-12-04 16:53         ` Paul Durrant
@ 2018-12-04 17:20           ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2018-12-04 17:20 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim Deegan, george.dunlap, Julien Grall,
	Suravee Suthikulpanit, xen-devel, Ian Jackson, Brian Woods,
	Roger Pau Monne

>>> On 04.12.18 at 17:53, <Paul.Durrant@citrix.com> wrote:
>> From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On Behalf
>> Of Jan Beulich
>> Sent: 04 December 2018 16:02
>> 
>> >>> On 04.12.18 at 16:36, <Paul.Durrant@citrix.com> wrote:
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: 04 December 2018 15:17
>> >>
>> >> >>> On 03.12.18 at 18:40, <paul.durrant@citrix.com> wrote:
>> >> > --- a/xen/drivers/passthrough/vtd/iommu.c
>> >> > +++ b/xen/drivers/passthrough/vtd/iommu.c
>> >> > @@ -633,11 +633,14 @@ static int __must_check
>> iommu_flush_iotlb(struct
>> >> domain *d, dfn_t dfn,
>> >> >
>> >> >  static int __must_check iommu_flush_iotlb_pages(struct domain *d,
>> >> >                                                  dfn_t dfn,
>> >> > -                                                unsigned int
>> >> page_count)
>> >> > +                                                unsigned int
>> >> page_count,
>> >> > +                                                unsigned int
>> >> flush_flags)
>> >> >  {
>> >> >      ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN));
>> >> > +    ASSERT(flush_flags);
>> >> >
>> >> > -    return iommu_flush_iotlb(d, dfn, 1, page_count);
>> >> > +    return iommu_flush_iotlb(d, dfn, flush_flags &
>> >> IOMMU_FLUSHF_modified,
>> >> > +                             page_count);
>> >>
>> >> Why the restriction to "modified"?
>> >
>> > The parameter is a bool which should be true if an existing PTE was
>> modified
>> > or false otherwise. I can make this !!(flush_flags &
>> IOMMU_FLUSHF_modified) is
>> > you prefer.
>> 
>> No, that wasn't my point. The question is why this isn't just
>> "flush_flags", without any masking. Iirc there are precautions
>> in the VT-d code to deal with hardware which may cache
>> non-present entries. In that case "added" requires flushing too.
>> 
> 
> I don't understand. iommu_flush_iotlb()'s third argument is 
> 'dma_old_pte_present' so that should be true iff IOMMU_FLUSHF_modified is 
> set. IOMMU_FLUSHF_added is irrelevant to the implementation.

Oh, you're right. I'm sorry for the noise then.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2018-12-04 17:20 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-03 17:40 [PATCH v2 0/4] iommu improvements Paul Durrant
2018-12-03 17:40 ` [PATCH v2 1/4] amd-iommu: add flush iommu_ops Paul Durrant
2018-12-04 14:24   ` Jan Beulich
2018-12-04 14:56     ` Paul Durrant
2018-12-03 17:40 ` [PATCH v2 2/4] iommu: rename wrapper functions Paul Durrant
2018-12-04 14:51   ` Jan Beulich
2018-12-04 15:00     ` Paul Durrant
2018-12-03 17:40 ` [PATCH v2 3/4] iommu: elide flushing for higher order map/unmap operations Paul Durrant
2018-12-04 15:16   ` Jan Beulich
2018-12-04 15:36     ` Paul Durrant
2018-12-04 16:01       ` Jan Beulich
2018-12-04 16:53         ` Paul Durrant
2018-12-04 17:20           ` Jan Beulich
2018-12-03 17:40 ` [PATCH v2 4/4] x86/mm/p2m: stop checking for IOMMU shared page tables in mmio_order() Paul Durrant
2018-12-04 15:20   ` Jan Beulich
2018-12-04 15:22     ` Paul Durrant
2018-12-04 15:36       ` Jan Beulich
2018-12-04 15:51   ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.