All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v9 0/8] IOMMU cleanup
@ 2020-09-15  8:29 Paul Durrant
  2020-09-15  8:29 ` [PATCH v9 1/8] x86/iommu: convert VT-d code to use new page table allocator Paul Durrant
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Paul Durrant @ 2020-09-15  8:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Paul Durrant, Andrew Cooper, Bertrand Marquis, George Dunlap,
	Ian Jackson, Jan Beulich, Julien Grall, Jun Nakajima, Kevin Tian,
	Oleksandr Tyshchenko, Roger Pau Monné,
	Stefano Stabellini, Wei Liu

From: Paul Durrant <pdurrant@amazon.com>

Paul Durrant (8):
  x86/iommu: convert VT-d code to use new page table allocator
  iommu: remove unused iommu_ops method and tasklet
  iommu: flush I/O TLB if iommu_map() or iommu_unmap() fail
  iommu: make map and unmap take a page count, similar to flush
  remove remaining uses of iommu_legacy_map/unmap
  common/grant_table: batch flush I/O TLB
  iommu: remove the share_p2m operation
  iommu: stop calling IOMMU page tables 'p2m tables'

 xen/arch/x86/mm.c                           |  22 +-
 xen/arch/x86/mm/p2m-ept.c                   |  21 +-
 xen/arch/x86/mm/p2m-pt.c                    |  16 +-
 xen/arch/x86/mm/p2m.c                       |  27 ++-
 xen/arch/x86/x86_64/mm.c                    |  20 +-
 xen/common/grant_table.c                    | 208 ++++++++++++------
 xen/common/memory.c                         |   7 +-
 xen/drivers/passthrough/amd/iommu.h         |   2 +-
 xen/drivers/passthrough/amd/iommu_map.c     |   4 +-
 xen/drivers/passthrough/amd/pci_amd_iommu.c |  20 +-
 xen/drivers/passthrough/arm/ipmmu-vmsa.c    |   2 +-
 xen/drivers/passthrough/arm/smmu.c          |   2 +-
 xen/drivers/passthrough/iommu.c             | 118 +++-------
 xen/drivers/passthrough/vtd/extern.h        |   2 +-
 xen/drivers/passthrough/vtd/iommu.c         | 225 +++++++++-----------
 xen/drivers/passthrough/vtd/x86/vtd.c       |   2 +-
 xen/drivers/passthrough/x86/iommu.c         |   2 +-
 xen/include/xen/iommu.h                     |  36 +---
 18 files changed, 372 insertions(+), 364 deletions(-)
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Bertrand Marquis <Bertrand.Marquis@arm.com>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien@xen.org>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Oleksandr Tyshchenko <olekstysh@gmail.com>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Wei Liu <wl@xen.org>
-- 
2.20.1



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v9 1/8] x86/iommu: convert VT-d code to use new page table allocator
  2020-09-15  8:29 [PATCH v9 0/8] IOMMU cleanup Paul Durrant
@ 2020-09-15  8:29 ` Paul Durrant
  2020-09-17  6:33   ` Tian, Kevin
  2020-09-15  8:29 ` [PATCH v9 2/8] iommu: remove unused iommu_ops method and tasklet Paul Durrant
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Paul Durrant @ 2020-09-15  8:29 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Jan Beulich, Kevin Tian

From: Paul Durrant <pdurrant@amazon.com>

This patch converts the VT-d code to use the new IOMMU page table allocator
function. This allows all the free-ing code to be removed (since it is now
handled by the general x86 code) which reduces TLB and cache thrashing as well
as shortening the code.

The scope of the mapping_lock in intel_iommu_quarantine_init() has also been
increased slightly; it should have always covered accesses to
'arch.vtd.pgd_maddr'.

NOTE: The common IOMMU needs a slight modification to avoid scheduling the
      cleanup tasklet if the free_page_table() method is not present (since
      the tasklet will unconditionally call it).

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Cc: Kevin Tian <kevin.tian@intel.com>

v2:
 - New in v2 (split from "add common page-table allocator")
---
 xen/drivers/passthrough/iommu.c     |   6 +-
 xen/drivers/passthrough/vtd/iommu.c | 101 ++++++++++------------------
 2 files changed, 39 insertions(+), 68 deletions(-)

diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 1d644844ab..2b1db8022c 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -225,8 +225,10 @@ static void iommu_teardown(struct domain *d)
 {
     struct domain_iommu *hd = dom_iommu(d);
 
-    hd->platform_ops->teardown(d);
-    tasklet_schedule(&iommu_pt_cleanup_tasklet);
+    iommu_vcall(hd->platform_ops, teardown, d);
+
+    if ( hd->platform_ops->free_page_table )
+        tasklet_schedule(&iommu_pt_cleanup_tasklet);
 }
 
 void iommu_domain_destroy(struct domain *d)
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 94e0455a4d..607e8b5e65 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -265,10 +265,15 @@ static u64 addr_to_dma_page_maddr(struct domain *domain, u64 addr, int alloc)
 
     addr &= (((u64)1) << addr_width) - 1;
     ASSERT(spin_is_locked(&hd->arch.mapping_lock));
-    if ( !hd->arch.vtd.pgd_maddr &&
-         (!alloc ||
-          ((hd->arch.vtd.pgd_maddr = alloc_pgtable_maddr(1, hd->node)) == 0)) )
-        goto out;
+    if ( !hd->arch.vtd.pgd_maddr )
+    {
+        struct page_info *pg;
+
+        if ( !alloc || !(pg = iommu_alloc_pgtable(domain)) )
+            goto out;
+
+        hd->arch.vtd.pgd_maddr = page_to_maddr(pg);
+    }
 
     parent = (struct dma_pte *)map_vtd_domain_page(hd->arch.vtd.pgd_maddr);
     while ( level > 1 )
@@ -279,13 +284,16 @@ static u64 addr_to_dma_page_maddr(struct domain *domain, u64 addr, int alloc)
         pte_maddr = dma_pte_addr(*pte);
         if ( !pte_maddr )
         {
+            struct page_info *pg;
+
             if ( !alloc )
                 break;
 
-            pte_maddr = alloc_pgtable_maddr(1, hd->node);
-            if ( !pte_maddr )
+            pg = iommu_alloc_pgtable(domain);
+            if ( !pg )
                 break;
 
+            pte_maddr = page_to_maddr(pg);
             dma_set_pte_addr(*pte, pte_maddr);
 
             /*
@@ -675,45 +683,6 @@ static void dma_pte_clear_one(struct domain *domain, uint64_t addr,
     unmap_vtd_domain_page(page);
 }
 
-static void iommu_free_pagetable(u64 pt_maddr, int level)
-{
-    struct page_info *pg = maddr_to_page(pt_maddr);
-
-    if ( pt_maddr == 0 )
-        return;
-
-    PFN_ORDER(pg) = level;
-    spin_lock(&iommu_pt_cleanup_lock);
-    page_list_add_tail(pg, &iommu_pt_cleanup_list);
-    spin_unlock(&iommu_pt_cleanup_lock);
-}
-
-static void iommu_free_page_table(struct page_info *pg)
-{
-    unsigned int i, next_level = PFN_ORDER(pg) - 1;
-    u64 pt_maddr = page_to_maddr(pg);
-    struct dma_pte *pt_vaddr, *pte;
-
-    PFN_ORDER(pg) = 0;
-    pt_vaddr = (struct dma_pte *)map_vtd_domain_page(pt_maddr);
-
-    for ( i = 0; i < PTE_NUM; i++ )
-    {
-        pte = &pt_vaddr[i];
-        if ( !dma_pte_present(*pte) )
-            continue;
-
-        if ( next_level >= 1 )
-            iommu_free_pagetable(dma_pte_addr(*pte), next_level);
-
-        dma_clear_pte(*pte);
-        iommu_sync_cache(pte, sizeof(struct dma_pte));
-    }
-
-    unmap_vtd_domain_page(pt_vaddr);
-    free_pgtable_maddr(pt_maddr);
-}
-
 static int iommu_set_root_entry(struct vtd_iommu *iommu)
 {
     u32 sts;
@@ -1748,16 +1717,7 @@ static void iommu_domain_teardown(struct domain *d)
         xfree(mrmrr);
     }
 
-    ASSERT(is_iommu_enabled(d));
-
-    if ( iommu_use_hap_pt(d) )
-        return;
-
-    spin_lock(&hd->arch.mapping_lock);
-    iommu_free_pagetable(hd->arch.vtd.pgd_maddr,
-                         agaw_to_level(hd->arch.vtd.agaw));
     hd->arch.vtd.pgd_maddr = 0;
-    spin_unlock(&hd->arch.mapping_lock);
 }
 
 static int __must_check intel_iommu_map_page(struct domain *d, dfn_t dfn,
@@ -2669,23 +2629,28 @@ static void vtd_dump_p2m_table(struct domain *d)
 static int __init intel_iommu_quarantine_init(struct domain *d)
 {
     struct domain_iommu *hd = dom_iommu(d);
+    struct page_info *pg;
     struct dma_pte *parent;
     unsigned int agaw = width_to_agaw(DEFAULT_DOMAIN_ADDRESS_WIDTH);
     unsigned int level = agaw_to_level(agaw);
-    int rc;
+    int rc = 0;
+
+    spin_lock(&hd->arch.mapping_lock);
 
     if ( hd->arch.vtd.pgd_maddr )
     {
         ASSERT_UNREACHABLE();
-        return 0;
+        goto out;
     }
 
-    spin_lock(&hd->arch.mapping_lock);
+    pg = iommu_alloc_pgtable(d);
 
-    hd->arch.vtd.pgd_maddr = alloc_pgtable_maddr(1, hd->node);
-    if ( !hd->arch.vtd.pgd_maddr )
+    rc = -ENOMEM;
+    if ( !pg )
         goto out;
 
+    hd->arch.vtd.pgd_maddr = page_to_maddr(pg);
+
     parent = map_vtd_domain_page(hd->arch.vtd.pgd_maddr);
     while ( level )
     {
@@ -2697,10 +2662,12 @@ static int __init intel_iommu_quarantine_init(struct domain *d)
          * page table pages, and the resulting allocations are always
          * zeroed.
          */
-        maddr = alloc_pgtable_maddr(1, hd->node);
-        if ( !maddr )
-            break;
+        pg = iommu_alloc_pgtable(d);
+
+        if ( !pg )
+            goto out;
 
+        maddr = page_to_maddr(pg);
         for ( offset = 0; offset < PTE_NUM; offset++ )
         {
             struct dma_pte *pte = &parent[offset];
@@ -2716,13 +2683,16 @@ static int __init intel_iommu_quarantine_init(struct domain *d)
     }
     unmap_vtd_domain_page(parent);
 
+    rc = 0;
+
  out:
     spin_unlock(&hd->arch.mapping_lock);
 
-    rc = iommu_flush_iotlb_all(d);
+    if ( !rc )
+        rc = iommu_flush_iotlb_all(d);
 
-    /* Pages leaked in failure case */
-    return level ? -ENOMEM : rc;
+    /* Pages may be leaked in failure case */
+    return rc;
 }
 
 static struct iommu_ops __initdata vtd_ops = {
@@ -2737,7 +2707,6 @@ static struct iommu_ops __initdata vtd_ops = {
     .map_page = intel_iommu_map_page,
     .unmap_page = intel_iommu_unmap_page,
     .lookup_page = intel_iommu_lookup_page,
-    .free_page_table = iommu_free_page_table,
     .reassign_device = reassign_device_ownership,
     .get_device_group_id = intel_iommu_group_id,
     .enable_x2apic = intel_iommu_enable_eim,
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v9 2/8] iommu: remove unused iommu_ops method and tasklet
  2020-09-15  8:29 [PATCH v9 0/8] IOMMU cleanup Paul Durrant
  2020-09-15  8:29 ` [PATCH v9 1/8] x86/iommu: convert VT-d code to use new page table allocator Paul Durrant
@ 2020-09-15  8:29 ` Paul Durrant
  2020-09-15  8:29 ` [PATCH v9 3/8] iommu: flush I/O TLB if iommu_map() or iommu_unmap() fail Paul Durrant
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Paul Durrant @ 2020-09-15  8:29 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Jan Beulich

From: Paul Durrant <pdurrant@amazon.com>

The VT-d and AMD IOMMU implementations both use the general x86 IOMMU page
table allocator and ARM always shares page tables with CPU. Hence there is no
need to retain the free_page_table() method or the tasklet which invokes it.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---

v2:
  - New in v2 (split from "add common page-table allocator")
---
 xen/drivers/passthrough/iommu.c | 25 -------------------------
 xen/include/xen/iommu.h         |  2 --
 2 files changed, 27 deletions(-)

diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 2b1db8022c..660dc5deb2 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -49,10 +49,6 @@ bool_t __read_mostly amd_iommu_perdev_intremap = 1;
 
 DEFINE_PER_CPU(bool_t, iommu_dont_flush_iotlb);
 
-DEFINE_SPINLOCK(iommu_pt_cleanup_lock);
-PAGE_LIST_HEAD(iommu_pt_cleanup_list);
-static struct tasklet iommu_pt_cleanup_tasklet;
-
 static int __init parse_iommu_param(const char *s)
 {
     const char *ss;
@@ -226,9 +222,6 @@ static void iommu_teardown(struct domain *d)
     struct domain_iommu *hd = dom_iommu(d);
 
     iommu_vcall(hd->platform_ops, teardown, d);
-
-    if ( hd->platform_ops->free_page_table )
-        tasklet_schedule(&iommu_pt_cleanup_tasklet);
 }
 
 void iommu_domain_destroy(struct domain *d)
@@ -368,23 +361,6 @@ int iommu_lookup_page(struct domain *d, dfn_t dfn, mfn_t *mfn,
     return iommu_call(hd->platform_ops, lookup_page, d, dfn, mfn, flags);
 }
 
-static void iommu_free_pagetables(void *unused)
-{
-    do {
-        struct page_info *pg;
-
-        spin_lock(&iommu_pt_cleanup_lock);
-        pg = page_list_remove_head(&iommu_pt_cleanup_list);
-        spin_unlock(&iommu_pt_cleanup_lock);
-        if ( !pg )
-            return;
-        iommu_vcall(iommu_get_ops(), free_page_table, pg);
-    } while ( !softirq_pending(smp_processor_id()) );
-
-    tasklet_schedule_on_cpu(&iommu_pt_cleanup_tasklet,
-                            cpumask_cycle(smp_processor_id(), &cpu_online_map));
-}
-
 int iommu_iotlb_flush(struct domain *d, dfn_t dfn, unsigned int page_count,
                       unsigned int flush_flags)
 {
@@ -508,7 +484,6 @@ int __init iommu_setup(void)
 #ifndef iommu_intremap
         printk("Interrupt remapping %sabled\n", iommu_intremap ? "en" : "dis");
 #endif
-        tasklet_init(&iommu_pt_cleanup_tasklet, iommu_free_pagetables, NULL);
     }
 
     return rc;
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 3272874958..1831dc66b0 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -263,8 +263,6 @@ struct iommu_ops {
     int __must_check (*lookup_page)(struct domain *d, dfn_t dfn, mfn_t *mfn,
                                     unsigned int *flags);
 
-    void (*free_page_table)(struct page_info *);
-
 #ifdef CONFIG_X86
     int (*enable_x2apic)(void);
     void (*disable_x2apic)(void);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v9 3/8] iommu: flush I/O TLB if iommu_map() or iommu_unmap() fail
  2020-09-15  8:29 [PATCH v9 0/8] IOMMU cleanup Paul Durrant
  2020-09-15  8:29 ` [PATCH v9 1/8] x86/iommu: convert VT-d code to use new page table allocator Paul Durrant
  2020-09-15  8:29 ` [PATCH v9 2/8] iommu: remove unused iommu_ops method and tasklet Paul Durrant
@ 2020-09-15  8:29 ` Paul Durrant
  2020-09-15  8:29 ` [PATCH v9 4/8] iommu: make map and unmap take a page count, similar to flush Paul Durrant
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Paul Durrant @ 2020-09-15  8:29 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Jan Beulich

From: Paul Durrant <pdurrant@amazon.com>

This patch adds a full I/O TLB flush to the error paths of iommu_map() and
iommu_unmap().

Without this change callers need constructs such as:

rc = iommu_map/unmap(...)
err = iommu_flush(...)
if ( !rc )
  rc = err;

With this change, it can be simplified to:

rc = iommu_map/unmap(...)
if ( !rc )
  rc = iommu_flush(...)

because, if the map or unmap fails the flush will be unnecessary. This saves
a stack variable and generally makes the call sites tidier.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---

v6:
 - Remove stray blank

v5:
 - Avoid unnecessary flushing if 'page_order' is 0
 - Add missing indirection on 'flush_flags'

v2:
 - New in v2
---
 xen/drivers/passthrough/iommu.c | 34 +++++++++++++++++----------------
 1 file changed, 18 insertions(+), 16 deletions(-)

diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 660dc5deb2..eb65631d59 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -274,6 +274,13 @@ int iommu_map(struct domain *d, dfn_t dfn, mfn_t mfn,
         break;
     }
 
+    /*
+     * Something went wrong so, if we were dealing with more than a single
+     * page, flush everything and clear flush flags.
+     */
+    if ( page_order && unlikely(rc) && !iommu_iotlb_flush_all(d, *flush_flags) )
+        *flush_flags = 0;
+
     return rc;
 }
 
@@ -283,14 +290,8 @@ int iommu_legacy_map(struct domain *d, dfn_t dfn, mfn_t mfn,
     unsigned int flush_flags = 0;
     int rc = iommu_map(d, dfn, mfn, page_order, flags, &flush_flags);
 
-    if ( !this_cpu(iommu_dont_flush_iotlb) )
-    {
-        int err = iommu_iotlb_flush(d, dfn, (1u << page_order),
-                                    flush_flags);
-
-        if ( !rc )
-            rc = err;
-    }
+    if ( !this_cpu(iommu_dont_flush_iotlb) && !rc )
+        rc = iommu_iotlb_flush(d, dfn, (1u << page_order), flush_flags);
 
     return rc;
 }
@@ -330,6 +331,13 @@ int iommu_unmap(struct domain *d, dfn_t dfn, unsigned int page_order,
         }
     }
 
+    /*
+     * Something went wrong so, if we were dealing with more than a single
+     * page, flush everything and clear flush flags.
+     */
+    if ( page_order && unlikely(rc) && !iommu_iotlb_flush_all(d, *flush_flags) )
+        *flush_flags = 0;
+
     return rc;
 }
 
@@ -338,14 +346,8 @@ int iommu_legacy_unmap(struct domain *d, dfn_t dfn, unsigned int page_order)
     unsigned int flush_flags = 0;
     int rc = iommu_unmap(d, dfn, page_order, &flush_flags);
 
-    if ( !this_cpu(iommu_dont_flush_iotlb) )
-    {
-        int err = iommu_iotlb_flush(d, dfn, (1u << page_order),
-                                    flush_flags);
-
-        if ( !rc )
-            rc = err;
-    }
+    if ( !this_cpu(iommu_dont_flush_iotlb) && !rc )
+        rc = iommu_iotlb_flush(d, dfn, (1u << page_order), flush_flags);
 
     return rc;
 }
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v9 4/8] iommu: make map and unmap take a page count, similar to flush
  2020-09-15  8:29 [PATCH v9 0/8] IOMMU cleanup Paul Durrant
                   ` (2 preceding siblings ...)
  2020-09-15  8:29 ` [PATCH v9 3/8] iommu: flush I/O TLB if iommu_map() or iommu_unmap() fail Paul Durrant
@ 2020-09-15  8:29 ` Paul Durrant
  2020-09-17  6:38   ` Tian, Kevin
  2020-09-17 16:51   ` Wei Liu
  2020-09-15  8:29 ` [PATCH v9 5/8] remove remaining uses of iommu_legacy_map/unmap Paul Durrant
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 19+ messages in thread
From: Paul Durrant @ 2020-09-15  8:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Paul Durrant, Jan Beulich, Julien Grall, Andrew Cooper, Wei Liu,
	Roger Pau Monné,
	George Dunlap, Ian Jackson, Stefano Stabellini, Jun Nakajima,
	Kevin Tian, Bertrand Marquis, Oleksandr Tyshchenko

From: Paul Durrant <pdurrant@amazon.com>

At the moment iommu_map() and iommu_unmap() take a page order rather than a
count, whereas iommu_iotlb_flush() takes a page count rather than an order.
This patch makes them consistent with each other, opting for a page count since
CPU page orders are not necessarily the same as those of an IOMMU.

NOTE: The 'page_count' parameter is also made an unsigned long in all the
      aforementioned functions.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Julien Grall <julien@xen.org>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wl@xen.org>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Bertrand Marquis <Bertrand.Marquis@arm.com>
Cc: Oleksandr Tyshchenko <olekstysh@gmail.com>

v8:
 - Fix IPMMU-VMSA code too

v7:
 - Fix ARM build

v6:
 - Fix missing conversion to unsigned long in AMD code
 - Fixed unconverted call to iommu_legacy_unmap() in x86/mm.c
 - s/1ul/1 in the grant table code

v5:
 - Re-worked to just use a page count, rather than both a count and an order

v2:
 - New in v2
---
 xen/arch/x86/mm.c                        |  8 ++++--
 xen/arch/x86/mm/p2m-ept.c                |  6 ++--
 xen/arch/x86/mm/p2m-pt.c                 |  4 +--
 xen/arch/x86/mm/p2m.c                    |  5 ++--
 xen/arch/x86/x86_64/mm.c                 |  4 +--
 xen/common/grant_table.c                 |  6 ++--
 xen/drivers/passthrough/amd/iommu.h      |  2 +-
 xen/drivers/passthrough/amd/iommu_map.c  |  4 +--
 xen/drivers/passthrough/arm/ipmmu-vmsa.c |  2 +-
 xen/drivers/passthrough/arm/smmu.c       |  2 +-
 xen/drivers/passthrough/iommu.c          | 35 +++++++++++-------------
 xen/drivers/passthrough/vtd/iommu.c      |  4 +--
 xen/drivers/passthrough/x86/iommu.c      |  2 +-
 xen/include/xen/iommu.h                  | 12 ++++----
 14 files changed, 48 insertions(+), 48 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 42a6dc9ba4..9447dabc50 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2441,7 +2441,7 @@ static int cleanup_page_mappings(struct page_info *page)
 
         if ( d && unlikely(need_iommu_pt_sync(d)) && is_pv_domain(d) )
         {
-            int rc2 = iommu_legacy_unmap(d, _dfn(mfn), PAGE_ORDER_4K);
+            int rc2 = iommu_legacy_unmap(d, _dfn(mfn), 1u << PAGE_ORDER_4K);
 
             if ( !rc )
                 rc = rc2;
@@ -2968,9 +2968,11 @@ static int _get_page_type(struct page_info *page, unsigned long type,
             mfn_t mfn = page_to_mfn(page);
 
             if ( (x & PGT_type_mask) == PGT_writable_page )
-                rc = iommu_legacy_unmap(d, _dfn(mfn_x(mfn)), PAGE_ORDER_4K);
+                rc = iommu_legacy_unmap(d, _dfn(mfn_x(mfn)),
+                                        1ul << PAGE_ORDER_4K);
             else
-                rc = iommu_legacy_map(d, _dfn(mfn_x(mfn)), mfn, PAGE_ORDER_4K,
+                rc = iommu_legacy_map(d, _dfn(mfn_x(mfn)), mfn,
+                                      1ul << PAGE_ORDER_4K,
                                       IOMMUF_readable | IOMMUF_writable);
 
             if ( unlikely(rc) )
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index b8154a7ecc..12cf38f6eb 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -843,14 +843,14 @@ out:
          need_modify_vtd_table )
     {
         if ( iommu_use_hap_pt(d) )
-            rc = iommu_iotlb_flush(d, _dfn(gfn), (1u << order),
+            rc = iommu_iotlb_flush(d, _dfn(gfn), 1ul << order,
                                    (iommu_flags ? IOMMU_FLUSHF_added : 0) |
                                    (vtd_pte_present ? IOMMU_FLUSHF_modified
                                                     : 0));
         else if ( need_iommu_pt_sync(d) )
             rc = iommu_flags ?
-                iommu_legacy_map(d, _dfn(gfn), mfn, order, iommu_flags) :
-                iommu_legacy_unmap(d, _dfn(gfn), order);
+                iommu_legacy_map(d, _dfn(gfn), mfn, 1ul << order, iommu_flags) :
+                iommu_legacy_unmap(d, _dfn(gfn), 1ul << order);
     }
 
     unmap_domain_page(table);
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index badb26bc34..3af51be78e 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -679,9 +679,9 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
     if ( need_iommu_pt_sync(p2m->domain) &&
          (iommu_old_flags != iommu_pte_flags || old_mfn != mfn_x(mfn)) )
         rc = iommu_pte_flags
-             ? iommu_legacy_map(d, _dfn(gfn), mfn, page_order,
+             ? iommu_legacy_map(d, _dfn(gfn), mfn, 1ul << page_order,
                                 iommu_pte_flags)
-             : iommu_legacy_unmap(d, _dfn(gfn), page_order);
+             : iommu_legacy_unmap(d, _dfn(gfn), 1ul << page_order);
 
     /*
      * Free old intermediate tables if necessary.  This has to be the
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index db7bde0230..928344be30 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1352,7 +1352,8 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn_l,
     {
         if ( !is_iommu_enabled(d) )
             return 0;
-        return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
+        return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l),
+                                1ul << PAGE_ORDER_4K,
                                 IOMMUF_readable | IOMMUF_writable);
     }
 
@@ -1443,7 +1444,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn_l)
     {
         if ( !is_iommu_enabled(d) )
             return 0;
-        return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
+        return iommu_legacy_unmap(d, _dfn(gfn_l), 1ul << PAGE_ORDER_4K);
     }
 
     gfn_lock(p2m, gfn, 0);
diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index 0d1aadbfce..bce1561e1a 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -1286,7 +1286,7 @@ int memory_add(unsigned long spfn, unsigned long epfn, unsigned int pxm)
     {
         for ( i = spfn; i < epfn; i++ )
             if ( iommu_legacy_map(hardware_domain, _dfn(i), _mfn(i),
-                                  PAGE_ORDER_4K,
+                                  1ul << PAGE_ORDER_4K,
                                   IOMMUF_readable | IOMMUF_writable) )
                 break;
         if ( i != epfn )
@@ -1294,7 +1294,7 @@ int memory_add(unsigned long spfn, unsigned long epfn, unsigned int pxm)
             while (i-- > old_max)
                 /* If statement to satisfy __must_check. */
                 if ( iommu_legacy_unmap(hardware_domain, _dfn(i),
-                                        PAGE_ORDER_4K) )
+                                        1ul << PAGE_ORDER_4K) )
                     continue;
 
             goto destroy_m2p;
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 9f0cae52c0..a5d3ed8bda 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -1225,7 +1225,7 @@ map_grant_ref(
             kind = IOMMUF_readable;
         else
             kind = 0;
-        if ( kind && iommu_legacy_map(ld, _dfn(mfn_x(mfn)), mfn, 0, kind) )
+        if ( kind && iommu_legacy_map(ld, _dfn(mfn_x(mfn)), mfn, 1, kind) )
         {
             double_gt_unlock(lgt, rgt);
             rc = GNTST_general_error;
@@ -1479,9 +1479,9 @@ unmap_common(
 
         kind = mapkind(lgt, rd, op->mfn);
         if ( !kind )
-            err = iommu_legacy_unmap(ld, _dfn(mfn_x(op->mfn)), 0);
+            err = iommu_legacy_unmap(ld, _dfn(mfn_x(op->mfn)), 1);
         else if ( !(kind & MAPKIND_WRITE) )
-            err = iommu_legacy_map(ld, _dfn(mfn_x(op->mfn)), op->mfn, 0,
+            err = iommu_legacy_map(ld, _dfn(mfn_x(op->mfn)), op->mfn, 1,
                                    IOMMUF_readable);
 
         double_gt_unlock(lgt, rgt);
diff --git a/xen/drivers/passthrough/amd/iommu.h b/xen/drivers/passthrough/amd/iommu.h
index e2d174f3b4..f1f0415469 100644
--- a/xen/drivers/passthrough/amd/iommu.h
+++ b/xen/drivers/passthrough/amd/iommu.h
@@ -231,7 +231,7 @@ int amd_iommu_reserve_domain_unity_map(struct domain *domain,
                                        paddr_t phys_addr, unsigned long size,
                                        int iw, int ir);
 int __must_check amd_iommu_flush_iotlb_pages(struct domain *d, dfn_t dfn,
-                                             unsigned int page_count,
+                                             unsigned long page_count,
                                              unsigned int flush_flags);
 int __must_check amd_iommu_flush_iotlb_all(struct domain *d);
 
diff --git a/xen/drivers/passthrough/amd/iommu_map.c b/xen/drivers/passthrough/amd/iommu_map.c
index 54b991294a..0cb948d114 100644
--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -351,7 +351,7 @@ int amd_iommu_unmap_page(struct domain *d, dfn_t dfn,
     return 0;
 }
 
-static unsigned long flush_count(unsigned long dfn, unsigned int page_count,
+static unsigned long flush_count(unsigned long dfn, unsigned long page_count,
                                  unsigned int order)
 {
     unsigned long start = dfn >> order;
@@ -362,7 +362,7 @@ static unsigned long flush_count(unsigned long dfn, unsigned int page_count,
 }
 
 int amd_iommu_flush_iotlb_pages(struct domain *d, dfn_t dfn,
-                                unsigned int page_count,
+                                unsigned long page_count,
                                 unsigned int flush_flags)
 {
     unsigned long dfn_l = dfn_x(dfn);
diff --git a/xen/drivers/passthrough/arm/ipmmu-vmsa.c b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
index b2a65dfaaf..346165c3fa 100644
--- a/xen/drivers/passthrough/arm/ipmmu-vmsa.c
+++ b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
@@ -945,7 +945,7 @@ static int __must_check ipmmu_iotlb_flush_all(struct domain *d)
 }
 
 static int __must_check ipmmu_iotlb_flush(struct domain *d, dfn_t dfn,
-                                          unsigned int page_count,
+                                          unsigned long page_count,
                                           unsigned int flush_flags)
 {
     ASSERT(flush_flags);
diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c
index 94662a8501..06f9bda47d 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -2534,7 +2534,7 @@ static int __must_check arm_smmu_iotlb_flush_all(struct domain *d)
 }
 
 static int __must_check arm_smmu_iotlb_flush(struct domain *d, dfn_t dfn,
-					     unsigned int page_count,
+					     unsigned long page_count,
 					     unsigned int flush_flags)
 {
 	ASSERT(flush_flags);
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index eb65631d59..87f9a857bb 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -235,7 +235,7 @@ void iommu_domain_destroy(struct domain *d)
 }
 
 int iommu_map(struct domain *d, dfn_t dfn, mfn_t mfn,
-              unsigned int page_order, unsigned int flags,
+              unsigned long page_count, unsigned int flags,
               unsigned int *flush_flags)
 {
     const struct domain_iommu *hd = dom_iommu(d);
@@ -245,10 +245,7 @@ int iommu_map(struct domain *d, dfn_t dfn, mfn_t mfn,
     if ( !is_iommu_enabled(d) )
         return 0;
 
-    ASSERT(IS_ALIGNED(dfn_x(dfn), (1ul << page_order)));
-    ASSERT(IS_ALIGNED(mfn_x(mfn), (1ul << page_order)));
-
-    for ( i = 0; i < (1ul << page_order); i++ )
+    for ( i = 0; i < page_count; i++ )
     {
         rc = iommu_call(hd->platform_ops, map_page, d, dfn_add(dfn, i),
                         mfn_add(mfn, i), flags, flush_flags);
@@ -278,25 +275,26 @@ int iommu_map(struct domain *d, dfn_t dfn, mfn_t mfn,
      * Something went wrong so, if we were dealing with more than a single
      * page, flush everything and clear flush flags.
      */
-    if ( page_order && unlikely(rc) && !iommu_iotlb_flush_all(d, *flush_flags) )
+    if ( page_count > 1 && unlikely(rc) &&
+         !iommu_iotlb_flush_all(d, *flush_flags) )
         *flush_flags = 0;
 
     return rc;
 }
 
 int iommu_legacy_map(struct domain *d, dfn_t dfn, mfn_t mfn,
-                     unsigned int page_order, unsigned int flags)
+                     unsigned long page_count, unsigned int flags)
 {
     unsigned int flush_flags = 0;
-    int rc = iommu_map(d, dfn, mfn, page_order, flags, &flush_flags);
+    int rc = iommu_map(d, dfn, mfn, page_count, flags, &flush_flags);
 
     if ( !this_cpu(iommu_dont_flush_iotlb) && !rc )
-        rc = iommu_iotlb_flush(d, dfn, (1u << page_order), flush_flags);
+        rc = iommu_iotlb_flush(d, dfn, page_count, flush_flags);
 
     return rc;
 }
 
-int iommu_unmap(struct domain *d, dfn_t dfn, unsigned int page_order,
+int iommu_unmap(struct domain *d, dfn_t dfn, unsigned long page_count,
                 unsigned int *flush_flags)
 {
     const struct domain_iommu *hd = dom_iommu(d);
@@ -306,9 +304,7 @@ int iommu_unmap(struct domain *d, dfn_t dfn, unsigned int page_order,
     if ( !is_iommu_enabled(d) )
         return 0;
 
-    ASSERT(IS_ALIGNED(dfn_x(dfn), (1ul << page_order)));
-
-    for ( i = 0; i < (1ul << page_order); i++ )
+    for ( i = 0; i < page_count; i++ )
     {
         int err = iommu_call(hd->platform_ops, unmap_page, d, dfn_add(dfn, i),
                              flush_flags);
@@ -335,19 +331,20 @@ int iommu_unmap(struct domain *d, dfn_t dfn, unsigned int page_order,
      * Something went wrong so, if we were dealing with more than a single
      * page, flush everything and clear flush flags.
      */
-    if ( page_order && unlikely(rc) && !iommu_iotlb_flush_all(d, *flush_flags) )
+    if ( page_count > 1 && unlikely(rc) &&
+         !iommu_iotlb_flush_all(d, *flush_flags) )
         *flush_flags = 0;
 
     return rc;
 }
 
-int iommu_legacy_unmap(struct domain *d, dfn_t dfn, unsigned int page_order)
+int iommu_legacy_unmap(struct domain *d, dfn_t dfn, unsigned long page_count)
 {
     unsigned int flush_flags = 0;
-    int rc = iommu_unmap(d, dfn, page_order, &flush_flags);
+    int rc = iommu_unmap(d, dfn, page_count, &flush_flags);
 
     if ( !this_cpu(iommu_dont_flush_iotlb) && !rc )
-        rc = iommu_iotlb_flush(d, dfn, (1u << page_order), flush_flags);
+        rc = iommu_iotlb_flush(d, dfn, page_count, flush_flags);
 
     return rc;
 }
@@ -363,7 +360,7 @@ int iommu_lookup_page(struct domain *d, dfn_t dfn, mfn_t *mfn,
     return iommu_call(hd->platform_ops, lookup_page, d, dfn, mfn, flags);
 }
 
-int iommu_iotlb_flush(struct domain *d, dfn_t dfn, unsigned int page_count,
+int iommu_iotlb_flush(struct domain *d, dfn_t dfn, unsigned long page_count,
                       unsigned int flush_flags)
 {
     const struct domain_iommu *hd = dom_iommu(d);
@@ -382,7 +379,7 @@ int iommu_iotlb_flush(struct domain *d, dfn_t dfn, unsigned int page_count,
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
             printk(XENLOG_ERR
-                   "d%d: IOMMU IOTLB flush failed: %d, dfn %"PRI_dfn", page count %u flags %x\n",
+                   "d%d: IOMMU IOTLB flush failed: %d, dfn %"PRI_dfn", page count %lu flags %x\n",
                    d->domain_id, rc, dfn_x(dfn), page_count, flush_flags);
 
         if ( !is_hardware_domain(d) )
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 607e8b5e65..68cf0e535a 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -584,7 +584,7 @@ static int __must_check iommu_flush_all(void)
 
 static int __must_check iommu_flush_iotlb(struct domain *d, dfn_t dfn,
                                           bool_t dma_old_pte_present,
-                                          unsigned int page_count)
+                                          unsigned long page_count)
 {
     struct domain_iommu *hd = dom_iommu(d);
     struct acpi_drhd_unit *drhd;
@@ -632,7 +632,7 @@ static int __must_check iommu_flush_iotlb(struct domain *d, dfn_t dfn,
 
 static int __must_check iommu_flush_iotlb_pages(struct domain *d,
                                                 dfn_t dfn,
-                                                unsigned int page_count,
+                                                unsigned long page_count,
                                                 unsigned int flush_flags)
 {
     ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN));
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index aea07e47c4..f17b1820f4 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -244,7 +244,7 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
         else if ( paging_mode_translate(d) )
             rc = set_identity_p2m_entry(d, pfn, p2m_access_rw, 0);
         else
-            rc = iommu_map(d, _dfn(pfn), _mfn(pfn), PAGE_ORDER_4K,
+            rc = iommu_map(d, _dfn(pfn), _mfn(pfn), 1ul << PAGE_ORDER_4K,
                            IOMMUF_readable | IOMMUF_writable, &flush_flags);
 
         if ( rc )
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 1831dc66b0..13f68dc93d 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -146,23 +146,23 @@ enum
 #define IOMMU_FLUSHF_modified (1u << _IOMMU_FLUSHF_modified)
 
 int __must_check iommu_map(struct domain *d, dfn_t dfn, mfn_t mfn,
-                           unsigned int page_order, unsigned int flags,
+                           unsigned long page_count, unsigned int flags,
                            unsigned int *flush_flags);
 int __must_check iommu_unmap(struct domain *d, dfn_t dfn,
-                             unsigned int page_order,
+                             unsigned long page_count,
                              unsigned int *flush_flags);
 
 int __must_check iommu_legacy_map(struct domain *d, dfn_t dfn, mfn_t mfn,
-                                  unsigned int page_order,
+                                  unsigned long page_count,
                                   unsigned int flags);
 int __must_check iommu_legacy_unmap(struct domain *d, dfn_t dfn,
-                                    unsigned int page_order);
+                                    unsigned long page_count);
 
 int __must_check iommu_lookup_page(struct domain *d, dfn_t dfn, mfn_t *mfn,
                                    unsigned int *flags);
 
 int __must_check iommu_iotlb_flush(struct domain *d, dfn_t dfn,
-                                   unsigned int page_count,
+                                   unsigned long page_count,
                                    unsigned int flush_flags);
 int __must_check iommu_iotlb_flush_all(struct domain *d,
                                        unsigned int flush_flags);
@@ -281,7 +281,7 @@ struct iommu_ops {
     void (*share_p2m)(struct domain *d);
     void (*crash_shutdown)(void);
     int __must_check (*iotlb_flush)(struct domain *d, dfn_t dfn,
-                                    unsigned int page_count,
+                                    unsigned long page_count,
                                     unsigned int flush_flags);
     int __must_check (*iotlb_flush_all)(struct domain *d);
     int (*get_reserved_device_memory)(iommu_grdm_t *, void *);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v9 5/8] remove remaining uses of iommu_legacy_map/unmap
  2020-09-15  8:29 [PATCH v9 0/8] IOMMU cleanup Paul Durrant
                   ` (3 preceding siblings ...)
  2020-09-15  8:29 ` [PATCH v9 4/8] iommu: make map and unmap take a page count, similar to flush Paul Durrant
@ 2020-09-15  8:29 ` Paul Durrant
  2020-09-16  7:53   ` Julien Grall
  2020-09-17  7:40   ` Tian, Kevin
  2020-09-15  8:29 ` [PATCH v9 6/8] common/grant_table: batch flush I/O TLB Paul Durrant
                   ` (2 subsequent siblings)
  7 siblings, 2 replies; 19+ messages in thread
From: Paul Durrant @ 2020-09-15  8:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Paul Durrant, Jan Beulich, Andrew Cooper, Wei Liu,
	Roger Pau Monné,
	George Dunlap, Ian Jackson, Julien Grall, Stefano Stabellini,
	Jun Nakajima, Kevin Tian

From: Paul Durrant <pdurrant@amazon.com>

The 'legacy' functions do implicit flushing so amend the callers to do the
appropriate flushing.

Unfortunately, because of the structure of the P2M code, we cannot remove
the per-CPU 'iommu_dont_flush_iotlb' global and the optimization it
facilitates. Checking of this flag is now done only in relevant callers of
iommu_iotlb_flush(). Also, 'iommu_dont_flush_iotlb' is now declared
as bool (rather than bool_t) and setting/clearing it are no longer pointlessly
gated on is_iommu_enabled() returning true. (Arguably it is also pointless to
gate the call to iommu_iotlb_flush() on that condition - since it is a no-op
in that case - but the if clause allows the scope of a stack variable to be
restricted).

NOTE: The code in memory_add() now sets 'ret' if iommu_map() or
      iommu_iotlb_flush() fails.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wl@xen.org>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien@xen.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>

v9:
 - Moved check of 'iommu_dont_flush_iotlb' out of iommu_iotlb_flush() and
   into callers to avoid re-introducing a problem on Arm
 - Dropped Jan's R-b due to change

v6:
 - Fix formatting problem in memory_add()

v5:
 - Re-base
 - Removed failure case on overflow of unsigned int as it is no longer
   necessary

v3:
 - Same as v2; elected to implement batch flushing in the grant table code as
   a subsequent patch

v2:
 - Shorten the diff (mainly because of a prior patch introducing automatic
   flush-on-fail into iommu_map() and iommu_unmap())
---
 xen/arch/x86/mm.c               | 24 +++++++++++++++++-------
 xen/arch/x86/mm/p2m-ept.c       | 21 +++++++++++++--------
 xen/arch/x86/mm/p2m-pt.c        | 16 ++++++++++++----
 xen/arch/x86/mm/p2m.c           | 25 +++++++++++++++++--------
 xen/arch/x86/x86_64/mm.c        | 20 +++++++-------------
 xen/common/grant_table.c        | 29 ++++++++++++++++++++++-------
 xen/common/memory.c             |  7 +++----
 xen/drivers/passthrough/iommu.c | 23 -----------------------
 xen/include/xen/iommu.h         | 21 +++++----------------
 9 files changed, 96 insertions(+), 90 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 9447dabc50..fec8a19920 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2441,10 +2441,16 @@ static int cleanup_page_mappings(struct page_info *page)
 
         if ( d && unlikely(need_iommu_pt_sync(d)) && is_pv_domain(d) )
         {
-            int rc2 = iommu_legacy_unmap(d, _dfn(mfn), 1u << PAGE_ORDER_4K);
+            unsigned int flush_flags = 0;
+            int err;
+
+            err = iommu_unmap(d, _dfn(mfn), 1ul << PAGE_ORDER_4K, &flush_flags);
+            if ( !err && !this_cpu(iommu_dont_flush_iotlb) )
+                err = iommu_iotlb_flush(d, _dfn(mfn), 1ul << PAGE_ORDER_4K,
+                                        flush_flags);
 
             if ( !rc )
-                rc = rc2;
+                rc = err;
         }
 
         if ( likely(!is_special_page(page)) )
@@ -2966,14 +2972,18 @@ static int _get_page_type(struct page_info *page, unsigned long type,
         if ( d && unlikely(need_iommu_pt_sync(d)) && is_pv_domain(d) )
         {
             mfn_t mfn = page_to_mfn(page);
+            dfn_t dfn = _dfn(mfn_x(mfn));
+            unsigned int flush_flags = 0;
 
             if ( (x & PGT_type_mask) == PGT_writable_page )
-                rc = iommu_legacy_unmap(d, _dfn(mfn_x(mfn)),
-                                        1ul << PAGE_ORDER_4K);
+                rc = iommu_unmap(d, dfn, 1ul << PAGE_ORDER_4K, &flush_flags);
             else
-                rc = iommu_legacy_map(d, _dfn(mfn_x(mfn)), mfn,
-                                      1ul << PAGE_ORDER_4K,
-                                      IOMMUF_readable | IOMMUF_writable);
+                rc = iommu_map(d, dfn, mfn, 1ul << PAGE_ORDER_4K,
+                               IOMMUF_readable | IOMMUF_writable, &flush_flags);
+
+            if ( !rc && !this_cpu(iommu_dont_flush_iotlb) )
+                rc = iommu_iotlb_flush(d, dfn, 1ul << PAGE_ORDER_4K,
+                                       flush_flags);
 
             if ( unlikely(rc) )
             {
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 12cf38f6eb..71473fd31d 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -842,15 +842,20 @@ out:
     if ( rc == 0 && p2m_is_hostp2m(p2m) &&
          need_modify_vtd_table )
     {
-        if ( iommu_use_hap_pt(d) )
-            rc = iommu_iotlb_flush(d, _dfn(gfn), 1ul << order,
-                                   (iommu_flags ? IOMMU_FLUSHF_added : 0) |
-                                   (vtd_pte_present ? IOMMU_FLUSHF_modified
-                                                    : 0));
-        else if ( need_iommu_pt_sync(d) )
+        unsigned int flush_flags = 0;
+
+        if ( need_iommu_pt_sync(d) )
             rc = iommu_flags ?
-                iommu_legacy_map(d, _dfn(gfn), mfn, 1ul << order, iommu_flags) :
-                iommu_legacy_unmap(d, _dfn(gfn), 1ul << order);
+                iommu_map(d, _dfn(gfn), mfn, 1ul << order, iommu_flags,
+                          &flush_flags) :
+                iommu_unmap(d, _dfn(gfn), 1ul << order, &flush_flags);
+        else if ( iommu_use_hap_pt(d) )
+            flush_flags =
+                (iommu_flags ? IOMMU_FLUSHF_added : 0) |
+                (vtd_pte_present ? IOMMU_FLUSHF_modified : 0);
+
+        if ( !rc && !this_cpu(iommu_dont_flush_iotlb) )
+            rc = iommu_iotlb_flush(d, _dfn(gfn), 1ul << order, flush_flags);
     }
 
     unmap_domain_page(table);
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 3af51be78e..96cacb6748 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -678,10 +678,18 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
 
     if ( need_iommu_pt_sync(p2m->domain) &&
          (iommu_old_flags != iommu_pte_flags || old_mfn != mfn_x(mfn)) )
-        rc = iommu_pte_flags
-             ? iommu_legacy_map(d, _dfn(gfn), mfn, 1ul << page_order,
-                                iommu_pte_flags)
-             : iommu_legacy_unmap(d, _dfn(gfn), 1ul << page_order);
+    {
+        unsigned int flush_flags = 0;
+
+        rc = iommu_pte_flags ?
+            iommu_map(d, _dfn(gfn), mfn, 1ul << page_order, iommu_pte_flags,
+                      &flush_flags) :
+            iommu_unmap(d, _dfn(gfn), 1ul << page_order, &flush_flags);
+
+        if ( !rc && !this_cpu(iommu_dont_flush_iotlb) )
+            rc = iommu_iotlb_flush(d, _dfn(gfn), 1ul << page_order,
+                                   flush_flags);
+    }
 
     /*
      * Free old intermediate tables if necessary.  This has to be the
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 928344be30..01ff92862d 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1350,11 +1350,15 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn_l,
 
     if ( !paging_mode_translate(p2m->domain) )
     {
-        if ( !is_iommu_enabled(d) )
-            return 0;
-        return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l),
-                                1ul << PAGE_ORDER_4K,
-                                IOMMUF_readable | IOMMUF_writable);
+        unsigned int flush_flags = 0;
+
+        ret = iommu_map(d, _dfn(gfn_l), _mfn(gfn_l), 1ul << PAGE_ORDER_4K,
+                        IOMMUF_readable | IOMMUF_writable, &flush_flags);
+        if ( !ret )
+            ret = iommu_iotlb_flush(d, _dfn(gfn_l), 1ul << PAGE_ORDER_4K,
+                                    flush_flags);
+
+        return ret;
     }
 
     gfn_lock(p2m, gfn, 0);
@@ -1442,9 +1446,14 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn_l)
 
     if ( !paging_mode_translate(d) )
     {
-        if ( !is_iommu_enabled(d) )
-            return 0;
-        return iommu_legacy_unmap(d, _dfn(gfn_l), 1ul << PAGE_ORDER_4K);
+        unsigned int flush_flags = 0;
+
+        ret = iommu_unmap(d, _dfn(gfn_l), 1ul << PAGE_ORDER_4K, &flush_flags);
+        if ( !ret )
+            ret = iommu_iotlb_flush(d, _dfn(gfn_l), 1ul << PAGE_ORDER_4K,
+                                    flush_flags);
+
+        return ret;
     }
 
     gfn_lock(p2m, gfn, 0);
diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index bce1561e1a..7e9d165449 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -1284,21 +1284,15 @@ int memory_add(unsigned long spfn, unsigned long epfn, unsigned int pxm)
          !iommu_use_hap_pt(hardware_domain) &&
          !need_iommu_pt_sync(hardware_domain) )
     {
-        for ( i = spfn; i < epfn; i++ )
-            if ( iommu_legacy_map(hardware_domain, _dfn(i), _mfn(i),
-                                  1ul << PAGE_ORDER_4K,
-                                  IOMMUF_readable | IOMMUF_writable) )
-                break;
-        if ( i != epfn )
-        {
-            while (i-- > old_max)
-                /* If statement to satisfy __must_check. */
-                if ( iommu_legacy_unmap(hardware_domain, _dfn(i),
-                                        1ul << PAGE_ORDER_4K) )
-                    continue;
+        unsigned int flush_flags = 0;
+        unsigned long n = epfn - spfn;
 
+        ret = iommu_map(hardware_domain, _dfn(i), _mfn(i), n,
+                        IOMMUF_readable | IOMMUF_writable, &flush_flags);
+        if ( !ret )
+            ret = iommu_iotlb_flush(hardware_domain, _dfn(i), n, flush_flags);
+        if ( ret )
             goto destroy_m2p;
-        }
     }
 
     /* We can't revert any more */
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index a5d3ed8bda..beb6b2d40d 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -1225,11 +1225,22 @@ map_grant_ref(
             kind = IOMMUF_readable;
         else
             kind = 0;
-        if ( kind && iommu_legacy_map(ld, _dfn(mfn_x(mfn)), mfn, 1, kind) )
+
+        if ( kind )
         {
-            double_gt_unlock(lgt, rgt);
-            rc = GNTST_general_error;
-            goto undo_out;
+            dfn_t dfn = _dfn(mfn_x(mfn));
+            unsigned int flush_flags = 0;
+            int err;
+
+            err = iommu_map(ld, dfn, mfn, 1, kind, &flush_flags);
+            if ( !err )
+                err = iommu_iotlb_flush(ld, dfn, 1, flush_flags);
+            if ( err )
+            {
+                double_gt_unlock(lgt, rgt);
+                rc = GNTST_general_error;
+                goto undo_out;
+            }
         }
     }
 
@@ -1473,19 +1484,23 @@ unmap_common(
     if ( rc == GNTST_okay && gnttab_need_iommu_mapping(ld) )
     {
         unsigned int kind;
+        dfn_t dfn = _dfn(mfn_x(op->mfn));
+        unsigned int flush_flags = 0;
         int err = 0;
 
         double_gt_lock(lgt, rgt);
 
         kind = mapkind(lgt, rd, op->mfn);
         if ( !kind )
-            err = iommu_legacy_unmap(ld, _dfn(mfn_x(op->mfn)), 1);
+            err = iommu_unmap(ld, dfn, 1, &flush_flags);
         else if ( !(kind & MAPKIND_WRITE) )
-            err = iommu_legacy_map(ld, _dfn(mfn_x(op->mfn)), op->mfn, 1,
-                                   IOMMUF_readable);
+            err = iommu_map(ld, dfn, op->mfn, 1, IOMMUF_readable,
+                            &flush_flags);
 
         double_gt_unlock(lgt, rgt);
 
+        if ( !err )
+            err = iommu_iotlb_flush(ld, dfn, 1, flush_flags);
         if ( err )
             rc = GNTST_general_error;
     }
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 714077c1e5..fedbd9019e 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -824,8 +824,7 @@ int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap *xatp,
     xatp->gpfn += start;
     xatp->size -= start;
 
-    if ( is_iommu_enabled(d) )
-       this_cpu(iommu_dont_flush_iotlb) = 1;
+    this_cpu(iommu_dont_flush_iotlb) = true;
 
     while ( xatp->size > done )
     {
@@ -845,12 +844,12 @@ int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap *xatp,
         }
     }
 
+    this_cpu(iommu_dont_flush_iotlb) = false;
+
     if ( is_iommu_enabled(d) )
     {
         int ret;
 
-        this_cpu(iommu_dont_flush_iotlb) = 0;
-
         ret = iommu_iotlb_flush(d, _dfn(xatp->idx - done), done,
                                 IOMMU_FLUSHF_added | IOMMU_FLUSHF_modified);
         if ( unlikely(ret) && rc >= 0 )
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 87f9a857bb..a9da4d2b06 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -282,18 +282,6 @@ int iommu_map(struct domain *d, dfn_t dfn, mfn_t mfn,
     return rc;
 }
 
-int iommu_legacy_map(struct domain *d, dfn_t dfn, mfn_t mfn,
-                     unsigned long page_count, unsigned int flags)
-{
-    unsigned int flush_flags = 0;
-    int rc = iommu_map(d, dfn, mfn, page_count, flags, &flush_flags);
-
-    if ( !this_cpu(iommu_dont_flush_iotlb) && !rc )
-        rc = iommu_iotlb_flush(d, dfn, page_count, flush_flags);
-
-    return rc;
-}
-
 int iommu_unmap(struct domain *d, dfn_t dfn, unsigned long page_count,
                 unsigned int *flush_flags)
 {
@@ -338,17 +326,6 @@ int iommu_unmap(struct domain *d, dfn_t dfn, unsigned long page_count,
     return rc;
 }
 
-int iommu_legacy_unmap(struct domain *d, dfn_t dfn, unsigned long page_count)
-{
-    unsigned int flush_flags = 0;
-    int rc = iommu_unmap(d, dfn, page_count, &flush_flags);
-
-    if ( !this_cpu(iommu_dont_flush_iotlb) && !rc )
-        rc = iommu_iotlb_flush(d, dfn, page_count, flush_flags);
-
-    return rc;
-}
-
 int iommu_lookup_page(struct domain *d, dfn_t dfn, mfn_t *mfn,
                       unsigned int *flags)
 {
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 13f68dc93d..a2eefe8582 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -151,16 +151,8 @@ int __must_check iommu_map(struct domain *d, dfn_t dfn, mfn_t mfn,
 int __must_check iommu_unmap(struct domain *d, dfn_t dfn,
                              unsigned long page_count,
                              unsigned int *flush_flags);
-
-int __must_check iommu_legacy_map(struct domain *d, dfn_t dfn, mfn_t mfn,
-                                  unsigned long page_count,
-                                  unsigned int flags);
-int __must_check iommu_legacy_unmap(struct domain *d, dfn_t dfn,
-                                    unsigned long page_count);
-
 int __must_check iommu_lookup_page(struct domain *d, dfn_t dfn, mfn_t *mfn,
                                    unsigned int *flags);
-
 int __must_check iommu_iotlb_flush(struct domain *d, dfn_t dfn,
                                    unsigned long page_count,
                                    unsigned int flush_flags);
@@ -369,15 +361,12 @@ void iommu_dev_iotlb_flush_timeout(struct domain *d, struct pci_dev *pdev);
 
 /*
  * The purpose of the iommu_dont_flush_iotlb optional cpu flag is to
- * avoid unecessary iotlb_flush in the low level IOMMU code.
- *
- * iommu_map_page/iommu_unmap_page must flush the iotlb but somethimes
- * this operation can be really expensive. This flag will be set by the
- * caller to notify the low level IOMMU code to avoid the iotlb flushes.
- * iommu_iotlb_flush/iommu_iotlb_flush_all will be explicitly called by
- * the caller.
+ * avoid unnecessary IOMMU flushing while updating the P2M.
+ * Setting the value to true will cause iommu_iotlb_flush() to return without
+ * actually performing a flush. A batch flush must therefore be done by the
+ * calling code after setting the value back to false.
  */
-DECLARE_PER_CPU(bool_t, iommu_dont_flush_iotlb);
+DECLARE_PER_CPU(bool, iommu_dont_flush_iotlb);
 
 extern struct spinlock iommu_pt_cleanup_lock;
 extern struct page_list_head iommu_pt_cleanup_list;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v9 6/8] common/grant_table: batch flush I/O TLB
  2020-09-15  8:29 [PATCH v9 0/8] IOMMU cleanup Paul Durrant
                   ` (4 preceding siblings ...)
  2020-09-15  8:29 ` [PATCH v9 5/8] remove remaining uses of iommu_legacy_map/unmap Paul Durrant
@ 2020-09-15  8:29 ` Paul Durrant
  2020-09-16  8:11   ` Julien Grall
  2020-09-17 16:51   ` Wei Liu
  2020-09-15  8:29 ` [PATCH v9 7/8] iommu: remove the share_p2m operation Paul Durrant
  2020-09-15  8:29 ` [PATCH v9 8/8] iommu: stop calling IOMMU page tables 'p2m tables' Paul Durrant
  7 siblings, 2 replies; 19+ messages in thread
From: Paul Durrant @ 2020-09-15  8:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Paul Durrant, Jan Beulich, Andrew Cooper, George Dunlap,
	Ian Jackson, Julien Grall, Stefano Stabellini, Wei Liu

From: Paul Durrant <pdurrant@amazon.com>

This patch avoids calling iommu_iotlb_flush() for each individual GNTTABOP and
instead calls iommu_iotlb_flush_all() at the end of a batch. This should mean
non-singleton map/unmap operations perform better.

NOTE: A batch is the number of operations done before a pre-emption check and,
      in the case of unmap, a TLB flush.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien@xen.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Wei Liu <wl@xen.org>

v6:
 - Fix spelling of 'preemption'
 - Drop unneeded 'currd' stack variable

v5:
 - Add batching to gnttab_map_grant_ref() to handle flushing before pre-
   emption check
 - Maintain per-op flushing in the case of singletons

v3:
 - New in v3
---
 xen/common/grant_table.c | 199 ++++++++++++++++++++++++++-------------
 1 file changed, 133 insertions(+), 66 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index beb6b2d40d..1e3d7a2d33 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -241,7 +241,13 @@ struct gnttab_unmap_common {
     grant_ref_t ref;
 };
 
-/* Number of unmap operations that are done between each tlb flush */
+/* Number of map operations that are done between each preemption check */
+#define GNTTAB_MAP_BATCH_SIZE 32
+
+/*
+ * Number of unmap operations that are done between each tlb flush and
+ * preemption check.
+ */
 #define GNTTAB_UNMAP_BATCH_SIZE 32
 
 
@@ -979,7 +985,7 @@ static unsigned int mapkind(
 
 static void
 map_grant_ref(
-    struct gnttab_map_grant_ref *op)
+    struct gnttab_map_grant_ref *op, bool do_flush, unsigned int *flush_flags)
 {
     struct domain *ld, *rd, *owner = NULL;
     struct grant_table *lgt, *rgt;
@@ -1229,12 +1235,11 @@ map_grant_ref(
         if ( kind )
         {
             dfn_t dfn = _dfn(mfn_x(mfn));
-            unsigned int flush_flags = 0;
             int err;
 
-            err = iommu_map(ld, dfn, mfn, 1, kind, &flush_flags);
-            if ( !err )
-                err = iommu_iotlb_flush(ld, dfn, 1, flush_flags);
+            err = iommu_map(ld, dfn, mfn, 1, kind, flush_flags);
+            if ( do_flush && !err )
+                err = iommu_iotlb_flush(ld, dfn, 1, *flush_flags);
             if ( err )
             {
                 double_gt_unlock(lgt, rgt);
@@ -1319,29 +1324,59 @@ static long
 gnttab_map_grant_ref(
     XEN_GUEST_HANDLE_PARAM(gnttab_map_grant_ref_t) uop, unsigned int count)
 {
-    int i;
-    struct gnttab_map_grant_ref op;
+    unsigned int done = 0;
+    int rc = 0;
 
-    for ( i = 0; i < count; i++ )
+    while ( count )
     {
-        if ( i && hypercall_preempt_check() )
-            return i;
+        unsigned int i, c = min_t(unsigned int, count, GNTTAB_MAP_BATCH_SIZE);
+        unsigned int flush_flags = 0;
 
-        if ( unlikely(__copy_from_guest_offset(&op, uop, i, 1)) )
-            return -EFAULT;
+        for ( i = 0; i < c; i++ )
+        {
+            struct gnttab_map_grant_ref op;
 
-        map_grant_ref(&op);
+            if ( unlikely(__copy_from_guest(&op, uop, 1)) )
+            {
+                rc = -EFAULT;
+                break;
+            }
 
-        if ( unlikely(__copy_to_guest_offset(uop, i, &op, 1)) )
-            return -EFAULT;
+            map_grant_ref(&op, c == 1, &flush_flags);
+
+            if ( unlikely(__copy_to_guest(uop, &op, 1)) )
+            {
+                rc = -EFAULT;
+                break;
+            }
+
+            guest_handle_add_offset(uop, 1);
+        }
+
+        if ( c > 1 )
+        {
+            int err = iommu_iotlb_flush_all(current->domain, flush_flags);
+
+            if ( !rc )
+                rc = err;
+        }
+
+        if ( rc )
+            break;
+
+        count -= c;
+        done += c;
+
+        if ( count && hypercall_preempt_check() )
+            return done;
     }
 
-    return 0;
+    return rc;
 }
 
 static void
 unmap_common(
-    struct gnttab_unmap_common *op)
+    struct gnttab_unmap_common *op, bool do_flush, unsigned int *flush_flags)
 {
     domid_t          dom;
     struct domain   *ld, *rd;
@@ -1485,22 +1520,20 @@ unmap_common(
     {
         unsigned int kind;
         dfn_t dfn = _dfn(mfn_x(op->mfn));
-        unsigned int flush_flags = 0;
         int err = 0;
 
         double_gt_lock(lgt, rgt);
 
         kind = mapkind(lgt, rd, op->mfn);
         if ( !kind )
-            err = iommu_unmap(ld, dfn, 1, &flush_flags);
+            err = iommu_unmap(ld, dfn, 1, flush_flags);
         else if ( !(kind & MAPKIND_WRITE) )
-            err = iommu_map(ld, dfn, op->mfn, 1, IOMMUF_readable,
-                            &flush_flags);
+            err = iommu_map(ld, dfn, op->mfn, 1, IOMMUF_readable, flush_flags);
 
         double_gt_unlock(lgt, rgt);
 
-        if ( !err )
-            err = iommu_iotlb_flush(ld, dfn, 1, flush_flags);
+        if ( do_flush && !err )
+            err = iommu_iotlb_flush(ld, dfn, 1, *flush_flags);
         if ( err )
             rc = GNTST_general_error;
     }
@@ -1599,8 +1632,8 @@ unmap_common_complete(struct gnttab_unmap_common *op)
 
 static void
 unmap_grant_ref(
-    struct gnttab_unmap_grant_ref *op,
-    struct gnttab_unmap_common *common)
+    struct gnttab_unmap_grant_ref *op, struct gnttab_unmap_common *common,
+    bool do_flush, unsigned int *flush_flags)
 {
     common->host_addr = op->host_addr;
     common->dev_bus_addr = op->dev_bus_addr;
@@ -1612,7 +1645,7 @@ unmap_grant_ref(
     common->rd = NULL;
     common->mfn = INVALID_MFN;
 
-    unmap_common(common);
+    unmap_common(common, do_flush, flush_flags);
     op->status = common->status;
 }
 
@@ -1621,31 +1654,55 @@ static long
 gnttab_unmap_grant_ref(
     XEN_GUEST_HANDLE_PARAM(gnttab_unmap_grant_ref_t) uop, unsigned int count)
 {
-    int i, c, partial_done, done = 0;
-    struct gnttab_unmap_grant_ref op;
-    struct gnttab_unmap_common common[GNTTAB_UNMAP_BATCH_SIZE];
+    struct domain *currd = current->domain;
+    unsigned int done = 0;
+    int rc = 0;
 
-    while ( count != 0 )
+    while ( count )
     {
-        c = min(count, (unsigned int)GNTTAB_UNMAP_BATCH_SIZE);
-        partial_done = 0;
+        struct gnttab_unmap_common common[GNTTAB_UNMAP_BATCH_SIZE];
+        unsigned int i, c, partial_done = 0;
+        unsigned int flush_flags = 0;
+
+        c = min_t(unsigned int, count, GNTTAB_UNMAP_BATCH_SIZE);
 
         for ( i = 0; i < c; i++ )
         {
+            struct gnttab_unmap_grant_ref op;
+
             if ( unlikely(__copy_from_guest(&op, uop, 1)) )
-                goto fault;
-            unmap_grant_ref(&op, &common[i]);
+            {
+                rc = -EFAULT;
+                break;
+            }
+
+            unmap_grant_ref(&op, &common[i], c == 1, &flush_flags);
             ++partial_done;
+
             if ( unlikely(__copy_field_to_guest(uop, &op, status)) )
-                goto fault;
+            {
+                rc = -EFAULT;
+                break;
+            }
+
             guest_handle_add_offset(uop, 1);
         }
 
-        gnttab_flush_tlb(current->domain);
+        gnttab_flush_tlb(currd);
+        if ( c > 1 )
+        {
+            int err = iommu_iotlb_flush_all(currd, flush_flags);
+
+            if ( !rc )
+                rc = err;
+        }
 
         for ( i = 0; i < partial_done; i++ )
             unmap_common_complete(&common[i]);
 
+        if ( rc )
+            break;
+
         count -= c;
         done += c;
 
@@ -1653,20 +1710,13 @@ gnttab_unmap_grant_ref(
             return done;
     }
 
-    return 0;
-
-fault:
-    gnttab_flush_tlb(current->domain);
-
-    for ( i = 0; i < partial_done; i++ )
-        unmap_common_complete(&common[i]);
-    return -EFAULT;
+    return rc;
 }
 
 static void
 unmap_and_replace(
-    struct gnttab_unmap_and_replace *op,
-    struct gnttab_unmap_common *common)
+    struct gnttab_unmap_and_replace *op, struct gnttab_unmap_common *common,
+    bool do_flush, unsigned int *flush_flags)
 {
     common->host_addr = op->host_addr;
     common->new_addr = op->new_addr;
@@ -1678,7 +1728,7 @@ unmap_and_replace(
     common->rd = NULL;
     common->mfn = INVALID_MFN;
 
-    unmap_common(common);
+    unmap_common(common, do_flush, flush_flags);
     op->status = common->status;
 }
 
@@ -1686,31 +1736,55 @@ static long
 gnttab_unmap_and_replace(
     XEN_GUEST_HANDLE_PARAM(gnttab_unmap_and_replace_t) uop, unsigned int count)
 {
-    int i, c, partial_done, done = 0;
-    struct gnttab_unmap_and_replace op;
-    struct gnttab_unmap_common common[GNTTAB_UNMAP_BATCH_SIZE];
+    struct domain *currd = current->domain;
+    unsigned int done = 0;
+    int rc = 0;
 
-    while ( count != 0 )
+    while ( count )
     {
-        c = min(count, (unsigned int)GNTTAB_UNMAP_BATCH_SIZE);
-        partial_done = 0;
+        struct gnttab_unmap_common common[GNTTAB_UNMAP_BATCH_SIZE];
+        unsigned int i, c, partial_done = 0;
+        unsigned int flush_flags = 0;
+
+        c = min_t(unsigned int, count, GNTTAB_UNMAP_BATCH_SIZE);
 
         for ( i = 0; i < c; i++ )
         {
+            struct gnttab_unmap_and_replace op;
+
             if ( unlikely(__copy_from_guest(&op, uop, 1)) )
-                goto fault;
-            unmap_and_replace(&op, &common[i]);
+            {
+                rc = -EFAULT;
+                break;
+            }
+
+            unmap_and_replace(&op, &common[i], c == 1, &flush_flags);
             ++partial_done;
+
             if ( unlikely(__copy_field_to_guest(uop, &op, status)) )
-                goto fault;
+            {
+                rc = -EFAULT;
+                break;
+            }
+
             guest_handle_add_offset(uop, 1);
         }
 
-        gnttab_flush_tlb(current->domain);
+        gnttab_flush_tlb(currd);
+        if ( c > 1 )
+        {
+            int err = iommu_iotlb_flush_all(currd, flush_flags);
+
+            if ( !rc )
+                rc = err;
+        }
 
         for ( i = 0; i < partial_done; i++ )
             unmap_common_complete(&common[i]);
 
+        if ( rc )
+            break;
+
         count -= c;
         done += c;
 
@@ -1718,14 +1792,7 @@ gnttab_unmap_and_replace(
             return done;
     }
 
-    return 0;
-
-fault:
-    gnttab_flush_tlb(current->domain);
-
-    for ( i = 0; i < partial_done; i++ )
-        unmap_common_complete(&common[i]);
-    return -EFAULT;
+    return rc;
 }
 
 static int
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v9 7/8] iommu: remove the share_p2m operation
  2020-09-15  8:29 [PATCH v9 0/8] IOMMU cleanup Paul Durrant
                   ` (5 preceding siblings ...)
  2020-09-15  8:29 ` [PATCH v9 6/8] common/grant_table: batch flush I/O TLB Paul Durrant
@ 2020-09-15  8:29 ` Paul Durrant
  2020-09-17  7:19   ` Tian, Kevin
  2020-09-17  7:24   ` Tian, Kevin
  2020-09-15  8:29 ` [PATCH v9 8/8] iommu: stop calling IOMMU page tables 'p2m tables' Paul Durrant
  7 siblings, 2 replies; 19+ messages in thread
From: Paul Durrant @ 2020-09-15  8:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Paul Durrant, Jan Beulich, Andrew Cooper, George Dunlap, Wei Liu,
	Roger Pau Monné,
	Kevin Tian

From: Paul Durrant <pdurrant@amazon.com>

Sharing of HAP tables is now VT-d specific so the operation is never defined
for AMD IOMMU any more. There's also no need to pro-actively set vtd.pgd_maddr
when using shared EPT as it is straightforward to simply define a helper
function to return the appropriate value in the shared and non-shared cases.

NOTE: This patch also modifies unmap_vtd_domain_page() to take a const
      pointer since the only thing it calls, unmap_domain_page(), also takes
      a const pointer.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Wei Liu <wl@xen.org>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Kevin Tian <kevin.tian@intel.com>

v6:
 - Adjust code to return P2M paddr
 - Add removed comment back in

v5:
 - Pass 'nr_pt_levels' into domain_pgd_maddr() directly

v2:
 - Put the PGD level adjust into the helper function too, since it is
   irrelevant in the shared EPT case
---
 xen/arch/x86/mm/p2m.c                 |  3 -
 xen/drivers/passthrough/iommu.c       |  8 ---
 xen/drivers/passthrough/vtd/extern.h  |  2 +-
 xen/drivers/passthrough/vtd/iommu.c   | 90 +++++++++++++++------------
 xen/drivers/passthrough/vtd/x86/vtd.c |  2 +-
 xen/include/xen/iommu.h               |  3 -
 6 files changed, 52 insertions(+), 56 deletions(-)

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 01ff92862d..d382199c88 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -726,9 +726,6 @@ int p2m_alloc_table(struct p2m_domain *p2m)
 
     p2m->phys_table = pagetable_from_mfn(top_mfn);
 
-    if ( hap_enabled(d) )
-        iommu_share_p2m_table(d);
-
     p2m_unlock(p2m);
     return 0;
 }
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index a9da4d2b06..90748062e5 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -500,14 +500,6 @@ int iommu_do_domctl(
     return ret;
 }
 
-void iommu_share_p2m_table(struct domain* d)
-{
-    ASSERT(hap_enabled(d));
-
-    if ( iommu_use_hap_pt(d) )
-        iommu_get_ops()->share_p2m(d);
-}
-
 void iommu_crash_shutdown(void)
 {
     if ( !iommu_crash_disable )
diff --git a/xen/drivers/passthrough/vtd/extern.h b/xen/drivers/passthrough/vtd/extern.h
index ada3c3098c..9cf5b578c9 100644
--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -72,7 +72,7 @@ void flush_all_cache(void);
 uint64_t alloc_pgtable_maddr(unsigned long npages, nodeid_t node);
 void free_pgtable_maddr(u64 maddr);
 void *map_vtd_domain_page(u64 maddr);
-void unmap_vtd_domain_page(void *va);
+void unmap_vtd_domain_page(const void *va);
 int domain_context_mapping_one(struct domain *domain, struct vtd_iommu *iommu,
                                u8 bus, u8 devfn, const struct pci_dev *);
 int domain_context_unmap_one(struct domain *domain, struct vtd_iommu *iommu,
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 68cf0e535a..58d4550a4c 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -318,6 +318,48 @@ static u64 addr_to_dma_page_maddr(struct domain *domain, u64 addr, int alloc)
     return pte_maddr;
 }
 
+static uint64_t domain_pgd_maddr(struct domain *d, unsigned int nr_pt_levels)
+{
+    struct domain_iommu *hd = dom_iommu(d);
+    uint64_t pgd_maddr;
+    unsigned int agaw;
+
+    ASSERT(spin_is_locked(&hd->arch.mapping_lock));
+
+    if ( iommu_use_hap_pt(d) )
+    {
+        pagetable_t pgt = p2m_get_pagetable(p2m_get_hostp2m(d));
+
+        return pagetable_get_paddr(pgt);
+    }
+
+    if ( !hd->arch.vtd.pgd_maddr )
+    {
+        /* Ensure we have pagetables allocated down to leaf PTE. */
+        addr_to_dma_page_maddr(d, 0, 1);
+
+        if ( !hd->arch.vtd.pgd_maddr )
+            return 0;
+    }
+
+    pgd_maddr = hd->arch.vtd.pgd_maddr;
+
+    /* Skip top levels of page tables for 2- and 3-level DRHDs. */
+    for ( agaw = level_to_agaw(4);
+          agaw != level_to_agaw(nr_pt_levels);
+          agaw-- )
+    {
+        const struct dma_pte *p = map_vtd_domain_page(pgd_maddr);
+
+        pgd_maddr = dma_pte_addr(*p);
+        unmap_vtd_domain_page(p);
+        if ( !pgd_maddr )
+            return 0;
+    }
+
+    return pgd_maddr;
+}
+
 static void iommu_flush_write_buffer(struct vtd_iommu *iommu)
 {
     u32 val;
@@ -1286,7 +1328,7 @@ int domain_context_mapping_one(
     struct context_entry *context, *context_entries;
     u64 maddr, pgd_maddr;
     u16 seg = iommu->drhd->segment;
-    int agaw, rc, ret;
+    int rc, ret;
     bool_t flush_dev_iotlb;
 
     ASSERT(pcidevs_locked());
@@ -1340,37 +1382,18 @@ int domain_context_mapping_one(
     if ( iommu_hwdom_passthrough && is_hardware_domain(domain) )
     {
         context_set_translation_type(*context, CONTEXT_TT_PASS_THRU);
-        agaw = level_to_agaw(iommu->nr_pt_levels);
     }
     else
     {
         spin_lock(&hd->arch.mapping_lock);
 
-        /* Ensure we have pagetables allocated down to leaf PTE. */
-        if ( hd->arch.vtd.pgd_maddr == 0 )
-        {
-            addr_to_dma_page_maddr(domain, 0, 1);
-            if ( hd->arch.vtd.pgd_maddr == 0 )
-            {
-            nomem:
-                spin_unlock(&hd->arch.mapping_lock);
-                spin_unlock(&iommu->lock);
-                unmap_vtd_domain_page(context_entries);
-                return -ENOMEM;
-            }
-        }
-
-        /* Skip top levels of page tables for 2- and 3-level DRHDs. */
-        pgd_maddr = hd->arch.vtd.pgd_maddr;
-        for ( agaw = level_to_agaw(4);
-              agaw != level_to_agaw(iommu->nr_pt_levels);
-              agaw-- )
+        pgd_maddr = domain_pgd_maddr(domain, iommu->nr_pt_levels);
+        if ( !pgd_maddr )
         {
-            struct dma_pte *p = map_vtd_domain_page(pgd_maddr);
-            pgd_maddr = dma_pte_addr(*p);
-            unmap_vtd_domain_page(p);
-            if ( pgd_maddr == 0 )
-                goto nomem;
+            spin_unlock(&hd->arch.mapping_lock);
+            spin_unlock(&iommu->lock);
+            unmap_vtd_domain_page(context_entries);
+            return -ENOMEM;
         }
 
         context_set_address_root(*context, pgd_maddr);
@@ -1389,7 +1412,7 @@ int domain_context_mapping_one(
         return -EFAULT;
     }
 
-    context_set_address_width(*context, agaw);
+    context_set_address_width(*context, level_to_agaw(iommu->nr_pt_levels));
     context_set_fault_enable(*context);
     context_set_present(*context);
     iommu_sync_cache(context, sizeof(struct context_entry));
@@ -1848,18 +1871,6 @@ static int __init vtd_ept_page_compatible(struct vtd_iommu *iommu)
            (ept_has_1gb(ept_cap) && opt_hap_1gb) <= cap_sps_1gb(vtd_cap);
 }
 
-/*
- * set VT-d page table directory to EPT table if allowed
- */
-static void iommu_set_pgd(struct domain *d)
-{
-    mfn_t pgd_mfn;
-
-    pgd_mfn = pagetable_get_mfn(p2m_get_pagetable(p2m_get_hostp2m(d)));
-    dom_iommu(d)->arch.vtd.pgd_maddr =
-        pagetable_get_paddr(pagetable_from_mfn(pgd_mfn));
-}
-
 static int rmrr_identity_mapping(struct domain *d, bool_t map,
                                  const struct acpi_rmrr_unit *rmrr,
                                  u32 flag)
@@ -2719,7 +2730,6 @@ static struct iommu_ops __initdata vtd_ops = {
     .adjust_irq_affinities = adjust_vtd_irq_affinities,
     .suspend = vtd_suspend,
     .resume = vtd_resume,
-    .share_p2m = iommu_set_pgd,
     .crash_shutdown = vtd_crash_shutdown,
     .iotlb_flush = iommu_flush_iotlb_pages,
     .iotlb_flush_all = iommu_flush_iotlb_all,
diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c b/xen/drivers/passthrough/vtd/x86/vtd.c
index bbe358dc36..6681dccd69 100644
--- a/xen/drivers/passthrough/vtd/x86/vtd.c
+++ b/xen/drivers/passthrough/vtd/x86/vtd.c
@@ -42,7 +42,7 @@ void *map_vtd_domain_page(u64 maddr)
     return map_domain_page(_mfn(paddr_to_pfn(maddr)));
 }
 
-void unmap_vtd_domain_page(void *va)
+void unmap_vtd_domain_page(const void *va)
 {
     unmap_domain_page(va);
 }
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index a2eefe8582..373145266f 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -270,7 +270,6 @@ struct iommu_ops {
 
     int __must_check (*suspend)(void);
     void (*resume)(void);
-    void (*share_p2m)(struct domain *d);
     void (*crash_shutdown)(void);
     int __must_check (*iotlb_flush)(struct domain *d, dfn_t dfn,
                                     unsigned long page_count,
@@ -347,8 +346,6 @@ void iommu_resume(void);
 void iommu_crash_shutdown(void);
 int iommu_get_reserved_device_memory(iommu_grdm_t *, void *);
 
-void iommu_share_p2m_table(struct domain *d);
-
 #ifdef CONFIG_HAS_PCI
 int iommu_do_pci_domctl(struct xen_domctl *, struct domain *d,
                         XEN_GUEST_HANDLE_PARAM(xen_domctl_t));
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v9 8/8] iommu: stop calling IOMMU page tables 'p2m tables'
  2020-09-15  8:29 [PATCH v9 0/8] IOMMU cleanup Paul Durrant
                   ` (6 preceding siblings ...)
  2020-09-15  8:29 ` [PATCH v9 7/8] iommu: remove the share_p2m operation Paul Durrant
@ 2020-09-15  8:29 ` Paul Durrant
  2020-09-17  7:25   ` Tian, Kevin
  7 siblings, 1 reply; 19+ messages in thread
From: Paul Durrant @ 2020-09-15  8:29 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Jan Beulich, Andrew Cooper, Kevin Tian

From: Paul Durrant <pdurrant@amazon.com>

It's confusing and not consistent with the terminology introduced with 'dfn_t'.
Just call them IOMMU page tables.

Also remove a pointless check of the 'acpi_drhd_units' list in
vtd_dump_page_table_level(). If the list is empty then IOMMU mappings would
not have been enabled for the domain in the first place.

NOTE: All calls to printk() have also been removed from
      iommu_dump_page_tables(); the implementation specific code is now
      responsible for all output.
      The check for the global 'iommu_enabled' has also been replaced by an
      ASSERT since iommu_dump_page_tables() is not registered as a key handler
      unless IOMMU mappings are enabled.
      Error messages are now prefixed with the name of the function.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Kevin Tian <kevin.tian@intel.com>

v6:
 - Cosmetic adjustment
 - Drop use of __func__

v5:
 - Make sure domain id is in the output
 - Use VTDPREFIX in output for consistency

v2:
 - Moved all output into implementation specific code
---
 xen/drivers/passthrough/amd/pci_amd_iommu.c | 20 +++++++-------
 xen/drivers/passthrough/iommu.c             | 21 ++++-----------
 xen/drivers/passthrough/vtd/iommu.c         | 30 ++++++++++++---------
 xen/include/xen/iommu.h                     |  2 +-
 4 files changed, 33 insertions(+), 40 deletions(-)

diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index 3390c22cf3..5fe9dc849d 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -491,8 +491,8 @@ static int amd_iommu_group_id(u16 seg, u8 bus, u8 devfn)
 
 #include <asm/io_apic.h>
 
-static void amd_dump_p2m_table_level(struct page_info* pg, int level, 
-                                     paddr_t gpa, int indent)
+static void amd_dump_page_table_level(struct page_info *pg, int level,
+                                      paddr_t gpa, int indent)
 {
     paddr_t address;
     struct amd_iommu_pte *table_vaddr;
@@ -504,7 +504,7 @@ static void amd_dump_p2m_table_level(struct page_info* pg, int level,
     table_vaddr = __map_domain_page(pg);
     if ( table_vaddr == NULL )
     {
-        printk("Failed to map IOMMU domain page %"PRIpaddr"\n", 
+        printk("AMD IOMMU failed to map domain page %"PRIpaddr"\n",
                 page_to_maddr(pg));
         return;
     }
@@ -521,7 +521,7 @@ static void amd_dump_p2m_table_level(struct page_info* pg, int level,
 
         if ( pde->next_level && (pde->next_level != (level - 1)) )
         {
-            printk("IOMMU p2m table error. next_level = %d, expected %d\n",
+            printk("AMD IOMMU table error. next_level = %d, expected %d\n",
                    pde->next_level, level - 1);
 
             continue;
@@ -529,7 +529,7 @@ static void amd_dump_p2m_table_level(struct page_info* pg, int level,
 
         address = gpa + amd_offset_level_address(index, level);
         if ( pde->next_level >= 1 )
-            amd_dump_p2m_table_level(
+            amd_dump_page_table_level(
                 mfn_to_page(_mfn(pde->mfn)), pde->next_level,
                 address, indent + 1);
         else
@@ -542,16 +542,16 @@ static void amd_dump_p2m_table_level(struct page_info* pg, int level,
     unmap_domain_page(table_vaddr);
 }
 
-static void amd_dump_p2m_table(struct domain *d)
+static void amd_dump_page_tables(struct domain *d)
 {
     const struct domain_iommu *hd = dom_iommu(d);
 
     if ( !hd->arch.amd.root_table )
         return;
 
-    printk("p2m table has %u levels\n", hd->arch.amd.paging_mode);
-    amd_dump_p2m_table_level(hd->arch.amd.root_table,
-                             hd->arch.amd.paging_mode, 0, 0);
+    printk("AMD IOMMU %pd table has %u levels\n", d, hd->arch.amd.paging_mode);
+    amd_dump_page_table_level(hd->arch.amd.root_table,
+                              hd->arch.amd.paging_mode, 0, 0);
 }
 
 static const struct iommu_ops __initconstrel _iommu_ops = {
@@ -578,7 +578,7 @@ static const struct iommu_ops __initconstrel _iommu_ops = {
     .suspend = amd_iommu_suspend,
     .resume = amd_iommu_resume,
     .crash_shutdown = amd_iommu_crash_shutdown,
-    .dump_p2m_table = amd_dump_p2m_table,
+    .dump_page_tables = amd_dump_page_tables,
 };
 
 static const struct iommu_init_ops __initconstrel _iommu_init_ops = {
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 90748062e5..8fae77b593 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -22,7 +22,7 @@
 #include <xen/keyhandler.h>
 #include <xsm/xsm.h>
 
-static void iommu_dump_p2m_table(unsigned char key);
+static void iommu_dump_page_tables(unsigned char key);
 
 unsigned int __read_mostly iommu_dev_iotlb_timeout = 1000;
 integer_param("iommu_dev_iotlb_timeout", iommu_dev_iotlb_timeout);
@@ -212,7 +212,7 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
     if ( !is_iommu_enabled(d) )
         return;
 
-    register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m table", 0);
+    register_keyhandler('o', &iommu_dump_page_tables, "dump iommu page tables", 0);
 
     hd->platform_ops->hwdom_init(d);
 }
@@ -535,16 +535,12 @@ bool_t iommu_has_feature(struct domain *d, enum iommu_feature feature)
     return is_iommu_enabled(d) && test_bit(feature, dom_iommu(d)->features);
 }
 
-static void iommu_dump_p2m_table(unsigned char key)
+static void iommu_dump_page_tables(unsigned char key)
 {
     struct domain *d;
     const struct iommu_ops *ops;
 
-    if ( !iommu_enabled )
-    {
-        printk("IOMMU not enabled!\n");
-        return;
-    }
+    ASSERT(iommu_enabled);
 
     ops = iommu_get_ops();
 
@@ -555,14 +551,7 @@ static void iommu_dump_p2m_table(unsigned char key)
         if ( is_hardware_domain(d) || !is_iommu_enabled(d) )
             continue;
 
-        if ( iommu_use_hap_pt(d) )
-        {
-            printk("\ndomain%d IOMMU p2m table shared with MMU: \n", d->domain_id);
-            continue;
-        }
-
-        printk("\ndomain%d IOMMU p2m table: \n", d->domain_id);
-        ops->dump_p2m_table(d);
+        ops->dump_page_tables(d);
     }
 
     rcu_read_unlock(&domlist_read_lock);
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 58d4550a4c..faf258e699 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2582,8 +2582,8 @@ static void vtd_resume(void)
     }
 }
 
-static void vtd_dump_p2m_table_level(paddr_t pt_maddr, int level, paddr_t gpa, 
-                                     int indent)
+static void vtd_dump_page_table_level(paddr_t pt_maddr, int level, paddr_t gpa,
+                                      int indent)
 {
     paddr_t address;
     int i;
@@ -2596,7 +2596,8 @@ static void vtd_dump_p2m_table_level(paddr_t pt_maddr, int level, paddr_t gpa,
     pt_vaddr = map_vtd_domain_page(pt_maddr);
     if ( pt_vaddr == NULL )
     {
-        printk("Failed to map VT-D domain page %"PRIpaddr"\n", pt_maddr);
+        printk(VTDPREFIX " failed to map domain page %"PRIpaddr"\n",
+               pt_maddr);
         return;
     }
 
@@ -2612,8 +2613,8 @@ static void vtd_dump_p2m_table_level(paddr_t pt_maddr, int level, paddr_t gpa,
 
         address = gpa + offset_level_address(i, level);
         if ( next_level >= 1 ) 
-            vtd_dump_p2m_table_level(dma_pte_addr(*pte), next_level, 
-                                     address, indent + 1);
+            vtd_dump_page_table_level(dma_pte_addr(*pte), next_level,
+                                      address, indent + 1);
         else
             printk("%*sdfn: %08lx mfn: %08lx\n",
                    indent, "",
@@ -2624,17 +2625,20 @@ static void vtd_dump_p2m_table_level(paddr_t pt_maddr, int level, paddr_t gpa,
     unmap_vtd_domain_page(pt_vaddr);
 }
 
-static void vtd_dump_p2m_table(struct domain *d)
+static void vtd_dump_page_tables(struct domain *d)
 {
-    const struct domain_iommu *hd;
+    const struct domain_iommu *hd = dom_iommu(d);
 
-    if ( list_empty(&acpi_drhd_units) )
+    if ( iommu_use_hap_pt(d) )
+    {
+        printk(VTDPREFIX " %pd sharing EPT table\n", d);
         return;
+    }
 
-    hd = dom_iommu(d);
-    printk("p2m table has %d levels\n", agaw_to_level(hd->arch.vtd.agaw));
-    vtd_dump_p2m_table_level(hd->arch.vtd.pgd_maddr,
-                             agaw_to_level(hd->arch.vtd.agaw), 0, 0);
+    printk(VTDPREFIX" %pd table has %d levels\n", d,
+           agaw_to_level(hd->arch.vtd.agaw));
+    vtd_dump_page_table_level(hd->arch.vtd.pgd_maddr,
+                              agaw_to_level(hd->arch.vtd.agaw), 0, 0);
 }
 
 static int __init intel_iommu_quarantine_init(struct domain *d)
@@ -2734,7 +2738,7 @@ static struct iommu_ops __initdata vtd_ops = {
     .iotlb_flush = iommu_flush_iotlb_pages,
     .iotlb_flush_all = iommu_flush_iotlb_all,
     .get_reserved_device_memory = intel_iommu_get_reserved_device_memory,
-    .dump_p2m_table = vtd_dump_p2m_table,
+    .dump_page_tables = vtd_dump_page_tables,
 };
 
 const struct iommu_init_ops __initconstrel intel_iommu_init_ops = {
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 373145266f..7a539522b1 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -276,7 +276,7 @@ struct iommu_ops {
                                     unsigned int flush_flags);
     int __must_check (*iotlb_flush_all)(struct domain *d);
     int (*get_reserved_device_memory)(iommu_grdm_t *, void *);
-    void (*dump_p2m_table)(struct domain *d);
+    void (*dump_page_tables)(struct domain *d);
 
 #ifdef CONFIG_HAS_DEVICE_TREE
     /*
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v9 5/8] remove remaining uses of iommu_legacy_map/unmap
  2020-09-15  8:29 ` [PATCH v9 5/8] remove remaining uses of iommu_legacy_map/unmap Paul Durrant
@ 2020-09-16  7:53   ` Julien Grall
  2020-09-17  7:40   ` Tian, Kevin
  1 sibling, 0 replies; 19+ messages in thread
From: Julien Grall @ 2020-09-16  7:53 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Paul Durrant, Jan Beulich, Andrew Cooper, Wei Liu,
	Roger Pau Monné,
	George Dunlap, Ian Jackson, Stefano Stabellini, Jun Nakajima,
	Kevin Tian

Hi Paul,

On 15/09/2020 09:29, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> The 'legacy' functions do implicit flushing so amend the callers to do the
> appropriate flushing.
> 
> Unfortunately, because of the structure of the P2M code, we cannot remove
> the per-CPU 'iommu_dont_flush_iotlb' global and the optimization it
> facilitates. Checking of this flag is now done only in relevant callers of
> iommu_iotlb_flush(). Also, 'iommu_dont_flush_iotlb' is now declared
> as bool (rather than bool_t) and setting/clearing it are no longer pointlessly
> gated on is_iommu_enabled() returning true. (Arguably it is also pointless to
> gate the call to iommu_iotlb_flush() on that condition - since it is a no-op
> in that case - but the if clause allows the scope of a stack variable to be
> restricted).
> 
> NOTE: The code in memory_add() now sets 'ret' if iommu_map() or
>        iommu_iotlb_flush() fails.
> 
> Signed-off-by: Paul Durrant <pdurrant@amazon.com>

The changes in common code looks good to me:

Acked-by: Julien Grall <jgrall@amazon.com>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v9 6/8] common/grant_table: batch flush I/O TLB
  2020-09-15  8:29 ` [PATCH v9 6/8] common/grant_table: batch flush I/O TLB Paul Durrant
@ 2020-09-16  8:11   ` Julien Grall
  2020-09-17 16:51   ` Wei Liu
  1 sibling, 0 replies; 19+ messages in thread
From: Julien Grall @ 2020-09-16  8:11 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Paul Durrant, Jan Beulich, Andrew Cooper, George Dunlap,
	Ian Jackson, Stefano Stabellini, Wei Liu

Hi Paul,

On 15/09/2020 09:29, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> This patch avoids calling iommu_iotlb_flush() for each individual GNTTABOP and
> instead calls iommu_iotlb_flush_all() at the end of a batch. This should mean
> non-singleton map/unmap operations perform better.
> 
> NOTE: A batch is the number of operations done before a pre-emption check and,
>        in the case of unmap, a TLB flush.
> 
> Suggested-by: Jan Beulich <jbeulich@suse.com>
> Signed-off-by: Paul Durrant <pdurrant@amazon.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Julien Grall <jgrall@amazon.com>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v9 1/8] x86/iommu: convert VT-d code to use new page table allocator
  2020-09-15  8:29 ` [PATCH v9 1/8] x86/iommu: convert VT-d code to use new page table allocator Paul Durrant
@ 2020-09-17  6:33   ` Tian, Kevin
  0 siblings, 0 replies; 19+ messages in thread
From: Tian, Kevin @ 2020-09-17  6:33 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Paul Durrant, Jan Beulich

> From: Paul Durrant <paul@xen.org>
> Sent: Tuesday, September 15, 2020 4:29 PM
> 
> From: Paul Durrant <pdurrant@amazon.com>
> 
> This patch converts the VT-d code to use the new IOMMU page table
> allocator
> function. This allows all the free-ing code to be removed (since it is now
> handled by the general x86 code) which reduces TLB and cache thrashing as
> well
> as shortening the code.
> 
> The scope of the mapping_lock in intel_iommu_quarantine_init() has also
> been
> increased slightly; it should have always covered accesses to
> 'arch.vtd.pgd_maddr'.
> 
> NOTE: The common IOMMU needs a slight modification to avoid scheduling
> the
>       cleanup tasklet if the free_page_table() method is not present (since
>       the tasklet will unconditionally call it).
> 
> Signed-off-by: Paul Durrant <pdurrant@amazon.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> ---
> Cc: Kevin Tian <kevin.tian@intel.com>
> 
> v2:
>  - New in v2 (split from "add common page-table allocator")
> ---
>  xen/drivers/passthrough/iommu.c     |   6 +-
>  xen/drivers/passthrough/vtd/iommu.c | 101 ++++++++++------------------
>  2 files changed, 39 insertions(+), 68 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/iommu.c
> b/xen/drivers/passthrough/iommu.c
> index 1d644844ab..2b1db8022c 100644
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -225,8 +225,10 @@ static void iommu_teardown(struct domain *d)
>  {
>      struct domain_iommu *hd = dom_iommu(d);
> 
> -    hd->platform_ops->teardown(d);
> -    tasklet_schedule(&iommu_pt_cleanup_tasklet);
> +    iommu_vcall(hd->platform_ops, teardown, d);
> +
> +    if ( hd->platform_ops->free_page_table )
> +        tasklet_schedule(&iommu_pt_cleanup_tasklet);
>  }
> 
>  void iommu_domain_destroy(struct domain *d)
> diff --git a/xen/drivers/passthrough/vtd/iommu.c
> b/xen/drivers/passthrough/vtd/iommu.c
> index 94e0455a4d..607e8b5e65 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -265,10 +265,15 @@ static u64 addr_to_dma_page_maddr(struct
> domain *domain, u64 addr, int alloc)
> 
>      addr &= (((u64)1) << addr_width) - 1;
>      ASSERT(spin_is_locked(&hd->arch.mapping_lock));
> -    if ( !hd->arch.vtd.pgd_maddr &&
> -         (!alloc ||
> -          ((hd->arch.vtd.pgd_maddr = alloc_pgtable_maddr(1, hd->node)) ==
> 0)) )
> -        goto out;
> +    if ( !hd->arch.vtd.pgd_maddr )
> +    {
> +        struct page_info *pg;
> +
> +        if ( !alloc || !(pg = iommu_alloc_pgtable(domain)) )
> +            goto out;
> +
> +        hd->arch.vtd.pgd_maddr = page_to_maddr(pg);
> +    }
> 
>      parent = (struct dma_pte *)map_vtd_domain_page(hd-
> >arch.vtd.pgd_maddr);
>      while ( level > 1 )
> @@ -279,13 +284,16 @@ static u64 addr_to_dma_page_maddr(struct
> domain *domain, u64 addr, int alloc)
>          pte_maddr = dma_pte_addr(*pte);
>          if ( !pte_maddr )
>          {
> +            struct page_info *pg;
> +
>              if ( !alloc )
>                  break;
> 
> -            pte_maddr = alloc_pgtable_maddr(1, hd->node);
> -            if ( !pte_maddr )
> +            pg = iommu_alloc_pgtable(domain);
> +            if ( !pg )
>                  break;
> 
> +            pte_maddr = page_to_maddr(pg);
>              dma_set_pte_addr(*pte, pte_maddr);
> 
>              /*
> @@ -675,45 +683,6 @@ static void dma_pte_clear_one(struct domain
> *domain, uint64_t addr,
>      unmap_vtd_domain_page(page);
>  }
> 
> -static void iommu_free_pagetable(u64 pt_maddr, int level)
> -{
> -    struct page_info *pg = maddr_to_page(pt_maddr);
> -
> -    if ( pt_maddr == 0 )
> -        return;
> -
> -    PFN_ORDER(pg) = level;
> -    spin_lock(&iommu_pt_cleanup_lock);
> -    page_list_add_tail(pg, &iommu_pt_cleanup_list);
> -    spin_unlock(&iommu_pt_cleanup_lock);
> -}
> -
> -static void iommu_free_page_table(struct page_info *pg)
> -{
> -    unsigned int i, next_level = PFN_ORDER(pg) - 1;
> -    u64 pt_maddr = page_to_maddr(pg);
> -    struct dma_pte *pt_vaddr, *pte;
> -
> -    PFN_ORDER(pg) = 0;
> -    pt_vaddr = (struct dma_pte *)map_vtd_domain_page(pt_maddr);
> -
> -    for ( i = 0; i < PTE_NUM; i++ )
> -    {
> -        pte = &pt_vaddr[i];
> -        if ( !dma_pte_present(*pte) )
> -            continue;
> -
> -        if ( next_level >= 1 )
> -            iommu_free_pagetable(dma_pte_addr(*pte), next_level);
> -
> -        dma_clear_pte(*pte);
> -        iommu_sync_cache(pte, sizeof(struct dma_pte));
> -    }
> -
> -    unmap_vtd_domain_page(pt_vaddr);
> -    free_pgtable_maddr(pt_maddr);
> -}
> -
>  static int iommu_set_root_entry(struct vtd_iommu *iommu)
>  {
>      u32 sts;
> @@ -1748,16 +1717,7 @@ static void iommu_domain_teardown(struct
> domain *d)
>          xfree(mrmrr);
>      }
> 
> -    ASSERT(is_iommu_enabled(d));
> -
> -    if ( iommu_use_hap_pt(d) )
> -        return;
> -
> -    spin_lock(&hd->arch.mapping_lock);
> -    iommu_free_pagetable(hd->arch.vtd.pgd_maddr,
> -                         agaw_to_level(hd->arch.vtd.agaw));
>      hd->arch.vtd.pgd_maddr = 0;
> -    spin_unlock(&hd->arch.mapping_lock);
>  }
> 
>  static int __must_check intel_iommu_map_page(struct domain *d, dfn_t dfn,
> @@ -2669,23 +2629,28 @@ static void vtd_dump_p2m_table(struct domain
> *d)
>  static int __init intel_iommu_quarantine_init(struct domain *d)
>  {
>      struct domain_iommu *hd = dom_iommu(d);
> +    struct page_info *pg;
>      struct dma_pte *parent;
>      unsigned int agaw =
> width_to_agaw(DEFAULT_DOMAIN_ADDRESS_WIDTH);
>      unsigned int level = agaw_to_level(agaw);
> -    int rc;
> +    int rc = 0;
> +
> +    spin_lock(&hd->arch.mapping_lock);
> 
>      if ( hd->arch.vtd.pgd_maddr )
>      {
>          ASSERT_UNREACHABLE();
> -        return 0;
> +        goto out;
>      }
> 
> -    spin_lock(&hd->arch.mapping_lock);
> +    pg = iommu_alloc_pgtable(d);
> 
> -    hd->arch.vtd.pgd_maddr = alloc_pgtable_maddr(1, hd->node);
> -    if ( !hd->arch.vtd.pgd_maddr )
> +    rc = -ENOMEM;
> +    if ( !pg )
>          goto out;
> 
> +    hd->arch.vtd.pgd_maddr = page_to_maddr(pg);
> +
>      parent = map_vtd_domain_page(hd->arch.vtd.pgd_maddr);
>      while ( level )
>      {
> @@ -2697,10 +2662,12 @@ static int __init
> intel_iommu_quarantine_init(struct domain *d)
>           * page table pages, and the resulting allocations are always
>           * zeroed.
>           */
> -        maddr = alloc_pgtable_maddr(1, hd->node);
> -        if ( !maddr )
> -            break;
> +        pg = iommu_alloc_pgtable(d);
> +
> +        if ( !pg )
> +            goto out;
> 
> +        maddr = page_to_maddr(pg);
>          for ( offset = 0; offset < PTE_NUM; offset++ )
>          {
>              struct dma_pte *pte = &parent[offset];
> @@ -2716,13 +2683,16 @@ static int __init
> intel_iommu_quarantine_init(struct domain *d)
>      }
>      unmap_vtd_domain_page(parent);
> 
> +    rc = 0;
> +
>   out:
>      spin_unlock(&hd->arch.mapping_lock);
> 
> -    rc = iommu_flush_iotlb_all(d);
> +    if ( !rc )
> +        rc = iommu_flush_iotlb_all(d);
> 
> -    /* Pages leaked in failure case */
> -    return level ? -ENOMEM : rc;
> +    /* Pages may be leaked in failure case */
> +    return rc;
>  }
> 
>  static struct iommu_ops __initdata vtd_ops = {
> @@ -2737,7 +2707,6 @@ static struct iommu_ops __initdata vtd_ops = {
>      .map_page = intel_iommu_map_page,
>      .unmap_page = intel_iommu_unmap_page,
>      .lookup_page = intel_iommu_lookup_page,
> -    .free_page_table = iommu_free_page_table,
>      .reassign_device = reassign_device_ownership,
>      .get_device_group_id = intel_iommu_group_id,
>      .enable_x2apic = intel_iommu_enable_eim,
> --
> 2.20.1



^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v9 4/8] iommu: make map and unmap take a page count, similar to flush
  2020-09-15  8:29 ` [PATCH v9 4/8] iommu: make map and unmap take a page count, similar to flush Paul Durrant
@ 2020-09-17  6:38   ` Tian, Kevin
  2020-09-17 16:51   ` Wei Liu
  1 sibling, 0 replies; 19+ messages in thread
From: Tian, Kevin @ 2020-09-17  6:38 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Paul Durrant, Jan Beulich, Julien Grall, Andrew Cooper, Wei Liu,
	Roger Pau Monné,
	George Dunlap, Ian Jackson, Stefano Stabellini, Nakajima, Jun,
	Bertrand Marquis, Oleksandr Tyshchenko

> From: Paul Durrant <paul@xen.org>
> Sent: Tuesday, September 15, 2020 4:30 PM
> 
> From: Paul Durrant <pdurrant@amazon.com>
> 
> At the moment iommu_map() and iommu_unmap() take a page order rather
> than a
> count, whereas iommu_iotlb_flush() takes a page count rather than an order.
> This patch makes them consistent with each other, opting for a page count
> since
> CPU page orders are not necessarily the same as those of an IOMMU.
> 
> NOTE: The 'page_count' parameter is also made an unsigned long in all the
>       aforementioned functions.
> 
> Signed-off-by: Paul Durrant <pdurrant@amazon.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Julien Grall <julien@xen.org>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v9 7/8] iommu: remove the share_p2m operation
  2020-09-15  8:29 ` [PATCH v9 7/8] iommu: remove the share_p2m operation Paul Durrant
@ 2020-09-17  7:19   ` Tian, Kevin
  2020-09-17  7:24   ` Tian, Kevin
  1 sibling, 0 replies; 19+ messages in thread
From: Tian, Kevin @ 2020-09-17  7:19 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Paul Durrant, Jan Beulich, Andrew Cooper, George Dunlap, Wei Liu,
	Roger Pau Monné

> From: Paul Durrant <paul@xen.org>
> Sent: Tuesday, September 15, 2020 4:30 PM
> 
> From: Paul Durrant <pdurrant@amazon.com>
> 
> Sharing of HAP tables is now VT-d specific so the operation is never defined
> for AMD IOMMU any more. There's also no need to pro-actively set
> vtd.pgd_maddr
> when using shared EPT as it is straightforward to simply define a helper
> function to return the appropriate value in the shared and non-shared cases.

vtd_dump_p2m_table then also needs to use this helper if I didn't overlook.

Thanks
Kevin

> 
> NOTE: This patch also modifies unmap_vtd_domain_page() to take a const
>       pointer since the only thing it calls, unmap_domain_page(), also takes
>       a const pointer.
> 
> Signed-off-by: Paul Durrant <pdurrant@amazon.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> ---
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Wei Liu <wl@xen.org>
> Cc: "Roger Pau Monné" <roger.pau@citrix.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> 
> v6:
>  - Adjust code to return P2M paddr
>  - Add removed comment back in
> 
> v5:
>  - Pass 'nr_pt_levels' into domain_pgd_maddr() directly
> 
> v2:
>  - Put the PGD level adjust into the helper function too, since it is
>    irrelevant in the shared EPT case
> ---
>  xen/arch/x86/mm/p2m.c                 |  3 -
>  xen/drivers/passthrough/iommu.c       |  8 ---
>  xen/drivers/passthrough/vtd/extern.h  |  2 +-
>  xen/drivers/passthrough/vtd/iommu.c   | 90 +++++++++++++++------------
>  xen/drivers/passthrough/vtd/x86/vtd.c |  2 +-
>  xen/include/xen/iommu.h               |  3 -
>  6 files changed, 52 insertions(+), 56 deletions(-)
> 
> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> index 01ff92862d..d382199c88 100644
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -726,9 +726,6 @@ int p2m_alloc_table(struct p2m_domain *p2m)
> 
>      p2m->phys_table = pagetable_from_mfn(top_mfn);
> 
> -    if ( hap_enabled(d) )
> -        iommu_share_p2m_table(d);
> -
>      p2m_unlock(p2m);
>      return 0;
>  }
> diff --git a/xen/drivers/passthrough/iommu.c
> b/xen/drivers/passthrough/iommu.c
> index a9da4d2b06..90748062e5 100644
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -500,14 +500,6 @@ int iommu_do_domctl(
>      return ret;
>  }
> 
> -void iommu_share_p2m_table(struct domain* d)
> -{
> -    ASSERT(hap_enabled(d));
> -
> -    if ( iommu_use_hap_pt(d) )
> -        iommu_get_ops()->share_p2m(d);
> -}
> -
>  void iommu_crash_shutdown(void)
>  {
>      if ( !iommu_crash_disable )
> diff --git a/xen/drivers/passthrough/vtd/extern.h
> b/xen/drivers/passthrough/vtd/extern.h
> index ada3c3098c..9cf5b578c9 100644
> --- a/xen/drivers/passthrough/vtd/extern.h
> +++ b/xen/drivers/passthrough/vtd/extern.h
> @@ -72,7 +72,7 @@ void flush_all_cache(void);
>  uint64_t alloc_pgtable_maddr(unsigned long npages, nodeid_t node);
>  void free_pgtable_maddr(u64 maddr);
>  void *map_vtd_domain_page(u64 maddr);
> -void unmap_vtd_domain_page(void *va);
> +void unmap_vtd_domain_page(const void *va);
>  int domain_context_mapping_one(struct domain *domain, struct
> vtd_iommu *iommu,
>                                 u8 bus, u8 devfn, const struct pci_dev *);
>  int domain_context_unmap_one(struct domain *domain, struct vtd_iommu
> *iommu,
> diff --git a/xen/drivers/passthrough/vtd/iommu.c
> b/xen/drivers/passthrough/vtd/iommu.c
> index 68cf0e535a..58d4550a4c 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -318,6 +318,48 @@ static u64 addr_to_dma_page_maddr(struct
> domain *domain, u64 addr, int alloc)
>      return pte_maddr;
>  }
> 
> +static uint64_t domain_pgd_maddr(struct domain *d, unsigned int
> nr_pt_levels)
> +{
> +    struct domain_iommu *hd = dom_iommu(d);
> +    uint64_t pgd_maddr;
> +    unsigned int agaw;
> +
> +    ASSERT(spin_is_locked(&hd->arch.mapping_lock));
> +
> +    if ( iommu_use_hap_pt(d) )
> +    {
> +        pagetable_t pgt = p2m_get_pagetable(p2m_get_hostp2m(d));
> +
> +        return pagetable_get_paddr(pgt);
> +    }
> +
> +    if ( !hd->arch.vtd.pgd_maddr )
> +    {
> +        /* Ensure we have pagetables allocated down to leaf PTE. */
> +        addr_to_dma_page_maddr(d, 0, 1);
> +
> +        if ( !hd->arch.vtd.pgd_maddr )
> +            return 0;
> +    }
> +
> +    pgd_maddr = hd->arch.vtd.pgd_maddr;
> +
> +    /* Skip top levels of page tables for 2- and 3-level DRHDs. */
> +    for ( agaw = level_to_agaw(4);
> +          agaw != level_to_agaw(nr_pt_levels);
> +          agaw-- )
> +    {
> +        const struct dma_pte *p = map_vtd_domain_page(pgd_maddr);
> +
> +        pgd_maddr = dma_pte_addr(*p);
> +        unmap_vtd_domain_page(p);
> +        if ( !pgd_maddr )
> +            return 0;
> +    }
> +
> +    return pgd_maddr;
> +}
> +
>  static void iommu_flush_write_buffer(struct vtd_iommu *iommu)
>  {
>      u32 val;
> @@ -1286,7 +1328,7 @@ int domain_context_mapping_one(
>      struct context_entry *context, *context_entries;
>      u64 maddr, pgd_maddr;
>      u16 seg = iommu->drhd->segment;
> -    int agaw, rc, ret;
> +    int rc, ret;
>      bool_t flush_dev_iotlb;
> 
>      ASSERT(pcidevs_locked());
> @@ -1340,37 +1382,18 @@ int domain_context_mapping_one(
>      if ( iommu_hwdom_passthrough && is_hardware_domain(domain) )
>      {
>          context_set_translation_type(*context, CONTEXT_TT_PASS_THRU);
> -        agaw = level_to_agaw(iommu->nr_pt_levels);
>      }
>      else
>      {
>          spin_lock(&hd->arch.mapping_lock);
> 
> -        /* Ensure we have pagetables allocated down to leaf PTE. */
> -        if ( hd->arch.vtd.pgd_maddr == 0 )
> -        {
> -            addr_to_dma_page_maddr(domain, 0, 1);
> -            if ( hd->arch.vtd.pgd_maddr == 0 )
> -            {
> -            nomem:
> -                spin_unlock(&hd->arch.mapping_lock);
> -                spin_unlock(&iommu->lock);
> -                unmap_vtd_domain_page(context_entries);
> -                return -ENOMEM;
> -            }
> -        }
> -
> -        /* Skip top levels of page tables for 2- and 3-level DRHDs. */
> -        pgd_maddr = hd->arch.vtd.pgd_maddr;
> -        for ( agaw = level_to_agaw(4);
> -              agaw != level_to_agaw(iommu->nr_pt_levels);
> -              agaw-- )
> +        pgd_maddr = domain_pgd_maddr(domain, iommu->nr_pt_levels);
> +        if ( !pgd_maddr )
>          {
> -            struct dma_pte *p = map_vtd_domain_page(pgd_maddr);
> -            pgd_maddr = dma_pte_addr(*p);
> -            unmap_vtd_domain_page(p);
> -            if ( pgd_maddr == 0 )
> -                goto nomem;
> +            spin_unlock(&hd->arch.mapping_lock);
> +            spin_unlock(&iommu->lock);
> +            unmap_vtd_domain_page(context_entries);
> +            return -ENOMEM;
>          }
> 
>          context_set_address_root(*context, pgd_maddr);
> @@ -1389,7 +1412,7 @@ int domain_context_mapping_one(
>          return -EFAULT;
>      }
> 
> -    context_set_address_width(*context, agaw);
> +    context_set_address_width(*context, level_to_agaw(iommu-
> >nr_pt_levels));
>      context_set_fault_enable(*context);
>      context_set_present(*context);
>      iommu_sync_cache(context, sizeof(struct context_entry));
> @@ -1848,18 +1871,6 @@ static int __init vtd_ept_page_compatible(struct
> vtd_iommu *iommu)
>             (ept_has_1gb(ept_cap) && opt_hap_1gb) <= cap_sps_1gb(vtd_cap);
>  }
> 
> -/*
> - * set VT-d page table directory to EPT table if allowed
> - */
> -static void iommu_set_pgd(struct domain *d)
> -{
> -    mfn_t pgd_mfn;
> -
> -    pgd_mfn =
> pagetable_get_mfn(p2m_get_pagetable(p2m_get_hostp2m(d)));
> -    dom_iommu(d)->arch.vtd.pgd_maddr =
> -        pagetable_get_paddr(pagetable_from_mfn(pgd_mfn));
> -}
> -
>  static int rmrr_identity_mapping(struct domain *d, bool_t map,
>                                   const struct acpi_rmrr_unit *rmrr,
>                                   u32 flag)
> @@ -2719,7 +2730,6 @@ static struct iommu_ops __initdata vtd_ops = {
>      .adjust_irq_affinities = adjust_vtd_irq_affinities,
>      .suspend = vtd_suspend,
>      .resume = vtd_resume,
> -    .share_p2m = iommu_set_pgd,
>      .crash_shutdown = vtd_crash_shutdown,
>      .iotlb_flush = iommu_flush_iotlb_pages,
>      .iotlb_flush_all = iommu_flush_iotlb_all,
> diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c
> b/xen/drivers/passthrough/vtd/x86/vtd.c
> index bbe358dc36..6681dccd69 100644
> --- a/xen/drivers/passthrough/vtd/x86/vtd.c
> +++ b/xen/drivers/passthrough/vtd/x86/vtd.c
> @@ -42,7 +42,7 @@ void *map_vtd_domain_page(u64 maddr)
>      return map_domain_page(_mfn(paddr_to_pfn(maddr)));
>  }
> 
> -void unmap_vtd_domain_page(void *va)
> +void unmap_vtd_domain_page(const void *va)
>  {
>      unmap_domain_page(va);
>  }
> diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
> index a2eefe8582..373145266f 100644
> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -270,7 +270,6 @@ struct iommu_ops {
> 
>      int __must_check (*suspend)(void);
>      void (*resume)(void);
> -    void (*share_p2m)(struct domain *d);
>      void (*crash_shutdown)(void);
>      int __must_check (*iotlb_flush)(struct domain *d, dfn_t dfn,
>                                      unsigned long page_count,
> @@ -347,8 +346,6 @@ void iommu_resume(void);
>  void iommu_crash_shutdown(void);
>  int iommu_get_reserved_device_memory(iommu_grdm_t *, void *);
> 
> -void iommu_share_p2m_table(struct domain *d);
> -
>  #ifdef CONFIG_HAS_PCI
>  int iommu_do_pci_domctl(struct xen_domctl *, struct domain *d,
>                          XEN_GUEST_HANDLE_PARAM(xen_domctl_t));
> --
> 2.20.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v9 7/8] iommu: remove the share_p2m operation
  2020-09-15  8:29 ` [PATCH v9 7/8] iommu: remove the share_p2m operation Paul Durrant
  2020-09-17  7:19   ` Tian, Kevin
@ 2020-09-17  7:24   ` Tian, Kevin
  1 sibling, 0 replies; 19+ messages in thread
From: Tian, Kevin @ 2020-09-17  7:24 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Paul Durrant, Jan Beulich, Andrew Cooper, George Dunlap, Wei Liu,
	Roger Pau Monné

> From: Tian, Kevin
> Sent: Thursday, September 17, 2020 3:20 PM
> 
> > From: Paul Durrant <paul@xen.org>
> > Sent: Tuesday, September 15, 2020 4:30 PM
> >
> > From: Paul Durrant <pdurrant@amazon.com>
> >
> > Sharing of HAP tables is now VT-d specific so the operation is never defined
> > for AMD IOMMU any more. There's also no need to pro-actively set
> > vtd.pgd_maddr
> > when using shared EPT as it is straightforward to simply define a helper
> > function to return the appropriate value in the shared and non-shared
> cases.
> 
> vtd_dump_p2m_table then also needs to use this helper if I didn't overlook.
> 

Please forget this comment. If hap sharing, that callback won't be
invoked. So:

	Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> 
> >
> > NOTE: This patch also modifies unmap_vtd_domain_page() to take a const
> >       pointer since the only thing it calls, unmap_domain_page(), also takes
> >       a const pointer.
> >
> > Signed-off-by: Paul Durrant <pdurrant@amazon.com>
> > Reviewed-by: Jan Beulich <jbeulich@suse.com>
> > ---
> > Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> > Cc: George Dunlap <george.dunlap@citrix.com>
> > Cc: Wei Liu <wl@xen.org>
> > Cc: "Roger Pau Monné" <roger.pau@citrix.com>
> > Cc: Kevin Tian <kevin.tian@intel.com>
> >
> > v6:
> >  - Adjust code to return P2M paddr
> >  - Add removed comment back in
> >
> > v5:
> >  - Pass 'nr_pt_levels' into domain_pgd_maddr() directly
> >
> > v2:
> >  - Put the PGD level adjust into the helper function too, since it is
> >    irrelevant in the shared EPT case
> > ---
> >  xen/arch/x86/mm/p2m.c                 |  3 -
> >  xen/drivers/passthrough/iommu.c       |  8 ---
> >  xen/drivers/passthrough/vtd/extern.h  |  2 +-
> >  xen/drivers/passthrough/vtd/iommu.c   | 90 +++++++++++++++------------
> >  xen/drivers/passthrough/vtd/x86/vtd.c |  2 +-
> >  xen/include/xen/iommu.h               |  3 -
> >  6 files changed, 52 insertions(+), 56 deletions(-)
> >
> > diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> > index 01ff92862d..d382199c88 100644
> > --- a/xen/arch/x86/mm/p2m.c
> > +++ b/xen/arch/x86/mm/p2m.c
> > @@ -726,9 +726,6 @@ int p2m_alloc_table(struct p2m_domain *p2m)
> >
> >      p2m->phys_table = pagetable_from_mfn(top_mfn);
> >
> > -    if ( hap_enabled(d) )
> > -        iommu_share_p2m_table(d);
> > -
> >      p2m_unlock(p2m);
> >      return 0;
> >  }
> > diff --git a/xen/drivers/passthrough/iommu.c
> > b/xen/drivers/passthrough/iommu.c
> > index a9da4d2b06..90748062e5 100644
> > --- a/xen/drivers/passthrough/iommu.c
> > +++ b/xen/drivers/passthrough/iommu.c
> > @@ -500,14 +500,6 @@ int iommu_do_domctl(
> >      return ret;
> >  }
> >
> > -void iommu_share_p2m_table(struct domain* d)
> > -{
> > -    ASSERT(hap_enabled(d));
> > -
> > -    if ( iommu_use_hap_pt(d) )
> > -        iommu_get_ops()->share_p2m(d);
> > -}
> > -
> >  void iommu_crash_shutdown(void)
> >  {
> >      if ( !iommu_crash_disable )
> > diff --git a/xen/drivers/passthrough/vtd/extern.h
> > b/xen/drivers/passthrough/vtd/extern.h
> > index ada3c3098c..9cf5b578c9 100644
> > --- a/xen/drivers/passthrough/vtd/extern.h
> > +++ b/xen/drivers/passthrough/vtd/extern.h
> > @@ -72,7 +72,7 @@ void flush_all_cache(void);
> >  uint64_t alloc_pgtable_maddr(unsigned long npages, nodeid_t node);
> >  void free_pgtable_maddr(u64 maddr);
> >  void *map_vtd_domain_page(u64 maddr);
> > -void unmap_vtd_domain_page(void *va);
> > +void unmap_vtd_domain_page(const void *va);
> >  int domain_context_mapping_one(struct domain *domain, struct
> > vtd_iommu *iommu,
> >                                 u8 bus, u8 devfn, const struct pci_dev *);
> >  int domain_context_unmap_one(struct domain *domain, struct
> vtd_iommu
> > *iommu,
> > diff --git a/xen/drivers/passthrough/vtd/iommu.c
> > b/xen/drivers/passthrough/vtd/iommu.c
> > index 68cf0e535a..58d4550a4c 100644
> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > @@ -318,6 +318,48 @@ static u64 addr_to_dma_page_maddr(struct
> > domain *domain, u64 addr, int alloc)
> >      return pte_maddr;
> >  }
> >
> > +static uint64_t domain_pgd_maddr(struct domain *d, unsigned int
> > nr_pt_levels)
> > +{
> > +    struct domain_iommu *hd = dom_iommu(d);
> > +    uint64_t pgd_maddr;
> > +    unsigned int agaw;
> > +
> > +    ASSERT(spin_is_locked(&hd->arch.mapping_lock));
> > +
> > +    if ( iommu_use_hap_pt(d) )
> > +    {
> > +        pagetable_t pgt = p2m_get_pagetable(p2m_get_hostp2m(d));
> > +
> > +        return pagetable_get_paddr(pgt);
> > +    }
> > +
> > +    if ( !hd->arch.vtd.pgd_maddr )
> > +    {
> > +        /* Ensure we have pagetables allocated down to leaf PTE. */
> > +        addr_to_dma_page_maddr(d, 0, 1);
> > +
> > +        if ( !hd->arch.vtd.pgd_maddr )
> > +            return 0;
> > +    }
> > +
> > +    pgd_maddr = hd->arch.vtd.pgd_maddr;
> > +
> > +    /* Skip top levels of page tables for 2- and 3-level DRHDs. */
> > +    for ( agaw = level_to_agaw(4);
> > +          agaw != level_to_agaw(nr_pt_levels);
> > +          agaw-- )
> > +    {
> > +        const struct dma_pte *p = map_vtd_domain_page(pgd_maddr);
> > +
> > +        pgd_maddr = dma_pte_addr(*p);
> > +        unmap_vtd_domain_page(p);
> > +        if ( !pgd_maddr )
> > +            return 0;
> > +    }
> > +
> > +    return pgd_maddr;
> > +}
> > +
> >  static void iommu_flush_write_buffer(struct vtd_iommu *iommu)
> >  {
> >      u32 val;
> > @@ -1286,7 +1328,7 @@ int domain_context_mapping_one(
> >      struct context_entry *context, *context_entries;
> >      u64 maddr, pgd_maddr;
> >      u16 seg = iommu->drhd->segment;
> > -    int agaw, rc, ret;
> > +    int rc, ret;
> >      bool_t flush_dev_iotlb;
> >
> >      ASSERT(pcidevs_locked());
> > @@ -1340,37 +1382,18 @@ int domain_context_mapping_one(
> >      if ( iommu_hwdom_passthrough && is_hardware_domain(domain) )
> >      {
> >          context_set_translation_type(*context, CONTEXT_TT_PASS_THRU);
> > -        agaw = level_to_agaw(iommu->nr_pt_levels);
> >      }
> >      else
> >      {
> >          spin_lock(&hd->arch.mapping_lock);
> >
> > -        /* Ensure we have pagetables allocated down to leaf PTE. */
> > -        if ( hd->arch.vtd.pgd_maddr == 0 )
> > -        {
> > -            addr_to_dma_page_maddr(domain, 0, 1);
> > -            if ( hd->arch.vtd.pgd_maddr == 0 )
> > -            {
> > -            nomem:
> > -                spin_unlock(&hd->arch.mapping_lock);
> > -                spin_unlock(&iommu->lock);
> > -                unmap_vtd_domain_page(context_entries);
> > -                return -ENOMEM;
> > -            }
> > -        }
> > -
> > -        /* Skip top levels of page tables for 2- and 3-level DRHDs. */
> > -        pgd_maddr = hd->arch.vtd.pgd_maddr;
> > -        for ( agaw = level_to_agaw(4);
> > -              agaw != level_to_agaw(iommu->nr_pt_levels);
> > -              agaw-- )
> > +        pgd_maddr = domain_pgd_maddr(domain, iommu->nr_pt_levels);
> > +        if ( !pgd_maddr )
> >          {
> > -            struct dma_pte *p = map_vtd_domain_page(pgd_maddr);
> > -            pgd_maddr = dma_pte_addr(*p);
> > -            unmap_vtd_domain_page(p);
> > -            if ( pgd_maddr == 0 )
> > -                goto nomem;
> > +            spin_unlock(&hd->arch.mapping_lock);
> > +            spin_unlock(&iommu->lock);
> > +            unmap_vtd_domain_page(context_entries);
> > +            return -ENOMEM;
> >          }
> >
> >          context_set_address_root(*context, pgd_maddr);
> > @@ -1389,7 +1412,7 @@ int domain_context_mapping_one(
> >          return -EFAULT;
> >      }
> >
> > -    context_set_address_width(*context, agaw);
> > +    context_set_address_width(*context, level_to_agaw(iommu-
> > >nr_pt_levels));
> >      context_set_fault_enable(*context);
> >      context_set_present(*context);
> >      iommu_sync_cache(context, sizeof(struct context_entry));
> > @@ -1848,18 +1871,6 @@ static int __init vtd_ept_page_compatible(struct
> > vtd_iommu *iommu)
> >             (ept_has_1gb(ept_cap) && opt_hap_1gb) <= cap_sps_1gb(vtd_cap);
> >  }
> >
> > -/*
> > - * set VT-d page table directory to EPT table if allowed
> > - */
> > -static void iommu_set_pgd(struct domain *d)
> > -{
> > -    mfn_t pgd_mfn;
> > -
> > -    pgd_mfn =
> > pagetable_get_mfn(p2m_get_pagetable(p2m_get_hostp2m(d)));
> > -    dom_iommu(d)->arch.vtd.pgd_maddr =
> > -        pagetable_get_paddr(pagetable_from_mfn(pgd_mfn));
> > -}
> > -
> >  static int rmrr_identity_mapping(struct domain *d, bool_t map,
> >                                   const struct acpi_rmrr_unit *rmrr,
> >                                   u32 flag)
> > @@ -2719,7 +2730,6 @@ static struct iommu_ops __initdata vtd_ops = {
> >      .adjust_irq_affinities = adjust_vtd_irq_affinities,
> >      .suspend = vtd_suspend,
> >      .resume = vtd_resume,
> > -    .share_p2m = iommu_set_pgd,
> >      .crash_shutdown = vtd_crash_shutdown,
> >      .iotlb_flush = iommu_flush_iotlb_pages,
> >      .iotlb_flush_all = iommu_flush_iotlb_all,
> > diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c
> > b/xen/drivers/passthrough/vtd/x86/vtd.c
> > index bbe358dc36..6681dccd69 100644
> > --- a/xen/drivers/passthrough/vtd/x86/vtd.c
> > +++ b/xen/drivers/passthrough/vtd/x86/vtd.c
> > @@ -42,7 +42,7 @@ void *map_vtd_domain_page(u64 maddr)
> >      return map_domain_page(_mfn(paddr_to_pfn(maddr)));
> >  }
> >
> > -void unmap_vtd_domain_page(void *va)
> > +void unmap_vtd_domain_page(const void *va)
> >  {
> >      unmap_domain_page(va);
> >  }
> > diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
> > index a2eefe8582..373145266f 100644
> > --- a/xen/include/xen/iommu.h
> > +++ b/xen/include/xen/iommu.h
> > @@ -270,7 +270,6 @@ struct iommu_ops {
> >
> >      int __must_check (*suspend)(void);
> >      void (*resume)(void);
> > -    void (*share_p2m)(struct domain *d);
> >      void (*crash_shutdown)(void);
> >      int __must_check (*iotlb_flush)(struct domain *d, dfn_t dfn,
> >                                      unsigned long page_count,
> > @@ -347,8 +346,6 @@ void iommu_resume(void);
> >  void iommu_crash_shutdown(void);
> >  int iommu_get_reserved_device_memory(iommu_grdm_t *, void *);
> >
> > -void iommu_share_p2m_table(struct domain *d);
> > -
> >  #ifdef CONFIG_HAS_PCI
> >  int iommu_do_pci_domctl(struct xen_domctl *, struct domain *d,
> >                          XEN_GUEST_HANDLE_PARAM(xen_domctl_t));
> > --
> > 2.20.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v9 8/8] iommu: stop calling IOMMU page tables 'p2m tables'
  2020-09-15  8:29 ` [PATCH v9 8/8] iommu: stop calling IOMMU page tables 'p2m tables' Paul Durrant
@ 2020-09-17  7:25   ` Tian, Kevin
  0 siblings, 0 replies; 19+ messages in thread
From: Tian, Kevin @ 2020-09-17  7:25 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Paul Durrant, Jan Beulich, Andrew Cooper

> From: Paul Durrant <paul@xen.org>
> Sent: Tuesday, September 15, 2020 4:30 PM
> 
> From: Paul Durrant <pdurrant@amazon.com>
> 
> It's confusing and not consistent with the terminology introduced with 'dfn_t'.
> Just call them IOMMU page tables.
> 
> Also remove a pointless check of the 'acpi_drhd_units' list in
> vtd_dump_page_table_level(). If the list is empty then IOMMU mappings
> would
> not have been enabled for the domain in the first place.
> 
> NOTE: All calls to printk() have also been removed from
>       iommu_dump_page_tables(); the implementation specific code is now
>       responsible for all output.
>       The check for the global 'iommu_enabled' has also been replaced by an
>       ASSERT since iommu_dump_page_tables() is not registered as a key
> handler
>       unless IOMMU mappings are enabled.
>       Error messages are now prefixed with the name of the function.
> 
> Signed-off-by: Paul Durrant <pdurrant@amazon.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> ---
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> 
> v6:
>  - Cosmetic adjustment
>  - Drop use of __func__
> 
> v5:
>  - Make sure domain id is in the output
>  - Use VTDPREFIX in output for consistency
> 
> v2:
>  - Moved all output into implementation specific code
> ---
>  xen/drivers/passthrough/amd/pci_amd_iommu.c | 20 +++++++-------
>  xen/drivers/passthrough/iommu.c             | 21 ++++-----------
>  xen/drivers/passthrough/vtd/iommu.c         | 30 ++++++++++++---------
>  xen/include/xen/iommu.h                     |  2 +-
>  4 files changed, 33 insertions(+), 40 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> index 3390c22cf3..5fe9dc849d 100644
> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> @@ -491,8 +491,8 @@ static int amd_iommu_group_id(u16 seg, u8 bus, u8
> devfn)
> 
>  #include <asm/io_apic.h>
> 
> -static void amd_dump_p2m_table_level(struct page_info* pg, int level,
> -                                     paddr_t gpa, int indent)
> +static void amd_dump_page_table_level(struct page_info *pg, int level,
> +                                      paddr_t gpa, int indent)
>  {
>      paddr_t address;
>      struct amd_iommu_pte *table_vaddr;
> @@ -504,7 +504,7 @@ static void amd_dump_p2m_table_level(struct
> page_info* pg, int level,
>      table_vaddr = __map_domain_page(pg);
>      if ( table_vaddr == NULL )
>      {
> -        printk("Failed to map IOMMU domain page %"PRIpaddr"\n",
> +        printk("AMD IOMMU failed to map domain page %"PRIpaddr"\n",
>                  page_to_maddr(pg));
>          return;
>      }
> @@ -521,7 +521,7 @@ static void amd_dump_p2m_table_level(struct
> page_info* pg, int level,
> 
>          if ( pde->next_level && (pde->next_level != (level - 1)) )
>          {
> -            printk("IOMMU p2m table error. next_level = %d, expected %d\n",
> +            printk("AMD IOMMU table error. next_level = %d, expected %d\n",
>                     pde->next_level, level - 1);
> 
>              continue;
> @@ -529,7 +529,7 @@ static void amd_dump_p2m_table_level(struct
> page_info* pg, int level,
> 
>          address = gpa + amd_offset_level_address(index, level);
>          if ( pde->next_level >= 1 )
> -            amd_dump_p2m_table_level(
> +            amd_dump_page_table_level(
>                  mfn_to_page(_mfn(pde->mfn)), pde->next_level,
>                  address, indent + 1);
>          else
> @@ -542,16 +542,16 @@ static void amd_dump_p2m_table_level(struct
> page_info* pg, int level,
>      unmap_domain_page(table_vaddr);
>  }
> 
> -static void amd_dump_p2m_table(struct domain *d)
> +static void amd_dump_page_tables(struct domain *d)
>  {
>      const struct domain_iommu *hd = dom_iommu(d);
> 
>      if ( !hd->arch.amd.root_table )
>          return;
> 
> -    printk("p2m table has %u levels\n", hd->arch.amd.paging_mode);
> -    amd_dump_p2m_table_level(hd->arch.amd.root_table,
> -                             hd->arch.amd.paging_mode, 0, 0);
> +    printk("AMD IOMMU %pd table has %u levels\n", d, hd-
> >arch.amd.paging_mode);
> +    amd_dump_page_table_level(hd->arch.amd.root_table,
> +                              hd->arch.amd.paging_mode, 0, 0);
>  }
> 
>  static const struct iommu_ops __initconstrel _iommu_ops = {
> @@ -578,7 +578,7 @@ static const struct iommu_ops __initconstrel
> _iommu_ops = {
>      .suspend = amd_iommu_suspend,
>      .resume = amd_iommu_resume,
>      .crash_shutdown = amd_iommu_crash_shutdown,
> -    .dump_p2m_table = amd_dump_p2m_table,
> +    .dump_page_tables = amd_dump_page_tables,
>  };
> 
>  static const struct iommu_init_ops __initconstrel _iommu_init_ops = {
> diff --git a/xen/drivers/passthrough/iommu.c
> b/xen/drivers/passthrough/iommu.c
> index 90748062e5..8fae77b593 100644
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -22,7 +22,7 @@
>  #include <xen/keyhandler.h>
>  #include <xsm/xsm.h>
> 
> -static void iommu_dump_p2m_table(unsigned char key);
> +static void iommu_dump_page_tables(unsigned char key);
> 
>  unsigned int __read_mostly iommu_dev_iotlb_timeout = 1000;
>  integer_param("iommu_dev_iotlb_timeout", iommu_dev_iotlb_timeout);
> @@ -212,7 +212,7 @@ void __hwdom_init iommu_hwdom_init(struct
> domain *d)
>      if ( !is_iommu_enabled(d) )
>          return;
> 
> -    register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m
> table", 0);
> +    register_keyhandler('o', &iommu_dump_page_tables, "dump iommu
> page tables", 0);
> 
>      hd->platform_ops->hwdom_init(d);
>  }
> @@ -535,16 +535,12 @@ bool_t iommu_has_feature(struct domain *d,
> enum iommu_feature feature)
>      return is_iommu_enabled(d) && test_bit(feature, dom_iommu(d)-
> >features);
>  }
> 
> -static void iommu_dump_p2m_table(unsigned char key)
> +static void iommu_dump_page_tables(unsigned char key)
>  {
>      struct domain *d;
>      const struct iommu_ops *ops;
> 
> -    if ( !iommu_enabled )
> -    {
> -        printk("IOMMU not enabled!\n");
> -        return;
> -    }
> +    ASSERT(iommu_enabled);
> 
>      ops = iommu_get_ops();
> 
> @@ -555,14 +551,7 @@ static void iommu_dump_p2m_table(unsigned char
> key)
>          if ( is_hardware_domain(d) || !is_iommu_enabled(d) )
>              continue;
> 
> -        if ( iommu_use_hap_pt(d) )
> -        {
> -            printk("\ndomain%d IOMMU p2m table shared with MMU: \n", d-
> >domain_id);
> -            continue;
> -        }
> -
> -        printk("\ndomain%d IOMMU p2m table: \n", d->domain_id);
> -        ops->dump_p2m_table(d);
> +        ops->dump_page_tables(d);
>      }
> 
>      rcu_read_unlock(&domlist_read_lock);
> diff --git a/xen/drivers/passthrough/vtd/iommu.c
> b/xen/drivers/passthrough/vtd/iommu.c
> index 58d4550a4c..faf258e699 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2582,8 +2582,8 @@ static void vtd_resume(void)
>      }
>  }
> 
> -static void vtd_dump_p2m_table_level(paddr_t pt_maddr, int level, paddr_t
> gpa,
> -                                     int indent)
> +static void vtd_dump_page_table_level(paddr_t pt_maddr, int level,
> paddr_t gpa,
> +                                      int indent)
>  {
>      paddr_t address;
>      int i;
> @@ -2596,7 +2596,8 @@ static void vtd_dump_p2m_table_level(paddr_t
> pt_maddr, int level, paddr_t gpa,
>      pt_vaddr = map_vtd_domain_page(pt_maddr);
>      if ( pt_vaddr == NULL )
>      {
> -        printk("Failed to map VT-D domain page %"PRIpaddr"\n", pt_maddr);
> +        printk(VTDPREFIX " failed to map domain page %"PRIpaddr"\n",
> +               pt_maddr);
>          return;
>      }
> 
> @@ -2612,8 +2613,8 @@ static void vtd_dump_p2m_table_level(paddr_t
> pt_maddr, int level, paddr_t gpa,
> 
>          address = gpa + offset_level_address(i, level);
>          if ( next_level >= 1 )
> -            vtd_dump_p2m_table_level(dma_pte_addr(*pte), next_level,
> -                                     address, indent + 1);
> +            vtd_dump_page_table_level(dma_pte_addr(*pte), next_level,
> +                                      address, indent + 1);
>          else
>              printk("%*sdfn: %08lx mfn: %08lx\n",
>                     indent, "",
> @@ -2624,17 +2625,20 @@ static void vtd_dump_p2m_table_level(paddr_t
> pt_maddr, int level, paddr_t gpa,
>      unmap_vtd_domain_page(pt_vaddr);
>  }
> 
> -static void vtd_dump_p2m_table(struct domain *d)
> +static void vtd_dump_page_tables(struct domain *d)
>  {
> -    const struct domain_iommu *hd;
> +    const struct domain_iommu *hd = dom_iommu(d);
> 
> -    if ( list_empty(&acpi_drhd_units) )
> +    if ( iommu_use_hap_pt(d) )
> +    {
> +        printk(VTDPREFIX " %pd sharing EPT table\n", d);
>          return;
> +    }
> 
> -    hd = dom_iommu(d);
> -    printk("p2m table has %d levels\n", agaw_to_level(hd->arch.vtd.agaw));
> -    vtd_dump_p2m_table_level(hd->arch.vtd.pgd_maddr,
> -                             agaw_to_level(hd->arch.vtd.agaw), 0, 0);
> +    printk(VTDPREFIX" %pd table has %d levels\n", d,
> +           agaw_to_level(hd->arch.vtd.agaw));
> +    vtd_dump_page_table_level(hd->arch.vtd.pgd_maddr,
> +                              agaw_to_level(hd->arch.vtd.agaw), 0, 0);
>  }
> 
>  static int __init intel_iommu_quarantine_init(struct domain *d)
> @@ -2734,7 +2738,7 @@ static struct iommu_ops __initdata vtd_ops = {
>      .iotlb_flush = iommu_flush_iotlb_pages,
>      .iotlb_flush_all = iommu_flush_iotlb_all,
>      .get_reserved_device_memory =
> intel_iommu_get_reserved_device_memory,
> -    .dump_p2m_table = vtd_dump_p2m_table,
> +    .dump_page_tables = vtd_dump_page_tables,
>  };
> 
>  const struct iommu_init_ops __initconstrel intel_iommu_init_ops = {
> diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
> index 373145266f..7a539522b1 100644
> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -276,7 +276,7 @@ struct iommu_ops {
>                                      unsigned int flush_flags);
>      int __must_check (*iotlb_flush_all)(struct domain *d);
>      int (*get_reserved_device_memory)(iommu_grdm_t *, void *);
> -    void (*dump_p2m_table)(struct domain *d);
> +    void (*dump_page_tables)(struct domain *d);
> 
>  #ifdef CONFIG_HAS_DEVICE_TREE
>      /*
> --
> 2.20.1



^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v9 5/8] remove remaining uses of iommu_legacy_map/unmap
  2020-09-15  8:29 ` [PATCH v9 5/8] remove remaining uses of iommu_legacy_map/unmap Paul Durrant
  2020-09-16  7:53   ` Julien Grall
@ 2020-09-17  7:40   ` Tian, Kevin
  1 sibling, 0 replies; 19+ messages in thread
From: Tian, Kevin @ 2020-09-17  7:40 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Paul Durrant, Jan Beulich, Andrew Cooper, Wei Liu,
	Roger Pau Monné,
	George Dunlap, Ian Jackson, Julien Grall, Stefano Stabellini,
	Nakajima, Jun

> From: Paul Durrant <paul@xen.org>
> Sent: Tuesday, September 15, 2020 4:30 PM
> 
> From: Paul Durrant <pdurrant@amazon.com>
> 
> The 'legacy' functions do implicit flushing so amend the callers to do the
> appropriate flushing.
> 
> Unfortunately, because of the structure of the P2M code, we cannot remove
> the per-CPU 'iommu_dont_flush_iotlb' global and the optimization it
> facilitates. Checking of this flag is now done only in relevant callers of
> iommu_iotlb_flush(). Also, 'iommu_dont_flush_iotlb' is now declared
> as bool (rather than bool_t) and setting/clearing it are no longer pointlessly
> gated on is_iommu_enabled() returning true. (Arguably it is also pointless to
> gate the call to iommu_iotlb_flush() on that condition - since it is a no-op
> in that case - but the if clause allows the scope of a stack variable to be
> restricted).
> 
> NOTE: The code in memory_add() now sets 'ret' if iommu_map() or
>       iommu_iotlb_flush() fails.
> 
> Signed-off-by: Paul Durrant <pdurrant@amazon.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> ---
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Wei Liu <wl@xen.org>
> Cc: "Roger Pau Monné" <roger.pau@citrix.com>
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Julien Grall <julien@xen.org>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> 
> v9:
>  - Moved check of 'iommu_dont_flush_iotlb' out of iommu_iotlb_flush() and
>    into callers to avoid re-introducing a problem on Arm
>  - Dropped Jan's R-b due to change
> 
> v6:
>  - Fix formatting problem in memory_add()
> 
> v5:
>  - Re-base
>  - Removed failure case on overflow of unsigned int as it is no longer
>    necessary
> 
> v3:
>  - Same as v2; elected to implement batch flushing in the grant table code as
>    a subsequent patch
> 
> v2:
>  - Shorten the diff (mainly because of a prior patch introducing automatic
>    flush-on-fail into iommu_map() and iommu_unmap())
> ---
>  xen/arch/x86/mm.c               | 24 +++++++++++++++++-------
>  xen/arch/x86/mm/p2m-ept.c       | 21 +++++++++++++--------
>  xen/arch/x86/mm/p2m-pt.c        | 16 ++++++++++++----
>  xen/arch/x86/mm/p2m.c           | 25 +++++++++++++++++--------
>  xen/arch/x86/x86_64/mm.c        | 20 +++++++-------------
>  xen/common/grant_table.c        | 29 ++++++++++++++++++++++-------
>  xen/common/memory.c             |  7 +++----
>  xen/drivers/passthrough/iommu.c | 23 -----------------------
>  xen/include/xen/iommu.h         | 21 +++++----------------
>  9 files changed, 96 insertions(+), 90 deletions(-)
> 
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index 9447dabc50..fec8a19920 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -2441,10 +2441,16 @@ static int cleanup_page_mappings(struct
> page_info *page)
> 
>          if ( d && unlikely(need_iommu_pt_sync(d)) && is_pv_domain(d) )
>          {
> -            int rc2 = iommu_legacy_unmap(d, _dfn(mfn), 1u <<
> PAGE_ORDER_4K);
> +            unsigned int flush_flags = 0;
> +            int err;
> +
> +            err = iommu_unmap(d, _dfn(mfn), 1ul << PAGE_ORDER_4K,
> &flush_flags);
> +            if ( !err && !this_cpu(iommu_dont_flush_iotlb) )
> +                err = iommu_iotlb_flush(d, _dfn(mfn), 1ul << PAGE_ORDER_4K,
> +                                        flush_flags);
> 
>              if ( !rc )
> -                rc = rc2;
> +                rc = err;
>          }
> 
>          if ( likely(!is_special_page(page)) )
> @@ -2966,14 +2972,18 @@ static int _get_page_type(struct page_info *page,
> unsigned long type,
>          if ( d && unlikely(need_iommu_pt_sync(d)) && is_pv_domain(d) )
>          {
>              mfn_t mfn = page_to_mfn(page);
> +            dfn_t dfn = _dfn(mfn_x(mfn));
> +            unsigned int flush_flags = 0;
> 
>              if ( (x & PGT_type_mask) == PGT_writable_page )
> -                rc = iommu_legacy_unmap(d, _dfn(mfn_x(mfn)),
> -                                        1ul << PAGE_ORDER_4K);
> +                rc = iommu_unmap(d, dfn, 1ul << PAGE_ORDER_4K, &flush_flags);
>              else
> -                rc = iommu_legacy_map(d, _dfn(mfn_x(mfn)), mfn,
> -                                      1ul << PAGE_ORDER_4K,
> -                                      IOMMUF_readable | IOMMUF_writable);
> +                rc = iommu_map(d, dfn, mfn, 1ul << PAGE_ORDER_4K,
> +                               IOMMUF_readable | IOMMUF_writable, &flush_flags);
> +
> +            if ( !rc && !this_cpu(iommu_dont_flush_iotlb) )
> +                rc = iommu_iotlb_flush(d, dfn, 1ul << PAGE_ORDER_4K,
> +                                       flush_flags);
> 
>              if ( unlikely(rc) )
>              {
> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
> index 12cf38f6eb..71473fd31d 100644
> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -842,15 +842,20 @@ out:
>      if ( rc == 0 && p2m_is_hostp2m(p2m) &&
>           need_modify_vtd_table )
>      {
> -        if ( iommu_use_hap_pt(d) )
> -            rc = iommu_iotlb_flush(d, _dfn(gfn), 1ul << order,
> -                                   (iommu_flags ? IOMMU_FLUSHF_added : 0) |
> -                                   (vtd_pte_present ? IOMMU_FLUSHF_modified
> -                                                    : 0));
> -        else if ( need_iommu_pt_sync(d) )
> +        unsigned int flush_flags = 0;
> +
> +        if ( need_iommu_pt_sync(d) )
>              rc = iommu_flags ?
> -                iommu_legacy_map(d, _dfn(gfn), mfn, 1ul << order, iommu_flags) :
> -                iommu_legacy_unmap(d, _dfn(gfn), 1ul << order);
> +                iommu_map(d, _dfn(gfn), mfn, 1ul << order, iommu_flags,
> +                          &flush_flags) :
> +                iommu_unmap(d, _dfn(gfn), 1ul << order, &flush_flags);
> +        else if ( iommu_use_hap_pt(d) )
> +            flush_flags =
> +                (iommu_flags ? IOMMU_FLUSHF_added : 0) |
> +                (vtd_pte_present ? IOMMU_FLUSHF_modified : 0);
> +
> +        if ( !rc && !this_cpu(iommu_dont_flush_iotlb) )
> +            rc = iommu_iotlb_flush(d, _dfn(gfn), 1ul << order, flush_flags);
>      }
> 
>      unmap_domain_page(table);
> diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
> index 3af51be78e..96cacb6748 100644
> --- a/xen/arch/x86/mm/p2m-pt.c
> +++ b/xen/arch/x86/mm/p2m-pt.c
> @@ -678,10 +678,18 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
> gfn_t gfn_, mfn_t mfn,
> 
>      if ( need_iommu_pt_sync(p2m->domain) &&
>           (iommu_old_flags != iommu_pte_flags || old_mfn != mfn_x(mfn)) )
> -        rc = iommu_pte_flags
> -             ? iommu_legacy_map(d, _dfn(gfn), mfn, 1ul << page_order,
> -                                iommu_pte_flags)
> -             : iommu_legacy_unmap(d, _dfn(gfn), 1ul << page_order);
> +    {
> +        unsigned int flush_flags = 0;
> +
> +        rc = iommu_pte_flags ?
> +            iommu_map(d, _dfn(gfn), mfn, 1ul << page_order, iommu_pte_flags,
> +                      &flush_flags) :
> +            iommu_unmap(d, _dfn(gfn), 1ul << page_order, &flush_flags);
> +
> +        if ( !rc && !this_cpu(iommu_dont_flush_iotlb) )
> +            rc = iommu_iotlb_flush(d, _dfn(gfn), 1ul << page_order,
> +                                   flush_flags);
> +    }
> 
>      /*
>       * Free old intermediate tables if necessary.  This has to be the
> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> index 928344be30..01ff92862d 100644
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -1350,11 +1350,15 @@ int set_identity_p2m_entry(struct domain *d,
> unsigned long gfn_l,
> 
>      if ( !paging_mode_translate(p2m->domain) )
>      {
> -        if ( !is_iommu_enabled(d) )
> -            return 0;
> -        return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l),
> -                                1ul << PAGE_ORDER_4K,
> -                                IOMMUF_readable | IOMMUF_writable);
> +        unsigned int flush_flags = 0;
> +
> +        ret = iommu_map(d, _dfn(gfn_l), _mfn(gfn_l), 1ul << PAGE_ORDER_4K,
> +                        IOMMUF_readable | IOMMUF_writable, &flush_flags);
> +        if ( !ret )
> +            ret = iommu_iotlb_flush(d, _dfn(gfn_l), 1ul << PAGE_ORDER_4K,
> +                                    flush_flags);
> +
> +        return ret;
>      }
> 
>      gfn_lock(p2m, gfn, 0);
> @@ -1442,9 +1446,14 @@ int clear_identity_p2m_entry(struct domain *d,
> unsigned long gfn_l)
> 
>      if ( !paging_mode_translate(d) )
>      {
> -        if ( !is_iommu_enabled(d) )
> -            return 0;
> -        return iommu_legacy_unmap(d, _dfn(gfn_l), 1ul << PAGE_ORDER_4K);
> +        unsigned int flush_flags = 0;
> +
> +        ret = iommu_unmap(d, _dfn(gfn_l), 1ul << PAGE_ORDER_4K,
> &flush_flags);
> +        if ( !ret )
> +            ret = iommu_iotlb_flush(d, _dfn(gfn_l), 1ul << PAGE_ORDER_4K,
> +                                    flush_flags);
> +
> +        return ret;
>      }
> 
>      gfn_lock(p2m, gfn, 0);
> diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
> index bce1561e1a..7e9d165449 100644
> --- a/xen/arch/x86/x86_64/mm.c
> +++ b/xen/arch/x86/x86_64/mm.c
> @@ -1284,21 +1284,15 @@ int memory_add(unsigned long spfn, unsigned
> long epfn, unsigned int pxm)
>           !iommu_use_hap_pt(hardware_domain) &&
>           !need_iommu_pt_sync(hardware_domain) )
>      {
> -        for ( i = spfn; i < epfn; i++ )
> -            if ( iommu_legacy_map(hardware_domain, _dfn(i), _mfn(i),
> -                                  1ul << PAGE_ORDER_4K,
> -                                  IOMMUF_readable | IOMMUF_writable) )
> -                break;
> -        if ( i != epfn )
> -        {
> -            while (i-- > old_max)
> -                /* If statement to satisfy __must_check. */
> -                if ( iommu_legacy_unmap(hardware_domain, _dfn(i),
> -                                        1ul << PAGE_ORDER_4K) )
> -                    continue;
> +        unsigned int flush_flags = 0;
> +        unsigned long n = epfn - spfn;
> 
> +        ret = iommu_map(hardware_domain, _dfn(i), _mfn(i), n,
> +                        IOMMUF_readable | IOMMUF_writable, &flush_flags);
> +        if ( !ret )
> +            ret = iommu_iotlb_flush(hardware_domain, _dfn(i), n, flush_flags);
> +        if ( ret )
>              goto destroy_m2p;
> -        }
>      }
> 
>      /* We can't revert any more */
> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
> index a5d3ed8bda..beb6b2d40d 100644
> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -1225,11 +1225,22 @@ map_grant_ref(
>              kind = IOMMUF_readable;
>          else
>              kind = 0;
> -        if ( kind && iommu_legacy_map(ld, _dfn(mfn_x(mfn)), mfn, 1, kind) )
> +
> +        if ( kind )
>          {
> -            double_gt_unlock(lgt, rgt);
> -            rc = GNTST_general_error;
> -            goto undo_out;
> +            dfn_t dfn = _dfn(mfn_x(mfn));
> +            unsigned int flush_flags = 0;
> +            int err;
> +
> +            err = iommu_map(ld, dfn, mfn, 1, kind, &flush_flags);
> +            if ( !err )
> +                err = iommu_iotlb_flush(ld, dfn, 1, flush_flags);
> +            if ( err )
> +            {
> +                double_gt_unlock(lgt, rgt);
> +                rc = GNTST_general_error;
> +                goto undo_out;
> +            }
>          }
>      }
> 
> @@ -1473,19 +1484,23 @@ unmap_common(
>      if ( rc == GNTST_okay && gnttab_need_iommu_mapping(ld) )
>      {
>          unsigned int kind;
> +        dfn_t dfn = _dfn(mfn_x(op->mfn));
> +        unsigned int flush_flags = 0;
>          int err = 0;
> 
>          double_gt_lock(lgt, rgt);
> 
>          kind = mapkind(lgt, rd, op->mfn);
>          if ( !kind )
> -            err = iommu_legacy_unmap(ld, _dfn(mfn_x(op->mfn)), 1);
> +            err = iommu_unmap(ld, dfn, 1, &flush_flags);
>          else if ( !(kind & MAPKIND_WRITE) )
> -            err = iommu_legacy_map(ld, _dfn(mfn_x(op->mfn)), op->mfn, 1,
> -                                   IOMMUF_readable);
> +            err = iommu_map(ld, dfn, op->mfn, 1, IOMMUF_readable,
> +                            &flush_flags);
> 
>          double_gt_unlock(lgt, rgt);
> 
> +        if ( !err )
> +            err = iommu_iotlb_flush(ld, dfn, 1, flush_flags);
>          if ( err )
>              rc = GNTST_general_error;
>      }
> diff --git a/xen/common/memory.c b/xen/common/memory.c
> index 714077c1e5..fedbd9019e 100644
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -824,8 +824,7 @@ int xenmem_add_to_physmap(struct domain *d,
> struct xen_add_to_physmap *xatp,
>      xatp->gpfn += start;
>      xatp->size -= start;
> 
> -    if ( is_iommu_enabled(d) )
> -       this_cpu(iommu_dont_flush_iotlb) = 1;
> +    this_cpu(iommu_dont_flush_iotlb) = true;
> 
>      while ( xatp->size > done )
>      {
> @@ -845,12 +844,12 @@ int xenmem_add_to_physmap(struct domain *d,
> struct xen_add_to_physmap *xatp,
>          }
>      }
> 
> +    this_cpu(iommu_dont_flush_iotlb) = false;
> +
>      if ( is_iommu_enabled(d) )
>      {
>          int ret;
> 
> -        this_cpu(iommu_dont_flush_iotlb) = 0;
> -
>          ret = iommu_iotlb_flush(d, _dfn(xatp->idx - done), done,
>                                  IOMMU_FLUSHF_added | IOMMU_FLUSHF_modified);
>          if ( unlikely(ret) && rc >= 0 )
> diff --git a/xen/drivers/passthrough/iommu.c
> b/xen/drivers/passthrough/iommu.c
> index 87f9a857bb..a9da4d2b06 100644
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -282,18 +282,6 @@ int iommu_map(struct domain *d, dfn_t dfn, mfn_t
> mfn,
>      return rc;
>  }
> 
> -int iommu_legacy_map(struct domain *d, dfn_t dfn, mfn_t mfn,
> -                     unsigned long page_count, unsigned int flags)
> -{
> -    unsigned int flush_flags = 0;
> -    int rc = iommu_map(d, dfn, mfn, page_count, flags, &flush_flags);
> -
> -    if ( !this_cpu(iommu_dont_flush_iotlb) && !rc )
> -        rc = iommu_iotlb_flush(d, dfn, page_count, flush_flags);
> -
> -    return rc;
> -}
> -
>  int iommu_unmap(struct domain *d, dfn_t dfn, unsigned long page_count,
>                  unsigned int *flush_flags)
>  {
> @@ -338,17 +326,6 @@ int iommu_unmap(struct domain *d, dfn_t dfn,
> unsigned long page_count,
>      return rc;
>  }
> 
> -int iommu_legacy_unmap(struct domain *d, dfn_t dfn, unsigned long
> page_count)
> -{
> -    unsigned int flush_flags = 0;
> -    int rc = iommu_unmap(d, dfn, page_count, &flush_flags);
> -
> -    if ( !this_cpu(iommu_dont_flush_iotlb) && !rc )
> -        rc = iommu_iotlb_flush(d, dfn, page_count, flush_flags);
> -
> -    return rc;
> -}
> -
>  int iommu_lookup_page(struct domain *d, dfn_t dfn, mfn_t *mfn,
>                        unsigned int *flags)
>  {
> diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
> index 13f68dc93d..a2eefe8582 100644
> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -151,16 +151,8 @@ int __must_check iommu_map(struct domain *d,
> dfn_t dfn, mfn_t mfn,
>  int __must_check iommu_unmap(struct domain *d, dfn_t dfn,
>                               unsigned long page_count,
>                               unsigned int *flush_flags);
> -
> -int __must_check iommu_legacy_map(struct domain *d, dfn_t dfn, mfn_t
> mfn,
> -                                  unsigned long page_count,
> -                                  unsigned int flags);
> -int __must_check iommu_legacy_unmap(struct domain *d, dfn_t dfn,
> -                                    unsigned long page_count);
> -
>  int __must_check iommu_lookup_page(struct domain *d, dfn_t dfn, mfn_t
> *mfn,
>                                     unsigned int *flags);
> -
>  int __must_check iommu_iotlb_flush(struct domain *d, dfn_t dfn,
>                                     unsigned long page_count,
>                                     unsigned int flush_flags);
> @@ -369,15 +361,12 @@ void iommu_dev_iotlb_flush_timeout(struct
> domain *d, struct pci_dev *pdev);
> 
>  /*
>   * The purpose of the iommu_dont_flush_iotlb optional cpu flag is to
> - * avoid unecessary iotlb_flush in the low level IOMMU code.
> - *
> - * iommu_map_page/iommu_unmap_page must flush the iotlb but
> somethimes
> - * this operation can be really expensive. This flag will be set by the
> - * caller to notify the low level IOMMU code to avoid the iotlb flushes.
> - * iommu_iotlb_flush/iommu_iotlb_flush_all will be explicitly called by
> - * the caller.
> + * avoid unnecessary IOMMU flushing while updating the P2M.
> + * Setting the value to true will cause iommu_iotlb_flush() to return without
> + * actually performing a flush. A batch flush must therefore be done by the
> + * calling code after setting the value back to false.
>   */
> -DECLARE_PER_CPU(bool_t, iommu_dont_flush_iotlb);
> +DECLARE_PER_CPU(bool, iommu_dont_flush_iotlb);
> 
>  extern struct spinlock iommu_pt_cleanup_lock;
>  extern struct page_list_head iommu_pt_cleanup_list;
> --
> 2.20.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v9 4/8] iommu: make map and unmap take a page count, similar to flush
  2020-09-15  8:29 ` [PATCH v9 4/8] iommu: make map and unmap take a page count, similar to flush Paul Durrant
  2020-09-17  6:38   ` Tian, Kevin
@ 2020-09-17 16:51   ` Wei Liu
  1 sibling, 0 replies; 19+ messages in thread
From: Wei Liu @ 2020-09-17 16:51 UTC (permalink / raw)
  To: Paul Durrant
  Cc: xen-devel, Paul Durrant, Jan Beulich, Julien Grall,
	Andrew Cooper, Wei Liu, Roger Pau Monné,
	George Dunlap, Ian Jackson, Stefano Stabellini, Jun Nakajima,
	Kevin Tian, Bertrand Marquis, Oleksandr Tyshchenko

On Tue, Sep 15, 2020 at 09:29:32AM +0100, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> At the moment iommu_map() and iommu_unmap() take a page order rather than a
> count, whereas iommu_iotlb_flush() takes a page count rather than an order.
> This patch makes them consistent with each other, opting for a page count since
> CPU page orders are not necessarily the same as those of an IOMMU.
> 
> NOTE: The 'page_count' parameter is also made an unsigned long in all the
>       aforementioned functions.
> 
> Signed-off-by: Paul Durrant <pdurrant@amazon.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Julien Grall <julien@xen.org>

Reviewed-by: Wei Liu <wl@xen.org>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v9 6/8] common/grant_table: batch flush I/O TLB
  2020-09-15  8:29 ` [PATCH v9 6/8] common/grant_table: batch flush I/O TLB Paul Durrant
  2020-09-16  8:11   ` Julien Grall
@ 2020-09-17 16:51   ` Wei Liu
  1 sibling, 0 replies; 19+ messages in thread
From: Wei Liu @ 2020-09-17 16:51 UTC (permalink / raw)
  To: Paul Durrant
  Cc: xen-devel, Paul Durrant, Jan Beulich, Andrew Cooper,
	George Dunlap, Ian Jackson, Julien Grall, Stefano Stabellini,
	Wei Liu

On Tue, Sep 15, 2020 at 09:29:34AM +0100, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> This patch avoids calling iommu_iotlb_flush() for each individual GNTTABOP and
> instead calls iommu_iotlb_flush_all() at the end of a batch. This should mean
> non-singleton map/unmap operations perform better.
> 
> NOTE: A batch is the number of operations done before a pre-emption check and,
>       in the case of unmap, a TLB flush.
> 
> Suggested-by: Jan Beulich <jbeulich@suse.com>
> Signed-off-by: Paul Durrant <pdurrant@amazon.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Wei Liu <wl@xen.org>


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-09-17 16:52 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-15  8:29 [PATCH v9 0/8] IOMMU cleanup Paul Durrant
2020-09-15  8:29 ` [PATCH v9 1/8] x86/iommu: convert VT-d code to use new page table allocator Paul Durrant
2020-09-17  6:33   ` Tian, Kevin
2020-09-15  8:29 ` [PATCH v9 2/8] iommu: remove unused iommu_ops method and tasklet Paul Durrant
2020-09-15  8:29 ` [PATCH v9 3/8] iommu: flush I/O TLB if iommu_map() or iommu_unmap() fail Paul Durrant
2020-09-15  8:29 ` [PATCH v9 4/8] iommu: make map and unmap take a page count, similar to flush Paul Durrant
2020-09-17  6:38   ` Tian, Kevin
2020-09-17 16:51   ` Wei Liu
2020-09-15  8:29 ` [PATCH v9 5/8] remove remaining uses of iommu_legacy_map/unmap Paul Durrant
2020-09-16  7:53   ` Julien Grall
2020-09-17  7:40   ` Tian, Kevin
2020-09-15  8:29 ` [PATCH v9 6/8] common/grant_table: batch flush I/O TLB Paul Durrant
2020-09-16  8:11   ` Julien Grall
2020-09-17 16:51   ` Wei Liu
2020-09-15  8:29 ` [PATCH v9 7/8] iommu: remove the share_p2m operation Paul Durrant
2020-09-17  7:19   ` Tian, Kevin
2020-09-17  7:24   ` Tian, Kevin
2020-09-15  8:29 ` [PATCH v9 8/8] iommu: stop calling IOMMU page tables 'p2m tables' Paul Durrant
2020-09-17  7:25   ` Tian, Kevin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.