All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/15] paravirtual IOMMU interface
@ 2018-08-01 13:40 Paul Durrant
  2018-08-01 13:40 ` [PATCH v4 01/15] re-work commit 3e06b989 "IOMMU: make page table population preemptible" Paul Durrant
                   ` (14 more replies)
  0 siblings, 15 replies; 24+ messages in thread
From: Paul Durrant @ 2018-08-01 13:40 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
	Razvan Cojocaru, Andrew Cooper, Ian Jackson, George Dunlap,
	Tim Deegan, Julien Grall, Paul Durrant, Tamas K Lengyel,
	Jan Beulich, Daniel De Graaf, Brian Woods, Suravee Suthikulpanit

The idea of a paravirtual IOMMU interface was last discussed on xen-devel
several years ago and narrowed down on a draft specification [1].
There was also an RFC patch series posted with an implementation, however
this was never followed through.

In this patch series I have tried to simplify the interface and therefore
have moved away from the draft specification. There is not yet any
new specification but hopefully the interface in the introduced iommu_op
header file will be understandable without such a specification.

[1] https://lists.xenproject.org/archives/html/xen-devel/2016-02/msg01428.html

Paul Durrant (15):
  re-work commit 3e06b989 "IOMMU: make page table population
    preemptible"...
  iommu: introduce the concept of BFN...
  iommu: make use of type-safe BFN and MFN in exported functions
  iommu: push use of type-safe BFN and MFN into iommu_ops
  iommu: don't domain_crash() inside iommu_map/unmap_page()
  public / x86: introduce __HYPERCALL_iommu_op
  iommu: track reserved ranges using a rangeset
  x86: add iommu_op to query reserved ranges
  vtd: add lookup_page method to iommu_ops
  mm / iommu: include need_iommu() test in iommu_use_hap_pt()
  mm / iommu: split need_iommu() into has_iommu_pt() and sync_iommu_pt()
  x86: add iommu_op to enable modification of IOMMU mappings
  memory: add get_paged_gfn() as a wrapper...
  x86: add iommu_ops to modify and flush IOMMU mappings
  x86: extend the map and unmap iommu_ops to support grant references

 tools/flask/policy/modules/xen.if             |   1 +
 xen/arch/arm/p2m.c                            |   9 +-
 xen/arch/x86/Makefile                         |   1 +
 xen/arch/x86/domain.c                         |   6 -
 xen/arch/x86/hvm/emulate.c                    |  32 +-
 xen/arch/x86/hvm/hvm.c                        |  16 +-
 xen/arch/x86/hvm/hypercall.c                  |   1 +
 xen/arch/x86/hvm/mtrr.c                       |   5 +-
 xen/arch/x86/hypercall.c                      |   1 +
 xen/arch/x86/iommu_op.c                       | 558 ++++++++++++++++++++++++++
 xen/arch/x86/mm.c                             |  15 +-
 xen/arch/x86/mm/mem_sharing.c                 |   2 +-
 xen/arch/x86/mm/p2m-ept.c                     |  19 +-
 xen/arch/x86/mm/p2m-pod.c                     |   3 +-
 xen/arch/x86/mm/p2m-pt.c                      |  52 ++-
 xen/arch/x86/mm/p2m.c                         |  44 +-
 xen/arch/x86/mm/paging.c                      |   2 +-
 xen/arch/x86/pv/hypercall.c                   |   1 +
 xen/arch/x86/x86_64/mm.c                      |   8 +-
 xen/common/grant_table.c                      | 201 ++++++++--
 xen/common/memory.c                           |  69 +++-
 xen/common/vm_event.c                         |   2 +-
 xen/drivers/passthrough/amd/iommu_cmd.c       |  18 +-
 xen/drivers/passthrough/amd/iommu_map.c       |  86 ++--
 xen/drivers/passthrough/amd/pci_amd_iommu.c   |   6 +-
 xen/drivers/passthrough/arm/smmu.c            |  20 +-
 xen/drivers/passthrough/device_tree.c         |  19 +-
 xen/drivers/passthrough/iommu.c               |  84 ++--
 xen/drivers/passthrough/pci.c                 |   6 +-
 xen/drivers/passthrough/vtd/iommu.c           |  90 +++--
 xen/drivers/passthrough/vtd/iommu.h           |   3 +
 xen/drivers/passthrough/vtd/x86/vtd.c         |  23 +-
 xen/drivers/passthrough/x86/iommu.c           |  83 ++--
 xen/include/Makefile                          |   2 +
 xen/include/asm-arm/grant_table.h             |   2 +-
 xen/include/asm-arm/iommu.h                   |   2 +-
 xen/include/asm-arm/p2m.h                     |   3 +
 xen/include/asm-x86/grant_table.h             |   2 +-
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |   8 +-
 xen/include/asm-x86/iommu.h                   |   6 +-
 xen/include/asm-x86/p2m.h                     |   2 +
 xen/include/public/iommu_op.h                 | 171 ++++++++
 xen/include/public/xen.h                      |   1 +
 xen/include/xen/grant_table.h                 |   7 +
 xen/include/xen/hypercall.h                   |  12 +
 xen/include/xen/iommu.h                       |  66 ++-
 xen/include/xen/mm.h                          |  10 +-
 xen/include/xen/sched.h                       |  12 +-
 xen/include/xlat.lst                          |   6 +
 xen/include/xsm/dummy.h                       |   6 +
 xen/include/xsm/xsm.h                         |   6 +
 xen/xsm/dummy.c                               |   1 +
 xen/xsm/flask/hooks.c                         |   6 +
 xen/xsm/flask/policy/access_vectors           |   2 +
 54 files changed, 1459 insertions(+), 360 deletions(-)
 create mode 100644 xen/arch/x86/iommu_op.c
 create mode 100644 xen/include/public/iommu_op.h
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Brian Woods <brian.woods@amd.com>
Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Tamas K Lengyel <tamas@tklengyel.com>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 01/15] re-work commit 3e06b989 "IOMMU: make page table population preemptible"...
  2018-08-01 13:40 [PATCH v4 00/15] paravirtual IOMMU interface Paul Durrant
@ 2018-08-01 13:40 ` Paul Durrant
  2018-08-01 16:15   ` Roger Pau Monné
  2018-08-02  7:19   ` Jan Beulich
  2018-08-01 13:40 ` [PATCH v4 02/15] iommu: introduce the concept of BFN Paul Durrant
                   ` (13 subsequent siblings)
  14 siblings, 2 replies; 24+ messages in thread
From: Paul Durrant @ 2018-08-01 13:40 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Paul Durrant, Jan Beulich

...to simplify the implementation and turn need_iommu back into a boolean.

As noted in [1] the tri-state nature of need_iommu after commit 3e06b989 is
confusing, as is the implementation of pre-emption using relmem_list.

This patch instead uses a simple count of pages already populated stored in
the x86 variant of struct arch_iommu and skips over that number of pages
if arch_iommu_populate_page_table() is re-invoked after pre-emption.

[1] https://lists.xenproject.org/archives/html/xen-devel/2018-07/msg01870.html

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>

v4:
 - New in v4.
---
 xen/arch/x86/domain.c                 |  6 ---
 xen/arch/x86/mm/p2m-pod.c             |  3 +-
 xen/drivers/passthrough/device_tree.c | 19 ++++-----
 xen/drivers/passthrough/iommu.c       |  8 ++--
 xen/drivers/passthrough/x86/iommu.c   | 79 ++++++++++++++---------------------
 xen/include/asm-x86/iommu.h           |  1 +
 xen/include/xen/sched.h               |  4 +-
 7 files changed, 48 insertions(+), 72 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index b5bb0f3b22..e68591d791 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1983,12 +1983,6 @@ int domain_relinquish_resources(struct domain *d)
         }
 
         d->arch.relmem = RELMEM_xen;
-
-        spin_lock(&d->page_alloc_lock);
-        page_list_splice(&d->arch.relmem_list, &d->page_list);
-        INIT_PAGE_LIST_HEAD(&d->arch.relmem_list);
-        spin_unlock(&d->page_alloc_lock);
-
         /* Fallthrough. Relinquish every page of memory. */
     case RELMEM_xen:
         ret = relinquish_memory(d, &d->xenpage_list, ~0UL);
diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
index 631e9aec33..80f5ab1ea4 100644
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -462,8 +462,7 @@ p2m_pod_offline_or_broken_hit(struct page_info *p)
 
 pod_hit:
     lock_page_alloc(p2m);
-    /* Insertion must be at list head (see iommu_populate_page_table()). */
-    page_list_add(p, &d->arch.relmem_list);
+    page_list_add_tail(p, &d->arch.relmem_list);
     unlock_page_alloc(p2m);
     pod_unlock(p2m);
     return 1;
diff --git a/xen/drivers/passthrough/device_tree.c b/xen/drivers/passthrough/device_tree.c
index 421f003438..671c8db1d0 100644
--- a/xen/drivers/passthrough/device_tree.c
+++ b/xen/drivers/passthrough/device_tree.c
@@ -40,17 +40,14 @@ int iommu_assign_dt_device(struct domain *d, struct dt_device_node *dev)
     if ( !list_empty(&dev->domain_list) )
         goto fail;
 
-    if ( need_iommu(d) <= 0 )
-    {
-        /*
-         * The hwdom is forced to use IOMMU for protecting assigned
-         * device. Therefore the IOMMU data is already set up.
-         */
-        ASSERT(!is_hardware_domain(d));
-        rc = iommu_construct(d);
-        if ( rc )
-            goto fail;
-    }
+    /*
+     * The hwdom is forced to use IOMMU for protecting assigned
+     * device. Therefore the IOMMU data is already set up.
+     */
+    ASSERT(!is_hardware_domain(d) || need_iommu(d));
+    rc = iommu_construct(d);
+    if ( rc )
+        goto fail;
 
     /* The flag field doesn't matter to DT device. */
     rc = hd->platform_ops->assign_device(d, 0, dt_to_dev(dev), 0);
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 70d218f910..49974e5634 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -214,14 +214,14 @@ void iommu_teardown(struct domain *d)
 {
     const struct domain_iommu *hd = dom_iommu(d);
 
-    d->need_iommu = 0;
+    d->need_iommu = false;
     hd->platform_ops->teardown(d);
     tasklet_schedule(&iommu_pt_cleanup_tasklet);
 }
 
 int iommu_construct(struct domain *d)
 {
-    if ( need_iommu(d) > 0 )
+    if ( need_iommu(d) )
         return 0;
 
     if ( !iommu_use_hap_pt(d) )
@@ -233,7 +233,7 @@ int iommu_construct(struct domain *d)
             return rc;
     }
 
-    d->need_iommu = 1;
+    d->need_iommu = true;
     /*
      * There may be dirty cache lines when a device is assigned
      * and before need_iommu(d) becoming true, this will cause
@@ -493,7 +493,7 @@ static void iommu_dump_p2m_table(unsigned char key)
     ops = iommu_get_ops();
     for_each_domain(d)
     {
-        if ( is_hardware_domain(d) || need_iommu(d) <= 0 )
+        if ( is_hardware_domain(d) || !need_iommu(d) )
             continue;
 
         if ( iommu_use_hap_pt(d) )
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index 68182afd91..22feba572b 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -41,62 +41,47 @@ int __init iommu_setup_hpet_msi(struct msi_desc *msi)
 
 int arch_iommu_populate_page_table(struct domain *d)
 {
-    const struct domain_iommu *hd = dom_iommu(d);
+    struct domain_iommu *hd = dom_iommu(d);
+    unsigned long skip = hd->arch.populated;
     struct page_info *page;
-    int rc = 0, n = 0;
+    int rc = 0;
 
-    d->need_iommu = -1;
+    if ( unlikely(d->is_dying) )
+        return -ESRCH;
 
     this_cpu(iommu_dont_flush_iotlb) = 1;
     spin_lock(&d->page_alloc_lock);
 
-    if ( unlikely(d->is_dying) )
-        rc = -ESRCH;
-
-    while ( !rc && (page = page_list_remove_head(&d->page_list)) )
+    page_list_for_each ( page, &d->page_list )
     {
-        if ( is_hvm_domain(d) ||
-            (page->u.inuse.type_info & PGT_type_mask) == PGT_writable_page )
-        {
-            unsigned long mfn = mfn_x(page_to_mfn(page));
-            unsigned long gfn = mfn_to_gmfn(d, mfn);
-
-            if ( gfn != gfn_x(INVALID_GFN) )
-            {
-                ASSERT(!(gfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH));
-                BUG_ON(SHARED_M2P(gfn));
-                rc = hd->platform_ops->map_page(d, gfn, mfn,
-                                                IOMMUF_readable |
-                                                IOMMUF_writable);
-            }
-            if ( rc )
-            {
-                page_list_add(page, &d->page_list);
-                break;
-            }
-        }
-        page_list_add_tail(page, &d->arch.relmem_list);
-        if ( !(++n & 0xff) && !page_list_empty(&d->page_list) &&
-             hypercall_preempt_check() )
-            rc = -ERESTART;
-    }
+        unsigned long mfn;
+        unsigned long gfn;
 
-    if ( !rc )
-    {
-        /*
-         * The expectation here is that generally there are many normal pages
-         * on relmem_list (the ones we put there) and only few being in an
-         * offline/broken state. The latter ones are always at the head of the
-         * list. Hence we first move the whole list, and then move back the
-         * first few entries.
-         */
-        page_list_move(&d->page_list, &d->arch.relmem_list);
-        while ( !page_list_empty(&d->page_list) &&
-                (page = page_list_first(&d->page_list),
-                 (page->count_info & (PGC_state|PGC_broken))) )
+        if ( !is_hvm_domain(d) &&
+            (page->u.inuse.type_info & PGT_type_mask) != PGT_writable_page )
+            continue;
+
+        mfn = mfn_x(page_to_mfn(page));
+        gfn = mfn_to_gmfn(d, mfn);
+
+        if ( gfn == gfn_x(INVALID_GFN) )
+            continue;
+
+        if ( skip && skip-- ) /* make sure this doesn't underflow */
+            continue;
+
+        ASSERT(!(gfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH));
+        BUG_ON(SHARED_M2P(gfn));
+        rc = hd->platform_ops->map_page(d, gfn, mfn,
+                                        IOMMUF_readable |
+                                        IOMMUF_writable);
+        if ( rc )
+            break;
+
+        if ( !(hd->arch.populated++ & 0xff) && hypercall_preempt_check() )
         {
-            page_list_del(page, &d->page_list);
-            page_list_add_tail(page, &d->arch.relmem_list);
+            rc = -ERESTART;
+            break;
         }
     }
 
diff --git a/xen/include/asm-x86/iommu.h b/xen/include/asm-x86/iommu.h
index 14ad0489a6..0dcead3b6c 100644
--- a/xen/include/asm-x86/iommu.h
+++ b/xen/include/asm-x86/iommu.h
@@ -37,6 +37,7 @@ struct arch_iommu
     int agaw;     /* adjusted guest address width, 0 is level 2 30-bit */
     u64 iommu_bitmap;              /* bitmap of iommu(s) that the domain uses */
     struct list_head mapped_rmrrs;
+    unsigned long populated;
 
     /* amd iommu support */
     int paging_mode;
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 851f11ecf7..5b9820e4e1 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -372,8 +372,8 @@ struct domain
 #ifdef CONFIG_HAS_PASSTHROUGH
     struct domain_iommu iommu;
 
-    /* Does this guest need iommu mappings (-1 meaning "being set up")? */
-    s8               need_iommu;
+    /* Does this guest need iommu mappings? */
+    bool             need_iommu;
 #endif
     /* is node-affinity automatically computed? */
     bool             auto_node_affinity;
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 02/15] iommu: introduce the concept of BFN...
  2018-08-01 13:40 [PATCH v4 00/15] paravirtual IOMMU interface Paul Durrant
  2018-08-01 13:40 ` [PATCH v4 01/15] re-work commit 3e06b989 "IOMMU: make page table population preemptible" Paul Durrant
@ 2018-08-01 13:40 ` Paul Durrant
  2018-08-01 13:40 ` [PATCH v4 03/15] iommu: make use of type-safe BFN and MFN in exported functions Paul Durrant
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Paul Durrant @ 2018-08-01 13:40 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Suravee Suthikulpanit,
	Julien Grall, Paul Durrant, Jan Beulich

...meaning 'bus frame number' i.e. a frame number mapped in the IOMMU
rather than the MMU.

This patch is a largely cosmetic change that substitutes the terms 'gfn'
and 'gaddr' for 'bfn' and 'baddr' in all the places where the frame number
or address relate to the IOMMU rather than the MMU.

The parts that are not purely cosmetic are:

 - the introduction of a type-safe declaration of bfn_t and definition of
   INVALID_BFN to make the substitution of gfn_x(INVALID_GFN) mechanical.
 - the introduction of __bfn_to_baddr and __baddr_to_bfn (and type-safe
   variants without the leading __) with some use of the former.

Subsequent patches will convert code to make use of type-safe BFNs.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Kevin Tian <kevin.tian@intel.com>

v3:
 - Get rid of intermediate 'frame' variables again.

v2:
 - Addressed comments from Jan.
---
 xen/drivers/passthrough/amd/iommu_cmd.c     | 18 +++----
 xen/drivers/passthrough/amd/iommu_map.c     | 76 ++++++++++++++---------------
 xen/drivers/passthrough/amd/pci_amd_iommu.c |  2 +-
 xen/drivers/passthrough/arm/smmu.c          | 16 +++---
 xen/drivers/passthrough/iommu.c             | 28 +++++------
 xen/drivers/passthrough/vtd/iommu.c         | 30 ++++++------
 xen/include/xen/iommu.h                     | 38 ++++++++++++---
 xen/include/xen/mm.h                        | 10 +++-
 8 files changed, 125 insertions(+), 93 deletions(-)

diff --git a/xen/drivers/passthrough/amd/iommu_cmd.c b/xen/drivers/passthrough/amd/iommu_cmd.c
index 08247fa354..f93becd6e1 100644
--- a/xen/drivers/passthrough/amd/iommu_cmd.c
+++ b/xen/drivers/passthrough/amd/iommu_cmd.c
@@ -284,7 +284,7 @@ void invalidate_iommu_all(struct amd_iommu *iommu)
 }
 
 void amd_iommu_flush_iotlb(u8 devfn, const struct pci_dev *pdev,
-                           uint64_t gaddr, unsigned int order)
+                           baddr_t baddr, unsigned int order)
 {
     unsigned long flags;
     struct amd_iommu *iommu;
@@ -315,12 +315,12 @@ void amd_iommu_flush_iotlb(u8 devfn, const struct pci_dev *pdev,
 
     /* send INVALIDATE_IOTLB_PAGES command */
     spin_lock_irqsave(&iommu->lock, flags);
-    invalidate_iotlb_pages(iommu, maxpend, 0, queueid, gaddr, req_id, order);
+    invalidate_iotlb_pages(iommu, maxpend, 0, queueid, baddr, req_id, order);
     flush_command_buffer(iommu);
     spin_unlock_irqrestore(&iommu->lock, flags);
 }
 
-static void amd_iommu_flush_all_iotlbs(struct domain *d, uint64_t gaddr,
+static void amd_iommu_flush_all_iotlbs(struct domain *d, baddr_t baddr,
                                        unsigned int order)
 {
     struct pci_dev *pdev;
@@ -333,7 +333,7 @@ static void amd_iommu_flush_all_iotlbs(struct domain *d, uint64_t gaddr,
         u8 devfn = pdev->devfn;
 
         do {
-            amd_iommu_flush_iotlb(devfn, pdev, gaddr, order);
+            amd_iommu_flush_iotlb(devfn, pdev, baddr, order);
             devfn += pdev->phantom_stride;
         } while ( devfn != pdev->devfn &&
                   PCI_SLOT(devfn) == PCI_SLOT(pdev->devfn) );
@@ -342,7 +342,7 @@ static void amd_iommu_flush_all_iotlbs(struct domain *d, uint64_t gaddr,
 
 /* Flush iommu cache after p2m changes. */
 static void _amd_iommu_flush_pages(struct domain *d,
-                                   uint64_t gaddr, unsigned int order)
+                                   baddr_t baddr, unsigned int order)
 {
     unsigned long flags;
     struct amd_iommu *iommu;
@@ -352,13 +352,13 @@ static void _amd_iommu_flush_pages(struct domain *d,
     for_each_amd_iommu ( iommu )
     {
         spin_lock_irqsave(&iommu->lock, flags);
-        invalidate_iommu_pages(iommu, gaddr, dom_id, order);
+        invalidate_iommu_pages(iommu, baddr, dom_id, order);
         flush_command_buffer(iommu);
         spin_unlock_irqrestore(&iommu->lock, flags);
     }
 
     if ( ats_enabled )
-        amd_iommu_flush_all_iotlbs(d, gaddr, order);
+        amd_iommu_flush_all_iotlbs(d, baddr, order);
 }
 
 void amd_iommu_flush_all_pages(struct domain *d)
@@ -367,9 +367,9 @@ void amd_iommu_flush_all_pages(struct domain *d)
 }
 
 void amd_iommu_flush_pages(struct domain *d,
-                           unsigned long gfn, unsigned int order)
+                           unsigned long bfn, unsigned int order)
 {
-    _amd_iommu_flush_pages(d, (uint64_t) gfn << PAGE_SHIFT, order);
+    _amd_iommu_flush_pages(d, __bfn_to_baddr(bfn), order);
 }
 
 void amd_iommu_flush_device(struct amd_iommu *iommu, uint16_t bdf)
diff --git a/xen/drivers/passthrough/amd/iommu_map.c b/xen/drivers/passthrough/amd/iommu_map.c
index 70b4345b37..4deab9cd2f 100644
--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -35,12 +35,12 @@ static unsigned int pfn_to_pde_idx(unsigned long pfn, unsigned int level)
     return idx;
 }
 
-void clear_iommu_pte_present(unsigned long l1_mfn, unsigned long gfn)
+void clear_iommu_pte_present(unsigned long l1_mfn, unsigned long bfn)
 {
     u64 *table, *pte;
 
     table = map_domain_page(_mfn(l1_mfn));
-    pte = table + pfn_to_pde_idx(gfn, IOMMU_PAGING_MODE_LEVEL_1);
+    pte = table + pfn_to_pde_idx(bfn, IOMMU_PAGING_MODE_LEVEL_1);
     *pte = 0;
     unmap_domain_page(table);
 }
@@ -104,7 +104,7 @@ static bool_t set_iommu_pde_present(u32 *pde, unsigned long next_mfn,
     return need_flush;
 }
 
-static bool_t set_iommu_pte_present(unsigned long pt_mfn, unsigned long gfn, 
+static bool_t set_iommu_pte_present(unsigned long pt_mfn, unsigned long bfn,
                                     unsigned long next_mfn, int pde_level, 
                                     bool_t iw, bool_t ir)
 {
@@ -114,7 +114,7 @@ static bool_t set_iommu_pte_present(unsigned long pt_mfn, unsigned long gfn,
 
     table = map_domain_page(_mfn(pt_mfn));
 
-    pde = (u32*)(table + pfn_to_pde_idx(gfn, pde_level));
+    pde = (u32*)(table + pfn_to_pde_idx(bfn, pde_level));
 
     need_flush = set_iommu_pde_present(pde, next_mfn, 
                                        IOMMU_PAGING_MODE_LEVEL_0, iw, ir);
@@ -331,7 +331,7 @@ static void set_pde_count(u64 *pde, unsigned int count)
  * otherwise increase pde count if mfn is contigous with mfn - 1
  */
 static int iommu_update_pde_count(struct domain *d, unsigned long pt_mfn,
-                                  unsigned long gfn, unsigned long mfn,
+                                  unsigned long bfn, unsigned long mfn,
                                   unsigned int merge_level)
 {
     unsigned int pde_count, next_level;
@@ -347,7 +347,7 @@ static int iommu_update_pde_count(struct domain *d, unsigned long pt_mfn,
 
     /* get pde at merge level */
     table = map_domain_page(_mfn(pt_mfn));
-    pde = table + pfn_to_pde_idx(gfn, merge_level);
+    pde = table + pfn_to_pde_idx(bfn, merge_level);
 
     /* get page table of next level */
     ntable_maddr = amd_iommu_get_next_table_from_pte((u32*)pde);
@@ -362,7 +362,7 @@ static int iommu_update_pde_count(struct domain *d, unsigned long pt_mfn,
     mask = (1ULL<< (PTE_PER_TABLE_SHIFT * next_level)) - 1;
 
     if ( ((first_mfn & mask) == 0) &&
-         (((gfn & mask) | first_mfn) == mfn) )
+         (((bfn & mask) | first_mfn) == mfn) )
     {
         pde_count = get_pde_count(*pde);
 
@@ -387,7 +387,7 @@ out:
 }
 
 static int iommu_merge_pages(struct domain *d, unsigned long pt_mfn,
-                             unsigned long gfn, unsigned int flags,
+                             unsigned long bfn, unsigned int flags,
                              unsigned int merge_level)
 {
     u64 *table, *pde, *ntable;
@@ -398,7 +398,7 @@ static int iommu_merge_pages(struct domain *d, unsigned long pt_mfn,
     ASSERT( spin_is_locked(&hd->arch.mapping_lock) && pt_mfn );
 
     table = map_domain_page(_mfn(pt_mfn));
-    pde = table + pfn_to_pde_idx(gfn, merge_level);
+    pde = table + pfn_to_pde_idx(bfn, merge_level);
 
     /* get first mfn */
     ntable_mfn = amd_iommu_get_next_table_from_pte((u32*)pde) >> PAGE_SHIFT;
@@ -436,7 +436,7 @@ static int iommu_merge_pages(struct domain *d, unsigned long pt_mfn,
  * {Re, un}mapping super page frames causes re-allocation of io
  * page tables.
  */
-static int iommu_pde_from_gfn(struct domain *d, unsigned long pfn, 
+static int iommu_pde_from_bfn(struct domain *d, unsigned long pfn,
                               unsigned long pt_mfn[])
 {
     u64 *pde, *next_table_vaddr;
@@ -477,11 +477,11 @@ static int iommu_pde_from_gfn(struct domain *d, unsigned long pfn,
              next_table_mfn != 0 )
         {
             int i;
-            unsigned long mfn, gfn;
+            unsigned long mfn, bfn;
             unsigned int page_sz;
 
             page_sz = 1 << (PTE_PER_TABLE_SHIFT * (next_level - 1));
-            gfn =  pfn & ~((1 << (PTE_PER_TABLE_SHIFT * next_level)) - 1);
+            bfn =  pfn & ~((1 << (PTE_PER_TABLE_SHIFT * next_level)) - 1);
             mfn = next_table_mfn;
 
             /* allocate lower level page table */
@@ -499,10 +499,10 @@ static int iommu_pde_from_gfn(struct domain *d, unsigned long pfn,
 
             for ( i = 0; i < PTE_PER_TABLE_SIZE; i++ )
             {
-                set_iommu_pte_present(next_table_mfn, gfn, mfn, next_level,
+                set_iommu_pte_present(next_table_mfn, bfn, mfn, next_level,
                                       !!IOMMUF_writable, !!IOMMUF_readable);
                 mfn += page_sz;
-                gfn += page_sz;
+                bfn += page_sz;
              }
 
             amd_iommu_flush_all_pages(d);
@@ -540,7 +540,7 @@ static int iommu_pde_from_gfn(struct domain *d, unsigned long pfn,
     return 0;
 }
 
-static int update_paging_mode(struct domain *d, unsigned long gfn)
+static int update_paging_mode(struct domain *d, unsigned long bfn)
 {
     u16 bdf;
     void *device_entry;
@@ -554,13 +554,13 @@ static int update_paging_mode(struct domain *d, unsigned long gfn)
     unsigned long old_root_mfn;
     struct domain_iommu *hd = dom_iommu(d);
 
-    if ( gfn == gfn_x(INVALID_GFN) )
+    if ( bfn == bfn_x(INVALID_BFN) )
         return -EADDRNOTAVAIL;
-    ASSERT(!(gfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH));
+    ASSERT(!(bfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH));
 
     level = hd->arch.paging_mode;
     old_root = hd->arch.root_table;
-    offset = gfn >> (PTE_PER_TABLE_SHIFT * (level - 1));
+    offset = bfn >> (PTE_PER_TABLE_SHIFT * (level - 1));
 
     ASSERT(spin_is_locked(&hd->arch.mapping_lock) && is_hvm_domain(d));
 
@@ -631,7 +631,7 @@ static int update_paging_mode(struct domain *d, unsigned long gfn)
     return 0;
 }
 
-int amd_iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
+int amd_iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
                        unsigned int flags)
 {
     bool_t need_flush = 0;
@@ -651,34 +651,34 @@ int amd_iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
     if ( rc )
     {
         spin_unlock(&hd->arch.mapping_lock);
-        AMD_IOMMU_DEBUG("Root table alloc failed, gfn = %lx\n", gfn);
+        AMD_IOMMU_DEBUG("Root table alloc failed, bfn = %lx\n", bfn);
         domain_crash(d);
         return rc;
     }
 
     /* Since HVM domain is initialized with 2 level IO page table,
-     * we might need a deeper page table for lager gfn now */
+     * we might need a deeper page table for wider bfn now */
     if ( is_hvm_domain(d) )
     {
-        if ( update_paging_mode(d, gfn) )
+        if ( update_paging_mode(d, bfn) )
         {
             spin_unlock(&hd->arch.mapping_lock);
-            AMD_IOMMU_DEBUG("Update page mode failed gfn = %lx\n", gfn);
+            AMD_IOMMU_DEBUG("Update page mode failed bfn = %lx\n", bfn);
             domain_crash(d);
             return -EFAULT;
         }
     }
 
-    if ( iommu_pde_from_gfn(d, gfn, pt_mfn) || (pt_mfn[1] == 0) )
+    if ( iommu_pde_from_bfn(d, bfn, pt_mfn) || (pt_mfn[1] == 0) )
     {
         spin_unlock(&hd->arch.mapping_lock);
-        AMD_IOMMU_DEBUG("Invalid IO pagetable entry gfn = %lx\n", gfn);
+        AMD_IOMMU_DEBUG("Invalid IO pagetable entry bfn = %lx\n", bfn);
         domain_crash(d);
         return -EFAULT;
     }
 
     /* Install 4k mapping first */
-    need_flush = set_iommu_pte_present(pt_mfn[1], gfn, mfn, 
+    need_flush = set_iommu_pte_present(pt_mfn[1], bfn, mfn,
                                        IOMMU_PAGING_MODE_LEVEL_1,
                                        !!(flags & IOMMUF_writable),
                                        !!(flags & IOMMUF_readable));
@@ -690,7 +690,7 @@ int amd_iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
     /* 4K mapping for PV guests never changes, 
      * no need to flush if we trust non-present bits */
     if ( is_hvm_domain(d) )
-        amd_iommu_flush_pages(d, gfn, 0);
+        amd_iommu_flush_pages(d, bfn, 0);
 
     for ( merge_level = IOMMU_PAGING_MODE_LEVEL_2;
           merge_level <= hd->arch.paging_mode; merge_level++ )
@@ -698,15 +698,15 @@ int amd_iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
         if ( pt_mfn[merge_level] == 0 )
             break;
         if ( !iommu_update_pde_count(d, pt_mfn[merge_level],
-                                     gfn, mfn, merge_level) )
+                                     bfn, mfn, merge_level) )
             break;
 
-        if ( iommu_merge_pages(d, pt_mfn[merge_level], gfn, 
+        if ( iommu_merge_pages(d, pt_mfn[merge_level], bfn,
                                flags, merge_level) )
         {
             spin_unlock(&hd->arch.mapping_lock);
             AMD_IOMMU_DEBUG("Merge iommu page failed at level %d, "
-                            "gfn = %lx mfn = %lx\n", merge_level, gfn, mfn);
+                            "bfn = %lx mfn = %lx\n", merge_level, bfn, mfn);
             domain_crash(d);
             return -EFAULT;
         }
@@ -720,7 +720,7 @@ out:
     return 0;
 }
 
-int amd_iommu_unmap_page(struct domain *d, unsigned long gfn)
+int amd_iommu_unmap_page(struct domain *d, unsigned long bfn)
 {
     unsigned long pt_mfn[7];
     struct domain_iommu *hd = dom_iommu(d);
@@ -739,34 +739,34 @@ int amd_iommu_unmap_page(struct domain *d, unsigned long gfn)
     }
 
     /* Since HVM domain is initialized with 2 level IO page table,
-     * we might need a deeper page table for lager gfn now */
+     * we might need a deeper page table for lager bfn now */
     if ( is_hvm_domain(d) )
     {
-        int rc = update_paging_mode(d, gfn);
+        int rc = update_paging_mode(d, bfn);
 
         if ( rc )
         {
             spin_unlock(&hd->arch.mapping_lock);
-            AMD_IOMMU_DEBUG("Update page mode failed gfn = %lx\n", gfn);
+            AMD_IOMMU_DEBUG("Update page mode failed bfn = %lx\n", bfn);
             if ( rc != -EADDRNOTAVAIL )
                 domain_crash(d);
             return rc;
         }
     }
 
-    if ( iommu_pde_from_gfn(d, gfn, pt_mfn) || (pt_mfn[1] == 0) )
+    if ( iommu_pde_from_bfn(d, bfn, pt_mfn) || (pt_mfn[1] == 0) )
     {
         spin_unlock(&hd->arch.mapping_lock);
-        AMD_IOMMU_DEBUG("Invalid IO pagetable entry gfn = %lx\n", gfn);
+        AMD_IOMMU_DEBUG("Invalid IO pagetable entry bfn = %lx\n", bfn);
         domain_crash(d);
         return -EFAULT;
     }
 
     /* mark PTE as 'page not present' */
-    clear_iommu_pte_present(pt_mfn[1], gfn);
+    clear_iommu_pte_present(pt_mfn[1], bfn);
     spin_unlock(&hd->arch.mapping_lock);
 
-    amd_iommu_flush_pages(d, gfn, 0);
+    amd_iommu_flush_pages(d, bfn, 0);
 
     return 0;
 }
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index 12d2695b89..d608631e6e 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -578,7 +578,7 @@ static void amd_dump_p2m_table_level(struct page_info* pg, int level,
                 maddr_to_page(next_table_maddr), next_level,
                 address, indent + 1);
         else
-            printk("%*sgfn: %08lx  mfn: %08lx\n",
+            printk("%*sbfn: %08lx  mfn: %08lx\n",
                    indent, "",
                    (unsigned long)PFN_DOWN(address),
                    (unsigned long)PFN_DOWN(next_table_maddr));
diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c
index 74c09b0991..1e4d561b47 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -2551,7 +2551,7 @@ static int __must_check arm_smmu_iotlb_flush_all(struct domain *d)
 }
 
 static int __must_check arm_smmu_iotlb_flush(struct domain *d,
-                                             unsigned long gfn,
+                                             unsigned long bfn,
                                              unsigned int page_count)
 {
 	/* ARM SMMU v1 doesn't have flush by VMA and VMID */
@@ -2737,7 +2737,7 @@ static void arm_smmu_iommu_domain_teardown(struct domain *d)
 	xfree(xen_domain);
 }
 
-static int __must_check arm_smmu_map_page(struct domain *d, unsigned long gfn,
+static int __must_check arm_smmu_map_page(struct domain *d, unsigned long bfn,
 			unsigned long mfn, unsigned int flags)
 {
 	p2m_type_t t;
@@ -2748,10 +2748,10 @@ static int __must_check arm_smmu_map_page(struct domain *d, unsigned long gfn,
 	 * protected by an IOMMU, Xen needs to add a 1:1 mapping in the domain
 	 * p2m to allow DMA request to work.
 	 * This is only valid when the domain is directed mapped. Hence this
-	 * function should only be used by gnttab code with gfn == mfn.
+	 * function should only be used by gnttab code with gfn == mfn == bfn.
 	 */
 	BUG_ON(!is_domain_direct_mapped(d));
-	BUG_ON(mfn != gfn);
+	BUG_ON(mfn != bfn);
 
 	/* We only support readable and writable flags */
 	if (!(flags & (IOMMUF_readable | IOMMUF_writable)))
@@ -2763,19 +2763,19 @@ static int __must_check arm_smmu_map_page(struct domain *d, unsigned long gfn,
 	 * The function guest_physmap_add_entry replaces the current mapping
 	 * if there is already one...
 	 */
-	return guest_physmap_add_entry(d, _gfn(gfn), _mfn(mfn), 0, t);
+	return guest_physmap_add_entry(d, _gfn(bfn), _mfn(bfn), 0, t);
 }
 
-static int __must_check arm_smmu_unmap_page(struct domain *d, unsigned long gfn)
+static int __must_check arm_smmu_unmap_page(struct domain *d, unsigned long bfn)
 {
 	/*
 	 * This function should only be used by gnttab code when the domain
-	 * is direct mapped
+	 * is direct mapped (i.e. gfn == mfn == bfn).
 	 */
 	if ( !is_domain_direct_mapped(d) )
 		return -EINVAL;
 
-	return guest_physmap_remove_page(d, _gfn(gfn), _mfn(gfn), 0);
+	return guest_physmap_remove_page(d, _gfn(bfn), _mfn(bfn), 0);
 }
 
 static const struct iommu_ops arm_smmu_iommu_ops = {
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 49974e5634..0faab18e88 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -185,7 +185,7 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
         page_list_for_each ( page, &d->page_list )
         {
             unsigned long mfn = mfn_x(page_to_mfn(page));
-            unsigned long gfn = mfn_to_gmfn(d, mfn);
+            unsigned long bfn = mfn_to_gmfn(d, mfn);
             unsigned int mapping = IOMMUF_readable;
             int ret;
 
@@ -194,7 +194,7 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
                   == PGT_writable_page) )
                 mapping |= IOMMUF_writable;
 
-            ret = hd->platform_ops->map_page(d, gfn, mfn, mapping);
+            ret = hd->platform_ops->map_page(d, bfn, mfn, mapping);
             if ( !rc )
                 rc = ret;
 
@@ -255,7 +255,7 @@ void iommu_domain_destroy(struct domain *d)
     arch_iommu_domain_destroy(d);
 }
 
-int iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
+int iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
                    unsigned int flags)
 {
     const struct domain_iommu *hd = dom_iommu(d);
@@ -264,13 +264,13 @@ int iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->map_page(d, gfn, mfn, flags);
+    rc = hd->platform_ops->map_page(d, bfn, mfn, flags);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
             printk(XENLOG_ERR
-                   "d%d: IOMMU mapping gfn %#lx to mfn %#lx failed: %d\n",
-                   d->domain_id, gfn, mfn, rc);
+                   "d%d: IOMMU mapping bfn %#lx to mfn %#lx failed: %d\n",
+                   d->domain_id, bfn, mfn, rc);
 
         if ( !is_hardware_domain(d) )
             domain_crash(d);
@@ -279,7 +279,7 @@ int iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
     return rc;
 }
 
-int iommu_unmap_page(struct domain *d, unsigned long gfn)
+int iommu_unmap_page(struct domain *d, unsigned long bfn)
 {
     const struct domain_iommu *hd = dom_iommu(d);
     int rc;
@@ -287,13 +287,13 @@ int iommu_unmap_page(struct domain *d, unsigned long gfn)
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->unmap_page(d, gfn);
+    rc = hd->platform_ops->unmap_page(d, bfn);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
             printk(XENLOG_ERR
-                   "d%d: IOMMU unmapping gfn %#lx failed: %d\n",
-                   d->domain_id, gfn, rc);
+                   "d%d: IOMMU unmapping bfn %#lx failed: %d\n",
+                   d->domain_id, bfn, rc);
 
         if ( !is_hardware_domain(d) )
             domain_crash(d);
@@ -319,7 +319,7 @@ static void iommu_free_pagetables(unsigned long unused)
                             cpumask_cycle(smp_processor_id(), &cpu_online_map));
 }
 
-int iommu_iotlb_flush(struct domain *d, unsigned long gfn,
+int iommu_iotlb_flush(struct domain *d, unsigned long bfn,
                       unsigned int page_count)
 {
     const struct domain_iommu *hd = dom_iommu(d);
@@ -328,13 +328,13 @@ int iommu_iotlb_flush(struct domain *d, unsigned long gfn,
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush )
         return 0;
 
-    rc = hd->platform_ops->iotlb_flush(d, gfn, page_count);
+    rc = hd->platform_ops->iotlb_flush(d, bfn, page_count);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
             printk(XENLOG_ERR
-                   "d%d: IOMMU IOTLB flush failed: %d, gfn %#lx, page count %u\n",
-                   d->domain_id, rc, gfn, page_count);
+                   "d%d: IOMMU IOTLB flush failed: %d, bfn %#lx, page count %u\n",
+                   d->domain_id, rc, bfn, page_count);
 
         if ( !is_hardware_domain(d) )
             domain_crash(d);
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 1710256823..48f62e0e8d 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -585,7 +585,7 @@ static int __must_check iommu_flush_all(void)
 }
 
 static int __must_check iommu_flush_iotlb(struct domain *d,
-                                          unsigned long gfn,
+                                          unsigned long bfn,
                                           bool_t dma_old_pte_present,
                                           unsigned int page_count)
 {
@@ -612,12 +612,12 @@ static int __must_check iommu_flush_iotlb(struct domain *d,
         if ( iommu_domid == -1 )
             continue;
 
-        if ( page_count != 1 || gfn == gfn_x(INVALID_GFN) )
+        if ( page_count != 1 || bfn == bfn_x(INVALID_BFN) )
             rc = iommu_flush_iotlb_dsi(iommu, iommu_domid,
                                        0, flush_dev_iotlb);
         else
             rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
-                                       (paddr_t)gfn << PAGE_SHIFT_4K,
+                                       __bfn_to_baddr(bfn),
                                        PAGE_ORDER_4K,
                                        !dma_old_pte_present,
                                        flush_dev_iotlb);
@@ -633,15 +633,15 @@ static int __must_check iommu_flush_iotlb(struct domain *d,
 }
 
 static int __must_check iommu_flush_iotlb_pages(struct domain *d,
-                                                unsigned long gfn,
+                                                unsigned long bfn,
                                                 unsigned int page_count)
 {
-    return iommu_flush_iotlb(d, gfn, 1, page_count);
+    return iommu_flush_iotlb(d, bfn, 1, page_count);
 }
 
 static int __must_check iommu_flush_iotlb_all(struct domain *d)
 {
-    return iommu_flush_iotlb(d, gfn_x(INVALID_GFN), 0, 0);
+    return iommu_flush_iotlb(d, bfn_x(INVALID_BFN), 0, 0);
 }
 
 /* clear one page's page table */
@@ -1767,7 +1767,7 @@ static void iommu_domain_teardown(struct domain *d)
 }
 
 static int __must_check intel_iommu_map_page(struct domain *d,
-                                             unsigned long gfn,
+                                             unsigned long bfn,
                                              unsigned long mfn,
                                              unsigned int flags)
 {
@@ -1786,14 +1786,14 @@ static int __must_check intel_iommu_map_page(struct domain *d,
 
     spin_lock(&hd->arch.mapping_lock);
 
-    pg_maddr = addr_to_dma_page_maddr(d, (paddr_t)gfn << PAGE_SHIFT_4K, 1);
+    pg_maddr = addr_to_dma_page_maddr(d, __bfn_to_baddr(bfn), 1);
     if ( pg_maddr == 0 )
     {
         spin_unlock(&hd->arch.mapping_lock);
         return -ENOMEM;
     }
     page = (struct dma_pte *)map_vtd_domain_page(pg_maddr);
-    pte = page + (gfn & LEVEL_MASK);
+    pte = page + (bfn & LEVEL_MASK);
     old = *pte;
     dma_set_pte_addr(new, (paddr_t)mfn << PAGE_SHIFT_4K);
     dma_set_pte_prot(new,
@@ -1817,22 +1817,22 @@ static int __must_check intel_iommu_map_page(struct domain *d,
     unmap_vtd_domain_page(page);
 
     if ( !this_cpu(iommu_dont_flush_iotlb) )
-        rc = iommu_flush_iotlb(d, gfn, dma_pte_present(old), 1);
+        rc = iommu_flush_iotlb(d, bfn, dma_pte_present(old), 1);
 
     return rc;
 }
 
 static int __must_check intel_iommu_unmap_page(struct domain *d,
-                                               unsigned long gfn)
+                                               unsigned long bfn)
 {
     /* Do nothing if hardware domain and iommu supports pass thru. */
     if ( iommu_passthrough && is_hardware_domain(d) )
         return 0;
 
-    return dma_pte_clear_one(d, (paddr_t)gfn << PAGE_SHIFT_4K);
+    return dma_pte_clear_one(d, __bfn_to_baddr(bfn));
 }
 
-int iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte,
+int iommu_pte_flush(struct domain *d, uint64_t bfn, uint64_t *pte,
                     int order, int present)
 {
     struct acpi_drhd_unit *drhd;
@@ -1856,7 +1856,7 @@ int iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte,
             continue;
 
         rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
-                                   (paddr_t)gfn << PAGE_SHIFT_4K,
+                                   __bfn_to_baddr(bfn),
                                    order, !present, flush_dev_iotlb);
         if ( rc > 0 )
         {
@@ -2626,7 +2626,7 @@ static void vtd_dump_p2m_table_level(paddr_t pt_maddr, int level, paddr_t gpa,
             vtd_dump_p2m_table_level(dma_pte_addr(*pte), next_level, 
                                      address, indent + 1);
         else
-            printk("%*sgfn: %08lx mfn: %08lx\n",
+            printk("%*sbfn: %08lx mfn: %08lx\n",
                    indent, "",
                    (unsigned long)(address >> PAGE_SHIFT_4K),
                    (unsigned long)(dma_pte_addr(*pte) >> PAGE_SHIFT_4K));
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 6b42e3b876..14ada0c14e 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -23,11 +23,37 @@
 #include <xen/page-defs.h>
 #include <xen/spinlock.h>
 #include <xen/pci.h>
+#include <xen/typesafe.h>
 #include <public/hvm/ioreq.h>
 #include <public/domctl.h>
 #include <asm/device.h>
 #include <asm/iommu.h>
 
+TYPE_SAFE(uint64_t, bfn);
+#define PRI_bfn     PRIx64
+#define INVALID_BFN _bfn(~0UL)
+
+#ifndef bfn_t
+#define bfn_t /* Grep fodder: bfn_t, _bfn() and bfn_x() are defined above */
+#define _bfn
+#define bfn_x
+#undef bfn_t
+#undef _bfn
+#undef bfn_x
+#endif
+
+#define IOMMU_PAGE_SHIFT 12
+#define IOMMU_PAGE_SIZE  (_AC(1,L) << IOMMU_PAGE_SHIFT)
+#define IOMMU_PAGE_MASK  (~(IOMMU_PAGE_SIZE - 1))
+
+typedef uint64_t baddr_t;
+
+#define __bfn_to_baddr(bfn) ((baddr_t)(bfn) << IOMMU_PAGE_SHIFT)
+#define __baddr_to_bfn(baddr) ((uint64_t)(baddr >> IOMMU_PAGE_SHIFT))
+
+#define bfn_to_baddr(bfn) __bfn_to_baddr(bfn_x(bfn))
+#define baddr_to_bfn(baddr) _bfn(__baddr_to_bfn(baddr))
+
 extern bool_t iommu_enable, iommu_enabled;
 extern bool_t force_iommu, iommu_dom0_strict, iommu_verbose;
 extern bool_t iommu_workaround_bios_bug, iommu_igfx, iommu_passthrough;
@@ -60,9 +86,9 @@ void iommu_teardown(struct domain *d);
 #define IOMMUF_readable  (1u<<_IOMMUF_readable)
 #define _IOMMUF_writable 1
 #define IOMMUF_writable  (1u<<_IOMMUF_writable)
-int __must_check iommu_map_page(struct domain *d, unsigned long gfn,
+int __must_check iommu_map_page(struct domain *d, unsigned long bfn,
                                 unsigned long mfn, unsigned int flags);
-int __must_check iommu_unmap_page(struct domain *d, unsigned long gfn);
+int __must_check iommu_unmap_page(struct domain *d, unsigned long bfn);
 
 enum iommu_feature
 {
@@ -152,9 +178,9 @@ struct iommu_ops {
 #endif /* HAS_PCI */
 
     void (*teardown)(struct domain *d);
-    int __must_check (*map_page)(struct domain *d, unsigned long gfn,
+    int __must_check (*map_page)(struct domain *d, unsigned long bfn,
                                  unsigned long mfn, unsigned int flags);
-    int __must_check (*unmap_page)(struct domain *d, unsigned long gfn);
+    int __must_check (*unmap_page)(struct domain *d, unsigned long bfn);
     void (*free_page_table)(struct page_info *);
 #ifdef CONFIG_X86
     void (*update_ire_from_apic)(unsigned int apic, unsigned int reg, unsigned int value);
@@ -165,7 +191,7 @@ struct iommu_ops {
     void (*resume)(void);
     void (*share_p2m)(struct domain *d);
     void (*crash_shutdown)(void);
-    int __must_check (*iotlb_flush)(struct domain *d, unsigned long gfn,
+    int __must_check (*iotlb_flush)(struct domain *d, unsigned long bfn,
                                     unsigned int page_count);
     int __must_check (*iotlb_flush_all)(struct domain *d);
     int (*get_reserved_device_memory)(iommu_grdm_t *, void *);
@@ -187,7 +213,7 @@ int iommu_do_pci_domctl(struct xen_domctl *, struct domain *d,
 int iommu_do_domctl(struct xen_domctl *, struct domain *d,
                     XEN_GUEST_HANDLE_PARAM(xen_domctl_t));
 
-int __must_check iommu_iotlb_flush(struct domain *d, unsigned long gfn,
+int __must_check iommu_iotlb_flush(struct domain *d, unsigned long bfn,
                                    unsigned int page_count);
 int __must_check iommu_iotlb_flush_all(struct domain *d);
 
diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index 24654e8e22..4ebf91f17e 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -14,8 +14,9 @@
  * should be adhered to.
  *
  * mfn: Machine Frame Number
- *   The values Xen puts into its own pagetables.  This is the host physical
- *   memory address space with RAM, MMIO etc.
+ *   The values Xen puts into its own pagetables, 2nd stage pagetables (where
+ *   hardware assisted 2nd stage translation is used) or IOMMU page tables.
+ *   This is the host physical memory address space with RAM, MMIO etc.
  *
  * gfn: Guest Frame Number
  *   The values a guest puts in its own pagetables.  For an auto-translated
@@ -26,6 +27,11 @@
  *   A linear idea of a guest physical address space. For an auto-translated
  *   guest, pfn == gfn while for a non-translated guest, pfn != gfn.
  *
+ * bfn: Bus Frame Number (definitions in include/xen/iommu.h)
+ *   The linear frame numbers of IOMMU address space. All initiators for a
+ *   given domain share a single IOMMU address space and, by default, Xen will
+ *   ensure bfn == pfn.
+ *
  * WARNING: Some of these terms have changed over time while others have been
  * used inconsistently, meaning that a lot of existing code does not match the
  * definitions above.  New code should use these terms as described here, and
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 03/15] iommu: make use of type-safe BFN and MFN in exported functions
  2018-08-01 13:40 [PATCH v4 00/15] paravirtual IOMMU interface Paul Durrant
  2018-08-01 13:40 ` [PATCH v4 01/15] re-work commit 3e06b989 "IOMMU: make page table population preemptible" Paul Durrant
  2018-08-01 13:40 ` [PATCH v4 02/15] iommu: introduce the concept of BFN Paul Durrant
@ 2018-08-01 13:40 ` Paul Durrant
  2018-08-01 13:40 ` [PATCH v4 04/15] iommu: push use of type-safe BFN and MFN into iommu_ops Paul Durrant
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Paul Durrant @ 2018-08-01 13:40 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Jun Nakajima, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall,
	Paul Durrant, Jan Beulich

This patch modifies the declaration of the entry points to the IOMMU
sub-system to use bfn_t and mfn_t in place of unsigned long. A subsequent
patch will similarly modify the methods in the iommu_ops structure.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Tim Deegan <tim@xen.org>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>

v3:
 - Removed most of the uses of an intermediate 'frame' variable.

v2:
 - Addressed comments from Jan.
 - Use intermediate 'frame' variable to avoid directly encapsulating
   mfn or gfn values as bfns.
---
 xen/arch/arm/p2m.c                    |  3 ++-
 xen/arch/x86/mm.c                     | 10 ++++----
 xen/arch/x86/mm/p2m-ept.c             | 10 +++++---
 xen/arch/x86/mm/p2m-pt.c              | 45 ++++++++++++++++++++---------------
 xen/arch/x86/mm/p2m.c                 | 16 ++++++++-----
 xen/arch/x86/x86_64/mm.c              |  5 ++--
 xen/common/grant_table.c              | 12 +++++-----
 xen/common/memory.c                   |  4 ++--
 xen/drivers/passthrough/iommu.c       | 25 ++++++++++---------
 xen/drivers/passthrough/vtd/x86/vtd.c |  3 ++-
 xen/include/xen/iommu.h               | 14 +++++++----
 11 files changed, 85 insertions(+), 62 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 14791388ad..d719db1e30 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -953,7 +953,8 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
 
     if ( need_iommu(p2m->domain) &&
          (lpae_valid(orig_pte) || lpae_valid(*entry)) )
-        rc = iommu_iotlb_flush(p2m->domain, gfn_x(sgfn), 1UL << page_order);
+        rc = iommu_iotlb_flush(p2m->domain, _bfn(gfn_x(sgfn)),
+                               1UL << page_order);
     else
         rc = 0;
 
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index a1a1f5f3c3..99070f916d 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2729,14 +2729,14 @@ static int _get_page_type(struct page_info *page, unsigned long type,
         struct domain *d = page_get_owner(page);
         if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
         {
-            gfn_t gfn = _gfn(mfn_to_gmfn(d, mfn_x(page_to_mfn(page))));
+            mfn_t mfn = page_to_mfn(page);
 
             if ( (x & PGT_type_mask) == PGT_writable_page )
-                iommu_ret = iommu_unmap_page(d, gfn_x(gfn));
+                iommu_ret = iommu_unmap_page(d, _bfn(mfn_x(mfn)));
             else if ( type == PGT_writable_page )
-                iommu_ret = iommu_map_page(d, gfn_x(gfn),
-                                           mfn_x(page_to_mfn(page)),
-                                           IOMMUF_readable|IOMMUF_writable);
+                iommu_ret = iommu_map_page(d, _bfn(mfn_x(mfn)), mfn,
+                                           IOMMUF_readable |
+                                           IOMMUF_writable);
         }
     }
 
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 14b593923b..2089b5232d 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -870,15 +870,19 @@ out:
             rc = iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
         else
         {
+            bfn_t bfn = _bfn(gfn);
+
             if ( iommu_flags )
                 for ( i = 0; i < (1 << order); i++ )
                 {
-                    rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
+                    rc = iommu_map_page(d, bfn_add(bfn, i),
+                                        mfn_add(mfn, i), iommu_flags);
                     if ( unlikely(rc) )
                     {
                         while ( i-- )
                             /* If statement to satisfy __must_check. */
-                            if ( iommu_unmap_page(p2m->domain, gfn + i) )
+                            if ( iommu_unmap_page(p2m->domain,
+                                                  bfn_add(bfn, i)) )
                                 continue;
 
                         break;
@@ -887,7 +891,7 @@ out:
             else
                 for ( i = 0; i < (1 << order); i++ )
                 {
-                    ret = iommu_unmap_page(d, gfn + i);
+                    ret = iommu_unmap_page(d, bfn_add(bfn, i));
                     if ( !rc )
                         rc = ret;
                 }
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index b8c5d2ed26..a441af388a 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -687,29 +687,36 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
             if ( iommu_old_flags )
                 amd_iommu_flush_pages(p2m->domain, gfn, page_order);
         }
-        else if ( iommu_pte_flags )
-            for ( i = 0; i < (1UL << page_order); i++ )
-            {
-                rc = iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i,
-                                    iommu_pte_flags);
-                if ( unlikely(rc) )
+        else
+        {
+            bfn_t bfn = _bfn(gfn);
+
+            if ( iommu_pte_flags )
+                for ( i = 0; i < (1UL << page_order); i++ )
                 {
-                    while ( i-- )
-                        /* If statement to satisfy __must_check. */
-                        if ( iommu_unmap_page(p2m->domain, gfn + i) )
-                            continue;
+                    rc = iommu_map_page(p2m->domain, bfn_add(bfn, i),
+                                        mfn_add(mfn, i), iommu_pte_flags);
+                    if ( unlikely(rc) )
+                    {
+                        while ( i-- )
+                            /* If statement to satisfy __must_check. */
+                            if ( iommu_unmap_page(p2m->domain,
+                                                  bfn_add(bfn, i)) )
+                                continue;
 
-                    break;
+                        break;
+                    }
                 }
-            }
-        else
-            for ( i = 0; i < (1UL << page_order); i++ )
-            {
-                int ret = iommu_unmap_page(p2m->domain, gfn + i);
+            else
+                for ( i = 0; i < (1UL << page_order); i++ )
+                {
+                    int ret = iommu_unmap_page(p2m->domain,
+                                               bfn_add(bfn, i));
 
-                if ( !rc )
-                    rc = ret;
-            }
+                    if ( !rc )
+                        rc = ret;
+                }
+        }
     }
 
     /*
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 8e9fbb5a14..fbf67def50 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -715,9 +715,11 @@ p2m_remove_page(struct p2m_domain *p2m, unsigned long gfn_l, unsigned long mfn,
 
         if ( need_iommu(p2m->domain) )
         {
+            bfn_t bfn = _bfn(mfn);
+
             for ( i = 0; i < (1 << page_order); i++ )
             {
-                int ret = iommu_unmap_page(p2m->domain, mfn + i);
+                int ret = iommu_unmap_page(p2m->domain, bfn_add(bfn, i));
 
                 if ( !rc )
                     rc = ret;
@@ -774,16 +776,17 @@ guest_physmap_add_entry(struct domain *d, gfn_t gfn, mfn_t mfn,
     {
         if ( need_iommu(d) && t == p2m_ram_rw )
         {
+            bfn_t bfn = _bfn(mfn_x(mfn));
+
             for ( i = 0; i < (1 << page_order); i++ )
             {
-                rc = iommu_map_page(d, mfn_x(mfn_add(mfn, i)),
-                                    mfn_x(mfn_add(mfn, i)),
+                rc = iommu_map_page(d, bfn_add(bfn, i), mfn_add(mfn, i),
                                     IOMMUF_readable|IOMMUF_writable);
                 if ( rc != 0 )
                 {
                     while ( i-- > 0 )
                         /* If statement to satisfy __must_check. */
-                        if ( iommu_unmap_page(d, mfn_x(mfn_add(mfn, i))) )
+                        if ( iommu_unmap_page(d, bfn_add(bfn, i)) )
                             continue;
 
                     return rc;
@@ -1158,7 +1161,8 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn_l,
     {
         if ( !need_iommu(d) )
             return 0;
-        return iommu_map_page(d, gfn_l, gfn_l, IOMMUF_readable|IOMMUF_writable);
+        return iommu_map_page(d, _bfn(gfn_l), _mfn(gfn_l),
+                              IOMMUF_readable | IOMMUF_writable);
     }
 
     gfn_lock(p2m, gfn, 0);
@@ -1248,7 +1252,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn_l)
     {
         if ( !need_iommu(d) )
             return 0;
-        return iommu_unmap_page(d, gfn_l);
+        return iommu_unmap_page(d, _bfn(gfn_l));
     }
 
     gfn_lock(p2m, gfn, 0);
diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index cca4ae926e..cc58e4cef4 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -1429,13 +1429,14 @@ int memory_add(unsigned long spfn, unsigned long epfn, unsigned int pxm)
     if ( iommu_enabled && !iommu_passthrough && !need_iommu(hardware_domain) )
     {
         for ( i = spfn; i < epfn; i++ )
-            if ( iommu_map_page(hardware_domain, i, i, IOMMUF_readable|IOMMUF_writable) )
+            if ( iommu_map_page(hardware_domain, _bfn(i), _mfn(i),
+                                IOMMUF_readable | IOMMUF_writable) )
                 break;
         if ( i != epfn )
         {
             while (i-- > old_max)
                 /* If statement to satisfy __must_check. */
-                if ( iommu_unmap_page(hardware_domain, i) )
+                if ( iommu_unmap_page(hardware_domain, _bfn(i)) )
                     continue;
 
             goto destroy_m2p;
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index d9ec711c73..a83c8832af 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -1135,13 +1135,13 @@ map_grant_ref(
              !(old_pin & (GNTPIN_hstw_mask|GNTPIN_devw_mask)) )
         {
             if ( !(kind & MAPKIND_WRITE) )
-                err = iommu_map_page(ld, mfn_x(mfn), mfn_x(mfn),
-                                     IOMMUF_readable|IOMMUF_writable);
+                err = iommu_map_page(ld, _bfn(mfn_x(mfn)), mfn,
+                                     IOMMUF_readable | IOMMUF_writable);
         }
         else if ( act_pin && !old_pin )
         {
             if ( !kind )
-                err = iommu_map_page(ld, mfn_x(mfn), mfn_x(mfn),
+                err = iommu_map_page(ld, _bfn(mfn_x(mfn)), mfn,
                                      IOMMUF_readable);
         }
         if ( err )
@@ -1404,10 +1404,10 @@ unmap_common(
 
         kind = mapkind(lgt, rd, op->mfn);
         if ( !kind )
-            err = iommu_unmap_page(ld, mfn_x(op->mfn));
+            err = iommu_unmap_page(ld, _bfn(mfn_x(op->mfn)));
         else if ( !(kind & MAPKIND_WRITE) )
-            err = iommu_map_page(ld, mfn_x(op->mfn),
-                                 mfn_x(op->mfn), IOMMUF_readable);
+            err = iommu_map_page(ld, _bfn(mfn_x(op->mfn)), op->mfn,
+                                 IOMMUF_readable);
 
         double_gt_unlock(lgt, rgt);
 
diff --git a/xen/common/memory.c b/xen/common/memory.c
index e29d596727..a5656743c2 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -833,11 +833,11 @@ int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap *xatp,
 
         this_cpu(iommu_dont_flush_iotlb) = 0;
 
-        ret = iommu_iotlb_flush(d, xatp->idx - done, done);
+        ret = iommu_iotlb_flush(d, _bfn(xatp->idx - done), done);
         if ( unlikely(ret) && rc >= 0 )
             rc = ret;
 
-        ret = iommu_iotlb_flush(d, xatp->gpfn - done, done);
+        ret = iommu_iotlb_flush(d, _bfn(xatp->gpfn - done), done);
         if ( unlikely(ret) && rc >= 0 )
             rc = ret;
     }
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 0faab18e88..303ac56e08 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -255,7 +255,7 @@ void iommu_domain_destroy(struct domain *d)
     arch_iommu_domain_destroy(d);
 }
 
-int iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
+int iommu_map_page(struct domain *d, bfn_t bfn, mfn_t mfn,
                    unsigned int flags)
 {
     const struct domain_iommu *hd = dom_iommu(d);
@@ -264,13 +264,13 @@ int iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->map_page(d, bfn, mfn, flags);
+    rc = hd->platform_ops->map_page(d, bfn_x(bfn), mfn_x(mfn), flags);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
             printk(XENLOG_ERR
-                   "d%d: IOMMU mapping bfn %#lx to mfn %#lx failed: %d\n",
-                   d->domain_id, bfn, mfn, rc);
+                   "d%d: IOMMU mapping bfn %"PRI_bfn" to mfn %"PRI_mfn" failed: %d\n",
+                   d->domain_id, bfn_x(bfn), mfn_x(mfn), rc);
 
         if ( !is_hardware_domain(d) )
             domain_crash(d);
@@ -279,7 +279,7 @@ int iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
     return rc;
 }
 
-int iommu_unmap_page(struct domain *d, unsigned long bfn)
+int iommu_unmap_page(struct domain *d, bfn_t bfn)
 {
     const struct domain_iommu *hd = dom_iommu(d);
     int rc;
@@ -287,13 +287,13 @@ int iommu_unmap_page(struct domain *d, unsigned long bfn)
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->unmap_page(d, bfn);
+    rc = hd->platform_ops->unmap_page(d, bfn_x(bfn));
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
             printk(XENLOG_ERR
-                   "d%d: IOMMU unmapping bfn %#lx failed: %d\n",
-                   d->domain_id, bfn, rc);
+                   "d%d: IOMMU unmapping bfn %"PRI_bfn" failed: %d\n",
+                   d->domain_id, bfn_x(bfn), rc);
 
         if ( !is_hardware_domain(d) )
             domain_crash(d);
@@ -319,8 +319,7 @@ static void iommu_free_pagetables(unsigned long unused)
                             cpumask_cycle(smp_processor_id(), &cpu_online_map));
 }
 
-int iommu_iotlb_flush(struct domain *d, unsigned long bfn,
-                      unsigned int page_count)
+int iommu_iotlb_flush(struct domain *d, bfn_t bfn, unsigned int page_count)
 {
     const struct domain_iommu *hd = dom_iommu(d);
     int rc;
@@ -328,13 +327,13 @@ int iommu_iotlb_flush(struct domain *d, unsigned long bfn,
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush )
         return 0;
 
-    rc = hd->platform_ops->iotlb_flush(d, bfn, page_count);
+    rc = hd->platform_ops->iotlb_flush(d, bfn_x(bfn), page_count);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
             printk(XENLOG_ERR
-                   "d%d: IOMMU IOTLB flush failed: %d, bfn %#lx, page count %u\n",
-                   d->domain_id, rc, bfn, page_count);
+                   "d%d: IOMMU IOTLB flush failed: %d, bfn %"PRI_bfn", page count %u\n",
+                   d->domain_id, rc, bfn_x(bfn), page_count);
 
         if ( !is_hardware_domain(d) )
             domain_crash(d);
diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c b/xen/drivers/passthrough/vtd/x86/vtd.c
index cc2bfea162..dc37dce4b6 100644
--- a/xen/drivers/passthrough/vtd/x86/vtd.c
+++ b/xen/drivers/passthrough/vtd/x86/vtd.c
@@ -155,7 +155,8 @@ void __hwdom_init vtd_set_hwdom_mapping(struct domain *d)
         tmp = 1 << (PAGE_SHIFT - PAGE_SHIFT_4K);
         for ( j = 0; j < tmp; j++ )
         {
-            int ret = iommu_map_page(d, pfn * tmp + j, pfn * tmp + j,
+            int ret = iommu_map_page(d, _bfn(pfn * tmp + j),
+                                     _mfn(pfn * tmp + j),
                                      IOMMUF_readable|IOMMUF_writable);
 
             if ( !rc )
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 14ada0c14e..a3c36c1148 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -24,6 +24,7 @@
 #include <xen/spinlock.h>
 #include <xen/pci.h>
 #include <xen/typesafe.h>
+#include <xen/mm.h>
 #include <public/hvm/ioreq.h>
 #include <public/domctl.h>
 #include <asm/device.h>
@@ -42,6 +43,11 @@ TYPE_SAFE(uint64_t, bfn);
 #undef bfn_x
 #endif
 
+static inline bfn_t bfn_add(bfn_t bfn, unsigned long i)
+{
+    return _bfn(bfn_x(bfn) + i);
+}
+
 #define IOMMU_PAGE_SHIFT 12
 #define IOMMU_PAGE_SIZE  (_AC(1,L) << IOMMU_PAGE_SHIFT)
 #define IOMMU_PAGE_MASK  (~(IOMMU_PAGE_SIZE - 1))
@@ -86,9 +92,9 @@ void iommu_teardown(struct domain *d);
 #define IOMMUF_readable  (1u<<_IOMMUF_readable)
 #define _IOMMUF_writable 1
 #define IOMMUF_writable  (1u<<_IOMMUF_writable)
-int __must_check iommu_map_page(struct domain *d, unsigned long bfn,
-                                unsigned long mfn, unsigned int flags);
-int __must_check iommu_unmap_page(struct domain *d, unsigned long bfn);
+int __must_check iommu_map_page(struct domain *d, bfn_t bfn,
+                                mfn_t mfn, unsigned int flags);
+int __must_check iommu_unmap_page(struct domain *d, bfn_t bfn);
 
 enum iommu_feature
 {
@@ -213,7 +219,7 @@ int iommu_do_pci_domctl(struct xen_domctl *, struct domain *d,
 int iommu_do_domctl(struct xen_domctl *, struct domain *d,
                     XEN_GUEST_HANDLE_PARAM(xen_domctl_t));
 
-int __must_check iommu_iotlb_flush(struct domain *d, unsigned long bfn,
+int __must_check iommu_iotlb_flush(struct domain *d, bfn_t bfn,
                                    unsigned int page_count);
 int __must_check iommu_iotlb_flush_all(struct domain *d);
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 04/15] iommu: push use of type-safe BFN and MFN into iommu_ops
  2018-08-01 13:40 [PATCH v4 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (2 preceding siblings ...)
  2018-08-01 13:40 ` [PATCH v4 03/15] iommu: make use of type-safe BFN and MFN in exported functions Paul Durrant
@ 2018-08-01 13:40 ` Paul Durrant
  2018-08-01 13:40 ` [PATCH v4 05/15] iommu: don't domain_crash() inside iommu_map/unmap_page() Paul Durrant
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Paul Durrant @ 2018-08-01 13:40 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Suravee Suthikulpanit, Andrew Cooper, George Dunlap,
	Paul Durrant, Jan Beulich

This patch modifies the methods in struct iommu_ops to use type-safe BFN
and MFN. This follows on from the prior patch that modified the functions
exported in xen/iommu.h.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>

v3:
 - Remove some use of intermediate 'frame' variables.

v2:
 - Addressed comments from Jan.
 - Extend use of intermediate 'frame' variable to avoid directly
   encapsulating gfn values as bfns or vice versa.
---
 xen/drivers/passthrough/amd/iommu_map.c       | 46 ++++++++++++++++-----------
 xen/drivers/passthrough/amd/pci_amd_iommu.c   |  2 +-
 xen/drivers/passthrough/arm/smmu.c            | 16 +++++-----
 xen/drivers/passthrough/iommu.c               |  9 +++---
 xen/drivers/passthrough/vtd/iommu.c           | 26 +++++++--------
 xen/drivers/passthrough/vtd/x86/vtd.c         |  7 ++--
 xen/drivers/passthrough/x86/iommu.c           |  2 +-
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |  8 ++---
 xen/include/xen/iommu.h                       | 13 +++++---
 9 files changed, 72 insertions(+), 57 deletions(-)

diff --git a/xen/drivers/passthrough/amd/iommu_map.c b/xen/drivers/passthrough/amd/iommu_map.c
index 4deab9cd2f..5a9a0af320 100644
--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -631,7 +631,7 @@ static int update_paging_mode(struct domain *d, unsigned long bfn)
     return 0;
 }
 
-int amd_iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
+int amd_iommu_map_page(struct domain *d, bfn_t bfn, mfn_t mfn,
                        unsigned int flags)
 {
     bool_t need_flush = 0;
@@ -651,7 +651,8 @@ int amd_iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
     if ( rc )
     {
         spin_unlock(&hd->arch.mapping_lock);
-        AMD_IOMMU_DEBUG("Root table alloc failed, bfn = %lx\n", bfn);
+        AMD_IOMMU_DEBUG("Root table alloc failed, bfn = %"PRI_bfn"\n",
+                        bfn_x(bfn));
         domain_crash(d);
         return rc;
     }
@@ -660,25 +661,27 @@ int amd_iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
      * we might need a deeper page table for wider bfn now */
     if ( is_hvm_domain(d) )
     {
-        if ( update_paging_mode(d, bfn) )
+        if ( update_paging_mode(d, bfn_x(bfn)) )
         {
             spin_unlock(&hd->arch.mapping_lock);
-            AMD_IOMMU_DEBUG("Update page mode failed bfn = %lx\n", bfn);
+            AMD_IOMMU_DEBUG("Update page mode failed bfn = %"PRI_bfn"\n",
+                            bfn_x(bfn));
             domain_crash(d);
             return -EFAULT;
         }
     }
 
-    if ( iommu_pde_from_bfn(d, bfn, pt_mfn) || (pt_mfn[1] == 0) )
+    if ( iommu_pde_from_bfn(d, bfn_x(bfn), pt_mfn) || (pt_mfn[1] == 0) )
     {
         spin_unlock(&hd->arch.mapping_lock);
-        AMD_IOMMU_DEBUG("Invalid IO pagetable entry bfn = %lx\n", bfn);
+        AMD_IOMMU_DEBUG("Invalid IO pagetable entry bfn = %"PRI_bfn"\n",
+                        bfn_x(bfn));
         domain_crash(d);
         return -EFAULT;
     }
 
     /* Install 4k mapping first */
-    need_flush = set_iommu_pte_present(pt_mfn[1], bfn, mfn,
+    need_flush = set_iommu_pte_present(pt_mfn[1], bfn_x(bfn), mfn_x(mfn),
                                        IOMMU_PAGING_MODE_LEVEL_1,
                                        !!(flags & IOMMUF_writable),
                                        !!(flags & IOMMUF_readable));
@@ -690,7 +693,7 @@ int amd_iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
     /* 4K mapping for PV guests never changes, 
      * no need to flush if we trust non-present bits */
     if ( is_hvm_domain(d) )
-        amd_iommu_flush_pages(d, bfn, 0);
+        amd_iommu_flush_pages(d, bfn_x(bfn), 0);
 
     for ( merge_level = IOMMU_PAGING_MODE_LEVEL_2;
           merge_level <= hd->arch.paging_mode; merge_level++ )
@@ -698,15 +701,16 @@ int amd_iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
         if ( pt_mfn[merge_level] == 0 )
             break;
         if ( !iommu_update_pde_count(d, pt_mfn[merge_level],
-                                     bfn, mfn, merge_level) )
+                                     bfn_x(bfn), mfn_x(mfn), merge_level) )
             break;
 
-        if ( iommu_merge_pages(d, pt_mfn[merge_level], bfn,
+        if ( iommu_merge_pages(d, pt_mfn[merge_level], bfn_x(bfn),
                                flags, merge_level) )
         {
             spin_unlock(&hd->arch.mapping_lock);
             AMD_IOMMU_DEBUG("Merge iommu page failed at level %d, "
-                            "bfn = %lx mfn = %lx\n", merge_level, bfn, mfn);
+                            "bfn = %"PRI_bfn" mfn = %"PRI_mfn"\n",
+                            merge_level, bfn_x(bfn), mfn_x(mfn));
             domain_crash(d);
             return -EFAULT;
         }
@@ -720,7 +724,7 @@ out:
     return 0;
 }
 
-int amd_iommu_unmap_page(struct domain *d, unsigned long bfn)
+int amd_iommu_unmap_page(struct domain *d, bfn_t bfn)
 {
     unsigned long pt_mfn[7];
     struct domain_iommu *hd = dom_iommu(d);
@@ -742,31 +746,33 @@ int amd_iommu_unmap_page(struct domain *d, unsigned long bfn)
      * we might need a deeper page table for lager bfn now */
     if ( is_hvm_domain(d) )
     {
-        int rc = update_paging_mode(d, bfn);
+        int rc = update_paging_mode(d, bfn_x(bfn));
 
         if ( rc )
         {
             spin_unlock(&hd->arch.mapping_lock);
-            AMD_IOMMU_DEBUG("Update page mode failed bfn = %lx\n", bfn);
+            AMD_IOMMU_DEBUG("Update page mode failed bfn = %"PRI_bfn"\n",
+                            bfn_x(bfn));
             if ( rc != -EADDRNOTAVAIL )
                 domain_crash(d);
             return rc;
         }
     }
 
-    if ( iommu_pde_from_bfn(d, bfn, pt_mfn) || (pt_mfn[1] == 0) )
+    if ( iommu_pde_from_bfn(d, bfn_x(bfn), pt_mfn) || (pt_mfn[1] == 0) )
     {
         spin_unlock(&hd->arch.mapping_lock);
-        AMD_IOMMU_DEBUG("Invalid IO pagetable entry bfn = %lx\n", bfn);
+        AMD_IOMMU_DEBUG("Invalid IO pagetable entry bfn = %"PRI_bfn"\n",
+                        bfn_x(bfn));
         domain_crash(d);
         return -EFAULT;
     }
 
     /* mark PTE as 'page not present' */
-    clear_iommu_pte_present(pt_mfn[1], bfn);
+    clear_iommu_pte_present(pt_mfn[1], bfn_x(bfn));
     spin_unlock(&hd->arch.mapping_lock);
 
-    amd_iommu_flush_pages(d, bfn, 0);
+    amd_iommu_flush_pages(d, bfn_x(bfn), 0);
 
     return 0;
 }
@@ -787,7 +793,9 @@ int amd_iommu_reserve_domain_unity_map(struct domain *domain,
     gfn = phys_addr >> PAGE_SHIFT;
     for ( i = 0; i < npages; i++ )
     {
-        rt = amd_iommu_map_page(domain, gfn +i, gfn +i, flags);
+        unsigned long frame = gfn + i;
+
+        rt = amd_iommu_map_page(domain, _bfn(frame), _mfn(frame), flags);
         if ( rt != 0 )
             return rt;
     }
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index d608631e6e..eea22c3d0d 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -271,7 +271,7 @@ static void __hwdom_init amd_iommu_hwdom_init(struct domain *d)
              */
             if ( mfn_valid(_mfn(pfn)) )
             {
-                int ret = amd_iommu_map_page(d, pfn, pfn,
+                int ret = amd_iommu_map_page(d, _bfn(pfn), _mfn(pfn),
                                              IOMMUF_readable|IOMMUF_writable);
 
                 if ( !rc )
diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c
index 1e4d561b47..221b62a59c 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -2550,8 +2550,7 @@ static int __must_check arm_smmu_iotlb_flush_all(struct domain *d)
 	return 0;
 }
 
-static int __must_check arm_smmu_iotlb_flush(struct domain *d,
-                                             unsigned long bfn,
+static int __must_check arm_smmu_iotlb_flush(struct domain *d, bfn_t bfn,
                                              unsigned int page_count)
 {
 	/* ARM SMMU v1 doesn't have flush by VMA and VMID */
@@ -2737,8 +2736,8 @@ static void arm_smmu_iommu_domain_teardown(struct domain *d)
 	xfree(xen_domain);
 }
 
-static int __must_check arm_smmu_map_page(struct domain *d, unsigned long bfn,
-			unsigned long mfn, unsigned int flags)
+static int __must_check arm_smmu_map_page(struct domain *d, bfn_t bfn,
+					  mfn_t mfn, unsigned int flags)
 {
 	p2m_type_t t;
 
@@ -2751,7 +2750,7 @@ static int __must_check arm_smmu_map_page(struct domain *d, unsigned long bfn,
 	 * function should only be used by gnttab code with gfn == mfn == bfn.
 	 */
 	BUG_ON(!is_domain_direct_mapped(d));
-	BUG_ON(mfn != bfn);
+	BUG_ON(mfn_x(mfn) != bfn_x(bfn));
 
 	/* We only support readable and writable flags */
 	if (!(flags & (IOMMUF_readable | IOMMUF_writable)))
@@ -2763,10 +2762,11 @@ static int __must_check arm_smmu_map_page(struct domain *d, unsigned long bfn,
 	 * The function guest_physmap_add_entry replaces the current mapping
 	 * if there is already one...
 	 */
-	return guest_physmap_add_entry(d, _gfn(bfn), _mfn(bfn), 0, t);
+	return guest_physmap_add_entry(d, _gfn(bfn_x(bfn)), _mfn(bfn_x(bfn)),
+				       0, t);
 }
 
-static int __must_check arm_smmu_unmap_page(struct domain *d, unsigned long bfn)
+static int __must_check arm_smmu_unmap_page(struct domain *d, bfn_t bfn)
 {
 	/*
 	 * This function should only be used by gnttab code when the domain
@@ -2775,7 +2775,7 @@ static int __must_check arm_smmu_unmap_page(struct domain *d, unsigned long bfn)
 	if ( !is_domain_direct_mapped(d) )
 		return -EINVAL;
 
-	return guest_physmap_remove_page(d, _gfn(bfn), _mfn(bfn), 0);
+	return guest_physmap_remove_page(d, _gfn(bfn_x(bfn)), _mfn(bfn_x(bfn)), 0);
 }
 
 static const struct iommu_ops arm_smmu_iommu_ops = {
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 303ac56e08..1f32958816 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -194,7 +194,8 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
                   == PGT_writable_page) )
                 mapping |= IOMMUF_writable;
 
-            ret = hd->platform_ops->map_page(d, bfn, mfn, mapping);
+            ret = hd->platform_ops->map_page(d, _bfn(bfn), _mfn(mfn),
+                                             mapping);
             if ( !rc )
                 rc = ret;
 
@@ -264,7 +265,7 @@ int iommu_map_page(struct domain *d, bfn_t bfn, mfn_t mfn,
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->map_page(d, bfn_x(bfn), mfn_x(mfn), flags);
+    rc = hd->platform_ops->map_page(d, bfn, mfn, flags);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
@@ -287,7 +288,7 @@ int iommu_unmap_page(struct domain *d, bfn_t bfn)
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->unmap_page(d, bfn_x(bfn));
+    rc = hd->platform_ops->unmap_page(d, bfn);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
@@ -327,7 +328,7 @@ int iommu_iotlb_flush(struct domain *d, bfn_t bfn, unsigned int page_count)
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush )
         return 0;
 
-    rc = hd->platform_ops->iotlb_flush(d, bfn_x(bfn), page_count);
+    rc = hd->platform_ops->iotlb_flush(d, bfn, page_count);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 48f62e0e8d..c9f50f04ad 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -584,8 +584,7 @@ static int __must_check iommu_flush_all(void)
     return rc;
 }
 
-static int __must_check iommu_flush_iotlb(struct domain *d,
-                                          unsigned long bfn,
+static int __must_check iommu_flush_iotlb(struct domain *d, bfn_t bfn,
                                           bool_t dma_old_pte_present,
                                           unsigned int page_count)
 {
@@ -612,12 +611,12 @@ static int __must_check iommu_flush_iotlb(struct domain *d,
         if ( iommu_domid == -1 )
             continue;
 
-        if ( page_count != 1 || bfn == bfn_x(INVALID_BFN) )
+        if ( page_count != 1 || bfn_eq(bfn, INVALID_BFN) )
             rc = iommu_flush_iotlb_dsi(iommu, iommu_domid,
                                        0, flush_dev_iotlb);
         else
             rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
-                                       __bfn_to_baddr(bfn),
+                                       bfn_to_baddr(bfn),
                                        PAGE_ORDER_4K,
                                        !dma_old_pte_present,
                                        flush_dev_iotlb);
@@ -633,7 +632,7 @@ static int __must_check iommu_flush_iotlb(struct domain *d,
 }
 
 static int __must_check iommu_flush_iotlb_pages(struct domain *d,
-                                                unsigned long bfn,
+                                                bfn_t bfn,
                                                 unsigned int page_count)
 {
     return iommu_flush_iotlb(d, bfn, 1, page_count);
@@ -641,7 +640,7 @@ static int __must_check iommu_flush_iotlb_pages(struct domain *d,
 
 static int __must_check iommu_flush_iotlb_all(struct domain *d)
 {
-    return iommu_flush_iotlb(d, bfn_x(INVALID_BFN), 0, 0);
+    return iommu_flush_iotlb(d, INVALID_BFN, 0, 0);
 }
 
 /* clear one page's page table */
@@ -676,7 +675,7 @@ static int __must_check dma_pte_clear_one(struct domain *domain, u64 addr)
     iommu_flush_cache_entry(pte, sizeof(struct dma_pte));
 
     if ( !this_cpu(iommu_dont_flush_iotlb) )
-        rc = iommu_flush_iotlb_pages(domain, addr >> PAGE_SHIFT_4K, 1);
+        rc = iommu_flush_iotlb_pages(domain, baddr_to_bfn(addr), 1);
 
     unmap_vtd_domain_page(page);
 
@@ -1767,8 +1766,7 @@ static void iommu_domain_teardown(struct domain *d)
 }
 
 static int __must_check intel_iommu_map_page(struct domain *d,
-                                             unsigned long bfn,
-                                             unsigned long mfn,
+                                             bfn_t bfn, mfn_t mfn,
                                              unsigned int flags)
 {
     struct domain_iommu *hd = dom_iommu(d);
@@ -1786,16 +1784,16 @@ static int __must_check intel_iommu_map_page(struct domain *d,
 
     spin_lock(&hd->arch.mapping_lock);
 
-    pg_maddr = addr_to_dma_page_maddr(d, __bfn_to_baddr(bfn), 1);
+    pg_maddr = addr_to_dma_page_maddr(d, bfn_to_baddr(bfn), 1);
     if ( pg_maddr == 0 )
     {
         spin_unlock(&hd->arch.mapping_lock);
         return -ENOMEM;
     }
     page = (struct dma_pte *)map_vtd_domain_page(pg_maddr);
-    pte = page + (bfn & LEVEL_MASK);
+    pte = page + (bfn_x(bfn) & LEVEL_MASK);
     old = *pte;
-    dma_set_pte_addr(new, (paddr_t)mfn << PAGE_SHIFT_4K);
+    dma_set_pte_addr(new, mfn_to_maddr(mfn));
     dma_set_pte_prot(new,
                      ((flags & IOMMUF_readable) ? DMA_PTE_READ  : 0) |
                      ((flags & IOMMUF_writable) ? DMA_PTE_WRITE : 0));
@@ -1823,13 +1821,13 @@ static int __must_check intel_iommu_map_page(struct domain *d,
 }
 
 static int __must_check intel_iommu_unmap_page(struct domain *d,
-                                               unsigned long bfn)
+                                               bfn_t bfn)
 {
     /* Do nothing if hardware domain and iommu supports pass thru. */
     if ( iommu_passthrough && is_hardware_domain(d) )
         return 0;
 
-    return dma_pte_clear_one(d, __bfn_to_baddr(bfn));
+    return dma_pte_clear_one(d, bfn_to_baddr(bfn));
 }
 
 int iommu_pte_flush(struct domain *d, uint64_t bfn, uint64_t *pte,
diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c b/xen/drivers/passthrough/vtd/x86/vtd.c
index dc37dce4b6..6fed4a92cb 100644
--- a/xen/drivers/passthrough/vtd/x86/vtd.c
+++ b/xen/drivers/passthrough/vtd/x86/vtd.c
@@ -121,6 +121,8 @@ void __hwdom_init vtd_set_hwdom_mapping(struct domain *d)
     {
         unsigned long pfn = pdx_to_pfn(i);
         bool map;
+        bfn_t bfn;
+        mfn_t mfn;
         int rc = 0;
 
         /*
@@ -153,10 +155,11 @@ void __hwdom_init vtd_set_hwdom_mapping(struct domain *d)
             continue;
 
         tmp = 1 << (PAGE_SHIFT - PAGE_SHIFT_4K);
+        bfn = _bfn(pfn * tmp);
+        mfn = _mfn(pfn * tmp);
         for ( j = 0; j < tmp; j++ )
         {
-            int ret = iommu_map_page(d, _bfn(pfn * tmp + j),
-                                     _mfn(pfn * tmp + j),
+            int ret = iommu_map_page(d, bfn_add(bfn, j), mfn_add(mfn, j),
                                      IOMMUF_readable|IOMMUF_writable);
 
             if ( !rc )
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index 22feba572b..18cc449be5 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -72,7 +72,7 @@ int arch_iommu_populate_page_table(struct domain *d)
 
         ASSERT(!(gfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH));
         BUG_ON(SHARED_M2P(gfn));
-        rc = hd->platform_ops->map_page(d, gfn, mfn,
+        rc = hd->platform_ops->map_page(d, _bfn(gfn), _mfn(mfn),
                                         IOMMUF_readable |
                                         IOMMUF_writable);
         if ( rc )
diff --git a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
index 99bc21c7b3..dce9ed6b83 100644
--- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
+++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
@@ -52,9 +52,9 @@ int amd_iommu_init(void);
 int amd_iommu_update_ivrs_mapping_acpi(void);
 
 /* mapping functions */
-int __must_check amd_iommu_map_page(struct domain *d, unsigned long gfn,
-                                    unsigned long mfn, unsigned int flags);
-int __must_check amd_iommu_unmap_page(struct domain *d, unsigned long gfn);
+int __must_check amd_iommu_map_page(struct domain *d, bfn_t bfn,
+                                    mfn_t mfn, unsigned int flags);
+int __must_check amd_iommu_unmap_page(struct domain *d, bfn_t bfn);
 u64 amd_iommu_get_next_table_from_pte(u32 *entry);
 int __must_check amd_iommu_alloc_root(struct domain_iommu *hd);
 int amd_iommu_reserve_domain_unity_map(struct domain *domain,
@@ -77,7 +77,7 @@ void iommu_dte_set_guest_cr3(u32 *dte, u16 dom_id, u64 gcr3,
 
 /* send cmd to iommu */
 void amd_iommu_flush_all_pages(struct domain *d);
-void amd_iommu_flush_pages(struct domain *d, unsigned long gfn,
+void amd_iommu_flush_pages(struct domain *d, unsigned long bfn,
                            unsigned int order);
 void amd_iommu_flush_iotlb(u8 devfn, const struct pci_dev *pdev,
                            uint64_t gaddr, unsigned int order);
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index a3c36c1148..624784fec8 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -48,6 +48,11 @@ static inline bfn_t bfn_add(bfn_t bfn, unsigned long i)
     return _bfn(bfn_x(bfn) + i);
 }
 
+static inline bool_t bfn_eq(bfn_t x, bfn_t y)
+{
+    return bfn_x(x) == bfn_x(y);
+}
+
 #define IOMMU_PAGE_SHIFT 12
 #define IOMMU_PAGE_SIZE  (_AC(1,L) << IOMMU_PAGE_SHIFT)
 #define IOMMU_PAGE_MASK  (~(IOMMU_PAGE_SIZE - 1))
@@ -184,9 +189,9 @@ struct iommu_ops {
 #endif /* HAS_PCI */
 
     void (*teardown)(struct domain *d);
-    int __must_check (*map_page)(struct domain *d, unsigned long bfn,
-                                 unsigned long mfn, unsigned int flags);
-    int __must_check (*unmap_page)(struct domain *d, unsigned long bfn);
+    int __must_check (*map_page)(struct domain *d, bfn_t bfn, mfn_t mfn,
+                                 unsigned int flags);
+    int __must_check (*unmap_page)(struct domain *d, bfn_t bfn);
     void (*free_page_table)(struct page_info *);
 #ifdef CONFIG_X86
     void (*update_ire_from_apic)(unsigned int apic, unsigned int reg, unsigned int value);
@@ -197,7 +202,7 @@ struct iommu_ops {
     void (*resume)(void);
     void (*share_p2m)(struct domain *d);
     void (*crash_shutdown)(void);
-    int __must_check (*iotlb_flush)(struct domain *d, unsigned long bfn,
+    int __must_check (*iotlb_flush)(struct domain *d, bfn_t bfn,
                                     unsigned int page_count);
     int __must_check (*iotlb_flush_all)(struct domain *d);
     int (*get_reserved_device_memory)(iommu_grdm_t *, void *);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 05/15] iommu: don't domain_crash() inside iommu_map/unmap_page()
  2018-08-01 13:40 [PATCH v4 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (3 preceding siblings ...)
  2018-08-01 13:40 ` [PATCH v4 04/15] iommu: push use of type-safe BFN and MFN into iommu_ops Paul Durrant
@ 2018-08-01 13:40 ` Paul Durrant
  2018-08-01 13:40 ` [PATCH v4 06/15] public / x86: introduce __HYPERCALL_iommu_op Paul Durrant
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Paul Durrant @ 2018-08-01 13:40 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Jun Nakajima, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall,
	Paul Durrant, Jan Beulich

Turn iommu_map/unmap_page() into straightforward wrappers that check the
existence of the relevant iommu_op and call through to it. This makes them
usable by PV IOMMU code to be delivered in future patches.
Leave the decision on whether to invoke domain_crash() up to the caller.
This has the added benefit that the (module/line number) message that
domain_crash() spits out will be more indicative of where the problem lies.

NOTE: This patch includes one bit of clean-up in set_identity_p2m_entry()
      replacing use of p2m->domain with the domain pointer passed into the
      function.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: George Dunlap <George.Dunlap@eu.citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tim Deegan <tim@xen.org>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>

v2:
 - New in v2.
---
 xen/arch/arm/p2m.c                  |  4 ++++
 xen/arch/x86/mm.c                   |  3 +++
 xen/arch/x86/mm/p2m-ept.c           |  3 +++
 xen/arch/x86/mm/p2m-pt.c            |  3 +++
 xen/arch/x86/mm/p2m.c               | 24 ++++++++++++++++++++----
 xen/common/grant_table.c            |  8 ++++++++
 xen/common/memory.c                 |  3 +++
 xen/drivers/passthrough/iommu.c     | 12 ------------
 xen/drivers/passthrough/x86/iommu.c |  4 ++++
 9 files changed, 48 insertions(+), 16 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index d719db1e30..eb39861b73 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -953,8 +953,12 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
 
     if ( need_iommu(p2m->domain) &&
          (lpae_valid(orig_pte) || lpae_valid(*entry)) )
+    {
         rc = iommu_iotlb_flush(p2m->domain, _bfn(gfn_x(sgfn)),
                                1UL << page_order);
+        if ( unlikely(rc) && !is_hardware_domain(p2m->domain) )
+            domain_crash(p2m->domain);
+    }
     else
         rc = 0;
 
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 99070f916d..08878574f3 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2737,6 +2737,9 @@ static int _get_page_type(struct page_info *page, unsigned long type,
                 iommu_ret = iommu_map_page(d, _bfn(mfn_x(mfn)), mfn,
                                            IOMMUF_readable |
                                            IOMMUF_writable);
+
+            if ( unlikely(iommu_ret) && !is_hardware_domain(d) )
+                domain_crash(d);
         }
     }
 
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 2089b5232d..33e77903d6 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -895,6 +895,9 @@ out:
                     if ( !rc )
                         rc = ret;
                 }
+
+            if ( unlikely(rc) && !is_hardware_domain(d) )
+                domain_crash(d);
         }
     }
 
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index a441af388a..9ff0b3fe7a 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -717,6 +717,9 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
                         rc = ret;
                 }
         }
+
+        if ( unlikely(rc) && !is_hardware_domain(p2m->domain) )
+            domain_crash(p2m->domain);
     }
 
     /*
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index fbf67def50..036f52f4f1 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -724,6 +724,9 @@ p2m_remove_page(struct p2m_domain *p2m, unsigned long gfn_l, unsigned long mfn,
                 if ( !rc )
                     rc = ret;
             }
+
+            if ( unlikely(rc) && !is_hardware_domain(p2m->domain) )
+                domain_crash(p2m->domain);
         }
 
         return rc;
@@ -789,6 +792,9 @@ guest_physmap_add_entry(struct domain *d, gfn_t gfn, mfn_t mfn,
                         if ( iommu_unmap_page(d, bfn_add(bfn, i)) )
                             continue;
 
+                    if ( !is_hardware_domain(d) )
+                        domain_crash(d);
+
                     return rc;
                 }
             }
@@ -1157,12 +1163,17 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn_l,
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
     int ret;
 
-    if ( !paging_mode_translate(p2m->domain) )
+    if ( !paging_mode_translate(d) )
     {
         if ( !need_iommu(d) )
             return 0;
-        return iommu_map_page(d, _bfn(gfn_l), _mfn(gfn_l),
-                              IOMMUF_readable | IOMMUF_writable);
+
+        ret = iommu_map_page(d, _bfn(gfn_l), _mfn(gfn_l),
+                             IOMMUF_readable | IOMMUF_writable);
+        if ( unlikely(ret) && !is_hardware_domain(d) )
+            domain_crash(d);
+
+        return ret;
     }
 
     gfn_lock(p2m, gfn, 0);
@@ -1252,7 +1263,12 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn_l)
     {
         if ( !need_iommu(d) )
             return 0;
-        return iommu_unmap_page(d, _bfn(gfn_l));
+
+        ret = iommu_unmap_page(d, _bfn(gfn_l));
+        if ( unlikely(ret) && !is_hardware_domain(d) )
+            domain_crash(d);
+
+        return ret;
     }
 
     gfn_lock(p2m, gfn, 0);
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index a83c8832af..1840b656c9 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -1146,6 +1146,9 @@ map_grant_ref(
         }
         if ( err )
         {
+            if ( !is_hardware_domain(ld) )
+                domain_crash(ld);
+
             double_gt_unlock(lgt, rgt);
             rc = GNTST_general_error;
             goto undo_out;
@@ -1412,7 +1415,12 @@ unmap_common(
         double_gt_unlock(lgt, rgt);
 
         if ( err )
+        {
+            if ( !is_hardware_domain(ld) )
+                domain_crash(ld);
+
             rc = GNTST_general_error;
+        }
     }
 
     /* If just unmapped a writable mapping, mark as dirtied */
diff --git a/xen/common/memory.c b/xen/common/memory.c
index a5656743c2..0a34677cc3 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -840,6 +840,9 @@ int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap *xatp,
         ret = iommu_iotlb_flush(d, _bfn(xatp->gpfn - done), done);
         if ( unlikely(ret) && rc >= 0 )
             rc = ret;
+
+        if ( unlikely(rc < 0) && !is_hardware_domain(d) )
+            domain_crash(d);
     }
 #endif
 
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 1f32958816..21e6886a3f 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -272,9 +272,6 @@ int iommu_map_page(struct domain *d, bfn_t bfn, mfn_t mfn,
             printk(XENLOG_ERR
                    "d%d: IOMMU mapping bfn %"PRI_bfn" to mfn %"PRI_mfn" failed: %d\n",
                    d->domain_id, bfn_x(bfn), mfn_x(mfn), rc);
-
-        if ( !is_hardware_domain(d) )
-            domain_crash(d);
     }
 
     return rc;
@@ -295,9 +292,6 @@ int iommu_unmap_page(struct domain *d, bfn_t bfn)
             printk(XENLOG_ERR
                    "d%d: IOMMU unmapping bfn %"PRI_bfn" failed: %d\n",
                    d->domain_id, bfn_x(bfn), rc);
-
-        if ( !is_hardware_domain(d) )
-            domain_crash(d);
     }
 
     return rc;
@@ -335,9 +329,6 @@ int iommu_iotlb_flush(struct domain *d, bfn_t bfn, unsigned int page_count)
             printk(XENLOG_ERR
                    "d%d: IOMMU IOTLB flush failed: %d, bfn %"PRI_bfn", page count %u\n",
                    d->domain_id, rc, bfn_x(bfn), page_count);
-
-        if ( !is_hardware_domain(d) )
-            domain_crash(d);
     }
 
     return rc;
@@ -358,9 +349,6 @@ int iommu_iotlb_flush_all(struct domain *d)
             printk(XENLOG_ERR
                    "d%d: IOMMU IOTLB flush all failed: %d\n",
                    d->domain_id, rc);
-
-        if ( !is_hardware_domain(d) )
-            domain_crash(d);
     }
 
     return rc;
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index 18cc449be5..673998b037 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -89,7 +89,11 @@ int arch_iommu_populate_page_table(struct domain *d)
     this_cpu(iommu_dont_flush_iotlb) = 0;
 
     if ( !rc )
+    {
         rc = iommu_iotlb_flush_all(d);
+        if ( unlikely(rc) )
+            domain_crash(d);
+    }
 
     if ( rc && rc != -ERESTART )
         iommu_teardown(d);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 06/15] public / x86: introduce __HYPERCALL_iommu_op
  2018-08-01 13:40 [PATCH v4 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (4 preceding siblings ...)
  2018-08-01 13:40 ` [PATCH v4 05/15] iommu: don't domain_crash() inside iommu_map/unmap_page() Paul Durrant
@ 2018-08-01 13:40 ` Paul Durrant
  2018-08-01 13:40 ` [PATCH v4 07/15] iommu: track reserved ranges using a rangeset Paul Durrant
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Paul Durrant @ 2018-08-01 13:40 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Paul Durrant, Jan Beulich,
	Daniel De Graaf

This patch introduces the boilerplate for a new hypercall to allow a
domain to control IOMMU mappings for its own pages.
Whilst there is duplication of code between the native and compat entry
points which appears ripe for some form of combination, I think it is
better to maintain the separation as-is because the compat entry point
will necessarily gain complexity in subsequent patches.

NOTE: This hypercall is only implemented for x86 and is currently
      restricted by XSM to dom0. Its scope can be expanded in future
      if need be.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov>

v3:
 - Push op code into XSM check.

v2:
 - Get rid of the can_control_iommu() function, leaving this patch as pure
   boilerplate.
 - Re-structure the hypercall to use a buffer array, similar to that used
   by __HYPERCALL_dm_op, to allow for future expansion of op structure
   without affecting binary compatibility.
 - Drop use of __ in public header guard.
---
 tools/flask/policy/modules/xen.if   |   1 +
 xen/arch/x86/Makefile               |   1 +
 xen/arch/x86/hvm/hypercall.c        |   1 +
 xen/arch/x86/hypercall.c            |   1 +
 xen/arch/x86/iommu_op.c             | 184 ++++++++++++++++++++++++++++++++++++
 xen/arch/x86/pv/hypercall.c         |   1 +
 xen/include/Makefile                |   2 +
 xen/include/public/iommu_op.h       |  64 +++++++++++++
 xen/include/public/xen.h            |   1 +
 xen/include/xen/hypercall.h         |  12 +++
 xen/include/xlat.lst                |   2 +
 xen/include/xsm/dummy.h             |   6 ++
 xen/include/xsm/xsm.h               |   6 ++
 xen/xsm/dummy.c                     |   1 +
 xen/xsm/flask/hooks.c               |   6 ++
 xen/xsm/flask/policy/access_vectors |   2 +
 16 files changed, 291 insertions(+)
 create mode 100644 xen/arch/x86/iommu_op.c
 create mode 100644 xen/include/public/iommu_op.h

diff --git a/tools/flask/policy/modules/xen.if b/tools/flask/policy/modules/xen.if
index 7aefd0061e..5161611471 100644
--- a/tools/flask/policy/modules/xen.if
+++ b/tools/flask/policy/modules/xen.if
@@ -60,6 +60,7 @@ define(`create_domain_common', `
 	allow $1 $2:grant setup;
 	allow $1 $2:hvm { getparam hvmctl sethvmc
 			setparam nested altp2mhvm altp2mhvm_op dm };
+	allow $1 $2:resource control_iommu;
 ')
 
 # create_domain(priv, target)
diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index 17e7d3fa34..35bfaf7e8a 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -33,6 +33,7 @@ obj-$(CONFIG_CRASH_DEBUG) += gdbstub.o
 obj-y += hypercall.o
 obj-y += i387.o
 obj-y += i8259.o
+obj-y += iommu_op.o
 obj-y += io_apic.o
 obj-$(CONFIG_LIVEPATCH) += alternative.o livepatch.o
 obj-y += msi.o
diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
index 85eacd7d33..3574966827 100644
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -137,6 +137,7 @@ static const hypercall_table_t hvm_hypercall_table[] = {
     COMPAT_CALL(mmuext_op),
     HYPERCALL(xenpmu_op),
     COMPAT_CALL(dm_op),
+    COMPAT_CALL(iommu_op),
     HYPERCALL(arch_1)
 };
 
diff --git a/xen/arch/x86/hypercall.c b/xen/arch/x86/hypercall.c
index 90e88c1d2c..045753e702 100644
--- a/xen/arch/x86/hypercall.c
+++ b/xen/arch/x86/hypercall.c
@@ -68,6 +68,7 @@ const hypercall_args_t hypercall_args_table[NR_hypercalls] =
     ARGS(xenpmu_op, 2),
     ARGS(dm_op, 3),
     ARGS(mca, 1),
+    ARGS(iommu_op, 2),
     ARGS(arch_1, 1),
 };
 
diff --git a/xen/arch/x86/iommu_op.c b/xen/arch/x86/iommu_op.c
new file mode 100644
index 0000000000..744c0fce27
--- /dev/null
+++ b/xen/arch/x86/iommu_op.c
@@ -0,0 +1,184 @@
+/******************************************************************************
+ * x86/iommu_op.c
+ *
+ * Paravirtualised IOMMU functionality
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Copyright (C) 2018 Citrix Systems Inc
+ */
+
+#include <xen/event.h>
+#include <xen/guest_access.h>
+#include <xen/hypercall.h>
+
+static void iommu_op(xen_iommu_op_t *op)
+{
+    switch ( op->op )
+    {
+    default:
+        op->status = -EOPNOTSUPP;
+        break;
+    }
+}
+
+int do_one_iommu_op(xen_iommu_op_buf_t *buf)
+{
+    xen_iommu_op_t op;
+    int rc;
+
+    if ( buf->size < sizeof(op) )
+        return -EFAULT;
+
+    if ( copy_from_guest((void *)&op, buf->h, sizeof(op)) )
+        return -EFAULT;
+
+    if ( op.pad )
+        return -EINVAL;
+
+    rc = xsm_iommu_op(XSM_PRIV, current->domain, op.op);
+    if ( rc )
+        return rc;
+
+    iommu_op(&op);
+
+    if ( __copy_field_to_guest(guest_handle_cast(buf->h, xen_iommu_op_t),
+                               &op, status) )
+        return -EFAULT;
+
+    return 0;
+}
+
+long do_iommu_op(unsigned int nr_bufs,
+                 XEN_GUEST_HANDLE_PARAM(xen_iommu_op_buf_t) bufs)
+{
+    unsigned int i;
+    long rc = 0;
+
+    for ( i = 0; i < nr_bufs; i++ )
+    {
+        xen_iommu_op_buf_t buf;
+
+        if ( ((i & 0xff) == 0xff) && hypercall_preempt_check() )
+        {
+            rc = i;
+            break;
+        }
+
+        if ( copy_from_guest_offset(&buf, bufs, i, 1) )
+        {
+            rc = -EFAULT;
+            break;
+        }
+
+        rc = do_one_iommu_op(&buf);
+        if ( rc )
+            break;
+    }
+
+    if ( rc > 0 )
+    {
+        ASSERT(rc < nr_bufs);
+        nr_bufs -= rc;
+        guest_handle_add_offset(bufs, rc);
+
+        rc = hypercall_create_continuation(__HYPERVISOR_iommu_op,
+                                           "ih", nr_bufs, bufs);
+    }
+
+    return rc;
+}
+
+int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
+{
+    compat_iommu_op_t cmp;
+    xen_iommu_op_t nat;
+    int rc;
+
+    if ( buf->size < sizeof(cmp) )
+        return -EFAULT;
+
+    if ( copy_from_compat((void *)&cmp, buf->h, sizeof(cmp)) )
+        return -EFAULT;
+
+    if ( cmp.pad )
+        return -EINVAL;
+
+    rc = xsm_iommu_op(XSM_PRIV, current->domain, cmp.op);
+    if ( rc )
+        return rc;
+
+    XLAT_iommu_op(&nat, &cmp);
+
+    iommu_op(&nat);
+
+    XLAT_iommu_op(&cmp, &nat);
+
+    if ( __copy_field_to_compat(compat_handle_cast(buf->h,
+                                                   compat_iommu_op_t),
+                                &cmp, status) )
+        return -EFAULT;
+
+    return 0;
+}
+
+int compat_iommu_op(unsigned int nr_bufs,
+                    XEN_GUEST_HANDLE_PARAM(compat_iommu_op_buf_t) bufs)
+{
+    unsigned int i;
+    long rc = 0;
+
+    for ( i = 0; i < nr_bufs; i++ )
+    {
+        compat_iommu_op_buf_t buf;
+
+        if ( ((i & 0xff) == 0xff) && hypercall_preempt_check() )
+        {
+            rc = i;
+            break;
+        }
+
+        if ( copy_from_guest_offset(&buf, bufs, i, 1) )
+        {
+            rc = -EFAULT;
+            break;
+        }
+
+        rc = compat_one_iommu_op(&buf);
+        if ( rc )
+            break;
+    }
+
+    if ( rc > 0 )
+    {
+        ASSERT(rc < nr_bufs);
+        nr_bufs -= rc;
+        guest_handle_add_offset(bufs, rc);
+
+        rc = hypercall_create_continuation(__HYPERVISOR_iommu_op,
+                                           "ih", nr_bufs, bufs);
+    }
+
+    return rc;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/pv/hypercall.c b/xen/arch/x86/pv/hypercall.c
index bbc3011d1a..d23f9af42f 100644
--- a/xen/arch/x86/pv/hypercall.c
+++ b/xen/arch/x86/pv/hypercall.c
@@ -80,6 +80,7 @@ const hypercall_table_t pv_hypercall_table[] = {
     HYPERCALL(xenpmu_op),
     COMPAT_CALL(dm_op),
     HYPERCALL(mca),
+    COMPAT_CALL(iommu_op),
     HYPERCALL(arch_1),
 };
 
diff --git a/xen/include/Makefile b/xen/include/Makefile
index df04182965..af54d8833f 100644
--- a/xen/include/Makefile
+++ b/xen/include/Makefile
@@ -11,6 +11,7 @@ headers-y := \
     compat/features.h \
     compat/grant_table.h \
     compat/kexec.h \
+    compat/iommu_op.h \
     compat/memory.h \
     compat/nmi.h \
     compat/physdev.h \
@@ -29,6 +30,7 @@ headers-$(CONFIG_X86)     += compat/arch-x86/xen-$(compat-arch-y).h
 headers-$(CONFIG_X86)     += compat/hvm/dm_op.h
 headers-$(CONFIG_X86)     += compat/hvm/hvm_op.h
 headers-$(CONFIG_X86)     += compat/hvm/hvm_vcpu.h
+headers-$(CONFIG_X86)     += compat/iommu_op.h
 headers-y                 += compat/arch-$(compat-arch-y).h compat/pmu.h compat/xlat.h
 headers-$(CONFIG_FLASK)   += compat/xsm/flask_op.h
 
diff --git a/xen/include/public/iommu_op.h b/xen/include/public/iommu_op.h
new file mode 100644
index 0000000000..c3b68f665a
--- /dev/null
+++ b/xen/include/public/iommu_op.h
@@ -0,0 +1,64 @@
+/*
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (C) 2018 Citrix Systems Inc
+ */
+
+#ifndef XEN_PUBLIC_IOMMU_OP_H
+#define XEN_PUBLIC_IOMMU_OP_H
+
+#include "xen.h"
+
+struct xen_iommu_op {
+    uint16_t op;    /* op type */
+    uint16_t pad;
+    int32_t status; /* op completion status: */
+                    /* 0 for success otherwise, negative errno */
+};
+typedef struct xen_iommu_op xen_iommu_op_t;
+DEFINE_XEN_GUEST_HANDLE(xen_iommu_op_t);
+
+struct xen_iommu_op_buf {
+    XEN_GUEST_HANDLE(void) h;
+    xen_ulong_t size;
+};
+typedef struct xen_iommu_op_buf xen_iommu_op_buf_t;
+DEFINE_XEN_GUEST_HANDLE(xen_iommu_op_buf_t);
+
+/* ` enum neg_errnoval
+ * ` HYPERVISOR_iommu_op(unsigned int nr_bufs,
+ * `                     xen_iommu_op_buf_t bufs[])
+ * `
+ *
+ * @nr_bufs is the number of buffers in the @bufs array.
+ * @bufs points to an array of buffers where each contains a struct
+ * xen_iommu_op.
+ */
+
+#endif /* XEN_PUBLIC_IOMMU_OP_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index fb1df8f293..68b0968e7d 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -121,6 +121,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
 #define __HYPERVISOR_xc_reserved_op       39 /* reserved for XenClient */
 #define __HYPERVISOR_xenpmu_op            40
 #define __HYPERVISOR_dm_op                41
+#define __HYPERVISOR_iommu_op             42
 
 /* Architecture-specific hypercall definitions. */
 #define __HYPERVISOR_arch_0               48
diff --git a/xen/include/xen/hypercall.h b/xen/include/xen/hypercall.h
index cc99aea57d..2ebc999f4b 100644
--- a/xen/include/xen/hypercall.h
+++ b/xen/include/xen/hypercall.h
@@ -16,6 +16,7 @@
 #include <public/version.h>
 #include <public/pmu.h>
 #include <public/hvm/dm_op.h>
+#include <public/iommu_op.h>
 #include <asm/hypercall.h>
 #include <xsm/xsm.h>
 
@@ -148,6 +149,10 @@ do_dm_op(
     unsigned int nr_bufs,
     XEN_GUEST_HANDLE_PARAM(xen_dm_op_buf_t) bufs);
 
+extern long
+do_iommu_op(unsigned int nr_bufs,
+            XEN_GUEST_HANDLE_PARAM(xen_iommu_op_buf_t) bufs);
+
 #ifdef CONFIG_COMPAT
 
 extern int
@@ -205,6 +210,13 @@ compat_dm_op(
     unsigned int nr_bufs,
     XEN_GUEST_HANDLE_PARAM(void) bufs);
 
+#include <compat/iommu_op.h>
+
+DEFINE_XEN_GUEST_HANDLE(compat_iommu_op_buf_t);
+extern int
+compat_iommu_op(unsigned int nr_bufs,
+                XEN_GUEST_HANDLE_PARAM(compat_iommu_op_buf_t) bufs);
+
 #endif
 
 void arch_get_xen_caps(xen_capabilities_info_t *info);
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 527332054a..3b15c18c4e 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -77,6 +77,8 @@
 ?	vcpu_hvm_context		hvm/hvm_vcpu.h
 ?	vcpu_hvm_x86_32			hvm/hvm_vcpu.h
 ?	vcpu_hvm_x86_64			hvm/hvm_vcpu.h
+!	iommu_op			iommu_op.h
+!	iommu_op_buf			iommu_op.h
 ?	kexec_exec			kexec.h
 !	kexec_image			kexec.h
 !	kexec_range			kexec.h
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index ff6b2dbf39..8891da0759 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -715,6 +715,12 @@ static XSM_INLINE int xsm_dm_op(XSM_DEFAULT_ARG struct domain *d)
     return xsm_default_action(action, current->domain, d);
 }
 
+static XSM_INLINE int xsm_iommu_op(XSM_DEFAULT_ARG struct domain *d, unsigned int op)
+{
+    XSM_ASSERT_ACTION(XSM_PRIV);
+    return xsm_default_action(action, current->domain, d);
+}
+
 #endif /* CONFIG_X86 */
 
 #include <public/version.h>
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index f0c6fc7e29..c47c1418eb 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -178,6 +178,7 @@ struct xsm_operations {
     int (*ioport_mapping) (struct domain *d, uint32_t s, uint32_t e, uint8_t allow);
     int (*pmu_op) (struct domain *d, unsigned int op);
     int (*dm_op) (struct domain *d);
+    int (*iommu_op) (struct domain *d, unsigned int op);
 #endif
     int (*xen_version) (uint32_t cmd);
     int (*domain_resource_map) (struct domain *d);
@@ -686,6 +687,11 @@ static inline int xsm_dm_op(xsm_default_t def, struct domain *d)
     return xsm_ops->dm_op(d);
 }
 
+static inline int xsm_iommu_op(xsm_default_t def, struct domain *d, unsigned int op)
+{
+    return xsm_ops->iommu_op(d, op);
+}
+
 #endif /* CONFIG_X86 */
 
 static inline int xsm_xen_version (xsm_default_t def, uint32_t op)
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index 6e751199ee..02512a1566 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -155,6 +155,7 @@ void __init xsm_fixup_ops (struct xsm_operations *ops)
     set_to_dummy_if_null(ops, ioport_mapping);
     set_to_dummy_if_null(ops, pmu_op);
     set_to_dummy_if_null(ops, dm_op);
+    set_to_dummy_if_null(ops, iommu_op);
 #endif
     set_to_dummy_if_null(ops, xen_version);
     set_to_dummy_if_null(ops, domain_resource_map);
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 78bc32602e..a04786106f 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1672,6 +1672,11 @@ static int flask_dm_op(struct domain *d)
     return current_has_perm(d, SECCLASS_HVM, HVM__DM);
 }
 
+static int flask_iommu_op(struct domain *d)
+{
+    return current_has_perm(d, SECCLASS_RESOURCE, RESOURCE__CONTROL_IOMMU);
+}
+
 #endif /* CONFIG_X86 */
 
 static int flask_xen_version (uint32_t op)
@@ -1850,6 +1855,7 @@ static struct xsm_operations flask_ops = {
     .ioport_mapping = flask_ioport_mapping,
     .pmu_op = flask_pmu_op,
     .dm_op = flask_dm_op,
+    .iommu_op = flask_iommu_op,
 #endif
     .xen_version = flask_xen_version,
     .domain_resource_map = flask_domain_resource_map,
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index c5d85486d0..0c894b733e 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -473,6 +473,8 @@ class resource
 # checked for PHYSDEVOP_setup_gsi (target IRQ)
 # checked for PHYSDEVOP_pci_mmcfg_reserved (target xen_t)
     setup
+# checked for IOMMU_OP
+    control_iommu
 }
 
 # Class security describes the FLASK security server itself; these operations
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 07/15] iommu: track reserved ranges using a rangeset
  2018-08-01 13:40 [PATCH v4 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (5 preceding siblings ...)
  2018-08-01 13:40 ` [PATCH v4 06/15] public / x86: introduce __HYPERCALL_iommu_op Paul Durrant
@ 2018-08-01 13:40 ` Paul Durrant
  2018-08-01 13:40 ` [PATCH v4 08/15] x86: add iommu_op to query reserved ranges Paul Durrant
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Paul Durrant @ 2018-08-01 13:40 UTC (permalink / raw)
  To: xen-devel; +Cc: Kevin Tian, Paul Durrant, Jan Beulich

Ranges that should be considered reserved in the IOMMU are not necessarily
limited to RMRRs. If iommu_inclusive_mapping is set then any frame number
falling within an E820 reserved region should also be considered as
reserved in the IOMMU.
This patch adds a rangeset to the domain_iommu structure that is then used
to track all reserved ranges. This will be needed by a subsequent patch
to test whether it is safe to modify a particular IOMMU entry.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Kevin Tian <kevin.tian@intel.com>

v2:
 - New in v2.
---
 xen/drivers/passthrough/iommu.c       | 10 +++++++++-
 xen/drivers/passthrough/vtd/iommu.c   | 20 +++++++++++++-------
 xen/drivers/passthrough/vtd/x86/vtd.c | 17 ++++++++++++++++-
 xen/include/xen/iommu.h               |  6 ++++++
 4 files changed, 44 insertions(+), 9 deletions(-)

diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 21e6886a3f..b10a37e5d7 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -147,6 +147,10 @@ int iommu_domain_init(struct domain *d)
     if ( !iommu_enabled )
         return 0;
 
+    hd->reserved_ranges = rangeset_new(d, NULL, 0);
+    if ( !hd->reserved_ranges )
+        return -ENOMEM;
+
     hd->platform_ops = iommu_get_ops();
     return hd->platform_ops->init(d);
 }
@@ -248,12 +252,16 @@ int iommu_construct(struct domain *d)
 
 void iommu_domain_destroy(struct domain *d)
 {
-    if ( !iommu_enabled || !dom_iommu(d)->platform_ops )
+    const struct domain_iommu *hd = dom_iommu(d);
+
+    if ( !iommu_enabled || !hd->platform_ops )
         return;
 
     iommu_teardown(d);
 
     arch_iommu_domain_destroy(d);
+
+    rangeset_destroy(hd->reserved_ranges);
 }
 
 int iommu_map_page(struct domain *d, bfn_t bfn, mfn_t mfn,
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index c9f50f04ad..282e227414 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1910,6 +1910,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map,
     unsigned long end_pfn = PAGE_ALIGN_4K(rmrr->end_address) >> PAGE_SHIFT_4K;
     struct mapped_rmrr *mrmrr;
     struct domain_iommu *hd = dom_iommu(d);
+    int err = 0;
 
     ASSERT(pcidevs_locked());
     ASSERT(rmrr->base_address < rmrr->end_address);
@@ -1923,8 +1924,6 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map,
         if ( mrmrr->base == rmrr->base_address &&
              mrmrr->end == rmrr->end_address )
         {
-            int ret = 0;
-
             if ( map )
             {
                 ++mrmrr->count;
@@ -1934,28 +1933,35 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map,
             if ( --mrmrr->count )
                 return 0;
 
-            while ( base_pfn < end_pfn )
+            err = rangeset_remove_range(hd->reserved_ranges,
+                                        base_pfn, end_pfn);
+            while ( !err && base_pfn < end_pfn )
             {
                 if ( clear_identity_p2m_entry(d, base_pfn) )
-                    ret = -ENXIO;
+                    err = -ENXIO;
+
                 base_pfn++;
             }
 
             list_del(&mrmrr->list);
             xfree(mrmrr);
-            return ret;
+            return err;
         }
     }
 
     if ( !map )
         return -ENOENT;
 
+    err = rangeset_add_range(hd->reserved_ranges, base_pfn, end_pfn);
+    if ( err )
+        return err;
+
     while ( base_pfn < end_pfn )
     {
-        int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag);
-
+        err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag);
         if ( err )
             return err;
+
         base_pfn++;
     }
 
diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c b/xen/drivers/passthrough/vtd/x86/vtd.c
index 6fed4a92cb..032412b8c6 100644
--- a/xen/drivers/passthrough/vtd/x86/vtd.c
+++ b/xen/drivers/passthrough/vtd/x86/vtd.c
@@ -164,10 +164,25 @@ void __hwdom_init vtd_set_hwdom_mapping(struct domain *d)
 
             if ( !rc )
                rc = ret;
+
+            /*
+             * The only reason a reserved page would be mapped is that
+             * iommu_inclusive_mapping is set, in which case the BFN
+             * needs to be marked as reserved in the IOMMU.
+             */
+            if ( page_is_ram_type(pfn, RAM_TYPE_RESERVED) )
+            {
+                ASSERT(iommu_inclusive_mapping);
+
+                ret = rangeset_add_singleton(dom_iommu(d)->reserved_ranges,
+                                             bfn_x(bfn));
+                if ( !rc )
+                    rc = ret;
+            }
         }
 
         if ( rc )
-            printk(XENLOG_WARNING VTDPREFIX " d%d: IOMMU mapping failed: %d\n",
+            printk(XENLOG_WARNING VTDPREFIX " d%d: IOMMU mapping/reservation failed: %d\n",
                    d->domain_id, rc);
 
         if (!(i & (0xfffff >> (PAGE_SHIFT - PAGE_SHIFT_4K))))
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 624784fec8..cc0be81b4e 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -122,6 +122,12 @@ struct domain_iommu {
 
     /* Features supported by the IOMMU */
     DECLARE_BITMAP(features, IOMMU_FEAT_count);
+
+    /*
+     * BFN ranges that are reserved in the domain IOMMU mappings and
+     * must not be modified after initialization.
+     */
+    struct rangeset *reserved_ranges;
 };
 
 #define dom_iommu(d)              (&(d)->iommu)
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 08/15] x86: add iommu_op to query reserved ranges
  2018-08-01 13:40 [PATCH v4 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (6 preceding siblings ...)
  2018-08-01 13:40 ` [PATCH v4 07/15] iommu: track reserved ranges using a rangeset Paul Durrant
@ 2018-08-01 13:40 ` Paul Durrant
  2018-08-01 13:40 ` [PATCH v4 09/15] vtd: add lookup_page method to iommu_ops Paul Durrant
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Paul Durrant @ 2018-08-01 13:40 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Paul Durrant, Jan Beulich

This patch adds an iommu_op to allow the domain IOMMU reserved ranges to be
queried by the guest.

NOTE: The number of reserved ranges is determined by system firmware, in
      conjunction with Xen command line options, and is expected to be
      small. Thus, to avoid over-complicating the code, there is no
      pre-emption check within the operation.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

v4:
 - Make xen_bfn_t strictly 64 bits wide and drop associated compat
   translation.

v3:
 - Avoid speculation beyond array bounds check.

v2:
 - Re-implemented for v2 based on new rangeset.
---
 xen/arch/x86/iommu_op.c       | 164 ++++++++++++++++++++++++++++++++++++++++--
 xen/include/public/iommu_op.h |  39 ++++++++++
 xen/include/xlat.lst          |   2 +
 3 files changed, 199 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/iommu_op.c b/xen/arch/x86/iommu_op.c
index 744c0fce27..bcfcd49102 100644
--- a/xen/arch/x86/iommu_op.c
+++ b/xen/arch/x86/iommu_op.c
@@ -22,11 +22,70 @@
 #include <xen/event.h>
 #include <xen/guest_access.h>
 #include <xen/hypercall.h>
+#include <xen/nospec.h>
+
+struct get_reserved_ctxt {
+    unsigned int max_entries;
+    unsigned int nr_entries;
+    XEN_GUEST_HANDLE(xen_iommu_reserved_range_t) ranges;
+};
+
+static int get_reserved(unsigned long s, unsigned long e, void *arg)
+{
+    struct get_reserved_ctxt *ctxt = arg;
+
+    if ( ctxt->nr_entries < ctxt->max_entries )
+    {
+        xen_iommu_reserved_range_t range = {
+            .start_bfn = s,
+            .nr_frames = e - s,
+        };
+
+        if ( copy_to_guest_offset(ctxt->ranges, ctxt->nr_entries, &range,
+                                  1) )
+            return -EFAULT;
+    }
+
+    ctxt->nr_entries++;
+    return 0;
+}
+
+static int iommu_op_query_reserved(struct xen_iommu_op_query_reserved *op)
+{
+    struct domain *currd = current->domain;
+    struct domain_iommu *iommu = dom_iommu(currd);
+    struct get_reserved_ctxt ctxt = {
+        .max_entries = op->nr_entries,
+        .ranges = op->ranges,
+    };
+    int rc;
+
+    if ( op->pad )
+        return -EINVAL;
+
+    rc = rangeset_report_ranges(iommu->reserved_ranges, 0, ~0ul,
+                                get_reserved, &ctxt);
+    if ( rc )
+        return rc;
+
+    /* Pass back the actual number of reserved ranges */
+    op->nr_entries = ctxt.nr_entries;
+
+    if ( !guest_handle_is_null(ctxt.ranges) &&
+         ctxt.nr_entries > ctxt.max_entries )
+        return -ENOBUFS;
+
+    return 0;
+}
 
 static void iommu_op(xen_iommu_op_t *op)
 {
     switch ( op->op )
     {
+    case XEN_IOMMUOP_query_reserved:
+        op->status = iommu_op_query_reserved(&op->u.query_reserved);
+        break;
+
     default:
         op->status = -EOPNOTSUPP;
         break;
@@ -35,13 +94,20 @@ static void iommu_op(xen_iommu_op_t *op)
 
 int do_one_iommu_op(xen_iommu_op_buf_t *buf)
 {
-    xen_iommu_op_t op;
+    xen_iommu_op_t op = {};
+    size_t offset;
+    static const size_t op_size[] = {
+        [XEN_IOMMUOP_query_reserved] = sizeof(struct xen_iommu_op_query_reserved),
+    };
+    size_t size;
     int rc;
 
-    if ( buf->size < sizeof(op) )
+    offset = offsetof(struct xen_iommu_op, u);
+
+    if ( buf->size < offset )
         return -EFAULT;
 
-    if ( copy_from_guest((void *)&op, buf->h, sizeof(op)) )
+    if ( copy_from_guest((void *)&op, buf->h, offset) )
         return -EFAULT;
 
     if ( op.pad )
@@ -51,6 +117,16 @@ int do_one_iommu_op(xen_iommu_op_buf_t *buf)
     if ( rc )
         return rc;
 
+    if ( op.op >= ARRAY_SIZE(op_size) )
+        return -EOPNOTSUPP;
+
+    size = op_size[array_index_nospec(op.op, ARRAY_SIZE(op_size))];
+    if ( buf->size < offset + size )
+        return -EFAULT;
+
+    if ( copy_from_guest_offset((void *)&op.u, buf->h, offset, size) )
+        return -EFAULT;
+
     iommu_op(&op);
 
     if ( __copy_field_to_guest(guest_handle_cast(buf->h, xen_iommu_op_t),
@@ -100,16 +176,27 @@ long do_iommu_op(unsigned int nr_bufs,
     return rc;
 }
 
+CHECK_iommu_reserved_range;
+
 int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
 {
-    compat_iommu_op_t cmp;
+    compat_iommu_op_t cmp = {};
+    size_t offset;
+    static const size_t op_size[] = {
+        [XEN_IOMMUOP_query_reserved] = sizeof(struct compat_iommu_op_query_reserved),
+    };
+    size_t size;
     xen_iommu_op_t nat;
+    unsigned int u;
+    int32_t status;
     int rc;
 
-    if ( buf->size < sizeof(cmp) )
+    offset = offsetof(struct compat_iommu_op, u);
+
+    if ( buf->size < offset )
         return -EFAULT;
 
-    if ( copy_from_compat((void *)&cmp, buf->h, sizeof(cmp)) )
+    if ( copy_from_compat((void *)&cmp, buf->h, offset) )
         return -EFAULT;
 
     if ( cmp.pad )
@@ -119,17 +206,82 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
     if ( rc )
         return rc;
 
+    if ( cmp.op >= ARRAY_SIZE(op_size) )
+        return -EOPNOTSUPP;
+
+    size = op_size[array_index_nospec(cmp.op, ARRAY_SIZE(op_size))];
+    if ( buf->size < offset + size )
+        return -EFAULT;
+
+    if ( copy_from_compat_offset((void *)&cmp.u, buf->h, offset, size) )
+        return -EFAULT;
+
+    /*
+     * The xlat magic doesn't quite know how to handle the union so
+     * we need to fix things up here.
+     */
+#define XLAT_iommu_op_u_query_reserved XEN_IOMMUOP_query_reserved
+    u = cmp.op;
+
+#define XLAT_iommu_op_query_reserved_HNDL_ranges(_d_, _s_)            \
+    do                                                                \
+    {                                                                 \
+        if ( !compat_handle_is_null((_s_)->ranges) )                  \
+        {                                                             \
+            unsigned int *nr_entries = COMPAT_ARG_XLAT_VIRT_BASE;     \
+            xen_iommu_reserved_range_t *ranges =                      \
+                (void *)(nr_entries + 1);                             \
+                                                                      \
+            if ( sizeof(*nr_entries) +                                \
+                 (sizeof(*ranges) * (_s_)->nr_entries) >              \
+                 COMPAT_ARG_XLAT_SIZE )                               \
+                return -E2BIG;                                        \
+                                                                      \
+            *nr_entries = (_s_)->nr_entries;                          \
+            set_xen_guest_handle((_d_)->ranges, ranges);              \
+        }                                                             \
+        else                                                          \
+            set_xen_guest_handle((_d_)->ranges, NULL);                \
+    } while (false)
+
     XLAT_iommu_op(&nat, &cmp);
 
+#undef XLAT_iommu_op_query_reserved_HNDL_ranges
+
     iommu_op(&nat);
 
+    status = nat.status;
+
+#define XLAT_iommu_op_query_reserved_HNDL_ranges(_d_, _s_)               \
+    do                                                                   \
+    {                                                                    \
+        if ( !compat_handle_is_null((_d_)->ranges) )                     \
+        {                                                                \
+            unsigned int *nr_entries = COMPAT_ARG_XLAT_VIRT_BASE;        \
+            compat_iommu_reserved_range_t *ranges =                      \
+                (void *)(nr_entries + 1);                                \
+            unsigned int nr =                                            \
+                min_t(unsigned int, (_d_)->nr_entries, *nr_entries);     \
+                                                                         \
+            if ( __copy_to_compat_offset((_d_)->ranges, 0, ranges, nr) ) \
+                status = -EFAULT;                                        \
+        }                                                                \
+    } while (false)
+
     XLAT_iommu_op(&cmp, &nat);
 
+    /* status will have been modified if __copy_to_compat_offset() failed */
+    cmp.status = status;
+
+#undef XLAT_iommu_op_query_reserved_HNDL_ranges
+
     if ( __copy_field_to_compat(compat_handle_cast(buf->h,
                                                    compat_iommu_op_t),
                                 &cmp, status) )
         return -EFAULT;
 
+#undef XLAT_iommu_op_u_query_reserved
+
     return 0;
 }
 
diff --git a/xen/include/public/iommu_op.h b/xen/include/public/iommu_op.h
index c3b68f665a..ade404a877 100644
--- a/xen/include/public/iommu_op.h
+++ b/xen/include/public/iommu_op.h
@@ -25,11 +25,50 @@
 
 #include "xen.h"
 
+typedef uint64_t xen_bfn_t;
+
+/* Structure describing a single range reserved in the IOMMU */
+struct xen_iommu_reserved_range {
+    xen_bfn_t start_bfn;
+    unsigned int nr_frames;
+    unsigned int pad;
+};
+typedef struct xen_iommu_reserved_range xen_iommu_reserved_range_t;
+DEFINE_XEN_GUEST_HANDLE(xen_iommu_reserved_range_t);
+
+/*
+ * XEN_IOMMUOP_query_reserved: Query ranges reserved in the IOMMU.
+ */
+#define XEN_IOMMUOP_query_reserved 1
+
+struct xen_iommu_op_query_reserved {
+    /*
+     * IN/OUT - On entry this is the number of entries available
+     *          in the ranges array below.
+     *          On exit this is the actual number of reserved ranges.
+     */
+    unsigned int nr_entries;
+    unsigned int pad;
+    /*
+     * OUT - This array is populated with reserved ranges. If it is
+     *       not sufficiently large then available entries are populated,
+     *       but the op status code will be set to -ENOBUFS.
+     *       It is permissable to set this to NULL if nr_entries is also
+     *       set to zero. In this case, on exit, nr_entries will still be
+     *       set to the actual number of reserved ranges but the status
+     *       code will be set to zero.
+     */
+    XEN_GUEST_HANDLE(xen_iommu_reserved_range_t) ranges;
+};
+
 struct xen_iommu_op {
     uint16_t op;    /* op type */
     uint16_t pad;
     int32_t status; /* op completion status: */
                     /* 0 for success otherwise, negative errno */
+    union {
+        struct xen_iommu_op_query_reserved query_reserved;
+    } u;
 };
 typedef struct xen_iommu_op xen_iommu_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_iommu_op_t);
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 3b15c18c4e..d2f9b1034b 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -79,6 +79,8 @@
 ?	vcpu_hvm_x86_64			hvm/hvm_vcpu.h
 !	iommu_op			iommu_op.h
 !	iommu_op_buf			iommu_op.h
+!	iommu_op_query_reserved		iommu_op.h
+?	iommu_reserved_range		iommu_op.h
 ?	kexec_exec			kexec.h
 !	kexec_image			kexec.h
 !	kexec_range			kexec.h
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 09/15] vtd: add lookup_page method to iommu_ops
  2018-08-01 13:40 [PATCH v4 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (7 preceding siblings ...)
  2018-08-01 13:40 ` [PATCH v4 08/15] x86: add iommu_op to query reserved ranges Paul Durrant
@ 2018-08-01 13:40 ` Paul Durrant
  2018-08-01 13:40 ` [PATCH v4 10/15] mm / iommu: include need_iommu() test in iommu_use_hap_pt() Paul Durrant
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Paul Durrant @ 2018-08-01 13:40 UTC (permalink / raw)
  To: xen-devel; +Cc: Kevin Tian, Paul Durrant, George Dunlap, Jan Beulich

This patch adds a new method to the VT-d IOMMU implementation to find the
MFN currently mapped by the specified BFN along with a wrapper function in
generic IOMMU code to call the implementation if it exists.

This functionality will be used by a subsequent patch.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: George Dunlap <george.dunlap@citrix.com>

v3:
 - Addressed comments from George.

v2:
 - Addressed some comments from Jan.
---
 xen/drivers/passthrough/iommu.c     | 11 +++++++++++
 xen/drivers/passthrough/vtd/iommu.c | 34 ++++++++++++++++++++++++++++++++++
 xen/drivers/passthrough/vtd/iommu.h |  3 +++
 xen/include/xen/iommu.h             |  4 ++++
 4 files changed, 52 insertions(+)

diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index b10a37e5d7..9b7baca93f 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -305,6 +305,17 @@ int iommu_unmap_page(struct domain *d, bfn_t bfn)
     return rc;
 }
 
+int iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
+                      unsigned int *flags)
+{
+    const struct domain_iommu *hd = dom_iommu(d);
+
+    if ( !iommu_enabled || !hd->platform_ops )
+        return -EOPNOTSUPP;
+
+    return hd->platform_ops->lookup_page(d, bfn, mfn, flags);
+}
+
 static void iommu_free_pagetables(unsigned long unused)
 {
     do {
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 282e227414..8cd3b59aa0 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1830,6 +1830,39 @@ static int __must_check intel_iommu_unmap_page(struct domain *d,
     return dma_pte_clear_one(d, bfn_to_baddr(bfn));
 }
 
+static int intel_iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
+                                   unsigned int *flags)
+{
+    struct domain_iommu *hd = dom_iommu(d);
+    struct dma_pte *page = NULL, *pte = NULL, val;
+    u64 pg_maddr;
+
+    spin_lock(&hd->arch.mapping_lock);
+
+    pg_maddr = addr_to_dma_page_maddr(d, bfn_to_baddr(bfn), 0);
+    if ( pg_maddr == 0 )
+    {
+        spin_unlock(&hd->arch.mapping_lock);
+        return -ENOMEM;
+    }
+
+    page = map_vtd_domain_page(pg_maddr);
+    pte = page + (bfn_x(bfn) & LEVEL_MASK);
+    val = *pte;
+
+    unmap_vtd_domain_page(page);
+    spin_unlock(&hd->arch.mapping_lock);
+
+    if ( !dma_pte_present(val) )
+        return -ENOENT;
+
+    *mfn = maddr_to_mfn(dma_pte_addr(val));
+    *flags = dma_pte_read(val) ? IOMMUF_readable : 0;
+    *flags |= dma_pte_write(val) ? IOMMUF_writable : 0;
+
+    return 0;
+}
+
 int iommu_pte_flush(struct domain *d, uint64_t bfn, uint64_t *pte,
                     int order, int present)
 {
@@ -2661,6 +2694,7 @@ const struct iommu_ops intel_iommu_ops = {
     .teardown = iommu_domain_teardown,
     .map_page = intel_iommu_map_page,
     .unmap_page = intel_iommu_unmap_page,
+    .lookup_page = intel_iommu_lookup_page,
     .free_page_table = iommu_free_page_table,
     .reassign_device = reassign_device_ownership,
     .get_device_group_id = intel_iommu_group_id,
diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
index 72c1a2e3cd..47bdfcb5ea 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -272,6 +272,9 @@ struct dma_pte {
 #define dma_set_pte_prot(p, prot) do { \
         (p).val = ((p).val & ~DMA_PTE_PROT) | ((prot) & DMA_PTE_PROT); \
     } while (0)
+#define dma_pte_prot(p) ((p).val & DMA_PTE_PROT)
+#define dma_pte_read(p) (dma_pte_prot(p) & DMA_PTE_READ)
+#define dma_pte_write(p) (dma_pte_prot(p) & DMA_PTE_WRITE)
 #define dma_pte_addr(p) ((p).val & PADDR_MASK & PAGE_MASK_4K)
 #define dma_set_pte_addr(p, addr) do {\
             (p).val |= ((addr) & PAGE_MASK_4K); } while (0)
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index cc0be81b4e..7c5d46df81 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -100,6 +100,8 @@ void iommu_teardown(struct domain *d);
 int __must_check iommu_map_page(struct domain *d, bfn_t bfn,
                                 mfn_t mfn, unsigned int flags);
 int __must_check iommu_unmap_page(struct domain *d, bfn_t bfn);
+int __must_check iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
+                                   unsigned int *flags);
 
 enum iommu_feature
 {
@@ -198,6 +200,8 @@ struct iommu_ops {
     int __must_check (*map_page)(struct domain *d, bfn_t bfn, mfn_t mfn,
                                  unsigned int flags);
     int __must_check (*unmap_page)(struct domain *d, bfn_t bfn);
+    int __must_check (*lookup_page)(struct domain *d, bfn_t bfn, mfn_t *mfn,
+                                    unsigned int *flags);
     void (*free_page_table)(struct page_info *);
 #ifdef CONFIG_X86
     void (*update_ire_from_apic)(unsigned int apic, unsigned int reg, unsigned int value);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 10/15] mm / iommu: include need_iommu() test in iommu_use_hap_pt()
  2018-08-01 13:40 [PATCH v4 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (8 preceding siblings ...)
  2018-08-01 13:40 ` [PATCH v4 09/15] vtd: add lookup_page method to iommu_ops Paul Durrant
@ 2018-08-01 13:40 ` Paul Durrant
  2018-08-01 13:40 ` [PATCH v4 11/15] mm / iommu: split need_iommu() into has_iommu_pt() and sync_iommu_pt() Paul Durrant
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Paul Durrant @ 2018-08-01 13:40 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Jun Nakajima, George Dunlap,
	Andrew Cooper, Julien Grall, Paul Durrant, Jan Beulich

The name 'iommu_use_hap_pt' suggests that that P2M table is in use as the
domain's IOMMU pagetable which, prior to this patch, is not strictly true
since the macro did not test whether the domain actually has IOMMU
mappings.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>

v4:
 - New in v4.
---
 xen/arch/x86/mm/p2m-ept.c       | 6 +++---
 xen/arch/x86/mm/p2m-pt.c        | 6 +++---
 xen/arch/x86/mm/p2m.c           | 2 +-
 xen/drivers/passthrough/iommu.c | 2 +-
 xen/include/asm-arm/iommu.h     | 2 +-
 xen/include/asm-x86/iommu.h     | 5 +++--
 6 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 33e77903d6..e52dbb6203 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -863,12 +863,12 @@ out:
         ept_sync_domain(p2m);
 
     /* For host p2m, may need to change VT-d page table.*/
-    if ( rc == 0 && p2m_is_hostp2m(p2m) && need_iommu(d) &&
+    if ( rc == 0 && p2m_is_hostp2m(p2m) &&
          need_modify_vtd_table )
     {
-        if ( iommu_hap_pt_share )
+        if ( iommu_use_hap_pt(d) )
             rc = iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
-        else
+        else if ( need_iommu(d) )
         {
             bfn_t bfn = _bfn(gfn);
 
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 9ff0b3fe7a..9255194070 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -677,8 +677,8 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
          && (gfn + (1UL << page_order) - 1 > p2m->max_mapped_pfn) )
         p2m->max_mapped_pfn = gfn + (1UL << page_order) - 1;
 
-    if ( iommu_enabled && need_iommu(p2m->domain) &&
-         (iommu_old_flags != iommu_pte_flags || old_mfn != mfn_x(mfn)) )
+    if ( iommu_enabled && (iommu_old_flags != iommu_pte_flags ||
+                           old_mfn != mfn_x(mfn)) )
     {
         ASSERT(rc == 0);
 
@@ -687,7 +687,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
             if ( iommu_old_flags )
                 amd_iommu_flush_pages(p2m->domain, gfn, page_order);
         }
-        else
+        else if ( need_iommu(p2m->domain) )
         {
             bfn_t bfn = _bfn(gfn);
 
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 036f52f4f1..7465489074 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -2092,7 +2092,7 @@ static unsigned int mmio_order(const struct domain *d,
      * - exclude PV guests, should execution reach this code for such.
      * So be careful when altering this.
      */
-    if ( !need_iommu(d) || !iommu_use_hap_pt(d) ||
+    if ( !iommu_use_hap_pt(d) ||
          (start_fn & ((1UL << PAGE_ORDER_2M) - 1)) || !(nr >> PAGE_ORDER_2M) )
         return PAGE_ORDER_4K;
 
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 9b7baca93f..ad8ffcfcb2 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -453,7 +453,7 @@ int iommu_do_domctl(
 
 void iommu_share_p2m_table(struct domain* d)
 {
-    if ( iommu_enabled && iommu_use_hap_pt(d) )
+    if ( iommu_use_hap_pt(d) )
         iommu_get_ops()->share_p2m(d);
 }
 
diff --git a/xen/include/asm-arm/iommu.h b/xen/include/asm-arm/iommu.h
index 57d9b1e14a..8d1506c6f7 100644
--- a/xen/include/asm-arm/iommu.h
+++ b/xen/include/asm-arm/iommu.h
@@ -21,7 +21,7 @@ struct arch_iommu
 };
 
 /* Always share P2M Table between the CPU and the IOMMU */
-#define iommu_use_hap_pt(d) (1)
+#define iommu_use_hap_pt(d) (need_iommu(d))
 
 const struct iommu_ops *iommu_get_ops(void);
 void __init iommu_set_ops(const struct iommu_ops *ops);
diff --git a/xen/include/asm-x86/iommu.h b/xen/include/asm-x86/iommu.h
index 0dcead3b6c..9216c09162 100644
--- a/xen/include/asm-x86/iommu.h
+++ b/xen/include/asm-x86/iommu.h
@@ -78,8 +78,9 @@ static inline int iommu_hardware_setup(void)
     return -ENODEV;
 }
 
-/* Does this domain have a P2M table we can use as its IOMMU pagetable? */
-#define iommu_use_hap_pt(d) (hap_enabled(d) && iommu_hap_pt_share)
+/* Are we using the domain P2M table as its IOMMU pagetable? */
+#define iommu_use_hap_pt(d) \
+    (hap_enabled(d) && need_iommu(d) && iommu_hap_pt_share)
 
 void iommu_update_ire_from_apic(unsigned int apic, unsigned int reg, unsigned int value);
 unsigned int iommu_read_apic_from_ire(unsigned int apic, unsigned int reg);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 11/15] mm / iommu: split need_iommu() into has_iommu_pt() and sync_iommu_pt()
  2018-08-01 13:40 [PATCH v4 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (9 preceding siblings ...)
  2018-08-01 13:40 ` [PATCH v4 10/15] mm / iommu: include need_iommu() test in iommu_use_hap_pt() Paul Durrant
@ 2018-08-01 13:40 ` Paul Durrant
  2018-08-01 13:40 ` [PATCH v4 12/15] x86: add iommu_op to enable modification of IOMMU mappings Paul Durrant
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Paul Durrant @ 2018-08-01 13:40 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
	Razvan Cojocaru, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, Paul Durrant, Tamas K Lengyel,
	Jan Beulich, Brian Woods, Suravee Suthikulpanit

The name 'need_iommu' is a little confusing as it suggests a domain needs
to use the IOMMU but something might not be set up yet, when in fact it
doesn't become true until IOMMU mappings for the domain have been created.

Two different meanings are also inferred from it in various places in the
code:

- Some callers want to test whether a domain has IOMMU mappings
- Some callers want to test whether they need to synchronize the domain's
  P2M and IOMMU mappings

This patch therefore creates two new boolean flags, 'has_iommu_pt' and
'sync_iommu_pt' to convey those meanings separately and creates the
corresponding macros. All callers of need_iommu() are then modified to
use the macro appropriate to what they are trying to test.

NOTE: The test of need_iommu(d) in the AMD IOMMU initialization code has
      been replaced with a test of iommu_dom0_strict since this more
      accurately reflects the meaning of the test and brings it into
      line with a similar test in the Intel VT-d code.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Tamas K Lengyel <tamas@tklengyel.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Brian Woods <brian.woods@amd.com>

v4:
 - New in v4.
---
 xen/arch/arm/p2m.c                          |  2 +-
 xen/arch/x86/hvm/mtrr.c                     |  5 +++--
 xen/arch/x86/mm.c                           |  2 +-
 xen/arch/x86/mm/mem_sharing.c               |  2 +-
 xen/arch/x86/mm/p2m-ept.c                   |  2 +-
 xen/arch/x86/mm/p2m-pt.c                    |  2 +-
 xen/arch/x86/mm/p2m.c                       |  8 ++++----
 xen/arch/x86/mm/paging.c                    |  2 +-
 xen/arch/x86/x86_64/mm.c                    |  3 ++-
 xen/common/memory.c                         |  6 +++---
 xen/common/vm_event.c                       |  2 +-
 xen/drivers/passthrough/amd/pci_amd_iommu.c |  2 +-
 xen/drivers/passthrough/device_tree.c       |  2 +-
 xen/drivers/passthrough/iommu.c             | 17 ++++++++++-------
 xen/drivers/passthrough/pci.c               |  6 +++---
 xen/include/asm-arm/grant_table.h           |  2 +-
 xen/include/asm-arm/iommu.h                 |  2 +-
 xen/include/asm-x86/grant_table.h           |  2 +-
 xen/include/asm-x86/iommu.h                 |  2 +-
 xen/include/xen/sched.h                     | 12 ++++++++----
 20 files changed, 46 insertions(+), 37 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index eb39861b73..c972b72baf 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -951,7 +951,7 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
     if ( lpae_valid(orig_pte) && entry->p2m.base != orig_pte.p2m.base )
         p2m_free_entry(p2m, orig_pte, level);
 
-    if ( need_iommu(p2m->domain) &&
+    if ( sync_iommu_pt(p2m->domain) &&
          (lpae_valid(orig_pte) || lpae_valid(*entry)) )
     {
         rc = iommu_iotlb_flush(p2m->domain, _bfn(gfn_x(sgfn)),
diff --git a/xen/arch/x86/hvm/mtrr.c b/xen/arch/x86/hvm/mtrr.c
index c502dda693..61131d4b61 100644
--- a/xen/arch/x86/hvm/mtrr.c
+++ b/xen/arch/x86/hvm/mtrr.c
@@ -824,7 +824,8 @@ HVM_REGISTER_SAVE_RESTORE(MTRR, hvm_save_mtrr_msr, hvm_load_mtrr_msr,
 
 void memory_type_changed(struct domain *d)
 {
-    if ( need_iommu(d) && d->vcpu && d->vcpu[0] )
+    if ( (sync_iommu_pt(d) || iommu_use_hap_pt(d)) &&
+         d->vcpu && d->vcpu[0] )
     {
         p2m_memory_type_changed(d);
         flush_all(FLUSH_CACHE);
@@ -872,7 +873,7 @@ int epte_get_entry_emt(struct domain *d, unsigned long gfn, mfn_t mfn,
         return MTRR_TYPE_UNCACHABLE;
     }
 
-    if ( !need_iommu(d) && !cache_flush_permitted(d) )
+    if ( !has_iommu_pt(d) && !cache_flush_permitted(d) )
     {
         *ipat = 1;
         return MTRR_TYPE_WRBACK;
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 08878574f3..462398096f 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2727,7 +2727,7 @@ static int _get_page_type(struct page_info *page, unsigned long type,
     {
         /* Special pages should not be accessible from devices. */
         struct domain *d = page_get_owner(page);
-        if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
+        if ( d && is_pv_domain(d) && unlikely(sync_iommu_pt(d)) )
         {
             mfn_t mfn = page_to_mfn(page);
 
diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index fad8a9df13..c54845275f 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -1610,7 +1610,7 @@ int mem_sharing_domctl(struct domain *d, struct xen_domctl_mem_sharing_op *mec)
         case XEN_DOMCTL_MEM_SHARING_CONTROL:
         {
             rc = 0;
-            if ( unlikely(need_iommu(d) && mec->u.enable) )
+            if ( unlikely(has_iommu_pt(d) && mec->u.enable) )
                 rc = -EXDEV;
             else
                 d->arch.hvm_domain.mem_sharing_enabled = mec->u.enable;
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index e52dbb6203..d90424a9a3 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -868,7 +868,7 @@ out:
     {
         if ( iommu_use_hap_pt(d) )
             rc = iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
-        else if ( need_iommu(d) )
+        else if ( sync_iommu_pt(d) )
         {
             bfn_t bfn = _bfn(gfn);
 
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 9255194070..e0ae6bf20f 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -687,7 +687,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
             if ( iommu_old_flags )
                 amd_iommu_flush_pages(p2m->domain, gfn, page_order);
         }
-        else if ( need_iommu(p2m->domain) )
+        else if ( sync_iommu_pt(p2m->domain) )
         {
             bfn_t bfn = _bfn(gfn);
 
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 7465489074..4ffce37c45 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -713,7 +713,7 @@ p2m_remove_page(struct p2m_domain *p2m, unsigned long gfn_l, unsigned long mfn,
     {
         int rc = 0;
 
-        if ( need_iommu(p2m->domain) )
+        if ( sync_iommu_pt(p2m->domain) )
         {
             bfn_t bfn = _bfn(mfn);
 
@@ -777,7 +777,7 @@ guest_physmap_add_entry(struct domain *d, gfn_t gfn, mfn_t mfn,
 
     if ( !paging_mode_translate(d) )
     {
-        if ( need_iommu(d) && t == p2m_ram_rw )
+        if ( sync_iommu_pt(d) && t == p2m_ram_rw )
         {
             bfn_t bfn = _bfn(mfn_x(mfn));
 
@@ -1165,7 +1165,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn_l,
 
     if ( !paging_mode_translate(d) )
     {
-        if ( !need_iommu(d) )
+        if ( !sync_iommu_pt(d) )
             return 0;
 
         ret = iommu_map_page(d, _bfn(gfn_l), _mfn(gfn_l),
@@ -1261,7 +1261,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn_l)
 
     if ( !paging_mode_translate(d) )
     {
-        if ( !need_iommu(d) )
+        if ( !sync_iommu_pt(d) )
             return 0;
 
         ret = iommu_unmap_page(d, _bfn(gfn_l));
diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c
index 2b0445ffe9..d54f6dd496 100644
--- a/xen/arch/x86/mm/paging.c
+++ b/xen/arch/x86/mm/paging.c
@@ -213,7 +213,7 @@ int paging_log_dirty_enable(struct domain *d, bool_t log_global)
 {
     int ret;
 
-    if ( need_iommu(d) && log_global )
+    if ( has_iommu_pt(d) && log_global )
     {
         /*
          * Refuse to turn on global log-dirty mode
diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index cc58e4cef4..29c3050a7f 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -1426,7 +1426,8 @@ int memory_add(unsigned long spfn, unsigned long epfn, unsigned int pxm)
     if ( ret )
         goto destroy_m2p;
 
-    if ( iommu_enabled && !iommu_passthrough && !need_iommu(hardware_domain) )
+    if ( iommu_enabled && !iommu_passthrough &&
+         !sync_iommu_pt(hardware_domain) )
     {
         for ( i = spfn; i < epfn; i++ )
             if ( iommu_map_page(hardware_domain, _bfn(i), _mfn(i),
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 0a34677cc3..ad7aa09a5c 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -804,8 +804,8 @@ int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap *xatp,
     xatp->size -= start;
 
 #ifdef CONFIG_HAS_PASSTHROUGH
-    if ( need_iommu(d) )
-        this_cpu(iommu_dont_flush_iotlb) = 1;
+    if ( sync_iommu_pt(d) || iommu_use_hap_pt(d) )
+       this_cpu(iommu_dont_flush_iotlb) = 1;
 #endif
 
     while ( xatp->size > done )
@@ -827,7 +827,7 @@ int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap *xatp,
     }
 
 #ifdef CONFIG_HAS_PASSTHROUGH
-    if ( need_iommu(d) )
+    if ( sync_iommu_pt(d) || iommu_use_hap_pt(d) )
     {
         int ret;
 
diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
index 144ab81c86..ec3dfd1dae 100644
--- a/xen/common/vm_event.c
+++ b/xen/common/vm_event.c
@@ -644,7 +644,7 @@ int vm_event_domctl(struct domain *d, struct xen_domctl_vm_event_op *vec,
 
             /* No paging if iommu is used */
             rc = -EMLINK;
-            if ( unlikely(need_iommu(d)) )
+            if ( unlikely(has_iommu_pt(d)) )
                 break;
 
             rc = -EXDEV;
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index eea22c3d0d..752751b103 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -256,7 +256,7 @@ static void __hwdom_init amd_iommu_hwdom_init(struct domain *d)
     if ( allocate_domain_resources(dom_iommu(d)) )
         BUG();
 
-    if ( !iommu_passthrough && !need_iommu(d) )
+    if ( !iommu_passthrough && !iommu_dom0_strict )
     {
         int rc = 0;
 
diff --git a/xen/drivers/passthrough/device_tree.c b/xen/drivers/passthrough/device_tree.c
index 671c8db1d0..1aae1a1f93 100644
--- a/xen/drivers/passthrough/device_tree.c
+++ b/xen/drivers/passthrough/device_tree.c
@@ -44,7 +44,7 @@ int iommu_assign_dt_device(struct domain *d, struct dt_device_node *dev)
      * The hwdom is forced to use IOMMU for protecting assigned
      * device. Therefore the IOMMU data is already set up.
      */
-    ASSERT(!is_hardware_domain(d) || need_iommu(d));
+    ASSERT(!is_hardware_domain(d) || has_iommu_pt(d));
     rc = iommu_construct(d);
     if ( rc )
         goto fail;
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index ad8ffcfcb2..caf3d125ae 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -179,8 +179,9 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
         return;
 
     register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m table", 0);
-    d->need_iommu = !!iommu_dom0_strict;
-    if ( need_iommu(d) && !iommu_use_hap_pt(d) )
+    d->has_iommu_pt = true;
+    d->sync_iommu_pt = !!iommu_dom0_strict && !iommu_use_hap_pt(d);
+    if ( sync_iommu_pt(d) )
     {
         struct page_info *page;
         unsigned int i = 0;
@@ -219,14 +220,14 @@ void iommu_teardown(struct domain *d)
 {
     const struct domain_iommu *hd = dom_iommu(d);
 
-    d->need_iommu = false;
+    d->has_iommu_pt = false;
     hd->platform_ops->teardown(d);
     tasklet_schedule(&iommu_pt_cleanup_tasklet);
 }
 
 int iommu_construct(struct domain *d)
 {
-    if ( need_iommu(d) )
+    if ( has_iommu_pt(d) )
         return 0;
 
     if ( !iommu_use_hap_pt(d) )
@@ -236,12 +237,14 @@ int iommu_construct(struct domain *d)
         rc = arch_iommu_populate_page_table(d);
         if ( rc )
             return rc;
+
+        d->sync_iommu_pt = true;
     }
 
-    d->need_iommu = true;
+    d->has_iommu_pt = true;
     /*
      * There may be dirty cache lines when a device is assigned
-     * and before need_iommu(d) becoming true, this will cause
+     * and before has_iommu_pt(d) becoming true, this will cause
      * memory_type_changed lose effect if memory type changes.
      * Call memory_type_changed here to amend this.
      */
@@ -500,7 +503,7 @@ static void iommu_dump_p2m_table(unsigned char key)
     ops = iommu_get_ops();
     for_each_domain(d)
     {
-        if ( is_hardware_domain(d) || !need_iommu(d) )
+        if ( is_hardware_domain(d) || !has_iommu_pt(d) )
             continue;
 
         if ( iommu_use_hap_pt(d) )
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index c4890a4295..3d3ad484e7 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1416,7 +1416,7 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32 flag)
 
     /* Prevent device assign if mem paging or mem sharing have been 
      * enabled for this domain */
-    if ( unlikely(!need_iommu(d) &&
+    if ( unlikely(!has_iommu_pt(d) &&
             (d->arch.hvm_domain.mem_sharing_enabled ||
              vm_event_check_ring(d->vm_event_paging) ||
              p2m_get_hostp2m(d)->global_logdirty)) )
@@ -1460,7 +1460,7 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32 flag)
     }
 
  done:
-    if ( !has_arch_pdevs(d) && need_iommu(d) )
+    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
         iommu_teardown(d);
     pcidevs_unlock();
 
@@ -1510,7 +1510,7 @@ int deassign_device(struct domain *d, u16 seg, u8 bus, u8 devfn)
 
     pdev->fault.count = 0;
 
-    if ( !has_arch_pdevs(d) && need_iommu(d) )
+    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
         iommu_teardown(d);
 
     return ret;
diff --git a/xen/include/asm-arm/grant_table.h b/xen/include/asm-arm/grant_table.h
index 9c2c815526..f01c5e36c9 100644
--- a/xen/include/asm-arm/grant_table.h
+++ b/xen/include/asm-arm/grant_table.h
@@ -86,7 +86,7 @@ static inline unsigned int gnttab_dom0_max(void)
     gfn_x(((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i])
 
 #define gnttab_need_iommu_mapping(d)                    \
-    (is_domain_direct_mapped(d) && need_iommu(d))
+    (is_domain_direct_mapped(d) && sync_iommu_pt(d))
 
 #endif /* __ASM_GRANT_TABLE_H__ */
 /*
diff --git a/xen/include/asm-arm/iommu.h b/xen/include/asm-arm/iommu.h
index 8d1506c6f7..f6df32f860 100644
--- a/xen/include/asm-arm/iommu.h
+++ b/xen/include/asm-arm/iommu.h
@@ -21,7 +21,7 @@ struct arch_iommu
 };
 
 /* Always share P2M Table between the CPU and the IOMMU */
-#define iommu_use_hap_pt(d) (need_iommu(d))
+#define iommu_use_hap_pt(d) (has_iommu_pt(d))
 
 const struct iommu_ops *iommu_get_ops(void);
 void __init iommu_set_ops(const struct iommu_ops *ops);
diff --git a/xen/include/asm-x86/grant_table.h b/xen/include/asm-x86/grant_table.h
index 76ec5dda2c..80e64b74b7 100644
--- a/xen/include/asm-x86/grant_table.h
+++ b/xen/include/asm-x86/grant_table.h
@@ -99,6 +99,6 @@ static inline void gnttab_clear_flag(unsigned int nr, uint16_t *st)
 #define gnttab_release_host_mappings(domain) ( paging_mode_external(domain) )
 
 #define gnttab_need_iommu_mapping(d)                \
-    (!paging_mode_translate(d) && need_iommu(d))
+    (!paging_mode_translate(d) && sync_iommu_pt(d))
 
 #endif /* __ASM_GRANT_TABLE_H__ */
diff --git a/xen/include/asm-x86/iommu.h b/xen/include/asm-x86/iommu.h
index 9216c09162..963829a5a2 100644
--- a/xen/include/asm-x86/iommu.h
+++ b/xen/include/asm-x86/iommu.h
@@ -80,7 +80,7 @@ static inline int iommu_hardware_setup(void)
 
 /* Are we using the domain P2M table as its IOMMU pagetable? */
 #define iommu_use_hap_pt(d) \
-    (hap_enabled(d) && need_iommu(d) && iommu_hap_pt_share)
+    (hap_enabled(d) && has_iommu_pt(d) && iommu_hap_pt_share)
 
 void iommu_update_ire_from_apic(unsigned int apic, unsigned int reg, unsigned int value);
 unsigned int iommu_read_apic_from_ire(unsigned int apic, unsigned int reg);
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 5b9820e4e1..66da6d6b23 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -372,8 +372,10 @@ struct domain
 #ifdef CONFIG_HAS_PASSTHROUGH
     struct domain_iommu iommu;
 
-    /* Does this guest need iommu mappings? */
-    bool             need_iommu;
+    /* Does this guest have iommu mappings? */
+    bool             has_iommu_pt;
+    /* Does this guest need synchronization of iommu mappings? */
+    bool             sync_iommu_pt;
 #endif
     /* is node-affinity automatically computed? */
     bool             auto_node_affinity;
@@ -894,9 +896,11 @@ void watchdog_domain_destroy(struct domain *d);
 #define is_pinned_vcpu(v) ((v)->domain->is_pinned || \
                            cpumask_weight((v)->cpu_hard_affinity) == 1)
 #ifdef CONFIG_HAS_PASSTHROUGH
-#define need_iommu(d)    ((d)->need_iommu)
+#define has_iommu_pt(d) ((d)->has_iommu_pt)
+#define sync_iommu_pt(d) ((d)->sync_iommu_pt)
 #else
-#define need_iommu(d)    (0)
+#define has_iommu_pt(d) (0)
+#define sync_iommu_pt(d) (0)
 #endif
 
 static inline bool is_vcpu_online(const struct vcpu *v)
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 12/15] x86: add iommu_op to enable modification of IOMMU mappings
  2018-08-01 13:40 [PATCH v4 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (10 preceding siblings ...)
  2018-08-01 13:40 ` [PATCH v4 11/15] mm / iommu: split need_iommu() into has_iommu_pt() and sync_iommu_pt() Paul Durrant
@ 2018-08-01 13:40 ` Paul Durrant
  2018-08-01 13:40 ` [PATCH v4 13/15] memory: add get_paged_gfn() as a wrapper Paul Durrant
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Paul Durrant @ 2018-08-01 13:40 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Paul Durrant, Jan Beulich

This patch adds an iommu_op which checks whether it is possible or
safe for a domain to modify its own IOMMU mappings and, if so, creates
a rangeset to track modifications.

NOTE: The actual map and unmap operations are introduced by subsequent
      patches.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

v4:
 - Set sync_iommu_pt to false instead of need_iommu.

v2:
 - New in v2.
---
 xen/arch/x86/iommu_op.c         | 42 +++++++++++++++++++++++++++++++++++++++++
 xen/drivers/passthrough/iommu.c |  2 +-
 xen/drivers/passthrough/pci.c   |  4 ++--
 xen/include/public/iommu_op.h   |  6 ++++++
 xen/include/xen/iommu.h         |  3 +++
 5 files changed, 54 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/iommu_op.c b/xen/arch/x86/iommu_op.c
index bcfcd49102..b29547bffd 100644
--- a/xen/arch/x86/iommu_op.c
+++ b/xen/arch/x86/iommu_op.c
@@ -78,6 +78,42 @@ static int iommu_op_query_reserved(struct xen_iommu_op_query_reserved *op)
     return 0;
 }
 
+static int iommu_op_enable_modification(void)
+{
+    struct domain *currd = current->domain;
+    struct domain_iommu *iommu = dom_iommu(currd);
+    const struct iommu_ops *ops = iommu->platform_ops;
+
+    /* Has modification already been enabled? */
+    if ( iommu->iommu_op_ranges )
+        return 0;
+
+    /*
+     * The IOMMU mappings cannot be modified if:
+     * - the IOMMU is not enabled or,
+     * - the current domain is dom0 and tranlsation is disabled or,
+     * - HAP is enabled and the IOMMU shares the mappings.
+     */
+    if ( !iommu_enabled ||
+         (is_hardware_domain(currd) && iommu_passthrough) ||
+         iommu_use_hap_pt(currd) )
+        return -EACCES;
+
+    /*
+     * The IOMMU implementation must provide the lookup method if
+     * modification of the mappings is to be supported.
+     */
+    if ( !ops->lookup_page )
+        return -EOPNOTSUPP;
+
+    iommu->iommu_op_ranges = rangeset_new(currd, NULL, 0);
+    if ( !iommu->iommu_op_ranges )
+        return -ENOMEM;
+
+    currd->sync_iommu_pt = 0; /* Disable synchronization */
+    return 0;
+}
+
 static void iommu_op(xen_iommu_op_t *op)
 {
     switch ( op->op )
@@ -86,6 +122,10 @@ static void iommu_op(xen_iommu_op_t *op)
         op->status = iommu_op_query_reserved(&op->u.query_reserved);
         break;
 
+    case XEN_IOMMUOP_enable_modification:
+        op->status = iommu_op_enable_modification();
+        break;
+
     default:
         op->status = -EOPNOTSUPP;
         break;
@@ -98,6 +138,7 @@ int do_one_iommu_op(xen_iommu_op_buf_t *buf)
     size_t offset;
     static const size_t op_size[] = {
         [XEN_IOMMUOP_query_reserved] = sizeof(struct xen_iommu_op_query_reserved),
+        [XEN_IOMMUOP_enable_modification] = 0,
     };
     size_t size;
     int rc;
@@ -184,6 +225,7 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
     size_t offset;
     static const size_t op_size[] = {
         [XEN_IOMMUOP_query_reserved] = sizeof(struct compat_iommu_op_query_reserved),
+        [XEN_IOMMUOP_enable_modification] = 0,
     };
     size_t size;
     xen_iommu_op_t nat;
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index caf3d125ae..8f635a5cdb 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -26,7 +26,6 @@ static void iommu_dump_p2m_table(unsigned char key);
 
 unsigned int __read_mostly iommu_dev_iotlb_timeout = 1000;
 integer_param("iommu_dev_iotlb_timeout", iommu_dev_iotlb_timeout);
-
 /*
  * The 'iommu' parameter enables the IOMMU.  Optional comma separated
  * value may contain:
@@ -265,6 +264,7 @@ void iommu_domain_destroy(struct domain *d)
     arch_iommu_domain_destroy(d);
 
     rangeset_destroy(hd->reserved_ranges);
+    rangeset_destroy(hd->iommu_op_ranges);
 }
 
 int iommu_map_page(struct domain *d, bfn_t bfn, mfn_t mfn,
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index 3d3ad484e7..d4033af41a 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1460,7 +1460,7 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32 flag)
     }
 
  done:
-    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
+    if ( !has_arch_pdevs(d) && has_iommu_pt(d) && !hd->iommu_op_ranges )
         iommu_teardown(d);
     pcidevs_unlock();
 
@@ -1510,7 +1510,7 @@ int deassign_device(struct domain *d, u16 seg, u8 bus, u8 devfn)
 
     pdev->fault.count = 0;
 
-    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
+    if ( !has_arch_pdevs(d) && has_iommu_pt(d) && !hd->iommu_op_ranges )
         iommu_teardown(d);
 
     return ret;
diff --git a/xen/include/public/iommu_op.h b/xen/include/public/iommu_op.h
index ade404a877..9bf74bd007 100644
--- a/xen/include/public/iommu_op.h
+++ b/xen/include/public/iommu_op.h
@@ -61,6 +61,12 @@ struct xen_iommu_op_query_reserved {
     XEN_GUEST_HANDLE(xen_iommu_reserved_range_t) ranges;
 };
 
+/*
+ * XEN_IOMMUOP_enable_modification: Enable operations that modify IOMMU
+ *                                  mappings.
+ */
+#define XEN_IOMMUOP_enable_modification 2
+
 struct xen_iommu_op {
     uint16_t op;    /* op type */
     uint16_t pad;
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 7c5d46df81..08b163cbcb 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -130,6 +130,9 @@ struct domain_iommu {
      * must not be modified after initialization.
      */
     struct rangeset *reserved_ranges;
+
+    /* Ranges under the control of iommu_op */
+    struct rangeset *iommu_op_ranges;
 };
 
 #define dom_iommu(d)              (&(d)->iommu)
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 13/15] memory: add get_paged_gfn() as a wrapper...
  2018-08-01 13:40 [PATCH v4 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (11 preceding siblings ...)
  2018-08-01 13:40 ` [PATCH v4 12/15] x86: add iommu_op to enable modification of IOMMU mappings Paul Durrant
@ 2018-08-01 13:40 ` Paul Durrant
  2018-08-01 13:40 ` [PATCH v4 14/15] x86: add iommu_ops to modify and flush IOMMU mappings Paul Durrant
  2018-08-01 13:40 ` [PATCH v4 15/15] x86: extend the map and unmap iommu_ops to support grant references Paul Durrant
  14 siblings, 0 replies; 24+ messages in thread
From: Paul Durrant @ 2018-08-01 13:40 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Paul Durrant, Jan Beulich

...for some uses of get_page_from_gfn().

There are many occurences of the following pattern in the code:

    q = <readonly look-up> ? P2M_ALLOC : P2M_UNSHARE;
    page = get_page_from_gfn(d, gfn, &p2mt, q);

    if ( p2m_is_paging(p2mt) )
    {
        if ( page )
            put_page(page);

        p2m_mem_paging_populate(d, gfn);
        return <-EAGAIN or equivalent>;
    }

    if ( (q & P2M_UNSHARE) && p2m_is_shared(p2mt) )
    {
        if ( page )
            put_page(page);

        return <-EAGAIN or equivalent>;
    }

    if ( !page )
        return <-EINVAL or equivalent>;

    if ( !p2m_is_ram(p2mt) ||
         (!<readonly look-up> && p2m_is_readonly(p2mt)) )
    {
        put_page(page);
        return <-EINVAL or equivalent>;
    }

There are some small differences between the exact way the occurrences are
coded but the desired semantic is the same.

This patch introduces a new common implementation of this code in
get_paged_gfn() and then converts the various open-coded patterns into
calls to this new function.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

v3:
 - Addressed comments from George.

v2:
 - New in v2.
---
 xen/arch/x86/hvm/emulate.c | 32 ++++++--------------------
 xen/arch/x86/hvm/hvm.c     | 16 ++-----------
 xen/common/grant_table.c   | 38 +++++++++----------------------
 xen/common/memory.c        | 56 +++++++++++++++++++++++++++++++++++++---------
 xen/include/asm-arm/p2m.h  |  3 +++
 xen/include/asm-x86/p2m.h  |  2 ++
 6 files changed, 71 insertions(+), 76 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 8385c62145..c26281ea1c 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -332,34 +332,16 @@ static int hvmemul_do_io_buffer(
 
 static int hvmemul_acquire_page(unsigned long gmfn, struct page_info **page)
 {
-    struct domain *curr_d = current->domain;
-    p2m_type_t p2mt;
-
-    *page = get_page_from_gfn(curr_d, gmfn, &p2mt, P2M_UNSHARE);
-
-    if ( *page == NULL )
-        return X86EMUL_UNHANDLEABLE;
-
-    if ( p2m_is_paging(p2mt) )
-    {
-        put_page(*page);
-        p2m_mem_paging_populate(curr_d, gmfn);
-        return X86EMUL_RETRY;
-    }
-
-    if ( p2m_is_shared(p2mt) )
+    switch ( get_paged_gfn(current->domain, _gfn(gmfn), false, NULL, page) )
     {
-        put_page(*page);
+    case -EAGAIN:
         return X86EMUL_RETRY;
-    }
-
-    /* This code should not be reached if the gmfn is not RAM */
-    if ( p2m_is_mmio(p2mt) )
-    {
-        domain_crash(curr_d);
-
-        put_page(*page);
+    case -EINVAL:
         return X86EMUL_UNHANDLEABLE;
+    default:
+        ASSERT_UNREACHABLE();
+    case 0:
+        break;
     }
 
     return X86EMUL_OKAY;
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 72c51faecb..03430e6f07 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2557,24 +2557,12 @@ static void *_hvm_map_guest_frame(unsigned long gfn, bool_t permanent,
                                   bool_t *writable)
 {
     void *map;
-    p2m_type_t p2mt;
     struct page_info *page;
     struct domain *d = current->domain;
+    p2m_type_t p2mt;
 
-    page = get_page_from_gfn(d, gfn, &p2mt,
-                             writable ? P2M_UNSHARE : P2M_ALLOC);
-    if ( (p2m_is_shared(p2mt) && writable) || !page )
-    {
-        if ( page )
-            put_page(page);
-        return NULL;
-    }
-    if ( p2m_is_paging(p2mt) )
-    {
-        put_page(page);
-        p2m_mem_paging_populate(d, gfn);
+    if ( get_paged_gfn(d, _gfn(gfn), !writable, &p2mt, &page) )
         return NULL;
-    }
 
     if ( writable )
     {
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 1840b656c9..cc7080bc98 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -366,39 +366,23 @@ static int get_paged_frame(unsigned long gfn, mfn_t *mfn,
                            struct page_info **page, bool readonly,
                            struct domain *rd)
 {
-    int rc = GNTST_okay;
-    p2m_type_t p2mt;
-
-    *mfn = INVALID_MFN;
-    *page = get_page_from_gfn(rd, gfn, &p2mt,
-                              readonly ? P2M_ALLOC : P2M_UNSHARE);
-    if ( !*page )
-    {
-#ifdef P2M_SHARED_TYPES
-        if ( p2m_is_shared(p2mt) )
-            return GNTST_eagain;
-#endif
-#ifdef P2M_PAGES_TYPES
-        if ( p2m_is_paging(p2mt) )
-        {
-            p2m_mem_paging_populate(rd, gfn);
-            return GNTST_eagain;
-        }
-#endif
-        return GNTST_bad_page;
-    }
+    int rc;
 
-    if ( p2m_is_foreign(p2mt) )
+    rc = get_paged_gfn(rd, _gfn(gfn), readonly, NULL, page);
+    switch ( rc )
     {
-        put_page(*page);
-        *page = NULL;
-
+    case -EAGAIN:
+        return GNTST_eagain;
+    case -EINVAL:
         return GNTST_bad_page;
+    default:
+        ASSERT_UNREACHABLE();
+    case 0:
+        break;
     }
 
     *mfn = page_to_mfn(*page);
-
-    return rc;
+    return GNTST_okay;
 }
 
 static inline void
diff --git a/xen/common/memory.c b/xen/common/memory.c
index ad7aa09a5c..4c34e6c2d9 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -1571,37 +1571,73 @@ void destroy_ring_for_helper(
     }
 }
 
-int prepare_ring_for_helper(
-    struct domain *d, unsigned long gmfn, struct page_info **_page,
-    void **_va)
+/*
+ * Acquire a pointer to struct page_info for a specified doman and GFN,
+ * checking whether the page has been paged out, or needs unsharing.
+ * If the function succeeds then zero is returned and page_p is written
+ * with a pointer to the struct page_info with a reference taken. The
+ * caller is responsible for dropping the reference. If p2mt_p is non-NULL
+ * then it is also written with the P2M type of the page.
+ * If the function fails then an appropriate errno is returned and the
+ * values referenced by page_p and p2mt_p are undefined.
+ */
+int get_paged_gfn(struct domain *d, gfn_t gfn, bool readonly,
+                  p2m_type_t *p2mt_p, struct page_info **page_p)
 {
-    struct page_info *page;
+    p2m_query_t q = readonly ? P2M_ALLOC : P2M_UNSHARE;
     p2m_type_t p2mt;
-    void *va;
+    struct page_info *page;
 
-    page = get_page_from_gfn(d, gmfn, &p2mt, P2M_UNSHARE);
+    page = get_page_from_gfn(d, gfn_x(gfn), &p2mt, q);
 
 #ifdef CONFIG_HAS_MEM_PAGING
     if ( p2m_is_paging(p2mt) )
     {
         if ( page )
             put_page(page);
-        p2m_mem_paging_populate(d, gmfn);
-        return -ENOENT;
+
+        p2m_mem_paging_populate(d, gfn_x(gfn));
+        return -EAGAIN;
     }
 #endif
 #ifdef CONFIG_HAS_MEM_SHARING
-    if ( p2m_is_shared(p2mt) )
+    if ( (q & P2M_UNSHARE) && p2m_is_shared(p2mt) )
     {
         if ( page )
             put_page(page);
-        return -ENOENT;
+
+        return -EAGAIN;
     }
 #endif
 
     if ( !page )
         return -EINVAL;
 
+    if ( !p2m_is_ram(p2mt) || (!readonly && p2m_is_readonly(p2mt)) )
+    {
+        put_page(page);
+        return -EINVAL;
+    }
+
+    if ( p2mt_p )
+        *p2mt_p = p2mt;
+
+    *page_p = page;
+    return 0;
+}
+
+int prepare_ring_for_helper(
+    struct domain *d, unsigned long gmfn, struct page_info **_page,
+    void **_va)
+{
+    struct page_info *page;
+    void *va;
+    int rc;
+
+    rc = get_paged_gfn(d, _gfn(gmfn), false, NULL, &page);
+    if ( rc )
+        return (rc == -EAGAIN) ? -ENOENT : rc;
+
     if ( !get_page_type(page, PGT_writable_page) )
     {
         put_page(page);
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 8823707c17..a39a4faabd 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -303,6 +303,9 @@ static inline struct page_info *get_page_from_gfn(
     return page;
 }
 
+int get_paged_gfn(struct domain *d, gfn_t gfn, bool readonly,
+                  p2m_type_t *p2mt_p, struct page_info **page_p);
+
 int get_page_type(struct page_info *page, unsigned long type);
 bool is_iomem_page(mfn_t mfn);
 static inline int get_page_and_type(struct page_info *page,
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index d4b3cfcb6e..e890bcd3e1 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -492,6 +492,8 @@ static inline struct page_info *get_page_from_gfn(
     return mfn_valid(_mfn(gfn)) && get_page(page, d) ? page : NULL;
 }
 
+int get_paged_gfn(struct domain *d, gfn_t gfn, bool readonly,
+                  p2m_type_t *p2mt_p, struct page_info **page_p);
 
 /* General conversion function from mfn to gfn */
 static inline unsigned long mfn_to_gfn(struct domain *d, mfn_t mfn)
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 14/15] x86: add iommu_ops to modify and flush IOMMU mappings
  2018-08-01 13:40 [PATCH v4 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (12 preceding siblings ...)
  2018-08-01 13:40 ` [PATCH v4 13/15] memory: add get_paged_gfn() as a wrapper Paul Durrant
@ 2018-08-01 13:40 ` Paul Durrant
  2018-08-01 13:40 ` [PATCH v4 15/15] x86: extend the map and unmap iommu_ops to support grant references Paul Durrant
  14 siblings, 0 replies; 24+ messages in thread
From: Paul Durrant @ 2018-08-01 13:40 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Paul Durrant, Jan Beulich

This patch adds iommu_ops to add (map) or remove (unmap) frames in the
domain's IOMMU mappings, and an iommu_op to synchronize (flush) those
manipulations with the hardware.

Mappings added by the map operation are tracked and only those mappings
may be removed by a subsequent unmap operation. Frames are specified by the
owning domain and GFN. It is, of course, permissable for a domain to map
its own frames using DOMID_SELF.

NOTE: The owning domain and GFN must also be specified in the unmap
      operation, as well as the BFN, so that they can be cross-checked
      with the existent mapping.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

v4:
 - Fixed logic inversion when checking return of iommu_unmap_page().

v3:
 - Add type pinning.

v2:
 - Heavily re-worked in v2, including explicit tracking of mappings.
   This avoids the need to clear non-reserved mappings from IOMMU
   at start of day, which would be prohibitively slow on a large host.
---
 xen/arch/x86/iommu_op.c       | 151 ++++++++++++++++++++++++++++++++++++++++++
 xen/include/public/iommu_op.h |  43 ++++++++++++
 xen/include/xlat.lst          |   2 +
 3 files changed, 196 insertions(+)

diff --git a/xen/arch/x86/iommu_op.c b/xen/arch/x86/iommu_op.c
index b29547bffd..35daeeed92 100644
--- a/xen/arch/x86/iommu_op.c
+++ b/xen/arch/x86/iommu_op.c
@@ -114,6 +114,131 @@ static int iommu_op_enable_modification(void)
     return 0;
 }
 
+static int iommuop_map(struct xen_iommu_op_map *op)
+{
+    struct domain *d, *currd = current->domain;
+    struct domain_iommu *iommu = dom_iommu(currd);
+    bool readonly = op->flags & XEN_IOMMUOP_map_readonly;
+    bfn_t bfn = _bfn(op->bfn);
+    struct page_info *page;
+    unsigned int prot;
+    int rc, ignore;
+
+    if ( op->pad || (op->flags & ~XEN_IOMMUOP_map_readonly) )
+        return -EINVAL;
+
+    if ( !iommu->iommu_op_ranges )
+        return -EOPNOTSUPP;
+
+    /* Check whether the specified BFN falls in a reserved region */
+    if ( rangeset_contains_singleton(iommu->reserved_ranges, bfn_x(bfn)) )
+        return -EINVAL;
+
+    d = rcu_lock_domain_by_any_id(op->domid);
+    if ( !d )
+        return -ESRCH;
+
+    rc = get_paged_gfn(d, _gfn(op->gfn), readonly, NULL, &page);
+    if ( rc )
+        goto unlock;
+
+    rc = -EINVAL;
+    if ( !readonly && !get_page_type(page, PGT_writable_page) )
+    {
+        put_page(page);
+        goto unlock;
+    }
+
+    prot = IOMMUF_readable;
+    if ( !readonly )
+        prot |= IOMMUF_writable;
+
+    rc = -EIO;
+    if ( iommu_map_page(currd, bfn, page_to_mfn(page), prot) )
+        goto release;
+
+    rc = rangeset_add_singleton(iommu->iommu_op_ranges, bfn_x(bfn));
+    if ( rc )
+        goto unmap;
+
+    rc = 0;
+    goto unlock; /* retain mapping and references */
+
+ unmap:
+    ignore = iommu_unmap_page(currd, bfn);
+
+ release:
+    if ( !readonly )
+        put_page_type(page);
+    put_page(page);
+
+ unlock:
+    rcu_unlock_domain(d);
+    return rc;
+}
+
+static int iommuop_unmap(struct xen_iommu_op_unmap *op)
+{
+    struct domain *d, *currd = current->domain;
+    struct domain_iommu *iommu = dom_iommu(currd);
+    bfn_t bfn = _bfn(op->bfn);
+    mfn_t mfn;
+    bool readonly;
+    unsigned int prot;
+    struct page_info *page;
+    int rc;
+
+    if ( op->pad0 || op->pad1 )
+        return -EINVAL;
+
+    if ( !iommu->iommu_op_ranges )
+        return -EOPNOTSUPP;
+
+    if ( !rangeset_contains_singleton(iommu->iommu_op_ranges, bfn_x(bfn)) ||
+         iommu_lookup_page(currd, bfn, &mfn, &prot) ||
+         !mfn_valid(mfn) )
+        return -ENOENT;
+
+    readonly = !(prot & IOMMUF_writable);
+    
+    d = rcu_lock_domain_by_any_id(op->domid);
+    if ( !d )
+        return -ESRCH;
+
+    rc = get_paged_gfn(d, _gfn(op->gfn), !(prot & IOMMUF_writable), NULL,
+                       &page);
+    if ( rc )
+        goto unlock;
+
+    put_page(page); /* release extra reference just taken */
+
+    rc = -EINVAL;
+    if ( !mfn_eq(page_to_mfn(page), mfn) )
+        goto unlock;
+
+    /* release reference taken in map */
+    if ( !readonly )
+        put_page_type(page);
+    put_page(page);
+
+    rc = rangeset_remove_singleton(iommu->iommu_op_ranges, bfn_x(bfn));
+    if ( rc )
+        goto unlock;
+
+    if ( iommu_unmap_page(currd, bfn) )
+        rc = -EIO;
+
+ unlock:
+    rcu_unlock_domain(d);
+
+    return rc;
+}
+
+static int iommuop_flush(void)
+{
+    return !iommu_iotlb_flush_all(current->domain) ? 0 : -EIO;
+}
+
 static void iommu_op(xen_iommu_op_t *op)
 {
     switch ( op->op )
@@ -126,6 +251,22 @@ static void iommu_op(xen_iommu_op_t *op)
         op->status = iommu_op_enable_modification();
         break;
 
+    case XEN_IOMMUOP_map:
+        this_cpu(iommu_dont_flush_iotlb) = 1;
+        op->status = iommuop_map(&op->u.map);
+        this_cpu(iommu_dont_flush_iotlb) = 0;
+        break;
+
+    case XEN_IOMMUOP_unmap:
+        this_cpu(iommu_dont_flush_iotlb) = 1;
+        op->status = iommuop_unmap(&op->u.unmap);
+        this_cpu(iommu_dont_flush_iotlb) = 0;
+        break;
+
+    case XEN_IOMMUOP_flush:
+        op->status = iommuop_flush();
+        break;
+
     default:
         op->status = -EOPNOTSUPP;
         break;
@@ -139,6 +280,9 @@ int do_one_iommu_op(xen_iommu_op_buf_t *buf)
     static const size_t op_size[] = {
         [XEN_IOMMUOP_query_reserved] = sizeof(struct xen_iommu_op_query_reserved),
         [XEN_IOMMUOP_enable_modification] = 0,
+        [XEN_IOMMUOP_map] = sizeof(struct xen_iommu_op_map),
+        [XEN_IOMMUOP_unmap] = sizeof(struct xen_iommu_op_unmap),
+        [XEN_IOMMUOP_flush] = 0,
     };
     size_t size;
     int rc;
@@ -226,6 +370,9 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
     static const size_t op_size[] = {
         [XEN_IOMMUOP_query_reserved] = sizeof(struct compat_iommu_op_query_reserved),
         [XEN_IOMMUOP_enable_modification] = 0,
+        [XEN_IOMMUOP_map] = sizeof(struct compat_iommu_op_map),
+        [XEN_IOMMUOP_unmap] = sizeof(struct compat_iommu_op_unmap),
+        [XEN_IOMMUOP_flush] = 0,
     };
     size_t size;
     xen_iommu_op_t nat;
@@ -263,6 +410,8 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
      * we need to fix things up here.
      */
 #define XLAT_iommu_op_u_query_reserved XEN_IOMMUOP_query_reserved
+#define XLAT_iommu_op_u_map XEN_IOMMUOP_map
+#define XLAT_iommu_op_u_unmap XEN_IOMMUOP_unmap
     u = cmp.op;
 
 #define XLAT_iommu_op_query_reserved_HNDL_ranges(_d_, _s_)            \
@@ -322,6 +471,8 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
                                 &cmp, status) )
         return -EFAULT;
 
+#undef XLAT_iommu_op_u_unmap
+#undef XLAT_iommu_op_u_map
 #undef XLAT_iommu_op_u_query_reserved
 
     return 0;
diff --git a/xen/include/public/iommu_op.h b/xen/include/public/iommu_op.h
index 9bf74bd007..c8dc531c83 100644
--- a/xen/include/public/iommu_op.h
+++ b/xen/include/public/iommu_op.h
@@ -67,6 +67,47 @@ struct xen_iommu_op_query_reserved {
  */
 #define XEN_IOMMUOP_enable_modification 2
 
+/*
+ * XEN_IOMMUOP_map: Map a guest page in the IOMMU.
+ */
+#define XEN_IOMMUOP_map 3
+
+struct xen_iommu_op_map {
+    /* IN - The domid of the guest */
+    domid_t domid;
+    uint16_t flags;
+
+#define _XEN_IOMMUOP_map_readonly 0
+#define XEN_IOMMUOP_map_readonly (1 << (_XEN_IOMMUOP_map_readonly))
+
+    uint32_t pad;
+    /* IN - The IOMMU frame number which will hold the new mapping */
+    xen_bfn_t bfn;
+    /* IN - The guest frame number of the page to be mapped */
+    xen_pfn_t gfn;
+};
+
+/*
+ * XEN_IOMMUOP_unmap_gfn: Remove a mapping in the IOMMU.
+ */
+#define XEN_IOMMUOP_unmap 4
+
+struct xen_iommu_op_unmap {
+    /* IN - The domid of the guest */
+    domid_t domid;
+    uint16_t pad0;
+    uint32_t pad1;
+    /* IN - The IOMMU frame number which holds the mapping to be removed */
+    xen_bfn_t bfn;
+    /* IN - The guest frame number of the page that is mapped */
+    xen_pfn_t gfn;
+};
+
+/*
+ * XEN_IOMMUOP_flush: Flush the IOMMU TLB.
+ */
+#define XEN_IOMMUOP_flush 5
+
 struct xen_iommu_op {
     uint16_t op;    /* op type */
     uint16_t pad;
@@ -74,6 +115,8 @@ struct xen_iommu_op {
                     /* 0 for success otherwise, negative errno */
     union {
         struct xen_iommu_op_query_reserved query_reserved;
+        struct xen_iommu_op_map map;
+        struct xen_iommu_op_unmap unmap;
     } u;
 };
 typedef struct xen_iommu_op xen_iommu_op_t;
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index d2f9b1034b..3ad7eadb5a 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -79,7 +79,9 @@
 ?	vcpu_hvm_x86_64			hvm/hvm_vcpu.h
 !	iommu_op			iommu_op.h
 !	iommu_op_buf			iommu_op.h
+!	iommu_op_map			iommu_op.h
 !	iommu_op_query_reserved		iommu_op.h
+!	iommu_op_unmap			iommu_op.h
 ?	iommu_reserved_range		iommu_op.h
 ?	kexec_exec			kexec.h
 !	kexec_image			kexec.h
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 15/15] x86: extend the map and unmap iommu_ops to support grant references
  2018-08-01 13:40 [PATCH v4 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (13 preceding siblings ...)
  2018-08-01 13:40 ` [PATCH v4 14/15] x86: add iommu_ops to modify and flush IOMMU mappings Paul Durrant
@ 2018-08-01 13:40 ` Paul Durrant
  14 siblings, 0 replies; 24+ messages in thread
From: Paul Durrant @ 2018-08-01 13:40 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Paul Durrant, Jan Beulich

This patch allows a domain to add or remove foreign frames from its
IOMMU mappings by grant reference as well as GFN. This is necessary,
for example, to support a PV network backend that needs to construct a
packet buffer that can be directly accessed by a NIC.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

v2:
 - New in v2.
---
 xen/arch/x86/iommu_op.c       |  83 ++++++++++++++++--------
 xen/common/grant_table.c      | 143 ++++++++++++++++++++++++++++++++++++++++++
 xen/include/public/iommu_op.h |  31 +++++++--
 xen/include/xen/grant_table.h |   7 +++
 4 files changed, 231 insertions(+), 33 deletions(-)

diff --git a/xen/arch/x86/iommu_op.c b/xen/arch/x86/iommu_op.c
index 35daeeed92..6b563c0227 100644
--- a/xen/arch/x86/iommu_op.c
+++ b/xen/arch/x86/iommu_op.c
@@ -23,6 +23,7 @@
 #include <xen/guest_access.h>
 #include <xen/hypercall.h>
 #include <xen/nospec.h>
+#include <xen/grant_table.h>
 
 struct get_reserved_ctxt {
     unsigned int max_entries;
@@ -121,10 +122,12 @@ static int iommuop_map(struct xen_iommu_op_map *op)
     bool readonly = op->flags & XEN_IOMMUOP_map_readonly;
     bfn_t bfn = _bfn(op->bfn);
     struct page_info *page;
+    mfn_t mfn;
     unsigned int prot;
     int rc, ignore;
 
-    if ( op->pad || (op->flags & ~XEN_IOMMUOP_map_readonly) )
+    if ( op->pad || (op->flags & ~(XEN_IOMMUOP_map_readonly |
+                                   XEN_IOMMUOP_map_gref)) )
         return -EINVAL;
 
     if ( !iommu->iommu_op_ranges )
@@ -138,15 +141,28 @@ static int iommuop_map(struct xen_iommu_op_map *op)
     if ( !d )
         return -ESRCH;
 
-    rc = get_paged_gfn(d, _gfn(op->gfn), readonly, NULL, &page);
-    if ( rc )
-        goto unlock;
+    if ( op->flags & XEN_IOMMUOP_map_gref )
+    {
+        rc = acquire_gref_for_iommu(d, op->u.gref, readonly, &mfn);
+        if ( rc )
+            goto unlock;
 
-    rc = -EINVAL;
-    if ( !readonly && !get_page_type(page, PGT_writable_page) )
+        page = mfn_to_page(mfn);
+    }
+    else
     {
-        put_page(page);
-        goto unlock;
+        rc = get_paged_gfn(d, _gfn(op->u.gfn), readonly, NULL, &page);
+        if ( rc )
+            goto unlock;
+
+        rc = -EINVAL;
+        if ( !readonly && !get_page_type(page, PGT_writable_page) )
+        {
+            put_page(page);
+            goto unlock;
+        }
+
+        mfn = page_to_mfn(page);
     }
 
     prot = IOMMUF_readable;
@@ -154,7 +170,7 @@ static int iommuop_map(struct xen_iommu_op_map *op)
         prot |= IOMMUF_writable;
 
     rc = -EIO;
-    if ( iommu_map_page(currd, bfn, page_to_mfn(page), prot) )
+    if ( iommu_map_page(currd, bfn, mfn, prot) )
         goto release;
 
     rc = rangeset_add_singleton(iommu->iommu_op_ranges, bfn_x(bfn));
@@ -168,9 +184,14 @@ static int iommuop_map(struct xen_iommu_op_map *op)
     ignore = iommu_unmap_page(currd, bfn);
 
  release:
-    if ( !readonly )
-        put_page_type(page);
-    put_page(page);
+    if ( op->flags & XEN_IOMMUOP_map_gref )
+        release_gref_for_iommu(d, op->u.gref, readonly, mfn);
+    else
+    {
+        if ( !readonly )
+            put_page_type(page);
+        put_page(page);
+    }
 
  unlock:
     rcu_unlock_domain(d);
@@ -185,11 +206,9 @@ static int iommuop_unmap(struct xen_iommu_op_unmap *op)
     mfn_t mfn;
     bool readonly;
     unsigned int prot;
-    struct page_info *page;
     int rc;
 
-    if ( op->pad0 || op->pad1 )
-        return -EINVAL;
+    if ( op->pad || (op->flags & ~XEN_IOMMUOP_unmap_gref) )
 
     if ( !iommu->iommu_op_ranges )
         return -EOPNOTSUPP;
@@ -205,21 +224,31 @@ static int iommuop_unmap(struct xen_iommu_op_unmap *op)
     if ( !d )
         return -ESRCH;
 
-    rc = get_paged_gfn(d, _gfn(op->gfn), !(prot & IOMMUF_writable), NULL,
-                       &page);
-    if ( rc )
-        goto unlock;
+    if ( op->flags & XEN_IOMMUOP_unmap_gref )
+    {
+        rc = release_gref_for_iommu(d, op->u.gref, readonly, mfn);
+        if ( rc )
+            goto unlock;
+    }
+    else
+    {
+        struct page_info *page;
 
-    put_page(page); /* release extra reference just taken */
+        rc = get_paged_gfn(d, _gfn(op->u.gfn), readonly, NULL, &page);
+        if ( rc )
+            goto unlock;
 
-    rc = -EINVAL;
-    if ( !mfn_eq(page_to_mfn(page), mfn) )
-        goto unlock;
+        put_page(page); /* release extra reference just taken */
 
-    /* release reference taken in map */
-    if ( !readonly )
-        put_page_type(page);
-    put_page(page);
+        rc = -EINVAL;
+        if ( !mfn_eq(page_to_mfn(page), mfn) )
+            goto unlock;
+
+        /* release reference taken in map */
+        if ( !readonly )
+            put_page_type(page);
+        put_page(page);
+    }
 
     rc = rangeset_remove_singleton(iommu->iommu_op_ranges, bfn_x(bfn));
     if ( rc )
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index cc7080bc98..fc285f9753 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -3898,6 +3898,149 @@ int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn,
     return rc;
 }
 
+int
+acquire_gref_for_iommu(struct domain *d, grant_ref_t gref,
+                       bool readonly, mfn_t *mfn)
+{
+    struct domain *currd = current->domain;
+    struct grant_table *gt = d->grant_table;
+    grant_entry_header_t *shah;
+    struct active_grant_entry *act;
+    uint16_t *status;
+    int rc;
+
+    grant_read_lock(gt);
+
+    rc = -ENOENT;
+    if ( gref > nr_grant_entries(gt) )
+        goto unlock;
+
+    act = active_entry_acquire(gt, gref);
+    shah = shared_entry_header(gt, gref);
+    status = ( gt->gt_version == 2 ) ?
+        &status_entry(gt, gref) :
+        &shah->flags;
+
+    rc = -EACCES;
+    if ( (shah->flags & GTF_type_mask) != GTF_permit_access ||
+         (shah->flags & GTF_sub_page) )
+        goto release;
+
+    rc = -ERANGE;
+    if ( act->pin && ((act->domid != currd->domain_id) ||
+                      (act->pin & 0x80808080U) != 0) )
+        goto release;
+
+    rc = -EINVAL;
+    if ( !act->pin ||
+         (!readonly && !(act->pin & GNTPIN_devw_mask)) ) {
+        if ( _set_status(gt->gt_version, currd->domain_id, readonly,
+                         0, shah, act, status) != GNTST_okay )
+            goto release;
+    }
+
+    if ( !act->pin )
+    {
+        gfn_t gfn = gt->gt_version == 1 ?
+            _gfn(shared_entry_v1(gt, gref).frame) :
+            _gfn(shared_entry_v2(gt, gref).full_page.frame);
+        struct page_info *page;
+
+        rc =  get_paged_gfn(d, gfn, readonly, NULL, &page);
+        if ( rc )
+            goto clear;
+
+        act_set_gfn(act, gfn);
+        act->mfn = page_to_mfn(page);
+        act->domid = currd->domain_id;
+        act->start = 0;
+        act->length = PAGE_SIZE;
+        act->is_sub_page = false;
+        act->trans_domain = d;
+        act->trans_gref = gref;
+    }
+    else
+    {
+        ASSERT(mfn_valid(act->mfn));
+        if ( !get_page(mfn_to_page(act->mfn), d) )
+            goto clear;
+    }
+
+    rc = 0;
+    act->pin += readonly ? GNTPIN_devr_inc : GNTPIN_devw_inc;
+    *mfn = act->mfn;
+    goto release;
+
+ clear:
+    if ( !readonly && !(act->pin & GNTPIN_devw_mask) )
+        gnttab_clear_flag(_GTF_writing, status);
+
+    if ( !act->pin )
+        gnttab_clear_flag(_GTF_reading, status);
+
+ release:
+    active_entry_release(act);
+
+ unlock:
+    grant_read_unlock(gt);
+
+    return rc;
+}
+
+int
+release_gref_for_iommu(struct domain *d, grant_ref_t gref,
+                       bool readonly, mfn_t mfn)
+{
+    struct domain *currd = current->domain;
+    struct grant_table *gt = d->grant_table;
+    grant_entry_header_t *shah;
+    struct active_grant_entry *act;
+    uint16_t *status;
+    int rc;
+
+    grant_read_lock(gt);
+
+    rc = -ENOENT;
+    if ( gref > nr_grant_entries(gt) )
+        goto unlock;
+
+    act = active_entry_acquire(gt, gref);
+    shah = shared_entry_header(gt, gref);
+    status = ( gt->gt_version == 2 ) ?
+        &status_entry(gt, gref) :
+        &shah->flags;
+
+    rc = -EINVAL;
+    if ( !act->pin || (act->domid != currd->domain_id) ||
+         !mfn_eq(act->mfn, mfn) )
+        goto release;
+
+    rc = 0;
+    if ( readonly )
+        act->pin -= GNTPIN_devr_inc;
+    else
+    {
+        gnttab_mark_dirty(d, mfn);
+
+        act->pin -= GNTPIN_devw_inc;
+        if ( !(act->pin & GNTPIN_devw_mask) )
+            gnttab_clear_flag(_GTF_writing, status);
+    }
+
+    if ( !act->pin )
+        gnttab_clear_flag(_GTF_reading, status);
+
+    put_page(mfn_to_page(mfn));
+
+ release:
+    active_entry_release(act);
+
+ unlock:
+    grant_read_unlock(gt);
+
+    return rc;
+}
+
 static void gnttab_usage_print(struct domain *rd)
 {
     int first = 1;
diff --git a/xen/include/public/iommu_op.h b/xen/include/public/iommu_op.h
index c8dc531c83..5a672bcead 100644
--- a/xen/include/public/iommu_op.h
+++ b/xen/include/public/iommu_op.h
@@ -24,6 +24,7 @@
 #define XEN_PUBLIC_IOMMU_OP_H
 
 #include "xen.h"
+#include "grant_table.h"
 
 typedef uint64_t xen_bfn_t;
 
@@ -79,12 +80,20 @@ struct xen_iommu_op_map {
 
 #define _XEN_IOMMUOP_map_readonly 0
 #define XEN_IOMMUOP_map_readonly (1 << (_XEN_IOMMUOP_map_readonly))
+#define _XEN_IOMMUOP_map_gref 1
+#define XEN_IOMMUOP_map_gref (1 << (_XEN_IOMMUOP_map_gref))
 
     uint32_t pad;
     /* IN - The IOMMU frame number which will hold the new mapping */
     xen_bfn_t bfn;
-    /* IN - The guest frame number of the page to be mapped */
-    xen_pfn_t gfn;
+    /*
+     * IN - The guest frame number or grant reference of the page to
+     * be mapped.
+     */
+    union {
+        xen_pfn_t gfn;
+        grant_ref_t gref;
+    } u;
 };
 
 /*
@@ -95,12 +104,22 @@ struct xen_iommu_op_map {
 struct xen_iommu_op_unmap {
     /* IN - The domid of the guest */
     domid_t domid;
-    uint16_t pad0;
-    uint32_t pad1;
+    uint16_t flags;
+
+#define _XEN_IOMMUOP_unmap_gref 0
+#define XEN_IOMMUOP_unmap_gref (1 << (_XEN_IOMMUOP_unmap_gref))
+
+    uint32_t pad;
     /* IN - The IOMMU frame number which holds the mapping to be removed */
     xen_bfn_t bfn;
-    /* IN - The guest frame number of the page that is mapped */
-    xen_pfn_t gfn;
+    /*
+     * IN - The guest frame number or grant reference of the page that
+     * is mapped.
+     */
+    union {
+        xen_pfn_t gfn;
+        grant_ref_t gref;
+    } u;
 };
 
 /*
diff --git a/xen/include/xen/grant_table.h b/xen/include/xen/grant_table.h
index 0286ba33dd..0f44c6587c 100644
--- a/xen/include/xen/grant_table.h
+++ b/xen/include/xen/grant_table.h
@@ -59,6 +59,13 @@ int mem_sharing_gref_to_gfn(struct grant_table *gt, grant_ref_t ref,
 int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn,
                      mfn_t *mfn);
 
+int
+acquire_gref_for_iommu(struct domain *d, grant_ref_t gref,
+                       bool readonly, mfn_t *mfn);
+int
+release_gref_for_iommu(struct domain *d, grant_ref_t gref,
+                       bool readonly, mfn_t mfn);
+
 unsigned int gnttab_dom0_frames(void);
 
 #endif /* __XEN_GRANT_TABLE_H__ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 01/15] re-work commit 3e06b989 "IOMMU: make page table population preemptible"...
  2018-08-01 13:40 ` [PATCH v4 01/15] re-work commit 3e06b989 "IOMMU: make page table population preemptible" Paul Durrant
@ 2018-08-01 16:15   ` Roger Pau Monné
  2018-08-02 10:26     ` Paul Durrant
  2018-08-02  7:19   ` Jan Beulich
  1 sibling, 1 reply; 24+ messages in thread
From: Roger Pau Monné @ 2018-08-01 16:15 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich, xen-devel

On Wed, Aug 01, 2018 at 02:40:14PM +0100, Paul Durrant wrote:
> ...to simplify the implementation and turn need_iommu back into a boolean.
> 
> As noted in [1] the tri-state nature of need_iommu after commit 3e06b989 is
> confusing, as is the implementation of pre-emption using relmem_list.
> 
> This patch instead uses a simple count of pages already populated stored in
> the x86 variant of struct arch_iommu and skips over that number of pages
> if arch_iommu_populate_page_table() is re-invoked after pre-emption.

Urg, all this is quite ugly. It would be quite better if the need for
an IOMMU was decided before populating the p2m (ie: at domain_create
time by passing an extra XEN_DOMCTL_CDF_iommu or some such), so that
the normal addition of RAM would also populate the IOMMU page tables
if not shared.

> [1] https://lists.xenproject.org/archives/html/xen-devel/2018-07/msg01870.html
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 01/15] re-work commit 3e06b989 "IOMMU: make page table population preemptible"...
  2018-08-01 13:40 ` [PATCH v4 01/15] re-work commit 3e06b989 "IOMMU: make page table population preemptible" Paul Durrant
  2018-08-01 16:15   ` Roger Pau Monné
@ 2018-08-02  7:19   ` Jan Beulich
  2018-08-02  8:02     ` Paul Durrant
  1 sibling, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2018-08-02  7:19 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, xen-devel

>>> On 01.08.18 at 15:40, <paul.durrant@citrix.com> wrote:
> ...to simplify the implementation and turn need_iommu back into a boolean.
> 
> As noted in [1] the tri-state nature of need_iommu after commit 3e06b989 is
> confusing, as is the implementation of pre-emption using relmem_list.
> 
> This patch instead uses a simple count of pages already populated stored in
> the x86 variant of struct arch_iommu and skips over that number of pages
> if arch_iommu_populate_page_table() is re-invoked after pre-emption.

Well, yes, I would have used that model in said commit if it was reliable,
but it isn't: What if the list of pages changed between two (re-)invocations?
Furthermore, with huge enough a guest even the skipping of the already
processed several million pages may exhaust the time acceptable between
preemption points.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 01/15] re-work commit 3e06b989 "IOMMU: make page table population preemptible"...
  2018-08-02  7:19   ` Jan Beulich
@ 2018-08-02  8:02     ` Paul Durrant
  2018-08-02  8:04       ` Paul Durrant
  0 siblings, 1 reply; 24+ messages in thread
From: Paul Durrant @ 2018-08-02  8:02 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, xen-devel, Ian Jackson

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 02 August 2018 08:20
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Wei Liu <wei.liu2@citrix.com>; George
> Dunlap <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>;
> Stefano Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> devel@lists.xenproject.org>; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Tim (Xen.org) <tim@xen.org>
> Subject: Re: [PATCH v4 01/15] re-work commit 3e06b989 "IOMMU: make
> page table population preemptible"...
> 
> >>> On 01.08.18 at 15:40, <paul.durrant@citrix.com> wrote:
> > ...to simplify the implementation and turn need_iommu back into a
> boolean.
> >
> > As noted in [1] the tri-state nature of need_iommu after commit 3e06b989
> is
> > confusing, as is the implementation of pre-emption using relmem_list.
> >
> > This patch instead uses a simple count of pages already populated stored in
> > the x86 variant of struct arch_iommu and skips over that number of pages
> > if arch_iommu_populate_page_table() is re-invoked after pre-emption.
> 
> Well, yes, I would have used that model in said commit if it was reliable,
> but it isn't: What if the list of pages changed between two (re-)invocations?

Is that really going to happen? This is the result of a domctl, which is a tools-only hypercall right?

> Furthermore, with huge enough a guest even the skipping of the already
> processed several million pages may exhaust the time acceptable between
> preemption points.

Ok, I'll keep a pointer into the page list instead.

  Paul

> 
> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 01/15] re-work commit 3e06b989 "IOMMU: make page table population preemptible"...
  2018-08-02  8:02     ` Paul Durrant
@ 2018-08-02  8:04       ` Paul Durrant
  2018-08-02  8:18         ` Jan Beulich
  0 siblings, 1 reply; 24+ messages in thread
From: Paul Durrant @ 2018-08-02  8:04 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: 'Stefano Stabellini',
	Wei Liu, Andrew Cooper, Tim (Xen.org),
	George Dunlap, 'Julien Grall', 'xen-devel',
	Ian Jackson

> -----Original Message-----
> From: Paul Durrant
> Sent: 02 August 2018 09:03
> To: 'Jan Beulich' <JBeulich@suse.com>
> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Wei Liu <wei.liu2@citrix.com>; George
> Dunlap <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>;
> Stefano Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> devel@lists.xenproject.org>; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Tim (Xen.org) <tim@xen.org>
> Subject: RE: [PATCH v4 01/15] re-work commit 3e06b989 "IOMMU: make
> page table population preemptible"...
> 
> > -----Original Message-----
> > From: Jan Beulich [mailto:JBeulich@suse.com]
> > Sent: 02 August 2018 08:20
> > To: Paul Durrant <Paul.Durrant@citrix.com>
> > Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
> > <Andrew.Cooper3@citrix.com>; Wei Liu <wei.liu2@citrix.com>; George
> > Dunlap <George.Dunlap@citrix.com>; Ian Jackson
> <Ian.Jackson@citrix.com>;
> > Stefano Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> > devel@lists.xenproject.org>; Konrad Rzeszutek Wilk
> > <konrad.wilk@oracle.com>; Tim (Xen.org) <tim@xen.org>
> > Subject: Re: [PATCH v4 01/15] re-work commit 3e06b989 "IOMMU: make
> > page table population preemptible"...
> >
> > >>> On 01.08.18 at 15:40, <paul.durrant@citrix.com> wrote:
> > > ...to simplify the implementation and turn need_iommu back into a
> > boolean.
> > >
> > > As noted in [1] the tri-state nature of need_iommu after commit
> 3e06b989
> > is
> > > confusing, as is the implementation of pre-emption using relmem_list.
> > >
> > > This patch instead uses a simple count of pages already populated stored
> in
> > > the x86 variant of struct arch_iommu and skips over that number of
> pages
> > > if arch_iommu_populate_page_table() is re-invoked after pre-emption.
> >
> > Well, yes, I would have used that model in said commit if it was reliable,
> > but it isn't: What if the list of pages changed between two (re-)invocations?
> 
> Is that really going to happen? This is the result of a domctl, which is a tools-
> only hypercall right?

Oh, I see what you mean... the guest could do something like a decrease_reservation... I was overlooking that setting up the iommu is happening while the guest is live. Would it be reasonable to domain_pause() for safety then?

  Paul

> 
> > Furthermore, with huge enough a guest even the skipping of the already
> > processed several million pages may exhaust the time acceptable between
> > preemption points.
> 
> Ok, I'll keep a pointer into the page list instead.
> 
>   Paul
> 
> >
> > Jan
> >


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 01/15] re-work commit 3e06b989 "IOMMU: make page table population preemptible"...
  2018-08-02  8:04       ` Paul Durrant
@ 2018-08-02  8:18         ` Jan Beulich
  2018-08-02  8:49           ` Paul Durrant
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2018-08-02  8:18 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Tim Deegan,
	george.dunlap, Julien Grall, xen-devel, Ian Jackson

>>> On 02.08.18 at 10:04, <Paul.Durrant@citrix.com> wrote:
>> From: Paul Durrant
>> Sent: 02 August 2018 09:03
>> > From: Jan Beulich [mailto:JBeulich@suse.com]
>> > Sent: 02 August 2018 08:20
>> > >>> On 01.08.18 at 15:40, <paul.durrant@citrix.com> wrote:
>> > > ...to simplify the implementation and turn need_iommu back into a
>> > boolean.
>> > >
>> > > As noted in [1] the tri-state nature of need_iommu after commit
>> 3e06b989
>> > is
>> > > confusing, as is the implementation of pre-emption using relmem_list.
>> > >
>> > > This patch instead uses a simple count of pages already populated stored
>> in
>> > > the x86 variant of struct arch_iommu and skips over that number of
>> pages
>> > > if arch_iommu_populate_page_table() is re-invoked after pre-emption.
>> >
>> > Well, yes, I would have used that model in said commit if it was reliable,
>> > but it isn't: What if the list of pages changed between two (re-)invocations?
>> 
>> Is that really going to happen? This is the result of a domctl, which is a tools-
>> only hypercall right?
> 
> Oh, I see what you mean... the guest could do something like a 
> decrease_reservation... I was overlooking that setting up the iommu is 
> happening while the guest is live. Would it be reasonable to domain_pause() 
> for safety then?

I'm hesitant to see domains (or vcpus) paused other than when absolutely
necessary. If everyone else thinks this is a good idea here, I think I won't
object, but please don't forget that any pausing for perhaps an extended
period of time may cause the guest to misbehave subsequently.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 01/15] re-work commit 3e06b989 "IOMMU: make page table population preemptible"...
  2018-08-02  8:18         ` Jan Beulich
@ 2018-08-02  8:49           ` Paul Durrant
  2018-08-02 10:01             ` Jan Beulich
  0 siblings, 1 reply; 24+ messages in thread
From: Paul Durrant @ 2018-08-02  8:49 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, xen-devel, Ian Jackson

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 02 August 2018 09:19
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; George Dunlap
> <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Wei Liu
> <wei.liu2@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>; xen-
> devel <xen-devel@lists.xenproject.org>; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Tim (Xen.org) <tim@xen.org>
> Subject: RE: [PATCH v4 01/15] re-work commit 3e06b989 "IOMMU: make
> page table population preemptible"...
> 
> >>> On 02.08.18 at 10:04, <Paul.Durrant@citrix.com> wrote:
> >> From: Paul Durrant
> >> Sent: 02 August 2018 09:03
> >> > From: Jan Beulich [mailto:JBeulich@suse.com]
> >> > Sent: 02 August 2018 08:20
> >> > >>> On 01.08.18 at 15:40, <paul.durrant@citrix.com> wrote:
> >> > > ...to simplify the implementation and turn need_iommu back into a
> >> > boolean.
> >> > >
> >> > > As noted in [1] the tri-state nature of need_iommu after commit
> >> 3e06b989
> >> > is
> >> > > confusing, as is the implementation of pre-emption using relmem_list.
> >> > >
> >> > > This patch instead uses a simple count of pages already populated
> stored
> >> in
> >> > > the x86 variant of struct arch_iommu and skips over that number of
> >> pages
> >> > > if arch_iommu_populate_page_table() is re-invoked after pre-
> emption.
> >> >
> >> > Well, yes, I would have used that model in said commit if it was reliable,
> >> > but it isn't: What if the list of pages changed between two (re-
> )invocations?
> >>
> >> Is that really going to happen? This is the result of a domctl, which is a
> tools-
> >> only hypercall right?
> >
> > Oh, I see what you mean... the guest could do something like a
> > decrease_reservation... I was overlooking that setting up the iommu is
> > happening while the guest is live. Would it be reasonable to
> domain_pause()
> > for safety then?
> 
> I'm hesitant to see domains (or vcpus) paused other than when absolutely
> necessary. If everyone else thinks this is a good idea here, I think I won't
> object, but please don't forget that any pausing for perhaps an extended
> period of time may cause the guest to misbehave subsequently.
> 

Yes, true, but I also wonder how safe it is to empty the page_list of a running guest. I guess it may be the best way but I think having a dedicate page_list in the iommu struct to host pages that have already been mapped and then transfer them back at the end would be cleaner and should allow need_iommu to stay boolean.

Also, given the expense of the operation to set up the mappings, I guess it may also be a good idea to leave them alone until domain destruction once the set-up has been done rather than removing them when the last device is de-assigned.

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 01/15] re-work commit 3e06b989 "IOMMU: make page table population preemptible"...
  2018-08-02  8:49           ` Paul Durrant
@ 2018-08-02 10:01             ` Jan Beulich
  0 siblings, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2018-08-02 10:01 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Tim Deegan,
	george.dunlap, Julien Grall, xen-devel, IanJackson

>>> On 02.08.18 at 10:49, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 02 August 2018 09:19
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
>> <Andrew.Cooper3@citrix.com>; George Dunlap
>> <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Wei Liu
>> <wei.liu2@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>; xen-
>> devel <xen-devel@lists.xenproject.org>; Konrad Rzeszutek Wilk
>> <konrad.wilk@oracle.com>; Tim (Xen.org) <tim@xen.org>
>> Subject: RE: [PATCH v4 01/15] re-work commit 3e06b989 "IOMMU: make
>> page table population preemptible"...
>> 
>> >>> On 02.08.18 at 10:04, <Paul.Durrant@citrix.com> wrote:
>> >> From: Paul Durrant
>> >> Sent: 02 August 2018 09:03
>> >> > From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> > Sent: 02 August 2018 08:20
>> >> > >>> On 01.08.18 at 15:40, <paul.durrant@citrix.com> wrote:
>> >> > > ...to simplify the implementation and turn need_iommu back into a
>> >> > boolean.
>> >> > >
>> >> > > As noted in [1] the tri-state nature of need_iommu after commit
>> >> 3e06b989
>> >> > is
>> >> > > confusing, as is the implementation of pre-emption using relmem_list.
>> >> > >
>> >> > > This patch instead uses a simple count of pages already populated
>> stored
>> >> in
>> >> > > the x86 variant of struct arch_iommu and skips over that number of
>> >> pages
>> >> > > if arch_iommu_populate_page_table() is re-invoked after pre-
>> emption.
>> >> >
>> >> > Well, yes, I would have used that model in said commit if it was reliable,
>> >> > but it isn't: What if the list of pages changed between two (re-
>> )invocations?
>> >>
>> >> Is that really going to happen? This is the result of a domctl, which is a
>> tools-
>> >> only hypercall right?
>> >
>> > Oh, I see what you mean... the guest could do something like a
>> > decrease_reservation... I was overlooking that setting up the iommu is
>> > happening while the guest is live. Would it be reasonable to
>> domain_pause()
>> > for safety then?
>> 
>> I'm hesitant to see domains (or vcpus) paused other than when absolutely
>> necessary. If everyone else thinks this is a good idea here, I think I won't
>> object, but please don't forget that any pausing for perhaps an extended
>> period of time may cause the guest to misbehave subsequently.
>> 
> 
> Yes, true, but I also wonder how safe it is to empty the page_list of a 
> running guest. I guess it may be the best way but I think having a dedicate 
> page_list in the iommu struct to host pages that have already been mapped and 
> then transfer them back at the end would be cleaner and should allow 
> need_iommu to stay boolean.

The issue is with the case of the guest trying to free a page - see
arch_free_heap_page(): The way page lists work when link fields
are 32-bit PDXes, all lists a page may possibly be on need to be
inspected, in case the page is at the head of that list. Introducing a
3rd list to take into consideration seems undesirable to me (and
again I did consider that option back when writing things the way
they are now).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 01/15] re-work commit 3e06b989 "IOMMU: make page table population preemptible"...
  2018-08-01 16:15   ` Roger Pau Monné
@ 2018-08-02 10:26     ` Paul Durrant
  0 siblings, 0 replies; 24+ messages in thread
From: Paul Durrant @ 2018-08-02 10:26 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Jan Beulich, Ian Jackson, xen-devel

> -----Original Message-----
> From: Roger Pau Monne
> Sent: 01 August 2018 17:15
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: xen-devel@lists.xenproject.org; Stefano Stabellini
> <sstabellini@kernel.org>; Wei Liu <wei.liu2@citrix.com>; George Dunlap
> <George.Dunlap@citrix.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Tim
> (Xen.org) <tim@xen.org>; Julien Grall <julien.grall@arm.com>; Jan Beulich
> <jbeulich@suse.com>
> Subject: Re: [Xen-devel] [PATCH v4 01/15] re-work commit 3e06b989
> "IOMMU: make page table population preemptible"...
> 
> On Wed, Aug 01, 2018 at 02:40:14PM +0100, Paul Durrant wrote:
> > ...to simplify the implementation and turn need_iommu back into a
> boolean.
> >
> > As noted in [1] the tri-state nature of need_iommu after commit 3e06b989
> is
> > confusing, as is the implementation of pre-emption using relmem_list.
> >
> > This patch instead uses a simple count of pages already populated stored in
> > the x86 variant of struct arch_iommu and skips over that number of pages
> > if arch_iommu_populate_page_table() is re-invoked after pre-emption.
> 
> Urg, all this is quite ugly. It would be quite better if the need for
> an IOMMU was decided before populating the p2m (ie: at domain_create
> time by passing an extra XEN_DOMCTL_CDF_iommu or some such), so that
> the normal addition of RAM would also populate the IOMMU page tables
> if not shared.

Given the problems that Jan has highlighted with using a skip count and essentially being forced to mess with the domain's page list, I think your suggestion is best. Then we can entirely bin arch_iommu_populate_page_table(). That is a bigger piece of work though so in the interest of no blocking the rest of this series I'll drop this patch and replace it with a simpler patch that's just sufficient to make need_iommu boolean again.

  Paul

> 
> > [1] https://lists.xenproject.org/archives/html/xen-devel/2018-
> 07/msg01870.html
> >
> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2018-08-02 10:26 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-01 13:40 [PATCH v4 00/15] paravirtual IOMMU interface Paul Durrant
2018-08-01 13:40 ` [PATCH v4 01/15] re-work commit 3e06b989 "IOMMU: make page table population preemptible" Paul Durrant
2018-08-01 16:15   ` Roger Pau Monné
2018-08-02 10:26     ` Paul Durrant
2018-08-02  7:19   ` Jan Beulich
2018-08-02  8:02     ` Paul Durrant
2018-08-02  8:04       ` Paul Durrant
2018-08-02  8:18         ` Jan Beulich
2018-08-02  8:49           ` Paul Durrant
2018-08-02 10:01             ` Jan Beulich
2018-08-01 13:40 ` [PATCH v4 02/15] iommu: introduce the concept of BFN Paul Durrant
2018-08-01 13:40 ` [PATCH v4 03/15] iommu: make use of type-safe BFN and MFN in exported functions Paul Durrant
2018-08-01 13:40 ` [PATCH v4 04/15] iommu: push use of type-safe BFN and MFN into iommu_ops Paul Durrant
2018-08-01 13:40 ` [PATCH v4 05/15] iommu: don't domain_crash() inside iommu_map/unmap_page() Paul Durrant
2018-08-01 13:40 ` [PATCH v4 06/15] public / x86: introduce __HYPERCALL_iommu_op Paul Durrant
2018-08-01 13:40 ` [PATCH v4 07/15] iommu: track reserved ranges using a rangeset Paul Durrant
2018-08-01 13:40 ` [PATCH v4 08/15] x86: add iommu_op to query reserved ranges Paul Durrant
2018-08-01 13:40 ` [PATCH v4 09/15] vtd: add lookup_page method to iommu_ops Paul Durrant
2018-08-01 13:40 ` [PATCH v4 10/15] mm / iommu: include need_iommu() test in iommu_use_hap_pt() Paul Durrant
2018-08-01 13:40 ` [PATCH v4 11/15] mm / iommu: split need_iommu() into has_iommu_pt() and sync_iommu_pt() Paul Durrant
2018-08-01 13:40 ` [PATCH v4 12/15] x86: add iommu_op to enable modification of IOMMU mappings Paul Durrant
2018-08-01 13:40 ` [PATCH v4 13/15] memory: add get_paged_gfn() as a wrapper Paul Durrant
2018-08-01 13:40 ` [PATCH v4 14/15] x86: add iommu_ops to modify and flush IOMMU mappings Paul Durrant
2018-08-01 13:40 ` [PATCH v4 15/15] x86: extend the map and unmap iommu_ops to support grant references Paul Durrant

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.